The joys of screen scraping

In the ideal world an organisation would its available through a simple RESTful interface. Sometimes they do, sometimes they don’t.

I have had two projects brewing in my mind, in order to  build and showcase the capabilities of “this fully armed and operational battle station”, I mean me:

  • has done so much for Web 2.0. I wanted to write a fun app to see which of my friends has listened to more tracks than me. All I need is a list of my friends, and a list of their playcount.  I can get the friend list easily, but to get the play count I need to get the user to log in to authenticate. There is no need, the data is public on each and every users web page. This is known, and a fix has been in the offing for several months. I could wait, or I could screen scrape. It ain’t polite, but it gets the job done.
  • If you commute into the city then the state of the tube becomes almost a religion; it has improved a lot, but one can still have a couple of nightmare journeys a year. TFL now allows you to see the departure board for each and every tube station which is a great help. I wouldn’t mind seeing where each train is, where they are all bunched up. Well I could live without it but this seems like an interesting piece of software to write, and it would be of at least some relevance to people. Now it would be great if there was an API for this, but there isn’t, none that I know of. So I am going to have to roll my sleeves up and build one for myself fishing out the data from each station’s web site and use that; there are 270 stations, and data is updated minute by minute. I need to think about this one.

Then there comes the web page, and that is a whole other set of skills…


~ by zeristor on June 18, 2009.

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: