Project WolverineBroadcasting LIVE from the orbiting command centre

Two cool things came to my attention today.

First, Google are opening up their BigQuery service.  This allows companies to scan billions of rows in seconds, and retrieve results for complex analysis queries.  Neat!

http://www.bbc.co.uk/news/technology-15734243

Secondly, Common Crawl have released a 7 billion page archive of websites they’ve scanned.  If you think you can write a better search engine than Google, then you can test your theories against their pre-made dataset.  You’ll need an Amazon EC3 account to do so, but it’s way cheaper and more convenient than building your own.

http://www.commoncrawl.org/

This entry was posted in Cool stuff and tagged , . Bookmark the permalink.

Comments are closed.

Browse by Topic