«Machine Scale Analysis of Digital Collections: An Interview with Lisa Green of Common Crawl»

«Machine Scale Analysis of Digital Collections: An Interview with Lisa Green of Common Crawl» http://feedly.com/k/1fCE9VJ


How do we make digital collections available at scale for today’s scholars and researchers? Lisa Green, director of Common Crawl, tackled this and related questions in her keynote address at Digital Preservation 2013. (You can view her slides and watch a video of her talk online.) As a follow up to ongoing discussions of what users can do with dumps of large sets of data, I’m thrilled to continue exploring the issues she raised in this insights interview.

Trevor: Could you tell us a bit about Common Crawl? What is your mission, what kinds of content do you have and how do you make it available to your users?

Lisa: Common Crawl is a non-profit organization that builds and maintains an open repository of web crawl data that is available for everyone to access and analyze. We believe that the web is is an incredibly valuable dataset capable of driving innovation in research, business, and education …


Deja un comentario