Dear Gordon, Great to hear from you. freetable.org sounds like a really interesting project.
The "wikipedia" of data idea is one that has come up several times in the "open" community over the last few years. Moreover, in its general form of a public, shared, open data commons, it is one that the Open Knowledge Foundation is directly seeking to promote and create, for example via projects such as the Comprehensive Knowledge Archive Network: http://www.ckan.net/. However, it is worth noting a clear division in approaches to creating such an "open data commons" between a more decentralized "debian-style" approach and the more centralized "wikipedia-style" one. CKAN is more along the debian-style model (see these recent slides [1]) while yours is obviously more focused on building a "central" store. Both are valuable, and complementary, though I guess it is unlikely that we're ever going to have one store with all the world's data in it ;) so it's important to think how about different such "wikipedias of data" will talk to each other (or at least share information about their existence) ... Here are a few comments, specific to your effort (no doubt you're already thinking about some of these): 1. Wikipedia as a project has/had a specific, and fairly, well-defined goal: making an open encyclopaedia. Specific subareas (news, travel etc) have spawned their own subprojects. This is an important point to bear in mind when trying to make a "wikipedia" for data. Not all data is the same and people interesed in genomics may not be much interested in sports or the economy. Much of what makes Wikipedia (or any other open project) work is the community. Having a well-defined focus is important in creating and maintaining that community. 2. Data is different (and likely harder) from text. For example, in data terms a project like Wikipedia is in fact still pretty small. If you want to build a wikipedia of data you'll need to deal with significant size and scaling issues. Furthermore, the tools for doing collaborative (versioned) development of data are still in their infancy compared to the situation for (unstructured) text -- everything from (good) diff tools to (distributed) versioning protocols [2]. 3. I'd advise against using a Creative Commons Attribution-Sharealike license for data. CC licenses were designed for content (text, images etc) and aren't a great match for data (just as free/open code licenses weren't a good match for data). Instead I'd suggest using a license specifically designed for data. For example Open Data Commons (http://www.opendatacommons.org/) have produced an attribution-sharealike license for data, the: Open Database License (ODbL): <http://www.opendatacommons.org/licenses/odbl/>. 4. If you're looking for data that could be usefully entered into your system, you could take a look through http://www.ckan.net/. There's quite a few packages on there with data that needs to be stored somewhere, especially somewhere that supports collaborative editing. You may also want to get in contact with the people at http://scraperwiki.com/ (I imagine they are storing lots of data from the scraping they do ...). Look forward to hearing more about this very interesting project. Regards, Rufus [1]: http://m.okfn.org/files/talks/ccc_20091228/ [2]: http://blog.okfn.org/2007/02/20/collaborative-development-of-data/ -- Promoting Open Knowledge in a Digital Age http://www.okfn.org/ - http://blog.okfn.org/ 2009/12/22 Gordon Irlam <[email protected]>: > Hi, > > Some friends and I are embarking on a project called freetable.org. > We aim to create a public commons for shared data in much the same way > Wikipedia created a public commons for textual data. That is, we seek > to be a centralized real time repository for shared data. Some > examples of the broad range of data we are considering: > > - classified, realty, job, and personal ads > > - customer reviews of products and businesses > > - app data for open source applications > > - geographic, scientific, and economic data > > Data will be able to be contributed by anyone, or at least initially, > by any programmer, under an open license. Data will be in the > standard table, record, field format. An interface similar to SQL > will be provided for programmers to access the data. Permissions will > be used to control who can modify the data. > > Rather than trying to carefully design the database tables we are > going to support, we will allow programmers to create any database > table on our system, and then see which tables prove popular. > > The purpose of this email is to gauge interest in freetable.org. We > don't want to build something that isn't useful to people. > > So if you have or know of an open dataset that you would like to make > use of via freetable, could you please reply letting us know what that > dataset is. > > many thanks, > gordon > > _______________________________________________ > okfn-discuss mailing list > [email protected] > http://lists.okfn.org/mailman/listinfo/okfn-discuss > -- Promoting Open Knowledge in a Digital Age http://www.okfn.org/ - http://blog.okfn.org/ _______________________________________________ okfn-discuss mailing list [email protected] http://lists.okfn.org/mailman/listinfo/okfn-discuss
