Re: [okfn-discuss] freetable.org: Expressions of interest sought

Rufus Pollock Mon, 11 Jan 2010 10:50:37 -0800

Dear Gordon,

Great to hear from you. freetable.org sounds like a really interesting project.


The "wikipedia" of data idea is one that has come up several times in
the "open" community over the last few years. Moreover, in its general
form of a public, shared, open data commons, it is one that the Open
Knowledge Foundation is directly seeking to promote and create, for
example via projects such as the Comprehensive Knowledge Archive
Network: http://www.ckan.net/.

However, it is worth noting a clear division in approaches to creating
such an "open data commons" between a more decentralized
"debian-style" approach and the more centralized "wikipedia-style"
one. CKAN is more along the debian-style model (see these recent
slides [1]) while yours is obviously more focused on building a
"central" store. Both are valuable, and complementary, though I guess
it is unlikely that we're ever going to have one store with all the
world's data in it ;) so it's important to think how about different
such "wikipedias of data" will talk to each other (or at least share
information about their existence) ...

Here are a few comments, specific to your effort (no doubt you're
already thinking about some of these):

1. Wikipedia as a project has/had a specific, and fairly, well-defined
goal: making an open encyclopaedia. Specific subareas (news, travel
etc) have spawned their own subprojects. This is an important point to
bear in mind when trying to make a "wikipedia" for data. Not all data
is the same and people interesed in genomics may not be much
interested in sports or the economy. Much of what makes Wikipedia (or
any other open project) work is the community. Having a well-defined
focus is important in creating and maintaining that community.

2. Data is different (and likely harder) from text. For example, in
data terms a project like Wikipedia is in fact still pretty small. If
you want to build a wikipedia of data you'll need to deal with
significant size and scaling issues. Furthermore, the tools for doing
collaborative (versioned) development of data are still in their
infancy compared to the situation for (unstructured) text --
everything from (good) diff tools to (distributed) versioning
protocols [2].

3. I'd advise against using a Creative Commons Attribution-Sharealike
license for data. CC licenses were designed for content (text, images
etc) and aren't a great match for data (just as free/open code
licenses weren't a good match for data). Instead I'd suggest using a
license specifically designed for data. For example Open Data Commons
(http://www.opendatacommons.org/) have produced an
attribution-sharealike license for data, the: Open Database License
(ODbL): <http://www.opendatacommons.org/licenses/odbl/>.

4. If you're looking for data that could be usefully entered into your
system, you could take a look through http://www.ckan.net/. There's
quite a few packages on there with data that needs to be stored
somewhere, especially somewhere that supports collaborative editing.
You may also want to get in contact with the people at
http://scraperwiki.com/ (I imagine they are storing lots of data from
the scraping they do ...).

Look forward to hearing more about this very interesting project.

Regards,

Rufus
[1]: http://m.okfn.org/files/talks/ccc_20091228/
[2]: http://blog.okfn.org/2007/02/20/collaborative-development-of-data/
-- 
Promoting Open Knowledge in a Digital Age
http://www.okfn.org/ - http://blog.okfn.org/

2009/12/22 Gordon Irlam <[email protected]>:
> Hi,
>
> Some friends and I are embarking on a project called freetable.org.
> We aim to create a public commons for shared data in much the same way
> Wikipedia created a public commons for textual data.  That is, we seek
> to be a centralized real time repository for shared data.  Some
> examples of the broad range of data we are considering:
>
>    - classified, realty, job, and personal ads
>
>    - customer reviews of products and businesses
>
>    - app data for open source applications
>
>    - geographic, scientific, and economic data
>
> Data will be able to be contributed by anyone, or at least initially,
> by any programmer, under an open license.  Data will be in the
> standard table, record, field format.  An interface similar to SQL
> will be provided for programmers to access the data.  Permissions will
> be used to control who can modify the data.
>
> Rather than trying to carefully design the database tables we are
> going to support, we will allow programmers to create any database
> table on our system, and then see which tables prove popular.
>
> The purpose of this email is to gauge interest in freetable.org.  We
> don't want to build something that isn't useful to people.
>
> So if you have or know of an open dataset that you would like to make
> use of via freetable, could you please reply letting us know what that
> dataset is.
>
> many thanks,
> gordon
>
> _______________________________________________
> okfn-discuss mailing list
> [email protected]
> http://lists.okfn.org/mailman/listinfo/okfn-discuss
>



-- 
Promoting Open Knowledge in a Digital Age
http://www.okfn.org/ - http://blog.okfn.org/

_______________________________________________
okfn-discuss mailing list
[email protected]
http://lists.okfn.org/mailman/listinfo/okfn-discuss

Re: [okfn-discuss] freetable.org: Expressions of interest sought

Reply via email to