There was an article I found via twitter that seems to be very pertinent to this discussion. I'll include the paragraphs that summed up the current data situation and drives:
"When datasets were sparse and only connected to the lab that produced them, we would brood every one of them, protect (patent) them and work on them in isolation in order to 'sell' them as chickens, usually in the form of a largely narrative article. Other scientists need to combine a minimum of two existing publications to generate new eggs and breed more chickens. However, chickens have become overabundant: more than 20 million articles exist in biomedicine alone. More recently, valuable aggregations of data were brought online (for example, data sets in GEO, curated databases such as SwissProt and locus-specific human gene variation databases (locus-specific databases such as the Leiden Open Variation Database LOVD). Now, data (eggs) have become a direct source of new in silico discoveries and a unit of scientific trade. But the scientific market has no way to value eggs because the entire system is built upon judging and exchanging chickens for acknowledgement and credit (through citations and other measures of impact). On the other hand, for effective and evidence-based breeding, we need the eggs as well as information from the parent chickens to assess the value of the eggs. This is where a major challenge lies: in the long overdue adaptation in scholarly communication. The data-intensive science wave that has come over us calls for innovative ways of data sharing, stewardship and valuation. We must respect the connection between the articles and the data and value both appropriately." [Full article at: http://www.nature.com/ng/journal/v43/n4/full/ng0411-281.html (Nature Genetics 43, 281–283 (2011) doi:10.1038/ng0411-281) "The value of data" Abstract - "Data citation and the derivation of semantic constructs directly from datasets have now both found their place in scientific communication. The social challenge facing us is to maintain the value of traditional narrative publications and their relationship to the datasets they report upon while at the same time developing appropriate metrics for citation of data and data constructs."] On Fri, 2011-04-01 at 18:19 +0100, Peter Murray-Rust wrote: > > > On Fri, Apr 1, 2011 at 4:10 PM, <[email protected]> wrote: > hi Rufus, > > > > > On Fri, 1 Apr 2011 14:45:18 +0100, Rufus Pollock wrote > > Hi All, > > > > ... > > > > > ## The Present: A One-Way Street > > > > > At the current time, the basic model for data processing is > a [UTF-8?]“one way > > [UTF-8?]street†. Sources of data, such as government, > publish data out into > > the world, where, (if we are lucky) it is processed by > intermediaries > > such as app creators or analysts, before finally being > consumed by end > > users1. > > > > It is a one way street because there is no feedback loop, no > sharing > > of data back to publishers and no sharing between > intermediaries. > > > Agreed. I have been working out these ideas at the Am. Chem. Soc. I > came up with the term "asymmetric" - and this is well argued in > Becky's chilling analysis . So Open Data is not just the crumbs that > the peaseant consume under the table. > > I also addressed the ecosystem aspect and here I use terms like > "bottom-up" and "web-democratic" . For me this describes Wikipedia, > OKF, OSM and my own seeds in the ecosystem BlueObelisk > http://en.wikipedia.org/wiki/Blue_Obelisk and Quixote > http://quixote.wikispot.org/Front_Page. They have been designed (or > have evolved) to have no centre and no hierarchy - they work by > "rought consensus and running code". > > Of course the ACS just continues to go ahead and copyright data > (deliberately)... > > > > > > So what should be different? > > > > ## The Future: An Ecosystem > > > > > With the introduction of data cycles we have a real > ecosystem not a > > one way street and this ecosystem thrives on collaboration, > > componentization and open data. > > > > > This is exactly how Blue Obelisk and Quixote work. > > The power of the ecosystem is that it can find vary dilute resources > and concentaret them (to use chemical terms). If there are (say) > 100,000 chemists and 0.1% care about doing something for Openness then > that's 100 activists and that is enough. > > P. > > > > -- > Peter Murray-Rust > Reader in Molecular Informatics > Unilever Centre, Dep. Of Chemistry > University of Cambridge > CB2 1EW, UK > +44-1223-763069 > _______________________________________________ > okfn-discuss mailing list > [email protected] > http://lists.okfn.org/mailman/listinfo/okfn-discuss _______________________________________________ okfn-discuss mailing list [email protected] http://lists.okfn.org/mailman/listinfo/okfn-discuss
