I agree with what Bjoern and Mark have said. We have the imperative to develop a new set of tools and most is in place.
For my part I am launching "the Content Mine" over these current days. The goal is simple - to extract 100,000,000 million facts from the scholarly scientific literature. See https://vimeo.com/78353557 (5 minutes video). http://www.slideshare.net/petermurrayrust/the-content-mine-presented-at-uksg and innumerable current blogs on http://blogs.ch.cam.ac.uk/pmr/ I would very much welcome help. I have been offered some from outside academia - it would be nice to have some academics who also believed in liberation. This is not vapourware. I demo'ed this at OKFN/Open Science in Oxford last Wednesday. I am starting with "Open Access" papers, such as PLoSONE and when tested there will move to other outlets. These papers can be queried for a wide range of scientific facts such as species (where we start), chemicals, sequences, geolocations, identifiers, phylogenetic trees, etc. We have means of publishing this and means of capturing it. Everything - code, protocols, extractions, stores, etc. are fully Open (OKD compliant). This has the potential to act as a semantic current-awareness system and also as a scientific search engine. At present there is no Open search engine for science, except Wikipedia. As Bjoern and Mark have made clear we must create one - and rapidly. Else we shall remain completely reliant on the charity of mega-corporations - do we trust them? I have applied for a personal grant to work on this. I will be delighted to work with any others outside or inside academia - all my software is Open for anyone to re-use without my permission. Only by making science immediately Open (OKD-compliant) at the time it is published do we have Open Access in the true (BOAI) sense of the word. On Sun, Dec 1, 2013 at 2:27 PM, Bjoern Brembs <b.bre...@gmail.com> wrote: > On Saturday, November 30, 2013, 12:30:54 AM, you wrote: > > > The technology to do all of this already exists. Most of > > the STEM metadata you describe is actually directly > > available in Medline, and the core parts can be used as > > per the open biblio principles. Crawling the websites is > > already possible using pubcrawler and other tools, and > > finding out what their stated licence status is can be > > done with howopenisit (although more often than not the > > answer is "not properly defined"). > > Precisely!! > > > However the hard part is not building or running these > > things or collecting all the data, but sustaining it in > > and imbuing it with credibility. > > Totally agreed! > > > For example I can run a server with all this on it at not > > too much personal expense, but who would treat it as a > > serious source? Scaling up to handle a large amount of > > users and providing a good service does cost money, which > > I (we) could probably find a way to fund - but even then, > > we still have to solve that credibility problem. It has to > > be known by those in or entering the field that "this is > > where you go to find this stuff" - as opposed to the > > current "go to the library and follow all the rules" approach. > > What we should be able to do right now (and for some of that we're > applying for grants as I type this), is to start building the > infrastructure for software and data. This will provide the opportunity to > develop standards for how to make the databases for text (repositories), > data and software interoperable. > > Simultaneously, these standards need t be communicated and adopted by a > critical mass of institutions. > > But perhaps most importantly, the institutions participating in crawling > and harvesting all our literature need to develop a way of searching, > filtering and sorting not only the existing literature, but especially the > incoming, new literature in a way that is superior to what we have now. > Given that there isn't really a single place where you can exhaustively > search the literature, the first part should be easy (existing literature). > > For the second part, (incoming, newly published literature), we're > currently in the process of developing an RSS reader which is tailor-made > for scientists. > > Thus, if there is a superior way to handle the literature, that > outcompetes everything we have right now (again, not too difficult), people > will go there, simple because they save time and effort that way. > > The next step will be an authoring tool that allows collaborative writing > with scientific referencing and peer-review. there are currently several > initiatives developing that environment. Once this is running, submission > will be as simple as hitting 'submit'. Everybody who has ever submitted to > a journal knows how people will flock to a service that allows submission > with a single click. > > Thus, I agree, this will be the important part, but offering a superior > way should do most of the work - just look at how quickly GScholar was > accepted. > > Cheers, > > > Bjoern > > > > > > -- > Björn Brembs > --------------------------------------------- > http://brembs.net > Neurogenetics > Universität Regensburg > Germany > > _______________________________________________ > open-access mailing list > open-acc...@lists.okfn.org > http://lists.okfn.org/mailman/listinfo/open-access > Unsubscribe: http://lists.okfn.org/mailman/options/open-access > -- Peter Murray-Rust Reader in Molecular Informatics Unilever Centre, Dep. Of Chemistry University of Cambridge CB2 1EW, UK +44-1223-763069
_______________________________________________ GOAL mailing list GOAL@eprints.org http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal