To follow up on this, do you think it would be possible to create a generic GRDDL transformation that would extract information from any well-structured XHTML table, using the scoped <th> row and column headers?

alf.

On 19 Feb 2006, at 15:07, Alf Eaton wrote:


I've been trying to decide on a good way to provide tabular data in papers using XHTML, for presentation online. The best options seem to be either just embedding the data as an array using JSON, or using tables with class and id markup and allowing them to be processed with GRDDL or Javascript to transform the data. Has there been any work on presenting spreadsheets in XHTML?

alf.

On 19 Feb 2006, at 12:17, Eric Neumann wrote:


Matt,

Spreadsheets are indeed useful as formatted sources that can be readily converted into RDF. We've used them as the primary source of expression data for BioDash (see attached averages; full GeneLogic data at http://www.samsi.info/200304/dmml/web-internal/ bio/data/data_rsvd.xls ). It almost seems a mapping tool could be written to take any excel files, a GRDDL-like conversion of column headers, row-headers, and cells, to produce RDF from these (see the example).

In our example, we wrote the conversion scripts directly into the excel file. The resulting (adenine/N3) file is show as well, with symbols strings mapped to URI's. The cool thing here is that if you add a DB query using the symbols strings (we did this within BioDash), you can take the returned gene information, convert it to RDF, and conenct it to the expression graph through the probes for each the row (see resulting adenine file).

Perhaps the BIORDF group should include using sdf sources as part of their overall strategy for producing RDF from current structured files (e.g., gene expression, screening, and clinical data in sdf). Many published papers have data tables, and this would be a great way to auto convert them to RDF!

Eric

--- Matthew Cockerill <[EMAIL PROTECTED]> wrote:


I couldn't agree more.

Spreadsheets (and equivalently, CSV files) are a
large fraction of
the 'additional datafiles' that BioMed Central
receives from authors.

What would be great would be to be able to define
some simple
standards and/or templates which authors could
follow in their
spreadsheets, to allow the automatic recognition of
key life science
identifiers, and quantitative attributes,  and so
the generation of RDF.

 From my point of view, that's the most basic,
practical and
prevalent example of the whole semi-structured data,
and so seems
like a good starting point.

Matt

On 15 Feb 2006, at 5:42, Cutler, Roger (RogerCutler)
wrote:


That's too deep for me.  I'll be satisfied, at
least in an immediate
sense, with a demonstration of how to generate RDF
from an Excel
spreadsheet.  I think I'll just start saying
"Excel spreadsheet" and
forget about the term that we use internally to
categorize the
kinds of
problems we have.  Spreadsheets are pretty much
the 80-20 of that
problem, so why not call a spade a spade.  I'm
really not very good at
generalizing and categorizing.



Reply via email to