Hi Michel, On 12/05/2013 07:05 PM, Michel Dumontier wrote:
Hi all, As you may know, Bio2RDF produces RDF dumps of its RDF datasets [1,2]. For each dataset, we generate a dataset description file (as per [3]; example [4]) that is in n-triples format, while the dataset is comprised of one or more *gzipped* n-triple files. I just noticed that LODStats did not correctly parse [5] these files to generate the dataset statistics, owing, perhaps, to the assignment of "application/x-ntriples" in the relevant datahub.io <http://datahub.io> resource metadata. I'd like to know what mime type we should specify for zipped, gzipped RDF data.
If you assume that the recipient will want to unzip them before parsing (as opposed to parsing *while* unzipping) then you could use a normal RDF MIME type but specify a gzip HTTP Content-Encoding:
http://stackoverflow.com/questions/864448/how-to-set-content-encoding-with-gzip David
as we prepare for our next release, we're planning to generate n-quads for the datasets, thereby linking versioned datasets with their metadata. we are wondering whether there will be sufficient support for this format. Also, we are wondering whether it would be problematic to provide single file downloads that are tar.gz formatted. comments and suggestions most welcome, m. [1] http://bio2rdf.org/datasets [2] http://download.bio2rdf.org/ [3] https://github.com/bio2rdf/bio2rdf-scripts/wiki/Bio2RDF-Dataset-Provenance [4] http://download.bio2rdf.org/current/affymetrix/bio2rdf-affymetrix-20121004.nt [5] http://stats.lod2.eu/rdfdocs?search=bio2rdf -- Michel Dumontier Associate Professor of Medicine (Biomedical Informatics), Stanford University Chair, W3C Semantic Web for Health Care and the Life Sciences Interest Group http://dumontierlab.com