On 2015-01-19 20:36, Larry Masinter wrote:
I just joined this list. I’m looking to help improve the story for Linked Open
Data in PDF, to lift PDF (and other formats) from one-star to five, perhaps
using XMP. I’ve found a few hints in the mailing list archive here.
http://lists.w3.org/Archives/Public/public-lod/2014Oct/0169.html
but I’m still looking. Any clues, problem statements, sample sites?
Larry
--
http://larry.masinter.net
Hi Larry,
First off, I totally acknowledge your interest to improve the state of
things for PDF.
I'm welcome to be proven wrong, but for the "big picture", I don't
believe that LaTeX/XMP/PDF is the way to go for LD-friendly - perhaps
efforts for that better invested elsewhere. There are a number of issues
and shortcomings with the PDF approach which in the end will not play
well with the Web is intended to be, nor how it functions. Most
importantly, it is not fault tolerant, machine-friendly (regardless of
what can be stuffed into XMP), and will not scale. At the end of the
day, PDF is a silo-document, its rendering is a resource-hog in
different devices, and it is not a ubiquitous reading/interactive
experience in different devices.
For XMP/PDF to work, I presume you are going to end up dealing with
RDF/XML, and an appropriate interface for authors to mark their
statements with. Keep in mind that, this will most likely treat the data
as a separate island, disassociated from the context in which it appears in.
May I invite you to read:
http://csarven.ca/enabling-accessible-knowledge
It covers my position in sufficient depth - not intended to be overly
technical, but rather covering the ground rules and ongoing work.
While you are at it, please do a quick print-view from your Web browser
(preferably in Firefox) or print to PDF.
The RDF bits are visible here:
http://www.w3.org/2012/pyRdfa/extract?uri=http%3A%2F%2Fcsarven.ca%2Fenabling-accessible-knowledge&rdfa_lite=false&vocab_expansion=false&embedded_rdf=true&validate=yes&space_preserve=true&vocab_cache_report=false&vocab_cache_bypass=false
I will spare you the details on what's going on there, unless you really
want to know, but to put it in a nutshell: it covers statements dealing
with sections, provenance, references/citations..
Here is another paper: http://linked-reseach.270a.info/ (which can just
as well be a PDF - after all, PDF is just a view), which in addition to
above, includes more atomic things like hypothesis, variables, workflows, ..
The work is based on Linked Research:
https://github.com/csarven/linked-research
If you are comfortable with your browser's developer toolbar, try
changing the stylesheet lncs.css in <head> to acm.css.
There is a whole behavioural/interactive layer which I'll skip over now,
but you can take a look at it if you fancy JavaScript.
As you may have already noticed, the HTML template is flexible enough
for "blog" posts and "papers" - again, this is about separating the
structure/content from the other layers: presentation, and behaviour.
Any feedback, questions, always welcome!
-Sarven
http://csarven.ca/#i