Re: future of rdflib (Was: Re: [rdflib-dev] text indexing)

Chimezie Ogbuji Sat, 06 May 2006 11:53:25 -0700

I think the first and most important design goal for rdflib is to
*correctly* serialize RDF graphs into one or more literal syntaxes.
The really, really, really primitive datatyping support works against
this design goal.


NTriples serialization is currently hampered ( to my understanding )
only by escaping.  The lack of 'complete' datatyping support is not so
much an issue for serialization.  For instance, proper comparison of
datatypes was explicitely mentioned in the sparql-p documentation
(http://dev.w3.org/cvsweb/~checkout~/2004/PythonLib-IH/Doc/sparqlDesc.html?rev=1.8#literals)
and the issue there was comparison.  I'm not sure if the concerns in
that document were addressed when sparql-p was ported.  I think
NTriples and N3 serialization of literals is straight forward once the
Literal is properly persisted with it's lexical value, datatype URI or
language tag.

Why a method? Even easier, just have literal's constructor do a
Python type() of the constructor arg, map the Python type to the XSD
datatype URI, and use that. If there is no mapping, or it's
ambiguous, *then* make an untyped literal.


How about this for a comprehensive 'mechanism' for Literals?:

1) An explicit, formal mapping from python types to XSD datatype is
written into the Literal module
2) The constructor does some introspection on the value given, using
this formal mapping.  .  The default (A string/unicode would be the
most common case) would be untyped.
3) The __repr__ should (where it's feasible) return a string that can
be 'evaled' to produce the python equivalent of the Literal.  xsd:int
-> int(..lexical value..) xsd:boolean -> bool(..lexical value..), etc.
This doesn't work 'directly' in all cases (xsd:date and xsd:dateTime
for example, since Python's date and datetime modules are not
builtins).  This follows Python's basic customization convention:
http://docs.python.org/ref/customization.html
4) Optionally, an additional toPython() method can be defined which
returns the equivalent Python object using __repr__ (where the string
can be evaled) or an explicit conversion via the formal mapping

Well, exactly right, and still no one's written a parser for it. And
there were Versa ports around and they never got integrated. rdflib
needs query language support if it's going to be useful and actually
used in serious applications.


The eventual goal with regards to Versa was to port it into rdflib
(per: http://rdflib.net/4rdf_rdflib_migration/), but there hasn't been
enough bandwidth to do that 'properly', so the current API mapping is
a temporary solution.

Now, if this is a technically separate package that's (1) documented,
(2) easy to install, (3) supported, I don't care if it's technically
separate. Someone can make a sumo build and us *users* can get on w/
our lives and projects. :>


I have (on my todo pile) an idea of a brief by-example-tutorial on the
Graph / Store API.  I'll post my first draft here to see what people
think. The idea is that the write-up would go hand in hand with API
documentation (which can - and is - easily automated).  Upon
integration of 1) a data binding library 2) SPARQL/Versa query
processing libraries you no longer have software dependency issues and
the installation becomes as easy as distutils allows it to be.

It sounds to me like datatype support (introspection on python values
and casting to python datatypes), RDF querying libraries, and
documentation (in prose to complement API  documentation) are the
lowest hanging fruit with regards to the road map.

Chimezie

_______________________________________________
Dev mailing list
[email protected]
http://rdflib.net/mailman/listinfo/dev

Re: future of rdflib (Was: Re: [rdflib-dev] text indexing)

Reply via email to