Thanks for the quick feedback!

You're right that I should have implemented Turtle output. I've done
that now, this is the result (as you'd expect):

<urn:artist:Best%20Coast> nmm:artistName "Best Coast" ;
  rdf:type nmm:Artist .

<urn:album:The%20Only%20Place> nmm:albumTitle "The Only Place" ;
  rdf:type nmm:MusicAlbum ;
  nmm:albumArtist <urn:artist:Best%20Coast> .

<urn:album-disc:The%20Only%20Place:Disc1> nmm:setNumber 1 ;
  nmm:albumDiscAlbum <urn:album:The%20Only%20Place> ;
  rdf:type nmm:MusicAlbumDisc .

<file:///home/sam/Downloads/Best%20Coast%20-%20The%20Only%20Place.mp3>
nie:comment "Free download from http://www.last.fm/music/Best+Coast
and http://MP3.com"; ;
  nmm:trackNumber 1 ;
  nmm:performer <urn:artist:Best%20Coast> ;
  nfo:averageBitrate 128000 ;
  nmm:musicAlbum <urn:album:The%20Only%20Place> ;
  nfo:channels 2 ;
  nmm:dlnaProfile "MP3" ;
  nmm:musicAlbumDisc <urn:album-disc:The%20Only%20Place:Disc1> ;
  rdf:type nmm:MusicPiece , nfo:Audio ;
  nfo:duration 164 ;
  nfo:codec "MPEG" ;
  nmm:dlnaMime "audio/mpeg" ;
  nfo:sampleRate 44100 ;
  nie:title "The Only Place" .


I'm still kinda interested in JSON-LD, because JSON (though not
JSON-LD) has such a massive user base already. Phillip, JSON-LD *is* a
W3C standard: <https://www.w3.org/TR/json-ld/>. The great thing about
standards is there are so many!

That said all the W3C's previous attempts at RDF-in-JSON are quite
bad, I think JSON-LD is definitely an improvement. There's a great
blog post from the main guy behind the standard called "JSON-LD and
Why I Hate the Semantic Web" which I recommend reading :-)
<http://manu.sporny.org/2014/json-ld-origins-2/>

Anyway, for my purposes, Turtle output from the extractors is fine
(and a big improvement on SPARQL). I'll keep the JSON-LD stuff around
in a separate commit.


On Sat, Apr 9, 2016 at 12:49 PM, Carlos Garnacho <carl...@gnome.org> wrote:
> Hey Sam :),
>
>> so, inspired by something in the Python RDFLib library, I came up with a
>> TrackerResource class that the extractors can use instead. This is a
>> work in process, but I have a branch in git.gnome.org that adds
>> TrackerResource, and converts some of the extractors to use it. The
>> TrackerResource class can serialize either to SPARQL update commands or
>> to JSON-LD. The branch also adds the `tracker extract` command from
>> <https://bugzilla.gnome.org/show_bug.cgi?id=751991> so you can try out
>> the extractors easily and specify `-o json` or `-o sparql` as you prefer.
>
> Nice! Should it have a turtle serializer too? Do you think this can be
> possibly used in the tracker store side to serialize contents?

I hadn't thought of that, but it's definitely possible. You could have
a `tracker serialize-the-whole-database` command :-)

In terms of backups, part of me things we should use an efficient
binary format.. but then it's hard to trust a backup that is an opaque
binary format. If we could serialize to Turtle or JSON-LD then you
could tell just by looking whether it was valid or not. We can just
gzip it to make it small.

...
>>
>> Here's an example of auto-generated SPARQL for an MP3 extraction:
>
> <snip>
>>
>>
>> Note there are a lot more DELETE statements than before. I figured that
>> anywhere we want to replace the existing data we need a DELETE
>> statement, and the reason we don't normally do it is because previously
>> it had to be done manually. That said, the TrackerResource class does
>> have a way of avoiding this. If you ever call _set_value() for a property 
>> then
>> it assumes you want to *overwrite* it, and will generate a DELETE. If you
>> only use _add_value() then it will assume you want to *add* to it, and won't
>> generate a DELETE. The latter case is needed for stuff like nao:hasTag.
>> I may be misunderstanding things here of course, I didn't actually write any
>> of the extractors myself.
>
> Sounds good :), It seems to me that the generated sparql already
> ensures some correctness, which is great. The difference between set
> and add makes sense, given that we have to deal with single and
> multivalued properties. The only potentially harmful combination would
> be doing add_value() on a single valued property, is there any way
> that could raise a warning in tracker-extract, rather than being
> caught late due to the failed insert?

I don't think that's possible because libtracker-sparql doesn't have
any knowledge of the ontologies. We could move a bunch of code from
libtracker-data to libtracker-sparql to make it happen, but I actually
think it's a good design to have libtracker-sparql separate from
Tracker's own database and Tracker's own ontologies.

Sam
_______________________________________________
tracker-list mailing list
tracker-list@gnome.org
https://mail.gnome.org/mailman/listinfo/tracker-list

Reply via email to