Thanks for the quick feedback! You're right that I should have implemented Turtle output. I've done that now, this is the result (as you'd expect):
<urn:artist:Best%20Coast> nmm:artistName "Best Coast" ; rdf:type nmm:Artist . <urn:album:The%20Only%20Place> nmm:albumTitle "The Only Place" ; rdf:type nmm:MusicAlbum ; nmm:albumArtist <urn:artist:Best%20Coast> . <urn:album-disc:The%20Only%20Place:Disc1> nmm:setNumber 1 ; nmm:albumDiscAlbum <urn:album:The%20Only%20Place> ; rdf:type nmm:MusicAlbumDisc . <file:///home/sam/Downloads/Best%20Coast%20-%20The%20Only%20Place.mp3> nie:comment "Free download from http://www.last.fm/music/Best+Coast and http://MP3.com" ; nmm:trackNumber 1 ; nmm:performer <urn:artist:Best%20Coast> ; nfo:averageBitrate 128000 ; nmm:musicAlbum <urn:album:The%20Only%20Place> ; nfo:channels 2 ; nmm:dlnaProfile "MP3" ; nmm:musicAlbumDisc <urn:album-disc:The%20Only%20Place:Disc1> ; rdf:type nmm:MusicPiece , nfo:Audio ; nfo:duration 164 ; nfo:codec "MPEG" ; nmm:dlnaMime "audio/mpeg" ; nfo:sampleRate 44100 ; nie:title "The Only Place" . I'm still kinda interested in JSON-LD, because JSON (though not JSON-LD) has such a massive user base already. Phillip, JSON-LD *is* a W3C standard: <https://www.w3.org/TR/json-ld/>. The great thing about standards is there are so many! That said all the W3C's previous attempts at RDF-in-JSON are quite bad, I think JSON-LD is definitely an improvement. There's a great blog post from the main guy behind the standard called "JSON-LD and Why I Hate the Semantic Web" which I recommend reading :-) <http://manu.sporny.org/2014/json-ld-origins-2/> Anyway, for my purposes, Turtle output from the extractors is fine (and a big improvement on SPARQL). I'll keep the JSON-LD stuff around in a separate commit. On Sat, Apr 9, 2016 at 12:49 PM, Carlos Garnacho <carl...@gnome.org> wrote: > Hey Sam :), > >> so, inspired by something in the Python RDFLib library, I came up with a >> TrackerResource class that the extractors can use instead. This is a >> work in process, but I have a branch in git.gnome.org that adds >> TrackerResource, and converts some of the extractors to use it. The >> TrackerResource class can serialize either to SPARQL update commands or >> to JSON-LD. The branch also adds the `tracker extract` command from >> <https://bugzilla.gnome.org/show_bug.cgi?id=751991> so you can try out >> the extractors easily and specify `-o json` or `-o sparql` as you prefer. > > Nice! Should it have a turtle serializer too? Do you think this can be > possibly used in the tracker store side to serialize contents? I hadn't thought of that, but it's definitely possible. You could have a `tracker serialize-the-whole-database` command :-) In terms of backups, part of me things we should use an efficient binary format.. but then it's hard to trust a backup that is an opaque binary format. If we could serialize to Turtle or JSON-LD then you could tell just by looking whether it was valid or not. We can just gzip it to make it small. ... >> >> Here's an example of auto-generated SPARQL for an MP3 extraction: > > <snip> >> >> >> Note there are a lot more DELETE statements than before. I figured that >> anywhere we want to replace the existing data we need a DELETE >> statement, and the reason we don't normally do it is because previously >> it had to be done manually. That said, the TrackerResource class does >> have a way of avoiding this. If you ever call _set_value() for a property >> then >> it assumes you want to *overwrite* it, and will generate a DELETE. If you >> only use _add_value() then it will assume you want to *add* to it, and won't >> generate a DELETE. The latter case is needed for stuff like nao:hasTag. >> I may be misunderstanding things here of course, I didn't actually write any >> of the extractors myself. > > Sounds good :), It seems to me that the generated sparql already > ensures some correctness, which is great. The difference between set > and add makes sense, given that we have to deal with single and > multivalued properties. The only potentially harmful combination would > be doing add_value() on a single valued property, is there any way > that could raise a warning in tracker-extract, rather than being > caught late due to the failed insert? I don't think that's possible because libtracker-sparql doesn't have any knowledge of the ontologies. We could move a bunch of code from libtracker-data to libtracker-sparql to make it happen, but I actually think it's a good design to have libtracker-sparql separate from Tracker's own database and Tracker's own ontologies. Sam _______________________________________________ tracker-list mailing list tracker-list@gnome.org https://mail.gnome.org/mailman/listinfo/tracker-list