Ivan Frade <[email protected]> writes: > Hi, > > On Sat, Nov 20, 2010 at 12:21 AM, Nikolaus Rath > <[email protected]> wrote: > >> >> Nikolaus Rath <[email protected]> >> writes: >> >> extractor = ExtractorHelper () >> >> results = extractor.get_metadata (filename) >> >> >> Upon closer investigation, get_metadata() fails whenever it encounters a >> text/plain file that contains a '['. Looking at the code, this does not >> seem surprising. >> >> Is the format of the string that's returned by GetMetadata() described >> somewhere? Then I could try to fix the parser. >> > > GetMetadata() returns triplets in "turtle" format, with the subject missing > (because the caller should know it and probably wants to add more > information). That python "parser" (if you can call it that) uses just > regular expressions to parse those triplets and handle the anonymous nodes > (those "[ xxx ]") in a tricky way to form a single key for the dictionary. > > Nodes like: > A slo:location [a slo:GeoLocation; slo:city "Helsinki"] > Are translated in the dictionary to: > slo:location:city "Helsinki" > > Not nice, but good enough for our testing. Remember that this code is just > an internal utility and not a public API. Patches are welcome if you find > issues,
Well, I would be quite happy to submit patches. But at the moment I still have absolutely no idea what GetMetadata() returns. What is a "turtle format"? What is a "subject"? What defines a node? I think you are assuming that I know something which I actually don't.. Btw, if tracker itself does not use ExtractorHelper, how does e.g. tracker-miner make sense of the metadata? Maybe I could use that code as a start instead. I tried to follow the invocation of get_metadata_fast_async but could not really identify the part where the metadata is parsed. Thanks, -Nikolaus -- »Time flies like an arrow, fruit flies like a Banana.« PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C _______________________________________________ tracker-list mailing list [email protected] http://mail.gnome.org/mailman/listinfo/tracker-list
