Ivan Frade <[email protected]> writes:
> Hi,
>
> On Sat, Nov 20, 2010 at 12:21 AM, Nikolaus Rath 
> <[email protected]> wrote:
>
>>
>> Nikolaus Rath <[email protected]> 
>> writes:
>> >> extractor = ExtractorHelper ()
>> >> results = extractor.get_metadata (filename)
>> >>
>> Upon closer investigation, get_metadata() fails whenever it encounters a
>> text/plain file that contains a '['. Looking at the code, this does not
>> seem surprising.
>>
>> Is the format of the string that's returned by GetMetadata() described
>> somewhere? Then I could try to fix the parser.
>>
>
> GetMetadata() returns triplets in "turtle" format, with the subject missing
> (because the caller should know it and probably wants to add more
> information). That python "parser" (if you can call it that) uses just
> regular expressions to parse those triplets and handle the anonymous nodes
> (those "[ xxx ]") in a tricky way to form a single key for the dictionary.
>
> Nodes like:
> A slo:location [a slo:GeoLocation; slo:city "Helsinki"]
> Are translated in the dictionary to:
> slo:location:city "Helsinki"
>
> Not nice, but good enough for our testing. Remember that this code is just
> an internal utility and not a public API. Patches are welcome if you find
> issues,

Well, I would be quite happy to submit patches. But at the moment I
still have absolutely no idea what GetMetadata() returns. What is a
"turtle format"? What is a "subject"? What defines a node? I think you
are assuming that I know something which I actually don't..


Btw, if tracker itself does not use ExtractorHelper, how does e.g.
tracker-miner make sense of the metadata? Maybe I could use that code as
a start instead. I tried to follow the invocation of
get_metadata_fast_async but could not really identify the part where the
metadata is parsed.


Thanks,

   -Nikolaus

-- 
 »Time flies like an arrow, fruit flies like a Banana.«

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
_______________________________________________
tracker-list mailing list
[email protected]
http://mail.gnome.org/mailman/listinfo/tracker-list

Reply via email to