Hi Antoni,
The roadmap doesn't give much detail about the intended vocabularies.
Dublin core is great, but what else? Joerg? What other kinds of metadata
information would you like to extract with Tika, and what vocabularies would
you like to use to express them?
At Adobe, you'll likely
+1
This does indeed look like a good combination.
Jörg
-Original Message-
From: Mattmann, Chris A (388J) [mailto:chris.a.mattm...@jpl.nasa.gov]
Sent: Freitag, 27. April 2012 01:33
To: dev@tika.apache.org
Subject: Re: [metadata] roadmap proposal available on the wiki
Hi Antoni
-Original Message-
From: Mattmann, Chris A (388J) [mailto:chris.a.mattm...@jpl.nasa.gov]
Sent: Mittwoch, 25. April 2012 22:40
To: dev@tika.apache.org
Subject: Re: [metadata] roadmap proposal available on the wiki
Hi Jörg,
On Apr 25, 2012, at 10:27 AM, Joerg Ehrlich wrote:
I am not strongly
Hi Jörg,
Thanks for your email, comments below:
On Apr 26, 2012, at 3:35 AM, Joerg Ehrlich wrote:
Hi Chris,
Those are all valid points and I agree that you could do everything with a
Hashmap.
Having the parsers fill the Metadata class and its Hashmap with all needed
information which
I think besides the namespaces, one of the issues Jörg is trying to tackle is
the structured metadata and the extra time and effort referred to is dealing
with serialization of structured data to and from a hashmap.
For example I may have metadata similar to:
Contact1
|-- First Name
|-- Last
Message-
From: Ray Gauss II [mailto:ray.ga...@alfresco.com]
Sent: Donnerstag, 26. April 2012 18:03
To: dev@tika.apache.org
Subject: Re: [metadata] roadmap proposal available on the wiki
I think besides the namespaces, one of the issues Jörg is trying to tackle is
the structured metadata
2012/04/25 Joerg Ehrlich napisał/wrote:
Hi,
I have put a proposal of a roadmap for the metadata features in Tika on the
wiki:
http://wiki.apache.org/tika/MetadataRoadmap
The proposal is based on a discussion around this topic I have had with Jukka.
Please review and feel free to edit the wiki
2012/04/26 Mattmann, Chris A (388J) napisał/wrote:
Hi Guys,
One comment RE: the below too -- this is precisely where I see
Any23 coming into play and why there is a strong relationship
between it and Tika:
http://incubator.apache.org/any23/
I'm the current Champion for the project and the
Hi Antoni,
Precisely! :) That would be awesome huh. And, my goal there too is to turn
Any23 parsers into Tika parsers too as I think they could be one and the
same (with an RDF or XMP or RSS ContentHandler transforming the
Tika intermediate SAX output the same).
Cheers,
Chris
On Apr 26, 2012,
Hi Jörg,
First off, thanks for taking the time to put your thoughts down on the Wiki. I
will
try to leverage that for helping push these ideas forward. I am +1 on most of
the things you proposed.
Regarding:
{quote}
Use XMP instead of Hashmap in Metadata class
The idea is to have just one
Hi Chris,
Thanks for your comments,
I am not strongly supportive of of changing the HashMap internal
representation in Metadata out.
A couple of things I like about the HashMap:
* It's simple.
* It doesn't require dependency on any external libraries and helps keep
tika-core minimal.
Hi Jörg,
On Apr 25, 2012, at 10:27 AM, Joerg Ehrlich wrote:
I am not strongly supportive of of changing the HashMap internal
representation in Metadata out.
A couple of things I like about the HashMap:
* It's simple.
* It doesn't require dependency on any external libraries and helps
12 matches
Mail list logo