Hi Markus,

I've read that tika is not parsing mp3 because of the copyright.
Currently there is a patch to parse mp3 ?

Regards,

On Tue, 2011-05-10 at 00:01 +0200, Markus Jelsma wrote:
> Mime-type is added via the index-more plugin. By default it creates multiple 
> values e.g. text/html, text and html for a HTML page. It can also be 
> configured 
> to output only text/html pair (see nutch-default for an example).
> 
> I've never indexed multimedia data so i can't help there, but what's not 
> working in Tika? I know Tika will do mp3 and jpeg but not video's (except 
> Flash). Haven't seen ogg around as well.
> 
> Nutch passes unmapped mime types to Tika.
> 
> > Hi everyone,
> > 
> > I'm trying to index images (jpeg, exif data), videos and audio (mp3,
> > ogg, id3 data) but tika is not working.
> > 
> > How can I index those files and create the respective fields ?
> > Also I don't found how to store the mime type of the files indexed.
> > 
> > Basically I need to index sites with multimedia.
> > 
> > Thanks,


Reply via email to