Sorry guys I'm nutters! :)
Cheers,
Chris
On Jul 20, 2011, at 1:39 AM, Julien Nioche wrote:
Glad you managed to get it to work. I don't know what Chris meant by that,
can;t see why we'd open a JIRA when we are already using the latest version
Julien
On 20 July 2011 08:19, Fernando Arreola
Hey Fernando,
Would be great to get a JIRA issue and patch to bring
Nutch 1.4-branch up to date with the latest Tika
based on your experience.
Thanks for your help!
Cheers,
Chris
On Jul 19, 2011, at 4:48 PM, Fernando Arreola wrote:
Hi,
You were right, it is enough to provide the right
You probably need to make sure that conf/tika-mimetypes.xml is the version
you've modified and contains the clues for detecting afm files.
BTW out of curiosity why did you have to modify tika-core.jar? Isn't it
enough to provide the clues in tika-mimetypes.xml?
Jul
On 13 July 2011 01:16,
I did update the runtime/local/conf/tika-mimetypes.xml and my changes are
there. I looked at the code for the ParserChecker and it seems to be doing
its own content type detection using a Protocol call, so I am trying to set
up Solr in hopes that it would work there (having some unix memory issues
Hello,
I have made some additions (a new parser) to the Apache Tika application and
I am trying to see if I can run my new changes using the crawl mechanism in
Nutch, but I am having some trouble updating Nutch with my modified Tika
application.
The Tika updates I made run fine if I run Tika as
Hi Fernando,
One point for me to mention which I did not pick up from your post. Did you
rebuild your Nutch dist after making the changes to include your new parser?
I know that this is a pretty simple suggestion but hopefully it might be the
right one.
Also can you please provide more details
Hello,
Thanks for the replies.
I have started trying to use Nutch 1.3 after your suggestions, especially
since I am using Tika 0.9, but I am not getting anywhere with it. I am able
to build fine but whenever I try to run any command it gives the error
stating that it cannot find C:\Program. For
Thanks, I really appreciate all the help. I used the ParserChecker and I
could see the metadata my parser extracted!
I have one more question though, I could only see the metadata my parser
extracted if I used the -forceAs mimetype option. Otherwise it is detected
as a text/plain file and my
8 matches
Mail list logo