Re: buildbot failure in ASF Buildbot on tika-trunk

2010-07-13 Thread Ken Krugler
Hi Gavin, The change we'd really like to make for Tika is to have two build commands: mvn clean install mvn site as this works around a known bug in Maven's handling of dependencies with the 'site' target. Don't know if buildbot supports that, but if so that would be most useful. Tha

Re: buildbot failure in ASF Buildbot on tika-trunk

2010-07-13 Thread Mattmann, Chris A (388J)
Hi Guys, +1, I like Buildbot, doesn't really matter to me whether we use that or Hudson or anything below, just that the person in Tika-ville that monitors/maintains our config for it knows how to communicate with builds@ or infra@ to fix little things like this, or to alleviate having to conta

Re: Getting started

2010-07-13 Thread Mattmann, Chris A (388J)
Thanks Nick and thanks Arturo, for the offer to write a small guide to getting started with parsing. It might be good to create a JIRA issue for this? Arturo, can you head over to JIRA and create an issue to contribute a "get Tika parsing up and running in 5 minutes" quick start guide? Then, you

RE: buildbot failure in ASF Buildbot on tika-trunk

2010-07-13 Thread Gav...
> -Original Message- > From: Jukka Zitting [mailto:jukka.zitt...@gmail.com] > Sent: Tuesday, 13 July 2010 7:01 PM > To: bui...@apache.org > Cc: dev@tika.apache.org > Subject: Re: buildbot failure in ASF Buildbot on tika-trunk > > Hi, > > On Mon, Jul 12, 2010 at 8:35 PM, wrote: > > The

[jira] Commented: (TIKA-460) HTMLHandler misses treatment of A elements

2010-07-13 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12887718#action_12887718 ] Julien Nioche commented on TIKA-460: this would work if we had in the list of safe eleme

[jira] Commented: (TIKA-463) HtmlParser doesn't extract links from img, map, object, frame, iframe, area, link

2010-07-13 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12887716#action_12887716 ] Julien Nioche commented on TIKA-463: creating a LinksHtmlMapper : +1, that would be a nic

Re: Getting started

2010-07-13 Thread Nick Burch
On Tue, 13 Jul 2010, Arturo Beltran wrote: It might be interesting to write a small manual: "How to create a new Tika Parser for Dummies". Simply including the three steps that I have finally figured out (new Parser, tika-mimetypes.xml, list the new parser). The 3rd step is only needed if you

Re: Getting started

2010-07-13 Thread Arturo Beltran
That was my "big" problem all this time, I almost went crazy. Now it works perfectly, thank you very much for your help. It might be interesting to write a small manual: "How to create a new Tika Parser for Dummies". Simply including the three steps that I have finally figured out (new Parser,

Re: Getting started

2010-07-13 Thread Nick Burch
On Tue, 13 Jul 2010, Arturo Beltran wrote: I'm calling my parser using the Tika-app included, so I think I'm using AutoDetectParser. You have to explicitly tell the AutoDetectParser to try your parser, in addition to the mime type definition List your new parser in: tika-parsers/src/main/res

Re: Getting started

2010-07-13 Thread Arturo Beltran
Hi Chris and all, El 07/07/2010 16:04, Mattmann, Chris A (388J) escribió: Hi Arturo, How exactly are you calling your parser? Are you using the AutoDetectParser? If so, can you put some print statements in in the public void parse(...) method of CompositeParser? Specifically, add a line right

Re: buildbot failure in ASF Buildbot on tika-trunk

2010-07-13 Thread Jukka Zitting
Hi, On Mon, Jul 12, 2010 at 8:35 PM, wrote: > The Buildbot has detected a new failure of tika-trunk on ASF Buildbot. > Full details are available at: >  http://ci.apache.org/builders/tika-trunk/builds/49 Gavin, I assume you set up the Tika build on buildbot? Can you change the build command fro