Jukka Zitting wrote: > Hi, > > Any interest in this?
definitely :-) Michi > If not, is there some other Lucene project that > I should approach? > > BR, > > Jukka Zitting > > On 7/18/06, Jukka Zitting <[EMAIL PROTECTED]> wrote: > >> Hi, >> >> I'm a committer of the Apache Jackrabbit project, and I've recently >> been working on improving the full text indexing support in >> Jackrabbit. We've used standard Lucene Java as the embedded full text >> search engine in Jackrabbit, but created our own set of parsers for >> extracting text content from binary files. So far our parser interface >> TextFilter [1] has been Jackrabbit-specific, but my recent refactoring >> proposal, TextExtractor, [2] aims for a generic solution that converts >> a generic InputStream into a Reader for passing to Lucene Java. >> >> Before coming up with the proposal I tried looking for similar >> solutions, but couldn't find any that would have satisfied my >> requirement of no external dependencies other than the JRE. Your >> o.a.nutch.parse.Parser interface however came quite close, and you >> already have an extensive set of existing implementations, so I'd like >> to leverage your work with the Parser implementations while finding a >> way to avoid the full Nutch and Hadoop dependencies. I believe that >> there are a number of other Lucene users who have similar needs. >> >> Thus I'd like to ask if there would be interest in making your Parser >> interface and implementations more easily accessible to external >> projects, perhaps as a separate library. If you're interested, I'd be >> happy to participate in such an effort. >> >> [1] >> http://svn.apache.org/viewvc/jackrabbit/trunk/jackrabbit/src/main/java/org/apache/jackrabbit/core/query/TextFilter.java?view=markup >> >> >> [2] http://issues.apache.org/jira/browse/JCR-415 >> >> >> BR, >> >> Jukka Zitting >> >> -- >> Yukatan - http://yukatan.fi/ - [EMAIL PROTECTED] >> Software craftsmanship, JCR consulting, and Java development >> > -- Michael Wechner Wyona - Open Source Content Management - Apache Lenya http://www.wyona.com http://lenya.apache.org [EMAIL PROTECTED] [EMAIL PROTECTED] +41 44 272 91 61 ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
