Hi, Any interest in this? If not, is there some other Lucene project that I should approach?
BR, Jukka Zitting On 7/18/06, Jukka Zitting <[EMAIL PROTECTED]> wrote: > Hi, > > I'm a committer of the Apache Jackrabbit project, and I've recently > been working on improving the full text indexing support in > Jackrabbit. We've used standard Lucene Java as the embedded full text > search engine in Jackrabbit, but created our own set of parsers for > extracting text content from binary files. So far our parser interface > TextFilter [1] has been Jackrabbit-specific, but my recent refactoring > proposal, TextExtractor, [2] aims for a generic solution that converts > a generic InputStream into a Reader for passing to Lucene Java. > > Before coming up with the proposal I tried looking for similar > solutions, but couldn't find any that would have satisfied my > requirement of no external dependencies other than the JRE. Your > o.a.nutch.parse.Parser interface however came quite close, and you > already have an extensive set of existing implementations, so I'd like > to leverage your work with the Parser implementations while finding a > way to avoid the full Nutch and Hadoop dependencies. I believe that > there are a number of other Lucene users who have similar needs. > > Thus I'd like to ask if there would be interest in making your Parser > interface and implementations more easily accessible to external > projects, perhaps as a separate library. If you're interested, I'd be > happy to participate in such an effort. > > [1] > http://svn.apache.org/viewvc/jackrabbit/trunk/jackrabbit/src/main/java/org/apache/jackrabbit/core/query/TextFilter.java?view=markup > [2] http://issues.apache.org/jira/browse/JCR-415 > > > BR, > > Jukka Zitting > > -- > Yukatan - http://yukatan.fi/ - [EMAIL PROTECTED] > Software craftsmanship, JCR consulting, and Java development > ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
