Hi,

Any interest in this? If not, is there some other Lucene project that
I should approach?

BR,

Jukka Zitting

On 7/18/06, Jukka Zitting <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I'm a committer of the Apache Jackrabbit project, and I've recently
> been working on improving the full text indexing support in
> Jackrabbit. We've used standard Lucene Java as the embedded full text
> search engine in Jackrabbit, but created our own set of parsers for
> extracting text content from binary files. So far our parser interface
> TextFilter [1] has been Jackrabbit-specific, but my recent refactoring
> proposal, TextExtractor, [2] aims for a generic solution that converts
> a generic InputStream into a Reader for passing to Lucene Java.
>
> Before coming up with the proposal I tried looking for similar
> solutions, but couldn't find any that would have satisfied my
> requirement of no external dependencies other than the JRE. Your
> o.a.nutch.parse.Parser interface however came quite close, and you
> already have an extensive set of existing implementations, so I'd like
> to leverage your work with the Parser implementations while finding a
> way to avoid the full Nutch and Hadoop dependencies. I believe that
> there are a number of other Lucene users who have similar needs.
>
> Thus I'd like to ask if there would be interest in making your Parser
> interface and implementations more easily accessible to external
> projects, perhaps as a separate library. If  you're interested, I'd be
> happy to participate in such an effort.
>
> [1] 
> http://svn.apache.org/viewvc/jackrabbit/trunk/jackrabbit/src/main/java/org/apache/jackrabbit/core/query/TextFilter.java?view=markup
> [2] http://issues.apache.org/jira/browse/JCR-415
>
>
> BR,
>
> Jukka Zitting
>
> --
> Yukatan - http://yukatan.fi/ - [EMAIL PROTECTED]
> Software craftsmanship, JCR consulting, and Java development
>

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to