Hi Todd,

All of these sound good. Personally, I think analyzers like these belong in Lucene's contrib/analyzers package, with Solr factory implementations built on those, but that's your call.

As for the Protocol Buffers, I am assuming you mean: http://code.google.com/p/protobuf/ That is an Apache license, so it is fine to incorporate. Sounds like it might be a contrib to start, but that's just my take.

Sounds like they might be worth using in SolrJ and for distributed, but am interested in how it compares to other similar technologies. Can you share your use case for them?

-Grant

On Oct 15, 2008, at 2:48 PM, Feak, Todd wrote:

Reposting, as I inadvertently thread hijacked on the first one. My bad.

Hi all,

I have a handful of custom classes that we've created for our purposes
here. I'd like to share them if you think they have value for the rest
of the community, but I wanted to check here before creating JIRA
tickets and patches.

Here's what I have:

1. DoubleMetaphoneFilter and Factory. This replaces usage of the
PhoneticFilter and Factory allowing access to set maxCodeLength() on the DoubleMetaphone encoder and access to the "alternate" encodings that the
encoder provides for some words.

2. JapaneseHalfWidthFilter and Factory. Some Japanese characters (and
Latin alphabet) exist in both a FullWidth and HalfWidth form. This
filter normalizes by switching to the FullWidth form for all the
characters. I have seen at least one JIRA ticket about this issue. This
implementation doesn't rely on Java 1.6.

3. JapaneseHiraganaFilter and Factory. Japanese Hiragana can be
translated to Katakana. This filter normalizes to Katakana so that data
and queries can come in either way and get hits.


Also, I have been requested to create a prototype that you may be
interested in. I'm to construct a QueryResponseWriter that returns
documents using Google's Protocol Buffers. This would rely on an
existing patch that exposes the OutputStream, but I would like to start the work soon. Are there license concerns that would block sharing this
with you? Is there any interest in this?

Thanks for your consideration,
Todd Feak

--------------------------
Grant Ingersoll
Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans.
http://www.lucenebootcamp.com


Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ









Reply via email to