[jira] Created: (LUCENE-1408) DocumentsWriter.init() doesn't grow fieldDataHash array at same rate as allFieldData array, leading to OOM errors

2008-09-30 Thread David C. Navas (JIRA)
DocumentsWriter.init() doesn't grow fieldDataHash array at same rate as allFieldData array, leading to OOM errors - Key: LUCENE-1408 URL: https://issue

Re: draft 2.4 announcement

2008-09-30 Thread Michael McCandless
Good idea -- done: Release 2.4.0 of Lucene is now available! With 2.4 we have relaxed the backwards compatibility policy of the Fieldable interface: we now allow changes on a case by case basis. This means any custom classes that implement Fieldable will need to be updated. This was done to a

Re: draft 2.4 announcement

2008-09-30 Thread Chris Hostetter
: The next release will be 2.9. After that will be 3.0, which will : remove all deprecated APIs from 2.9 and will be the first release of : Lucene to require JRE 1.5. The timing on these two releases is not : yet known. I would move that para to the end, possibly starting with "SPECIAL NOTE:"

Re: [jira] Commented: (LUCENE-1406) new Arabic Analyzer (Apache license)

2008-09-30 Thread Robert Muir
Thanks for clarification. With this method arabic analyzer could lemmatize, not stem, using buckwalter dictionary, and things like broken plural will work correctly. I'm not sure yet if hspell has this type of information, but it would at least be a better stem for hebrew as well. On Tue, Sep 30

Re: Ocean and GData

2008-09-30 Thread Jason Rutherglen
The wiki site has now changed to http://wiki.apache.org/lucene-java/RealtimeSearch which only has the realtime search documentation. The other components that may be a fit for SOLR are listed at http://wiki.apache.org/lucene-java/OceanComponents On Mon, Sep 29, 2008 at 11:43 AM, Jason Rutherglen

Re: [jira] Commented: (LUCENE-1406) new Arabic Analyzer (Apache license)

2008-09-30 Thread Otis Gospodnetic
Oh, and a note on GPL. It's fine to make use of GPL data, it's just that ASF cannot distribute it. So the code could come with (java)docs that point out that things would be better if the analyzer could use the GPL data that can be downloaded from X and it could be written to make use of the G

Re: [jira] Commented: (LUCENE-1406) new Arabic Analyzer (Apache license)

2008-09-30 Thread Otis Gospodnetic
Yeah, there is interest in Hebrew. People ask for it occasionally and I know one vry large news organization that will be looking Lucene/Solr Hebrew support in the coming weeks/months. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Ro

Re: [jira] Commented: (LUCENE-1406) new Arabic Analyzer (Apache license)

2008-09-30 Thread Robert Muir
thanks for your feedback. below is a description of an idea for a biblical hebrew stemmer that would work somewhat differently than a modern hebrew stemmer. With regards to pointing i can imagine a user might be frustrated if a word is stemmed too aggressively when niqqud is present in query or te

Re: [jira] Commented: (LUCENE-1406) new Arabic Analyzer (Apache license)

2008-09-30 Thread DM Smith
Robert Muir wrote: can you provide any more information on your use case? I had originally imagined MH, ktiv male spelling only, but your use case is interesting. Are you currently indexing biblical hebrew text? dotted or undotted? Biblical Hebrew. Variety of texts. Some unpointed. Others w/ p

Re: [jira] Commented: (LUCENE-1406) new Arabic Analyzer (Apache license)

2008-09-30 Thread Robert Muir
can you provide any more information on your use case? I had originally imagined MH, ktiv male spelling only, but your use case is interesting. Are you currently indexing biblical hebrew text? dotted or undotted? On Tue, Sep 30, 2008 at 8:54 AM, DM Smith <[EMAIL PROTECTED]> wrote: > > On Sep 30

Re: [jira] Commented: (LUCENE-1406) new Arabic Analyzer (Apache license)

2008-09-30 Thread DM Smith
On Sep 30, 2008, at 8:19 AM, Robert Muir wrote: cool. is there interest in similar basic functionality for Hebrew? I'm interested as I use lucene for biblical research. same rules apply: without using GPL data (i.e. Hspell data) you can't do it right, but you can do a lot of the common

Re: draft 2.4 announcement

2008-09-30 Thread Michael McCandless
Woops, thanks Steven. New version: Release 2.4.0 of Lucene is now available! The next release will be 2.9. After that will be 3.0, which will remove all deprecated APIs from 2.9 and will be the first release of Lucene to require JRE 1.5. The timing on these two releases is not yet known. W

RE: draft 2.4 announcement

2008-09-30 Thread Steven A Rowe
Spelling nits: On 09/30/2008 at 8:27 AM, Michael McCandless wrote: > Fieldable interface: we now allow changes on a case by case bases. bases -> basis > This means any custom classes that implement Fielable will need to be Fielable -> Fieldable Steve --

[VOTE] Release Lucene 2.4.0

2008-09-30 Thread Michael McCandless
I've built the release artifacts, from revision 700430 on the 2.4 branch. These are the changes: http://people.apache.org/~mikemccand/staging-area/lucene2.4changes/Changes.html Please vote to officially release these artifacts as 2.4.0: http://people.apache.org/~mikemccand/staging-area/l

Re: draft 2.4 announcement

2008-09-30 Thread Michael McCandless
OK another iteration: Release 2.4.0 of Lucene is now available! The next release will be 2.9. After that will be 3.0, which will remove all deprecated APIs from 2.9 and will be the first release of Lucene to require JRE 1.5. The timing on these two releases is not yet known. With 2.4 we ha

Re: [jira] Commented: (LUCENE-1406) new Arabic Analyzer (Apache license)

2008-09-30 Thread Robert Muir
cool. is there interest in similar basic functionality for Hebrew? same rules apply: without using GPL data (i.e. Hspell data) you can't do it right, but you can do a lot of the common stuff just like Arabic. Tokenization is a tad bit more complex, and out of box western behavior is probably annoy

Re: draft 2.4 announcement

2008-09-30 Thread Michael McCandless
Good points -- I'll update! Thanks Grant. Mike Grant Ingersoll wrote: Couple of notes: I would call out the Fieldable change I would also add something to the effect that the next release, 2.9, will be a deprecation release (or whatever you want to call it) and that 3.0 will require Jav

[jira] Commented: (LUCENE-1406) new Arabic Analyzer (Apache license)

2008-09-30 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635723#action_12635723 ] Grant Ingersoll commented on LUCENE-1406: - I'll commit once 2.4 is released. > ne

Re: draft 2.4 announcement

2008-09-30 Thread Grant Ingersoll
Couple of notes: I would call out the Fieldable change I would also add something to the effect that the next release, 2.9, will be a deprecation release (or whatever you want to call it) and that 3.0 will require Java 1.5. I don't think we can start spreading that news too soon. -Grant

Re: draft 2.4 announcement

2008-09-30 Thread Michael McCandless
OK here's the new version with suggestions folded in: Release 2.4.0 of Lucene is now available! Many new features, fixes and optimizations have happened since 2.3, including: * New InstantiatedIndex (contrib/instantiated): RAM-based index that enables much faster searching than RAMDirec