[jira] Commented: (LUCENE-1406) new Arabic Analyzer (Apache license)

2008-09-26 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634838#action_12634838 ] Grant Ingersoll commented on LUCENE-1406: - Very cool. I've used a modified versio

[jira] Commented: (LUCENE-1406) new Arabic Analyzer (Apache license)

2008-09-26 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634944#action_12634944 ] Robert Muir commented on LUCENE-1406: - Thought I would add the following comments: I

[jira] Commented: (LUCENE-1406) new Arabic Analyzer (Apache license)

2008-09-30 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635723#action_12635723 ] Grant Ingersoll commented on LUCENE-1406: - I'll commit once 2.4 is released. > ne

[jira] Commented: (LUCENE-1406) new Arabic Analyzer (Apache license)

2008-10-20 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12641075#action_12641075 ] Grant Ingersoll commented on LUCENE-1406: - Committed revision 706342. I made some

Re: [jira] Commented: (LUCENE-1406) new Arabic Analyzer (Apache license)

2008-09-30 Thread Robert Muir
cool. is there interest in similar basic functionality for Hebrew? same rules apply: without using GPL data (i.e. Hspell data) you can't do it right, but you can do a lot of the common stuff just like Arabic. Tokenization is a tad bit more complex, and out of box western behavior is probably annoy

Re: [jira] Commented: (LUCENE-1406) new Arabic Analyzer (Apache license)

2008-09-30 Thread DM Smith
On Sep 30, 2008, at 8:19 AM, Robert Muir wrote: cool. is there interest in similar basic functionality for Hebrew? I'm interested as I use lucene for biblical research. same rules apply: without using GPL data (i.e. Hspell data) you can't do it right, but you can do a lot of the common

Re: [jira] Commented: (LUCENE-1406) new Arabic Analyzer (Apache license)

2008-09-30 Thread Robert Muir
can you provide any more information on your use case? I had originally imagined MH, ktiv male spelling only, but your use case is interesting. Are you currently indexing biblical hebrew text? dotted or undotted? On Tue, Sep 30, 2008 at 8:54 AM, DM Smith <[EMAIL PROTECTED]> wrote: > > On Sep 30

Re: [jira] Commented: (LUCENE-1406) new Arabic Analyzer (Apache license)

2008-09-30 Thread DM Smith
Robert Muir wrote: can you provide any more information on your use case? I had originally imagined MH, ktiv male spelling only, but your use case is interesting. Are you currently indexing biblical hebrew text? dotted or undotted? Biblical Hebrew. Variety of texts. Some unpointed. Others w/ p

Re: [jira] Commented: (LUCENE-1406) new Arabic Analyzer (Apache license)

2008-09-30 Thread Robert Muir
thanks for your feedback. below is a description of an idea for a biblical hebrew stemmer that would work somewhat differently than a modern hebrew stemmer. With regards to pointing i can imagine a user might be frustrated if a word is stemmed too aggressively when niqqud is present in query or te

Re: [jira] Commented: (LUCENE-1406) new Arabic Analyzer (Apache license)

2008-09-30 Thread Otis Gospodnetic
: Robert Muir <[EMAIL PROTECTED]> To: java-dev@lucene.apache.org Sent: Tuesday, September 30, 2008 8:19:35 AM Subject: Re: [jira] Commented: (LUCENE-1406) new Arabic Analyzer (Apache license) cool. is there interest in similar basic functionality for Hebrew? same rules apply: without using GP

Re: [jira] Commented: (LUCENE-1406) new Arabic Analyzer (Apache license)

2008-09-30 Thread Otis Gospodnetic
Re: [jira] Commented: (LUCENE-1406) new Arabic Analyzer (Apache license) cool. is there interest in similar basic functionality for Hebrew? same rules apply: without using GPL data (i.e. Hspell data) you can't do it right, but you can do a lot of the common stuff just like Arabic. Tokenization

Re: [jira] Commented: (LUCENE-1406) new Arabic Analyzer (Apache license)

2008-09-30 Thread Robert Muir
ge > From: Robert Muir <[EMAIL PROTECTED]> > To: java-dev@lucene.apache.org > Sent: Tuesday, September 30, 2008 8:19:35 AM > Subject: Re: [jira] Commented: (LUCENE-1406) new Arabic Analyzer (Apache > license) > > cool. is there interest in similar basic functionality for Hebrew? >

Re: [jira] Commented: (LUCENE-1406) new Arabic Analyzer (Apache license)

2008-10-01 Thread Nadav Har'El
On Tue, Sep 30, 2008, Robert Muir wrote about "Re: [jira] Commented: (LUCENE-1406) new Arabic Analyzer (Apache license)": > Thanks for clarification. With this method arabic analyzer could lemmatize, > not stem, using buckwalter dictionary, and things like broken plural will

Re: [jira] Commented: (LUCENE-1406) new Arabic Analyzer (Apache license)

2008-10-01 Thread Grant Ingersoll
Can we have the Hebrew discussion on another thread? FWIW, I do agree it would be a good thing to add. Thanks, Grant On Oct 1, 2008, at 4:02 PM, Nadav Har'El wrote: On Tue, Sep 30, 2008, Robert Muir wrote about "Re: [jira] Commented: (LUCENE-1406) new Arabic Analyzer (Apac