Spliting of words

2005-09-13 Thread Madhu Satyanarayana Panitini
Hai all I want know the split pattern of text before indexing in Lucene, its splits where ever there is space in between the words Or is there any pattern in splitting the words of text document. In which program I can find the code on the splitting of the word. Madhu Madhu Satyanarayana. Pan

RE: Spliting of words

2005-09-13 Thread Kunemann Frank
This depends on the analyzer you are using. You can find the standard analyzers in org.apache.lucene.analysis. To find out what they do, I recommend the example in Lucene in action in 4.2.3 called "AnalyzerDemo". If you don't have the book, you can also download the examples from http://www.manning

Re: Spliting of words

2005-09-13 Thread Paul Libbrecht
Madhu, Analyzer is the magic word here. Lucene's StandardAnalyzer has a whole grammar to split words into tokens. There are many more analyzers, most of which are language specific (e.g. based the Snowball or Porter-stemmers, see contribs or javadoc of core). For which language do wish to u

RE: Splitting of words

2005-09-13 Thread Madhu Satyanarayana Panitini
Hi Paul, I agree with u "Analyzer is the magic word" Lets look it in depth and clear, I would consider three parts in the analyzer 1. Tokenization (splitting of words) 2. Stopwords removal (depends up on the language) 3. stemming of the words (depends up on the language) First to start analyze

Re: Splitting of words

2005-09-13 Thread Erik Hatcher
On Sep 13, 2005, at 7:24 AM, Madhu Satyanarayana Panitini wrote: Hi Paul, I agree with u "Analyzer is the magic word" Lets look it in depth and clear, I would consider three parts in the analyzer 1. Tokenization (splitting of words) 2. Stopwords removal (depends up on the language) 3. stemmin

Lucene issue tracker moved to JIRA

2005-09-13 Thread Erik Hatcher
cross-posting to all the Lucene e-mail lists The Lucene issue tracking system has migrated from Bugzilla to JIRA. Please use the new system for all issue tracking activities from this point forward: http://issues.apache.org/jira/browse/LUCENE If a Lucene committer has the time to swit

MultiSearcher... Multiple Analyzer

2005-09-13 Thread Olivier Jaquemet
Hi, I have many indices, one for each language, each one has been indexed using a specific analyzer. I want to search in all my indices, but I still want/need to use the same analyzer that has been used for indexing. MultiSearch only accept one query, and if I use for example QueryParser, I ca

Is George Aroush still around?

2005-09-13 Thread Jeff Rodenburg
Mayday, mayday Has anyone had recent contact with George Aroush? He's presently managing the C# port of Lucene. Thanks, Jeff Rodenburg

Re: too many files open

2005-09-13 Thread Otis Gospodnetic
Hello, (replying to java-user list, as that's the place to ask) Your mergeFactor is waay to high. Leave it at the default (10). Also look at IndexWriter javadocs, where mergeFactor and friends are described. If you have Lucene in Action, mergeFactor is described in detail in chapter 2 (see h

lock directory system property is read too early!

2005-09-13 Thread Paul Libbrecht
Hi, I am facing the problem that the system property LOCK_DIR in FSDirectory seems to be loaded too early, that is, at classloading time, whereas I am setting this property myself later... Dare I request that its initialization is done lazily ? thanks paul ---

Re: lock directory system property is read too early!

2005-09-13 Thread Otis Gospodnetic
Hello, The code in SVN has already been changed and the use of system properties has been deprecated. Otis --- Paul Libbrecht <[EMAIL PROTECTED]> wrote: > > Hi, > > I am facing the problem that the system property LOCK_DIR in > FSDirectory seems to be loaded too early, that is, at classload

Hits issue or custom filter issue?

2005-09-13 Thread Jeff Rodenburg
I'm encountering some unexpected behavior teeing up multiple Hits objects from a searcher, and I think I'm missing something obvious. Hoping a second pair of eyes might see what I'm missing. Here's my code sequence: // Some liberties taken in the code regarding names, etc. // v1.4.3 codebase Bo

Re: Hits issue or custom filter issue?

2005-09-13 Thread Chris Hostetter
: Hits h1 = oMultiSearcher.Search(new FilteredQuery(combinedQuery, new : myCustomFilter(1))); : Hits h2 = oMultiSearcher.Search(new FilteredQuery(combinedQuery, new : myCustomFilter(2))); ...do you get the same results if you use... Hits h1 = oMultiSearcher.search(combinedQuery, myCustomFilte

Re: Hits issue or custom filter issue?

2005-09-13 Thread Jeff Rodenburg
Might be the same issue, haven't been able to determine during a step-through on the code exec. You're right, no need to add a new FilteredQuery to the statement, just a search on combinedQuery with a new myCustomFilter. Unfortunately, no joy; same response. -- j On 9/13/05, Chris Hostetter <[E

Is Lucene for Me?

2005-09-13 Thread James Reynolds
Please forgive this low tech question, but I'm wondering if Lucene is an appropriate solution for a challenge I'm facing. I need a quick look up method for a growing list of customers in a database (the alphabetical select list has become too cumbersome). Lucene seems to be an excellent option f

Re: Is Lucene for Me?

2005-09-13 Thread Erik Hatcher
On Sep 13, 2005, at 8:27 PM, James Reynolds wrote: Please forgive this low tech question, but I'm wondering if Lucene is an appropriate solution for a challenge I'm facing. I need a quick look up method for a growing list of customers in a database (the alphabetical select list has become t

Re: Hits issue or custom filter issue?

2005-09-13 Thread Chris Hostetter
if you can post a short unit test demonstrating the problem, that would help us understand the problem you are having. At this point, i would guess the problem relates to your custon filter. if you look at the attachment to the bug i mentioned, you can see that the "testFilters" method domonstra

Re: Hits issue or custom filter issue?

2005-09-13 Thread Chris Hostetter
I just had another thought: is the number of results you get back from each of the filters the same as the number you get back if you apply no filter? If so, then: a) Perhaps you don't realize this, but a Filter can never be used to increase the number of results returned by a query. Filter's

Re: Hits issue or custom filter issue?

2005-09-13 Thread Jeff Rodenburg
On 9/13/05, Chris Hostetter <[EMAIL PROTECTED]> wrote: > > > I just had another thought: is the number of results you get back from > each of the filters the same as the number you get back if you apply no > filter? In one case yes, in another case no. I've been able to test to either some res

RE: Is George Aroush still around?

2005-09-13 Thread George Aroush
Hi Jeff, Yes, I am here but *very* busy -- I expect things to cool off in a week or so at which time I intend to pickup the 1.9 port as well as the moving of DotLucene from SourceForg.Net to it's incubated home at Apache. If you have cycles and you can help, let me know as I am currently the only