Searching in various Index

2007-08-27 Thread payo
Hi to all i'm new with lucene i create a directory for each document format, so , i create a directory for HTML documents, and one directory for PDF documents, one document for XML documents. for each directory i create a index. how i do for search in all my index in all directorys? thanks --

Re: Searching Diacritics

2007-08-27 Thread Karl Wettin
27 aug 2007 kl. 20.30 skrev anorman: I've tried to implement an analyzer with little different then using: result = new ISOLatin1AccentFilter(result); in the TokenStream method. Everything appears to work, however my search will not work for any word with diacritics with that change.

Re: Searching Diacritics

2007-08-27 Thread anorman
I've tried to implement an analyzer with little different then using: result = new ISOLatin1AccentFilter(result); in the TokenStream method. Everything appears to work, however my search will not work for any word with diacritics with that change. Without using it will find words such as "cèdu

Re: Reusing FIeld Cache when opening a new IndexSearcher

2007-08-27 Thread Karl Wettin
27 aug 2007 kl. 16.52 skrev Antoine Baudoux: Is there a way to re-use some the fieldCaches of the previous IndexSearcher? You can take a look at what Solr does: http://wiki.apache.org/solr/SolrCaching -- karl - To unsubs

Re: Highlighter that works with phrase and span queries

2007-08-27 Thread Mike Klaas
Mark, I'm still interested in integrating this into Solr--this is a feature that has been requested a few times. It would be easier to do so if it were a contrib/... thanks for the great work, -Mike On 27-Aug-07, at 4:21 AM, Mark Miller wrote: I am a bit unclear about your question. The

Re: Searching Diacritics

2007-08-27 Thread anorman
Can I do this at search time rather than index time? Below is my code that is handling the searching, where would I utilize such a filter? Thanks for the help! package search.lucene.search; import org.apache.lucene.document.Document; import java.io.IOException; import java.util.ArrayList; im

Re: Highlighter that works with phrase and span queries

2007-08-27 Thread Michael Stoppelman
Ah, much clearer now. It seems that the jar file is just the class files. Is the source/javadoc code somewhere else? -M On 8/27/07, Mark Miller <[EMAIL PROTECTED]> wrote: > > I am a bit unclear about your question. The patch you mention extends > the original Highlighter to support phrase and spa

Reusing FIeld Cache when opening a new IndexSearcher

2007-08-27 Thread Antoine Baudoux
Hello I have a customScoreQuery that uses some FieldCaches to compute document scores. Every 5 minutes I open a new IndexSearcher. Is there a way to re-use some the fieldCaches of the previous IndexSearcher? thx, Antoine -- Antoine Baudoux Development Manager [EMAIL PR

Re: Searching Diacritics

2007-08-27 Thread thomas arni
You can extend the DefaultAnalyzer. The only thing you have to do, is to rewrite the method tokenStream like this: /** Constructs a [EMAIL PROTECTED] StandardTokenizer} filtered by a [EMAIL PROTECTED] StandardFilter}, a [EMAIL PROTECTED] LowerCaseFilter} and a [EMAIL PROTECTED] StopFilter}.

Re: Searching Diacritics

2007-08-27 Thread anorman
This looks like exactly what I want. Would I implement this along with another analyzer such as the standard or stand alone? Does anyone have any code examples of implementing such a thing? Thanks, Albert karl wettin-3 wrote: > > > 27 aug 2007 kl. 16.03 skrev anorman: > >> >> I have a se

Re: Searching Diacritics

2007-08-27 Thread Karl Wettin
27 aug 2007 kl. 16.03 skrev anorman: I have a searchable index of documents which contain french and spanish diacritics (è, é, À) etc. I would like to make the content searchable so that when a user searches for a word such as "Amèrique" or "Amerique" (without diacritic) then it returns

Searching Diacritics

2007-08-27 Thread anorman
I have a searchable index of documents which contain french and spanish diacritics (è, é, À) etc. I would like to make the content searchable so that when a user searches for a word such as "Amèrique" or "Amerique" (without diacritic) then it returns the same results. Has anyone set up something

Re: function query - get DocValues

2007-08-27 Thread Will Johnson
the patch i have worked out allows passing up the DocValues returned by the ValueSourceScorer to the ValueSourceQuery and then on up via a getDocValues(). am i missing any design/performance issues or does this sound generally useful? i'll submit a patch to jira once everything is doc'd a

Re: Maximum index size

2007-08-27 Thread Karl Wettin
27 aug 2007 kl. 14.34 skrev Antoine Baudoux: Ok, I will follow the documentation. But appart from optimization tips, are there intrinsic limits in term of : - Index size. Integer.MAX_VALUE documents. - number of distinct indexed terms. Integer.MAX_VALUE unique tokens per field. htt

Re: Maximum index size

2007-08-27 Thread Antoine Baudoux
Ok, I will follow the documentation. But appart from optimization tips, are there intrinsic limits in term of : - Index size. - number of distinct indexed terms. -- Antoine Baudoux Development Manager [EMAIL PROTECTED] Tél.: +32 2 333 58 44 GSM: +32 499 534 538 Fax.: +32 2 648 16 53 On 27

Re: Highlighter that works with phrase and span queries

2007-08-27 Thread Mark Miller
I am a bit unclear about your question. The patch you mention extends the original Highlighter to support phrase and span queries. It does not include any major performance increases over the original Highlighter (in fact, it takes a bit longer to Highlight a Span or Phrase query than it does t

Re: Maximum index size

2007-08-27 Thread Grant Ingersoll
Have a look at: http://wiki.apache.org/lucene-java/ BasicsOfPerformance and the section on indexing. Please provide more details about what you are doing and your hardware, memory, etc. if that doesn't help. On Aug 27, 2007, at 2:26 AM, Antoine Baudoux wrote: Hello, I'm i

RE: Get results from partial search keyword

2007-08-27 Thread Samir Abdou
Hi, Here's an example of the analyzer main's method: public TokenStream tokenStream(String fieldName, Reader reader) { TokenStream result = new StandardTokenizer(reader); result = new StandardFilter(result); result = new LowerCaseFilter(result); result = new StopFilter(result, s

RE: Get results from partial search keyword

2007-08-27 Thread spinergywmy
Hi, Just wonder do you have any example that I can refer to on how to use the stemmer filter? Thank you. -- View this message in context: http://www.nabble.com/Get-results-from-partial-search-keyword-tf4333228.html#a12344878 Sent from the Lucene - Java Users mailing list archive at Nabbl

RE: Get results from partial search keyword

2007-08-27 Thread Samir Abdou
Hi, To handle such a problem, you should use an analyzer with a stemmer (Porter stemmer for example). You have just to add the stemmer filter to your analyser. Hope this help, Samir -Message d'origine- De : spinergywmy [mailto:[EMAIL PROTECTED] Envoyé : lundi, 27. août 2007 05:55 À :

Re: Highlighter that works with phrase and span queries

2007-08-27 Thread Michael Stoppelman
Is this jar going to be in the next release of lucene? Also, are these the same as the changes in the following patch: https://issues.apache.org/jira/secure/attachment/12362653/spanhighlighter10.patch -M On 6/27/07, Mark Miller <[EMAIL PROTECTED]> wrote: > > > > I have not looked at any highlight