Re: Time of processing hits.doc()

2007-11-18 Thread Tzvika Barenholz
You can feed the hits vector into Quaere (http://quaere.codehaus.org/) to accomplish the SQL-like grouping you desire, very easily. But I'm not sure it'll be that much quicker. Worth a shot. T On 11/18/07, Haroldo Nascimento <[EMAIL PROTECTED]> wrote: > > I have a problem of performance when I ne

Re: urgent

2007-11-18 Thread rohit saini
Hi, I think u may need to use escape function to escape the words which does not come in searching Rohit On 11/17/07, Shakti_Sareen <[EMAIL PROTECTED]> wrote: > > Hi > > > > I am facing problem in searching the word containing forward slash (/). > > My index file contains more then one docum

RE: XML parsing using Lucene in Java

2007-11-18 Thread Chhabra, Kapil
Checkout for "http://www.ibm.com/developerworks/web/library/j-lucene/"; Though this page does not list a comparison between SAX and Digester, it convinced me enough to use Digester Regards, kapilChhabra -Original Message- From: syedfa [mailto:[EMAIL PROTECTED] Sent: Monday, November 19,

RE: neither IndexWriter nor IndexReader would delete documents

2007-11-18 Thread Chhabra, Kapil
Hi, Checkout for the following lines in the documentation of IndexReader: * "An IndexReader can be opened on a directory for which an IndexWriter is opened already, but it cannot be used to delete documents from the index then. " * "Once a document is deleted it will not appear in TermDocs or Term

RE: Time of processing hits.doc()

2007-11-18 Thread Chhabra, Kapil
Hey! Search for the topic "Aggregating Category Hits" in the list. You'll get a few approaches that you may use to implement "groupby". Regards, kapilChhabra -Original Message- From: Haroldo Nascimento [mailto:[EMAIL PROTECTED] Sent: Monday, November 19, 2007 3:02 AM To: java-user@lucene

XML parsing using Lucene in Java

2007-11-18 Thread syedfa
Dear Fellow Lucene Developers: I am a java/jsp developer and have started learning lucene for the purpose of creating a search engine for some books that I have in xml format. The XML document is actually quite large, and would like to provide as accurate results as possible to the user searchin

Re: Time of processing hits.doc()

2007-11-18 Thread Mark Miller
Correction: that issue to watch out for is in regards to the TopDocs HitCollector. If you where to go with your own HitCollector rather than TopDocs you might not necessarily have this problem (or at the least you can code around it). Mark Miller wrote: Hey Haroldo. First thing you need to d

Re: Time of processing hits.doc()

2007-11-18 Thread Mark Miller
Hey Haroldo. First thing you need to do is *stop* using Hits in your searches. Hits is optimized for some pretty specific use cases and you will get along much better by using a HitCollector. Hits has three main functions: It caches documents, normalizes scores, and stores ids associated wit

Re: neither IndexWriter nor IndexReader would delete documents

2007-11-18 Thread Daniel Naber
On Sonntag, 18. November 2007, flateric wrote: > Has absolutely no effect. I also tried delete on the IndexWriter - no > effect. Please use the tool Luke to have a look inside your index to see if a document with field "uid" and the uid you're expecting really exists. The field should be UN_TOK

neither IndexWriter nor IndexReader would delete documents

2007-11-18 Thread flateric
Hallo *; I went through some examples of the Lucene in Action book to find that the API has changed and then applied the corrections with the help of this forum. One runtime problem however remains. I cannot delete any documents. I store documents like this: IndexWriter iw = new IndexWriter(

Re: Payloads, Tokenizers, and Filters. Oh My!

2007-11-18 Thread Tricia Williams
I apologize for cross-posting but I believe both Solr and Lucene users and developers should be concerned with this. I am not aware of a better way to reach both communities. In this email I'm looking for comments on: * Do TokenFilters belong in the Solr code base at all? * How to deal

Re: Time of processing hits.doc()

2007-11-18 Thread N. Hira
Can you explain the problem you're trying to address from the user's perspective? From the description you've provided, you may want to look up "Faceted Searching". Another option may be to use a HitCollector, but it would help us if you could describe the problem at a higher level. Re

Time of processing hits.doc()

2007-11-18 Thread Haroldo Nascimento
I have a problem of performance when I need group the result do search I have the code below: for (int i = 0; i < hits.length(); i++) { doc = hits.doc(i); obj1 = doc.get(Constants.STATE_DESC_FIELD_LABEL); obj2 = doc.get(xxx);

Re: Can I use Ispell dictionaries roe analizers in Lucene?

2007-11-18 Thread Daniel Naber
On Sonntag, 18. November 2007, Alebu wrote: > So what ispell dictionary actually is? List of rules for translation > some words (or sentence?) to 'base form'? Or what? It's a list of terms with optional flags. For example: walk/xy In a different file, the flag "x" would then be defined as "appe

Re: Can I use Ispell dictionaries roe analizers in Lucene?

2007-11-18 Thread Alebu
So what ispell dictionary actually is? List of rules for translation some words (or sentence?) to 'base form'? Or what? If it is so, then as I understand it is actually possible to create some analyzer which gets ispell dictionary as parameter and this way to get a full power of ispell dictionari

Re: Can I use Ispell dictionaries roe analizers in Lucene?

2007-11-18 Thread Daniel Naber
On Sonntag, 18. November 2007, Alebu wrote: > 1. To analyze non English language I need to use specific analyzer. You don't have to, but it helps improving recall. > Can I use Ispell dictionaries with Lucene? It depends on the dictionary. Some dictionary authors use the ispell flagging system

Can I use Ispell dictionaries roe analizers in Lucene?

2007-11-18 Thread Alebu
I was wondering about methods for analyzing various languages and that what I understand (please correct me if I wrong): 1. To analyze non English language I need to use specific analyzer. Link to already available contributions in sandbox http://svn.apache.org/repos/asf/lucene/java/trunk/contr

Estimate index filesystem requirements

2007-11-18 Thread Lothar Maerkle
Hi, I'm wondering if there is a kind of "formule" to estimate the size of a lucene index. Searching the list, I did not find any pointers. Does anybody has a hint? What I figured out from the file format description and some empirical tests is, that for every index-file: Field-files: field-dat