RE: best way to share cookie info (user search history, etc.) between two load balanaced lucene search servers..

2007-11-12 Thread Chhabra, Kapil
Ah! There are so many ways to do this as there are so many questions unanswered in your mail. What kind of load balancer are you going to install? Will you be replicating the complete lucene index on both the servers? Do you plan to use the MultiSearcher/ParellelMultiSearcher? Do these servers sha

RE: best way to share cookie info (user search history, etc.) between two load balanaced lucene search servers..

2007-11-12 Thread Chhabra, Kapil
about the load balancer yet. That is what we are trying to find out. I was thinking of using LVS, but not sure how easy is it to use. I also found the Persistence feature to be quite useful. Is there any better solution for load balancing? Chhabra, Kapil wrote: > > Ah! There are so many way

RE: Lucene search question

2007-11-13 Thread Chhabra, Kapil
If its only about the search, you could have "section" as just another field in your index. You could simply search on work as "well" as "section". Otherwise, if you are looking at aggregating category hits, then look at http://mail-archives.apache.org/mod_mbox/lucene-java-user/200605.mbox/[EMAI

RE: neither IndexWriter nor IndexReader would delete documents

2007-11-18 Thread Chhabra, Kapil
Hi, Checkout for the following lines in the documentation of IndexReader: * "An IndexReader can be opened on a directory for which an IndexWriter is opened already, but it cannot be used to delete documents from the index then. " * "Once a document is deleted it will not appear in TermDocs or Term

RE: XML parsing using Lucene in Java

2007-11-18 Thread Chhabra, Kapil
Checkout for "http://www.ibm.com/developerworks/web/library/j-lucene/"; Though this page does not list a comparison between SAX and Digester, it convinced me enough to use Digester Regards, kapilChhabra -Original Message- From: syedfa [mailto:[EMAIL PROTECTED] Sent: Monday, November 19,

RE: Time of processing hits.doc()

2007-11-18 Thread Chhabra, Kapil
Hey! Search for the topic "Aggregating Category Hits" in the list. You'll get a few approaches that you may use to implement "groupby". Regards, kapilChhabra -Original Message- From: Haroldo Nascimento [mailto:[EMAIL PROTECTED] Sent: Monday, November 19, 2007 3:02 AM To: java-user@lucene

RE: Lucene Setting

2007-11-19 Thread Chhabra, Kapil
Liaqat, What exactly are you looking for? Are you sure you want to build the source of lucene and then use it? Alternatively you could simply use the lucene jar file (ie. already built for you) and start playing around with it. This jar file is bundled in the archive that you might have downloaded.

RE: how to increase the performance!!!

2007-11-26 Thread Chhabra, Kapil
Hi Shakti, > I am using Searching is taking a lot of time. What do you mean by a lot of time? How much time is it taking? There are a lot of factors that affect the search speed. > The size of the folder where I am keeping the index files is 160 MB containing 3277 documents. That's not too much.

RE: RAMDirectory vs FSDirectory

2007-11-26 Thread Chhabra, Kapil
> one can improve search performance by using a RAMDirectory created from an underlying FSDirectory using one of the parameterised constructors. Is this correct? Absolutly > Will a FSDirectory not automatically load the index into memory provided enough RAM is available? Not all index files are

RE: document deletion problem

2007-12-19 Thread Chhabra, Kapil
Hi Tushar, If you refer to the Javadocs for IndexReader, you'll come across the following line: "For efficiency, in this API documents are often referred to via document numbers, non-negative integers which each name a unique document in the index. These document numbers are ephemeral--they may ch

RE: TermFreqVector

2007-07-20 Thread Chhabra, Kapil
http://lucene.apache.org/java/2_2_0/api/org/apache/lucene/search/Hits.ht ml#id(int) public final int id(int n) throws IOException Returns the id for the nth document in this set. Note that ids may change when the index changes, so you cannot rely on the id to be stable. kapilCh

RE: Question regarding ignore case?

2007-07-20 Thread Chhabra, Kapil
I don't think that there is any other way out apart from re-indexing in all-small or all-caps case(through an Analyzer or externally), and then searching in the same case as you used while indexing. Even if you find a way by which you can run case insensitive searches, I am sure it'll add to the co

RE: deleting/updating/identifying a document

2007-07-20 Thread Chhabra, Kapil
Is it not true for any RDBMS table as well which does not have a Primary Key? If this is a problem that you are facing, then it can be solved by introducing one unique identifier as a field in your index which would act as a Primary Key for your index. Using an untokenized field might not be a good

RE: Problem Search using lucene

2007-07-31 Thread Chhabra, Kapil
You just have to make sure that what you are searching is indexed (and esp. in the same format/case). Use Luke (http://www.getopt.org/luke/) to browse through your index. This might give you an insight of what you have indexed and what you are searching for. Regards, kapilChhabra -Original Me

RE: Getting only the Ids, not the whole documents.

2007-08-02 Thread Chhabra, Kapil
What is the structure of your index? If you havnt already, then add a new field to your index that stores the contractId. For all other fields, set the "store" flag to false while indexing. You can now safely retrieve the value of this contractId field based on your search results. Regards, kapil

RE: speedup indexing

2007-08-06 Thread Chhabra, Kapil
Try going through: http://wiki.apache.org/lucene-java/ImproveIndexingSpeed Regards, kapilChhabra -Original Message- From: SK R [mailto:[EMAIL PROTECTED] Sent: Monday, August 06, 2007 5:09 PM To: java-user@lucene.apache.org Subject: speedup indexing Hi, I have indexed 5 fields and s

RE: Multiple fields vs one field

2007-08-06 Thread Chhabra, Kapil
Hey Albert, Just to remind you, that the fields in Lucene are per document and not per index. This means that you can have documents in an index which have different fields altogether. So, in effect, you can all your document types to your existing index. And guess what, you don't need to change an

RE: lucene suggest

2007-08-21 Thread Chhabra, Kapil
Hi, Yes there are ways and workarounds to remove duplicates based on one field. But, you should not need this if you don't index duplicates at the first place. Just put a call to "delete" from index right before you add the document to in. Best Regards, Kapil Chhabra -Original Message- Fr