RE: exception to open a large index Insufficient system resources exist

2009-09-01 Thread Fang_Li
32 bit JVM, 1.3G allocated heap size, Lucene 2.4.1. In my option, this exception should not be caused by out of memory or out of system file handle because different exception should be thrown for these two cases. Any hint? Thanks, Fang, Li -Original Message- From: Uwe Schindler

RE: How to search both Tokenized and Untokenized fields

2009-03-11 Thread Fang_Li
Hi, What do you mean untokenized field? Are you using different analyzer for different field? If yes, I think you just use the same analyzer (PerfieldAnalyzer, I guess) for query. Li -Original Message- From: rokham [mailto:somebodyik...@gmail.com] Sent: Monday, March 09, 2009 11:02 PM

RE: IndexWriter 2-phase commit usage

2009-02-24 Thread Fang_Li
The prepareCommit should do most real works, so the chance index2.commit() failure should be slim. I think it's very hard to compensate the changes already committed. One solution is that you create separate indexes for each transaction and merge them later. Merging can fail, but the transaction

RE: the efficiency of creating indexes

2009-02-18 Thread Fang_Li
Did you try? The cost of index merging grows when indexes are getting bigger. Try to limit the max document size in a segment by setting setMaxMergeDocs in IndexWriter. -Original Message- From: 治江 王 [mailto:wangzhijiang...@yahoo.com.cn] Sent: Monday, February 16, 2009 1:49 PM To:

the impact of thousands of field in a single document

2009-02-18 Thread Fang_Li
Hi, Due to requirement, we need to construct a Lucene document with tens of thousands of Field. Did anyone try this? What's the performance penalty comparing with one single field to store all tokens for both indexing and searching? Thanks, Li

RE: Does lucene support distributed indexing?

2008-04-28 Thread Fang_Li
Solr does not do distributed indexing, but index replication. All copies are identical. Lucene has some build in support for distributed search, please take a look at RemoteSearchable. For indexing, you can add a front load balancer in a naïve way. Regards, -Original Message- From:

RE: how to query against payload

2008-04-22 Thread Fang_Li
Hi Grant, Thanks for your help. BoostingTermQuery uses reader.termPositions(term) to get the term position. In the Term, we cannot put any payload value to find the result documents. What I want is Find out all documents which have a specific payload value in a specific term. We does not

how to query against payload

2008-04-21 Thread Fang_Li
Hi, I want to use payload to store some kind of object id which is an arbitrary byte array for better performance. But I do need some kind of function like searching against payload value. Also when the hits are available, how to get the payload of a specific term from a

RE: Need addtional info for Field(希望看得 懂中文的朋友帮我出出主意)

2008-04-21 Thread Fang_Li
Try to use payload which is stored as additional information. Currently lucene only support per token payload, but you can add an arbitrary token for the time information. I am not sure what are the query information? Only the subtitle or both subtitle and time? Regards, -Original

RE: Performance searching over multiple indexes

2007-10-25 Thread Fang_Li
Using more than one Index will definitely decrease the searching performance. The most Lucene search latency is to load the hits. If there is no hit, the searching takes a short time, dozens milli seconds and it's a const if the document number is less than 1M. search 100 indexes will take 100

RE: store index on web server directory

2007-09-29 Thread Fang_Li
Can you map the remote file system in window or mount in Linux? Lucene rely on the OS level file system access. Has anyone try to store the index file in a NAS/SAN? How is the performance? Regards, Li -Original Message- From: othman [mailto:[EMAIL PROTECTED] Sent: Thursday, September

RE: Lucene index performance

2007-06-19 Thread Fang_Li
Hi Andreas, I am very interested in the multiple index file index/search. Can you kindly help me on following questions? 1) Why you use multi index files? How much is the performance gain for both indexing and searching? Someone reported that there no big performance difference except the

RE: Help IndexWriter,Multi-threaded index access

2007-05-11 Thread Fang_Li
Hi, You cannot create more than one indexwriter for one index instance. But you can share the indexwriter through multi servlets or threads. Don't open a new IndexWriter in different threads, reuse the old one. Regards, -Original Message- From: legrand thomas [mailto:[EMAIL