log4j error

2008-01-18 Thread testn
Since I upgraded to Lucene 2.3, I started to see some error message coming from log4j via Lucene. Has any one ever experienced this? Is this classloading issue? javax.ejb.EJBException: EJB Exception: : java.lang.IllegalStateException: Current state = FLUSHED, new state = CODING at java.ni

How's 2.3 doing?

2007-11-13 Thread testn
Hi, Are we closed to release Lucene 2.3? Is it stable enough to production? I thought it's supposed to be released in October. Thanks, -- View this message in context: http://www.nabble.com/How%27s-2.3-doing--tf4802426.html#a13740560 Sent from the Lucene - Java Users mailing list archive at Na

Re: Java Heap Space -Out Of Memory Error

2007-09-17 Thread testn
As I mentioned, IndexReader is the one that holds the memory. You should explicitly close the underlying IndexReader to make sure that the reader releases the memory. Sebastin wrote: > > Hi testn, > Every IndexFolder is of size 1.5 GB of size,eventhough when > i

Re: Java Heap Space -Out Of Memory Error

2007-09-15 Thread testn
: > > HI testn, > > it gives performance improvement while optimizing the Index. > > Now i seprate the IndexStore on a daily basis.(ie) > For Every Day it create a new Index store ,sep- 08-2007,sep-09-2007 like > wise it will minimize the size of the IndexStore.could you giv

Re: Java Heap Space -Out Of Memory Error

2007-09-14 Thread testn
So did you see any improvement in performance? Sebastin wrote: > > It works finally .i use Lucene 2.2 in my application.thanks testn and > Mike > > Michael McCandless-2 wrote: >> >> >> It sounds like there may be a Lucene version mismatch? When Luke was

Re: Java Heap Space -Out Of Memory Error

2007-09-13 Thread testn
As Mike mentioned, what is the version of Lucene you are using? Plus can you also post the stacktrace? Sebastin wrote: > > Hi testn, > i wrote the case wrongly actually the error is > > java.io.ioexception file not found-segments > > testn wrote: >

Re: Java Heap Space -Out Of Memory Error

2007-09-13 Thread testn
Should the file be "segments_8" and "segments.gen"? Why is it "Segment"? The case is different. Sebastin wrote: > > java.io.IoException:File Not Found- Segments is the error message > > testn wrote: >> >> What is the error message? P

Re: Java Heap Space -Out Of Memory Error

2007-09-13 Thread testn
What is the error message? Probably Mike, Erick or Yonik can help you better on this since I'm no one in index area. Sebastin wrote: > > HI testn, > 1.I optimize the Large Indexes of size 10 GB using Luke.it > optimize all the content into a single CFS file

OutOfMemoryError: allocLargeArray

2007-09-13 Thread testn
I got this intermittent stacktrace 3 times in a month from using Lucene with JRockIt. Has anyone ever seen this in Lucene 2.2? java.lang.OutOfMemoryError: allocLargeArray at org.apache.lucene.util.PriorityQueue.initialize(PriorityQueue.java:36) at org.apache.lucene.index.SegmentMe

Re: Performance Questions

2007-09-10 Thread testn
- Searcher itself doesn't cost much. The cost came from the construction of TermsInfoReader from IndexReader - This means you can construct a number of searchers based on different combination of indices. - If I were you, I would construct a number of indices based on the demand of freshness.

Re: Query without Analyzer

2007-09-10 Thread testn
Alice, You need to do the following: - When you create a document, you need to add category id field using something like doc.add(new Field(”categoryId”, categoryId, Field.Store.YES, Field.Index.UN_TOKENIZED)); Alice-21 wrote: > > Hi folks! > > > > I'm using Lucene to provide search on m

Re: Implement a filter to the search results

2007-09-10 Thread testn
It's probably easier to add category, department, year as a part of query and then requery to get the hits you need. M.K wrote: > > Hi All, > > I have a search form which has an input area for key search and also > three > optional select boxs *Catagory, Department and Year. * > My question

Re: Java Heap Space -Out Of Memory Error

2007-09-10 Thread testn
1. You can close the searcher once you're done. If you want to reopen the index, you can close and reopen only the updated 3 readers and keep the 2 old indexreaders and reuse it. It should reduce the time to reopen it. 2. Make sure that you optimize it every once in a while 3. You might consider s

Re: Java Heap Space -Out Of Memory Error

2007-09-05 Thread testn
area code and phone number separately if the numbers are pretty distributed. Sebastin wrote: > > Hi testn, >here is my index details: > Index fields :5 fields > Store Fileds:10 fields >

Re: Java Heap Space -Out Of Memory Error

2007-09-04 Thread testn
Can you provide more info about your index? How many documents, fields and what is the average document length? Sebastin wrote: > > Hi testn, >i index the dateSc as 070904(2007/09/04) format.i am not using > any timestamp here.how can we effectively reopen the IndexSear

Re: Java Heap Space -Out Of Memory Error

2007-09-04 Thread testn
Check out Wiki for more information at http://wiki.apache.org/jakarta-lucene/LargeScaleDateRangeProcessing Sebastin wrote: > > Hi All, >i used to search 3 Lucene Index store of size 6 GB,10 GB,10 GB of > records using MultiReader class. > > here is the following code snippet: > > >

Re: Java Heap Space -Out Of Memory Error

2007-09-04 Thread testn
I think you store dateSc with full precision i.e. with time. You should consider to index it just date part or to the resolution you really need. It should reduce the memory it use when constructing DateRangeQuery and plus it will improve search performance as well. Sebastin wrote: > > Hi All,

Re: out of order

2007-08-16 Thread testn
There are two files: 1. segments_2 [-1, -1, -3, 0, 0, 1, 20, 112, 39, 17, -80, 0, 0, 0, 0, 0, 0, 0, 0] 2. segments.gen [-1, -1, -1, -2, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 2] but this one when the index is done done properly. hossman wrote: > > : After you close that IndexWriter, can

Re: out of order

2007-08-16 Thread testn
t; > Can you shed some light on how the application is using Lucene? Are you > doing > deletes as well as adds? Opening readers against this RAMDirectory? > Closing/ > opening writers at different times? Any changes to the default parameters > (mergeFactor, maxBufferedDocs,

Re: query question

2007-08-16 Thread testn
Can you post your code? Make sure that when you use wildcard in your custom query parser, it will generate either WildcardQuery or PrefixQuery correctly. is_maximum wrote: > > Yes karl, when I explore the index by Luke I can see the terms > for example I have a field namely, patientResult, it

Re: out of order

2007-08-16 Thread testn
s-2 wrote: > > > Well then that is particularly spooky!! > > And, hopefully, possible/easy to reproduce. Thanks. > > Mike > > "testn" <[EMAIL PROTECTED]> wrote: >> >> I use RAMDirectory and the error often shows the low number. Last time it >

Re: out of order

2007-08-15 Thread testn
I use RAMDirectory and the error often shows the low number. Last time it happened with message "7<=7". Nest time it happens, I will try to capture the stacktrace. Michael McCandless-2 wrote: > > > "testn" <[EMAIL PROTECTED]> wrote: >> >> Us

out of order

2007-08-15 Thread testn
Using Lucene 2.2.0, I still sporadically got doc out of order error. I indexed all of my stuff in one thread. Do you have any idea why it happens? Thanks! -- View this message in context: http://www.nabble.com/out-of-order-tf4276385.html#a12172277 Sent from the Lucene - Java Users mailing list

RE: High CPU usage duing index and search

2007-08-13 Thread testn
. Chew Yee Chuang wrote: > > Hi testn, > > I have tested Filter, it is pretty fast, but still take a lot of CPU > resource, Maybe it could due to the number of filter I run. > > Thank you > eChuang, Chew > > > -----Original Message- > From: testn [mailt

RE: High CPU usage duing index and search

2007-08-07 Thread testn
Check out Filter class. You can create a separate filter for each field and then chain them together using ChainFilter. If you cache the filter, it will be pretty fast. Chew Yee Chuang wrote: > > Greetings, > > Yes, process a little bit and stop for a while really reduce the CPU > usage, > bu

Re: You are right but it doesn't make it faster.

2007-08-06 Thread testn
Does it mean you already reuse IndexReader without reopening it? If you haven't done so, please try it out. docFreq() should be really quick. Thanks Daniel, you are completely right. I changed the code - but it doesn't make it [noticeably faster] - probably behind the scene it does run on the en

Re: speedup indexing

2007-08-06 Thread testn
1. If you only search on docId field only, database might be a better solution in this case. 2. To improve indexing speed, you can consider using the trunk code which includes LUCENE-834. The indexing speed will be faster by almost an order of magnitude. SK R wrote: > > Hi, > I have indexed

Re: strange MultiFieldQueryParser error: java.lang.Integer

2007-08-03 Thread testn
Boost must be Map Luca123 wrote: > > Hi all, > I've always used the MultiFieldQueryParser class without problems but > now i'm experiencing a strange problem. > This is my code: > > Map boost = new HashMap(); > boost.put("field1",5); > boost.put("field2",1); > > Analyzer analyzer = new Standa

Re: Get the terms and frequency vector of an indexed but unstored field

2007-08-03 Thread testn
you can use IndexReader.getTermFreqVectors(int n) to get all terms and their frequencies. Make sure when you create an index, you choose option to store it by specifying Field.TermVector option. Check out http://www.cnlp.org/presentations/slides/AdvancedLuceneEU.pdf tierecke wrote: > > Hi, >

Re: Clustered Indexing on common network filesystem

2007-08-02 Thread testn
Why don't you check out Hadoop and Nutch? It should provide what you are looking for. Zach Bailey wrote: > > Hi, > > It's been a couple of days now and I haven't heard anything on this > topic, while there has been substantial list traffic otherwise. > > Am I asking in the wrong place? Was I

Re: extracting non-english text from word, pdf, etc....??

2007-08-02 Thread testn
This is what I use now to extract english. > > Thanks, > Michael > > testn wrote: >> If you can extract token stream from those files already, you can simply >> use >> different analyzers to analyze those token stream appropriately. Check >> out >> Luce

Re: LUCENE-843 Release

2007-08-02 Thread testn
; changes, just bug fixes, and so I don't think we should violate that > accepted practice. > > I would rather see us finish up 2.3 and release it, and going forwards > do more frequent releases, instead of porting big changes back onto > point releases. > > Mike > &

Re: Solr's NumberUtils doesnt work

2007-08-02 Thread testn
How did you encode your integer into String? Did you use int2sortableStr? is_maximum wrote: > > Hi > I am using NumberUtils to encode and decode numbers while indexing and > searching, when I am going to decode the number retrieved from an index it > throws exception for some fields > the exce

Re: LUCENE-843 Release

2007-08-02 Thread testn
Mike, as a committer, what do you think? Thanks! Peter Keegan wrote: > > I've built a production index with this patch and done some query stress > testing with no problems. > I'd give it a thumbs up. > > Peter > > On 7/30/07, testn <[EMAIL PROTECTED]&g

Re: extracting non-english text from word, pdf, etc....??

2007-08-02 Thread testn
If you can extract token stream from those files already, you can simply use different analyzers to analyze those token stream appropriately. Check out Lucen-contrib analyzers at http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/analyzers/src/java/org/apache/lucene/analysis/ heybluez wr

RE: High CPU usage duing index and search

2007-08-02 Thread testn
20,000 queries continuously? Sounds a bit too much. Can you elaborate more what you need to do? Probably you won't need that many queries. Chew Yee Chuang wrote: > > Hi, > > Thanks for the link provided, actually I've go through those article when > I > developing the index and search functio

Re: Getting only the Ids, not the whole documents.

2007-08-02 Thread testn
Hi, Why don't you consider to use FieldSelector? LoadFirstFieldSelector has an ability to help you load only the first field in the document without loading all the fields. After that, you can keep the whole document if you like. It should help improve performance better. is_maximum wrote: >

Re: Do AND + OR Search in Lucene

2007-08-02 Thread testn
You can create two queries from two query parser, one with AND and the other one with OR. After you create both of them, you call setBoost() to give different boost level and then join them together using BooleanQuery with option BooleanClause.Occur.SHOULD. That should do the trick. askarzaidi w

LUCENE-843 Release

2007-07-30 Thread testn
Hi guys, Do you think LUCENE-843 is stable enough? If so, do you think it's worth to release it with probably LUCENE 2.2.1? It would be nice so that people can take the advantage of it right away without risking other breaking changes in the HEAD branch or waiting until 2.3 release. Thanks, --

Re: NPE in MultiReader

2007-07-27 Thread testn
- like > configuration, parameters..can you describe more? > thanks, > dt, > www.ejinz.com > Search Engine News > > - Original Message - > From: "testn" <[EMAIL PROTECTED]> > To: > Sent: Friday, July 27, 2007 7:50 PM > Subject: NPE in MultiReade

NPE in MultiReader

2007-07-27 Thread testn
Every once in a while I got the following exception with Lucene 2.2. Do you have any idea? Thanks, java.lang.NullPointerException at org.apache.lucene.index.MultiReader.getFieldNames(MultiReader.java:264) at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:180

Re: Search for null

2007-07-24 Thread testn
ng field of docs and >> index the field without tokenizing it. Then you may search for that >> special value to find the docs. >> >> Jay >> >> Les Fletcher wrote: >> > Does this particular range query have any significant performance >> issues?

Search for null

2007-07-23 Thread testn
Is it possible to search for the document that specified field doesn't exist or such field value is null? -- View this message in context: http://www.nabble.com/Search-for-null-tf4130600.html#a11746864 Sent from the Lucene - Java Users mailing list archive at Nabble.com. --