Re: outof memory error

2008-02-05 Thread SK R
problem you're trying to solve by indexing this doc. > Is it a log file? I can't imagine a text document that big. That's like a > 100 volume encyclopedia, and I can't help but wonder whether your users > would be better served by indexing it in pieces. > > Best >

outof memory error

2008-02-04 Thread SK R
Hi, I got outof memory exception while indexing huge documents (~1GB) in one thread and optimizing some other (2 to 3) indexes in different threads. Max JVM heap size is 512MB. I'm using lucene2.3.0. Please suggest a way to avoid this exception. Regards RSK

Re: speedup indexing

2007-08-07 Thread SK R
Hi, Thanks for this valuable informations. I'm using Lucene2.1 now. Do I need to apply the patch "LUCENE-843" with existing one or i have to move the latest? Do i need to use flushByRam instead of flushbydoc to work with this patch? Regards RSK On 8/7/07, Michael McCandless <[EMAIL PROT

speedup indexing

2007-08-06 Thread SK R
Hi, I have indexed 5 fields and stored 2 of them(field Length is around 1). My index is growing in nature and it is in GB. I need to get search result based on docID only. Scoring, additional sorting, delete and update are never used. None of complicated things required. In my testing

How to get FastAnalyzer?

2007-07-30 Thread SK R
Hi, During my search on alternative for StandardAnalyzer , I got some useful information about JFlex based FastAnalyzer in this user-group. I tried to get corresponding files from https://issues.apache.org/jira/browse/LUCENE-966 . But they are in txt format and how can i get and test that impr

Re: zero termfreq for some search strings with special characters

2007-06-20 Thread SK R
e right! "emp-id" will be separated to two terms CONTENT:"emp" CONTENT:"id" by standard tokenizer for indexing and searching. But direct writing term (CONTENT:"emp-id") will not. Andy -Original Message- From: SK R [mailto:[EMAIL PROTECT

zero termfreq for some search strings with special characters

2007-06-20 Thread SK R
Hi, I'm using standard tokenizer for both indexing and searching process.Myindexed value is like "emp-id Aq234 kaith creating document for search". I can get search results for the query CONTENT:"emp-id" by using hits = indexSearcher.search(*query*). But if I try to get termfrequency of t

Re: Does Lucene search over memory too?

2007-05-29 Thread SK R
Hi Michael McCandless, Thanks a lot for this clarification. Calling writer.flush() before every search is the solution for my case. But this may cause any performance issues(i.e) more time or more memory requirement? Any idea about time taken for writer.flush()? Thanks & Regards RSK On

Does Lucene search over memory too?

2007-05-28 Thread SK R
Hi, Does Lucene search FSDirectory as well as buffered in-memory docs while we are calling searcher.search(query)? Why I'm asking this is, I've indexed my doc with mergeFactor & Max.Buff.Docs = 50 and I've optimized and closed it at mid-night only.Beforeoptimization, my search gives partial

How to get term frequency of multi terms and TimeRange?

2007-04-24 Thread SK R
Hi, How to get term frequency of multi terms in particular document? Any API method other than using TermVector may help? Also How to calculate termfreq. of time range. i.e : If my index have a field "TIME" with values in millis (like 1176281188000)., and I want to calculate term freq. of

Re: How to get termfreq. of each doc for wildcard terms?

2007-04-24 Thread SK R
Hi, Anybody have idea about my previous post? Regards RSK On 4/23/07, SK R <[EMAIL PROTECTED]> wrote: Hi, In my application, sometimes I need to find doc Id with term frequency of my terms in my index of multi lines, tokenized & indexed with Standard Analyzer. For this, now

How to get termfreq. of each doc for wildcard terms?

2007-04-23 Thread SK R
Hi, In my application, sometimes I need to find doc Id with term frequency of my terms in my index of multi lines, tokenized & indexed with Standard Analyzer. For this, now I'm using * TermDocs termDocs= reader.termDocs(new Term("FIELD","book1"); while(termDocs.next()) { matches +=

Re: what's the use of proximity data?

2007-03-26 Thread SK R
7, karl wettin <[EMAIL PROTECTED]> wrote: 27 mar 2007 kl. 08.49 skrev SK R: > Hi, >Please clarify my doubts. >What's the use of storing proximity data internally while > indexing? Is > it only for score calculation or any other additional purpose? >How lucene

what's the use of proximity data?

2007-03-26 Thread SK R
Hi, Please clarify my doubts. What's the use of storing proximity data internally while indexing? Is it only for score calculation or any other additional purpose? How lucene handles phrase query? Whether it's depend on proximity data of phrase terms or any other? Thanks & Regards RSK

Re: MergeFactor and MaxBufferedDocs value should ...?

2007-03-23 Thread SK R
<[EMAIL PROTECTED]> wrote: "SK R" <[EMAIL PROTECTED]> wrote: > If I set MergeFactor = 100 and MaxBufferedDocs=250 , then first 100 > segments will be merged in RAMDir when 100 docs arrived. At the end of > 350th > doc added to writer , RAMDir have 2 merged seg

MergeFactor and MaxBufferedDocs value should ...?

2007-03-22 Thread SK R
Hi, I've looked the uses of MergeFactor and MaxBufferedDocs. If I set MergeFactor = 100 and MaxBufferedDocs=250 , then first 100 segments will be merged in RAMDir when 100 docs arrived. At the end of 350th doc added to writer , RAMDir have 2 merged segment files + 50 seperate segment files

Re: how ungrouped query handled?

2007-03-22 Thread SK R
there's a way I can see to fix PrecedenceQueryParser. : : Best : Erick : : On 3/22/07, SK R <[EMAIL PROTECTED]> wrote: : > : > Hi, : > Can anyone explain how lucene handles the belowed query? : > My query is *field1:source AND (field2:name OR field3:dest)* . I'v

how ungrouped query handled?

2007-03-22 Thread SK R
Hi, Can anyone explain how lucene handles the belowed query? My query is *field1:source AND (field2:name OR field3:dest)* . I've given this string to queryparser and then searched by using searcher. It returns correct results. It's query.toString() print is :: +field1:source +(field2:name f

Re: can't get docFreq of phrase

2007-03-20 Thread SK R
Thanks a lot. On 3/20/07, karl wettin <[EMAIL PROTECTED]> wrote: 20 mar 2007 kl. 12.14 skrev SK R: > Hi Mark, > Thanks for your reply. > Could i get this match length (docFreq) without using > searcher.search(..) ? > > One more doubt is

Re: can't get docFreq of phrase

2007-03-20 Thread SK R
h(pq).length(); Cheers Mark - Original Message From: SK R <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Tuesday, 20 March, 2007 10:32:32 AM Subject: can't get docFreq of phrase Hi, I can get docFreq. of single term like (f1:test) by using indexReader.docFreq(new

can't get docFreq of phrase

2007-03-20 Thread SK R
Hi, I can get docFreq. of single term like (f1:test) by using indexReader.docFreq(new Term("f1","test")). But can't get docFreq. of phrase term like f2:"test under") by the same method. Is anything wrong in this code? Please help me to resolve this problem. Thanks & Regards RSK