problem you're trying to solve by indexing this doc.
> Is it a log file? I can't imagine a text document that big. That's like a
> 100 volume encyclopedia, and I can't help but wonder whether your users
> would be better served by indexing it in pieces.
>
> Best
>
Hi,
I got outof memory exception while indexing huge documents (~1GB) in
one thread and optimizing some other (2 to 3) indexes in different threads.
Max JVM heap size is 512MB. I'm using lucene2.3.0.
Please suggest a way to avoid this exception.
Regards
RSK
Hi,
Thanks for this valuable informations.
I'm using Lucene2.1 now. Do I need to apply the patch "LUCENE-843" with
existing one or i have to move the latest? Do i need to use flushByRam
instead of flushbydoc to work with this patch?
Regards
RSK
On 8/7/07, Michael McCandless <[EMAIL PROT
Hi,
I have indexed 5 fields and stored 2 of them(field Length is around
1). My index is growing in nature and it is in GB. I need to get search
result based on docID only. Scoring, additional sorting, delete and update
are never used. None of complicated things required.
In my testing
Hi,
During my search on alternative for StandardAnalyzer , I got some useful
information about JFlex based FastAnalyzer in this user-group. I tried to
get corresponding files from
https://issues.apache.org/jira/browse/LUCENE-966 . But they are in txt
format and how can i get and test that impr
e right!
"emp-id" will be separated to two terms CONTENT:"emp" CONTENT:"id" by
standard tokenizer for indexing and searching. But direct writing term
(CONTENT:"emp-id") will not.
Andy
-Original Message-
From: SK R [mailto:[EMAIL PROTECT
Hi,
I'm using standard tokenizer for both indexing and searching
process.Myindexed value is like "emp-id Aq234 kaith creating document
for search".
I can get search results for the query CONTENT:"emp-id" by using hits =
indexSearcher.search(*query*).
But if I try to get termfrequency of t
Hi Michael McCandless,
Thanks a lot for this clarification.
Calling writer.flush() before every search is the solution for my case.
But this may cause any performance issues(i.e) more time or more memory
requirement?
Any idea about time taken for writer.flush()?
Thanks & Regards
RSK
On
Hi,
Does Lucene search FSDirectory as well as buffered in-memory docs while
we are calling searcher.search(query)?
Why I'm asking this is, I've indexed my doc with mergeFactor &
Max.Buff.Docs = 50 and I've optimized and closed it at mid-night
only.Beforeoptimization, my search gives partial
Hi,
How to get term frequency of multi terms in particular document? Any API
method other than using TermVector may help?
Also How to calculate termfreq. of time range. i.e : If my index have a
field "TIME" with values in millis (like 1176281188000)., and I want to
calculate term freq. of
Hi,
Anybody have idea about my previous post?
Regards
RSK
On 4/23/07, SK R <[EMAIL PROTECTED]> wrote:
Hi,
In my application, sometimes I need to find doc Id with term frequency
of my terms in my index of multi lines, tokenized & indexed with Standard
Analyzer. For this, now
Hi,
In my application, sometimes I need to find doc Id with term frequency
of my terms in my index of multi lines, tokenized & indexed with Standard
Analyzer. For this, now I'm using *
TermDocs termDocs= reader.termDocs(new Term("FIELD","book1");
while(termDocs.next())
{
matches +=
7, karl wettin <[EMAIL PROTECTED]> wrote:
27 mar 2007 kl. 08.49 skrev SK R:
> Hi,
>Please clarify my doubts.
>What's the use of storing proximity data internally while
> indexing? Is
> it only for score calculation or any other additional purpose?
>How lucene
Hi,
Please clarify my doubts.
What's the use of storing proximity data internally while indexing? Is
it only for score calculation or any other additional purpose?
How lucene handles phrase query? Whether it's depend on proximity data
of phrase terms or any other?
Thanks & Regards
RSK
<[EMAIL PROTECTED]> wrote:
"SK R" <[EMAIL PROTECTED]> wrote:
> If I set MergeFactor = 100 and MaxBufferedDocs=250 , then first 100
> segments will be merged in RAMDir when 100 docs arrived. At the end of
> 350th
> doc added to writer , RAMDir have 2 merged seg
Hi,
I've looked the uses of MergeFactor and MaxBufferedDocs.
If I set MergeFactor = 100 and MaxBufferedDocs=250 , then first 100
segments will be merged in RAMDir when 100 docs arrived. At the end of 350th
doc added to writer , RAMDir have 2 merged segment files + 50 seperate
segment files
there's a way I can see to fix PrecedenceQueryParser.
:
: Best
: Erick
:
: On 3/22/07, SK R <[EMAIL PROTECTED]> wrote:
: >
: > Hi,
: > Can anyone explain how lucene handles the belowed query?
: > My query is *field1:source AND (field2:name OR field3:dest)* .
I'v
Hi,
Can anyone explain how lucene handles the belowed query?
My query is *field1:source AND (field2:name OR field3:dest)* . I've
given this string to queryparser and then searched by using searcher. It
returns correct results. It's query.toString() print is :: +field1:source
+(field2:name f
Thanks a lot.
On 3/20/07, karl wettin <[EMAIL PROTECTED]> wrote:
20 mar 2007 kl. 12.14 skrev SK R:
> Hi Mark,
> Thanks for your reply.
> Could i get this match length (docFreq) without using
> searcher.search(..) ?
>
> One more doubt is
h(pq).length();
Cheers
Mark
- Original Message
From: SK R <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Tuesday, 20 March, 2007 10:32:32 AM
Subject: can't get docFreq of phrase
Hi,
I can get docFreq. of single term like (f1:test) by using
indexReader.docFreq(new
Hi,
I can get docFreq. of single term like (f1:test) by using
indexReader.docFreq(new Term("f1","test")). But can't get docFreq. of phrase
term like f2:"test under") by the same method.
Is anything wrong in this code?
Please help me to resolve this problem.
Thanks & Regards
RSK
21 matches
Mail list logo