I _think_ Lucene 2.1 (or is it trunk?, I lose track) has the ability to
delete all documents containing a term.
Actually it's been in IndexReader for longer than I can remember. We're
still on 1.4.3 and it's in there.
(Only difference in 2.1 is that it's now on IndexWriter as well.)
Dan
There are (at least) two ways to generate a BitSet which can be used for
filtering.
Filter.bits()
BitSet bits = new BitSet(reader.maxDoc());
TermDocs td = reader.termDocs(new Term("field", "text");
while (td.next())
{
bits.set(td.doc());
}
return bits;
and HitCollector.collec
:
: I have a question regarding the close() CSIndexInput class, why there is
: no close operation defined...
: Like ...
: base.close()
i'm not expert on this kind of thing, but since CSIndexInput only reads
from a portion of another IndexInput, closing base seems like a bad idea
... super.close()
thanks
regards
ashwin
On 3/13/07, karl wettin <[EMAIL PROTECTED]> wrote:
13 mar 2007 kl. 04.01 skrev ashwin kumar:
> can u please give any links to poi forums
Sure,
http://www.google.com/search?q=poi+forum
--
karl
>
> On 3/13/07, karl wettin <[EMAIL PROTECTED]> wrote:
>>
>>
>> 13 mar 200
13 mar 2007 kl. 04.01 skrev ashwin kumar:
can u please give any links to poi forums
Sure,
http://www.google.com/search?q=poi+forum
--
karl
On 3/13/07, karl wettin <[EMAIL PROTECTED]> wrote:
13 mar 2007 kl. 03.51 skrev ashwin kumar:
> hi all i have successfully indexed .txt and .pdf
can u please give any links to poi forums
On 3/13/07, karl wettin <[EMAIL PROTECTED]> wrote:
13 mar 2007 kl. 03.51 skrev ashwin kumar:
> hi all i have successfully indexed .txt and .pdf files using
> lucene . Now i
> want to index word documents , Excel sheets and power point
> slides .and f
13 mar 2007 kl. 03.51 skrev ashwin kumar:
hi all i have successfully indexed .txt and .pdf files using
lucene . Now i
want to index word documents , Excel sheets and power point
slides .and for
this i have downloaded POI api
from the following link
http://jakarta.apache.org/poi/
can some
hi all i have successfully indexed .txt and .pdf files using lucene . Now i
want to index word documents , Excel sheets and power point slides .and for
this i have downloaded POI api
from the following link
http://jakarta.apache.org/poi/
can some one help me with sample codes for indexing the a
That worked for me too. Thanks!
-Original Message-
From: Steffen Heinrich [mailto:[EMAIL PROTECTED]
Sent: Friday, March 09, 2007 1:39 PM
To: java-user@lucene.apache.org
Subject: Re: Words not found, large file indexing
Hello Chris,
this is incredible!
I'm new to Lucene and did just su
: You're entirely correct about the analyzer (I'm using one that breaks on
: non-alphanumeric characters, so all punctuation is ignored). To be
: honest, I hadn't thought about altering this, but I guess I could; just
: reticent that there might be unforeseen consequences.
this is where the PerF
: I suspect that if you stored your dates (use DateTools) as strings
: with a resolution of a day you'd get much faster queries, assuming
: that this is fine enough for your app.
...especialy if you use ConstanScoreRangeQUery ... if you find that isn't
fast enough, having the various granularitie
: ok, so does that mean i can use both q1 and q2 for phrase query ie; for
: searching words adjacent to each other. Actually that was my only concern,
: as i wanted to use q1 for phrase query, rather than q2.
: Regards,
Your example "q1" is not hte correct syntax for a phrase query .. the
correct
Hi All,
I have a question regarding the close() CSIndexInput class, why there is
no close operation defined...
Like ...
base.close()
If I analyze the memory dump of our server I can see lot of object of
type CSIndexInput. So I am not sure if I call close() on IndexeSearcher
it closes all in
Hi Ruchika,
Are there are any quote characters in your index (may the Luke be with
you[1])? If not, you could just remove all quotes from your query
(except the surrounding ones indicating phrase matching, of course), and
things will work, as you have indicated.
Which version of Lucene are you u
This may help:
http://www.pdfbox.org/userguide/text_extraction.html#Lucene+Integration
ashwin kumar wrote:
> hi all i am able to convert a pdf in to a text file using pdfbox. and this
> is the code that i used
>
> import org.pdfbox.pdfparser.PDFParser;
> import org.pdfbox.pdmodel.PDDocument;
> i
As of Lucene 2.1 you can make optimal use of RAM by monitoring
IndexWriter.ramSizeInBytes() and calling IndexWriter.flush() when memory is
tight.
This avoids the issue of trying to estimate a value for maxBufferedDocs which
you think can fit into RAM.
Cheers
Mark
- Original Message
F
I think you can try
MergeFactor =1000
MaxMergeDocs=2147483647
MaxBufferedDocs=1000
--LM
On 3/12/07, Harini Raghavan <[EMAIL PROTECTED]> wrote:
Hi Everyone,
We have been using Lucene integrated with our application for over a year
now. The indexing and searching has been pretty fast until re
Hi Everyone,
We have been using Lucene integrated with our application for over a year
now. The indexing and searching has been pretty fast until recently. But now
we are having some scalability issues. We have a job that indexes around
2 documents in to index every day. There are 2 processes
Hi,
I am not sure if I can help you a lot but you can check how Nutch does this
(although it does not do exactly what you want). See
*org.apache.nutch.summary.basic.BasicSummarizer
*
or
*org.apache.nutch.summary.lucene.LuceneSummarizer*
You should also check Highliter API (
http://lucene.apac
>> where can i find such examples
In the source distribution here:
http://apache.rmplc.co.uk/lucene/java/lucene-2.1.0-src.zip
See the "HighlighterTest.java" file
Alternatively, read the Javadocs example for the Highlighter package here
http://lucene.apache.org/java/docs/api/
- Original
Thanks mark harwood ,
I want something like Highlighter thing
where can i find such examples
Regards Chaminda
mark harwood <[EMAIL PROTECTED]> wrote:
>> Why nobody is anwering me?
Apologies for your 2 hour delay earlier this morning. The Lucene 24 hour
helpdesk was tempora
>> Why nobody is anwering me?
Apologies for your 2 hour delay earlier this morning. The Lucene 24 hour
helpdesk was temporarily closed while we had a weekend/life/sleep.
If you file an official complaint you will be entitled to an immediate and full
refund of your support fee.
As for your prob
Many thaks Vipin,
I'l check
Vipin <[EMAIL PROTECTED]> wrote:
Hi chaminda,
you just go through this link
http://today.java.net/pub/a/today/2005/08/09/didyoumean.html?page=1
in this articles last portion(page 3) the author has suggested a way to
handle such kind of things(Composit
Hi chaminda,
you just go through this link
http://today.java.net/pub/a/today/2005/08/09/didyoumean.html?page=1
in this articles last portion(page 3) the author has suggested a way to
handle such kind of things(Composite didyoumean parser)..
i think it will open up a way...
Regard
i am using StopAnalyszer.
Yes something like "innere Organe bezeichnet" works exactly fine
„ character should not be a probelm, as if i remove this character, i still
get the error
I get this error tiil the point i have Organe\" in my query. Guess it is the
double quote inside a phrase, which is
12 mar 2007 kl. 08.24 skrev ruchi thakur:
yes that is exactly what i am doing
in java String i have something like
String aSearchStr = "\"„innere Organe\\\" bezeichnet\"";
Query query = parser.parse(aSearchStr);
I'm not sure why you get this exception. Perhaps it has something to
do with th
26 matches
Mail list logo