Naman Gupta wrote:
Does lucene support the files in pdf and mht file formats. I wasnt able to
retrieve any results after creating an index of such files.
Well, the answer is simple: Lucene itself does not support any file
format. You need a file parser that converts your files to a plain text
Hey
Does lucene support the files in pdf and mht file formats. I wasnt able to
retrieve any results after creating an index of such files. This is the
first time i am using lucene.
Thanks
Naman K Gupta
Did your index size increase drastically?
As a first step I would recommend optimizing your index if you haven't
already.
-M
On Feb 12, 2008 7:42 PM, Cesar Ronchese <[EMAIL PROTECTED]> wrote:
>
> I was doing normal queries happily, seeing the results statistics come in
> about 0.02 seconds.
>
>
I was doing normal queries happily, seeing the results statistics come in
about 0.02 seconds.
But then, I added a extra field to seach togheter with the normal query,
then the statistic pulled up to 0.35 seconds. That was a lot.
example:
normal query: some test (it returns quick)
extra field que
My bad. Thanks for the link!
Jay
Chris Hostetter wrote:
: Do you know why FieldNormModifier is removed from Lucene 2.3?
: thanks.
it wasn't...
http://lucene.apache.org/java/2_3_0/api/contrib-misc/org/apache/lucene/index/FieldNormModifier.html
...it's in the "miscellaneous" contrib though so
I think this is what you are asking:
http://lucene.apache.org/java/2_3_0/api/core/org/apache/lucene/index/IndexReader.html#getFieldNames(org.apache.lucene.index.IndexReader.FieldOption)
On Feb 12, 2008, at 11:13 AM, <[EMAIL PROTECTED]> <[EMAIL PROTECTED]
> wrote:
Hi,
Does anyone have a
: Do you know why FieldNormModifier is removed from Lucene 2.3?
: thanks.
it wasn't...
http://lucene.apache.org/java/2_3_0/api/contrib-misc/org/apache/lucene/index/FieldNormModifier.html
...it's in the "miscellaneous" contrib though so you'll need to use that
jar explicitly.
-Hoss
--
OK, understood.
Maybe a little hint in the legend, like "Only for stored fields".
> -Original Message-
> From: Andrzej Bialecki [mailto:[EMAIL PROTECTED]
> Sent: Dienstag, 12. Februar 2008 19:13
> To: java-user@lucene.apache.org
> Subject: Re: Lukes document hitlist display
>
> [EMAIL PR
[EMAIL PROTECTED] wrote:
Hi,
using Luke 0.7.1.
The document hitlist has a column header ITSVop0LBC.
When I add a field like this:
new Field("CONTENT", contentReader, TermVector.WITH_OFFSETS)
Luke shows only "--". Why?
Shouldn't it be "IT-Vo-"?
It should, but this information i
Hi,
using Luke 0.7.1.
The document hitlist has a column header ITSVop0LBC.
When I add a field like this:
new Field("CONTENT", contentReader, TermVector.WITH_OFFSETS)
Luke shows only "--". Why?
Shouldn't it be "IT-Vo-"?
Thank you
-
Do you know why FieldNormModifier is removed from Lucene 2.3?
thanks.
Jay
Chris Hostetter wrote:
: I read the doc for the api indexreader.setNorm() after I posted the question
: earlier. To use that setNorm() to modify the field boost, it seems to me that
: one has to know how the boost is fold
It'd be helpful if there is an api for getting the norm of a given field
in a given doc.
Thanks for the pointers.
Jay
Chris Hostetter wrote:
: I read the doc for the api indexreader.setNorm() after I posted the question
: earlier. To use that setNorm() to modify the field boost, it seems to me
Did you take a look at the
org.apache.lucene.analysis.ngram.NGramTokenFilter? Or other ngram
implementation? Works great for us.
Patrick
Ulrich Vachon wrote:
Hi all,
It's possible to use simplely (without java preprocessing, if possible)
Lucene to find items with this constraints:
I have
Hi,
Does anyone have a code snippet which would allow me to ask my index how
many instances of a field are indexed?
Thanks,
Marc Dumontier
Manager, Software Development
Thomson Scientific (Canada)
1 Yonge Street, Suite 1801
Toronto, Ontario M5E 1W7
Direct +1 416 214 3448
Mobile +
This would be really nice!
> -Original Message-
> From: Andrzej Bialecki [mailto:[EMAIL PROTECTED]
> Sent: Dienstag, 12. Februar 2008 16:41
> To: java-user@lucene.apache.org
> Subject: Re: TermPositionVector
>
> [EMAIL PROTECTED] wrote:
> > Hi,
> >
> > could somebody please explain wha
[EMAIL PROTECTED] wrote:
Hi,
could somebody please explain what the difference between positions and
offsets is?
And: Is there a trick to show theses infos in luke?
Not yet :) Funny thing, I've been thinking about adding this to Luke,
but ran out of time before the last release. Perhaps I'l
Erica - it has never been in the core JAR.It should be available
in the lucene-regex-2.3.0.jar
Erik
On Feb 12, 2008, at 10:01 AM, Mitchell, Erica wrote:
Hi,
I've downloaded lucene 2.3.0 and the jar lucene-core-2.3.0.jar does
not
contain the SpanRegexQuery class.
Has this bee
Hi,
I've downloaded lucene 2.3.0 and the jar lucene-core-2.3.0.jar does not
contain the SpanRegexQuery class.
Has this been deprecated?
Thanks,
Erica
IONA Technologies PLC (registered in Ireland)
Registered Number: 171387
Registered Address: The IONA Building, Shel
You should probably think about synonym analyzers, both at index
time and query time. Because I think you have a problem here
Let's say you can do what you ask, at query time transform
any of your three options into "clamoxyle". Would it really
be satisfactory to your users to then NOT get any
TermA TermB
TermA has position 0 and offset 0
TermB has position 1 and offset 6
Right?
> -Original Message-
> From: Grant Ingersoll [mailto:[EMAIL PROTECTED]
> Sent: Dienstag, 12. Februar 2008 15:16
> To: java-user@lucene.apache.org
> Subject: Re: TermPositionVector
>
> Position is jus
Position is just relative to other tokens
(Token.getPositionIncrement()), offsets are character offsets
(Token.startOffset(), Token.endOffset())
-Grant
On Feb 12, 2008, at 8:31 AM, <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> wrote:
Hi,
could somebody please explain what the difference between
Hi,
could somebody please explain what the difference between positions and
offsets is?
And: Is there a trick to show theses infos in luke?
Thank you.
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail
Hi all,
It's possible to use simplely (without java preprocessing, if possible)
Lucene to find items with this constraints:
I have indexed this word : clamoxyle
I want to find it with this queries : claomxyle, clamoxile, camoxyle.
It is possible?
Thank you,
Ulrich.
23 matches
Mail list logo