I guess, you have to provide customized tokenizer in your analyzer.
-Original Message-
From: Scott Smith [mailto:ssm...@mainstreamdata.com]
Sent: Wednesday, September 18, 2013 12:26 AM
To: java-user@lucene.apache.org
Subject: Can you escape characters you don't want the analyzer to
Hi,
This means that there is either a bug in Lucene or that your index is
corrupted. Can you reproduce this failure if you reindex data? The
output of CheckIndex would be interesting as well, see
In lucene 4.3.0 there is no IndexFileNameFilter.
And I find in org.apache.lucene.index.IndexFileNames the index file
extensions have only 3 types.
public static final String INDEX_EXTENSIONS[] = new String[] {
COMPOUND_FILE_EXTENSION,
COMPOUND_FILE_ENTRIES_EXTENSION,
GEN_EXTENSION,
Hi,
This looks bad! Can you write a small test case that reproduces the
issue so that we can try to understand what happens here?
Thanks!
--
Adrien
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For
Hi,
Since Lucene 4.0 which introduced codecs, it is not possible anymore
to know based on filename extensions whether files have been created
by Lucene or not: every codec is free to use any file extension.
On Wed, Sep 18, 2013 at 1:03 PM, Yonghui Zhao zhaoyong...@gmail.com wrote:
In lucene
Hi,
Are you talking about updating the content of the index or customizing
the file formats of the index?
On Tue, Sep 17, 2013 at 11:31 PM, Ralf Bierig ralf.bie...@gmail.com wrote:
Hi all,
is there any good documentation of how to change and modify the index of
Lucene version 4 other than
Got it. Currently I don't use any custom codecs.
2013/9/18 Adrien Grand jpou...@gmail.com
Hi,
Since Lucene 4.0 which introduced codecs, it is not possible anymore
to know based on filename extensions whether files have been created
by Lucene or not: every codec is free to use any file
Hi,
On Wed, Sep 18, 2013 at 1:39 PM, Yonghui Zhao zhaoyong...@gmail.com wrote:
Got it. Currently I don't use any custom codecs.
Part of the problem is that even the current codec keeps evolving, and
file extensions that exist today might not be used anymore in 6 months
and vice-versa. I would
Hi,
I wrote a simple code to update a lucene document with new values.
Code Snippet:
Term term = new Term(PRODUCT_CODE, productCode);
TermQuery query = new TermQuery(term);
TopDocs productDoc = this.searcher.search(query, 1);
int docNum = scoreDoc.doc;
Document doc =
Hi,
the problem is that a document retrieved by IndexReader.document() only
contains stored fields and no indexed fields (they rae no longer accessible
from the index). Also, the field types only contain stored as attribute, so
when reindexing with IndexWriter you just create a document with
Hi,
While trying to play with the CompoundWordTokenFilterBase I noticed that
the behavior is to include the original token together with the new
sub-tokens.
I assume this is expected (haven't found any relevant docs on this), but I
was wondering if it's a hard requirement or can I propose a
Hello,
I was going to use the TotalHitCountCollector in cases where I'm
interested just in the number of results.
Obviously I was hoping to gain in performances compared to a scored
query.
From my tests it seam it's not so performant compare to the scored
search. At this point I'm wondering if
Hi,
The ConstantScoreQuery part is just overhead. If scores are not requested, they
should not be calculated - but CSQ cannot prevent this from happening at all.
It just prevent's the collector from seeing the scores. As the counting
collector does not request any scores, you just add a
Out of curiosity, what is your use case? I mean, the normal use of this
filter is to permit a shorthand reference to a long term, but why would
you necessarily want to preclude direct reference to the full term?
-- Jack Krupansky
-Original Message-
From: Alex Parvulescu
Sent:
Hello,
Over the last few weeks I've been working on upgrading an application from
Lucene 3.x to Lucene 4.x in hopes of improving performance. Unfortunately,
after going through the full migration process and playing with all sorts
of tweaks I found online and in the documentation, Lucene 4 is
That's the conclusion I was coming to.
Thanks
-Original Message-
From: Jack Krupansky [mailto:j...@basetechnology.com]
Sent: Tuesday, September 17, 2013 9:20 PM
To: java-user@lucene.apache.org
Subject: Re: Can you escape characters you don't want the analyzer to modify
It sounds like
Hi,
Thanks Mark Miller for your advise.
I had missed some of the part, thats why I could not get the proper value.
I should get the binaryvalue instead of get() for compressed content.
I tested all the scnarious and I have some doubts,
1. I observed that while searching with highlighter tool, it
Hi Uwe,
Thanks for explaining.
Earlier our system was using 2.4 version and in that this was possible.
Anyways, I will implement it correctly as you suggested.
On 18-09-2013 07:41 PM, Uwe Schindler wrote:
Hi,
the problem is that a document retrieved by IndexReader.document() only contains
18 matches
Mail list logo