We are using similar technique as yours. We keep smaller indexes and use
ParallelMultiSearcher to search across the index. Keeping smaller indexes is
good as index and index optimzation would be faster. There will be small delay
while searching across the indexes.
1. What is your search time?
We are planning to ingest some non-English content into our application. All
content is OCR'ed and there are a lot of misspellings and garbage terms because
of this. Each document has one primary language with a some exceptions (e.g. a
few English terms mixed in with primarily non-English docu
Hi All,
Can some one pls direct me on how to boost the result when specific keywords
are found while searching the document?
example:
1. While indexing the documents A, B and C, I do not boost any of these
documents. (Field.Store.YES, Field.Index.ANALYZED) and setBoost(1.0)
2. Now I read documen
Hello,
I have some questions about what kind of behavior is expected when passing
Version.LUCENE_24/29/30 to QueryParser and the StandardAnalyzer when parsing
a query. I know that passing the Version to the constructors make Lucene
act that like version, with all features and bugs intact. The be
> ...
> 1. I've not tested my application with single index as initially (a few
> years back) we thought smaller the index size (7 indexes for default 80%
> searches) the faster the search time would be ...
Possibly. Maybe it will be acceptable to make some searches a bit
slower in order to make
Hi All,
This is my first question for this forum. I am fairly familiar with Lucene
and using 2.9.4 in my project (not using Solr). I have a following question
for the use of Synonym filter.
While indexing contents, I am using following analyzer setup
[Analyzer1] == StandardTokenizer --> Stand
Hi Ian,
Thanks for sharing your knowledge and to-the-point answers.
1. I've not tested my application with single index as initially (a few
years back) we thought smaller the index size (7 indexes for default 80%
searches) the faster the search time would be. Anyway i'll give it a try and
share t
30Gb isn't that big by lucene standards. Have you considered or tried
just having one large index? If necessary you could restrict searches
to particular "indexes", or groups thereof, by a field in the combined
index, preferably used as a filter. If the slow searches have to
search across 63 sep
Hi list,
We have an index directory of 30 GB which is divided into 3 subdirectories
(idx1, idx2, idx3) which are again divided into 21 sub-subdirectories
(idx1-1, idx1-2, , idx2-1, , idx3-1, , idx3-21).
We are running with java 1.6, lucene 2.9 (going to upgrade to 3.1 very
soon), linu
Attachment didn't work - test below:
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.Term;
import org.apache.lucene.search.I
I attach a junit test which shows strange behaviour of the inOrder
parameter on the SpanNearQuery constructor, using Lucene 2.9.4.
My understanding of this parameter is that true forces the order and
false doesn't care about the order.
Using true always works. However using false works fine when
Hi,
Luke cannot search NumericFields correctly, as the official Lucene
QueryParser does not produce numeric ranbge queries, as it does not know
that the field is numeric. It uses a TermRangeQuery and that may hit random
documents.
Uwe
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http
Well, you can use one of the sorting search methods and pass multiple
sort keys including relevance and a timestamp. But I suspect the
Google algorithm may be a bit more complex than that.
One technique is boosting: set an index time document boost on recent
documents. Of course what is recent t
Oh sorry, you're right, these are hibernate search classes. But I tried to
search inside the Luke Tool, and didn't find the right data with the given
query (the query is in fact an org.apache.lucene.search.Query, which will be
wrapped to hibernate search queries), so I thought this might be a Lu
Hallo Jacqueline,
I have no idea what classes inside Lucene you use, the term "FieldBridge"
relates more to Hibernate Search, right? So maybe you ask this question on
their mailing list. NumericFieldUtils is also not a Lucene class, to create
a numeric query use NumericRangeQuery.newDoubleRange(fi
Hi,
I have indexed some numeric properties (double) by adding numeric fields like
this in a custom FieldBridge:
NumericField field = new NumericField(propertyName, Store.YES, true);
field.setDoubleValue(propertyValue);
document.add(field);
This works fine and with my RangeQueries I g
> The same functionality can be achieved per field using
> Field.INDEX_NOT_ANALYZED.
;)
> -Ursprüngliche Nachricht-
> Von: Uwe Schindler [mailto:u...@thetaphi.de]
> Gesendet: Montag, 9. Mai 2011 10:07
> An: java-user@lucene.apache.org
> Betreff: RE: Is there kind of a "NullAnalyzer" ?
>
>
Hi,
KeywordTokenizer and KeywordAnalyzer.
The same functionality can be achieved per field using
Field.INDEX_NOT_ANALYZED.
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
> -Original Message-
> From: Clemens Wyss [mailto:clemens...
Thx!
> -Ursprüngliche Nachricht-
> Von: Federico Fissore [mailto:feder...@fissore.org]
> Gesendet: Montag, 9. Mai 2011 09:52
> An: java-user@lucene.apache.org
> Betreff: Re: Is there kind of a "NullAnalyzer" ?
>
> Clemens Wyss, il 09/05/2011 09:42, ha scritto:
> > i.e. an analyzer which t
Clemens Wyss, il 09/05/2011 09:42, ha scritto:
i.e. an analyzer which takes the field to be analyzed as is into the index...?
The fields I am trying to index have a max length of 3 words and I don't want
to match sub terms of these fields.
keyword analyzer?
https://lucene.apache.org/java/3_0
i.e. an analyzer which takes the field to be analyzed as is into the index...?
The fields I am trying to index have a max length of 3 words and I don't want
to match sub terms of these fields.
-
To unsubscribe, e-mail: java-user
21 matches
Mail list logo