Is it possible to do a fuzzy prefix query in Lucene:
e.g.:
term "foh*~"
should match:
foo
foobar
fohbar
...
i.e. fuzzyness should be applied to the given prefix.
Thx
- Clemens
-
To unsubscribe, e-mail: java-user-unsubscr...@
Say our index has (documents with) three fields "f1", "f2" and "f3" and I want
to find all documents matching "foo" and "bar" in any combination of the three
fields.
1)The more words that match, the higher its ranking. So it is not really a
strict AND-query...
2)The more words that match in a si
16:33 GMT+09:00 Clemens Wyss DEV :
> I'll give it a try. Any plans for luke to support Lucene 5.x, too?
>
> -Ursprüngliche Nachricht-
> Von: Koji Sekiguchi [mailto:koji.sekigu...@rondhuit.com]
> Gesendet: Freitag, 24. April 2015 08:23
> An: java-user@lucene.apache.o
which stands for Natural Language Processing for Lucene, has a function
for browsing Lucene index aside from NLP tools. It supports 5.x index format.
https://github.com/NLP4L/nlp4l#using-lucene-index-browser
Thanks,
Koji
On 2015/04/24 15:10, Clemens Wyss DEV wrote:
>>From time to time I make
From time to time I make use of luke to inspect lucene indexes. I appreciate
this tool very much.
Will there be a version of Luke for Lucene 5.x? Or is there one already?
Thx
Clemens
effects).
But in general, you should really enable DocValues for fields you want to sort
on.
Uwe
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
> -Original Message-----
> From: Clemens Wyss DEV [mailto:clemens...@mysign.ch]
>
eap consuming than
FieldCache.
[1] https://wiki.apache.org/lucene-java/ReleaseNote50
On Mon, Feb 23, 2015 at 1:24 PM, Clemens Wyss DEV
wrote:
> After upgrading to Lucene 5 one of my unittest which tests sorting
> fails
> with:
> unexpected docvalues type NONE for field 'providerte
After upgrading to Lucene 5 one of my unittest which tests sorting fails with:
unexpected docvalues type NONE for field 'providertestfield' (expected=SORTED).
Use UninvertingReader or index with docvalues
What am I missing?
Got this one sorted out. I was still referencing the 4.x lucene-analyzers.jar
which required the reader ;)
Sorry for the noise!
-Ursprüngliche Nachricht-
Von: Clemens Wyss DEV [mailto:clemens...@mysign.ch]
Gesendet: Montag, 23. Februar 2015 12:42
An: java-user@lucene.apache.org
Betreff
My custom Analyzer had the following (Lucene 4) impl of createComponents:
protected TokenStreamComponents createComponents ( final String fieldName,
final Reader reader )
{
Tokenizer source = new KeywordTokenizer( reader );
TokenStream
ericType() chain you'll end up
with INT or DOUBLE or whatever.
But if you know you stored it as an IntField then surely you already know it's
an integer? Unless you sometimes store different things in the one field. I
wouldn't do that.
--
Ian.
On Thu, Feb 19, 2015 at 12:22 P
When I index a Document with an IntField and then find that very Document the
former IntField is returned as StoredField. How do I determine the "original"
fieldtype (IntField, LongField, DoubleField ...)?
Must I ?
Number number = Field.numericValue();
if( number != null )
{
if( number instanc
Sorry for cross-posting, but the tika-ml does not seem to be too "lively":
I am trying to make use of the ForkParser. Unfortunately I am getting „Lost
connection to a forked server process“ for an (encrypted) pdf which I can
extract „in-process“. Extracting the document "in-process" takes appro
ereas I'd like to have the original only ...
-Ursprüngliche Nachricht-
Von: Clemens Wyss DEV [mailto:clemens...@mysign.ch]
Gesendet: Dienstag, 27. Januar 2015 09:08
An: java-user@lucene.apache.org
Betreff: LowercaseFilter, preserveOriginal?
Why does the LowecaseFilter, opposed to
Why does the LowecaseFilter, opposed to the ASCIIFoldingFilter, have no
preserveOriginal-argument?
I very much preserveOriginal="true" when applying the ASCIIFoldingFilter for
(german)suggestions
e there are good (but less popular) prefix hits. Having
control over which suggester is used when, and how its specific suggestions are
merged into the final result list, helps improving the user experience, at
least with our use cases.
Cheers, Oli
-Original Message-
From: Clemens Wyss D
ry would then be:
visiblefrom:[0 TO ] AND visibleto:[ TO ]
And a rather Solr'y question, nevertheless I ask it here:
I intended to use this very query as query filter (qf), but I guess it doesn't
make sense because '' changes at every call ;)
-Ursprüngliche Nachrich
me, use maxlong.
-Mike
On 1/12/15 4:23 AM, Clemens Wyss DEV wrote:
> I'll add/start with my proposal ;)
>
> Document-meta fields:
> + visiblefrom [long]
> + visibleto [long]
>
> Query or query filter:
> (*:* -visiblefrom:[* TO *] AND -visibleto:[* TO *]) OR (*:*
> -
)
OR ( visiblefrom:[* TO ] AND visibleto:[ TO *])
-Ursprüngliche Nachricht-
Von: Clemens Wyss DEV [mailto:clemens...@mysign.ch]
Gesendet: Montag, 12. Januar 2015 09:40
An: java-user@lucene.apache.org
Betreff: howto: handle temporal visibility of a document?
We have documents that are not always vi
We have documents that are not always visible (visiblefrom-visibleto). In order
to not have to query the originating object of the document whether it is
currently visible (after the query), we'd like to put metadata into the
documents, so that the visibility can be determined at query-time (by
TO *]
(That's asterisk:asterisk -field1:[* TO *] in case the silly list interprets
the asterisks as markup)
There's some special magic in filter query processing to handle this case, but
not in the main query parser.
Best,
Erick
On Wed, Jan 7, 2015 at 8:14 AM, Clemens Wyss DEV wrote:
> Sa
query processing to handle this case, but
not in the main query parser.
Best,
Erick
On Wed, Jan 7, 2015 at 8:14 AM, Clemens Wyss DEV wrote:
> Say I wanted to find documents which have no content in "field1" (or
> dosuments that have no field 'field1'), wouldn&
Say I wanted to find documents which have no content in "field1" (or dosuments
that have no field 'field1'), wouldn't that be the following query?
-field1:[* TO *]
Thanks for you help
Clemens
-
To unsubscribe, e-mail: java-user-
One of our indexes is updated completely quite frequently -> "batch update" or
"re-index".
If so more than 2million documents are added/updated to/in the very index. This
creates an immense IO load on our system. Does it make sense to set merge
scheduler to NoMergeScheduler (and/or MergePolicy
Betreff: Re: [suggestions] fetch terms from a FilterAtomicReader(subclass)?
On 10/27/2014 07:32 AM, Clemens Wyss DEV wrote:
> Is it possible to fetch the terms of a FilterAtomicReader in order to provide
> suggestions from a subset of all documents in an index?
Yes, it is possible.
I do it by f
Is it possible to fetch the terms of a FilterAtomicReader in order to provide
suggestions from a subset of all documents in an index?
So my target is to "provide suggestions from a subset of all documents in an
index".
Note:
I have an "equal" discussion ongoing in the solr-mailinglist. But I th
The following "testcase" runs endlessly and produces VERY heavy load.
...
String query = "Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed
diam nonumy eirmod tempor invidunt ut "
+ "labore et dolore magna aliquyam erat, sed
diam voluptua. At vero eos et
Having
control over which suggester is used when, and how its specific suggestions are
merged into the final result list, helps improving the user experience, at
least with our use cases.
Cheers, Oli
-Original Message-
From: Clemens Wyss DEV [mailto:clemens...@mysign.ch]
Sent: Friday, J
Sorry for re-asking.
Has anyone implemented an AnalyzingSuggester which
- is fuzzy
- is case insensitive (or must/should this be implemented by the analyzer?)
- does infix search
[- has a small memory footprint]
-Ursprüngliche Nachricht-
Von: Clemens Wyss DEV [mailto:clemens
http://blog.mikemccandless.com
On Thu, Jun 19, 2014 at 12:54 PM, Clemens Wyss DEV wrote:
> directory = new SimpleFSDirectory( indexLocation ); IndexWriterConfig
> config = new IndexWriterConfig(Version.LUCENE_47, new
> WhitespaceAnalyzer( Version.LUCENE_47 )); indexWriter = new
> Index
case is supposed to work; if it doesn't it's a bad bug :)
Can you reduce it to a small example?
Mike McCandless
http://blog.mikemccandless.com
On Wed, Jun 18, 2014 at 10:08 AM, Clemens Wyss DEV wrote:
> I would like to perform a batch update on an index. In order to omit
>
I would like to perform a batch update on an index. In order to omit duplicate
entries I am making use of IndexWriter#updateDocument(Term, Document)
open an IndexWriter;
foreach( element in elementsToBeUpdatedWhichHaveDuplicates )
{
doc = element.toDoc();
indexWriter.updateDocument( uniqueTermFor
- does this mean
5485824 bytes, or ~5.2 MB? This is probably "correct", meaning this is the RAM
to hold the terms index. But I can't see from your heap dump output where the
other ~51.3 MB is being used by StandardDirectoryReader.
rsprüngliche Nachricht-
Von: Michael McCandless [mailto:luc...@mikemccandless.com]
Gesendet: Freitag, 13. Juni 2014 15:48
An: Lucene Users
Betreff: Re: [lucene 4.6] NPE when calling IndexReader#openIfChanged
On Fri, Jun 13, 2014 at 8:53 AM, Clemens Wyss DEV wrote:
> Thanks a lot!
>>
Looking for an AnalyzingSuggester which supports
- fuzzyness
- case insensitivity
- small (in memors) footprint (*)
(*)Just tried to "hand" my big IndexReader (see oher post " [lucene 4.6] NPE
when calling IndexReader#openIfChanged") into JaspellLookup. Got an OOM.
Is there any (Jaspell)Lookup im
|
---
Does this help?
-----Ursprüngliche Nachricht-
Von: Michael McCandless [mailto:luc...@mikemc
anged
On Wed, May 21, 2014 at 3:17 AM, Clemens Wyss DEV wrote:
>> Can you just decrease IW's ramBufferSizeMB to relieve the memory pressure?
> +1
> Is there something alike for IndexReaders?
No, although you can take steps during indexing to reduce the RAM required
during sear
27;m doing something similar, adding weighting as some function of doc freq (and
using Scala).
Cheers,
Neil
On 13/06/14 00:19, Clemens Wyss DEV wrote:
> enter InputIteratorWrapper ;) i.e. new InputIteratorWrapper(tfit )
>
> -Ursprüngliche Nachricht-----
> Von: Clemens Wyss DEV [mailt
enter InputIteratorWrapper ;) i.e. new InputIteratorWrapper(tfit )
-Ursprüngliche Nachricht-
Von: Clemens Wyss DEV [mailto:clemens...@mysign.ch]
Gesendet: Donnerstag, 12. Juni 2014 16:01
An: java-user@lucene.apache.org
Betreff: AW: Analyzing suggester for many fields
trying to re-build
liche Nachricht-
Von: Clemens Wyss DEV [mailto:clemens...@mysign.ch]
Gesendet: Mittwoch, 11. Juni 2014 12:57
An: java-user@lucene.apache.org
Betreff: AW: Analyzing suggester for many fields
Unfortunately the link provided by Goutham is no more valid. Anybody still got
the code?
-Ursprüng
Unfortunately the link provided by Goutham is no more valid. Anybody still got
the code?
-Ursprüngliche Nachricht-
Von: Goutham Tholpadi [mailto:gtholp...@gmail.com]
Gesendet: Donnerstag, 29. August 2013 06:21
An: java-user@lucene.apache.org
Betreff: Re: Analyzing suggester for many fiel
cene 4.6] NPE when calling IndexReader#openIfChanged
On Mon, May 19, 2014 at 6:14 AM, Clemens Wyss DEV wrote:
> Mike,
> first of all thanks for all your input, I really appreciate (as much as I
> like reading your blog).
You're welcome!
>> Hmm, but you swap these files over w
ommit in order to see changes.
What if I were to search right after deleteAll?
-Ursprüngliche Nachricht-
Von: Michael McCandless [mailto:luc...@mikemccandless.com]
Gesendet: Montag, 19. Mai 2014 11:05
An: Lucene Users
Betreff: Re: [lucene 4.6] NPE when calling IndexReader#openIfChanged
O
ing files
directly from the index directory yourself between reopens?
Mike McCandless
http://blog.mikemccandless.com
On Mon, May 19, 2014 at 1:36 AM, Clemens Wyss DEV wrote:
> Sorry for being imprecise
> java version "1.6.0_26"
> Java(TM) SE Runtime Environment (build 1
e :)
Possibly a concurrency/timing issue?
-Ursprüngliche Nachricht-----
Von: Clemens Wyss DEV [mailto:clemens...@mysign.ch]
Gesendet: Montag, 19. Mai 2014 07:37
An: java-user@lucene.apache.org
Betreff: AW: [lucene 4.6] NPE when calling IndexReader#openIfChanged
Sorry for being imprecise
java ve
g, 18. Mai 2014 16:51
An: Lucene Users
Betreff: Re: [lucene 4.6] NPE when calling IndexReader#openIfChanged
But what is the output of "java -fullversion"?
Mike McCandless
http://blog.mikemccandless.com
On Sun, May 18, 2014 at 5:24 AM, Clemens Wyss DEV wrote:
>> What java versio
ttp://blog.mikemccandless.com
On Wed, May 14, 2014 at 2:16 AM, Clemens Wyss DEV wrote:
> Tackled this down a little bit more:
> Lucene40LiveDocsFormat#readLiveDocs calls
> IndexFileNames#fileNameForGeneration
> If I get this right, param 'gen' seems to be -1.
> Gen is being gat
e Nachricht-----
Von: Clemens Wyss DEV [mailto:clemens...@mysign.ch]
Gesendet: Dienstag, 13. Mai 2014 18:23
An: java-user@lucene.apache.org
Betreff: [lucene 4.6] NPE when calling IndexReader#openIfChanged
I am facing the following stacktrace:
java.lang.NullPointerException: null
at jav
Not closing an IndexReader most probably (to say the least) results in a
mem-leak -> OOM
> But if I close if given that it is share by multiple threads I will
>need to check each time before doing the search if IndexReader is still open
>correct?
You can make use of IndexReader#incRef/#decRef ,
> But if I close if given that it is share by multiple threads I will need to
> check each time
>before doing the search if IndexReader is still open correct?
You can make use of IndexReader#incRef/#decRef , i.e.
ir.incRef();
try
{
Or maybe SearcherManager
http://blog.mikemccandless.com/2011/09
I am facing the following stacktrace:
java.lang.NullPointerException: null
at java.io.File.(File.java:305) ~[na:1.6.0_26]
at
org.apache.lucene.store.NIOFSDirectory.openInput(NIOFSDirectory.java:80)
~[lucene-core.jar:4.6.0 1543363 - simon - 2013-11-19 11:05:50]
, and extends LuceneTestCase, using
newDirectory and so on.
if you have files still open this will fail the test and give you a stacktrace
of where you initially opened the file.
On Sun, Dec 9, 2012 at 12:28 PM, Clemens Wyss DEV wrote:
> Hi Mike,
> unfortunately not. When I run the unit t
@lucene.apache.org
Betreff: Re: Lucene (4.0), junit, failed to delete _0_nrm.cfs
Can you post the source code for your test case?
Mike McCandless
http://blog.mikemccandless.com
On Sun, Dec 9, 2012 at 11:45 AM, Clemens Wyss DEV wrote:
> I am (also) running lucene unit tests.
>
> In the teardo
I am (also) running lucene unit tests.
In the teardown-method(@After) I (try to) delete the complete directory-folder.
Unfortunately this does not always work. If not, the file _0_nrm.cfs (or
_0.fdx) is the first to cause problems, i.e. is being "locked"...
I do explicitly close the writers/read
I have a CustomAnalyzer which overrides "public final TokenStream tokenStream (
String fieldName, Reader reader )":
@Override
public final TokenStream tokenStream ( String fieldName, Reader reader )
{
boolean fieldRequiresExactMatching =
IndexManager.getInstance().isExactMatchField( fieldName );
d) and then convert the query *plan to
nalp* :).
You can also index the suffixes of words, e.g. vacancyplan, acancyplan,
cancyplan and so forth, and then convert the query *plan to plan. Note that it
increases the lexicon !
Shai
On Fri, Dec 7, 2012 at 11:16 AM, Clemens Wyss DEV wrote:
> In
In order to provide suggestions our query also includes a "WildcardQuery with a
leading *", which, of course, has a HUGE performance impact :-(
E.g.
Say we have indexed "vacancyplan", then if a user typed "plan" he should also
be offered "vacancyplan" ...
How can this feature be implemented wit
Is there a way to limit IndexWriters memory usage?
While indexing many many documents my IndexWriter occupies > 30MB in memory.
Is there a way to limit this "usage"?
Thx
Clemens
-
To unsubscribe, e-mail: java-user-unsubscr...@
ings (indexed documents)
If you turn on IndexWriter's infoStream, do you see output saying it's flushing
a new segment because RAM is > 5.0 MB?
Mike McCandless
http://blog.mikemccandless.com
On Mon, Apr 16, 2012 at 4:46 AM, Clemens Wyss wrote:
> We limit the memory con
We limit the memory consumption of our IndexWriters by setting RAMBufferSizeMB
to 5MB (IndexWriterConfig.setRAMBufferSizeMB).
Inspecting a heapdump unveils that I still have writers wich consume/retain
more than 35MB!
How come?
Any help/advice appreciated
Clemens
---
>Chuck Norris
... with his swiss army knife ... ;)
Greetings from Switzerland
- Clemens-having-a-swiss-army-knife-too
> -Ursprüngliche Nachricht-
> Von: Dawid Weiss [mailto:dawid.we...@gmail.com]
> Gesendet: Montag, 24. Oktober 2011 17:01
> An: java-user@lucene.apache.org
> Betreff: Re: A
Writers.
> > Clone it before, if you want to use it multiple times.
> >
> > Uwe
> >
> > -
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: u...@thetaphi.de
> >
> >
> >> --
I am seeing this stack trace in my logs:
org.apache.lucene.util.SetOnce$AlreadySetException: The object cannot be set
twice!
at org.apache.lucene.util.SetOnce.set(SetOnce.java:69)
at
org.apache.lucene.index.MergePolicy.setIndexWriter(MergePolicy.java:271)
at org.apache.luc
Ok, I just read the java doc ...
Is there a possibility to just revert the pending writes of an IR?
> -Ursprüngliche Nachricht-
> Von: Clemens Wyss [mailto:clemens...@mysign.ch]
> Gesendet: Dienstag, 26. Juli 2011 17:25
> An: java-user@lucene.apache.org
> Betreff: AW: impl
Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>
> > -Original Message-
> > From: Clemens Wyss [mailto:clemens...@mysign.ch]
> > Sent: Tuesday, July 26, 2011 4:27 PM
> > To: java-user@lucene.apache.org
> > Subject: AW: implicit closing of an IndexWr
ot;,
> the original JVM problem itself is still there and cannot be fixed (if you
> interrupt threads).
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>
> > -Original Message-
> >
I am using Lucene 3.3
> -Ursprüngliche Nachricht-
> Von: Mark Miller [mailto:markrmil...@gmail.com]
> Gesendet: Dienstag, 26. Juli 2011 16:05
> An: java-user@lucene.apache.org
> Betreff: Re: implicit closing of an IndexWriter
>
>
> On Jul 26, 2011, at 9:5
Under which circumstances is an IndexWriter "implcitly" closed?
I have an IndexWriter member in one of my helper classes which ist openened in
the constructor. I never ever close this member explicitly. Nevertheless I
encounter AlreadyClosedException's when writing through the IndexWriter ...
; H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>
> > -Original Message-
> > From: Clemens Wyss [mailto:clemens...@mysign.ch]
> > Sent: Thursday, June 30, 2011 9:44 AM
> > To: java-user@lucene.apache.org
> >
), filter, 10 );
The filter never ever lets any documents through...when calling
result = indexSearcher.search( new WildcardQuery( new Term( "description",
"*happy*" ) ), 10 );
I have hits...
> -Ursprüngliche Nachricht-
> Von: Clemens Wyss [mailto:clemens...
rom, eg a MatchAllDocsQuery.
>
> karl
>
> 29 jun 2011 kl. 17.25 skrev Clemens Wyss:
>
> > Say I have a document with field "f1". How can I search Documents which
> have not "test" in field "f"
> > I tried:
> > -f: *test*
&
Say I have a document with field "f1". How can I search Documents which have
not "test" in field "f"
I tried:
-f: *test*
f: -*test*
f: NOT *test*
but no luck. Using WildCardQuery class...
Any advices?
Thx
Clemens
-
To unsubscri
I have a minimal unit test in which I add three documents to an index. The
documents have two fields "year" and "descritpion".
doc1(year = "2007", desc = "text with 2007 and 2009")
doc2(year = "2009", desc = "text with 2007 and 2009")
doc3(year = "2008", desc = "text with 2007 and 2009")
To searc
gt; eMail: u...@thetaphi.de
>
> > -Original Message-
> > From: Clemens Wyss [mailto:clemens...@mysign.ch]
> > Sent: Monday, May 09, 2011 9:43 AM
> > To: java-user@lucene.apache.org
> > Subject: Is there kind of a "NullAnalyzer" ?
> >
> > i.
Thx!
> -Ursprüngliche Nachricht-
> Von: Federico Fissore [mailto:feder...@fissore.org]
> Gesendet: Montag, 9. Mai 2011 09:52
> An: java-user@lucene.apache.org
> Betreff: Re: Is there kind of a "NullAnalyzer" ?
>
> Clemens Wyss, il 09/05/2011 09:42, ha scri
i.e. an analyzer which takes the field to be analyzed as is into the index...?
The fields I am trying to index have a max length of 3 words and I don't want
to match sub terms of these fields.
-
To unsubscribe, e-mail: java-user
sition, and found after a while that I didn't
> really
> want to go back FWIW.
>
> Best
> Erick
>
> On Thu, May 5, 2011 at 2:26 AM, Clemens Wyss
> wrote:
> > What I am looking for is the autosuggestion implemented here (@solr)
> >
> > http://search-
ll get (cased) suggestions back.
> If you need cased suggestions, but provide normalized (lowercased) prefixes
> you'll get nothing, although such a feature would be relatively easy to
> implement based on the automaton code currently in the SVN.
>
> Dawid
>
> On Fri, May
> normalization at the time you query for suggestions.
> 3. "http://search-lucene.com/m/586gA4ccL11";. I have no idea.
>
> Dawid
>
> On Fri, May 6, 2011 at 11:06 AM, Clemens Wyss
> wrote:
>
> > I have come across TSTLookup.
> > In which jar Do I find FSTL
> FSTLookupTest -- you can populate FSTLookup manually with terms/ phrases
> from your index and then use the resulting automaton for suggestions.
>
> Dawid
>
> On Thu, May 5, 2011 at 2:54 PM, Clemens Wyss
> wrote:
>
> > I have implemented my index (in fact it'
I have implemented my index (in fact it's a plugable indexing API) in "plain
Lucene". It tried to implement a term suggestion mechanism on my own, being not
to happy so far.
At
http://search-lucene.com/m/0QBv41ssGlh/suggestion&subj=Auto+Suggest
I have seen Solr's auto suggestion for search terms.
are after.
> See how Solr uses it here:
> http://search-
> lucene.com/c/Solr:/src/java/org/apache/solr/analysis/EdgeNGramTokenizer
> Factory.java||EdgeNGramTokenizer
>
>
> Otis
>
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem
> search ::
want to do this?
> What is the use-case you're trying to solve? Is relevance not what you want?
> Are you just experimenting?
>
> The statement of *what* you want to do is clear, but I don't know an easy to
> do that. Perhaps there's a better approach to solving the un
ing of a field/document
>
> What is the problem you're trying to solve? I'm wondering if this is an XY
> problem. See:
> http://people.apache.org/~hossman/#xyproblem
>
> Best
> Erick
>
> On Wed, May 4, 2011 at 3:16 AM, Clemens Wyss
> wrote:
> > G
Given the I have 3 documents with exactly one field and the fields have the
following contents:
This is a moon
The moon is bright
moon
If I analyze these documents they all hit on "moon". But how do I need to
analyze/search my index in order to have the following "sort order":
moon
The moon is b
just an example. Stick another tokenizer in there, like
> WhitespaceTokenizer in there, for example.
>
> Otis
>
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem
> search :: http://search-lucene.com/
>
>
>
> - Original Message
&g
ter.Side.FRONT, 1, 4); }
>
>
> Check out page 265 of Lucene in Action 2.
>
> Otis
>
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
>
>
>
> - Original Message
> > F
/examples closely enough, but you may want to look at
> this if you haven't done so yet.
>
> Otis
>
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem
> search :: http://search-lucene.com/
>
>
>
> - Original Message
&g
Betreff: AW: "fuzzy prefix" search
>
> Have you tried
>
> Query q = new FuzzyQuery( new Term( "test", "Mer" ), 0.499f);
>
>
> Sven
>
>
> -Ursprüngliche Nachricht-
> Von: Clemens Wyss [mailto:clemens...@mysign.ch]
> Gesendet
; and "merlot"? Would it be
> less that 1.5 which I reckon would be the value of length(term)*0.5 as
> detailed in the javadocs? Seems unlikely, but I don't really know anything
> about the Levenshtein (edit distance) algorithm as used by FuzzyQuery.
> Wouldn'
" search
>
> Mer != mer. The latter will be what is indexed because StandardAnalyzer
> calls LowerCaseFilter.
>
> --
> Ian.
>
>
> On Tue, May 3, 2011 at 9:56 AM, Clemens Wyss
> wrote:
> > Sorry for coming back to my issue. Can anybody exp
archer( indexReader );
Query q = new FuzzyQuery( new Term( "test", "Mer" ), 0.5f, 0, 10 );
// or Query q = new FuzzyQuery( new Term( "test", "Mer" ), 0.5f);
TopDocs result = searcher.search( q, 10 );
Assert.assertEquals( 1, result.totalHits );
- Clemens
&
Is it the combination of FuzzyQuery and Term which makes the search to go for
"word boundaries"?
> -Ursprüngliche Nachricht-
> Von: Clemens Wyss [mailto:clemens...@mysign.ch]
> Gesendet: Montag, 2. Mai 2011 14:13
> An: java-user@lucene.apache.org
> Betreff:
rms must match this prefix and
> the
> rest of each term is matched using fuzzy.
>
> Uwe
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
> > -Original Message-
> > From: Clemens Wyss
I'd like to search fuzzily but not on a full term.
E.g.
I have a text "Merlot del Ticino"
I'd like
"mer", "merr", "melo", ... to match.
If I use FuzzyQuery only "merlot, "merlott" hit. What Query-combination should
I use?
Thx
Clemens
--
ens
> -Ursprüngliche Nachricht-
> Von: Grant Ingersoll [mailto:gsing...@apache.org]
> Gesendet: Sonntag, 24. April 2011 08:30
> An: java-user@lucene.apache.org
> Betreff: Re: "Umlaute" getting lost
>
>
> On Apr 21, 2011, at 5:02 PM, Clemens Wyss wrot
pril 2011 12:13
> An: java-user@lucene.apache.org
> Betreff: Re: "Umlaute" getting lost
>
> On Sun, Apr 24, 2011 at 8:30 AM, Grant Ingersoll
> wrote:
> >
> > On Apr 21, 2011, at 5:02 PM, Clemens Wyss wrote:
> >
> >> I keep my search terms in a
I keep my search terms in a dedicated RAMDirectory (the termIndex).
In there I palce all the term of my real index. When putting the terms into the
termIndex I can still see [using the debugger] the Umlaute (äöü). Unfortunately
when searching the
termIndex the documents no more contain these Um
> Betreff: Re: German*Filter, Analyzer "cutting" off letters from (french)
> words...
>
> On Fri, Apr 15, 2011 at 8:48 AM, Clemens Wyss
> wrote:
> > Does the StandardAnalyzer lowercase its terms?
> yes!
>
> simon
> >
> >> -Ursprüngliche Nach
Does the StandardAnalyzer lowercase its terms?
> -Ursprüngliche Nachricht-
> Von: Clemens Wyss [mailto:clemens...@mysign.ch]
> Gesendet: Mittwoch, 13. April 2011 13:34
> An: java-user@lucene.apache.org
> Betreff: AW: German*Filter, Analyzer "cutting" off lett
1 - 100 of 112 matches
Mail list logo