Hi all,
Am using lucene 2.3.2 as a search engine in my e-paper site. So that i want the
user to search the news. I achieved that objective but now am trying to
implement autosuggest so that user can pick a choice from the drop down and no
need of typing in the entire sentence or so.
I have d
Hi,
I find it is very hard to implement a GivenCharFilter(extends
TokenFilter)using incrementToken. My requirment is like this: I want to
analyze a StringReader("axb xxa xx c") to these token[term(startOffset,
endOffset, posIncre)]:
a(0,1,1) b(2,3,1) a(4,5,1) c(6,7,1).
First I use a WhiteS
On Mon, Nov 23, 2009 at 12:14 PM, Uwe Schindler wrote:
> If you just want to autocomplete the current term the user enters,
> initialize a TermEnum with the current entered term fragment. If you then
> iterate through the termenum, you get all terms that exist in the index
> *after* that term (in
For what it's worth, AOL uses a Solr cluster to handle searches for @aol users.
Each user has his own index.
Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
- Original Message
> From: fulin tang
> To
No, createOutput is designed to always create a new file. This fits
with Lucene because Lucene is "write once" (never writes to the same
file more than once).
Probably you should fallback to Java's IO APIs?
Mike
On Sat, Nov 21, 2009 at 10:36 AM, nithin kamath wrote:
>
> I use Lucene's IndexOut
OK, I'll add that to the javadocs; thanks.
But the fact that you weren't closing the old readers was probably
also tying up lots of disk space...
Mike
On Tue, Nov 24, 2009 at 3:31 PM, vsevel wrote:
>
> Hi, this is good information. as I read your post I realized that I am
> supposed to commit a
Hi, this is good information. as I read your post I realized that I am
supposed to commit after an optimize, which is something I do not currently
do. That would probably lead to the extra disk space I saw being consumed.
If this is correct, then the optimize javadoc could be improved to say that
On Tue, Nov 24, 2009 at 9:08 AM, vsevel wrote:
> Hi, just to make sure I understand correctly... After an optimize, without
> any reader, my index takes 30Gb on the disk. Are you saying that if I can
> ensure there is only one reader at a time, it could take up to 120Gb on the
> disk if searching
I am also having a hard time understanding the NewSpansUnordered
isPayloadAvilable() method.
For my test case where 2 tokens are at the same position, the code below
seems to be failing in traversing the 2 SpansCells. The first SpansCell it
retrieves has its next field set to null so it cannot fi
Hello,
For certain span queries I construct problematically by piecing together my
own SpanTermQueries I would like to enforce that Payload data is not
returned for matches on those specific terms used by the constituent
SapnTermQueries.
For exmaple if I search for a position match with a SpanQue
yes that indeed works for me.
thanks,
C>T>
On Mon, Nov 23, 2009 at 5:50 PM, Paul Elschot wrote:
> Op maandag 23 november 2009 20:07:58 schreef Christopher Tignor:
> > Also, I noticed that with the above edit to NearSpansOrdered I am getting
> > erroneous results fo normal ordered searches using
Hi, just to make sure I understand correctly... After an optimize, without
any reader, my index takes 30Gb on the disk. Are you saying that if I can
ensure there is only one reader at a time, it could take up to 120Gb on the
disk if searching while an optimize is going on?
I did not get your 3X
On Nov 24, 2009, at 7:09 AM, m.harig wrote:
>
> What should i do now , could you make me clear ??
>
How are you constructing the SpellChecker?
>
> Grant Ingersoll-6 wrote:
>>
>>
>> On Nov 24, 2009, at 1:16 AM, m.harig wrote:
>>
>>>
>>> String[] suggestions = spellChecker.suggestSimilar
Thanks a lot!
of course performance :)
I used 2.41 and now with 2.9.1 it works:)
Michael McCandless pisze:
Are you sure you even need to do this (manage separate RAMDir)? Ie
what's driving this (performance)?
IndexWriter already has its own RAM buffer for newly added docs / deletions.
You ca
What should i do now , could you make me clear ??
Grant Ingersoll-6 wrote:
>
>
> On Nov 24, 2009, at 1:16 AM, m.harig wrote:
>
>>
>> String[] suggestions = spellChecker.suggestSimilar("hoem", 3,indexReader,
>> "contents", true);
>>
>> this is how am retrieving my did you mean words
>>
>
>
On Nov 24, 2009, at 1:16 AM, m.harig wrote:
>
> String[] suggestions = spellChecker.suggestSimilar("hoem", 3,indexReader,
> "contents", true);
>
> this is how am retrieving my did you mean words
>
And which distance measure are you using?
I'm guessing this has to do with the use of "onlyMore
I don't really like that name, for the same reason ("create" and "new"
imply that an entirely new reader is being created, which is far more
costly than what normally happens).
Mike
On Tue, Nov 24, 2009 at 5:02 AM, Uwe Schindler wrote:
> How about newReader()?
>
> -
> Uwe Schindler
> H.-H.-M
Erick,
Thanks for all your help so far.
I'll try and see if upgrading to 2.9.1 won't cause us too much changes and
if it's stable enough.
If upgrading won't work then I can revert to using TermsFilter and
BooleanFilter from contrib which should cover all my needs and maybe it will
even be faster t
How about newReader()?
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
> -Original Message-
> From: Michael McCandless [mailto:luc...@mikemccandless.com]
> Sent: Tuesday, November 24, 2009 11:00 AM
> To: java-user@lucene.apache.org
On Tue, Nov 24, 2009 at 1:44 AM, vsevel wrote:
>
> 1) correct: I am using IndexWriter.getReader(). I guess I was assuming that
> was a privately owned object and I had no business dealing with its
> lifecycle. the api would be clearer to rename the operation createReader().
I just committed an ad
Are you sure you even need to do this (manage separate RAMDir)? Ie
what's driving this (performance)?
IndexWriter already has its own RAM buffer for newly added docs / deletions.
You can use the new near-realtime API (IndexWriter.getReader) in 2.9
to reopen your IndexReader, for searching the fu
21 matches
Mail list logo