Hi all,
I have used Lucene so far for solving toy exaples and making
tutorial examples, but now I am facing my first real-world high-quality
application.
I need to manage around 50.000 docs, ranging from a few lines to a
couple pages. I also need to handle lemmas and synonyms, and h
I stopped procrastinating on this today.
I signed up for a BOF slot at 8 on Thursday. Hopefully not against other
stuff of interest.
I've not done this before, but the BOF slots were filling.
>From my perspective, it'd be great to have people from any of the
subprojects. Plenty of cross fertiliz
On 8/23/06, Chris Hostetter <[EMAIL PROTECTED]> wrote:
: I was wondering if there have been any other self/semi-organized things
: around Lucene in the past, like a BOF?
This will be my first ApacheCon, so i can't speak to what's happened in
the past -- but I'm certainly up for putting some face
Hello, Ronald.
What I have found that nothing except createWeight uses that
docFreqs(Term[]) method...
Maybe I need to parallelize it... But I dont understand something.
When does Multisearcher.createWeight() is being called, b/c only this method
used docFreqs and this method creates HashMap of d
I understand...because I've experienced it. I think the answer is to
'parallelize' the docFreq process...and or try to make use of the
docFreq(Terms[]). By passing an Array of Terms, you can avoid the 'call
per Term' per remote and just make a single docFreq call per remote.
You might have to ex
I suspect guarantee that if you have a large index (actually, not that
large), you'll find yourself dealing with TooManyClauses exceptions. Look at
the thread in this list titled "I just don't get wildcards at all" for a
discussion of wildcards and applicable strategies. "The guys" explained a
lot
I have had some badly behaved Lucene indexing software crash on me several
times and have been left with an index directory with lots of non-composite
files in, when all I ought to be getting is the compound files .cfs files
plus deletable and segments.
Re-indexing everything doesn't bear think
I believe Lucene's QueryParser doesn't allow you to specify a leading wildcard.
However, the WildcardQuery class does allow leading wildcard queries, such as
"*technology". This is probably the easiest way to get around this.
You do have other options that can specify a wildcard search, such as
On Fri, 2006-09-15 at 15:31 +0200, Mark Müller wrote:
> I guess terms will only be took into the corpus when the search found
> results at least once for that term (and removed if no more results were
> found).
>
> Persisting the corpus has to be done, but should be no problem.
I use ObjectIn&Out
Hello All,
I have a question .. how to use wildcard for searching at the start of
the query string.
For Ex. I want to search on title with query value "*technology", when I
try to create a lucene query by using QueryParser it thorws the
excpetion ..
Lexical error at line 1, column 1. Enco
Depending on the size of your index, you might want to put it in the
downloaded page. I have a small index of maybe 1,500 words so I have
the word list in the page. this is simpler than ajax, but will not
work for big indexes, of course.
On Sep 15, 2006, at 8:02 AM, Mark Müller wrote:
Hi a
HI,
I'm new in the lucene and currently I'm performing search in all the
fields.I'm only specifying the term which i want to search so, I would like to
know how to get field name related to this term in all the documents that hit
by searcher.
pls suggest a solution for same.
Thanks in ad
Thx for the pointer to your code. It's a smart approach even it not related
to Lucene only.
I guess terms will only be took into the corpus when the search found
results at least once for that term (and removed if no more results were
found).
Persisting the corpus has to be done, but should be no
First, You really must undestand analyzers and what they do. If you haven't
seen the book Lucene in Action, I highly recommend it.
Second, get a copy of Luke (google luke and lucene). It is a graphical tool
that lets you examine an index and fire queries at it. It'll show you
exactly what was ind
Hello, Yura.
Does anyone understand my email? Maybe my English is too bad...
Thanks.
YS> Here is the situation. I have ParallelMultiSearcher object
YS> initializated with two or more RemoteSearchable's.
YS> I run PrefixQuery search on some keyword field, say "link". When I run
YS> search starti
We've done something similar at http://www.123dictionar.ro. As you type,
the word is sent to the server using AJAX and if an exact match is not
found, a Lucene index is searched using a FuzzyQuery search. Counts are
precomputed, as data is not changing.
Regards,
Ioan
Mark Müller wrote:
Hi al
Thanks for response,
I have again a small problem.
I have some text in a xml tag like
\A1;Frank\PPaul
Does lucene can not index it using SimpleAnalyzer or TextContentExtractor.
Thanks...
Catalin Mititelu <[EMAIL PROTECTED]> wrote:
One more hint for 2) and 3): use SimpleAnalyzer on
On Fri, 2006-09-15 at 14:02 +0200, Mark Müller wrote:
> Hi all,
> I like to know if it is possible to let make Lucene Suggestions while the
> user types in the search query.
>
> Like in Google Suggest: http://www.google.com/webhp?complete=1&hl=en
>
> I just need to send with AJAX the part of the
Hi all,
I like to know if it is possible to let make Lucene Suggestions while the
user types in the search query.
Like in Google Suggest: http://www.google.com/webhp?complete=1&hl=en
I just need to send with AJAX the part of the word the user already typed
and get back the list of matching terms.
One more hint for 2) and 3): use SimpleAnalyzer on your xml (give up at
XmlContentExtractor). In this manner you can index all "words" from xml file at
lower case (tag name, attribute name, attribute value and content).
Of course, you should use the same analyzer for searching.
Simon Willnauer
On 9/15/06, aslam bari <[EMAIL PROTECTED]> wrote:
Dear Mititelu,
Thanks for reply. Can you help me on some samll issue related to it.
1) I am new to Lucene. Can you tell me where is this DEFAULT_MAX_FIELD_LENGTH
variable available and how to set it and for my purpose like 6-10MB file, how
m
Dear Mititelu,
Thanks for reply. Can you help me on some samll issue related to it.
1) I am new to Lucene. Can you tell me where is this DEFAULT_MAX_FIELD_LENGTH
variable available and how to set it and for my purpose like 6-10MB file, how
much i should set.
2) how can i index all the words
Yes. The default max limit for indexing tokens is 10,000.
Look here
http://lucene.apache.org/java/docs/api/org/apache/lucene/index/IndexWriter.html#DEFAULT_MAX_FIELD_LENGTH
aslam bari <[EMAIL PROTECTED]> wrote: Dear all,
I am trying to index a Xml file which has 6MB size. Does lucene support t
Dear all,
I am trying to index a Xml file which has 6MB size. Does lucene support the
big document size. What is the limit of lucene Max file size to index.
Because when i check and trying to search in the indexed file. I am not able
to get all the results. It gives me some results but not o
24 matches
Mail list logo