Otis Gospodnetic wrote:
I suspect Martijn really wants that snippet dynamically generated, with
KWIC, as on the lucenebook.com screen shot. Thus, he can't generate
and store the snippet at index time, and has to construct it at search
time.
Otis
That is correct. I won't be having a lot of hits,
Erik Hatcher wrote:
On Dec 22, 2004, at 12:43 PM, M. Smit wrote:
Consider that you're only highlighting 20 or so entries at one time.
Getting the text from a Lucene index you're already navigating will be
quite quick. But it shouldn't be too bad to pull 20 records from a
database either.
The
On Dec 22, 2004, at 12:43 PM, M. Smit wrote:
Erik Hatcher wrote:
But for the other issue on 'store lucene' vs 'store db'. Does anyone
can provide me with some field experience on size?
The system I'm developing will provide searching through some 2000
pdf's, say some 200 pages each. I feed the pl
For simpy.com I store the full text of web pages in Lucene, in order to
provide full-text web searches. Nutch (nutch.org) does the same. You
can set the maximal number of tokens you want indexed via IndexWriter.
You can also compress fields in the newest version of Lucene (or maybe
just the one
I suspect Martijn really wants that snippet dynamically generated, with
KWIC, as on the lucenebook.com screen shot. Thus, he can't generate
and store the snippet at index time, and has to construct it at search
time.
Otis
--- Mike Snare <[EMAIL PROTECTED]> wrote:
> > But for the other issue on
> But for the other issue on 'store lucene' vs 'store db'. Does anyone can
> provide me with some field experience on size?
> The system I'm developing will provide searching through some 2000
> pdf's, say some 200 pages each. I feed the plain text into Lucene on a
> Field.UnStored bases. I also st
Erik Hatcher wrote:
Highlighter does not mandate you store your text in the index. It is
just a convenient way to do it. You're free to pull the text from
anywhere and highlight it based on the query.
Furthermore, you are saying that the highlighter takes care of the
corresponding field/words
On Dec 22, 2004, at 12:04 PM, M. Smit wrote:
Problem is though that I'm a little reluctant storing the data
Field.Text instead of Field.UnStored, because of the shear size of the
documents and the multitude I would like to index (say some 100paged *
2k documents). But than again, it's size vers
Otis,
Problem is though that I'm a little reluctant storing the data
Field.Text instead of Field.UnStored, because of the shear size of the
documents and the multitude I would like to index (say some 100paged *
2k documents). But than again, it's size versus
go-back-in-the-db-and-do-your-thing
Martijn, have you seen the Highlighter in the Lucene Sandbox?
If you've stored your text in the Lucene index, there is no need to go
back to DB to pull out the blog, parse it, and highlight it - the
Highlighter in the Sandbox will do this for you.
Otis
--- "M. Smit" <[EMAIL PROTECTED]> wrote:
>
Hello list,
I'm not sure if this subject will cover my question, but here goes:
consider the following snippet:
is = new IndexSearcher((String) envContext.lookup("search_index_dir"));
StopAnalyzer analyzer = new
StopAnalyzer(ArticleIndexer.SEARCH_STOP_WORDS_NL);
parser = new
NewMultiFieldQueryPa
11 matches
Mail list logo