I have title field and description field indexed. Now I want to
search for "object oriented programming" either in title "and or"
description using Lucene search query.
How do I do this?
-
To unsubscribe, e-mail: [EMAIL PROTECT
jim shirreffs wrote:
Thanks for the advice, I just don't see where in the Lucene code I
should plug OOParcer into Lucene.
I've walked the code in LIUS and Nutch (moving on to Solr) trying to
find common objects. If I can find common objects in Lucene and Nutch
I'll know where to plug in.
Yo
Another (obvious) option is to use two indexes and direct the query to the
appropriate index depending on the search specification. Of course you
double your space requirements, but your basically going to do that anyway
if you use two fields. I chose this for the slight benefit of fewer fields
on
I know you have a solution already that I agree with, but I do think
the DisjunctionMaxQuery could serve as the start for writing your own
Query that did what you want. Why would you want to? Well, maybe
you have other ways you want to search as well and don't want to mess
with custom Sim
Thanks for the advice, I just don't see where in the Lucene code I should
plug OOParcer into Lucene.
I've walked the code in LIUS and Nutch (moving on to Solr) trying to find
common objects. If I can find common objects in Lucene and Nutch I'll know
where to plug in.
Lucene Objects looks li
On 5/25/07, Chris Hostetter <[EMAIL PROTECTED]> wrote:
: Yes, indeed we could but it brings other problems, for example
increasing
: the index size, and extending the query to search for multiple fields,
etc.
1) if you index both teh raw and stemmed forms your index is going to grow
to roughly
: Yes, indeed we could but it brings other problems, for example increasing
: the index size, and extending the query to search for multiple fields, etc.
1) if you index both teh raw and stemmed forms your index is going to grow
to roughly the same size regardless of wether the stem and the arw a
jim shirreffs wrote:
Thanks to all that try to help me out
Jim S
P.S. If I get it working I will be happy to email post the code.
If you looked at the code in Nutch, you can take most of the parse-oo
plugin verbatim, because all this plugin does is it extracts the text
content and metadata
I've been working on this for a while, I am trying to get the demo code that
comes with Lucene to index OpenOffice documentss. I've looked at LIUS code
and at Nutch code. But can't find an easy way. So I am digging into the
code.
I wrote a KcmiDocument class that returns a Document. In it I
I can only speak to the " avoid matching stemmed
or canonical forms" part...
Yes, but you've got to do some fancy dancing when you index,
something like adding a special signifier to, say, the original token.
I'll ignore the canonical part of your question for the sake of
brevity.
Consider inde
Hello Ard,
What you are after is a higher mergeFactor and probably also a higher
maxBufferedDocs. Is indexing performance the concern?
Don't go crazy with setting a super high (e.g. 100+) mergeFactor, unless you
really have the number of open files on your server(s) set to a solid/high
number.
On 5/25/07, Walt Stoneburner <[EMAIL PROTECTED]> wrote:
In reading the math for scoring at the bottom of:
http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javadoc/org/apache/lucene/search/Similarity.html
It appears that if I can make tf() and idf(), term frequency and
inverse docume
Yes, indeed we could but it brings other problems, for example increasing
the index size, and extending the query to search for multiple fields, etc.
On 5/25/07, Steven Rowe <[EMAIL PROTECTED]> wrote:
Hi Enis,
Enis Soztutar wrote:
> In nutch we have a use case in which we need to store tokens
Hi Enis,
Enis Soztutar wrote:
> In nutch we have a use case in which we need to store tokens with their
> original text plus their stemmed form plus their canonical form(through
> some asciifization). From my understanding of lucene, it makes sense to
> write a tokenstream which generates several
Hi,
In nutch we have a use case in which we need to store tokens with their
original text plus their stemmed form plus their canonical form(through
some asciifization). From my understanding of lucene, it makes sense to
write a tokenstream which generates several tokens for each "word", but
p
In reading the math for scoring at the bottom of:
http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javadoc/org/apache/lucene/search/Similarity.html
It appears that if I can make tf() and idf(), term frequency and
inverse document frequency respectively, both return 1, then coord(),
w
Grant writes:
Have a look at the DisjunctionMaxQuery, I think it might help,
although I am not sure it will fully cover your case.
The definition for DisjunctionMaxQuery is provided at this URL:
http://incubator.apache.org/lucene.net/docs/2.1/Lucene.Net.Search.DisjunctionMaxQuery.html,
Grossly
>
> Hello,
>
> I am trying to change the maximum number of documents in a
> lucene segment. By default it seems to be 10.
Correction: 10 for the smallest (just created) segments of course, because
obviously merged segments are likely to contain many more documents
> When I have a
> mergeFac
Hello,
I am trying to change the maximum number of documents in a lucene segment. By
default it seems to be 10. When I have a mergeFactor of say 10, then on
average, after every 100 added documents lucene is merging segments.
I want each segment to contain more then the default 10 documents, be
On 5/25/07, karl wettin <[EMAIL PROTECTED]> wrote:
PerFieldAnalyzerWrapper
that was fast! thanks!
http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javadoc/
org/apache/lucene/analysis/PerFieldAnalyzerWrapper.html
--
karl
---
Hi Grant,
Is there any code example for this case?
Thanks,
Anny
On 5/15/07, Grant Ingersoll <[EMAIL PROTECTED]> wrote:
Yes, have a look at the SpanQuery functionality.
-Grant
On May 15, 2007, at 3:05 AM, Anny Bridge wrote:
> Hi all,
>
> When do search with lucene,can i get the number of ti
25 maj 2007 kl. 09.32 skrev Paulo Silveira:
I have a Document with tow fields: one I would like to write with
SimpleAnalyzer, the other I want to use StandardAnalyzer, is there a
simple way to do it?
PerFieldAnalyzerWrapper
http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javad
Hello!
I have a Document with tow fields: one I would like to write with
SimpleAnalyzer, the other I want to use StandardAnalyzer, is there a
simple way to do it?
thanks
--
Paulo E. A. Silveira
Caelum Ensino e Soluções em Java
http://www.caelum.com.br/
-
23 matches
Mail list logo