if you need the number of docs in which a given term appears you can use the
method docfreq
http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/index/IndexReader.html#docFreq(org.apache.lucene.index.Term)
otherwise you can use
http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/ind
I submitted a patch a few days ago (see
http://issues.apache.org/bugzilla/show_bug.cgi?id=30232).
Has someone tested this feature? Is there a reason not to commit it?
Julien
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additi
DocumentWriter is typically created with the
ramDirectory field of IndexWriter and not the actual directory field.
So getDirectory() should return this ramDirectory in order to work,
which is not very intuitive (one could expect it to return the real
directory). One could change the visibility of r
Of course I'd be pleased to make a draft of the javadoc and a patch file.
I'll try to do it, but I can't promise to deliver it soon
Julien
- Original Message -
From: "Doug Cutting" <[EMAIL PROTECTED]>
To: "Lucene Developers List" <[EMAIL PROTECTED]>
Sent: Monday, July 12, 2004 7:34 PM
The xls files did not pass. You can download them from the following URLs :
http://jnioche.freesurf.fr/shortQueries.xls
http://jnioche.freesurf.fr/longQueries.xls
- Original Message -
From: "Julien Nioche" <[EMAIL PROTECTED]>
To: "Lucene Developers List"
;~3^3.0)) ((descr:"burgundy wines"~3^4.0
descr:"burgundy wine"~3^4.0)) ((kw:"burgundy wines"~3^4.0 kw:"burgundy
wine"~3^4.0))
- Original Message -
From: "Julien Nioche" <[EMAIL PROTECTED]>
To: "Lucene Developers List" <[EMAI
s the best way to set up this value in IndexWriter? Maybe we could
limit to a few possible values like :
DEFAULT = 128
AVERAGE = 64
HIGH = 32
in order to avoid too low settings.
Any comments or suggestions? Can anyone give feedback on this?
Julien
- Original Message -
From: "Julien
[I move this discussion to the dev list]
> Then use this in place of BooleanQuery when you don't want coordination
> scoring. I think that should do the trick.
In my case it works perfectly. As we generate multilingual and semantic
expansions of the original words of a query, the coordination fa
Just a question :
Classes implementing MultiTermQuery (i-e WildcardQuery and FuzzyQuery) are
changed into BooleanQueries by the 'rewrite()' method before a Search. The
default coord() method of Similarity implies that the score of this BQ is
multiplied by the (ratio number of Terms found / number o
There's a new release of limo available !
This new version :
- includes lucene-1.3-final.jar
- fixes a bug with index loading
- detects when index changes and auto refreshes the information (as proposed
by Jakob Flierl)
- uses css for easier customisation (as proposed by E Hatcher)
- escapes HTML
>
To: "Lucene Developers List" <[EMAIL PROTECTED]>
Sent: Thursday, December 04, 2003 7:28 PM
Subject: Re: suggestion for a CustomDirectory
> Julien Nioche wrote:
> > However in most cases the
> > application would be faster because :
> > - tree access to the Te
Here is a use case :
- my Lucene application is running under W2K
- I have (just) a gigabyte RAM
- my index is quite big, let's say 1.7 Gb (with a .tis of 31Mb an a .tii of
479 Kb)
Using RAMDirectory is impossible, FSDirectory works but is quite slow.
Could it be possible to create a custom Direc
IL PROTECTED]>
Sent: Thursday, December 04, 2003 3:56 PM
Subject: Re: suggestion for a CustomDirectory
> On Thursday, December 4, 2003, at 09:45 AM, Julien Nioche wrote:
> > Here is a use case :
> > - my Lucene application is running under W2K
> > - I have (just) a gigab
ith less RAM available FSDirectory should be faster?
BTW modifications are to be made in the org.apache.lucene.store.InputStream
class, not in Directory.
Has anybody else tried it? Do you find similar results? What does it bring
on a bigger index?
Cheers
---
J
ry much.
>
> --
>
> - Original Message -
>
> DATE: Mon, 7 Jul 2003 09:32:22
> From: "Julien Nioche" <[EMAIL PROTECTED]>
> To: "Lucene Developers List"
<[EMAIL PROTECTED]>,<[EMAIL PROTECTED]>
> Cc:
>
> >No, TermD
No, TermDocs operates only on Terms, not on PhraseQueries. A PhraseQuery is
a query and is not stored in an index.
- Original Message -
From: "none none" <[EMAIL PROTECTED]>
To: "Lucene Developers List" <[EMAIL PROTECTED]>
Sent: Monday, July 07, 2003 7:00 AM
Subject: Re: [FAQ] Finding
I observed a change in the ordering of the results since I moved from 1.2 to
the 1.3 RC1 version of the API ( with the extensible Scoring API)
Maybe it's related?
- Original Message -
From: "Otis Gospodnetic" <[EMAIL PROTECTED]>
To: "Lucene Developers List" <[EMAIL PROTECTED]>
Sent: Monday
d for something else
> originally), but that will run a few things of this nature and print
> out the timings. Not a good way to benchmark, but I think it gives an
> idea.
>
> Otis
>
>
> --- Julien Nioche <[EMAIL PROTECTED]> wrote:
> > Hello,
> >
> >
during the execution of these methods, this kind of change must be
pretty harmless...
Any opinion on that? Could it have a side effect?
Julien Nioche
www.lingway.com
--
To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
What about the Support for Search Term Highlighting? (see Maik Schreiber's
paper)
It seems to have vanished from the list of features?
- Original Message -
From: "Peter Carlson" <[EMAIL PROTECTED]>
To: "Lucene Developers List" <[EMAIL PROTECTED]>
Sent: Thursday, May 23, 2002 8:29 AM
Subj
Another feature could be the ability to retrieve the number of occurences
not only for a term
but also for a Phrase (see
http://www.mail-archive.com/lucene-dev@jakarta.apache.org/msg00101.html)
- Original Message -
From: <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Monday, May 20, 2
Hi folks,
Just a little advertising message for those who are interested in semantic
expansions :
http://kant.lingway.com/DemoUN is a demo of a multilingual IR system based
on Lucene
Please take a look at it - feedback is welcome!
Julien
- Original Message -
From: "Peter Carlson" <
Hi,
We had a discussion on this topic a few month ago.
http://www.mail-archive.com/lucene-dev@jakarta.apache.org/msg00091.html
In particular Doug's answer :
http://www.mail-archive.com/lucene-dev@jakarta.apache.org/msg00101.html
Retrieving this information requires a few changes in the code (
By the way, I was wondering if there is any Analyzer that uses the following
constructor
public Token(String text, int start, int end, String typ) ?
Maybe it could be interesting to build an analyzer that recognizes
punctuation marks and
keeps it in the index as Tokens with a given Type (say fo
Any questions or comments are welcome. You can send them to
[EMAIL PROTECTED]
Please take a look at our site (www.lingway.com) for more information about
our activities.
Thank you
Julien Nioche / www.lingway.com
Hi all,
I also implemented an Highlight functionality based on Maik Schreiber's code
and
modified the Lucene source. I agree with Peter that it will be great to have
this changes done
in the Lucene core code.
The only difference in my functionality is that query.getTerms() returns
both TermQuery
e modifying the score() methods of the different Scorers? Or
> >is this information already computed somewhere else?
> >
> >Thanks a lot for your help
> >
> >Julien Nioche
> >
27 matches
Mail list logo