Re: Normalization Techniques

2005-09-27 Thread Paul Elschot
On Wednesday 28 September 2005 02:35, Ira Goldstein wrote: > Hi. I’m working on a project to compare various normalization techniques > and want to make sure that I understand the code before I begin making > changes. It appears that while the tf’s are being stored in DocumentWriter, > the actual

[jira] Updated: (LUCENE-374) You cannot sort on fields that don't exist

2005-09-27 Thread Yonik Seeley (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-374?page=all ] Yonik Seeley updated LUCENE-374: Attachment: sort.diff Attaching latest version. I looked into TestSort to add some tests for this, and was surprised to see some tests that should have covere

[jira] Created: (LUCENE-442) TestIndexModifier.testIndexWithThreads is not valid?

2005-09-27 Thread Hoss Man (JIRA)
TestIndexModifier.testIndexWithThreads is not valid? Key: LUCENE-442 URL: http://issues.apache.org/jira/browse/LUCENE-442 Project: Lucene - Java Type: Bug Components: Search Versions: 1.9 Reporter:

[jira] Updated: (LUCENE-374) You cannot sort on fields that don't exist

2005-09-27 Thread Yonik Seeley (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-374?page=all ] Yonik Seeley updated LUCENE-374: Attachment: sort.diff Updated my original patch from http://www.mail-archive.com/java-user@lucene.apache.org/msg00611.html which actually wasn't meant to be a c

Normalization Techniques

2005-09-27 Thread Ira Goldstein
Hi. I’m working on a project to compare various normalization techniques and want to make sure that I understand the code before I begin making changes. It appears that while the tf’s are being stored in DocumentWriter, the actual normalization (and boosting) is calculated in BooleanQuery.sumOfSq

KinoSearch merge model

2005-09-27 Thread Marvin Humphrey
Greets, As mentioned in my previous post, the most significant architectural difference between the Lucene/Plucene indexer and KinoSearch indexer is the merge model. KinoSearch's merge model is considerably more efficient in Perl; I suspect that it may also be incrementally more efficien

Perl progress

2005-09-27 Thread Marvin Humphrey
Greets, A week ago, a revamped indexer based on my Perl search engine library, KinoSearch, successfully built a Lucene-compatible index. The corpus was 1,000 documents from Wikipedia. Better, it did so in a reasonable amount of time: Time to index 1000 docs on my G4 laptop ===

[jira] Updated: (LUCENE-441) IntParser and FloatParser unused by FieldCacheImpl

2005-09-27 Thread Yonik Seeley (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-441?page=all ] Yonik Seeley updated LUCENE-441: Attachment: FieldCacheImpl_useParsers.txt attached patch to use the parse() methods of IntParser and FloatParser > IntParser and FloatParser unused by FieldCac

[jira] Created: (LUCENE-441) IntParser and FloatParser unused by FieldCacheImpl

2005-09-27 Thread Yonik Seeley (JIRA)
IntParser and FloatParser unused by FieldCacheImpl -- Key: LUCENE-441 URL: http://issues.apache.org/jira/browse/LUCENE-441 Project: Lucene - Java Type: Bug Components: Search Versions: CVS Nightly - Specify date

[jira] Commented: (LUCENE-395) CoordConstrainedBooleanQuery + QueryParser support

2005-09-27 Thread paul.elschot (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-395?page=comments#action_12330622 ] paul.elschot commented on LUCENE-395: - I glanced through the code quickly, I'll give it a try in a week or so. BooleanScorer2 has one catch for which I did not see a check

Re: Lucene and UTF-8

2005-09-27 Thread Marvin Humphrey
On Sep 27, 2005, at 7:01 AM, Ken Krugler wrote: Just to clarify, an incompatibility will occur if: a. The new code is used to write the index. b. The text being written contains an embedded null or an extended (not in the BMP) Unicode code point. c. Old code is then used to read the index.

Re: Lucene and UTF-8

2005-09-27 Thread Ken Krugler
> Perl development is going very well, by the way. On the indexing side, I've got a new app going which solves both the index compatibility issue and the speed issue, about which I'll make a presentation in this forum after I flesh it out and clean it up. > Well, I'm lying a little. Th

[jira] Updated: (LUCENE-395) CoordConstrainedBooleanQuery + QueryParser support

2005-09-27 Thread Hoss Man (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-395?page=all ] Hoss Man updated LUCENE-395: Attachment: LUCENE-395.patch TestBooleanMinShouldMatch.java I don't really understand BooleanScorer2 very much, but i thought I understaood it enough t

[jira] Updated: (LUCENE-440) FilteredQuery should have getFilter()

2005-09-27 Thread Yonik Seeley (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-440?page=all ] Yonik Seeley updated LUCENE-440: Attachment: FilteredQuery.txt > FilteredQuery should have getFilter() > - > > Key: LUCENE-440 > URL: http:

[jira] Created: (LUCENE-440) FilteredQuery should have getFilter()

2005-09-27 Thread Yonik Seeley (JIRA)
FilteredQuery should have getFilter() - Key: LUCENE-440 URL: http://issues.apache.org/jira/browse/LUCENE-440 Project: Lucene - Java Type: Bug Versions: 1.9 Reporter: Yonik Seeley Priority: Minor Unless you are in

[jira] Commented: (LUCENE-383) ConstantScoreRangeQuery - fixes "too many clauses" exception

2005-09-27 Thread paul.elschot (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-383?page=comments#action_12330549 ] paul.elschot commented on LUCENE-383: - Since the constant score is taken from the query boost, idf issues can be dealt with elsewhere. IOW I don't think there is a need to