[jira] Created: (LUCENE-770) CfsExtractor tool

2007-01-11 Thread Otis Gospodnetic (JIRA)
CfsExtractor tool - Key: LUCENE-770 URL: https://issues.apache.org/jira/browse/LUCENE-770 Project: Lucene - Java Issue Type: New Feature Components: Index Affects Versions: 2.1 Reporter: Otis

[jira] Updated: (LUCENE-770) CfsExtractor tool

2007-01-11 Thread Otis Gospodnetic (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Otis Gospodnetic updated LUCENE-770: Attachment: LUCENE-770.patch CfsExtractor tool - Key:

[jira] Updated: (LUCENE-741) Field norm modifier (CLI tool)

2007-01-11 Thread Otis Gospodnetic (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Otis Gospodnetic updated LUCENE-741: Attachment: LUCENE-741.patch Field norm modifier (CLI tool)

[jira] Commented: (LUCENE-140) docs out of order

2007-01-11 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12463872 ] Michael McCandless commented on LUCENE-140: --- Phew! I'm glad we finally got to the bottom of this one.

[jira] Commented: (LUCENE-140) docs out of order

2007-01-11 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12463875 ] Michael McCandless commented on LUCENE-140: --- Actually, this reminds me that, as of lockless commits,

[jira] Resolved: (LUCENE-140) docs out of order

2007-01-11 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-140. --- Resolution: Fixed Fix Version/s: 2.1 Resolving this now, finally (I'll move

Re: Beyond Lucene 2.0 Index Design

2007-01-11 Thread Grant Ingersoll
Hi Jeff, Wondering if you (and/or others) would be interested in taking a look at https://issues.apache.org/jira/browse/LUCENE-662 and vetting the new interfaces, etc. to see if you could come up w/ a prototype implementation. This would help move along 662 as it would sort out some of

IndexWriter forceOptimize() ?

2007-01-11 Thread Otis Gospodnetic
Hi, What do people here think about adding forceOptimize() to IndexWriter? public synchronized void forceOptimize() throws IOException { flushRamSegments(); int minSegment = segmentInfos.size() - mergeFactor; mergeSegments(minSegment 0 ? 0 : minSegment); } I need it

Re: Lockless commits -- great stuff!

2007-01-11 Thread Michael McCandless
Marvin Humphrey wrote: I've finished integrating the lockless commits concept into KinoSearch, and I wanted to pop in and say that it's a very nice piece of work. Real outside-the-box thinking -- or at least outside my box. :) Nothing better than an innovation which solves long-standing

Re: Lockless commits -- great stuff!

2007-01-11 Thread Marvin Humphrey
On Jan 11, 2007, at 6:48 AM, Michael McCandless wrote: I too am happy that we have no more commit lock :) Not just that. :) No more lock directory, since we can put write.lock in the index directory itself. No more lock file name munging, since lock files from different indexes no

[jira] Updated: (LUCENE-769) [PATCH] Performance improvement for some cases of sorted search

2007-01-11 Thread Artem Vasiliev (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Vasiliev updated LUCENE-769: -- Attachment: StoredFieldSorting.patch [PATCH] Performance improvement for some cases of sorted

[jira] Commented: (LUCENE-769) [PATCH] Performance improvement for some cases of sorted search

2007-01-11 Thread Artem Vasiliev (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12463996 ] Artem Vasiliev commented on LUCENE-769: --- Renamed classes as Hoss proposed. Tried to hide

Re: IndexWriter forceOptimize() ?

2007-01-11 Thread robert engels
I agree with the boolean addition. optimize(false) is a request to maybe optimize, optimize(true) always should optimize to a single segment optimize(false) might check some parameter as to the maximum number of segments allowed before an actual optimize if performed. On Jan 11, 2007,

Re: Beyond Lucene 2.0 Index Design

2007-01-11 Thread jian chen
I also got the same question. It seems it is very hard to efficiently do phrase based query. I think most search engines do phrase based query, or at least appear to be. So, like in google, the query result must contain all the words user searched on. It seems to me that the impacted-sorted

[jira] Commented: (LUCENE-769) [PATCH] Performance improvement for some cases of sorted search

2007-01-11 Thread Chuck Williams (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464055 ] Chuck Williams commented on LUCENE-769: --- Robert, Could you attach your current implementation of reopen() as

[jira] Updated: (LUCENE-769) [PATCH] Performance improvement for some cases of sorted search

2007-01-11 Thread robert engels (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] robert engels updated LUCENE-769: - Attachment: IndexReaderUtils.java [PATCH] Performance improvement for some cases of sorted

[jira] Commented: (LUCENE-769) [PATCH] Performance improvement for some cases of sorted search

2007-01-11 Thread robert engels (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464056 ] robert engels commented on LUCENE-769: -- The IndexReaderUtils I posted is not compilable - there are a few more

Re: Beyond Lucene 2.0 Index Design

2007-01-11 Thread Marvin Humphrey
On Jan 11, 2007, at 2:30 PM, jian chen wrote: It seems to me that the impacted-sorted list makes sense if you are trying to do pure vector space based ranking. This is from what I have read from the research papers. They all talk about how to optimize the vector space model using this

Re: [jira] Commented: (LUCENE-140) docs out of order

2007-01-11 Thread Chris Hostetter
: I think we should deprecate the create argument to : FSDirectory.getDirectory(*) and leave only the create argument in : IndexWriter's constructors. Am I missing something? Is there are a : reason not to do this? i actual wonder about hte problem from the oposite direction: to me it makes

Re: [jira] Commented: (LUCENE-769) [PATCH] Performance improvement for some cases of sorted search

2007-01-11 Thread Chris Hostetter
: Chuck Williams commented on LUCENE-769: : --- : : Robert, : : Could you attach your current implementation of reopen() as well? The : attachment did not come through in your java-dev message today, or the : one from 12/11. I'd like to look at an incremental

Re: IndexWriter forceOptimize() ?

2007-01-11 Thread Otis Gospodnetic
Yeah, I actually had: public int segments() { return segmentInfos.size(); } in my IndexReader, but then erased it precisely because I thought this was exposing too much about the impl. I think optimize(int) that Chris mentioned exposes too much. I thought about having optimize(boolean force)

Re: IndexWriter forceOptimize() ?

2007-01-11 Thread Otis Gospodnetic
Doron, Maybe my browser is misbehaving, but I don't see your comments in http://issues.apache.org/jira/browse/LUCENE-741 . Didn't see the JIRA email with them either... Otis - Original Message From: Doron Cohen [EMAIL PROTECTED] To: java-dev@lucene.apache.org Sent: Thursday, January

Re: Beyond Lucene 2.0 Index Design

2007-01-11 Thread Ming Lei
Marvin, Several posts back on this thread, I talked about an algorithm of impact-sorted posting list for conjunctive boolean query. Your concerns on impact-sorting in boolean retrieval model is valid. But practically, the approximation (as in my original post) should work well enough for large

Re: IndexWriter forceOptimize() ?

2007-01-11 Thread Chris Hostetter
: I think optimize(int) that Chris mentioned exposes too much. I thought : about having optimize(boolean force) in place of optimize(), but then : we'd have to deprecate, so I opted for forceOptimize() that, I feel : exposes a little less. i have no strong feelings about exposing the number of

Re: IndexWriter forceOptimize() ?

2007-01-11 Thread Otis Gospodnetic
One day I read email in a different order, I miss replies like this. If optimize(boolean force) looks more attractive than optimizeForce(), that's fine by me. I just want to be able to force the cfs index, even if it's already optimized, to expand. Getting it to have a single segment is just a

[jira] Commented: (LUCENE-741) Field norm modifier (CLI tool)

2007-01-11 Thread Doron Cohen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464105 ] Doron Cohen commented on LUCENE-741: I was looking at what it would take to make this work with .nrm file as

Re: IndexWriter forceOptimize() ?

2007-01-11 Thread Doron Cohen
Otis Gospodnetic [EMAIL PROTECTED] wrote on 11/01/2007 20:17:31: Doron, Maybe my browser is misbehaving, but I don't see your comments in http://issues.apache.org/jira/browse/LUCENE-741 . Didn't see the JIRA email with them either... Otis Otis, your browser is perfect, just that I was

[jira] Updated: (LUCENE-741) Field norm modifier (CLI tool)

2007-01-11 Thread Doron Cohen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doron Cohen updated LUCENE-741: --- Attachment: for.nrm.patch Field norm modifier (CLI tool) --

[jira] Updated: (LUCENE-741) Field norm modifier (CLI tool)

2007-01-11 Thread Doron Cohen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doron Cohen updated LUCENE-741: --- Attachment: (was: for.nrm.patch) Field norm modifier (CLI tool) --

[jira] Updated: (LUCENE-741) Field norm modifier (CLI tool)

2007-01-11 Thread Doron Cohen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doron Cohen updated LUCENE-741: --- Attachment: for.nrm.patch Field norm modifier (CLI tool) --

[jira] Commented: (LUCENE-741) Field norm modifier (CLI tool)

2007-01-11 Thread Doron Cohen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464109 ] Doron Cohen commented on LUCENE-741: Attached for.nrm.patch was very noisy - so I replaced it with one created

Re: Beyond Lucene 2.0 Index Design

2007-01-11 Thread Marvin Humphrey
On Jan 11, 2007, at 8:37 PM, Ming Lei wrote: But practically, the approximation (as in my original post) should work well enough for large corpus and relevancy-driven retrieval. The saving on disk access for large corpus (implies very long posting list) will be huge by impact-sorted posting