[jira] Commented: (LUCENE-1510) InstantiatedIndexReader throws NullPointerException in norms() when used with a MultiReader

2009-01-08 Thread Karl Wettin (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661908#action_12661908 ] Karl Wettin commented on LUCENE-1510: - Thanks for the report Robert! I've committed a

[jira] Closed: (LUCENE-1510) InstantiatedIndexReader throws NullPointerException in norms() when used with a MultiReader

2009-01-08 Thread Karl Wettin (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wettin closed LUCENE-1510. --- Resolution: Fixed Fix Version/s: 2.9 > InstantiatedIndexReader throws NullPointerException in

Re: [jira] Commented: (LUCENE-1476) BitVector implement DocIdSet

2009-01-08 Thread Michael McCandless
robert engels wrote: Then why not always write segment.del, where is incremented. This is what Lucene does today. It's "write once". Each file may be compressed or uncompressed based on the number of deletions it contains. Lucene also does this. Still, as Marvin pointed out,

[jira] Commented: (LUCENE-1476) BitVector implement DocIdSet

2009-01-08 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661934#action_12661934 ] Michael McCandless commented on LUCENE-1476: {quote} > PostingList would be c

[jira] Commented: (LUCENE-1476) BitVector implement DocIdSet

2009-01-08 Thread Paul Elschot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661944#action_12661944 ] Paul Elschot commented on LUCENE-1476: -- bq. To minimize CPU cycles, it would theoreti

[jira] Commented: (LUCENE-1510) InstantiatedIndexReader throws NullPointerException in norms() when used with a MultiReader

2009-01-08 Thread Robert Newson (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661956#action_12661956 ] Robert Newson commented on LUCENE-1510: --- Looks good to me. I wonder if you should ad

[jira] Commented: (LUCENE-1476) BitVector implement DocIdSet

2009-01-08 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661977#action_12661977 ] Marvin Humphrey commented on LUCENE-1476: - Paul Elschot: > How about a SegmentSea

[jira] Commented: (LUCENE-1476) BitVector implement DocIdSet

2009-01-08 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661982#action_12661982 ] Marvin Humphrey commented on LUCENE-1476: - How about if we model deletions-as-iter

[jira] Assigned: (LUCENE-1497) Minor changes to SimpleHTMLFormatter

2009-01-08 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-1497: -- Assignee: Michael McCandless > Minor changes to SimpleHTMLFormatter >

[jira] Commented: (LUCENE-1497) Minor changes to SimpleHTMLFormatter

2009-01-08 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661992#action_12661992 ] Michael McCandless commented on LUCENE-1497: In fact I think it may be faster

[jira] Commented: (LUCENE-1476) BitVector implement DocIdSet

2009-01-08 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661995#action_12661995 ] Marvin Humphrey commented on LUCENE-1476: - Mike McCandless: > For a TermQuery (on

[jira] Commented: (LUCENE-1476) BitVector implement DocIdSet

2009-01-08 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661998#action_12661998 ] Mark Miller commented on LUCENE-1476: - bq. I noticed that in one version of the patch

[jira] Commented: (LUCENE-1497) Minor changes to SimpleHTMLFormatter

2009-01-08 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662004#action_12662004 ] Shai Erera commented on LUCENE-1497: If I understand you correctly, you propose to cha

[jira] Commented: (LUCENE-1497) Minor changes to SimpleHTMLFormatter

2009-01-08 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662020#action_12662020 ] Michael McCandless commented on LUCENE-1497: Ahh, OK, then let's leave your ap

[jira] Resolved: (LUCENE-1497) Minor changes to SimpleHTMLFormatter

2009-01-08 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-1497. Resolution: Fixed Fix Version/s: (was: 2.4.1) Lucene Fields: [New, P

[jira] Updated: (LUCENE-1483) Change IndexSearcher multisegment searches to search each individual segment using a single HitCollector

2009-01-08 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1483: --- Attachment: LUCENE-1483.patch Attached full patch (though you'll get failed hunks be

[jira] Commented: (LUCENE-1483) Change IndexSearcher multisegment searches to search each individual segment using a single HitCollector

2009-01-08 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662028#action_12662028 ] Mark Miller commented on LUCENE-1483: - bq. It runs legacy vs new sort and asserts that

[jira] Commented: (LUCENE-1476) BitVector implement DocIdSet

2009-01-08 Thread Jason Rutherglen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662033#action_12662033 ] Jason Rutherglen commented on LUCENE-1476: -- Marvin: "The whole tombstone idea aro

[jira] Commented: (LUCENE-1483) Change IndexSearcher multisegment searches to search each individual segment using a single HitCollector

2009-01-08 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662038#action_12662038 ] Mark Miller commented on LUCENE-1483: - Its the ORDSUBORD again (which I don't think we

[jira] Commented: (LUCENE-1482) Replace infoSteram by a logging framework (SLF4J)

2009-01-08 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662039#action_12662039 ] Shai Erera commented on LUCENE-1482: Grant, given what I wrote below, having Lucene us

[jira] Issue Comment Edited: (LUCENE-1483) Change IndexSearcher multisegment searches to search each individual segment using a single HitCollector

2009-01-08 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662038#action_12662038 ] markrmil...@gmail.com edited comment on LUCENE-1483 at 1/8/09 9:15 AM: -

[jira] Commented: (LUCENE-1314) IndexReader.clone

2009-01-08 Thread Jason Rutherglen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662043#action_12662043 ] Jason Rutherglen commented on LUCENE-1314: -- I executed on Eclipse Mac OS X on a 4

[jira] Commented: (LUCENE-1482) Replace infoSteram by a logging framework (SLF4J)

2009-01-08 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662044#action_12662044 ] Yonik Seeley commented on LUCENE-1482: -- It seems we should take into consideration th

[jira] Commented: (LUCENE-1476) BitVector implement DocIdSet

2009-01-08 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662065#action_12662065 ] Marvin Humphrey commented on LUCENE-1476: - Jason Rutherglen: > I found in making

[jira] Commented: (LUCENE-1479) TrecDocMaker skips over documents when "Date" is missing from documents

2009-01-08 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662073#action_12662073 ] Michael McCandless commented on LUCENE-1479: Shai, it seems like a doc that ha

[jira] Assigned: (LUCENE-1479) TrecDocMaker skips over documents when "Date" is missing from documents

2009-01-08 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-1479: -- Assignee: Michael McCandless > TrecDocMaker skips over documents when "Date" i

[jira] Commented: (LUCENE-1476) BitVector implement DocIdSet

2009-01-08 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662089#action_12662089 ] Michael McCandless commented on LUCENE-1476: {quote} > There's going to be a c

[jira] Commented: (LUCENE-1476) BitVector implement DocIdSet

2009-01-08 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662092#action_12662092 ] Michael McCandless commented on LUCENE-1476: {quote} > If Lucene crashed for s

[jira] Commented: (LUCENE-1476) BitVector implement DocIdSet

2009-01-08 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662097#action_12662097 ] Michael McCandless commented on LUCENE-1476: {quote} > It would be exposed as

[jira] Commented: (LUCENE-1476) BitVector implement DocIdSet

2009-01-08 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662100#action_12662100 ] Marvin Humphrey commented on LUCENE-1476: - Mike McCandless: > I'm also curious wh

[jira] Commented: (LUCENE-1476) BitVector implement DocIdSet

2009-01-08 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662101#action_12662101 ] Michael McCandless commented on LUCENE-1476: {quote} > If we move the deletion

[jira] Commented: (LUCENE-1476) BitVector implement DocIdSet

2009-01-08 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662102#action_12662102 ] Michael McCandless commented on LUCENE-1476: {quote} > How about if we model d

[jira] Commented: (LUCENE-1476) BitVector implement DocIdSet

2009-01-08 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662107#action_12662107 ] Marvin Humphrey commented on LUCENE-1476: - Mike McCandless: > Commit is for crash

[jira] Commented: (LUCENE-1476) BitVector implement DocIdSet

2009-01-08 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662110#action_12662110 ] Marvin Humphrey commented on LUCENE-1476: - Mike McCandless: > if it's sparse, you

[jira] Updated: (LUCENE-1314) IndexReader.clone

2009-01-08 Thread Jason Rutherglen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Rutherglen updated LUCENE-1314: - Attachment: LUCENE-1314.patch LUCENE-1314.patch All tests pass. IndexReader.close wa

[jira] Commented: (LUCENE-1476) BitVector implement DocIdSet

2009-01-08 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662143#action_12662143 ] Marvin Humphrey commented on LUCENE-1476: - Mike McCandless: > So, net/net it seem

Re: [jira] Commented: (LUCENE-1476) BitVector implement DocIdSet

2009-01-08 Thread Marvin Humphrey
> You can do that now by implementing BitVector.nextSetBit(int tick) and using > that in TermDocs to set a nextDeletion member var instead of checking every > doc num with BitVector.get(). This seems so easy, I should take a crack at it. :) Marvin Humphrey --

Re: [jira] Commented: (LUCENE-1476) BitVector implement DocIdSet

2009-01-08 Thread robert engels
The way we've simplified this that every document has an OID. It simplifies updates and delete tracking (in the transaction log). On Jan 8, 2009, at 2:28 PM, Marvin Humphrey (JIRA) wrote: [ https://issues.apache.org/jira/browse/LUCENE-1476? page=com.atlassian.jira.plugin.system.issuetab

Re: stored fields / unicode compression

2009-01-08 Thread Chris Hostetter
Catching up on my holiday email, I on't think there were any replies to this question yet. The low level file formats used by Lucene is an area I don't have time/expertise to follow carefully, but if i'm remember correctly the concensus is/was to more more towards pure (byte[] data, int star

[jira] Commented: (LUCENE-1476) BitVector implement DocIdSet

2009-01-08 Thread Jason Rutherglen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662214#action_12662214 ] Jason Rutherglen commented on LUCENE-1476: -- M.M.:" I think the transactions layer

Re: stored fields / unicode compression

2009-01-08 Thread Robert Muir
thanks for the response, this sounds great. some way to plug in arbitrary schemes would be helpful. I've experimented with a few for my case and unicode compression gave the best bang for the buck, but i remember some of the other schemes such as arithmetic coding seemed to provide wins for reason

Re: Realtime Search

2009-01-08 Thread Jason Rutherglen
Based on our discussions, it seems best to get realtime search going in small steps. Below are some possible steps to take. Patch #1: Expose an IndexWriter.getReader method that returns the current reader and shares the write lock Patch #2: Implement a realtime ram index class Patch #3: Implement

[jira] Updated: (LUCENE-1476) BitVector implement DocIdSet

2009-01-08 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marvin Humphrey updated LUCENE-1476: Attachment: quasi_iterator_deletions.diff Here's a patch implementing BitVector.nextSetBit

Re: Realtime Search

2009-01-08 Thread John Wang
We have worked on this problem on the server level as well. We have also open sourced it at: http://code.google.com/p/zoie/ wiki on the realtime aspect: http://code.google.com/p/zoie/wiki/ZoieSystem -John On Fri, Dec 26, 2008 at 12:34 PM, Robert Engels wrote: > If you move to the "either embe

[jira] Commented: (LUCENE-1494) Additional features for searching for value across multiple fields (many-to-one style)

2009-01-08 Thread Paul Cowan (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662244#action_12662244 ] Paul Cowan commented on LUCENE-1494: Hi Hoss, I don't disagree that an inverted inher

[jira] Commented: (LUCENE-1476) BitVector implement DocIdSet

2009-01-08 Thread Doug Cutting (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662247#action_12662247 ] Doug Cutting commented on LUCENE-1476: -- bq. To really tighten this loop, you have to

[jira] Updated: (LUCENE-1479) TrecDocMaker skips over documents when "Date" is missing from documents

2009-01-08 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-1479: --- Attachment: (was: LUCENE-1479.patch) > TrecDocMaker skips over documents when "Date" is missing

[jira] Updated: (LUCENE-1479) TrecDocMaker skips over documents when "Date" is missing from documents

2009-01-08 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-1479: --- Attachment: LUCENE-1479.patch Thanks Mike, you're right. The compilation error is a result of a refa