[jira] Commented: (LUCENE-2393) Utility to output total term frequency and df from a lucene index

2010-04-17 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12858157#action_12858157 ] Michael McCandless commented on LUCENE-2393: Patch looks good Tom -- th

Re: Fix to contrib/misc/HighFreqTerms.java

2010-04-17 Thread Michael McCandless
terms.docFreq())); > > Tom > > -Original Message- > From: Michael McCandless [mailto:luc...@mikemccandless.com] > Sent: Wednesday, April 14, 2010 3:50 PM > To: java-dev@lucene.apache.org > Subject: Re: Bug in contrib/misc/HighFreqTerms.java? > > OK I committed th

[jira] Commented: (LUCENE-2398) Improve tests to work easier from IDEs

2010-04-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857914#action_12857914 ] Michael McCandless commented on LUCENE-2398: This is a great cleanup Ro

Re: Proposal about Version API "relaxation"

2010-04-16 Thread Michael McCandless
ts that do > this ... > > Dev on the experimental latest greatest fun branch, or the more in the past, > back compat hassle stable branch? Port most patches to two somewhat > diverging code bases? > > If that was actually how things worked out, I'd be +1. I just wonder ...

Re: Proposal about Version API "relaxation"

2010-04-16 Thread Michael McCandless
Getting back to the stable/experimental branches... I think, with separate stable & experimental branches, development would/should be active on both branches. It'd depend on the feature... Eg today we'd have 3.x stable branch and the experimental branch (= trunk). Small features, bug fixes, wo

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Michael McCandless
On Thu, Apr 15, 2010 at 3:50 PM, Robert Muir wrote: > for now simply moving analyzers to its own jar filE would be a great step! +1 -- why not consolidate all analyzers now? (And fix indexer to require a minimal API = TokenStream minus reset & close). Mike -

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Michael McCandless
Unfortunately, live searching against an old index can get very hairy. EG look at what I had to do for the "flex API on pre-flex index" flex emulation layer. It's also not great because it gives the illusion that all is good, yet, you've taken a silent hit (up to ~10% or so) in your search perf.

[jira] Commented: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments

2010-04-15 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857381#action_12857381 ] Michael McCandless commented on LUCENE-2324: {quote} i would love to be

[jira] Commented: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments

2010-04-15 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857373#action_12857373 ] Michael McCandless commented on LUCENE-2324: bq. The usual design is a qu

[jira] Resolved: (LUCENE-1278) Add optional storing of document numbers in term dictionary

2010-04-15 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-1278. Resolution: Won't Fix I think the pulsing codec (wraps any other codec

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Michael McCandless
2010/4/15 Shai Erera : > One way is to define 'major' as X and minor X.Y, and another is to define > major as 'X.Y' and minor as 'X.Y.Z'. I prefer the latter but don't have any > strong feelings against the former. I prefer X.Y, ie, changes to Y only is a minor release (mostly bug fixes but may

Re: TestCodecs running time

2010-04-15 Thread Michael McCandless
Yah :) TestStressIndexing2 is another slow one... I'll go fix it... Mike On Thu, Apr 15, 2010 at 2:15 AM, Shai Erera wrote: > See you already did that Mike :). Thanks ! now the tests run for 2s. > > Shai > > On Fri, Apr 9, 2010 at 12:49 PM, Michael McCandless > wrot

Re: SnapshotDeletionPolicy throws NPE if no commit happened

2010-04-15 Thread Michael McCandless
Presumably you'd also hit this exception if the DP deletes all commit points, right? I like IllegalStateException. Mike 2010/4/15 Shai Erera : > BTW, even if it's a stupid thing to do, someone can today create SDP and > call snapshot without ever creating IW. And it's not an impossible scenario.

[jira] Commented: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments

2010-04-14 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857124#action_12857124 ] Michael McCandless commented on LUCENE-2324: This is awesome Michael!

[jira] Commented: (LUCENE-2393) Utility to output total term frequency and df from a lucene index

2010-04-14 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857121#action_12857121 ] Michael McCandless commented on LUCENE-2393: Programmatically indexing t

Re: Proposal about Version API "relaxation"

2010-04-14 Thread Michael McCandless
On Wed, Apr 14, 2010 at 12:06 AM, Marvin Humphrey wrote: > Essentially, we're free to break back compat within "Lucy" at any time, but > we're not able to break back compat within a stable fork like "Lucy1", > "Lucy2", etc. So what we'll probably do during normal development with > Analyzers is

Re: Bug in contrib/misc/HighFreqTerms.java?

2010-04-14 Thread Michael McCandless
72] 536480 body:[55 6e 69 74 65 64] 543746 Which is not very readable, but, it does this because flex terms are arbitrary byte[], not necessarily utf8... maybe we should fix it to print both hex and String if we assume bytes are utf8? Mike On Wed, Apr 14, 2010 at 3:25 PM, Michael McCandless

Re: Bug in contrib/misc/HighFreqTerms.java?

2010-04-14 Thread Michael McCandless
Ugh, I'll fix this. With the new flex API, you can't ask a composite (Multi/DirReader) for its postings -- you have to go through the static methods on MultiFields. I'm trying to put some distance b/w IndexReader and composite readers... because I'd like to eventually deprecate them. Ie, the comp

[jira] Updated: (LUCENE-2387) IndexWriter retains references to Readers used in Fields (memory leak)

2010-04-14 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-2387: --- Attachment: LUCENE-2387-29x.patch 29x version of this patch. > IndexWriter reta

Re: Google-developed posting list encoding

2010-04-14 Thread Michael McCandless
Flex has already landed (in trunk, for 3.1), so this is "just" a matter of someone creating a codec using Group VarInt. Mike On Wed, Apr 14, 2010 at 4:58 AM, John Wang wrote: > This would be something that's excellent for contribution after the > Flex-Indexing support is added. > -John > > On We

[jira] Commented: (LUCENE-2371) Update fileformats spec to match how flex's standard codec writes terms

2010-04-13 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856479#action_12856479 ] Michael McCandless commented on LUCENE-2371: Reminder to future self:

[jira] Resolved: (LUCENE-2111) Wrapup flexible indexing

2010-04-13 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-2111. Resolution: Fixed > Wrapup flexible index

[jira] Commented: (LUCENE-2386) IndexWriter commits unnecessarily on fresh Directory

2010-04-13 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856357#action_12856357 ] Michael McCandless commented on LUCENE-2386: Patch looks good

[jira] Commented: (LUCENE-2386) IndexWriter commits unnecessarily on fresh Directory

2010-04-12 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856065#action_12856065 ] Michael McCandless commented on LUCENE-2386: Yeah I think new IW(),

[jira] Commented: (LUCENE-2386) IndexWriter commits unnecessarily on fresh Directory

2010-04-12 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856022#action_12856022 ] Michael McCandless commented on LUCENE-2386: Shai, can you also test CR

[jira] Commented: (LUCENE-2316) Define clear semantics for Directory.fileLength

2010-04-12 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855916#action_12855916 ] Michael McCandless commented on LUCENE-2316: bq. I'm also ok w/ the

[jira] Commented: (LUCENE-2392) Enable flexible scoring

2010-04-12 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855906#action_12855906 ] Michael McCandless commented on LUCENE-2392: bq. I think what I'm

[jira] Commented: (LUCENE-2392) Enable flexible scoring

2010-04-12 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855905#action_12855905 ] Michael McCandless commented on LUCENE-2392: bq. Mike, I don

[jira] Commented: (LUCENE-2386) IndexWriter commits unnecessarily on fresh Directory

2010-04-11 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855786#action_12855786 ] Michael McCandless commented on LUCENE-2386: Patch looks good... thanks

[jira] Updated: (LUCENE-2392) Enable flexible scoring

2010-04-11 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-2392: --- Attachment: LUCENE-2392.patch Rough first patch attached > Enable flexi

[jira] Created: (LUCENE-2392) Enable flexible scoring

2010-04-11 Thread Michael McCandless (JIRA)
Enable flexible scoring --- Key: LUCENE-2392 URL: https://issues.apache.org/jira/browse/LUCENE-2392 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Michael McCandless

[jira] Commented: (LUCENE-2386) IndexWriter commits unnecessarily on fresh Directory

2010-04-11 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855750#action_12855750 ] Michael McCandless commented on LUCENE-2386: Shai I think you should

[jira] Commented: (LUCENE-2386) IndexWriter commits unnecessarily on fresh Directory

2010-04-11 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855747#action_12855747 ] Michael McCandless commented on LUCENE-2386: Actually I consider this a

[jira] Commented: (LUCENE-2386) IndexWriter commits unnecessarily on fresh Directory

2010-04-11 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855738#action_12855738 ] Michael McCandless commented on LUCENE-2386: I like the fix (catching

[jira] Commented: (LUCENE-2316) Define clear semantics for Directory.fileLength

2010-04-11 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855736#action_12855736 ] Michael McCandless commented on LUCENE-2316: I don't think Lucene

[jira] Commented: (LUCENE-2376) java.lang.OutOfMemoryError:Java heap space

2010-04-10 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855548#action_12855548 ] Michael McCandless commented on LUCENE-2376: Yes total unique fields are

Re: IndexWriter memory leak?

2010-04-10 Thread Michael McCandless
t; > On Fri, Apr 9, 2010 at 12:32 PM, Michael McCandless > wrote: >> >> I agree IW should not hold refs to the Field instances from the last >> doc indexed... I put a patch on LUCENE-2387 to null the reference as >> we go.  Can you confirm this lets GC reclaim? &g

[jira] Commented: (LUCENE-2372) Replace deprecated TermAttribute by new CharTermAttribute

2010-04-09 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855492#action_12855492 ] Michael McCandless commented on LUCENE-2372: +1 to making KeywordAnal

[jira] Commented: (LUCENE-2386) IndexWriter commits unnecessarily on fresh Directory

2010-04-09 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855470#action_12855470 ] Michael McCandless commented on LUCENE-2386: I think oal.index is

[jira] Commented: (LUCENE-2386) IndexWriter commits unnecessarily on fresh Directory

2010-04-09 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855421#action_12855421 ] Michael McCandless commented on LUCENE-2386: Patch looks good! Hmm... m

[jira] Resolved: (LUCENE-2387) IndexWriter retains references to Readers used in Fields (memory leak)

2010-04-09 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-2387. Resolution: Fixed Fix Version/s: 3.1 > IndexWriter retains references

[jira] Commented: (LUCENE-2386) IndexWriter commits unnecessarily on fresh Directory

2010-04-09 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855364#action_12855364 ] Michael McCandless commented on LUCENE-2386: How about we subclass FNFE?

[jira] Commented: (LUCENE-2387) IndexWriter retains references to Readers used in Fields (memory leak)

2010-04-09 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855347#action_12855347 ] Michael McCandless commented on LUCENE-2387: I agree, Uwe -- I'll

[jira] Commented: (LUCENE-2364) Add support for terms in BytesRef format to Term, TermQuery, TermRangeQuery & Co.

2010-04-09 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855343#action_12855343 ] Michael McCandless commented on LUCENE-2364: Maybe we should simply depre

Re: IndexWriter memory leak?

2010-04-09 Thread Michael McCandless
I agree IW should not hold refs to the Field instances from the last doc indexed... I put a patch on LUCENE-2387 to null the reference as we go. Can you confirm this lets GC reclaim? Mike On Fri, Apr 9, 2010 at 12:54 AM, Ruben Laguna wrote: > But the Readers I'm talking about are not held by th

[jira] Updated: (LUCENE-2387) IndexWriter retains references to Readers used in Fields (memory leak)

2010-04-09 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-2387: --- Attachment: LUCENE-2387.patch Attached patch nulls out the Fieldable reference

[jira] Assigned: (LUCENE-2387) IndexWriter retains references to Readers used in Fields (memory leak)

2010-04-09 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-2387: -- Assignee: Michael McCandless > IndexWriter retains references to Readers u

[jira] Commented: (LUCENE-2376) java.lang.OutOfMemoryError:Java heap space

2010-04-09 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855336#action_12855336 ] Michael McCandless commented on LUCENE-2376: Hmm indeed you have a great

[jira] Commented: (LUCENE-2386) IndexWriter commits unnecessarily on fresh Directory

2010-04-09 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855333#action_12855333 ] Michael McCandless commented on LUCENE-2386: bq. This is a behaviora

Re: TestCodecs running time

2010-04-09 Thread Michael McCandless
It's also slow because it repeats all the tests for each of the core codecs (standard, sep, pulsing, intblock). I think it's fine to reduce the number of iterations -- just make sure there's no seed to newRandom() so the distributing testing is "effective". Mike On Fri, Apr 9, 2010 at 12:43 AM,

Re: Getting fsync out of the loop

2010-04-08 Thread Michael McCandless
On Thu, Apr 8, 2010 at 6:21 PM, Earwin Burrfoot wrote: >> But, IW doesn't let you "hold on to" checkpoints... only to commits. >> >> Ie SnapshotDP will only "see" actual commit/close calls, not >> intermediate checkpoints like a random segment merge completing, a >> flush happening, etc. >> >> Or.

[jira] Commented: (LUCENE-2386) IndexWriter commits unnecessarily on fresh Directory

2010-04-08 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855215#action_12855215 ] Michael McCandless commented on LUCENE-2386: I think the patch is good

Re: (LUCENE-2335) optimization: when sorting by field, if index has one segment and field values are not needed, do not load String[] into field cache)

2010-04-08 Thread Michael McCandless
Actually Toke opened a new issue (LUCENE-2369) for the new approach to Locale-based sorting... I think we should leave the existing issue as the single-segment optimization (it's a separate issue). Mike On Thu, Apr 8, 2010 at 6:06 PM, Chris Hostetter wrote: > > : > Is it possible to change it? I

[jira] Commented: (LUCENE-2386) IndexWriter commits unnecessarily on fresh Directory

2010-04-08 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855135#action_12855135 ] Michael McCandless commented on LUCENE-2386: I agree: IW really should

Re: Move NoDeletionPolicy to core

2010-04-08 Thread Michael McCandless
+1 I don't think bw needs to be kept -- contrib/benchmark is allowed to change. Mike On Thu, Apr 8, 2010 at 5:44 AM, Shai Erera wrote: > Hi > > I've noticed benchmark has a NoDeletionPolicy class and I was wondering if > we can move it to core. I might want to use it for the parallel index stuf

[jira] Commented: (LUCENE-2376) java.lang.OutOfMemoryError:Java heap space

2010-04-08 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12854876#action_12854876 ] Michael McCandless commented on LUCENE-2376: OK but I suspect the root c

Re: Getting fsync out of the loop

2010-04-08 Thread Michael McCandless
On Wed, Apr 7, 2010 at 3:27 PM, Earwin Burrfoot wrote: >> No, this doesn't make sense.  The OS detects a disk full on accepting >> the write into the write cache, not [later] on flushing the write >> cache to disk.  If the OS accepts the write, then disk is not full (ie >> flushing the cache will

[jira] Commented: (LUCENE-2364) Add support for terms in BytesRef format to Term, TermQuery, TermRangeQuery & Co.

2010-04-07 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12854717#action_12854717 ] Michael McCandless commented on LUCENE-2364: Once we fix Term to ta

[jira] Resolved: (LUCENE-2383) Some small fixes after the flex merge...

2010-04-07 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-2383. Resolution: Fixed > Some small fixes after the flex me

[jira] Commented: (LUCENE-2383) Some small fixes after the flex merge...

2010-04-07 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12854684#action_12854684 ] Michael McCandless commented on LUCENE-2383: Thanks Uwe, I agree th

Re: Commit freeze in flex branch

2010-04-07 Thread Michael McCandless
Yes +1 to that -- thanks Uwe!! And thanks for the many other people who helped out on flex. It's a big and exciting improvement :) Mike On Wed, Apr 7, 2010 at 4:11 PM, Michael Busch wrote: > Uwe, thanks for doing all the svn work!  Was a smooth transition! > >  Michael > > On 4/6/10 12:27 PM,

[jira] Updated: (LUCENE-2383) Some small fixes after the flex merge...

2010-04-07 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-2383: --- Attachment: LUCENE-2383.patch > Some small fixes after the flex me

[jira] Created: (LUCENE-2383) Some small fixes after the flex merge...

2010-04-07 Thread Michael McCandless (JIRA)
Some small fixes after the flex merge... Key: LUCENE-2383 URL: https://issues.apache.org/jira/browse/LUCENE-2383 Project: Lucene - Java Issue Type: Bug Reporter: Michael McCandless

[jira] Commented: (LUCENE-2380) Add FieldCache.getTermBytes, to load term data as byte[]

2010-04-07 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12854615#action_12854615 ] Michael McCandless commented on LUCENE-2380: We could also do shared

[jira] Commented: (LUCENE-1536) if a filter can support random access API, we should use it

2010-04-07 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12854612#action_12854612 ] Michael McCandless commented on LUCENE-1536: With flex, you can now get

[jira] Created: (LUCENE-2382) Merging implemented by codecs must catch aborted merges

2010-04-07 Thread Michael McCandless (JIRA)
Components: Index Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 3.1 This is a regression (we lost functionality on landing flex). When you close IW with "false" (meaning abort all running merges), IW asks the merge threads to abort. T

[jira] Created: (LUCENE-2381) Use packed ints for sort ords (in FieldCache.getStringIndex/.getTermBytesIndex)

2010-04-07 Thread Michael McCandless (JIRA)
- Java Issue Type: Improvement Reporter: Michael McCandless Fix For: 3.1 We wastefully use a whole int today, but for enumerated fields (eg "country", "state", "color", "category") this is very wasteful since you could use

[jira] Created: (LUCENE-2380) Add FieldCache.getTermBytes, to load term data as byte[]

2010-04-07 Thread Michael McCandless (JIRA)
Reporter: Michael McCandless Fix For: 3.1 With flex, a term is now an opaque byte[] (typically, utf8 encoded unicode string, but not necessarily), so we need to push this up the search stack. FieldCache now has getStrings and getStringIndex; we need corresponding methods to

[jira] Resolved: (LUCENE-2379) TermRangeQuery & FieldCacheRangeFilter should accepts BytesRef

2010-04-07 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-2379. Resolution: Duplicate Woops -- dup of LUCENE-2364. > TermRangeQu

[jira] Commented: (LUCENE-2364) Add support for terms in BytesRef format to Term, TermQuery, TermRangeQuery & Co.

2010-04-07 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12854575#action_12854575 ] Michael McCandless commented on LUCENE-2364:

[jira] Created: (LUCENE-2379) TermRangeQuery & FieldCacheRangeFilter should accepts BytesRef

2010-04-07 Thread Michael McCandless (JIRA)
Type: Improvement Components: Search Affects Versions: 3.1 Reporter: Michael McCandless Fix For: 3.1 With flex, a term is a byte[] (BytesRef) not a String... we need to push this "up the search stack". TermRangeQuery / FieldCacheRangeFilter.newStringRa

[jira] Created: (LUCENE-2378) Cutover remaining usage of pre-flex APIs

2010-04-07 Thread Michael McCandless (JIRA)
Cutover remaining usage of pre-flex APIs Key: LUCENE-2378 URL: https://issues.apache.org/jira/browse/LUCENE-2378 Project: Lucene - Java Issue Type: Improvement Reporter: Michael

[jira] Commented: (LUCENE-2373) Change StandardTermsDictWriter to work with streaming and append-only filesystems

2010-04-07 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12854409#action_12854409 ] Michael McCandless commented on LUCENE-2373: I would love to make Lu

Re: Getting fsync out of the loop

2010-04-07 Thread Michael McCandless
On Tue, Apr 6, 2010 at 7:26 PM, Earwin Burrfoot wrote: >> Running out of disk space with fsync disabled won't lead to corruption. >> Even kill -9 the JRE process with fsync disabled won't corrupt. >> In these cases index just falls back to last successful commit. >> >> It's "only" power loss / OS

[jira] Commented: (LUCENE-2377) Enable the use of NoMergePolicy and NoMergeScheduler by Benchmark

2010-04-07 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12854401#action_12854401 ] Michael McCandless commented on LUCENE-2377: Patch looks good Shai! >

[jira] Commented: (LUCENE-2376) java.lang.OutOfMemoryError:Java heap space

2010-04-07 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12854398#action_12854398 ] Michael McCandless commented on LUCENE-2376: Is this the same issue as LU

[jira] Resolved: (LUCENE-1990) Add unsigned packed int impls in oal.util

2010-04-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-1990. Resolution: Fixed Fix Version/s: (was: Flex Branch

[jira] Commented: (LUCENE-1990) Add unsigned packed int impls in oal.util

2010-04-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12854174#action_12854174 ] Michael McCandless commented on LUCENE-1990: OK indeed now I can see

[jira] Resolved: (LUCENE-1976) isCurrent() and getVersion() on an NRT reader are broken

2010-04-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-1976. Resolution: Fixed OK fixed on 3.1. > isCurrent() and getVersion() on an

[jira] Resolved: (LUCENE-2329) Use parallel arrays instead of PostingList objects

2010-04-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-2329. Resolution: Fixed Third time's a charm? > Use parallel arrays in

Re: Getting fsync out of the loop

2010-04-06 Thread Michael McCandless
On Tue, Apr 6, 2010 at 10:11 AM, Earwin Burrfoot wrote: > So, I want to pump my IndexWriter hard and fast with documents. Nice. > Removing fsync from FSDirectory helps. But for that I pay with possibility of > index corruption, not only if my node suddenly loses > power/kernelpanics, but also i

[jira] Commented: (LUCENE-2361) OutOfMemoryException while Indexing

2010-04-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12854120#action_12854120 ] Michael McCandless commented on LUCENE-2361: Hmm but the above infoSt

[jira] Updated: (LUCENE-1976) isCurrent() and getVersion() on an NRT reader are broken

2010-04-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1976: --- Fix Version/s: 3.1 > isCurrent() and getVersion() on an NRT reader are bro

[jira] Reopened: (LUCENE-1976) isCurrent() and getVersion() on an NRT reader are broken

2010-04-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reopened LUCENE-1976: Reopening to fix on 3.1 after flex lands... > isCurrent() and getVersion() on an

[jira] Commented: (LUCENE-2370) Reintegrate flex branch into trunk

2010-04-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12854112#action_12854112 ] Michael McCandless commented on LUCENE-2370: The bug is LUCENE-1976 -- a

[jira] Created: (LUCENE-2371) Update fileformats spec to match how flex's standard codec writes terms

2010-04-06 Thread Michael McCandless (JIRA)
Java Issue Type: Bug Components: Website Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 3.1 The standard codec changes how the terms index is written (eg uses packed ints, writes a whole field's terms at once, etc.)... we h

Re: Incremental Field Updates

2010-04-06 Thread Michael McCandless
have to be ordered if we > introduce updates?  Or does the onus of maintaining order fall on the > application? > > -Babak > > On Sat, Apr 3, 2010 at 3:28 AM, Michael McCandless > wrote: >> On Sat, Apr 3, 2010 at 1:25 AM, Babak Farhang wrote: >>>> I think the

[jira] Updated: (LUCENE-2329) Use parallel arrays instead of PostingList objects

2010-04-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-2329: --- Attachment: LUCENE-2329.patch New patch, init'ing the postings arrays in THPF.

[jira] Updated: (LUCENE-2265) improve automaton performance by running on byte[]

2010-04-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-2265: --- Attachment: LUCENE-2265.patch bq. The problem is it does not handle at least the

[jira] Commented: (LUCENE-2329) Use parallel arrays instead of PostingList objects

2010-04-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853815#action_12853815 ] Michael McCandless commented on LUCENE-2329: bq. We could move th

[jira] Commented: (LUCENE-2361) OutOfMemoryException while Indexing

2010-04-06 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853811#action_12853811 ] Michael McCandless commented on LUCENE-2361: Hmm are you sure you'r

Re: Term space continuity

2010-04-05 Thread Michael McCandless
The flex API isolates fields, ie you get a TermsEnum for a given field and it enums only the term's text (as a BytesRef). Mike On Mon, Apr 5, 2010 at 7:22 PM, Earwin Burrfoot wrote: > A random thought from some of the earlier discussions. > > Had anybody used the fact that Lucene Term space is c

[jira] Updated: (LUCENE-2265) improve automaton performance by running on byte[]

2010-04-05 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-2265: --- Attachment: LUCENE-2265.patch Patch w/ first cut at method to cutover whole UTF32

[jira] Updated: (LUCENE-2265) improve automaton performance by running on byte[]

2010-04-05 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-2265: --- Attachment: LUCENE-2265.patch Attached patch for first cut at APIs to convert a

[jira] Resolved: (LUCENE-2365) Finding Newest Segment In Empty Index

2010-04-05 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-2365. Resolution: Fixed Fix Version/s: (was: 3.0.1) 3.1

[jira] Updated: (LUCENE-2329) Use parallel arrays instead of PostingList objects

2010-04-05 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-2329: --- Attachment: LUCENE-2329.patch > Use parallel arrays instead of PostingList obje

[jira] Reopened: (LUCENE-2329) Use parallel arrays instead of PostingList objects

2010-04-05 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reopened LUCENE-2329: Reopening -- this fixed causes an intermittent deadlock in TestStressIndexing2. It&#

[jira] Commented: (LUCENE-2365) Finding Newest Segment In Empty Index

2010-04-05 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853497#action_12853497 ] Michael McCandless commented on LUCENE-2365: Thanks, patch looks good;

[jira] Assigned: (LUCENE-2365) Finding Newest Segment In Empty Index

2010-04-05 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-2365: -- Assignee: Michael McCandless > Finding Newest Segment In Empty In

[jira] Commented: (LUCENE-2354) Convert NumericUtils and NumericTokenStream to use BytesRef instead of Strings/char[]

2010-04-04 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853273#action_12853273 ] Michael McCandless commented on LUCENE-2354: Patch looks good Uwe! >

  1   2   3   4   5   6   7   8   9   10   >