[jira] Updated: (LUCENE-2386) IndexWriter commits unnecessarily on fresh Directory

2010-04-12 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-2386: --- Attachment: LUCENE-2386.patch Patch includes the proposed test in TestIndexWriter. I think this is r

[jira] Commented: (LUCENE-2386) IndexWriter commits unnecessarily on fresh Directory

2010-04-12 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856065#action_12856065 ] Michael McCandless commented on LUCENE-2386: Yeah I think new IW(), set maxBuf

[jira] Commented: (LUCENE-2386) IndexWriter commits unnecessarily on fresh Directory

2010-04-12 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856063#action_12856063 ] Shai Erera commented on LUCENE-2386: So just call "new IW()", then rollback and ensure

[jira] Commented: (LUCENE-2386) IndexWriter commits unnecessarily on fresh Directory

2010-04-12 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856022#action_12856022 ] Michael McCandless commented on LUCENE-2386: Shai, can you also test CREATE on

[jira] Commented: (LUCENE-2355) Refactor Directory/Multi/SegmentReader creation/reopening/cloning/closing

2010-04-12 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855964#action_12855964 ] Earwin Burrfoot commented on LUCENE-2355: - - Norm holds a reference to papa-Segmen

[jira] Commented: (LUCENE-2386) IndexWriter commits unnecessarily on fresh Directory

2010-04-12 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855961#action_12855961 ] Earwin Burrfoot commented on LUCENE-2386: - I'm surrendering the issue, any outcome

[jira] Commented: (LUCENE-1934) Rework (Float)LatLng implementation and distance calculation

2010-04-12 Thread Nicolas Helleringer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855944#action_12855944 ] Nicolas Helleringer commented on LUCENE-1934: - This has to be reworked when LU

Re: chinese stopwords

2010-04-12 Thread Grant Ingersoll
+1 On Apr 9, 2010, at 9:59 PM, John Wang wrote: > Hi: > >I am using the SmartChineseAnalyzer class and it is great! > >Was wondering if we should have a set of chinese stopwords. The default > set containts only punctuations. > > Thanks > > -John -

[jira] Commented: (LUCENE-2386) IndexWriter commits unnecessarily on fresh Directory

2010-04-12 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855937#action_12855937 ] Earwin Burrfoot commented on LUCENE-2386: - bq. So if you pass CREATE on an already

[jira] Commented: (LUCENE-2392) Enable flexible scoring

2010-04-12 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855934#action_12855934 ] Robert Muir commented on LUCENE-2392: - {quote} Really, maybe somehow we should be usin

[jira] Commented: (LUCENE-2386) IndexWriter commits unnecessarily on fresh Directory

2010-04-12 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855924#action_12855924 ] Shai Erera commented on LUCENE-2386: I don't think that people need to write that "emp

[jira] Commented: (LUCENE-2316) Define clear semantics for Directory.fileLength

2010-04-12 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855916#action_12855916 ] Michael McCandless commented on LUCENE-2316: bq. I'm also ok w/ the bw break r

[jira] Commented: (LUCENE-2392) Enable flexible scoring

2010-04-12 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855913#action_12855913 ] Shai Erera commented on LUCENE-2392: I'd like to withdraw my request from above. I mis

[jira] Commented: (LUCENE-2392) Enable flexible scoring

2010-04-12 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855906#action_12855906 ] Michael McCandless commented on LUCENE-2392: bq. I think what I'm saying is th

[jira] Commented: (LUCENE-2392) Enable flexible scoring

2010-04-12 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855905#action_12855905 ] Michael McCandless commented on LUCENE-2392: bq. Mike, I don't think overlapTe

[jira] Commented: (LUCENE-2386) IndexWriter commits unnecessarily on fresh Directory

2010-04-12 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855899#action_12855899 ] Earwin Burrfoot commented on LUCENE-2386: - bq. The point here was that we should n

[jira] Commented: (LUCENE-2386) IndexWriter commits unnecessarily on fresh Directory

2010-04-12 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855892#action_12855892 ] Shai Erera commented on LUCENE-2386: bq. what is the proper way (after this fix) to op

[jira] Commented: (LUCENE-2386) IndexWriter commits unnecessarily on fresh Directory

2010-04-12 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855890#action_12855890 ] Earwin Burrfoot commented on LUCENE-2386: - bq. I'm sure such apps are more sophist

[jira] Commented: (LUCENE-2386) IndexWriter commits unnecessarily on fresh Directory

2010-04-12 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855889#action_12855889 ] Earwin Burrfoot commented on LUCENE-2386: - Meh, that all is just a matter of persp

Re: [jira] Commented: (LUCENE-2392) Enable flexible scoring

2010-04-12 Thread Shai Erera
I'm not sure Robert where did I propose to shove random statistics into the index? Lucene calculated a doc length today which some in the academy/research here disagree w/ how it's done. So instead of attempting to fix it for all, I think it'd be great if one can define what is the doc Length as on

Re: [jira] Commented: (LUCENE-2392) Enable flexible scoring

2010-04-12 Thread Robert Muir
I disagree. I think what Mike has defined here is way beyond a baby-step: its all the stats needed to support modern IR models in Lucene: BM25, additional vector space algorithms, divergence from randomness, and language modelling. I think the ability to calculate your own random statistics and sh

[jira] Commented: (LUCENE-2373) Change StandardTermsDictWriter to work with streaming and append-only filesystems

2010-04-12 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855877#action_12855877 ] Shai Erera commented on LUCENE-2373: I'd rather not count on file length as well ... s

[jira] Commented: (LUCENE-2392) Enable flexible scoring

2010-04-12 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855875#action_12855875 ] Shai Erera commented on LUCENE-2392: Mike - it'll also be great if we can store the le

[jira] Commented: (LUCENE-2316) Define clear semantics for Directory.fileLength

2010-04-12 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855873#action_12855873 ] Shai Erera commented on LUCENE-2316: Well ... dir.fileLength is also used by SegmentIn

[jira] Commented: (LUCENE-2386) IndexWriter commits unnecessarily on fresh Directory

2010-04-12 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855870#action_12855870 ] Shai Erera commented on LUCENE-2386: I'm not sure if we're arguing about the same thin