RE: losing history

2009-06-16 Thread Uwe Schindler
I think the problem here is that the file was not renamed in SVN. As patches in JIRA normally do not contain the rename (because they cannot applied with all actions like renames automatically done), I think the rename got lost. Renames only work correct, if the person who did the rename in his

[jira] Commented: (LUCENE-1673) Move TrieRange to core

2009-06-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12719940#action_12719940 ] Uwe Schindler commented on LUCENE-1673: --- bq. re: NumericField - it wouldn't have

Re: losing history

2009-06-16 Thread Simon Willnauer
Uwe is right! As long as you us diffs (patches) and have any kind of svn cp / svn mv done to you repository the will not be reflected in the diff. I don't think that there is any way of doing this currently except of the committer is doing it by hand (again) when applying the patch. This is

RE: losing history

2009-06-16 Thread Uwe Schindler
The problem is, when you applied the patch, the files are already deleted/created by patch2 and the SVN client is loosing the move operation (he only sees a new unversioned file and one missing file). As your link notes, you cannot replay the changes already done (by the patch command). So the

[jira] Updated: (LUCENE-1630) Mating Collector and Scorer on doc Id orderness

2009-06-16 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-1630: --- Attachment: LUCENE-1630.patch Changed Query.createQueryWeight to public, as was suggested by Yonik.

[jira] Assigned: (LUCENE-1504) SerialChainFilter should use DocSet API rather then deprecated BitSet API

2009-06-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler reassigned LUCENE-1504: - Assignee: Uwe Schindler Hllo Ryan, I will try to get this into 2.9, but before some

[jira] Commented: (LUCENE-1504) SerialChainFilter should use DocSet API rather then deprecated BitSet API

2009-06-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720014#action_12720014 ] Uwe Schindler commented on LUCENE-1504: --- And other things: - Use a

[jira] Created: (LUCENE-1693) AttributeSource/TokenStream API improvements

2009-06-16 Thread Michael Busch (JIRA)
AttributeSource/TokenStream API improvements Key: LUCENE-1693 URL: https://issues.apache.org/jira/browse/LUCENE-1693 Project: Lucene - Java Issue Type: Improvement Components: Analysis

[jira] Updated: (LUCENE-1693) AttributeSource/TokenStream API improvements

2009-06-16 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Busch updated LUCENE-1693: -- Attachment: lucene-1693.patch Patch that includes all mentioned improvements, but needs

Re: losing history

2009-06-16 Thread Michael McCandless
I'm afraid this was my bad -- I blindly applied the patch and svn deleted the 0 byte files and failed to manually do the svn move instead. I believe the trunk version of svn includes an svn patch command (that is sorely needed). It'd fix this as well as eg forgetting to svn add new files in a

[jira] Assigned: (LUCENE-1630) Mating Collector and Scorer on doc Id orderness

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-1630: -- Assignee: Michael McCandless Mating Collector and Scorer on doc Id orderness

[jira] Commented: (LUCENE-1693) AttributeSource/TokenStream API improvements

2009-06-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720031#action_12720031 ] Uwe Schindler commented on LUCENE-1693: --- Why do you add a new class SmallToken? I

Re: Field.tokenStreamValue

2009-06-16 Thread Michael McCandless
Seems reasonable? So you're saying that if a Field has both TokenStream and some other value, the TokenStream gets indexed into postings term vectors, but the other value gets stored? Mike On Mon, Jun 15, 2009 at 9:48 PM, Yonik Seeleyyo...@lucidimagination.com wrote: The JavaDoc suggests that

RE: Field.tokenStreamValue

2009-06-16 Thread Uwe Schindler
Yes, I exactly need this for NumericField! The numeric value gets indexed using the tokenStream, but an optional stored field value (e.g. the number as plain text or even prefixEncoded) would also be good. Currently the user must index both types separate (but can use the same field name). As far

[jira] Commented: (LUCENE-1693) AttributeSource/TokenStream API improvements

2009-06-16 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720046#action_12720046 ] Michael Busch commented on LUCENE-1693: --- {quote} Why do you add a new class

Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Michael Busch
Probably everyone is thinking right now Oh no! Not again!. I admit I didn't fully read the incredibly long recent thread about backwards-compatibility, so maybe what I'm about to propose has been proposed already. In that case my apologies in advance. Rather than discussing our current

[jira] Commented: (LUCENE-1693) AttributeSource/TokenStream API improvements

2009-06-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720052#action_12720052 ] Uwe Schindler commented on LUCENE-1693: --- {quote} bq. What was your concusion about

[jira] Commented: (LUCENE-1673) Move TrieRange to core

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720054#action_12720054 ] Michael McCandless commented on LUCENE-1673: Patch looks good Uwe! The only

[jira] Issue Comment Edited: (LUCENE-1693) AttributeSource/TokenStream API improvements

2009-06-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720052#action_12720052 ] Uwe Schindler edited comment on LUCENE-1693 at 6/16/09 3:43 AM:

Re: Field.tokenStreamValue

2009-06-16 Thread Michael McCandless
OK let's do it then... Yonik do you want to open issue, patch, etc.? We should spell this out clearly in the javadocs that this case (tokenStream + string/binary value) is handled specially, because this does break from Field's normal semantics. Mike On Tue, Jun 16, 2009 at 6:18 AM, Uwe

[jira] Commented: (LUCENE-1673) Move TrieRange to core

2009-06-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720056#action_12720056 ] Uwe Schindler commented on LUCENE-1673: --- What do you think about deprecating

RE: Field.tokenStreamValue

2009-06-16 Thread Uwe Schindler
Maybe we should also add ctors to Field, with TokenStream and String/binary that set Field.Store.YES (compress is deprecated, so no need to support). - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From:

[jira] Commented: (LUCENE-1693) AttributeSource/TokenStream API improvements

2009-06-16 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720060#action_12720060 ] Michael Busch commented on LUCENE-1693: --- {quote} - If somebody implements the new

[jira] Commented: (LUCENE-1693) AttributeSource/TokenStream API improvements

2009-06-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720066#action_12720066 ] Uwe Schindler commented on LUCENE-1693: --- {quote} What if you currently have a filter

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Shai Erera
Since I proposed the same changes ( http://www.nabble.com/Re%3A-Lucene%27s-default-settings---back-compatibility-p23792927.html), I can only give my +1 to all 4 :). On the other thread I also proposed to change the policy around changing default settings. But maybe we should take it one step at a

[jira] Created: (LUCENE-1694) Query#mergeBooleanQueries argument should be of type BooleanQuery[] instead of Query[]

2009-06-16 Thread Simon Willnauer (JIRA)
Query#mergeBooleanQueries argument should be of type BooleanQuery[] instead of Query[] -- Key: LUCENE-1694 URL: https://issues.apache.org/jira/browse/LUCENE-1694

[jira] Commented: (LUCENE-1693) AttributeSource/TokenStream API improvements

2009-06-16 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720076#action_12720076 ] Shai Erera commented on LUCENE-1693: I have a couple of TokenFilters that work that

[jira] Updated: (LUCENE-1694) Query#mergeBooleanQueries argument should be of type BooleanQuery[] instead of Query[]

2009-06-16 Thread Simon Willnauer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-1694: Priority: Minor (was: Major) Query#mergeBooleanQueries argument should be of type

[jira] Commented: (LUCENE-1693) AttributeSource/TokenStream API improvements

2009-06-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720081#action_12720081 ] Uwe Schindler commented on LUCENE-1693: --- If you clone, you would not fall into the

Re: Field.tokenStreamValue

2009-06-16 Thread Michael McCandless
That sounds good. Mike On Tue, Jun 16, 2009 at 6:53 AM, Uwe Schindleru...@thetaphi.de wrote: Maybe we should also add ctors to Field, with TokenStream and String/binary that set Field.Store.YES (compress is deprecated, so no need to support). - Uwe Schindler H.-H.-Meier-Allee 63,

[jira] Commented: (LUCENE-1673) Move TrieRange to core

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720082#action_12720082 ] Michael McCandless commented on LUCENE-1673: I think deprecating DateTools

[jira] Commented: (LUCENE-1673) Move TrieRange to core

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720084#action_12720084 ] Michael McCandless commented on LUCENE-1673: bq. NumericField would only work

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Michael McCandless
+1 to all 4. Mike On Tue, Jun 16, 2009 at 6:37 AM, Michael Buschbusch...@gmail.com wrote: Probably everyone is thinking right now Oh no! Not again!. I admit I didn't fully read the incredibly long recent thread about backwards-compatibility, so maybe what I'm about to propose has been

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Simon Willnauer
+1 to all 4. On Tue, Jun 16, 2009 at 2:07 PM, Michael McCandlessluc...@mikemccandless.com wrote: +1 to all 4. Mike On Tue, Jun 16, 2009 at 6:37 AM, Michael Buschbusch...@gmail.com wrote: Probably everyone is thinking right now Oh no! Not again!. I admit I didn't fully read the incredibly

[jira] Commented: (LUCENE-1673) Move TrieRange to core

2009-06-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720089#action_12720089 ] Uwe Schindler commented on LUCENE-1673: --- bq. Actually, this need not be a

[jira] Updated: (LUCENE-1694) Query#mergeBooleanQueries argument should be of type BooleanQuery[] instead of Query[]

2009-06-16 Thread Simon Willnauer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-1694: Attachment: Query_mergeBooleanQueries.patch Attached patch + testcase. The patch passes

[jira] Created: (LUCENE-1695) Update the Highlighter to use the new TokenStream API

2009-06-16 Thread Mark Miller (JIRA)
Update the Highlighter to use the new TokenStream API - Key: LUCENE-1695 URL: https://issues.apache.org/jira/browse/LUCENE-1695 Project: Lucene - Java Issue Type: Improvement

[jira] Updated: (LUCENE-1695) Update the Highlighter to use the new TokenStream API

2009-06-16 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated LUCENE-1695: Attachment: LUCENE-1695.patch Rough, non backward compat patch. There is still an issue with

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Mark Miller
Just to cat call from the corner over here: So unless you update on *every* minor release, from a users perspective, this is the same as tossing out API back compat (though still with the option to keep what we want around as long as we want) ? Michael Busch wrote: Probably everyone is

Re: Field.tokenStreamValue

2009-06-16 Thread Yonik Seeley
Yep, it's also useful for pre-analyzing text. Wish I had it way back when I started Solr (to avoid an unneccessary pass through the analyzer, I actually stored and indexed the number in transformed but untokenized form... not great for Luke :-) -Yonik http://www.lucidimagination.com On Tue, Jun

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Grant Ingersoll
+1 on everything. This is the sanity we need, especially #2. Thanks for bringing this up again. I'd add a slight mod to #2 that I think helps further communicate to users our expectations (marked by my initials GSI) by employing some convention in our @deprecated comments: 2.

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Mark Miller
I'd be interested in what the users list has to say. With this many +1's, seems reasonable to take it over there. - Mark Grant Ingersoll wrote: +1 on everything. This is the sanity we need, especially #2. Thanks for bringing this up again. I'd add a slight mod to #2 that I think helps

[jira] Assigned: (LUCENE-1694) Query#mergeBooleanQueries argument should be of type BooleanQuery[] instead of Query[]

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-1694: -- Assignee: Michael McCandless Query#mergeBooleanQueries argument should be of

[jira] Commented: (LUCENE-1694) Query#mergeBooleanQueries argument should be of type BooleanQuery[] instead of Query[]

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720139#action_12720139 ] Michael McCandless commented on LUCENE-1694: Patch looks good, thanks Simon.

[jira] Resolved: (LUCENE-1694) Query#mergeBooleanQueries argument should be of type BooleanQuery[] instead of Query[]

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-1694. Resolution: Fixed Thank Simon! Query#mergeBooleanQueries argument should be of

[jira] Commented: (LUCENE-1673) Move TrieRange to core

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720143#action_12720143 ] Michael McCandless commented on LUCENE-1673: bq. I only wanted to hear one

[jira] Assigned: (LUCENE-1692) Contrib analyzers need tests

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-1692: -- Assignee: Michael McCandless Contrib analyzers need tests

[jira] Updated: (LUCENE-1692) Contrib analyzers need tests

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1692: --- Fix Version/s: 2.9 Contrib analyzers need tests

[jira] Commented: (LUCENE-1692) Contrib analyzers need tests

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720144#action_12720144 ] Michael McCandless commented on LUCENE-1692: These are much needed... thanks

Re: madvise(ptr, len, MADV_SEQUENTIAL)

2009-06-16 Thread Michael McCandless
Lucene could really make use of this method. When a segment merge takes place, we can read write many GB of data, which without madvise on many OSs would effectively flush the IO cache (thus hurting our search performance). Mike On Mon, Jun 15, 2009 at 6:01 PM, Jason

[jira] Commented: (LUCENE-1313) Near Realtime Search

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720147#action_12720147 ] Michael McCandless commented on LUCENE-1313: {quote} I think this is

[jira] Commented: (LUCENE-1313) Near Realtime Search

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720148#action_12720148 ] Michael McCandless commented on LUCENE-1313: bq. conditionalize them to run

[jira] Commented: (LUCENE-1673) Move TrieRange to core

2009-06-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720152#action_12720152 ] Uwe Schindler commented on LUCENE-1673: --- With a NumericTermQuery you would only hit

RE: madvise(ptr, len, MADV_SEQUENTIAL)

2009-06-16 Thread Uwe Schindler
But to use it, we should change MMapDirectory to also use the mapping when writing to files. I thought about it, it is very simple to implement (just copy the IndexInput and change all gets() to sets()) - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail:

[jira] Commented: (LUCENE-1692) Contrib analyzers need tests

2009-06-16 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720154#action_12720154 ] Robert Muir commented on LUCENE-1692: - Michael: LUCENE-973 would save me from having

[jira] Updated: (LUCENE-1696) Added New Token API impl for ASCIIFoldingFilter

2009-06-16 Thread Simon Willnauer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-1696: Attachment: ASCIIFoldingFilter._newTokenAPI.patch all tests pass Added New Token API

[jira] Created: (LUCENE-1696) Added New Token API impl for ASCIIFoldingFilter

2009-06-16 Thread Simon Willnauer (JIRA)
Added New Token API impl for ASCIIFoldingFilter --- Key: LUCENE-1696 URL: https://issues.apache.org/jira/browse/LUCENE-1696 Project: Lucene - Java Issue Type: Improvement Components:

[jira] Commented: (LUCENE-1673) Move TrieRange to core

2009-06-16 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720163#action_12720163 ] Yonik Seeley commented on LUCENE-1673: -- bq. We could easily add numeric; then

[jira] Commented: (LUCENE-973) Token of returns in CJKTokenizer + new TestCJKTokenizer

2009-06-16 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720169#action_12720169 ] Mark Miller commented on LUCENE-973: So the latest patch is ready to go in? I guess I

[jira] Commented: (LUCENE-1377) Add HTMLStripReader and WordDelimiterFilter from SOLR

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720171#action_12720171 ] Michael McCandless commented on LUCENE-1377: Robert, would ICUTokenizer

[jira] Commented: (LUCENE-1696) Added New Token API impl for ASCIIFoldingFilter

2009-06-16 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720173#action_12720173 ] Robert Muir commented on LUCENE-1696: - Simon, I think if you want to handle accents in

[jira] Commented: (LUCENE-1377) Add HTMLStripReader and WordDelimiterFilter from SOLR

2009-06-16 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720175#action_12720175 ] Robert Muir commented on LUCENE-1377: - they are a bit different. for example:

[jira] Commented: (LUCENE-973) Token of returns in CJKTokenizer + new TestCJKTokenizer

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720177#action_12720177 ] Michael McCandless commented on LUCENE-973: --- I'll take it Mark! Fixes a bug and

[jira] Assigned: (LUCENE-973) Token of returns in CJKTokenizer + new TestCJKTokenizer

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-973: - Assignee: Michael McCandless Token of returns in CJKTokenizer + new

[jira] Commented: (LUCENE-1696) Added New Token API impl for ASCIIFoldingFilter

2009-06-16 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720183#action_12720183 ] Robert Muir commented on LUCENE-1696: - i uploaded a testcase under LUCENE-1581 showing

[jira] Commented: (LUCENE-1696) Added New Token API impl for ASCIIFoldingFilter

2009-06-16 Thread Simon Willnauer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720189#action_12720189 ] Simon Willnauer commented on LUCENE-1696: - bq. i don't see an alternative,

[jira] Assigned: (LUCENE-1696) Added New Token API impl for ASCIIFoldingFilter

2009-06-16 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller reassigned LUCENE-1696: --- Assignee: Mark Miller Added New Token API impl for ASCIIFoldingFilter

[jira] Assigned: (LUCENE-1486) Wildcards, ORs etc inside Phrase queries

2009-06-16 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller reassigned LUCENE-1486: --- Assignee: Mark Miller Wildcards, ORs etc inside Phrase queries

[jira] Updated: (LUCENE-1696) Added New Token API impl for ASCIIFoldingFilter

2009-06-16 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-1696: Attachment: TestGermanCollation.java show how to do this with german... its a bit more involved

[jira] Created: (LUCENE-1697) MoreLikeThis should use the new Token API

2009-06-16 Thread Grant Ingersoll (JIRA)
MoreLikeThis should use the new Token API - Key: LUCENE-1697 URL: https://issues.apache.org/jira/browse/LUCENE-1697 Project: Lucene - Java Issue Type: Improvement Reporter: Grant Ingersoll

[jira] Commented: (LUCENE-1696) Added New Token API impl for ASCIIFoldingFilter

2009-06-16 Thread Simon Willnauer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720192#action_12720192 ] Simon Willnauer commented on LUCENE-1696: - Thanks robert, I did know about

[jira] Commented: (LUCENE-1673) Move TrieRange to core

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720191#action_12720191 ] Michael McCandless commented on LUCENE-1673: {quote} bq. We could easily add

[jira] Commented: (LUCENE-1696) Added New Token API impl for ASCIIFoldingFilter

2009-06-16 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720193#action_12720193 ] Robert Muir commented on LUCENE-1696: - simon, actually i think its documented you can

[jira] Commented: (LUCENE-1693) AttributeSource/TokenStream API improvements

2009-06-16 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720196#action_12720196 ] Michael Busch commented on LUCENE-1693: --- But, the additional copying would affect

[jira] Commented: (LUCENE-1696) Added New Token API impl for ASCIIFoldingFilter

2009-06-16 Thread Simon Willnauer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720197#action_12720197 ] Simon Willnauer commented on LUCENE-1696: - bq. simon, actually i think its

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Michael Busch
Wow this is *very* similar! :) On 6/16/09 4:29 AM, Shai Erera wrote: Since I proposed the same changes (http://www.nabble.com/Re%3A-Lucene%27s-default-settings---back-compatibility-p23792927.html), I can only give my +1 to all 4 :). On the other thread I also proposed to change the policy

Re: madvise(ptr, len, MADV_SEQUENTIAL)

2009-06-16 Thread Earwin Burrfoot
Except, you don't know the size of the file to be written upfront. One probable solution is to map output file in pages. As a complementary solution you can map a huge area of the file, and hope few real memory is allocated by OS unless you actually write all over that area. Dunno. The idea of

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Michael Busch
Sounds good, Grant. I'll open a task to change the policy with target release=3.0. Michael On 6/16/09 6:53 AM, Grant Ingersoll wrote: +1 on everything. This is the sanity we need, especially #2. Thanks for bringing this up again. I'd add a slight mod to #2 that I think helps further

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Earwin Burrfoot
Oh yes! Again! +1 One point is missing. What about incompatible behavioral changes that do not touch API and file format? Like posIncr=0 at the first token in stream, or analyzer fixes, or something along these lines. Are we free to introduce them in a minor release without warning, or are we

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Michael Busch
Fair enough. We certainly want our users to understand our reasons for these changes, and keep their trust that we're making our best efforts to keep upgrading as effortless as possible. However, there will always be someone who is not happy with such a change. But if the vast majority of the

[jira] Commented: (LUCENE-973) Token of returns in CJKTokenizer + new TestCJKTokenizer

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720200#action_12720200 ] Michael McCandless commented on LUCENE-973: --- Does anyone know if the added

[jira] Commented: (LUCENE-1696) Added New Token API impl for ASCIIFoldingFilter

2009-06-16 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720201#action_12720201 ] Robert Muir commented on LUCENE-1696: - since this seems to be a recurring theme maybe

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Yonik Seeley
So under this proposal, what's the difference between a major and minor release? -Yonik http://www.lucidimagination.com On Tue, Jun 16, 2009 at 6:37 AM, Michael Buschbusch...@gmail.com wrote: Probably everyone is thinking right now Oh no! Not again!. I admit I didn't fully read the

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Michael Busch
I'd suggest to treat a runtime change like an API change (unless it's fixing a bug of course), i.e. giving a warning, providing a switch, switching the default behavior only after a major or minor release was around that had the warning/switch. Michael On 6/16/09 8:54 AM, Earwin Burrfoot

[jira] Commented: (LUCENE-973) Token of returns in CJKTokenizer + new TestCJKTokenizer

2009-06-16 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720204#action_12720204 ] Robert Muir commented on LUCENE-973: sounds like another good test case, add a few

Re: madvise(ptr, len, MADV_SEQUENTIAL)

2009-06-16 Thread Michael McCandless
Hmm... posix_fadvise lets you do this with a file descriptor; this would be better for Lucene (per descriptor not per mapped region of RAM) since we could advise independent of which FSDir impl is in use... Mike On Tue, Jun 16, 2009 at 10:32 AM, Uwe Schindleru...@thetaphi.de wrote: But to use

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Shai Erera
Index back-compat is guaranteed to hold within minor releases. On Tue, Jun 16, 2009 at 6:59 PM, Yonik Seeley yo...@lucidimagination.comwrote: So under this proposal, what's the difference between a major and minor release? -Yonik http://www.lucidimagination.com On Tue, Jun 16, 2009 at

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Mark Miller
Right - I'm not saying that the users should trump the devs, just curious what the response will be, if any. I also think that when we update the back compat policy, there should be wording that stresses where we should use our new powers carefully (eg common API's and such). And we should

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Michael Busch
From a backwards-compatibility point of view, nothing really. Michael On 6/16/09 8:59 AM, Yonik Seeley wrote: So under this proposal, what's the difference between a major and minor release? -Yonik http://www.lucidimagination.com On Tue, Jun 16, 2009 at 6:37 AM, Michael

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Shai Erera
Ahh ... I wish I had finished http://www.nabble.com/Re%3A-Lucene%27s-default-settings---back-compatibility-p23792927.htmlwith +1 of my own. Guess that's what was missing to get it to closure :). Shai On Tue, Jun 16, 2009 at 7:03 PM, Michael Busch busch...@gmail.com wrote: I'd suggest to treat

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Mark Miller
Yeah, the only difference now is that we can remove deprecated APIs. And I guess we add nothing. Which is, as Micahel has said, is goofy. 3.0 will be 2.9 like 1.9 was 2.0. Without deprecations. Not a big deal at all, but I find it goofy too. - Mark Michael Busch wrote: From a

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Michael Busch
Well I'd actually hope that there will be significantly less need to do these tricks to get around the new policy. I'll open a JIRA issue and we can use it to work on the exact wording. Michael On 6/16/09 9:03 AM, Mark Miller wrote: Right - I'm not saying that the users should trump the

[jira] Commented: (LUCENE-973) Token of returns in CJKTokenizer + new TestCJKTokenizer

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720207#action_12720207 ] Michael McCandless commented on LUCENE-973: --- Well, my question is: is there any

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Mark Miller
I would guess you hit what I call thread fatigue by the time you summed that up :) Michael hasn't been around for a bit - perhaps it was easier for him to spawn a new thread. Also, much shorter text to read :) Shai Erera wrote: Ahh ... I wish I had finished

[jira] Created: (LUCENE-1698) Change backwards-compatibility policy

2009-06-16 Thread Michael Busch (JIRA)
Change backwards-compatibility policy - Key: LUCENE-1698 URL: https://issues.apache.org/jira/browse/LUCENE-1698 Project: Lucene - Java Issue Type: Task Reporter: Michael Busch

Re: Proposal for changing the backwards-compatibility policy

2009-06-16 Thread Shai Erera
Also, much shorter text to read :) You're right, Michael's is 484 words, mine was 691. But in my defense, I did offer two more changes, that were later brought up on this thread (summing to 563 words) :). Anyway, I'm glad it's kept alive and hopefully things will change. Shai On Tue, Jun 16,

[jira] Commented: (LUCENE-973) Token of returns in CJKTokenizer + new TestCJKTokenizer

2009-06-16 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720212#action_12720212 ] Robert Muir commented on LUCENE-973: Michael i don't see anything obvious, but a test

[jira] Updated: (LUCENE-973) Token of returns in CJKTokenizer + new TestCJKTokenizer

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-973: -- Attachment: LUCENE-973.patch Or... how about we just switch to iteration not

[jira] Commented: (LUCENE-1693) AttributeSource/TokenStream API improvements

2009-06-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720222#action_12720222 ] Michael McCandless commented on LUCENE-1693: bq. What do you or others think

[jira] Commented: (LUCENE-1673) Move TrieRange to core

2009-06-16 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720223#action_12720223 ] Yonik Seeley commented on LUCENE-1673: -- bq. But we are already baking in the trie

  1   2   >