date:20090815

[jira] Assigned: (LUCENE-1809) highlight-vs-vector-highlight.alg is unfair

2009-08-15 Thread Michael McCandless (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless reassigned LUCENE-1809:
--

Assignee: Michael McCandless

> highlight-vs-vector-highlight.alg is unfair
> ---
>
> Key: LUCENE-1809
> URL: https://issues.apache.org/jira/browse/LUCENE-1809
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: contrib/benchmark
>Affects Versions: 2.9
>Reporter: Koji Sekiguchi
>Assignee: Michael McCandless
>Priority: Trivial
> Attachments: LUCENE-1809.patch, LUCENE-1809.patch
>
>
> highlight-vs-vector-highlight.alg uses EnwikiQueryMaker which makes 
> SpanQueries, but FastVectorHighlighter simply ignores SpanQueries.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1809) highlight-vs-vector-highlight.alg is unfair

2009-08-15 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743666#action_12743666
 ] 

Michael McCandless commented on LUCENE-1809:


Patch looks good, thanks Koji.  I'll commit shortly!

> highlight-vs-vector-highlight.alg is unfair
> ---
>
> Key: LUCENE-1809
> URL: https://issues.apache.org/jira/browse/LUCENE-1809
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: contrib/benchmark
>Affects Versions: 2.9
>Reporter: Koji Sekiguchi
>Priority: Trivial
> Attachments: LUCENE-1809.patch, LUCENE-1809.patch
>
>
> highlight-vs-vector-highlight.alg uses EnwikiQueryMaker which makes 
> SpanQueries, but FastVectorHighlighter simply ignores SpanQueries.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Resolved: (LUCENE-1809) highlight-vs-vector-highlight.alg is unfair

2009-08-15 Thread Michael McCandless (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-1809.


   Resolution: Fixed
Fix Version/s: 2.9

Thanks Koji!

> highlight-vs-vector-highlight.alg is unfair
> ---
>
> Key: LUCENE-1809
> URL: https://issues.apache.org/jira/browse/LUCENE-1809
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: contrib/benchmark
>Affects Versions: 2.9
>Reporter: Koji Sekiguchi
>Assignee: Michael McCandless
>Priority: Trivial
> Fix For: 2.9
>
> Attachments: LUCENE-1809.patch, LUCENE-1809.patch
>
>
> highlight-vs-vector-highlight.alg uses EnwikiQueryMaker which makes 
> SpanQueries, but FastVectorHighlighter simply ignores SpanQueries.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Created: (LUCENE-1811) TestIndexReaderReopen nightly build failure

2009-08-15 Thread Michael McCandless (JIRA)

TestIndexReaderReopen nightly build failure
---

 Key: LUCENE-1811
 URL: https://issues.apache.org/jira/browse/LUCENE-1811
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: 2.9
Reporter: Michael McCandless
Priority: Minor
 Fix For: 2.9


An interesting failure in last night's build 
(http://hudson.zones.apache.org/hudson/job/Lucene-trunk/920).

I think the root cause wast he AIOOB exception... all the "lock obtain timed 
out" exceptions look like they cascaded.

{code}
[junit] Testsuite: org.apache.lucene.index.TestIndexReaderReopen
[junit] Lock obtain timed out: 
org.apache.lucene.store.singleinstancel...@6ac615: write.lock)
[junit] Tests run: 15, Failures: 1, Errors: 0, Time elapsed: 31.087 sec
[junit] 
[junit] - Standard Output ---
[junit] java.lang.ArrayIndexOutOfBoundsException: Array index out of range: 
148
[junit] at org.apache.lucene.util.BitVector.getAndSet(BitVector.java:74)
[junit] at 
org.apache.lucene.index.SegmentReader.doDelete(SegmentReader.java:908)
[junit] at 
org.apache.lucene.index.IndexReader.deleteDocument(IndexReader.java:1122)
[junit] at 
org.apache.lucene.index.DirectoryReader.doDelete(DirectoryReader.java:521)
[junit] at 
org.apache.lucene.index.IndexReader.deleteDocument(IndexReader.java:1122)
[junit] at 
org.apache.lucene.index.TestIndexReaderReopen$8.modifyIndex(TestIndexReaderReopen.java:638)
[junit] at 
org.apache.lucene.index.TestIndexReaderReopen.refreshReader(TestIndexReaderReopen.java:840)
[junit] at 
org.apache.lucene.index.TestIndexReaderReopen.access$400(TestIndexReaderReopen.java:47)
[junit] at 
org.apache.lucene.index.TestIndexReaderReopen$9.run(TestIndexReaderReopen.java:681)
[junit] at 
org.apache.lucene.index.TestIndexReaderReopen$ReaderThread.run(TestIndexReaderReopen.java:822)
[junit] org.apache.lucene.store.LockObtainFailedException: Lock obtain 
timed out: org.apache.lucene.store.singleinstancel...@88d319: write.lock
[junit] at org.apache.lucene.store.Lock.obtain(Lock.java:85)
[junit] at 
org.apache.lucene.index.DirectoryReader.acquireWriteLock(DirectoryReader.java:666)
[junit] at 
org.apache.lucene.index.IndexReader.setNorm(IndexReader.java:994)
[junit] at 
org.apache.lucene.index.IndexReader.setNorm(IndexReader.java:1020)
[junit] at 
org.apache.lucene.index.TestIndexReaderReopen$8.modifyIndex(TestIndexReaderReopen.java:634)
[junit] at 
org.apache.lucene.index.TestIndexReaderReopen.refreshReader(TestIndexReaderReopen.java:840)
[junit] at 
org.apache.lucene.index.TestIndexReaderReopen.access$400(TestIndexReaderReopen.java:47)
[junit] at 
org.apache.lucene.index.TestIndexReaderReopen$9.run(TestIndexReaderReopen.java:681)
[junit] at 
org.apache.lucene.index.TestIndexReaderReopen$ReaderThread.run(TestIndexReaderReopen.java:822)
...
[junit] -  ---
[junit] Testcase: 
testThreadSafety(org.apache.lucene.index.TestIndexReaderReopen):  FAILED
[junit] Error occurred in thread Thread-36:
[junit] Lock obtain timed out: 
org.apache.lucene.store.singleinstancel...@6ac615: write.lock
[junit] junit.framework.AssertionFailedError: Error occurred in thread 
Thread-36:
[junit] Lock obtain timed out: 
org.apache.lucene.store.singleinstancel...@6ac615: write.lock
[junit] at 
org.apache.lucene.index.TestIndexReaderReopen.testThreadSafety(TestIndexReaderReopen.java:764)
[junit] 
[junit] 
{code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: Build failed in Hudson: Lucene-trunk #920

2009-08-15 Thread Michael McCandless

This failure looks real.  We hit a spooky AIOOBE in
TestIndexReaderReopen.testThreadSafety.  I've opened
https://issues.apache.org/jira/browse/LUCENE-1811 to track it.

Mike

On Fri, Aug 14, 2009 at 11:16 PM, Apache Hudson
Server wrote:
> See http://hudson.zones.apache.org/hudson/job/Lucene-trunk/920/changes
>
> Changes:
>
> [uschindler] LUCENE-1801: All Tokenizers/TokenStreams that are source of 
> tokens call AttributeSource.clearAttributes() first. Made Token.clear() 
> consistent to AttributeImpl (clear everything)
>
> [gsingers] LUCENE-1790: pass in position information for scoring
>
> [ehatcher] LUCENE-1806: add args to test macro (Jason Rutherglen via ehatcher)
>
> [mikemccand] LUCENE-1807: allow passing the Map of field name -> analyzer to 
> PerFieldAnalyzerWrapper
>
> --
> [...truncated 16851 lines...]
>    [junit] Testsuite: org.apache.lucene.index.store.TestRAMDirectory
>    [junit] Tests run: 6, Failures: 0, Errors: 0, Time elapsed: 3.36 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.queryParser.TestMultiAnalyzer
>    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.41 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.queryParser.TestMultiFieldQueryParser
>    [junit] Tests run: 9, Failures: 0, Errors: 0, Time elapsed: 2.281 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.queryParser.TestQueryParser
>    [junit] Tests run: 26, Failures: 0, Errors: 0, Time elapsed: 1.582 sec
>    [junit]
>    [junit] - Standard Output ---
>    [junit] Result: (fieldX:x fieldy:)^2.0
>    [junit] -  ---
>    [junit] Testsuite: org.apache.lucene.search.TestBoolean2
>    [junit] Tests run: 11, Failures: 0, Errors: 0, Time elapsed: 12.37 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestBooleanMinShouldMatch
>    [junit] Tests run: 15, Failures: 0, Errors: 0, Time elapsed: 12.08 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestBooleanOr
>    [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 2.15 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestBooleanPrefixQuery
>    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.572 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestBooleanQuery
>    [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.339 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestBooleanScorer
>    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.773 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestCachingWrapperFilter
>    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.503 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestComplexExplanations
>    [junit] Tests run: 22, Failures: 0, Errors: 0, Time elapsed: 3.39 sec
>    [junit]
>    [junit] Testsuite: 
> org.apache.lucene.search.TestComplexExplanationsOfNonMatches
>    [junit] Tests run: 22, Failures: 0, Errors: 0, Time elapsed: 0.895 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestConstantScoreRangeQuery
>    [junit] Tests run: 11, Failures: 0, Errors: 0, Time elapsed: 10.066 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestCustomSearcherSort
>    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 2.433 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestDateFilter
>    [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 1.038 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestDateSort
>    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.841 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestDisjunctionMaxQuery
>    [junit] Tests run: 10, Failures: 0, Errors: 0, Time elapsed: 2.333 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestDocBoost
>    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.626 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestExplanations
>    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.666 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestExtendedFieldCache
>    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.439 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestFilteredQuery
>    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 2.108 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestFilteredSearch
>    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.637 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestFuzzyQuery
>    [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.924 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestMatchAllDocsQuery
>    [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.946 sec
>    [junit]
>

[jira] Commented: (LUCENE-1792) new QueryParser fails to set AUTO REWRITE for multi-term queries

2009-08-15 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743668#action_12743668
 ] 

Michael McCandless commented on LUCENE-1792:


On quick glance the patch looks good, but I'm not going to have enough time to 
look more thoroughly!

I think you used "svn move" to rename PrefixWildcardQueryNodeProcessore -> 
WildcardQueryNodeProcessor?  (because "patch" fails to apply the changes).

> new QueryParser fails to set AUTO REWRITE for multi-term queries
> 
>
> Key: LUCENE-1792
> URL: https://issues.apache.org/jira/browse/LUCENE-1792
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: contrib/*
>Affects Versions: 2.9
>Reporter: Michael McCandless
>Assignee: Michael Busch
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1792.patch, LUCENE-1792.patch, 
> removal_of_wildcard_and_prefix_detection_from_the_syntaxparser.patch
>
>
> The old QueryParser defaults to constant score rewrite for 
> Prefix,Fuzzy,Wildcard,TermRangeQuery, but the new one seems not to.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1811) TestIndexReaderReopen nightly build failure

2009-08-15 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743715#action_12743715
 ] 

Michael McCandless commented on LUCENE-1811:


I believe this is just a thread-safety bug in the test.  It's deleting by a 
fixed docID, but, depending on how threads are scheduled, that docID may be 
invalid.  I'll commit a simple fix shortly...

> TestIndexReaderReopen nightly build failure
> ---
>
> Key: LUCENE-1811
> URL: https://issues.apache.org/jira/browse/LUCENE-1811
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Index
>Affects Versions: 2.9
>Reporter: Michael McCandless
>Priority: Minor
> Fix For: 2.9
>
>
> An interesting failure in last night's build 
> (http://hudson.zones.apache.org/hudson/job/Lucene-trunk/920).
> I think the root cause wast he AIOOB exception... all the "lock obtain timed 
> out" exceptions look like they cascaded.
> {code}
> [junit] Testsuite: org.apache.lucene.index.TestIndexReaderReopen
> [junit] Lock obtain timed out: 
> org.apache.lucene.store.singleinstancel...@6ac615: write.lock)
> [junit] Tests run: 15, Failures: 1, Errors: 0, Time elapsed: 31.087 sec
> [junit] 
> [junit] - Standard Output ---
> [junit] java.lang.ArrayIndexOutOfBoundsException: Array index out of 
> range: 148
> [junit]   at org.apache.lucene.util.BitVector.getAndSet(BitVector.java:74)
> [junit]   at 
> org.apache.lucene.index.SegmentReader.doDelete(SegmentReader.java:908)
> [junit]   at 
> org.apache.lucene.index.IndexReader.deleteDocument(IndexReader.java:1122)
> [junit]   at 
> org.apache.lucene.index.DirectoryReader.doDelete(DirectoryReader.java:521)
> [junit]   at 
> org.apache.lucene.index.IndexReader.deleteDocument(IndexReader.java:1122)
> [junit]   at 
> org.apache.lucene.index.TestIndexReaderReopen$8.modifyIndex(TestIndexReaderReopen.java:638)
> [junit]   at 
> org.apache.lucene.index.TestIndexReaderReopen.refreshReader(TestIndexReaderReopen.java:840)
> [junit]   at 
> org.apache.lucene.index.TestIndexReaderReopen.access$400(TestIndexReaderReopen.java:47)
> [junit]   at 
> org.apache.lucene.index.TestIndexReaderReopen$9.run(TestIndexReaderReopen.java:681)
> [junit]   at 
> org.apache.lucene.index.TestIndexReaderReopen$ReaderThread.run(TestIndexReaderReopen.java:822)
> [junit] org.apache.lucene.store.LockObtainFailedException: Lock obtain 
> timed out: org.apache.lucene.store.singleinstancel...@88d319: write.lock
> [junit]   at org.apache.lucene.store.Lock.obtain(Lock.java:85)
> [junit]   at 
> org.apache.lucene.index.DirectoryReader.acquireWriteLock(DirectoryReader.java:666)
> [junit]   at 
> org.apache.lucene.index.IndexReader.setNorm(IndexReader.java:994)
> [junit]   at 
> org.apache.lucene.index.IndexReader.setNorm(IndexReader.java:1020)
> [junit]   at 
> org.apache.lucene.index.TestIndexReaderReopen$8.modifyIndex(TestIndexReaderReopen.java:634)
> [junit]   at 
> org.apache.lucene.index.TestIndexReaderReopen.refreshReader(TestIndexReaderReopen.java:840)
> [junit]   at 
> org.apache.lucene.index.TestIndexReaderReopen.access$400(TestIndexReaderReopen.java:47)
> [junit]   at 
> org.apache.lucene.index.TestIndexReaderReopen$9.run(TestIndexReaderReopen.java:681)
> [junit]   at 
> org.apache.lucene.index.TestIndexReaderReopen$ReaderThread.run(TestIndexReaderReopen.java:822)
> ...
> [junit] -  ---
> [junit] Testcase: 
> testThreadSafety(org.apache.lucene.index.TestIndexReaderReopen):FAILED
> [junit] Error occurred in thread Thread-36:
> [junit] Lock obtain timed out: 
> org.apache.lucene.store.singleinstancel...@6ac615: write.lock
> [junit] junit.framework.AssertionFailedError: Error occurred in thread 
> Thread-36:
> [junit] Lock obtain timed out: 
> org.apache.lucene.store.singleinstancel...@6ac615: write.lock
> [junit]   at 
> org.apache.lucene.index.TestIndexReaderReopen.testThreadSafety(TestIndexReaderReopen.java:764)
> [junit] 
> [junit] 
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Resolved: (LUCENE-1811) TestIndexReaderReopen nightly build failure

2009-08-15 Thread Michael McCandless (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-1811.


Resolution: Fixed

> TestIndexReaderReopen nightly build failure
> ---
>
> Key: LUCENE-1811
> URL: https://issues.apache.org/jira/browse/LUCENE-1811
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Index
>Affects Versions: 2.9
>Reporter: Michael McCandless
>Priority: Minor
> Fix For: 2.9
>
>
> An interesting failure in last night's build 
> (http://hudson.zones.apache.org/hudson/job/Lucene-trunk/920).
> I think the root cause wast he AIOOB exception... all the "lock obtain timed 
> out" exceptions look like they cascaded.
> {code}
> [junit] Testsuite: org.apache.lucene.index.TestIndexReaderReopen
> [junit] Lock obtain timed out: 
> org.apache.lucene.store.singleinstancel...@6ac615: write.lock)
> [junit] Tests run: 15, Failures: 1, Errors: 0, Time elapsed: 31.087 sec
> [junit] 
> [junit] - Standard Output ---
> [junit] java.lang.ArrayIndexOutOfBoundsException: Array index out of 
> range: 148
> [junit]   at org.apache.lucene.util.BitVector.getAndSet(BitVector.java:74)
> [junit]   at 
> org.apache.lucene.index.SegmentReader.doDelete(SegmentReader.java:908)
> [junit]   at 
> org.apache.lucene.index.IndexReader.deleteDocument(IndexReader.java:1122)
> [junit]   at 
> org.apache.lucene.index.DirectoryReader.doDelete(DirectoryReader.java:521)
> [junit]   at 
> org.apache.lucene.index.IndexReader.deleteDocument(IndexReader.java:1122)
> [junit]   at 
> org.apache.lucene.index.TestIndexReaderReopen$8.modifyIndex(TestIndexReaderReopen.java:638)
> [junit]   at 
> org.apache.lucene.index.TestIndexReaderReopen.refreshReader(TestIndexReaderReopen.java:840)
> [junit]   at 
> org.apache.lucene.index.TestIndexReaderReopen.access$400(TestIndexReaderReopen.java:47)
> [junit]   at 
> org.apache.lucene.index.TestIndexReaderReopen$9.run(TestIndexReaderReopen.java:681)
> [junit]   at 
> org.apache.lucene.index.TestIndexReaderReopen$ReaderThread.run(TestIndexReaderReopen.java:822)
> [junit] org.apache.lucene.store.LockObtainFailedException: Lock obtain 
> timed out: org.apache.lucene.store.singleinstancel...@88d319: write.lock
> [junit]   at org.apache.lucene.store.Lock.obtain(Lock.java:85)
> [junit]   at 
> org.apache.lucene.index.DirectoryReader.acquireWriteLock(DirectoryReader.java:666)
> [junit]   at 
> org.apache.lucene.index.IndexReader.setNorm(IndexReader.java:994)
> [junit]   at 
> org.apache.lucene.index.IndexReader.setNorm(IndexReader.java:1020)
> [junit]   at 
> org.apache.lucene.index.TestIndexReaderReopen$8.modifyIndex(TestIndexReaderReopen.java:634)
> [junit]   at 
> org.apache.lucene.index.TestIndexReaderReopen.refreshReader(TestIndexReaderReopen.java:840)
> [junit]   at 
> org.apache.lucene.index.TestIndexReaderReopen.access$400(TestIndexReaderReopen.java:47)
> [junit]   at 
> org.apache.lucene.index.TestIndexReaderReopen$9.run(TestIndexReaderReopen.java:681)
> [junit]   at 
> org.apache.lucene.index.TestIndexReaderReopen$ReaderThread.run(TestIndexReaderReopen.java:822)
> ...
> [junit] -  ---
> [junit] Testcase: 
> testThreadSafety(org.apache.lucene.index.TestIndexReaderReopen):FAILED
> [junit] Error occurred in thread Thread-36:
> [junit] Lock obtain timed out: 
> org.apache.lucene.store.singleinstancel...@6ac615: write.lock
> [junit] junit.framework.AssertionFailedError: Error occurred in thread 
> Thread-36:
> [junit] Lock obtain timed out: 
> org.apache.lucene.store.singleinstancel...@6ac615: write.lock
> [junit]   at 
> org.apache.lucene.index.TestIndexReaderReopen.testThreadSafety(TestIndexReaderReopen.java:764)
> [junit] 
> [junit] 
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1794) implement reusableTokenStream for all contrib analyzers

2009-08-15 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743721#action_12743721
 ] 

Yonik Seeley commented on LUCENE-1794:
--

Patch looks good - do you plan on committing soon Robert?

> implement reusableTokenStream for all contrib analyzers
> ---
>
> Key: LUCENE-1794
> URL: https://issues.apache.org/jira/browse/LUCENE-1794
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: contrib/analyzers
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1794.patch, LUCENE-1794.patch, LUCENE-1794.patch, 
> LUCENE-1794.patch, LUCENE-1794.patch, LUCENE-1794.patch
>
>
> most contrib analyzers do not have an impl for reusableTokenStream
> regardless of how expensive the back compat reflection is for indexing speed, 
> I think we should do this to mitigate any performance costs. hey, overall it 
> might even be an improvement!
> the back compat code for non-final analyzers is already in place so this is 
> easy money in my opinion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Created: (LUCENE-1812) Static index pruning by in-document term frequency (Carmel pruning)

2009-08-15 Thread Andrzej Bialecki (JIRA)

Static index pruning by in-document term frequency (Carmel pruning)
---

 Key: LUCENE-1812
 URL: https://issues.apache.org/jira/browse/LUCENE-1812
 Project: Lucene - Java
  Issue Type: New Feature
  Components: contrib/*
Affects Versions: 2.9
Reporter: Andrzej Bialecki 


This module provides tools to produce a subset of input indexes by removing 
postings data for those terms where their in-document frequency is below a 
specified threshold. The net effect of this processing is a much smaller index 
that for common types of queries returns nearly identical top-N results as 
compared with the original index, but with increased performance. 

Optionally, stored values and term vectors can also be removed. This 
functionality is largely independent, so it can be used without term pruning 
(when term freq. threshold is set to 1).

As the threshold value increases, the total size of the index decreases, search 
performance increases, and recall decreases (i.e. search quality deteriorates). 
NOTE: especially phrase recall deteriorates significantly at higher threshold 
values. 

Primary purpose of this class is to produce small first-tier indexes that fit 
completely in RAM, and store these indexes using 
IndexWriter.addIndexes(IndexReader[]). Usually the performance of this class 
will not be sufficient to use the resulting index view for on-the-fly pruning 
and searching. 

NOTE: If the input index is optimized (i.e. doesn't contain deletions) then the 
index produced via IndexWriter.addIndexes(IndexReader[]) will preserve internal 
document id-s so that they are in sync with the original index. This means that 
all other auxiliary information not necessary for first-tier processing, such 
as some stored fields, can also be removed, to be quickly retrieved on-demand 
from the original index using the same internal document id. 

Threshold values can be specified globally (for terms in all fields) using 
defaultThreshold parameter, and can be overriden using per-field or per-term 
values supplied in a thresholds map. Keys in this map are either field names, 
or terms in field:text format. The precedence of these values is the following: 
first a per-term threshold is used if present, then per-field threshold if 
present, and finally the default threshold.

A command-line tool (PruningTool) is provided for convenience. At this moment 
it doesn't support all functionality available through API.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1812) Static index pruning by in-document term frequency (Carmel pruning)

2009-08-15 Thread Andrzej Bialecki (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrzej Bialecki  updated LUCENE-1812:
--

Attachment: pruning.patch

Patch relative to the current trunk.

> Static index pruning by in-document term frequency (Carmel pruning)
> ---
>
> Key: LUCENE-1812
> URL: https://issues.apache.org/jira/browse/LUCENE-1812
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: contrib/*
>Affects Versions: 2.9
>Reporter: Andrzej Bialecki 
> Attachments: pruning.patch
>
>
> This module provides tools to produce a subset of input indexes by removing 
> postings data for those terms where their in-document frequency is below a 
> specified threshold. The net effect of this processing is a much smaller 
> index that for common types of queries returns nearly identical top-N results 
> as compared with the original index, but with increased performance. 
> Optionally, stored values and term vectors can also be removed. This 
> functionality is largely independent, so it can be used without term pruning 
> (when term freq. threshold is set to 1).
> As the threshold value increases, the total size of the index decreases, 
> search performance increases, and recall decreases (i.e. search quality 
> deteriorates). NOTE: especially phrase recall deteriorates significantly at 
> higher threshold values. 
> Primary purpose of this class is to produce small first-tier indexes that fit 
> completely in RAM, and store these indexes using 
> IndexWriter.addIndexes(IndexReader[]). Usually the performance of this class 
> will not be sufficient to use the resulting index view for on-the-fly pruning 
> and searching. 
> NOTE: If the input index is optimized (i.e. doesn't contain deletions) then 
> the index produced via IndexWriter.addIndexes(IndexReader[]) will preserve 
> internal document id-s so that they are in sync with the original index. This 
> means that all other auxiliary information not necessary for first-tier 
> processing, such as some stored fields, can also be removed, to be quickly 
> retrieved on-demand from the original index using the same internal document 
> id. 
> Threshold values can be specified globally (for terms in all fields) using 
> defaultThreshold parameter, and can be overriden using per-field or per-term 
> values supplied in a thresholds map. Keys in this map are either field names, 
> or terms in field:text format. The precedence of these values is the 
> following: first a per-term threshold is used if present, then per-field 
> threshold if present, and finally the default threshold.
> A command-line tool (PruningTool) is provided for convenience. At this moment 
> it doesn't support all functionality available through API.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1794) implement reusableTokenStream for all contrib analyzers

2009-08-15 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743727#action_12743727
 ] 

Robert Muir commented on LUCENE-1794:
-

Yonik, thanks for reviewing it. 
I wanted to wait a bit and see if Shai wanted to give a crack at 
ReusingAnalyzer, but we could do that as a separate issue and then refactor 
code to use it?


> implement reusableTokenStream for all contrib analyzers
> ---
>
> Key: LUCENE-1794
> URL: https://issues.apache.org/jira/browse/LUCENE-1794
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: contrib/analyzers
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1794.patch, LUCENE-1794.patch, LUCENE-1794.patch, 
> LUCENE-1794.patch, LUCENE-1794.patch, LUCENE-1794.patch
>
>
> most contrib analyzers do not have an impl for reusableTokenStream
> regardless of how expensive the back compat reflection is for indexing speed, 
> I think we should do this to mitigate any performance costs. hey, overall it 
> might even be an improvement!
> the back compat code for non-final analyzers is already in place so this is 
> easy money in my opinion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1794) implement reusableTokenStream for all contrib analyzers

2009-08-15 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743728#action_12743728
 ] 

Yonik Seeley commented on LUCENE-1794:
--

Yes, I think we should just commit this now - the most important part is that 
people can create their own reusable tokenstreams from Lucene's tokenizers and 
token filters.  Making an easier to use ReusingAnalyzer can be a separate issue.

> implement reusableTokenStream for all contrib analyzers
> ---
>
> Key: LUCENE-1794
> URL: https://issues.apache.org/jira/browse/LUCENE-1794
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: contrib/analyzers
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1794.patch, LUCENE-1794.patch, LUCENE-1794.patch, 
> LUCENE-1794.patch, LUCENE-1794.patch, LUCENE-1794.patch
>
>
> most contrib analyzers do not have an impl for reusableTokenStream
> regardless of how expensive the back compat reflection is for indexing speed, 
> I think we should do this to mitigate any performance costs. hey, overall it 
> might even be an improvement!
> the back compat code for non-final analyzers is already in place so this is 
> easy money in my opinion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1794) implement reusableTokenStream for all contrib analyzers

2009-08-15 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743730#action_12743730
 ] 

Robert Muir commented on LUCENE-1794:
-

Yonik, ok, I will look over the patch again, but I plan on committing this 
tonight or tomorrow if nothing comes up.

> implement reusableTokenStream for all contrib analyzers
> ---
>
> Key: LUCENE-1794
> URL: https://issues.apache.org/jira/browse/LUCENE-1794
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: contrib/analyzers
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1794.patch, LUCENE-1794.patch, LUCENE-1794.patch, 
> LUCENE-1794.patch, LUCENE-1794.patch, LUCENE-1794.patch
>
>
> most contrib analyzers do not have an impl for reusableTokenStream
> regardless of how expensive the back compat reflection is for indexing speed, 
> I think we should do this to mitigate any performance costs. hey, overall it 
> might even be an improvement!
> the back compat code for non-final analyzers is already in place so this is 
> easy money in my opinion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Created: (LUCENE-1813) Add option to ReverseStringFilter to mark reversed tokens

2009-08-15 Thread Andrzej Bialecki (JIRA)

Add option to ReverseStringFilter to mark reversed tokens
-

 Key: LUCENE-1813
 URL: https://issues.apache.org/jira/browse/LUCENE-1813
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/analyzers
Affects Versions: 2.9
Reporter: Andrzej Bialecki 
 Attachments: reverseMark.patch

This patch implements additional functionality in the filter to "mark" reversed 
tokens with a special marker character (Unicode 0001). This is useful when 
indexing both straight and reversed tokens (e.g. to implement efficient leading 
wildcards search).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1813) Add option to ReverseStringFilter to mark reversed tokens

2009-08-15 Thread Andrzej Bialecki (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrzej Bialecki  updated LUCENE-1813:
--

Attachment: reverseMark.patch

Patch and unit tests.

> Add option to ReverseStringFilter to mark reversed tokens
> -
>
> Key: LUCENE-1813
> URL: https://issues.apache.org/jira/browse/LUCENE-1813
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: contrib/analyzers
>Affects Versions: 2.9
>Reporter: Andrzej Bialecki 
> Attachments: reverseMark.patch
>
>
> This patch implements additional functionality in the filter to "mark" 
> reversed tokens with a special marker character (Unicode 0001). This is 
> useful when indexing both straight and reversed tokens (e.g. to implement 
> efficient leading wildcards search).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Assigned: (LUCENE-1813) Add option to ReverseStringFilter to mark reversed tokens

2009-08-15 Thread Robert Muir (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir reassigned LUCENE-1813:
---

Assignee: Robert Muir

> Add option to ReverseStringFilter to mark reversed tokens
> -
>
> Key: LUCENE-1813
> URL: https://issues.apache.org/jira/browse/LUCENE-1813
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: contrib/analyzers
>Affects Versions: 2.9
>Reporter: Andrzej Bialecki 
>Assignee: Robert Muir
> Attachments: reverseMark.patch
>
>
> This patch implements additional functionality in the filter to "mark" 
> reversed tokens with a special marker character (Unicode 0001). This is 
> useful when indexing both straight and reversed tokens (e.g. to implement 
> efficient leading wildcards search).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1813) Add option to ReverseStringFilter to mark reversed tokens

2009-08-15 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743737#action_12743737
 ] 

Robert Muir commented on LUCENE-1813:
-

the corresponding solr task (SOLR-1321) is marked as version 1.4

does anyone oppose putting this one in 2.9? 


> Add option to ReverseStringFilter to mark reversed tokens
> -
>
> Key: LUCENE-1813
> URL: https://issues.apache.org/jira/browse/LUCENE-1813
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: contrib/analyzers
>Affects Versions: 2.9
>Reporter: Andrzej Bialecki 
>Assignee: Robert Muir
> Attachments: reverseMark.patch
>
>
> This patch implements additional functionality in the filter to "mark" 
> reversed tokens with a special marker character (Unicode 0001). This is 
> useful when indexing both straight and reversed tokens (e.g. to implement 
> efficient leading wildcards search).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1813) Add option to ReverseStringFilter to mark reversed tokens

2009-08-15 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743740#action_12743740
 ] 

Robert Muir commented on LUCENE-1813:
-

andrzej, the reverse() methods are public, can you supply default impls 
(withMark=false) just in the case that someone is using them?

alternatively, maybe the reverse() methods could stay the same, and the marking 
could happen in incrementToken() ?


> Add option to ReverseStringFilter to mark reversed tokens
> -
>
> Key: LUCENE-1813
> URL: https://issues.apache.org/jira/browse/LUCENE-1813
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: contrib/analyzers
>Affects Versions: 2.9
>Reporter: Andrzej Bialecki 
>Assignee: Robert Muir
> Attachments: reverseMark.patch
>
>
> This patch implements additional functionality in the filter to "mark" 
> reversed tokens with a special marker character (Unicode 0001). This is 
> useful when indexing both straight and reversed tokens (e.g. to implement 
> efficient leading wildcards search).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1813) Add option to ReverseStringFilter to mark reversed tokens

2009-08-15 Thread Andrzej Bialecki (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743743#action_12743743
 ] 

Andrzej Bialecki  commented on LUCENE-1813:
---

Either way is fine with me. To preserve the public API I think it's better to 
move this marking logic to incrementToken(). I'll prepare an updated patch.

> Add option to ReverseStringFilter to mark reversed tokens
> -
>
> Key: LUCENE-1813
> URL: https://issues.apache.org/jira/browse/LUCENE-1813
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: contrib/analyzers
>Affects Versions: 2.9
>Reporter: Andrzej Bialecki 
>Assignee: Robert Muir
> Attachments: reverseMark.patch
>
>
> This patch implements additional functionality in the filter to "mark" 
> reversed tokens with a special marker character (Unicode 0001). This is 
> useful when indexing both straight and reversed tokens (e.g. to implement 
> efficient leading wildcards search).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1790) Add Boosting Function Term Query and Some Payload Query refactorings

2009-08-15 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743747#action_12743747
 ] 

Mark Miller commented on LUCENE-1790:
-

BoostingFunctionTermQuery implements equals but not hashcode - important for a 
query class I think.

> Add Boosting Function Term Query and Some Payload Query refactorings
> 
>
> Key: LUCENE-1790
> URL: https://issues.apache.org/jira/browse/LUCENE-1790
> Project: Lucene - Java
>  Issue Type: New Feature
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1790-position.patch, LUCENE-1790.patch, 
> LUCENE-1790.patch, LUCENE-1790.patch
>
>
> Similar to the BoostingTermQuery, the BoostingFunctionTermQuery is a 
> SpanTermQuery, but the difference is the payload score for a doc is not the 
> average of all the payloads, but applies a function to them instead.  
> BoostingTermQuery becomes a BoostingFunctionTermQuery with an 
> AveragePayloadFunction applied to it.
> Also add marker interface to indicate PayloadQuery types.  Refactor 
> Similarity.scorePayload to also take in the doc id.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1790) Add Boosting Function Term Query and Some Payload Query refactorings

2009-08-15 Thread Mark Miller (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated LUCENE-1790:


Attachment: LUCENE-1790.patch

remove some unused imports
added missing license header

Added hashCode to BoostingFunctionTermQuery

Added hashCode/equals to PayloadFunction classes

added hashcode/equals to query - really it should be handling the 
equals/hashcode for boost, not subclasses (which will be likely to forget it - 
you should check super classes for these things anyway as well).

BoostingFunctionTermQuery is a subclass of SpanTermQuery, but both of them use 
a weak equals method (using instanceof)
so while BoostingFunctionTermQuery.equals(SpanTermQuery) should equal 
SpanTermQuery.equals(BoostFunctionTermQuery), it doesn't.

Added new hashCode/equals for both classes that work properly.

Added a couple tests for these fixes

> Add Boosting Function Term Query and Some Payload Query refactorings
> 
>
> Key: LUCENE-1790
> URL: https://issues.apache.org/jira/browse/LUCENE-1790
> Project: Lucene - Java
>  Issue Type: New Feature
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1790-position.patch, LUCENE-1790.patch, 
> LUCENE-1790.patch, LUCENE-1790.patch, LUCENE-1790.patch
>
>
> Similar to the BoostingTermQuery, the BoostingFunctionTermQuery is a 
> SpanTermQuery, but the difference is the payload score for a doc is not the 
> average of all the payloads, but applies a function to them instead.  
> BoostingTermQuery becomes a BoostingFunctionTermQuery with an 
> AveragePayloadFunction applied to it.
> Also add marker interface to indicate PayloadQuery types.  Refactor 
> Similarity.scorePayload to also take in the doc id.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1808) make Query.createWeight public (or add back Query.createQueryWeight())

2009-08-15 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743754#action_12743754
 ] 

Mark Miller commented on LUCENE-1808:
-

I havn't yet figured out how to do this without breaking back compat  - I think 
this was an issue before as well. I'd have to dig it up, but some user 
complained about a similar issue when QueryWeight was put in.

If you add createQueryWeight as a public method, then all of the Lucene classes 
have to be changed to call it - otherwise, if you override it in a user Query, 
it won't be called on that Query.

But anyone with an external Query class that overrode createWeight will not 
call createQueryWeight, and won't work correctly with classes that override it. 
I guess if we make it final it would close that loop hole, but then thats a 
loss from createWeight where you could override, and is still a back compat 
break?

> make Query.createWeight public (or add back Query.createQueryWeight())
> --
>
> Key: LUCENE-1808
> URL: https://issues.apache.org/jira/browse/LUCENE-1808
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Query/Scoring
>Affects Versions: 2.9
>Reporter: Tim Smith
>Assignee: Mark Miller
>
> Now that the QueryWeight class has been removed, the public QueryWeight 
> createQueryWeight() method on Query was also removed
> i have cases where i want to create a weight for a sub query (outside of the 
> org.apache.lucene.search package) and i don't want the weight normalized 
> (think BooleanQuery outside of the o.a.l.search package)
> in order to do this, i have to create a static Utils class inside 
> o.a.l.search, pass in the Query and searcher, and have the static method call 
> the protected createWeight method
> this should not be necessary
> This could be fixed in one of 2 ways:
> 1. make createWeight() public on Query (breaks back compat)
> 2. add the following method:
> {code}
> public Weight createQueryWeight(Searcher searcher) throws IOException {
>   return createWeight(searcher);
> }
> {code}
> createWeight(Searcher) should then be deprectated in favor of the publicly 
> accessible method

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1808) make Query.createWeight public (or add back Query.createQueryWeight())

2009-08-15 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743755#action_12743755
 ] 

Mark Miller commented on LUCENE-1808:
-

1. make createWeight() public on Query (breaks back compat)

hmmm - I took that as fact, but is that true? Can't you open up visibility 
without breaking back compat? Time to look this stuff up again ...

> make Query.createWeight public (or add back Query.createQueryWeight())
> --
>
> Key: LUCENE-1808
> URL: https://issues.apache.org/jira/browse/LUCENE-1808
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Query/Scoring
>Affects Versions: 2.9
>Reporter: Tim Smith
>Assignee: Mark Miller
>
> Now that the QueryWeight class has been removed, the public QueryWeight 
> createQueryWeight() method on Query was also removed
> i have cases where i want to create a weight for a sub query (outside of the 
> org.apache.lucene.search package) and i don't want the weight normalized 
> (think BooleanQuery outside of the o.a.l.search package)
> in order to do this, i have to create a static Utils class inside 
> o.a.l.search, pass in the Query and searcher, and have the static method call 
> the protected createWeight method
> this should not be necessary
> This could be fixed in one of 2 ways:
> 1. make createWeight() public on Query (breaks back compat)
> 2. add the following method:
> {code}
> public Weight createQueryWeight(Searcher searcher) throws IOException {
>   return createWeight(searcher);
> }
> {code}
> createWeight(Searcher) should then be deprectated in favor of the publicly 
> accessible method

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1794) implement reusableTokenStream for all contrib analyzers

2009-08-15 Thread Shai Erera (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-1794:
---

Attachment: LUCENE-1794-reusing-analyzer.patch

Apologies for the late post, I had a busy weekend. Attached patch includes 
ReusingAnalyzer, Streams in Analyzer and javadocs.

Robert, please have a look. I think extending it should be fairly 
straightforward and we can probably finish the integration in a couple of days. 
However if you discover it isn't the case, we can separate it into a different 
issue.

Also, I did not include a note in CHANGES. Once you're done merging it into the 
larger patch, I can help w/ the javadocs and CHANGES if required.

> implement reusableTokenStream for all contrib analyzers
> ---
>
> Key: LUCENE-1794
> URL: https://issues.apache.org/jira/browse/LUCENE-1794
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: contrib/analyzers
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1794-reusing-analyzer.patch, LUCENE-1794.patch, 
> LUCENE-1794.patch, LUCENE-1794.patch, LUCENE-1794.patch, LUCENE-1794.patch, 
> LUCENE-1794.patch
>
>
> most contrib analyzers do not have an impl for reusableTokenStream
> regardless of how expensive the back compat reflection is for indexing speed, 
> I think we should do this to mitigate any performance costs. hey, overall it 
> might even be an improvement!
> the back compat code for non-final analyzers is already in place so this is 
> easy money in my opinion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1808) make Query.createWeight public (or add back Query.createQueryWeight())

2009-08-15 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743760#action_12743760
 ] 

Shai Erera commented on LUCENE-1808:


bq. Can't you open up visibility without breaking back compat?

I don't see why this would break back-compat. I can always extend a class and 
make a package-private or protected method public. I cannot reduce visibility, 
but can always increase it.

About the issues w/ createQueryWeight, I think you're referring to the chain of 
comments that started here: 
https://issues.apache.org/jira/browse/LUCENE-1630?focusedCommentId=12723976&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12723976.
 Is that what you were talking about?

> make Query.createWeight public (or add back Query.createQueryWeight())
> --
>
> Key: LUCENE-1808
> URL: https://issues.apache.org/jira/browse/LUCENE-1808
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Query/Scoring
>Affects Versions: 2.9
>Reporter: Tim Smith
>Assignee: Mark Miller
>
> Now that the QueryWeight class has been removed, the public QueryWeight 
> createQueryWeight() method on Query was also removed
> i have cases where i want to create a weight for a sub query (outside of the 
> org.apache.lucene.search package) and i don't want the weight normalized 
> (think BooleanQuery outside of the o.a.l.search package)
> in order to do this, i have to create a static Utils class inside 
> o.a.l.search, pass in the Query and searcher, and have the static method call 
> the protected createWeight method
> this should not be necessary
> This could be fixed in one of 2 ways:
> 1. make createWeight() public on Query (breaks back compat)
> 2. add the following method:
> {code}
> public Weight createQueryWeight(Searcher searcher) throws IOException {
>   return createWeight(searcher);
> }
> {code}
> createWeight(Searcher) should then be deprectated in favor of the publicly 
> accessible method

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1808) make Query.createWeight public (or add back Query.createQueryWeight())

2009-08-15 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743761#action_12743761
 ] 

Shai Erera commented on LUCENE-1808:


bq. I can always extend a class and make a package-private or protected method 
public. I cannot reduce visibility, but can always increase it.

Ohh ... after hitting Submit I understood why it would break back-compat - if I 
extend Query and override createWeight, and leave it 'protected' I won't 
compile if we make it public, since I'll be reducing visibility.

> make Query.createWeight public (or add back Query.createQueryWeight())
> --
>
> Key: LUCENE-1808
> URL: https://issues.apache.org/jira/browse/LUCENE-1808
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Query/Scoring
>Affects Versions: 2.9
>Reporter: Tim Smith
>Assignee: Mark Miller
>
> Now that the QueryWeight class has been removed, the public QueryWeight 
> createQueryWeight() method on Query was also removed
> i have cases where i want to create a weight for a sub query (outside of the 
> org.apache.lucene.search package) and i don't want the weight normalized 
> (think BooleanQuery outside of the o.a.l.search package)
> in order to do this, i have to create a static Utils class inside 
> o.a.l.search, pass in the Query and searcher, and have the static method call 
> the protected createWeight method
> this should not be necessary
> This could be fixed in one of 2 ways:
> 1. make createWeight() public on Query (breaks back compat)
> 2. add the following method:
> {code}
> public Weight createQueryWeight(Searcher searcher) throws IOException {
>   return createWeight(searcher);
> }
> {code}
> createWeight(Searcher) should then be deprectated in favor of the publicly 
> accessible method

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1808) make Query.createWeight public (or add back Query.createQueryWeight())

2009-08-15 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743762#action_12743762
 ] 

Mark Miller commented on LUCENE-1808:
-

Ahh - nice catch. I'm not sure what to do here then...

The previous possible break (I didn't actually look into it so I dunno) was 
referenced here:

http://search.lucidimagination.com/search/document/41004a9436799675/spanquery_and_boostingtermquery_oddities

> make Query.createWeight public (or add back Query.createQueryWeight())
> --
>
> Key: LUCENE-1808
> URL: https://issues.apache.org/jira/browse/LUCENE-1808
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Query/Scoring
>Affects Versions: 2.9
>Reporter: Tim Smith
>Assignee: Mark Miller
>
> Now that the QueryWeight class has been removed, the public QueryWeight 
> createQueryWeight() method on Query was also removed
> i have cases where i want to create a weight for a sub query (outside of the 
> org.apache.lucene.search package) and i don't want the weight normalized 
> (think BooleanQuery outside of the o.a.l.search package)
> in order to do this, i have to create a static Utils class inside 
> o.a.l.search, pass in the Query and searcher, and have the static method call 
> the protected createWeight method
> this should not be necessary
> This could be fixed in one of 2 ways:
> 1. make createWeight() public on Query (breaks back compat)
> 2. add the following method:
> {code}
> public Weight createQueryWeight(Searcher searcher) throws IOException {
>   return createWeight(searcher);
> }
> {code}
> createWeight(Searcher) should then be deprectated in favor of the publicly 
> accessible method

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1794) implement reusableTokenStream for all contrib analyzers

2009-08-15 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743763#action_12743763
 ] 

Yonik Seeley commented on LUCENE-1794:
--

Perhaps the Streams class should be part of ReusingAnalyzer and not Analyzer?  
It's a specific implementation of a reusable token stream, not part of the 
Analyzer interface.

> implement reusableTokenStream for all contrib analyzers
> ---
>
> Key: LUCENE-1794
> URL: https://issues.apache.org/jira/browse/LUCENE-1794
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: contrib/analyzers
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1794-reusing-analyzer.patch, LUCENE-1794.patch, 
> LUCENE-1794.patch, LUCENE-1794.patch, LUCENE-1794.patch, LUCENE-1794.patch, 
> LUCENE-1794.patch
>
>
> most contrib analyzers do not have an impl for reusableTokenStream
> regardless of how expensive the back compat reflection is for indexing speed, 
> I think we should do this to mitigate any performance costs. hey, overall it 
> might even be an improvement!
> the back compat code for non-final analyzers is already in place so this is 
> easy money in my opinion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1813) Add option to ReverseStringFilter to mark reversed tokens

2009-08-15 Thread Andrzej Bialecki (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrzej Bialecki  updated LUCENE-1813:
--

Attachment: reverseMark-2.patch

Updated patch that moves the marking logic to incrementToken().

> Add option to ReverseStringFilter to mark reversed tokens
> -
>
> Key: LUCENE-1813
> URL: https://issues.apache.org/jira/browse/LUCENE-1813
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: contrib/analyzers
>Affects Versions: 2.9
>Reporter: Andrzej Bialecki 
>Assignee: Robert Muir
> Attachments: reverseMark-2.patch, reverseMark.patch
>
>
> This patch implements additional functionality in the filter to "mark" 
> reversed tokens with a special marker character (Unicode 0001). This is 
> useful when indexing both straight and reversed tokens (e.g. to implement 
> efficient leading wildcards search).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1808) make Query.createWeight public (or add back Query.createQueryWeight())

2009-08-15 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743767#action_12743767
 ] 

Shai Erera commented on LUCENE-1808:


When I changed createQueryWeight from protected to public, it was because we 
introduced it in 2.9 only, so it was possible. Perhaps we should deprecate 
createWeight, and add back createQueryWeight as public?

> make Query.createWeight public (or add back Query.createQueryWeight())
> --
>
> Key: LUCENE-1808
> URL: https://issues.apache.org/jira/browse/LUCENE-1808
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Query/Scoring
>Affects Versions: 2.9
>Reporter: Tim Smith
>Assignee: Mark Miller
>
> Now that the QueryWeight class has been removed, the public QueryWeight 
> createQueryWeight() method on Query was also removed
> i have cases where i want to create a weight for a sub query (outside of the 
> org.apache.lucene.search package) and i don't want the weight normalized 
> (think BooleanQuery outside of the o.a.l.search package)
> in order to do this, i have to create a static Utils class inside 
> o.a.l.search, pass in the Query and searcher, and have the static method call 
> the protected createWeight method
> this should not be necessary
> This could be fixed in one of 2 ways:
> 1. make createWeight() public on Query (breaks back compat)
> 2. add the following method:
> {code}
> public Weight createQueryWeight(Searcher searcher) throws IOException {
>   return createWeight(searcher);
> }
> {code}
> createWeight(Searcher) should then be deprectated in favor of the publicly 
> accessible method

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1794) implement reusableTokenStream for all contrib analyzers

2009-08-15 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743770#action_12743770
 ] 

Shai Erera commented on LUCENE-1794:


Well ... it's true and false at the same time. On one hand, I think Analyzer 
should impl reusableTokenStream just like ReusingAnalyzer, but we can't do that 
because of back-compat. On the other hand, Streams does belong to 
ReusingAnalyzer because it makes use of it.

What I thought was that maybe someone would want to make use of Streams w/o 
extending Analyzer. And ... we may want to constraint setPreviousTokenStream to 
Streams, or TokenStream or a generic type of thing, to avoid casting and be 
more type-safe.

I wonder if we'll stay w/ Analyzer.reusableTS as it is forever, or will we 
break it one day to be like ReusingAnalyzer (and by that deprecate 
ReusingAnalyzer?).

I guess that if we think for the long term that ReusingAnalyzer will stay, and 
hence most Analyzers will actually be ReusingAnalyzer extension, then I'm ok w/ 
moving Streams into ReusingAnalyzer. But keeping it in Analyzer will allow us 
in the future to constrain prevTokenStream to be of that type and not a generic 
Object.

> implement reusableTokenStream for all contrib analyzers
> ---
>
> Key: LUCENE-1794
> URL: https://issues.apache.org/jira/browse/LUCENE-1794
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: contrib/analyzers
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1794-reusing-analyzer.patch, LUCENE-1794.patch, 
> LUCENE-1794.patch, LUCENE-1794.patch, LUCENE-1794.patch, LUCENE-1794.patch, 
> LUCENE-1794.patch
>
>
> most contrib analyzers do not have an impl for reusableTokenStream
> regardless of how expensive the back compat reflection is for indexing speed, 
> I think we should do this to mitigate any performance costs. hey, overall it 
> might even be an improvement!
> the back compat code for non-final analyzers is already in place so this is 
> easy money in my opinion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1808) make Query.createWeight public (or add back Query.createQueryWeight())

2009-08-15 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743771#action_12743771
 ] 

Mark Miller commented on LUCENE-1808:
-

Done you have the above problem though:
{quote}
If you add createQueryWeight as a public method, then all of the Lucene classes 
have to be changed to call it - otherwise, if you override it in a user Query, 
it won't be called on that Query.

But anyone with an external Query class that calls {<-FIXED} createWeight will 
not call createQueryWeight, and won't work correctly with classes that override 
it. I guess if we make it final it would close that loop hole, but then thats a 
loss from createWeight where you could override, and is still a back compat 
break?
{quote}


> make Query.createWeight public (or add back Query.createQueryWeight())
> --
>
> Key: LUCENE-1808
> URL: https://issues.apache.org/jira/browse/LUCENE-1808
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Query/Scoring
>Affects Versions: 2.9
>Reporter: Tim Smith
>Assignee: Mark Miller
>
> Now that the QueryWeight class has been removed, the public QueryWeight 
> createQueryWeight() method on Query was also removed
> i have cases where i want to create a weight for a sub query (outside of the 
> org.apache.lucene.search package) and i don't want the weight normalized 
> (think BooleanQuery outside of the o.a.l.search package)
> in order to do this, i have to create a static Utils class inside 
> o.a.l.search, pass in the Query and searcher, and have the static method call 
> the protected createWeight method
> this should not be necessary
> This could be fixed in one of 2 ways:
> 1. make createWeight() public on Query (breaks back compat)
> 2. add the following method:
> {code}
> public Weight createQueryWeight(Searcher searcher) throws IOException {
>   return createWeight(searcher);
> }
> {code}
> createWeight(Searcher) should then be deprectated in favor of the publicly 
> accessible method

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Issue Comment Edited: (LUCENE-1808) make Query.createWeight public (or add back Query.createQueryWeight())

2009-08-15 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743771#action_12743771
 ] 

Mark Miller edited comment on LUCENE-1808 at 8/15/09 1:43 PM:
--

Done you have the above problem though:
{quote}
If you add createQueryWeight as a public method, then all of the Lucene classes 
have to be changed to call it - otherwise, if you override it in a user Query, 
it won't be called on that Query.

But anyone with an external Query class that calls [<-FIXED] createWeight will 
not call createQueryWeight, and won't work correctly with classes that override 
it. I guess if we make it final it would close that loop hole, but then thats a 
loss from createWeight where you could override, and is still a back compat 
break?
{quote}


  was (Author: markrmil...@gmail.com):
Done you have the above problem though:
{quote}
If you add createQueryWeight as a public method, then all of the Lucene classes 
have to be changed to call it - otherwise, if you override it in a user Query, 
it won't be called on that Query.

But anyone with an external Query class that calls {<-FIXED} createWeight will 
not call createQueryWeight, and won't work correctly with classes that override 
it. I guess if we make it final it would close that loop hole, but then thats a 
loss from createWeight where you could override, and is still a back compat 
break?
{quote}

  
> make Query.createWeight public (or add back Query.createQueryWeight())
> --
>
> Key: LUCENE-1808
> URL: https://issues.apache.org/jira/browse/LUCENE-1808
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Query/Scoring
>Affects Versions: 2.9
>Reporter: Tim Smith
>Assignee: Mark Miller
>
> Now that the QueryWeight class has been removed, the public QueryWeight 
> createQueryWeight() method on Query was also removed
> i have cases where i want to create a weight for a sub query (outside of the 
> org.apache.lucene.search package) and i don't want the weight normalized 
> (think BooleanQuery outside of the o.a.l.search package)
> in order to do this, i have to create a static Utils class inside 
> o.a.l.search, pass in the Query and searcher, and have the static method call 
> the protected createWeight method
> this should not be necessary
> This could be fixed in one of 2 ways:
> 1. make createWeight() public on Query (breaks back compat)
> 2. add the following method:
> {code}
> public Weight createQueryWeight(Searcher searcher) throws IOException {
>   return createWeight(searcher);
> }
> {code}
> createWeight(Searcher) should then be deprectated in favor of the publicly 
> accessible method

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1794) implement reusableTokenStream for all contrib analyzers

2009-08-15 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743774#action_12743774
 ] 

Robert Muir commented on LUCENE-1794:
-

Shai, I will take a look at your patch as soon as I am at a real computer. 
thanks for your work in advance, we maybe should put it on another issue though 
just to keep the scope of this one reasonably contained.

{quote}
And ... we may want to constraint setPreviousTokenStream to Streams, or 
TokenStream or a generic type of thing, to avoid casting and be more type-safe.
{quote}

see QueryAutoStopWordAnalyzer in my patch for a counter-example to this. in 
this case, it is a Set, because it is dependent upon field.

> implement reusableTokenStream for all contrib analyzers
> ---
>
> Key: LUCENE-1794
> URL: https://issues.apache.org/jira/browse/LUCENE-1794
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: contrib/analyzers
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1794-reusing-analyzer.patch, LUCENE-1794.patch, 
> LUCENE-1794.patch, LUCENE-1794.patch, LUCENE-1794.patch, LUCENE-1794.patch, 
> LUCENE-1794.patch
>
>
> most contrib analyzers do not have an impl for reusableTokenStream
> regardless of how expensive the back compat reflection is for indexing speed, 
> I think we should do this to mitigate any performance costs. hey, overall it 
> might even be an improvement!
> the back compat code for non-final analyzers is already in place so this is 
> easy money in my opinion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Created: (LUCENE-1814) Some Lucene tests try and use a Junit Assert in new threads

2009-08-15 Thread Mark Miller (JIRA)

Some Lucene tests try and use a Junit Assert in new threads
---

 Key: LUCENE-1814
 URL: https://issues.apache.org/jira/browse/LUCENE-1814
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Mark Miller
Priority: Minor


There are a few cases in Lucene tests where JUnit Asserts are used inside a new 
threads run method - this won't work because Junit throws an exception when a 
call to Assert fails - that will kill the thread, but the exception will not 
propagate to JUnit - so unless a failure is caused later from the thread 
termination, the Asserts are invalid.

TestThreadSafe
TestStressIndexing2
TestStringIntern

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1794) implement reusableTokenStream for all contrib analyzers

2009-08-15 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743776#action_12743776
 ] 

Yonik Seeley commented on LUCENE-1794:
--

In general, we should strive to treat our base abstract classes like 
interfaces, with the ability to provide default implementations to avoid back 
compatibility breaks (while avoiding adding members or non-overrideable 
methods).  One could make the case that the ClosableThreadLocal should not be 
in Analyzer either, but it's been there long enough now, it would break back 
compat to move it.

bq. What I thought was that maybe someone would want to make use of Streams w/o 
extending Analyzer.

They still can - ReusableAnalyzer.Streams.

bq. But keeping it in Analyzer will allow us in the future to constrain 
prevTokenStream to be of that type and not a generic Object.

Doesn't seem like we should force all tokenstreams to be reusable, or constrain 
the exact form of how a reusable token stream is obtained.


> implement reusableTokenStream for all contrib analyzers
> ---
>
> Key: LUCENE-1794
> URL: https://issues.apache.org/jira/browse/LUCENE-1794
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: contrib/analyzers
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1794-reusing-analyzer.patch, LUCENE-1794.patch, 
> LUCENE-1794.patch, LUCENE-1794.patch, LUCENE-1794.patch, LUCENE-1794.patch, 
> LUCENE-1794.patch
>
>
> most contrib analyzers do not have an impl for reusableTokenStream
> regardless of how expensive the back compat reflection is for indexing speed, 
> I think we should do this to mitigate any performance costs. hey, overall it 
> might even be an improvement!
> the back compat code for non-final analyzers is already in place so this is 
> easy money in my opinion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1794) implement reusableTokenStream for all contrib analyzers

2009-08-15 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743778#action_12743778
 ] 

Shai Erera commented on LUCENE-1794:


I guess you're both right. I thought that one day we'll cancel ReusingAnalyzer 
and pull it up to Analyzer, but it looks like ReusingAnalyzer makes sense to 
stay, and so we can move Streams to it.

Robert, if possible, I'd like to get this one in as part of this issue. The 
reason is that you already modified all Analyzers to impl reusableTokenStream. 
I'm afraid that if we'll do it in another issue, some Analyzers will be skipped 
over. If you want, I can apply this to your patch and post pack an updated one 
tomorrow.

> implement reusableTokenStream for all contrib analyzers
> ---
>
> Key: LUCENE-1794
> URL: https://issues.apache.org/jira/browse/LUCENE-1794
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: contrib/analyzers
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1794-reusing-analyzer.patch, LUCENE-1794.patch, 
> LUCENE-1794.patch, LUCENE-1794.patch, LUCENE-1794.patch, LUCENE-1794.patch, 
> LUCENE-1794.patch
>
>
> most contrib analyzers do not have an impl for reusableTokenStream
> regardless of how expensive the back compat reflection is for indexing speed, 
> I think we should do this to mitigate any performance costs. hey, overall it 
> might even be an improvement!
> the back compat code for non-final analyzers is already in place so this is 
> easy money in my opinion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1808) make Query.createWeight public (or add back Query.createQueryWeight())

2009-08-15 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743780#action_12743780
 ] 

Shai Erera commented on LUCENE-1808:


I thought that's partly we took care of here: 
https://issues.apache.org/jira/browse/LUCENE-1630?focusedCommentId=12723996&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12723996

True, if someone overrides createWeight (he ought to) and call it specifically, 
createQueryWeight won't be called. But then, all of our code will call 
createQueryWeight. And if we deprecate createWeight, those who call it directly 
will need to move to createQueryWeight, so I think we should be fine?

Anyway, I may not think too clear at this hour (1 AM), so if I misunderstood 
something, I'll read it again in the morning.

> make Query.createWeight public (or add back Query.createQueryWeight())
> --
>
> Key: LUCENE-1808
> URL: https://issues.apache.org/jira/browse/LUCENE-1808
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Query/Scoring
>Affects Versions: 2.9
>Reporter: Tim Smith
>Assignee: Mark Miller
>
> Now that the QueryWeight class has been removed, the public QueryWeight 
> createQueryWeight() method on Query was also removed
> i have cases where i want to create a weight for a sub query (outside of the 
> org.apache.lucene.search package) and i don't want the weight normalized 
> (think BooleanQuery outside of the o.a.l.search package)
> in order to do this, i have to create a static Utils class inside 
> o.a.l.search, pass in the Query and searcher, and have the static method call 
> the protected createWeight method
> this should not be necessary
> This could be fixed in one of 2 ways:
> 1. make createWeight() public on Query (breaks back compat)
> 2. add the following method:
> {code}
> public Weight createQueryWeight(Searcher searcher) throws IOException {
>   return createWeight(searcher);
> }
> {code}
> createWeight(Searcher) should then be deprectated in favor of the publicly 
> accessible method

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1794) implement reusableTokenStream for all contrib analyzers

2009-08-15 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743786#action_12743786
 ] 

Mark Miller commented on LUCENE-1794:
-

To not break back compat, everything has got to work even if they don't yet 
move from the deprecated method.



> implement reusableTokenStream for all contrib analyzers
> ---
>
> Key: LUCENE-1794
> URL: https://issues.apache.org/jira/browse/LUCENE-1794
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: contrib/analyzers
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1794-reusing-analyzer.patch, LUCENE-1794.patch, 
> LUCENE-1794.patch, LUCENE-1794.patch, LUCENE-1794.patch, LUCENE-1794.patch, 
> LUCENE-1794.patch
>
>
> most contrib analyzers do not have an impl for reusableTokenStream
> regardless of how expensive the back compat reflection is for indexing speed, 
> I think we should do this to mitigate any performance costs. hey, overall it 
> might even be an improvement!
> the back compat code for non-final analyzers is already in place so this is 
> easy money in my opinion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1794) implement reusableTokenStream for all contrib analyzers

2009-08-15 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743788#action_12743788
 ] 

Robert Muir commented on LUCENE-1794:
-

{quote}
Robert, if possible, I'd like to get this one in as part of this issue. The 
reason is that you already modified all Analyzers to impl reusableTokenStream. 
I'm afraid that if we'll do it in another issue, some Analyzers will be skipped 
over. If you want, I can apply this to your patch and post pack an updated one 
tomorrow.
{quote}

Shai, this is a valid concern. But also lets not forget analyzers that already 
implement reusableTS that are not a part of this patch (yet should be changed 
to extend ReusingAnalyzer)... examples include collation/* analyzers/fa, etc.

But even before this I think we should make sure everyone is happy with 
ReusingAnalyzer itself... this is the only reason I think it might merit 
another issue... this patch is already a little unwieldy because I crept the 
scope to include reset(Reader) and reset() methods for tokenstreams that keep 
state...


> implement reusableTokenStream for all contrib analyzers
> ---
>
> Key: LUCENE-1794
> URL: https://issues.apache.org/jira/browse/LUCENE-1794
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: contrib/analyzers
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1794-reusing-analyzer.patch, LUCENE-1794.patch, 
> LUCENE-1794.patch, LUCENE-1794.patch, LUCENE-1794.patch, LUCENE-1794.patch, 
> LUCENE-1794.patch
>
>
> most contrib analyzers do not have an impl for reusableTokenStream
> regardless of how expensive the back compat reflection is for indexing speed, 
> I think we should do this to mitigate any performance costs. hey, overall it 
> might even be an improvement!
> the back compat code for non-final analyzers is already in place so this is 
> easy money in my opinion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1794) implement reusableTokenStream for all contrib analyzers

2009-08-15 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743791#action_12743791
 ] 

Yonik Seeley commented on LUCENE-1794:
--

bq. But even before this I think we should make sure everyone is happy with 
ReusingAnalyzer itself... this is the only reason I think it might merit 
another issue

+1

The ReusingAnalyzer brings up other issues of protocol - right now consumers 
like lucene indexing call reset() on the stream, but I see the prototype 
ReusingAnalyzer also calling reset() on the stream.

> implement reusableTokenStream for all contrib analyzers
> ---
>
> Key: LUCENE-1794
> URL: https://issues.apache.org/jira/browse/LUCENE-1794
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: contrib/analyzers
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1794-reusing-analyzer.patch, LUCENE-1794.patch, 
> LUCENE-1794.patch, LUCENE-1794.patch, LUCENE-1794.patch, LUCENE-1794.patch, 
> LUCENE-1794.patch
>
>
> most contrib analyzers do not have an impl for reusableTokenStream
> regardless of how expensive the back compat reflection is for indexing speed, 
> I think we should do this to mitigate any performance costs. hey, overall it 
> might even be an improvement!
> the back compat code for non-final analyzers is already in place so this is 
> easy money in my opinion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Reopened: (LUCENE-1522) another highlighter

2009-08-15 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi reopened LUCENE-1522:



There is a bug in BaseFragmentsBuilder. When the highlighting field is not 
stored, StringIndexOutOfBoundException will be thrown. I'd like to reopen this 
issue so the fix can be included in 2.9. I'll post the fix soon.

> another highlighter
> ---
>
> Key: LUCENE-1522
> URL: https://issues.apache.org/jira/browse/LUCENE-1522
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: contrib/highlighter
>Reporter: Koji Sekiguchi
>Assignee: Michael McCandless
>Priority: Minor
> Fix For: 2.9
>
> Attachments: colored-tag-sample.png, 
> LUCENE-1522-multiValued-test.patch, LUCENE-1522.patch, LUCENE-1522.patch, 
> LUCENE-1522.patch, LUCENE-1522.patch, LUCENE-1522.patch, LUCENE-1522.patch, 
> LUCENE-1522.patch
>
>
> I've written this highlighter for my project to support bi-gram token stream 
> (general token stream (e.g. WhitespaceTokenizer) also supported. see test 
> code in patch). The idea was inherited from my previous project with my 
> colleague and LUCENE-644. This approach needs highlight fields to be 
> TermVector.WITH_POSITIONS_OFFSETS, but is fast and can support N-grams. This 
> depends on LUCENE-1448 to get refined term offsets.
> usage:
> {code:java}
> TopDocs docs = searcher.search( query, 10 );
> Highlighter h = new Highlighter();
> FieldQuery fq = h.getFieldQuery( query );
> for( ScoreDoc scoreDoc : docs.scoreDocs ){
>   // fieldName="content", fragCharSize=100, numFragments=3
>   String[] fragments = h.getBestFragments( fq, reader, scoreDoc.doc, 
> "content", 100, 3 );
>   if( fragments != null ){
> for( String fragment : fragments )
>   System.out.println( fragment );
>   }
> }
> {code}
> features:
> - fast for large docs
> - supports not only whitespace-based token stream, but also "fixed size" 
> N-gram (e.g. (2,2), not (1,3)) (can solve LUCENE-1489)
> - supports PhraseQuery, phrase-unit highlighting with slops
> {noformat}
> q="w1 w2"
> w1 w2
> ---
> q="w1 w2"~1
> w1 w3 w2 w3 w1 w2
> {noformat}
> - highlight fields need to be TermVector.WITH_POSITIONS_OFFSETS
> - easy to apply patch due to independent package (contrib/highlighter2)
> - uses Java 1.5
> - looks query boost to score fragments (currently doesn't see idf, but it 
> should be possible)
> - pluggable FragListBuilder
> - pluggable FragmentsBuilder
> to do:
> - term positions can be unnecessary when phraseHighlight==false
> - collects performance numbers

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1522) another highlighter

2009-08-15 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated LUCENE-1522:
---

Attachment: LUCENE-1522-fix-SIOOBE.patch

The patch includes the fix and a test case.

> another highlighter
> ---
>
> Key: LUCENE-1522
> URL: https://issues.apache.org/jira/browse/LUCENE-1522
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: contrib/highlighter
>Reporter: Koji Sekiguchi
>Assignee: Michael McCandless
>Priority: Minor
> Fix For: 2.9
>
> Attachments: colored-tag-sample.png, LUCENE-1522-fix-SIOOBE.patch, 
> LUCENE-1522-multiValued-test.patch, LUCENE-1522.patch, LUCENE-1522.patch, 
> LUCENE-1522.patch, LUCENE-1522.patch, LUCENE-1522.patch, LUCENE-1522.patch, 
> LUCENE-1522.patch
>
>
> I've written this highlighter for my project to support bi-gram token stream 
> (general token stream (e.g. WhitespaceTokenizer) also supported. see test 
> code in patch). The idea was inherited from my previous project with my 
> colleague and LUCENE-644. This approach needs highlight fields to be 
> TermVector.WITH_POSITIONS_OFFSETS, but is fast and can support N-grams. This 
> depends on LUCENE-1448 to get refined term offsets.
> usage:
> {code:java}
> TopDocs docs = searcher.search( query, 10 );
> Highlighter h = new Highlighter();
> FieldQuery fq = h.getFieldQuery( query );
> for( ScoreDoc scoreDoc : docs.scoreDocs ){
>   // fieldName="content", fragCharSize=100, numFragments=3
>   String[] fragments = h.getBestFragments( fq, reader, scoreDoc.doc, 
> "content", 100, 3 );
>   if( fragments != null ){
> for( String fragment : fragments )
>   System.out.println( fragment );
>   }
> }
> {code}
> features:
> - fast for large docs
> - supports not only whitespace-based token stream, but also "fixed size" 
> N-gram (e.g. (2,2), not (1,3)) (can solve LUCENE-1489)
> - supports PhraseQuery, phrase-unit highlighting with slops
> {noformat}
> q="w1 w2"
> w1 w2
> ---
> q="w1 w2"~1
> w1 w3 w2 w3 w1 w2
> {noformat}
> - highlight fields need to be TermVector.WITH_POSITIONS_OFFSETS
> - easy to apply patch due to independent package (contrib/highlighter2)
> - uses Java 1.5
> - looks query boost to score fragments (currently doesn't see idf, but it 
> should be possible)
> - pluggable FragListBuilder
> - pluggable FragmentsBuilder
> to do:
> - term positions can be unnecessary when phraseHighlight==false
> - collects performance numbers

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Hudson build is back to normal: Lucene-trunk #921

2009-08-15 Thread Apache Hudson Server

See http://hudson.zones.apache.org/hudson/job/Lucene-trunk/921/changes



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1791) Enhance QueryUtils and CheckHIts to wrap everything they check in MultiReader/MultiSearcher

2009-08-15 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated LUCENE-1791:
-

Attachment: LUCENE-1791.patch

i put the doc "ids" into a KEY field and refactored ItemizedFilter to be a 
trivial subclass of FieldCacheTermsFilter.

I also added more wrap permutations to address some of the possible edge cases 
Simon pointed out (good catch SImon) but didn't introduce any randomization for 
hte reasons mentioned before (even with the change to not rely on consistent 
docIds in ItemizedFilter, we can't allow deletions before the wrapped 
searcher/reader because CheckHIts does it magic based on docIds. 

(hmm... i suppose the wrap functions could return some metadata about what 
offset the old ids have in the new search/reader and CheckHits could use that 
 hmmm ... seems kludgy so i'm not going to worry about it)

I think we're good to go here unless anyone has any objections

> Enhance QueryUtils and CheckHIts to wrap everything they check in 
> MultiReader/MultiSearcher
> ---
>
> Key: LUCENE-1791
> URL: https://issues.apache.org/jira/browse/LUCENE-1791
> Project: Lucene - Java
>  Issue Type: Test
>Reporter: Hoss Man
> Fix For: 2.9
>
> Attachments: LUCENE-1791.patch, LUCENE-1791.patch, LUCENE-1791.patch, 
> LUCENE-1791.patch, LUCENE-1791.patch, LUCENE-1791.patch, LUCENE-1791.patch, 
> LUCENE-1791.patch
>
>
> methods in CheckHits & QueryUtils are in a good position to take any Searcher 
> they are given and not only test it, but also test MultiReader & 
> MultiSearcher constructs built around them

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Assigned: (LUCENE-1791) Enhance QueryUtils and CheckHIts to wrap everything they check in MultiReader/MultiSearcher

2009-08-15 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man reassigned LUCENE-1791:


Assignee: Hoss Man

> Enhance QueryUtils and CheckHIts to wrap everything they check in 
> MultiReader/MultiSearcher
> ---
>
> Key: LUCENE-1791
> URL: https://issues.apache.org/jira/browse/LUCENE-1791
> Project: Lucene - Java
>  Issue Type: Test
>Reporter: Hoss Man
>Assignee: Hoss Man
> Fix For: 2.9
>
> Attachments: LUCENE-1791.patch, LUCENE-1791.patch, LUCENE-1791.patch, 
> LUCENE-1791.patch, LUCENE-1791.patch, LUCENE-1791.patch, LUCENE-1791.patch, 
> LUCENE-1791.patch
>
>
> methods in CheckHits & QueryUtils are in a good position to take any Searcher 
> they are given and not only test it, but also test MultiReader & 
> MultiSearcher constructs built around them

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1794) implement reusableTokenStream for all contrib analyzers

2009-08-15 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743816#action_12743816
 ] 

Shai Erera commented on LUCENE-1794:


bq. right now consumers like lucene indexing call reset() on the stream, but I 
see the prototype ReusingAnalyzer also calling reset() on the stream.

I don't think that's a new problem - I simply coded what I think most Analyzers 
that do impl reusableTS do. And if there are reusableTS impls that don't call 
reset() on purpose, then we shouldn't call it.

Therefore, I think that we should change our code to not call reset(). I don't 
think there's a reusableTS impl which does not call reset(), because it relies 
on the consumer to do it (nobody guarantees that anyway). We should simply note 
that on reusableTS javadoc (e.g., something like "return an already reset token 
stream"). I don't mind doing that in a separate issue if that's what you prefer.

> implement reusableTokenStream for all contrib analyzers
> ---
>
> Key: LUCENE-1794
> URL: https://issues.apache.org/jira/browse/LUCENE-1794
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: contrib/analyzers
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1794-reusing-analyzer.patch, LUCENE-1794.patch, 
> LUCENE-1794.patch, LUCENE-1794.patch, LUCENE-1794.patch, LUCENE-1794.patch, 
> LUCENE-1794.patch
>
>
> most contrib analyzers do not have an impl for reusableTokenStream
> regardless of how expensive the back compat reflection is for indexing speed, 
> I think we should do this to mitigate any performance costs. hey, overall it 
> might even be an improvement!
> the back compat code for non-final analyzers is already in place so this is 
> easy money in my opinion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

48 matches

Mail list logo