[jira] [Updated] (LUCENE-6466) Move SpanQuery.getSpans() to SpanWeight

2015-05-18 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6466:
--
Attachment: LUCENE-6466.patch

Another patch.  This one is nicer, I think.  It uses the needsScores parameter 
to determine whether or not to build the SpanSimilarity for a search.

 Move SpanQuery.getSpans() to SpanWeight
 ---

 Key: LUCENE-6466
 URL: https://issues.apache.org/jira/browse/LUCENE-6466
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Priority: Minor
 Attachments: LUCENE-6466.patch, LUCENE-6466.patch, LUCENE-6466.patch


 SpanQuery.getSpans() should only be called on rewritten queries, so it seems 
 to make more sense to have this being called from SpanWeight



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6371) Improve Spans payload collection

2015-05-13 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6371:
--
Attachment: LUCENE-6371.patch

I've been playing around with various APIs for this, and I think this one works 
reasonably well.

Spans.isPayloadAvailable() and getPayload() are replaced with a collect() 
method that takes a SpanCollector.  If you want to get payloads from a Spans, 
you do the following:

{code:java}
PayloadSpanCollector collector = new PayloadSpanCollector();
while (spans.nextStartPosition() != NO_MORE_POSITIONS) {
  collector.reset();
  spans.collect(collector);
  doSomethingWith(collector.getPayloads());
}
{code}

The actual job of collecting information from postings lists is devolved to the 
collector itself (via SpanCollector.collectLeaf(), called from 
TermSpans.collect()).

The API is made slightly complicated by the need to buffer collected 
information in NearOrderedSpans, because the algorithm there moves child spans 
on eagerly when finding the smallest possible match, so by the time collect() 
is called we're out of position.  This is dealt with using a 
BufferedSpanCollector, with collectCandidate(Spans) and accept() methods.  The 
default (No-op) collector has a no-op implementation of this, which should get 
optimized away by HotSpot, meaning that we don't need to have separate 
implementations for collecting and non-collecting algorithms, and can do away 
with PayloadNearOrderedSpans.

This patch also moves the PayloadCheck queries to the .payloads package, which 
tidies things up a bit.

All tests pass.

 Improve Spans payload collection
 

 Key: LUCENE-6371
 URL: https://issues.apache.org/jira/browse/LUCENE-6371
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Paul Elschot
Priority: Minor
 Attachments: LUCENE-6371.patch


 Spin off from LUCENE-6308, see the comments there from around 23 March 2015.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-6371) Improve Spans payload collection

2015-05-19 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward resolved LUCENE-6371.
---
Resolution: Fixed
  Assignee: Alan Woodward

Thanks for the reviews, everyone!

I've marked the new classes and methods as lucene.experimental, rather than 
moving to the sandbox - if anyone feels strongly about that, maybe it could be 
done in a follow up issue.

 Improve Spans payload collection
 

 Key: LUCENE-6371
 URL: https://issues.apache.org/jira/browse/LUCENE-6371
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Paul Elschot
Assignee: Alan Woodward
Priority: Minor
 Attachments: LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch


 Spin off from LUCENE-6308, see the comments there from around 23 March 2015.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6371) Improve Spans payload collection

2015-05-19 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6371:
--
Fix Version/s: 5.2

 Improve Spans payload collection
 

 Key: LUCENE-6371
 URL: https://issues.apache.org/jira/browse/LUCENE-6371
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Paul Elschot
Assignee: Alan Woodward
Priority: Minor
 Fix For: 5.2

 Attachments: LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch


 Spin off from LUCENE-6308, see the comments there from around 23 March 2015.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6466) Move SpanQuery.getSpans() to SpanWeight

2015-05-19 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6466:
--
Attachment: LUCENE-6466.patch

Final patch.  I'll commit this tomorrow, absent any objections.

 Move SpanQuery.getSpans() to SpanWeight
 ---

 Key: LUCENE-6466
 URL: https://issues.apache.org/jira/browse/LUCENE-6466
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Priority: Minor
 Attachments: LUCENE-6466.patch, LUCENE-6466.patch, LUCENE-6466.patch, 
 LUCENE-6466.patch


 SpanQuery.getSpans() should only be called on rewritten queries, so it seems 
 to make more sense to have this being called from SpanWeight



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6489) Move span payloads to sandbox

2015-05-19 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550336#comment-14550336
 ] 

Alan Woodward commented on LUCENE-6489:
---

The collectPayloads parameter in SpanNearQuery can be nuked now, I think, as 
collection depends on the SpanCollector passed in.

 Move span payloads to sandbox
 -

 Key: LUCENE-6489
 URL: https://issues.apache.org/jira/browse/LUCENE-6489
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir

 As mentioned on LUCENE-6371:
 {noformat}
 I've marked the new classes and methods as lucene.experimental, rather than 
 moving to the sandbox - if anyone feels strongly about that, maybe it could 
 be done in a follow up issue.
 {noformat}
 I feel strongly about this and will do the move.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-1387) Add more search options for filtering field facets.

2015-04-08 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated SOLR-1387:

Attachment: SOLR-1387-contains.patch

Patch moving the 'contains' method to SimpleFacets, and refactoring it a bit to 
just use Strings.  I'll commit this later today.

 Add more search options for filtering field facets.
 ---

 Key: SOLR-1387
 URL: https://issues.apache.org/jira/browse/SOLR-1387
 Project: Solr
  Issue Type: New Feature
  Components: search
Reporter: Anil Khadka
Assignee: Alan Woodward
 Fix For: Trunk, 5.1

 Attachments: SOLR-1387-contains.patch, SOLR-1387.patch


 Currently for filtering the facets, we have to use prefix (which use 
 String.startsWith() in java). 
 We can add some parameters like
 * facet.iPrefix : this would act like case-insensitive search. (or ---  
 facet.prefix=afacet.caseinsense=on)
 * facet.regex : this is pure regular expression search (which obviously would 
 be expensive if issued).
 Moreover, allowing multiple filtering for same field would be great like
 facet.prefix=a OR facet.prefix=A ... sth like this.
 All above concepts could be equally applicable to TermsComponent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-1387) Add more search options for filtering field facets.

2015-04-08 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward resolved SOLR-1387.
-
Resolution: Fixed

 Add more search options for filtering field facets.
 ---

 Key: SOLR-1387
 URL: https://issues.apache.org/jira/browse/SOLR-1387
 Project: Solr
  Issue Type: New Feature
  Components: search
Reporter: Anil Khadka
Assignee: Alan Woodward
 Fix For: Trunk, 5.1

 Attachments: SOLR-1387-contains.patch, SOLR-1387-contains.patch, 
 SOLR-1387.patch


 Currently for filtering the facets, we have to use prefix (which use 
 String.startsWith() in java). 
 We can add some parameters like
 * facet.iPrefix : this would act like case-insensitive search. (or ---  
 facet.prefix=afacet.caseinsense=on)
 * facet.regex : this is pure regular expression search (which obviously would 
 be expensive if issued).
 Moreover, allowing multiple filtering for same field would be great like
 facet.prefix=a OR facet.prefix=A ... sth like this.
 All above concepts could be equally applicable to TermsComponent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-1387) Add more search options for filtering field facets.

2015-04-08 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated SOLR-1387:

Attachment: SOLR-1387-contains.patch

Oops, yes, that's exactly what I did.  Here's the correct version...

 Add more search options for filtering field facets.
 ---

 Key: SOLR-1387
 URL: https://issues.apache.org/jira/browse/SOLR-1387
 Project: Solr
  Issue Type: New Feature
  Components: search
Reporter: Anil Khadka
Assignee: Alan Woodward
 Fix For: Trunk, 5.1

 Attachments: SOLR-1387-contains.patch, SOLR-1387-contains.patch, 
 SOLR-1387.patch


 Currently for filtering the facets, we have to use prefix (which use 
 String.startsWith() in java). 
 We can add some parameters like
 * facet.iPrefix : this would act like case-insensitive search. (or ---  
 facet.prefix=afacet.caseinsense=on)
 * facet.regex : this is pure regular expression search (which obviously would 
 be expensive if issued).
 Moreover, allowing multiple filtering for same field would be great like
 facet.prefix=a OR facet.prefix=A ... sth like this.
 All above concepts could be equally applicable to TermsComponent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6412) Merge SpanTermQuery into TermQuery

2015-04-09 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14486971#comment-14486971
 ] 

Alan Woodward commented on LUCENE-6412:
---

But this is pretty non-invasive, though?  It just makes SpanQueries easier to 
use - progress not perfection, etc, etc...

 Merge SpanTermQuery into TermQuery
 --

 Key: LUCENE-6412
 URL: https://issues.apache.org/jira/browse/LUCENE-6412
 Project: Lucene - Core
  Issue Type: Improvement
Affects Versions: Trunk, 5.2
Reporter: Alan Woodward
Priority: Minor
 Attachments: LUCENE-6412.patch


 Having a separate SpanTermQuery doesn't actually gain us anything now, and 
 it's trivial enough to make TermQuery extend SpanQuery copy the getSpans() 
 and getField() impls over from STQ.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6412) Merge SpanTermQuery into TermQuery

2015-04-08 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6412:
--
Attachment: LUCENE-6412.patch

Patch moving the functionality into TermQuery and deprecated SpanTermQuery.

The one wrinkle is that scores from plain TermQueries and SpanTermQueries are 
different, because STQ uses a SpanWeight/SpanScorer.  I've kept 
TestSpansAdvanced and TestSpansAdvanced2 using STQ rather than TermQuery so 
that the tests still pass, but maybe this doesn't matter that much?

 Merge SpanTermQuery into TermQuery
 --

 Key: LUCENE-6412
 URL: https://issues.apache.org/jira/browse/LUCENE-6412
 Project: Lucene - Core
  Issue Type: Improvement
Affects Versions: Trunk, 5.2
Reporter: Alan Woodward
Priority: Minor
 Attachments: LUCENE-6412.patch


 Having a separate SpanTermQuery doesn't actually gain us anything now, and 
 it's trivial enough to make TermQuery extend SpanQuery copy the getSpans() 
 and getField() impls over from STQ.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-6412) Merge SpanTermQuery into TermQuery

2015-04-08 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14486108#comment-14486108
 ] 

Alan Woodward edited comment on LUCENE-6412 at 4/8/15 9:36 PM:
---

Patch moving the functionality into TermQuery and deprecating SpanTermQuery.

The one wrinkle is that scores from plain TermQueries and SpanTermQueries are 
different, because STQ uses a SpanWeight/SpanScorer.  I've kept 
TestSpansAdvanced and TestSpansAdvanced2 using STQ rather than TermQuery so 
that the tests still pass, but maybe this doesn't matter that much?


was (Author: romseygeek):
Patch moving the functionality into TermQuery and deprecated SpanTermQuery.

The one wrinkle is that scores from plain TermQueries and SpanTermQueries are 
different, because STQ uses a SpanWeight/SpanScorer.  I've kept 
TestSpansAdvanced and TestSpansAdvanced2 using STQ rather than TermQuery so 
that the tests still pass, but maybe this doesn't matter that much?

 Merge SpanTermQuery into TermQuery
 --

 Key: LUCENE-6412
 URL: https://issues.apache.org/jira/browse/LUCENE-6412
 Project: Lucene - Core
  Issue Type: Improvement
Affects Versions: Trunk, 5.2
Reporter: Alan Woodward
Priority: Minor
 Attachments: LUCENE-6412.patch


 Having a separate SpanTermQuery doesn't actually gain us anything now, and 
 it's trivial enough to make TermQuery extend SpanQuery copy the getSpans() 
 and getField() impls over from STQ.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-6412) Merge SpanTermQuery into TermQuery

2015-04-08 Thread Alan Woodward (JIRA)
Alan Woodward created LUCENE-6412:
-

 Summary: Merge SpanTermQuery into TermQuery
 Key: LUCENE-6412
 URL: https://issues.apache.org/jira/browse/LUCENE-6412
 Project: Lucene - Core
  Issue Type: Improvement
Affects Versions: Trunk, 5.2
Reporter: Alan Woodward
Priority: Minor


Having a separate SpanTermQuery doesn't actually gain us anything now, and it's 
trivial enough to make TermQuery extend SpanQuery copy the getSpans() and 
getField() impls over from STQ.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-6491) Failure in TestTermRangeQuery

2015-05-20 Thread Alan Woodward (JIRA)
Alan Woodward created LUCENE-6491:
-

 Summary: Failure in TestTermRangeQuery
 Key: LUCENE-6491
 URL: https://issues.apache.org/jira/browse/LUCENE-6491
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Alan Woodward


ant test  -Dtestcase=TestTermRangeQuery 
-Dtests.method=testAutoPrefixTermsKickIn -Dtests.seed=1970E47D58D0CF50 
-Dtests.locale=uk -Dtests.timezone=Canada/Newfoundland -Dtests.asserts=true 
-Dtests.file.encoding=US-ASCII

Reproduces for me



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6491) Failure in TestTermRangeQuery

2015-05-20 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552072#comment-14552072
 ] 

Alan Woodward commented on LUCENE-6491:
---

{code}
[10:37:03.803] FAILURE 2.19s J0 | TestTermRangeQuery.testAutoPrefixTermsKickIn 

Throwable #1: java.lang.AssertionError: expected:1 but was:121
   at 
__randomizedtesting.SeedInfo.seed([1970E47D58D0CF50:3D6443D83137B394]:0)
   at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.failNotEquals(Assert.java:647)
   at org.junit.Assert.assertEquals(Assert.java:128)
   at org.junit.Assert.assertEquals(Assert.java:472)
   at org.junit.Assert.assertEquals(Assert.java:456)
   at 
org.apache.lucene.search.TestTermRangeQuery.testAutoPrefixTermsKickIn(TestTermRangeQuery.java:495)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:483)
   at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1627)
   at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:836)
   at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:872)
   at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:886)
   at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
   at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
   at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
   at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
   at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
   at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
   at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
   at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798)
   at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458)
   at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:845)
   at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:747)
   at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:781)
   at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:792)
   at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
   at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
   at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
   at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
   at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
   at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
   at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
   at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
   at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54)
   at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
   at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
   at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
   at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
   at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
   at java.lang.Thread.run(Thread.java:745)
  2 NOTE: test params are: codec=Lucene50, 
sim=RandomSimilarityProvider(queryNorm=false,coord=yes): {content=DFR I(ne)2}, 
locale=uk, timezone=Canada/Newfoundland
  2 NOTE: Mac OS X 10.7.5 x86_64/Oracle Corporation 1.8.0_25 
(64-bit)/cpus=8,threads=1,free=96435912,total=358088704
  2 NOTE: All tests run in this JVM: [TestAddIndexes, TestShardSearching, 
TestElevationComparator, TestBlockPostingsFormat3, TestIntsRef, Test2BPostings, 

[jira] [Resolved] (LUCENE-6490) TestPayloadNearQuery fails with NPE

2015-05-20 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward resolved LUCENE-6490.
---
Resolution: Fixed

 TestPayloadNearQuery fails with NPE
 ---

 Key: LUCENE-6490
 URL: https://issues.apache.org/jira/browse/LUCENE-6490
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
Assignee: Alan Woodward
 Attachments: LUCENE-6490.patch


 ant test  -Dtestcase=TestPayloadNearQuery -Dtests.method=test 
 -Dtests.seed=24743B1132665845 -Dtests.slow=true -Dtests.locale=es_NI 
 -Dtests.timezone=Israel -Dtests.asserts=true -Dtests.file.encoding=US-ASCII
 {noformat}
[junit4] Started J0 PID(19895@localhost).
[junit4] Suite: org.apache.lucene.search.payloads.TestPayloadNearQuery
[junit4]   2 NOTE: reproduce with: ant test  
 -Dtestcase=TestPayloadNearQuery -Dtests.method=test 
 -Dtests.seed=24743B1132665845 -Dtests.slow=true -Dtests.locale=es_NI 
 -Dtests.timezone=Israel -Dtests.asserts=true -Dtests.file.encoding=US-ASCII
[junit4] ERROR   0.09s | TestPayloadNearQuery.test 
[junit4] Throwable #1: java.lang.RuntimeException: 
 java.util.concurrent.ExecutionException: java.lang.NullPointerException
[junit4]  at 
 __randomizedtesting.SeedInfo.seed([24743B1132665845:AC2004CB9C9A35BD]:0)
[junit4]  at 
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:669)
[junit4]  at 
 org.apache.lucene.search.IndexSearcher.searchAfter(IndexSearcher.java:353)
[junit4]  at 
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:382)
[junit4]  at 
 org.apache.lucene.search.payloads.TestPayloadNearQuery.test(TestPayloadNearQuery.java:144)
[junit4]  at java.lang.Thread.run(Thread.java:745)
[junit4] Caused by: java.util.concurrent.ExecutionException: 
 java.lang.NullPointerException
[junit4]  at 
 java.util.concurrent.FutureTask.report(FutureTask.java:122)
[junit4]  at 
 java.util.concurrent.FutureTask.get(FutureTask.java:192)
[junit4]  at 
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:665)
[junit4]  ... 39 more
[junit4] Caused by: java.lang.NullPointerException
[junit4]  at 
 org.apache.lucene.search.payloads.PayloadNearQuery$PayloadNearSpanScorer.processPayloads(PayloadNearQuery.java:202)
[junit4]  at 
 org.apache.lucene.search.payloads.PayloadNearQuery$PayloadNearSpanScorer.setFreqCurrentDoc(PayloadNearQuery.java:223)
[junit4]  at 
 org.apache.lucene.search.spans.SpanScorer.ensureFreq(SpanScorer.java:65)
[junit4]  at 
 org.apache.lucene.search.spans.SpanScorer.score(SpanScorer.java:118)
[junit4]  at 
 org.apache.lucene.search.AssertingScorer.score(AssertingScorer.java:67)
[junit4]  at 
 org.apache.lucene.search.TopScoreDocCollector$SimpleTopScoreDocCollector$1.collect(TopScoreDocCollector.java:64)
[junit4]  at 
 org.apache.lucene.search.AssertingLeafCollector.collect(AssertingLeafCollector.java:53)
[junit4]  at 
 org.apache.lucene.search.AssertingCollector$1.collect(AssertingCollector.java:57)
[junit4]  at 
 org.apache.lucene.search.AssertingLeafCollector.collect(AssertingLeafCollector.java:53)
[junit4]  at 
 org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll(Weight.java:203)
[junit4]  at 
 org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:174)
[junit4]  at 
 org.apache.lucene.search.BulkScorer.score(BulkScorer.java:35)
[junit4]  at 
 org.apache.lucene.search.AssertingBulkScorer.score(AssertingBulkScorer.java:69)
[junit4]  at 
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:714)
[junit4]  at 
 org.apache.lucene.search.AssertingIndexSearcher.search(AssertingIndexSearcher.java:93)
[junit4]  at 
 org.apache.lucene.search.IndexSearcher$4.call(IndexSearcher.java:656)
[junit4]  at 
 org.apache.lucene.search.IndexSearcher$4.call(IndexSearcher.java:653)
[junit4]  at 
 java.util.concurrent.FutureTask.run(FutureTask.java:265)
[junit4]  at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[junit4]  at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[junit4]  ... 1 more
[junit4]   2 NOTE: test params are: 
 codec=FastDecompressionCompressingStoredFields(storedFieldsFormat=CompressingStoredFieldsFormat(compressionMode=FAST_DECOMPRESSION,
  chunkSize=25825, maxDocsPerChunk=709, blockSize=459), 
 termVectorsFormat=CompressingTermVectorsFormat(compressionMode=FAST_DECOMPRESSION,
  chunkSize=25825, blockSize=459)), sim=DefaultSimilarity, locale=es_NI, 
 

[jira] [Updated] (LUCENE-6466) Move SpanQuery.getSpans() to SpanWeight

2015-05-20 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6466:
--
Attachment: LUCENE-6466.patch

Updated to take into account changes from LUCENE-6490.  Running precommit now

 Move SpanQuery.getSpans() to SpanWeight
 ---

 Key: LUCENE-6466
 URL: https://issues.apache.org/jira/browse/LUCENE-6466
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Priority: Minor
 Attachments: LUCENE-6466.patch, LUCENE-6466.patch, LUCENE-6466.patch, 
 LUCENE-6466.patch, LUCENE-6466.patch


 SpanQuery.getSpans() should only be called on rewritten queries, so it seems 
 to make more sense to have this being called from SpanWeight



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-6466) Move SpanQuery.getSpans() to SpanWeight

2015-05-20 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward resolved LUCENE-6466.
---
Resolution: Fixed

Thanks for the review, Paul!

 Move SpanQuery.getSpans() to SpanWeight
 ---

 Key: LUCENE-6466
 URL: https://issues.apache.org/jira/browse/LUCENE-6466
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Priority: Minor
 Attachments: LUCENE-6466.patch, LUCENE-6466.patch, LUCENE-6466.patch, 
 LUCENE-6466.patch, LUCENE-6466.patch


 SpanQuery.getSpans() should only be called on rewritten queries, so it seems 
 to make more sense to have this being called from SpanWeight



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7613) solrcore.properties file should be loaded if it resides in ZooKeeper

2015-06-05 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574208#comment-14574208
 ] 

Alan Woodward commented on SOLR-7613:
-

I've been playing around with this a bit, and have come up with the following 
solution: the extra properties are loaded by the SolrResourceLoader, rather 
than in CoreDescriptor, which means that it's automatically loaded from the 
'correct' place (and will allow overriding and editing when SOLR-7570 is in).  
Properties are only actually used by the resource loader, so there's no 
particular need for them to be available via CoreDescriptor anyway.

I should have a patch to upload early next week.

 solrcore.properties file should be loaded if it resides in ZooKeeper
 

 Key: SOLR-7613
 URL: https://issues.apache.org/jira/browse/SOLR-7613
 Project: Solr
  Issue Type: Bug
Reporter: Steve Davids
 Fix For: 5.3


 The solrcore.properties file is used to load user defined properties for use 
 primarily in the solrconfig.xml file, though this properties file will only 
 load if it is resident in the core/conf directory on the physical disk, it 
 will not load if it is in ZK's core/conf directory. There should be a 
 mechanism to allow a core properties file to be specified in ZK and can be 
 updated appropriately along with being able to reload the properties when the 
 file changes (or via a core reload).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6527) TermWeight should not load norms when needsScores is false

2015-06-05 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574479#comment-14574479
 ] 

Alan Woodward commented on LUCENE-6527:
---

For SpanWeight, you don't need to pass needsScores back down to the 
constructor, as if the TermContexts map is null then you know scores are not 
required.  So you can just call {{IndexSearcher.getSimilarity(termContexts != 
null)}}.

 TermWeight should not load norms when needsScores is false
 --

 Key: LUCENE-6527
 URL: https://issues.apache.org/jira/browse/LUCENE-6527
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Adrien Grand
Assignee: Adrien Grand
 Attachments: LUCENE-6527.patch, LUCENE-6527.patch


 TermWeight currently loads norms all the time, even when needsScores is false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6371) Improve Spans payload collection

2015-06-09 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6371:
--
Attachment: LUCENE-6371.patch

Here's a patch that makes NearSpansOrdered non-lazy in the way Adrien 
described, and simplifies the SpanCollector accordingly.

Should I break out the changes to NearSpansOrdered into their own issue?  It 
seems like a big enough change in its own right, really.

 Improve Spans payload collection
 

 Key: LUCENE-6371
 URL: https://issues.apache.org/jira/browse/LUCENE-6371
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Paul Elschot
Assignee: Alan Woodward
Priority: Minor
 Fix For: Trunk, 5.3

 Attachments: LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch, 
 LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch


 Spin off from LUCENE-6308, see the comments there from around 23 March 2015.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6371) Improve Spans payload collection

2015-06-08 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6371:
--
Attachment: LUCENE-6371.patch

The patch again, this time taken from the correct point in the source tree :-/  
I've fixed the javadoc comment as well.

bq. should we consider removing it entirely?

I don't think so, it's a pretty fundamental operation.  One way of simplifying 
it might be to make SpanCollector final, and have it collect either everything 
or nothing, so that creating subcollectors is easier.  But that then makes it 
difficult to move payload collection out of core.  Or maybe instead we could 
make SpanCollector implement Cloneable, and move the responsibility of building 
subcollectors directly into NearSpansOrdered?

 Improve Spans payload collection
 

 Key: LUCENE-6371
 URL: https://issues.apache.org/jira/browse/LUCENE-6371
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Paul Elschot
Assignee: Alan Woodward
Priority: Minor
 Fix For: Trunk, 5.3

 Attachments: LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch, 
 LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch


 Spin off from LUCENE-6308, see the comments there from around 23 March 2015.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6371) Improve Spans payload collection

2015-06-08 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14576830#comment-14576830
 ] 

Alan Woodward commented on LUCENE-6371:
---

I'd like to commit this today, and backport this and LUCENE-6466 to 5.x (with 
some beasting first to see if changing to an ordered map has fixed LUCENE-6495)

 Improve Spans payload collection
 

 Key: LUCENE-6371
 URL: https://issues.apache.org/jira/browse/LUCENE-6371
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Paul Elschot
Assignee: Alan Woodward
Priority: Minor
 Fix For: Trunk, 5.3

 Attachments: LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch, 
 LUCENE-6371.patch, LUCENE-6371.patch


 Spin off from LUCENE-6308, see the comments there from around 23 March 2015.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6371) Improve Spans payload collection

2015-06-08 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14577233#comment-14577233
 ] 

Alan Woodward commented on LUCENE-6371:
---

That might work, as long as NSO still reports all the matches, rather than just 
the first one it encounters.  It will change scoring a bit, but that shouldn't 
be too much of a problem.  I'll work on a patch.

 Improve Spans payload collection
 

 Key: LUCENE-6371
 URL: https://issues.apache.org/jira/browse/LUCENE-6371
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Paul Elschot
Assignee: Alan Woodward
Priority: Minor
 Fix For: Trunk, 5.3

 Attachments: LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch, 
 LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch


 Spin off from LUCENE-6308, see the comments there from around 23 March 2015.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7570) Config APIs should not modify the ConfigSet

2015-06-03 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14570723#comment-14570723
 ] 

Alan Woodward commented on SOLR-7570:
-

bq. What I'm suggesting is make the mutable conf location configurable on a per 
collection basis

OK, that seems reasonable.

There are a few moving parts here, so what I'd like to do is commit the current 
patch with its changes to SolrResourceLoader, and then open new issues for 
moving individual features to the new API:
* Config overlays
* Mutable schema
* Managed resources in general (I think this may supersede the StorageIO 
interface)

Noble's idea for configurable config locations can be in a followup issue as 
well.

 Config APIs should not modify the ConfigSet
 ---

 Key: SOLR-7570
 URL: https://issues.apache.org/jira/browse/SOLR-7570
 Project: Solr
  Issue Type: Improvement
Reporter: Tomás Fernández Löbbe
 Attachments: SOLR-7570.patch


 Originally discussed here: 
 http://mail-archives.apache.org/mod_mbox/lucene-dev/201505.mbox/%3CCAMJgJxSXCHxDzJs5-C-pKFDEBQD6JbgxB=-xp7u143ekmgp...@mail.gmail.com%3E
 The ConfigSet used to create a collection should be read-only. Changes made 
 via any of the Config APIs should only be applied to the collection where the 
 operation is done and no to other collections that may be using the same 
 ConfigSet. As discussed in the dev list: 
 When a collection is created we should have two things, an immutable part 
 (the ConfigSet) and a mutable part (configoverlay, generated schema, etc). 
 The ConfigSet will still be placed in ZooKeeper under /configs but the 
 mutable part should be placed under /collections/$COLLECTION_NAME/…
 [~romseygeek] suggested: 
 {quote}
 A nice way of doing it would be to make it part of the SolrResourceLoader 
 interface.  The ZK resource loader could check in the collection-specific 
 zknode first, and then under configs/, and we could add a writeResource() 
 method that writes to the collection-specific node as well.  Then all config 
 I/O goes via the resource loader, and we have a way of keeping certain parts 
 immutable.
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6537) Make NearSpansOrdered use lazy iteration

2015-06-09 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579075#comment-14579075
 ] 

Alan Woodward commented on LUCENE-6537:
---

I think it's scoring changes.  The benchmark is getting the top ten hits, 
ranking them by score, merging any docs that have the same score into a group, 
and then counting the groups.  What's happened here is that doc 3979685's score 
has increased (presumably because NSO is now finding an extra Span in that 
document that was being discarded by the eager shrink-to-smallest-fit algorithm 
before), and it has pushed doc 85504 out of the top 10.  But 85504 was part of 
a group of three docs with identical scores, so the number of score groups has 
increased by one.

I'm not sure what the point of doing the score-grouping is though?  It seems a 
pretty arbitrary thing to be checking?

 Make NearSpansOrdered use lazy iteration
 

 Key: LUCENE-6537
 URL: https://issues.apache.org/jira/browse/LUCENE-6537
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Priority: Minor
 Attachments: LUCENE-6537.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6537) Make NearSpansOrdered use lazy iteration

2015-06-09 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14579296#comment-14579296
 ] 

Alan Woodward commented on LUCENE-6537:
---

Thanks Mike.  So I think this is good to go then?

 Make NearSpansOrdered use lazy iteration
 

 Key: LUCENE-6537
 URL: https://issues.apache.org/jira/browse/LUCENE-6537
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Priority: Minor
 Attachments: LUCENE-6537.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6371) Improve Spans payload collection

2015-06-10 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6371:
--
Attachment: LUCENE-6371.patch

Updated patch, following on from LUCENE-6537.  I'd like to commit this, then 
backport LUCENE-6490 and LUCENE-6537.

 Improve Spans payload collection
 

 Key: LUCENE-6371
 URL: https://issues.apache.org/jira/browse/LUCENE-6371
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Paul Elschot
Assignee: Alan Woodward
Priority: Minor
 Fix For: 5.3, Trunk

 Attachments: LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch, 
 LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch, 
 LUCENE-6371.patch


 Spin off from LUCENE-6308, see the comments there from around 23 March 2015.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6537) Make NearSpansOrdered use lazy iteration

2015-06-10 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14580250#comment-14580250
 ] 

Alan Woodward commented on LUCENE-6537:
---

I started back-porting this to 5x, but I think it makes more sense to backport 
LUCENE-6490 and LUCENE-6371 first.

 Make NearSpansOrdered use lazy iteration
 

 Key: LUCENE-6537
 URL: https://issues.apache.org/jira/browse/LUCENE-6537
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Priority: Minor
 Attachments: LUCENE-6537.patch, LUCENE-6537.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6537) Make NearSpansOrdered use lazy iteration

2015-06-09 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14578911#comment-14578911
 ] 

Alan Woodward commented on LUCENE-6537:
---

It should return the same document matches, but it will return some extra Span 
hits within each document, so scores might be different.  Is that benchmark 
result saying that there are unexpected document matches, or that the scores 
have changed?

 Make NearSpansOrdered use lazy iteration
 

 Key: LUCENE-6537
 URL: https://issues.apache.org/jira/browse/LUCENE-6537
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Priority: Minor
 Attachments: LUCENE-6537.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6537) Make NearSpansOrdered use lazy iteration

2015-06-09 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14578917#comment-14578917
 ] 

Alan Woodward commented on LUCENE-6537:
---

It looks as though there's an extra hit in there, on document 3979685.

 Make NearSpansOrdered use lazy iteration
 

 Key: LUCENE-6537
 URL: https://issues.apache.org/jira/browse/LUCENE-6537
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Priority: Minor
 Attachments: LUCENE-6537.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-6537) Make NearSpansOrdered use lazy iteration

2015-06-09 Thread Alan Woodward (JIRA)
Alan Woodward created LUCENE-6537:
-

 Summary: Make NearSpansOrdered use lazy iteration
 Key: LUCENE-6537
 URL: https://issues.apache.org/jira/browse/LUCENE-6537
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6537) Make NearSpansOrdered use lazy iteration

2015-06-09 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6537:
--
Attachment: LUCENE-6537.patch

Patch moving NearSpansOrdered to a lazy algorithm.  This makes span collection 
much simpler, and it will also slightly improve performance when scores are not 
required, as nextStartPosition() can return as soon as it finds a match, 
without having to shrink the spans.

 Make NearSpansOrdered use lazy iteration
 

 Key: LUCENE-6537
 URL: https://issues.apache.org/jira/browse/LUCENE-6537
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Priority: Minor
 Attachments: LUCENE-6537.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6537) Make NearSpansOrdered use lazy iteration

2015-06-09 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14578870#comment-14578870
 ] 

Alan Woodward commented on LUCENE-6537:
---

NearSpansOrdered uses an eager algorithm to only return the shortest possible 
matches from a document.  This means that its subspans are out of position when 
nextStartPosition() returns, making it difficult to collect any information 
about the subspans.

 Make NearSpansOrdered use lazy iteration
 

 Key: LUCENE-6537
 URL: https://issues.apache.org/jira/browse/LUCENE-6537
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6371) Improve Spans payload collection

2015-06-09 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14578880#comment-14578880
 ] 

Alan Woodward commented on LUCENE-6371:
---

I opened LUCENE-6537

 Improve Spans payload collection
 

 Key: LUCENE-6371
 URL: https://issues.apache.org/jira/browse/LUCENE-6371
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Paul Elschot
Assignee: Alan Woodward
Priority: Minor
 Fix For: Trunk, 5.3

 Attachments: LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch, 
 LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch


 Spin off from LUCENE-6308, see the comments there from around 23 March 2015.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7341) xjoin - join data from external sources

2015-06-24 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14599507#comment-14599507
 ] 

Alan Woodward commented on SOLR-7341:
-

Hi Tom,

This looks great!  A couple of questions:
* This looks like it's built against 4.x.  Could you make a patch that compiles 
against trunk?
* Do you have any performance numbers for this?

I'm not sure if we have any kind of general policy about accepting contrib/ 
patches.  This looks sufficiently generally useful that it would be worth 
committing though.

 xjoin - join data from external sources
 ---

 Key: SOLR-7341
 URL: https://issues.apache.org/jira/browse/SOLR-7341
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 4.10.3
Reporter: Tom Winch
Priority: Minor
 Fix For: Trunk

 Attachments: SOLR-7341.patch, SOLR-7341.patch


 h2. XJoin
 The xjoin SOLR contrib allows external results to be joined with SOLR 
 results in a query and the SOLR result set to be filtered by the results of 
 an external query. Values from the external results are made available in the 
 SOLR results and may also be used to boost the scores of corresponding 
 documents during the search. The contrib consists of the Java classes 
 XJoinSearchComponent, XJoinValueSourceParser and XJoinQParserPlugin (and 
 associated classes), which must be configured in solrconfig.xml, and the 
 interfaces XJoinResultsFactory and XJoinResults, which are implemented by the 
 user to provide the link between SOLR and the external results source. 
 External results and SOLR documents are matched via a single configurable 
 attribute (the join field). The contrib JAR solr-xjoin-4.10.3.jar contains 
 these classes and interfaces and should be included in SOLR's class path from 
 solrconfig.xml, as should a JAR containing the user implementations of the 
 previously mentioned interfaces. For example:
 {code:xml}
 config
   ..
   !-- XJoin contrib JAR file --
   lib dir=${solr.install.dir:../../..}/dist/ regex=solr-xjoin-\d.*\.jar 
 /
   ..
   !-- user implementations of XJoin interfaces --
   lib path=/path/to/xjoin_test.jar /
   ..
 /config
 {code}
 h2. Java classes and interfaces
 h3. XJoinResultsFactory
 The user implementation of this interface is responsible for connecting to an 
 external source to perform a query (or otherwise collect results). Parameters 
 with prefix component name.external. are passed from the SOLR query URL 
 to pararameterise the search. The interface has the following methods:
 * void init(NamedList args) - this is called during SOLR initialisation, and 
 passed parameters from the search component configuration (see below)
 * XJoinResults getResults(SolrParams params) - this is called during a SOLR 
 search to generate external results, and is passed parameters from the SOLR 
 query URL (as above)
 For example, the implementation might perform queries of an external source 
 based on the 'q' SOLR query URL parameter (in full, component 
 name.external.q).
 h3. XJoinResults
 A user implementation of this interface is returned by the getResults() 
 method of the XJoinResultsFactory implementation. It has methods:
 * Object getResult(String joinId) - this should return a particular result 
 given the value of the join attribute
 * IterableString getJoinIds() - this should return the join attribute 
 values for all results of the external search
 h3. XJoinSearchComponent
 This is the central Java class of the contrib. It is a SOLR search component, 
 configured in solrconfig.xml and included in one or more SOLR request 
 handlers. There is one XJoin search component per external source, and each 
 has two main responsibilities:
 * Before the SOLR search, it connects to the external source and retrieves 
 results, storing them in the SOLR request context
 * After the SOLR search, it matches SOLR document in the results set and 
 external results via the join field, adding attributes from the external 
 results to documents in the SOLR results set
 It takes the following initialisation parameters:
 * factoryClass - this specifies the user-supplied class implementing 
 XJoinResultsFactory, used to generate external results
 * joinField - this specifies the attribute on which to join between SOLR 
 documents and external results
 * external - this parameter set is passed to configure the 
 XJoinResultsFactory implementation
 For example, in solrconfig.xml:
 {code:xml}
 searchComponent name=xjoin_test 
 class=org.apache.solr.search.xjoin.XJoinSearchComponent
   str name=factoryClasstest.TestXJoinResultsFactory/str
   str name=joinFieldid/str
   lst name=external
 str name=values1,2,3/str
   /lst
 /searchComponent
 {code}
 Here, the search component instantiates a new TextXJoinResultsFactory during 
 

[jira] [Created] (LUCENE-6587) Move explain() to Scorer

2015-06-19 Thread Alan Woodward (JIRA)
Alan Woodward created LUCENE-6587:
-

 Summary: Move explain() to Scorer
 Key: LUCENE-6587
 URL: https://issues.apache.org/jira/browse/LUCENE-6587
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward


At the moment, the explanation API is on Weight, rather than on Scorer.  This 
has a number of disadvantages:
* It means that Weights need to know about the scoring algorithms of their 
child scorers, which results in a leaky API (for example, the 
SloppyPhraseScorer has a package-private sloppyFreq() method which is only used 
by PhraseWeight.explain(), and SpanScorer has a similar public method that is 
again only called by explanation functions)
* It leads to lots of duplicated code - more or less every Weight.explain() 
method creates a Scorer, advances to the appropriate doc, and checks for a match
* It's very slow, because we create a new Scorer for every document

I'd like to try moving explain() directly to Scorer.  We can keep the old slow 
IndexSearcher.explain() API, but in addition explanations could now be 
generated efficiently in a Collector.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6587) Move explain() to Scorer

2015-06-19 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6587:
--
Attachment: LUCENE-6587.patch

Here's a patch.

Weight.explain(LeafReaderContext, int) is final, pulling a Scorer, advancing to 
the relevant doc, and calling scorer.explain().  There's a new explainMiss(LRC, 
int) method which is called if the document doesn't match, with a default 
implementation.  BooleanWeight overrides this.

There's a nocommit around what to do with query rescorers.  I don't really want 
to make Weight.explain() overrideable, which is the easiest way to deal with 
them, but I'm not sure that there's any better way of doing it.

All lucene tests pass, will look at the Solr ones presently.

 Move explain() to Scorer
 

 Key: LUCENE-6587
 URL: https://issues.apache.org/jira/browse/LUCENE-6587
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
 Attachments: LUCENE-6587.patch


 At the moment, the explanation API is on Weight, rather than on Scorer.  This 
 has a number of disadvantages:
 * It means that Weights need to know about the scoring algorithms of their 
 child scorers, which results in a leaky API (for example, the 
 SloppyPhraseScorer has a package-private sloppyFreq() method which is only 
 used by PhraseWeight.explain(), and SpanScorer has a similar public method 
 that is again only called by explanation functions)
 * It leads to lots of duplicated code - more or less every Weight.explain() 
 method creates a Scorer, advances to the appropriate doc, and checks for a 
 match
 * It's very slow, because we create a new Scorer for every document
 I'd like to try moving explain() directly to Scorer.  We can keep the old 
 slow IndexSearcher.explain() API, but in addition explanations could now be 
 generated efficiently in a Collector.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6580) Allow defined-width gaps in SpanNearQuery

2015-06-19 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14593173#comment-14593173
 ] 

Alan Woodward commented on LUCENE-6580:
---

This is because slops mean different things between the two queries.  In 
PhraseQuery, a slop of greater than 0 means we end up with a SloppyPhraseScorer 
that relaxes the ordering constraint (so you can have, in effect, the 'gap' 
appearing after the end of the match).  An ordered SpanNearQuery with a slop, 
however, still requires its clauses to be in order, but allows them to be 
spaced out.

So this issue makes an ordered SpanNearQuery more like a PhraseQuery only in 
the case that the PQ has defined gaps, but zero slop.

 Allow defined-width gaps in SpanNearQuery
 -

 Key: LUCENE-6580
 URL: https://issues.apache.org/jira/browse/LUCENE-6580
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
 Attachments: LUCENE-6580.patch, LUCENE-6580.patch


 SpanNearQuery is not quite an exact Spans replacement for PhraseQuery at the 
 moment, because while you can ask for an overall slop in an ordered match, 
 you can't specify exactly where the gaps should appear.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6371) Improve Spans payload collection

2015-06-10 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6371:
--
Attachment: LUCENE-6371-5x.patch

Here's a patch for 5x.  What with reversions and overlapping commits, it turns 
out the easiest thing to do was to merge the patches for LUCENE-6466, 
LUCENE-6537 and LUCENE-6371 into one.

All tests are passing, but I want to beast this against Java 7 for a bit to 
check that LUCENE-6490 is fixed.

 Improve Spans payload collection
 

 Key: LUCENE-6371
 URL: https://issues.apache.org/jira/browse/LUCENE-6371
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Paul Elschot
Assignee: Alan Woodward
Priority: Minor
 Fix For: 5.3, Trunk

 Attachments: LUCENE-6371-5x.patch, LUCENE-6371.patch, 
 LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch, 
 LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch


 Spin off from LUCENE-6308, see the comments there from around 23 March 2015.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6494) Make PayloadSpanUtil apply to other postings information

2015-06-16 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14587921#comment-14587921
 ] 

Alan Woodward commented on LUCENE-6494:
---

No reason it can't go in sandbox.  I think we need to commit LUCENE-6489 first 
though?

 Make PayloadSpanUtil apply to other postings information
 

 Key: LUCENE-6494
 URL: https://issues.apache.org/jira/browse/LUCENE-6494
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Assignee: Alan Woodward
 Fix For: 5.2

 Attachments: LUCENE-6494.patch, LUCENE-6494.patch, LUCENE-6494.patch, 
 LUCENE-6494.patch, LUCENE-6494.patch


 With the addition of SpanCollectors, we can now get arbitrary postings 
 information from SpanQueries.  PayloadSpanUtil does some rewriting to convert 
 non-span queries into SpanQueries so that it can collect payloads.  It would 
 be good to make this more generic, so that we can collect any postings 
 information from any query (without having to make invasive changes to 
 already optimized Scorers, etc).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-6537) Make NearSpansOrdered use lazy iteration

2015-06-15 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward resolved LUCENE-6537.
---
   Resolution: Fixed
Fix Version/s: 5.3
 Assignee: Alan Woodward

Turned out to be simpler to do this separately.

 Make NearSpansOrdered use lazy iteration
 

 Key: LUCENE-6537
 URL: https://issues.apache.org/jira/browse/LUCENE-6537
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Assignee: Alan Woodward
Priority: Minor
 Fix For: 5.3

 Attachments: LUCENE-6537.patch, LUCENE-6537.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-6466) Move SpanQuery.getSpans() to SpanWeight

2015-06-15 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward resolved LUCENE-6466.
---
Resolution: Fixed

 Move SpanQuery.getSpans() to SpanWeight
 ---

 Key: LUCENE-6466
 URL: https://issues.apache.org/jira/browse/LUCENE-6466
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Assignee: Alan Woodward
Priority: Minor
 Fix For: 5.3, Trunk

 Attachments: LUCENE-6466-2.patch, LUCENE-6466-2.patch, 
 LUCENE-6466-2.patch, LUCENE-6466-branch5x.patch, LUCENE-6466.patch, 
 LUCENE-6466.patch, LUCENE-6466.patch, LUCENE-6466.patch, LUCENE-6466.patch


 SpanQuery.getSpans() should only be called on rewritten queries, so it seems 
 to make more sense to have this being called from SpanWeight



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6466) Move SpanQuery.getSpans() to SpanWeight

2015-06-15 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6466:
--
Attachment: LUCENE-6466-branch5x.patch

Patch for branch 5x (before LUCENE-6371 is added)

 Move SpanQuery.getSpans() to SpanWeight
 ---

 Key: LUCENE-6466
 URL: https://issues.apache.org/jira/browse/LUCENE-6466
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Assignee: Alan Woodward
Priority: Minor
 Fix For: 5.3, Trunk

 Attachments: LUCENE-6466-2.patch, LUCENE-6466-2.patch, 
 LUCENE-6466-2.patch, LUCENE-6466-branch5x.patch, LUCENE-6466.patch, 
 LUCENE-6466.patch, LUCENE-6466.patch, LUCENE-6466.patch, LUCENE-6466.patch


 SpanQuery.getSpans() should only be called on rewritten queries, so it seems 
 to make more sense to have this being called from SpanWeight



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-6567) No need for out-of-order payload checks in SpanPayloadCheckQuery

2015-06-15 Thread Alan Woodward (JIRA)
Alan Woodward created LUCENE-6567:
-

 Summary: No need for out-of-order payload checks in 
SpanPayloadCheckQuery
 Key: LUCENE-6567
 URL: https://issues.apache.org/jira/browse/LUCENE-6567
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Priority: Minor


Since LUCENE-6537, all composite Spans implementations collect their payloads 
in-order, so we don't need special logic for that case anymore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6567) No need for out-of-order payload checks in SpanPayloadCheckQuery

2015-06-15 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6567:
--
Attachment: LUCENE-6567.patch

Patch.

 No need for out-of-order payload checks in SpanPayloadCheckQuery
 

 Key: LUCENE-6567
 URL: https://issues.apache.org/jira/browse/LUCENE-6567
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Priority: Minor
 Attachments: LUCENE-6567.patch


 Since LUCENE-6537, all composite Spans implementations collect their payloads 
 in-order, so we don't need special logic for that case anymore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6561) Add a TermContextCache to IndexSearcher

2015-06-15 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14586048#comment-14586048
 ] 

Alan Woodward commented on LUCENE-6561:
---

bq. I don't think we should even encourage caching of this?

Well at the moment we don't even make caching possible - TermContext.build() is 
a static method, so there's no way to do what Mike suggests and build an 
external cache.  I agree that IndexSearcher is the wrong place to expose it 
though.  And a package-private method wouldn't work, because SpanTermQuery is 
in a different package.

The concurrency problems arise because of sharing the cache between multiple 
queries, maybe what's really needed is a QueryContext that's passed to 
createWeight() that holds this kind of info?

 Add a TermContextCache to IndexSearcher
 ---

 Key: LUCENE-6561
 URL: https://issues.apache.org/jira/browse/LUCENE-6561
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
 Attachments: LUCENE-6561.patch, LUCENE-6561.patch


 TermContexts can be quite expensive to build, and if you have fairly complex 
 queries that re-use the same terms you can end up spending a lot of time 
 re-building the same ones over and over again.  It would be nice to be able 
 to cache them on an IndexSearcher, so that they can be re-used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6371) Improve Spans payload collection

2015-06-15 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6371:
--
Attachment: LUCENE-6371-5x.patch

Smaller patch, after LUCENE-6466 and LUCENE-6537

 Improve Spans payload collection
 

 Key: LUCENE-6371
 URL: https://issues.apache.org/jira/browse/LUCENE-6371
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Paul Elschot
Assignee: Alan Woodward
Priority: Minor
 Fix For: 5.3, Trunk

 Attachments: LUCENE-6371-5x.patch, LUCENE-6371-5x.patch, 
 LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch, 
 LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch


 Spin off from LUCENE-6308, see the comments there from around 23 March 2015.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-6371) Improve Spans payload collection

2015-06-15 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward resolved LUCENE-6371.
---
Resolution: Fixed

Thanks everyone!

 Improve Spans payload collection
 

 Key: LUCENE-6371
 URL: https://issues.apache.org/jira/browse/LUCENE-6371
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Paul Elschot
Assignee: Alan Woodward
Priority: Minor
 Fix For: 5.3, Trunk

 Attachments: LUCENE-6371-5x.patch, LUCENE-6371-5x.patch, 
 LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch, 
 LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch


 Spin off from LUCENE-6308, see the comments there from around 23 March 2015.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-6473) Make Spans an interface

2015-06-15 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward resolved LUCENE-6473.
---
Resolution: Won't Fix

This isn't necessary due to LUCENE-6371

 Make Spans an interface
 ---

 Key: LUCENE-6473
 URL: https://issues.apache.org/jira/browse/LUCENE-6473
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
 Attachments: LUCENE-6473.patch


 Spans is currently an abstract class, extending DocIdSetIterator.  This 
 restricts what we can do with implementations of Spans.  For example, in 
 LUCENE-6371, it would be useful to have PayloadSpan classes that extend 
 existing Spans implementations, but that also implement a PayloadSpans 
 interface that extends Spans.  This isn't possible if Spans is not an 
 interface itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6561) Add a TermContextCache to IndexSearcher

2015-06-12 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6561:
--
Attachment: LUCENE-6561.patch

Patch that just moves TermContext.build() to IndexSearcher.

 Add a TermContextCache to IndexSearcher
 ---

 Key: LUCENE-6561
 URL: https://issues.apache.org/jira/browse/LUCENE-6561
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
 Attachments: LUCENE-6561.patch, LUCENE-6561.patch


 TermContexts can be quite expensive to build, and if you have fairly complex 
 queries that re-use the same terms you can end up spending a lot of time 
 re-building the same ones over and over again.  It would be nice to be able 
 to cache them on an IndexSearcher, so that they can be re-used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-6561) Add a TermContextCache to IndexSearcher

2015-06-12 Thread Alan Woodward (JIRA)
Alan Woodward created LUCENE-6561:
-

 Summary: Add a TermContextCache to IndexSearcher
 Key: LUCENE-6561
 URL: https://issues.apache.org/jira/browse/LUCENE-6561
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward


TermContexts can be quite expensive to build, and if you have fairly complex 
queries that re-use the same terms you can end up spending a lot of time 
re-building the same ones over and over again.  It would be nice to be able to 
cache them on an IndexSearcher, so that they can be re-used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6561) Add a TermContextCache to IndexSearcher

2015-06-12 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6561:
--
Attachment: LUCENE-6561.patch

Very rough patch that adds a TermContextCache to IndexSearcher, and moves 
TermContext.build() to IndexSearcher.getTermContext(Term).  There are three 
cache implementations, one no-op (the default), one that caches everything, and 
a dumb LRU implementation that uses LinkedHashMap.  
LuceneTestCase.newSearcher() randomly selects the various implementations.

Needs some proper tests, and probably the concurrency on the LRUCache needs 
more thought.  And benchmarking! (do our benchmark tests re-use queries at 
all?) But I thought I'd put this up to see what people think.

 Add a TermContextCache to IndexSearcher
 ---

 Key: LUCENE-6561
 URL: https://issues.apache.org/jira/browse/LUCENE-6561
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
 Attachments: LUCENE-6561.patch


 TermContexts can be quite expensive to build, and if you have fairly complex 
 queries that re-use the same terms you can end up spending a lot of time 
 re-building the same ones over and over again.  It would be nice to be able 
 to cache them on an IndexSearcher, so that they can be re-used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6561) Add a TermContextCache to IndexSearcher

2015-06-12 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583963#comment-14583963
 ] 

Alan Woodward commented on LUCENE-6561:
---

Concurrency was one of the things that I was unsure of, and it may be that this 
is only useful in specific cases (in my case, very large complex queries with 
lots of redundancy in them being run in series).  

Would a good compromise be to keep the move of TermContext.build(), so that if 
anybody does want to implement a cache, they can subclass IndexSearcher?  
Difficult to implement your own cache when queries are all calling a static 
method.

 Add a TermContextCache to IndexSearcher
 ---

 Key: LUCENE-6561
 URL: https://issues.apache.org/jira/browse/LUCENE-6561
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
 Attachments: LUCENE-6561.patch


 TermContexts can be quite expensive to build, and if you have fairly complex 
 queries that re-use the same terms you can end up spending a lot of time 
 re-building the same ones over and over again.  It would be nice to be able 
 to cache them on an IndexSearcher, so that they can be re-used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-6580) Allow defined-width gaps in SpanNearQuery

2015-06-18 Thread Alan Woodward (JIRA)
Alan Woodward created LUCENE-6580:
-

 Summary: Allow defined-width gaps in SpanNearQuery
 Key: LUCENE-6580
 URL: https://issues.apache.org/jira/browse/LUCENE-6580
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward


SpanNearQuery is not quite an exact Spans replacement for PhraseQuery at the 
moment, because while you can ask for an overall slop in an ordered match, you 
can't specify exactly where the gaps should appear.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6580) Allow defined-width gaps in SpanNearQuery

2015-06-18 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6580:
--
Attachment: LUCENE-6580.patch

Patch adding a SpanGapQuery that can be added as part of SpanNearQuery's 
constructor.

This also adds a Spans.skipToPosition(int) method which GapSpans overrides to 
make seeking forward more efficient.

One thing I don't like here is that SpanGapQuery is a top-level SpanQuery, when 
it only really makes sense to be used within SpanNearQuery.  An alternative 
could be to add a builder to SpanNearQuery with an .addGap(int) method, and 
make SpanGapQuery a private class.

 Allow defined-width gaps in SpanNearQuery
 -

 Key: LUCENE-6580
 URL: https://issues.apache.org/jira/browse/LUCENE-6580
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
 Attachments: LUCENE-6580.patch


 SpanNearQuery is not quite an exact Spans replacement for PhraseQuery at the 
 moment, because while you can ask for an overall slop in an ordered match, 
 you can't specify exactly where the gaps should appear.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6580) Allow defined-width gaps in SpanNearQuery

2015-06-18 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14591551#comment-14591551
 ] 

Alan Woodward commented on LUCENE-6580:
---

Fair enough.  How about {{advanceToPosition}}?

 Allow defined-width gaps in SpanNearQuery
 -

 Key: LUCENE-6580
 URL: https://issues.apache.org/jira/browse/LUCENE-6580
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
 Attachments: LUCENE-6580.patch


 SpanNearQuery is not quite an exact Spans replacement for PhraseQuery at the 
 moment, because while you can ask for an overall slop in an ordered match, 
 you can't specify exactly where the gaps should appear.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6580) Allow defined-width gaps in SpanNearQuery

2015-06-18 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6580:
--
Attachment: LUCENE-6580.patch

New patch:
* skipToPosition is now scanToPosition()
* SpanNearQuery has a Builder, and SpanGapQuery is a private subclass

 Allow defined-width gaps in SpanNearQuery
 -

 Key: LUCENE-6580
 URL: https://issues.apache.org/jira/browse/LUCENE-6580
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
 Attachments: LUCENE-6580.patch, LUCENE-6580.patch


 SpanNearQuery is not quite an exact Spans replacement for PhraseQuery at the 
 moment, because while you can ask for an overall slop in an ordered match, 
 you can't specify exactly where the gaps should appear.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-2880) SpanQuery scoring inconsistencies

2015-06-17 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14589943#comment-14589943
 ] 

Alan Woodward edited comment on LUCENE-2880 at 6/17/15 3:45 PM:


+1

Maybe {{width}} rather than {{distance}} as the method name?


was (Author: romseygeek):
+1

Maybe {{{width}}} rather than {{{distance}}} as the method name?

 SpanQuery scoring inconsistencies
 -

 Key: LUCENE-2880
 URL: https://issues.apache.org/jira/browse/LUCENE-2880
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
 Fix For: 4.9, Trunk

 Attachments: LUCENE-2880.patch, LUCENE-2880.patch


 Spinoff of LUCENE-2879.
 You can see a full description there, but the gist is that SpanQuery sums up 
 freqs with sloppyFreq.
 However this slop is simply spans.end() - spans.start()
 For a SpanTermQuery for example, this means its scoring 0.5 for TF versus 
 TermQuery's 1.0.
 As you can imagine, I think in practical situations this would make it 
 difficult for SpanQuery users to
 really use SpanQueries for effective ranking, especially in combination with 
 non-Spanqueries (maybe via DisjunctionMaxQuery, etc)
 The problem is more general than this simple example: for example 
 SpanNearQuery should be consistent with PhraseQuery's slop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2880) SpanQuery scoring inconsistencies

2015-06-17 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14589943#comment-14589943
 ] 

Alan Woodward commented on LUCENE-2880:
---

+1

Maybe {{{width}}} rather than {{{distance}}} as the method name?

 SpanQuery scoring inconsistencies
 -

 Key: LUCENE-2880
 URL: https://issues.apache.org/jira/browse/LUCENE-2880
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
 Fix For: 4.9, Trunk

 Attachments: LUCENE-2880.patch, LUCENE-2880.patch


 Spinoff of LUCENE-2879.
 You can see a full description there, but the gist is that SpanQuery sums up 
 freqs with sloppyFreq.
 However this slop is simply spans.end() - spans.start()
 For a SpanTermQuery for example, this means its scoring 0.5 for TF versus 
 TermQuery's 1.0.
 As you can imagine, I think in practical situations this would make it 
 difficult for SpanQuery users to
 really use SpanQueries for effective ranking, especially in combination with 
 non-Spanqueries (maybe via DisjunctionMaxQuery, etc)
 The problem is more general than this simple example: for example 
 SpanNearQuery should be consistent with PhraseQuery's slop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6494) Make PayloadSpanUtil apply to other postings information

2015-06-16 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6494:
--
Attachment: LUCENE-6494.patch

Here's another go at this:
* SpanQueryWrapper takes a query and a searcher, and translates the query into 
a spanquery equivalent
* SQW.advanceTo(doc) returns a Spans object (that will only support iteration 
over Span positions, not .nextDoc() or .advance()), which can be used to 
collect information via a SpanCollector
* PayloadSpanUtil is rewritten to use SQW under the hood

I'm not overly keen on SpanQueryWrapper as a name, but other than that I think 
this API is reasonable?

 Make PayloadSpanUtil apply to other postings information
 

 Key: LUCENE-6494
 URL: https://issues.apache.org/jira/browse/LUCENE-6494
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Assignee: Alan Woodward
 Fix For: 5.2

 Attachments: LUCENE-6494.patch, LUCENE-6494.patch, LUCENE-6494.patch, 
 LUCENE-6494.patch, LUCENE-6494.patch


 With the addition of SpanCollectors, we can now get arbitrary postings 
 information from SpanQueries.  PayloadSpanUtil does some rewriting to convert 
 non-span queries into SpanQueries so that it can collect payloads.  It would 
 be good to make this more generic, so that we can collect any postings 
 information from any query (without having to make invasive changes to 
 already optimized Scorers, etc).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-6567) No need for out-of-order payload checks in SpanPayloadCheckQuery

2015-06-16 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward resolved LUCENE-6567.
---
Resolution: Fixed
  Assignee: Alan Woodward

 No need for out-of-order payload checks in SpanPayloadCheckQuery
 

 Key: LUCENE-6567
 URL: https://issues.apache.org/jira/browse/LUCENE-6567
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Assignee: Alan Woodward
Priority: Minor
 Attachments: LUCENE-6567.patch


 Since LUCENE-6537, all composite Spans implementations collect their payloads 
 in-order, so we don't need special logic for that case anymore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6371) Improve Spans payload collection

2015-05-27 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560771#comment-14560771
 ] 

Alan Woodward commented on LUCENE-6371:
---

I'm not sure I'm following what you mean by having two APIs.  
SpanQuery.createWeight() has exactly the same signature as Query.createWeight() 
- replacing needsScores with the termcontexts map in the SpanWeight constructor 
just allows us to pass in the needed information to build the Similarity at the 
same time as indicating whether or not it should be built at all.

 Improve Spans payload collection
 

 Key: LUCENE-6371
 URL: https://issues.apache.org/jira/browse/LUCENE-6371
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Paul Elschot
Assignee: Alan Woodward
Priority: Minor
 Fix For: Trunk, 5.3

 Attachments: LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch, 
 LUCENE-6371.patch


 Spin off from LUCENE-6308, see the comments there from around 23 March 2015.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6371) Improve Spans payload collection

2015-05-27 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560783#comment-14560783
 ] 

Alan Woodward commented on LUCENE-6371:
---

Ah, ok.  This is more to do with LUCENE-6466.  Moving .getSpans() and 
.extractTerms() to SpanWeight means that we can't collect the termcontexts in 
the constructor any more, because we can't call them on a partially constructed 
object.  And the TermContexts map was already in the API, but it was in 
getSpans() rather than in the SpanWeight constructor.  You're right that it's 
messy though.  I'm just not sure there's a cleaner way of doing it that doesn't 
involve completely changing how the SpanScorer works.

 Improve Spans payload collection
 

 Key: LUCENE-6371
 URL: https://issues.apache.org/jira/browse/LUCENE-6371
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Paul Elschot
Assignee: Alan Woodward
Priority: Minor
 Fix For: Trunk, 5.3

 Attachments: LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch, 
 LUCENE-6371.patch


 Spin off from LUCENE-6308, see the comments there from around 23 March 2015.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7570) Config APIs should not modify the ConfigSet

2015-05-29 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14564398#comment-14564398
 ] 

Alan Woodward commented on SOLR-7570:
-

bq. will the LOCAL changes make sense for SolrCloud mode?

I was thinking it might come in useful for things like PeerSync, or possibly 
per-core roles.

I'll change the collection-specific znode to conform with the existing setup.

Back-compatibility shouldn't be a problem, as existing installations will have 
their overlays read from the shared config, up until they make a change, at 
which point an overlay will be written to the collection config, which takes 
precedence.  I still need to work out how this works with ConfListeners though.

Changes shared between collections should be done through a different API, I 
think.  Something like the configset API being discussed on SOLR-5955 would be 
more appropriate for that.

 Config APIs should not modify the ConfigSet
 ---

 Key: SOLR-7570
 URL: https://issues.apache.org/jira/browse/SOLR-7570
 Project: Solr
  Issue Type: Improvement
Reporter: Tomás Fernández Löbbe
 Attachments: SOLR-7570.patch


 Originally discussed here: 
 http://mail-archives.apache.org/mod_mbox/lucene-dev/201505.mbox/%3CCAMJgJxSXCHxDzJs5-C-pKFDEBQD6JbgxB=-xp7u143ekmgp...@mail.gmail.com%3E
 The ConfigSet used to create a collection should be read-only. Changes made 
 via any of the Config APIs should only be applied to the collection where the 
 operation is done and no to other collections that may be using the same 
 ConfigSet. As discussed in the dev list: 
 When a collection is created we should have two things, an immutable part 
 (the ConfigSet) and a mutable part (configoverlay, generated schema, etc). 
 The ConfigSet will still be placed in ZooKeeper under /configs but the 
 mutable part should be placed under /collections/$COLLECTION_NAME/…
 [~romseygeek] suggested: 
 {quote}
 A nice way of doing it would be to make it part of the SolrResourceLoader 
 interface.  The ZK resource loader could check in the collection-specific 
 zknode first, and then under configs/, and we could add a writeResource() 
 method that writes to the collection-specific node as well.  Then all config 
 I/O goes via the resource loader, and we have a way of keeping certain parts 
 immutable.
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6371) Improve Spans payload collection

2015-05-29 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14564727#comment-14564727
 ] 

Alan Woodward commented on LUCENE-6371:
---

I *think* leniency should be fixed now, because the TermContext for each leaf 
is built by SpanTermQuery.createWeight(), and then collected for IDF via a new 
SpanWeight.extractTermContexts() method, rather than being built by the parent 
Weight via extractTerms().  So there should only be one visit to the terms 
dictionary per term in normal use.

There's still an extra visit in SpanMTQWrapper, but I think we can fix that by 
adding a SpanTermQuery(Term, TermContext) constructor, like we have with the 
standard TermQuery.  Maybe we should carry this over to LUCENE-6466, as here 
it's getting mixed up with the span collection API, which is a separate thing.  
I'll put up a patch.

 Improve Spans payload collection
 

 Key: LUCENE-6371
 URL: https://issues.apache.org/jira/browse/LUCENE-6371
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Paul Elschot
Assignee: Alan Woodward
Priority: Minor
 Fix For: Trunk, 5.3

 Attachments: LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch, 
 LUCENE-6371.patch


 Spin off from LUCENE-6308, see the comments there from around 23 March 2015.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6371) Improve Spans payload collection

2015-05-29 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14564768#comment-14564768
 ] 

Alan Woodward commented on LUCENE-6371:
---

Could it be because SpanWeight was previously using a TreeMap to collect terms, 
which was enforcing an ordering?  I'm a bit confused by how it would affect 
things, though, because the test that failed was running the exact same query 
in different ways, which would suggest that Java 7 was iterating over the exact 
same map non-deterministically.

 Improve Spans payload collection
 

 Key: LUCENE-6371
 URL: https://issues.apache.org/jira/browse/LUCENE-6371
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Paul Elschot
Assignee: Alan Woodward
Priority: Minor
 Fix For: Trunk, 5.3

 Attachments: LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch, 
 LUCENE-6371.patch


 Spin off from LUCENE-6308, see the comments there from around 23 March 2015.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6466) Move SpanQuery.getSpans() to SpanWeight

2015-05-29 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6466:
--
Attachment: LUCENE-6466-2.patch

Patch part 2, following discussion on LUCENE-6371.
* removes SpanSimilarity, in favour of a map of terms to termcontexts
* SpanTermQuery can take an optional TermContext in its constructor, similar to 
TermQuery
* SpanMTQWrapper now preserves term states when rewriting to SpanTermQueries

What would be nice would be to try and write an asserting TermsEnum that could 
check how many times seekExact(BytesRef) was called, to ensure that the various 
queries are re-using their term states properly.

 Move SpanQuery.getSpans() to SpanWeight
 ---

 Key: LUCENE-6466
 URL: https://issues.apache.org/jira/browse/LUCENE-6466
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Priority: Minor
 Fix For: Trunk, 5.3

 Attachments: LUCENE-6466-2.patch, LUCENE-6466.patch, 
 LUCENE-6466.patch, LUCENE-6466.patch, LUCENE-6466.patch, LUCENE-6466.patch


 SpanQuery.getSpans() should only be called on rewritten queries, so it seems 
 to make more sense to have this being called from SpanWeight



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (LUCENE-6466) Move SpanQuery.getSpans() to SpanWeight

2015-05-29 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward reassigned LUCENE-6466:
-

Assignee: Alan Woodward

 Move SpanQuery.getSpans() to SpanWeight
 ---

 Key: LUCENE-6466
 URL: https://issues.apache.org/jira/browse/LUCENE-6466
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Assignee: Alan Woodward
Priority: Minor
 Fix For: Trunk, 5.3

 Attachments: LUCENE-6466-2.patch, LUCENE-6466-2.patch, 
 LUCENE-6466-2.patch, LUCENE-6466.patch, LUCENE-6466.patch, LUCENE-6466.patch, 
 LUCENE-6466.patch, LUCENE-6466.patch


 SpanQuery.getSpans() should only be called on rewritten queries, so it seems 
 to make more sense to have this being called from SpanWeight



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6466) Move SpanQuery.getSpans() to SpanWeight

2015-05-29 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14565026#comment-14565026
 ] 

Alan Woodward commented on LUCENE-6466:
---

I'll clean up LUCENE-6371 next, before we put this all back into 5.x

 Move SpanQuery.getSpans() to SpanWeight
 ---

 Key: LUCENE-6466
 URL: https://issues.apache.org/jira/browse/LUCENE-6466
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Priority: Minor
 Fix For: Trunk, 5.3

 Attachments: LUCENE-6466-2.patch, LUCENE-6466-2.patch, 
 LUCENE-6466-2.patch, LUCENE-6466.patch, LUCENE-6466.patch, LUCENE-6466.patch, 
 LUCENE-6466.patch, LUCENE-6466.patch


 SpanQuery.getSpans() should only be called on rewritten queries, so it seems 
 to make more sense to have this being called from SpanWeight



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6466) Move SpanQuery.getSpans() to SpanWeight

2015-05-29 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6466:
--
Attachment: LUCENE-6466-2.patch

Nits appropriately picked :-)

 Move SpanQuery.getSpans() to SpanWeight
 ---

 Key: LUCENE-6466
 URL: https://issues.apache.org/jira/browse/LUCENE-6466
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Priority: Minor
 Fix For: Trunk, 5.3

 Attachments: LUCENE-6466-2.patch, LUCENE-6466-2.patch, 
 LUCENE-6466-2.patch, LUCENE-6466.patch, LUCENE-6466.patch, LUCENE-6466.patch, 
 LUCENE-6466.patch, LUCENE-6466.patch


 SpanQuery.getSpans() should only be called on rewritten queries, so it seems 
 to make more sense to have this being called from SpanWeight



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6466) Move SpanQuery.getSpans() to SpanWeight

2015-05-29 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6466:
--
Attachment: LUCENE-6466-2.patch

Oops, yes, I missed TopTermsSpanBooleanQueryRewrite.  Final patch with the 
changes there, plus some assertions copied from TermWeight/TermScorer.

 Move SpanQuery.getSpans() to SpanWeight
 ---

 Key: LUCENE-6466
 URL: https://issues.apache.org/jira/browse/LUCENE-6466
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Priority: Minor
 Fix For: Trunk, 5.3

 Attachments: LUCENE-6466-2.patch, LUCENE-6466-2.patch, 
 LUCENE-6466.patch, LUCENE-6466.patch, LUCENE-6466.patch, LUCENE-6466.patch, 
 LUCENE-6466.patch


 SpanQuery.getSpans() should only be called on rewritten queries, so it seems 
 to make more sense to have this being called from SpanWeight



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7570) Config APIs should not modify the ConfigSet

2015-06-01 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14567249#comment-14567249
 ] 

Alan Woodward commented on SOLR-7570:
-

I think that's making things overcomplicated.  Whether or not a specific config 
setting is updated via overlay or via XML file changes is an implementation 
detail.  I'd rather we just made configs immutable (at least through this API), 
and overlays always written to the collection-specific config.  If you want to 
update the overlay for multiple collections, you could use aliases, or set 
collection=coll1,coll2 etc on the query?

 Config APIs should not modify the ConfigSet
 ---

 Key: SOLR-7570
 URL: https://issues.apache.org/jira/browse/SOLR-7570
 Project: Solr
  Issue Type: Improvement
Reporter: Tomás Fernández Löbbe
 Attachments: SOLR-7570.patch


 Originally discussed here: 
 http://mail-archives.apache.org/mod_mbox/lucene-dev/201505.mbox/%3CCAMJgJxSXCHxDzJs5-C-pKFDEBQD6JbgxB=-xp7u143ekmgp...@mail.gmail.com%3E
 The ConfigSet used to create a collection should be read-only. Changes made 
 via any of the Config APIs should only be applied to the collection where the 
 operation is done and no to other collections that may be using the same 
 ConfigSet. As discussed in the dev list: 
 When a collection is created we should have two things, an immutable part 
 (the ConfigSet) and a mutable part (configoverlay, generated schema, etc). 
 The ConfigSet will still be placed in ZooKeeper under /configs but the 
 mutable part should be placed under /collections/$COLLECTION_NAME/…
 [~romseygeek] suggested: 
 {quote}
 A nice way of doing it would be to make it part of the SolrResourceLoader 
 interface.  The ZK resource loader could check in the collection-specific 
 zknode first, and then under configs/, and we could add a writeResource() 
 method that writes to the collection-specific node as well.  Then all config 
 I/O goes via the resource loader, and we have a way of keeping certain parts 
 immutable.
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6513) Allow limits on SpanMultiTermQueryWrapper expansion

2015-06-01 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14567242#comment-14567242
 ] 

Alan Woodward commented on LUCENE-6513:
---

The problem with using TopTerms is that it requires that terms have the same 
comparison value across segments (which makes it more efficient, because it can 
only build TermContexts for candidate terms), but that doesn't hold for 
termFreq.

 Allow limits on SpanMultiTermQueryWrapper expansion
 ---

 Key: LUCENE-6513
 URL: https://issues.apache.org/jira/browse/LUCENE-6513
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Priority: Minor
 Attachments: LUCENE-6513.patch


 SpanMultiTermQueryWrapper currently rewrites to a SpanOrQuery with as many 
 clauses as there are matching terms.  It would be nice to be able to limit 
 this in a slightly nicer way than using TopTerms, which for most queries just 
 translates to a lexicographical ordering.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6513) Allow limits on SpanMultiTermQueryWrapper expansion

2015-06-01 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6513:
--
Attachment: LUCENE-6513.patch

Patch.  This adds a FrequentTerms rewrite to SpanMTQWrapper, which will rewrite 
the query using the terms with highest term frequencies.

 Allow limits on SpanMultiTermQueryWrapper expansion
 ---

 Key: LUCENE-6513
 URL: https://issues.apache.org/jira/browse/LUCENE-6513
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Priority: Minor
 Attachments: LUCENE-6513.patch


 SpanMultiTermQueryWrapper currently rewrites to a SpanOrQuery with as many 
 clauses as there are matching terms.  It would be nice to be able to limit 
 this in a slightly nicer way than using TopTerms, which for most queries just 
 translates to a lexicographical ordering.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-6513) Allow limits on SpanMultiTermQueryWrapper expansion

2015-06-01 Thread Alan Woodward (JIRA)
Alan Woodward created LUCENE-6513:
-

 Summary: Allow limits on SpanMultiTermQueryWrapper expansion
 Key: LUCENE-6513
 URL: https://issues.apache.org/jira/browse/LUCENE-6513
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Priority: Minor


SpanMultiTermQueryWrapper currently rewrites to a SpanOrQuery with as many 
clauses as there are matching terms.  It would be nice to be able to limit this 
in a slightly nicer way than using TopTerms, which for most queries just 
translates to a lexicographical ordering.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7555) Display total space and available space in Admin

2015-06-01 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14567394#comment-14567394
 ] 

Alan Woodward commented on SOLR-7555:
-

How about adding the methods to DirectoryFactory instead?

 Display total space and available space in Admin
 

 Key: SOLR-7555
 URL: https://issues.apache.org/jira/browse/SOLR-7555
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Affects Versions: 5.1
Reporter: Eric Pugh
Assignee: Erik Hatcher
Priority: Minor
 Fix For: 5.2

 Attachments: DiskSpaceAwareDirectory.java, 
 SOLR-7555-display_disk_space.patch, SOLR-7555-display_disk_space_v2.patch, 
 SOLR-7555-display_disk_space_v3.patch, SOLR-7555.patch


 Frequently I have access to the Solr Admin console, but not the underlying 
 server, and I'm curious how much space remains available.   This little patch 
 exposes total Volume size as well as the usable space remaining:
 !https://monosnap.com/file/VqlReekCFwpK6utI3lP18fbPqrGI4b.png!
 I'm not sure if this is the best place to put this, as every shard will share 
 the same data, so maybe it should be on the top level Dashboard?  Also not 
 sure what to call the fields! 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6371) Improve Spans payload collection

2015-05-27 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560857#comment-14560857
 ] 

Alan Woodward commented on LUCENE-6371:
---

It's the same logic as https://issues.apache.org/jira/browse/LUCENE-6425 
really.  Spans should only be pulled from a SpanQuery after it has been 
rewritten against a searcher, so it makes sense that they should be on 
SpanWeight rather than SpanQuery.  And it simplifies pulling Spans for use in 
things like highlighting (or payload collection, as here) - before you had to 
rewrite, extract terms, build the termcontexts map and then call getSpans(); 
now you just call IndexSearcher.getNormalizedWeight(), cast to a SpanWeight and 
call getSpans.

 Improve Spans payload collection
 

 Key: LUCENE-6371
 URL: https://issues.apache.org/jira/browse/LUCENE-6371
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Paul Elschot
Assignee: Alan Woodward
Priority: Minor
 Fix For: Trunk, 5.3

 Attachments: LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch, 
 LUCENE-6371.patch


 Spin off from LUCENE-6308, see the comments there from around 23 March 2015.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7570) Config APIs should not modify the ConfigSet

2015-05-27 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated SOLR-7570:

Attachment: SOLR-7570.patch

Here's the first cut of an idea.

* Adds a writeResource(String, byte[], Location) method to SolrResourceLoader 
(Location can be LOCAL, COLLECTION, CONFIGSET)
* SolrResourceLoader takes both a local instance dir and a shared config dir.  
These can be the same if you're not using a configset.
* The standard resource loader looks in three places for resources:
** core instance dir
** configset
** classpath
* The ZK resource loader looks in four places:
** core instance dir
** collection-specific config
** zk config
** classpath

You can write to either the local core instance dir, or to the 
collection-specific config (I added CONFIG as a location in case we want to use 
that later for things like specifying where a particular resource was found, 
but that can be taken out if it's not adding anything now).

Writing to the collection-specific config uses version-tracking to implement 
optimistic concurrency.

There are tests for the standard resource loader and the ZK resource loader.

This is still pretty rough around the edges, and I haven't run the full test 
suite or started cutting over existing code to using the new API, but it's a 
start.  What do people think?

 Config APIs should not modify the ConfigSet
 ---

 Key: SOLR-7570
 URL: https://issues.apache.org/jira/browse/SOLR-7570
 Project: Solr
  Issue Type: Improvement
Reporter: Tomás Fernández Löbbe
 Attachments: SOLR-7570.patch


 Originally discussed here: 
 http://mail-archives.apache.org/mod_mbox/lucene-dev/201505.mbox/%3CCAMJgJxSXCHxDzJs5-C-pKFDEBQD6JbgxB=-xp7u143ekmgp...@mail.gmail.com%3E
 The ConfigSet used to create a collection should be read-only. Changes made 
 via any of the Config APIs should only be applied to the collection where the 
 operation is done and no to other collections that may be using the same 
 ConfigSet. As discussed in the dev list: 
 When a collection is created we should have two things, an immutable part 
 (the ConfigSet) and a mutable part (configoverlay, generated schema, etc). 
 The ConfigSet will still be placed in ZooKeeper under /configs but the 
 mutable part should be placed under /collections/$COLLECTION_NAME/…
 [~romseygeek] suggested: 
 {quote}
 A nice way of doing it would be to make it part of the SolrResourceLoader 
 interface.  The ZK resource loader could check in the collection-specific 
 zknode first, and then under configs/, and we could add a writeResource() 
 method that writes to the collection-specific node as well.  Then all config 
 I/O goes via the resource loader, and we have a way of keeping certain parts 
 immutable.
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6371) Improve Spans payload collection

2015-05-29 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6371:
--
Attachment: LUCENE-6371.patch

Patch updated to trunk.

 Improve Spans payload collection
 

 Key: LUCENE-6371
 URL: https://issues.apache.org/jira/browse/LUCENE-6371
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Paul Elschot
Assignee: Alan Woodward
Priority: Minor
 Fix For: Trunk, 5.3

 Attachments: LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch, 
 LUCENE-6371.patch, LUCENE-6371.patch


 Spin off from LUCENE-6308, see the comments there from around 23 March 2015.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6371) Improve Spans payload collection

2015-05-28 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14562834#comment-14562834
 ] 

Alan Woodward commented on LUCENE-6371:
---

I think it's still useful though - I use it all the time!  It would be nice if 
you could restrict the number of SpanOr clauses it rewrites to, but that's a 
separate issue.

If you really think that moving .getSpans() and .extractTerms() to SpanWeight 
doesn't gain anything, then I can back it out.  But I think it does simplify 
the API and brings it more into line with our other standard queries.  And I 
really don't see that exposing the termcontexts map on the SpanWeight 
constructor is any worse than exposing it directly in .getSpans().  In fact, 
I'd say that it's hiding it better - very few users of lucene are going to be 
looking at SpanWeights, as they're an implementation detail, but anyone using 
an IDE is going to be shown SpanQuery.getSpans() when they try and autocomplete 
on a SpanQuery object, and it's not something that most users need to worry 
about.

 Improve Spans payload collection
 

 Key: LUCENE-6371
 URL: https://issues.apache.org/jira/browse/LUCENE-6371
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Paul Elschot
Assignee: Alan Woodward
Priority: Minor
 Fix For: Trunk, 5.3

 Attachments: LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch, 
 LUCENE-6371.patch


 Spin off from LUCENE-6308, see the comments there from around 23 March 2015.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6371) Improve Spans payload collection

2015-05-27 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560603#comment-14560603
 ] 

Alan Woodward commented on LUCENE-6371:
---

It's because of the way scoring works on Spans.  We build a span tree, calling 
.getSpans() on the various weights as we go down the hierarchy, but scoring is 
only done by the top level, so we only want to build a Similarity in the 
top-level weight.  And because the Similarity is built in the constructor, we 
need to collect all the terms and termcontexts of the various leaves before the 
Weight itself is built, so we can't just pass needsScores.

Ideally scoring would be done on the Spans themselves (making them even more 
like just a specialized Scorer), but that's a bigger change.

 Improve Spans payload collection
 

 Key: LUCENE-6371
 URL: https://issues.apache.org/jira/browse/LUCENE-6371
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Paul Elschot
Assignee: Alan Woodward
Priority: Minor
 Fix For: Trunk, 5.3

 Attachments: LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch, 
 LUCENE-6371.patch


 Spin off from LUCENE-6308, see the comments there from around 23 March 2015.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6371) Improve Spans payload collection

2015-05-25 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6371:
--
Attachment: LUCENE-6371.patch

Here's a patch taking into account all the comments here and on LUCENE-6494.
* SpanCollector becomes an interface again, so payload collection is entirely 
defined in the .payloads package
* BufferedSpanCollector is removed, replaced by a simple array of 
SpanCollectors in NearSpansOrdered.  SpanCollector has two methods to deal with 
this, newSubCollectors() and collectedComposite(), to create and then replay.
* SpanCollectors are passed through in getSpans().  A null passed here means no 
collection, and there's a default getSpans() call on SpanWeight that always 
passes a null collector.
* I've removed SpanSimilarity, in favour of passing a map of Terms to 
TermContexts to the SpanWeight constructor.  If this is null, then scoring 
isn't required; if not, then SpanWeight builds a SimScorer and passes that to 
its scorer.

 Improve Spans payload collection
 

 Key: LUCENE-6371
 URL: https://issues.apache.org/jira/browse/LUCENE-6371
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Paul Elschot
Assignee: Alan Woodward
Priority: Minor
 Fix For: Trunk, 5.3

 Attachments: LUCENE-6371.patch, LUCENE-6371.patch, LUCENE-6371.patch, 
 LUCENE-6371.patch


 Spin off from LUCENE-6308, see the comments there from around 23 March 2015.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6494) Make PayloadSpanUtil apply to other postings information

2015-05-25 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14558370#comment-14558370
 ] 

Alan Woodward commented on LUCENE-6494:
---

Thanks for dealing with this guys.  I've tried to address some of the comments 
on the general shape of the API over on LUCENE-6371, and then we can keep this 
issue for PayloadSpanUtil itself.

 Make PayloadSpanUtil apply to other postings information
 

 Key: LUCENE-6494
 URL: https://issues.apache.org/jira/browse/LUCENE-6494
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Assignee: Alan Woodward
 Fix For: 5.2

 Attachments: LUCENE-6494.patch, LUCENE-6494.patch, LUCENE-6494.patch, 
 LUCENE-6494.patch


 With the addition of SpanCollectors, we can now get arbitrary postings 
 information from SpanQueries.  PayloadSpanUtil does some rewriting to convert 
 non-span queries into SpanQueries so that it can collect payloads.  It would 
 be good to make this more generic, so that we can collect any postings 
 information from any query (without having to make invasive changes to 
 already optimized Scorers, etc).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6494) Make PayloadSpanUtil apply to other postings information

2015-05-21 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14554756#comment-14554756
 ] 

Alan Woodward commented on LUCENE-6494:
---

Yeah, it's NearSpansOrdered that makes everything complicated, because by the 
time you call collect() its child Spans have moved on.  So you need to either 
have just a generic collector type, or pass the collector type to the 
NearSpansOrdered constructor and have a way of creating a buffered collector 
from that, which leads to the overcomplicated generics that Robert didn't like.

There might be a way of doing this by passing a SpanCollector to the SpanWeight 
somehow?  And just doing some ugly brute-force casting in replay(), risking the 
ire of the Generics Policeman.  But in the meantime, please do try this API out 
- feedback from people other than me who are using it will be very useful!

 Make PayloadSpanUtil apply to other postings information
 

 Key: LUCENE-6494
 URL: https://issues.apache.org/jira/browse/LUCENE-6494
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Assignee: Alan Woodward
 Fix For: 5.2

 Attachments: LUCENE-6494.patch, LUCENE-6494.patch, LUCENE-6494.patch, 
 LUCENE-6494.patch


 With the addition of SpanCollectors, we can now get arbitrary postings 
 information from SpanQueries.  PayloadSpanUtil does some rewriting to convert 
 non-span queries into SpanQueries so that it can collect payloads.  It would 
 be good to make this more generic, so that we can collect any postings 
 information from any query (without having to make invasive changes to 
 already optimized Scorers, etc).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6494) Make PayloadSpanUtil apply to other postings information

2015-05-21 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6494:
--
Attachment: LUCENE-6494.patch

This patch makes MatchData concrete and package-private, and removes all the 
generics.  SpanCollectorFactory is also gone.

I'll see if I can fold BufferedSpanCollector entirely into NearSpansOrdered, as 
that's the only class that actually makes use of it.

 Make PayloadSpanUtil apply to other postings information
 

 Key: LUCENE-6494
 URL: https://issues.apache.org/jira/browse/LUCENE-6494
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Assignee: Alan Woodward
 Fix For: 5.2

 Attachments: LUCENE-6494.patch, LUCENE-6494.patch


 With the addition of SpanCollectors, we can now get arbitrary postings 
 information from SpanQueries.  PayloadSpanUtil does some rewriting to convert 
 non-span queries into SpanQueries so that it can collect payloads.  It would 
 be good to make this more generic, so that we can collect any postings 
 information from any query (without having to make invasive changes to 
 already optimized Scorers, etc).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6494) Make PayloadSpanUtil apply to other postings information

2015-05-21 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6494:
--
Attachment: LUCENE-6494.patch

Here is a patch.

SpanCollector is changed from an interface to a concrete implementation, 
parametrized by a MatchData data type defining the type of postings information 
to collect.  The no-op implementation is a specialised subclass.

Most of the functionality from PayloadSpanUtil is moved to MatchDataCollector.  
This will take an arbitrary query, convert it to a Span query, run it over any 
document in a searcher, and return a MatchDataIterator that iterates over the 
matches for that query within that doc.  This ignores things like SpanNot 
exclusions and Boolean MUST_NOT clauses, so you should make sure that you 
already know a document is a match before passing it in.  PayloadSpanUtil 
retains its existing methods and constructor for backwards compatibility.

MatchData implementations for positions, offsets and payloads are all provided, 
although at the moment you can only collect one of these at a time - a 
composite collector is something I want to look at in another issue.

The MatchDataIteratorT interface is a bit clunky at the moment.  I might look 
at moving the field information directly into MatchData and changing this to 
look more like other iterators (either lucene ones or Java ones).

There are still lots of javadocs to add, and I'm writing more tests, but I 
thought I'd put this up for comment.  It should allow things like luwak's 
exact-match highlighter to work without requiring all the low-level changes in 
LUCENE-2878.

 Make PayloadSpanUtil apply to other postings information
 

 Key: LUCENE-6494
 URL: https://issues.apache.org/jira/browse/LUCENE-6494
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Assignee: Alan Woodward
 Fix For: 5.2

 Attachments: LUCENE-6494.patch


 With the addition of SpanCollectors, we can now get arbitrary postings 
 information from SpanQueries.  PayloadSpanUtil does some rewriting to convert 
 non-span queries into SpanQueries so that it can collect payloads.  It would 
 be good to make this more generic, so that we can collect any postings 
 information from any query (without having to make invasive changes to 
 already optimized Scorers, etc).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6494) Make PayloadSpanUtil apply to other postings information

2015-05-21 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14554133#comment-14554133
 ] 

Alan Woodward commented on LUCENE-6494:
---

bq. Please use regular names like get() and set()

Sure, that's easy enough.

On the number of classes and generics, maybe it would fit more in with the rest 
of the lucene API if there was just a single MatchData class taking an int 
saying what to collect?

 Make PayloadSpanUtil apply to other postings information
 

 Key: LUCENE-6494
 URL: https://issues.apache.org/jira/browse/LUCENE-6494
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Assignee: Alan Woodward
 Fix For: 5.2

 Attachments: LUCENE-6494.patch


 With the addition of SpanCollectors, we can now get arbitrary postings 
 information from SpanQueries.  PayloadSpanUtil does some rewriting to convert 
 non-span queries into SpanQueries so that it can collect payloads.  It would 
 be good to make this more generic, so that we can collect any postings 
 information from any query (without having to make invasive changes to 
 already optimized Scorers, etc).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-6494) Make PayloadSpanUtil apply to other postings information

2015-05-21 Thread Alan Woodward (JIRA)
Alan Woodward created LUCENE-6494:
-

 Summary: Make PayloadSpanUtil apply to other postings information
 Key: LUCENE-6494
 URL: https://issues.apache.org/jira/browse/LUCENE-6494
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Assignee: Alan Woodward
 Fix For: 5.2


With the addition of SpanCollectors, we can now get arbitrary postings 
information from SpanQueries.  PayloadSpanUtil does some rewriting to convert 
non-span queries into SpanQueries so that it can collect payloads.  It would be 
good to make this more generic, so that we can collect any postings information 
from any query (without having to make invasive changes to already optimized 
Scorers, etc).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6494) Make PayloadSpanUtil apply to other postings information

2015-05-21 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6494:
--
Attachment: LUCENE-6494.patch

Removing the generics and integrating the buffered collector directly into 
NearSpansOrdered also allows a bunch of other cleanups.  So now:
* SpanWeight.getSpans() looks exactly like Weight.scorer()
* SpanPositionCheckQuery.accept() doesn't need to take a collector any more

 Make PayloadSpanUtil apply to other postings information
 

 Key: LUCENE-6494
 URL: https://issues.apache.org/jira/browse/LUCENE-6494
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Assignee: Alan Woodward
 Fix For: 5.2

 Attachments: LUCENE-6494.patch, LUCENE-6494.patch, LUCENE-6494.patch, 
 LUCENE-6494.patch


 With the addition of SpanCollectors, we can now get arbitrary postings 
 information from SpanQueries.  PayloadSpanUtil does some rewriting to convert 
 non-span queries into SpanQueries so that it can collect payloads.  It would 
 be good to make this more generic, so that we can collect any postings 
 information from any query (without having to make invasive changes to 
 already optimized Scorers, etc).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6494) Make PayloadSpanUtil apply to other postings information

2015-05-21 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6494:
--
Attachment: LUCENE-6494.patch

BufferedSpanCollector is now a private class in NearSpansOrdered, and buffer() 
and bufferedCollector() are removed from SpanCollector.

So the public API is now:
* SpanCollector (for collecting data from Spans, for generic use in queries)
* MatchDataCollector (for collecting data from a specific document and query)
* MatchDataIterator (for iterating over the results of a MatchDataCollector)

 Make PayloadSpanUtil apply to other postings information
 

 Key: LUCENE-6494
 URL: https://issues.apache.org/jira/browse/LUCENE-6494
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Assignee: Alan Woodward
 Fix For: 5.2

 Attachments: LUCENE-6494.patch, LUCENE-6494.patch, LUCENE-6494.patch


 With the addition of SpanCollectors, we can now get arbitrary postings 
 information from SpanQueries.  PayloadSpanUtil does some rewriting to convert 
 non-span queries into SpanQueries so that it can collect payloads.  It would 
 be good to make this more generic, so that we can collect any postings 
 information from any query (without having to make invasive changes to 
 already optimized Scorers, etc).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6494) Make PayloadSpanUtil apply to other postings information

2015-05-21 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14555236#comment-14555236
 ] 

Alan Woodward commented on LUCENE-6494:
---

bq. I personally think it makes sense for this to not be a blocker for 5.2 and 
shipping this with 5.3 instead. Do you think that makes sense?
+1

I'm not sure I'll have svn access for the next couple of days, will you be able 
to back this out of the 5.2 branch Anshum?  If not, I can probably get to it on 
Monday UK time (so before the release).

Regarding Rob's concerns:
* javadocs and commented out code I can sort out asap, the API sort of kept 
changing every time I typed this afternoon so I didn't get round to getting 
this into precommit-passing shape.
* I'll replace the no-op collector with null checks, that's a leftover from the 
generics stuff
* MatchDataIterator and MatchDataCollector can be moved to the highlighter 
package, I think (or sandbox initially, but the idea is to eventually build a 
highlighter from them)
* I'll add stuff to AssertingSpans to enforce the contracts
* I think the boolean works best ('enablePositionCollection' maybe?) - will see 
what works

Sorry about the timing here, I don't want to mess the release of 5.2 up!

 Make PayloadSpanUtil apply to other postings information
 

 Key: LUCENE-6494
 URL: https://issues.apache.org/jira/browse/LUCENE-6494
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Assignee: Alan Woodward
 Fix For: 5.2

 Attachments: LUCENE-6494.patch, LUCENE-6494.patch, LUCENE-6494.patch, 
 LUCENE-6494.patch


 With the addition of SpanCollectors, we can now get arbitrary postings 
 information from SpanQueries.  PayloadSpanUtil does some rewriting to convert 
 non-span queries into SpanQueries so that it can collect payloads.  It would 
 be good to make this more generic, so that we can collect any postings 
 information from any query (without having to make invasive changes to 
 already optimized Scorers, etc).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6494) Make PayloadSpanUtil apply to other postings information

2015-05-21 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14554587#comment-14554587
 ] 

Alan Woodward commented on LUCENE-6494:
---

We could add a CollectionTerm to MatchData as well, to collect all terms from 
a Spans.  I'm not sure I see why you need the Term for highlighting though - 
can't you just use offsets?

 Make PayloadSpanUtil apply to other postings information
 

 Key: LUCENE-6494
 URL: https://issues.apache.org/jira/browse/LUCENE-6494
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alan Woodward
Assignee: Alan Woodward
 Fix For: 5.2

 Attachments: LUCENE-6494.patch, LUCENE-6494.patch, LUCENE-6494.patch, 
 LUCENE-6494.patch


 With the addition of SpanCollectors, we can now get arbitrary postings 
 information from SpanQueries.  PayloadSpanUtil does some rewriting to convert 
 non-span queries into SpanQueries so that it can collect payloads.  It would 
 be good to make this more generic, so that we can collect any postings 
 information from any query (without having to make invasive changes to 
 already optimized Scorers, etc).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6490) TestPayloadNearQuery fails with NPE

2015-05-19 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550931#comment-14550931
 ] 

Alan Woodward commented on LUCENE-6490:
---

We're going to need a different SpanCollector per-scorer to deal with 
multithreaded search.  Working on a patch now.

Randomized testing ftw...

 TestPayloadNearQuery fails with NPE
 ---

 Key: LUCENE-6490
 URL: https://issues.apache.org/jira/browse/LUCENE-6490
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
Assignee: Alan Woodward

 ant test  -Dtestcase=TestPayloadNearQuery -Dtests.method=test 
 -Dtests.seed=24743B1132665845 -Dtests.slow=true -Dtests.locale=es_NI 
 -Dtests.timezone=Israel -Dtests.asserts=true -Dtests.file.encoding=US-ASCII
 {noformat}
[junit4] Started J0 PID(19895@localhost).
[junit4] Suite: org.apache.lucene.search.payloads.TestPayloadNearQuery
[junit4]   2 NOTE: reproduce with: ant test  
 -Dtestcase=TestPayloadNearQuery -Dtests.method=test 
 -Dtests.seed=24743B1132665845 -Dtests.slow=true -Dtests.locale=es_NI 
 -Dtests.timezone=Israel -Dtests.asserts=true -Dtests.file.encoding=US-ASCII
[junit4] ERROR   0.09s | TestPayloadNearQuery.test 
[junit4] Throwable #1: java.lang.RuntimeException: 
 java.util.concurrent.ExecutionException: java.lang.NullPointerException
[junit4]  at 
 __randomizedtesting.SeedInfo.seed([24743B1132665845:AC2004CB9C9A35BD]:0)
[junit4]  at 
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:669)
[junit4]  at 
 org.apache.lucene.search.IndexSearcher.searchAfter(IndexSearcher.java:353)
[junit4]  at 
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:382)
[junit4]  at 
 org.apache.lucene.search.payloads.TestPayloadNearQuery.test(TestPayloadNearQuery.java:144)
[junit4]  at java.lang.Thread.run(Thread.java:745)
[junit4] Caused by: java.util.concurrent.ExecutionException: 
 java.lang.NullPointerException
[junit4]  at 
 java.util.concurrent.FutureTask.report(FutureTask.java:122)
[junit4]  at 
 java.util.concurrent.FutureTask.get(FutureTask.java:192)
[junit4]  at 
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:665)
[junit4]  ... 39 more
[junit4] Caused by: java.lang.NullPointerException
[junit4]  at 
 org.apache.lucene.search.payloads.PayloadNearQuery$PayloadNearSpanScorer.processPayloads(PayloadNearQuery.java:202)
[junit4]  at 
 org.apache.lucene.search.payloads.PayloadNearQuery$PayloadNearSpanScorer.setFreqCurrentDoc(PayloadNearQuery.java:223)
[junit4]  at 
 org.apache.lucene.search.spans.SpanScorer.ensureFreq(SpanScorer.java:65)
[junit4]  at 
 org.apache.lucene.search.spans.SpanScorer.score(SpanScorer.java:118)
[junit4]  at 
 org.apache.lucene.search.AssertingScorer.score(AssertingScorer.java:67)
[junit4]  at 
 org.apache.lucene.search.TopScoreDocCollector$SimpleTopScoreDocCollector$1.collect(TopScoreDocCollector.java:64)
[junit4]  at 
 org.apache.lucene.search.AssertingLeafCollector.collect(AssertingLeafCollector.java:53)
[junit4]  at 
 org.apache.lucene.search.AssertingCollector$1.collect(AssertingCollector.java:57)
[junit4]  at 
 org.apache.lucene.search.AssertingLeafCollector.collect(AssertingLeafCollector.java:53)
[junit4]  at 
 org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll(Weight.java:203)
[junit4]  at 
 org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:174)
[junit4]  at 
 org.apache.lucene.search.BulkScorer.score(BulkScorer.java:35)
[junit4]  at 
 org.apache.lucene.search.AssertingBulkScorer.score(AssertingBulkScorer.java:69)
[junit4]  at 
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:714)
[junit4]  at 
 org.apache.lucene.search.AssertingIndexSearcher.search(AssertingIndexSearcher.java:93)
[junit4]  at 
 org.apache.lucene.search.IndexSearcher$4.call(IndexSearcher.java:656)
[junit4]  at 
 org.apache.lucene.search.IndexSearcher$4.call(IndexSearcher.java:653)
[junit4]  at 
 java.util.concurrent.FutureTask.run(FutureTask.java:265)
[junit4]  at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[junit4]  at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[junit4]  ... 1 more
[junit4]   2 NOTE: test params are: 
 codec=FastDecompressionCompressingStoredFields(storedFieldsFormat=CompressingStoredFieldsFormat(compressionMode=FAST_DECOMPRESSION,
  chunkSize=25825, maxDocsPerChunk=709, blockSize=459), 
 

[jira] [Assigned] (LUCENE-6490) TestPayloadNearQuery fails with NPE

2015-05-19 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward reassigned LUCENE-6490:
-

Assignee: Alan Woodward

 TestPayloadNearQuery fails with NPE
 ---

 Key: LUCENE-6490
 URL: https://issues.apache.org/jira/browse/LUCENE-6490
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
Assignee: Alan Woodward

 ant test  -Dtestcase=TestPayloadNearQuery -Dtests.method=test 
 -Dtests.seed=24743B1132665845 -Dtests.slow=true -Dtests.locale=es_NI 
 -Dtests.timezone=Israel -Dtests.asserts=true -Dtests.file.encoding=US-ASCII
 {noformat}
[junit4] Started J0 PID(19895@localhost).
[junit4] Suite: org.apache.lucene.search.payloads.TestPayloadNearQuery
[junit4]   2 NOTE: reproduce with: ant test  
 -Dtestcase=TestPayloadNearQuery -Dtests.method=test 
 -Dtests.seed=24743B1132665845 -Dtests.slow=true -Dtests.locale=es_NI 
 -Dtests.timezone=Israel -Dtests.asserts=true -Dtests.file.encoding=US-ASCII
[junit4] ERROR   0.09s | TestPayloadNearQuery.test 
[junit4] Throwable #1: java.lang.RuntimeException: 
 java.util.concurrent.ExecutionException: java.lang.NullPointerException
[junit4]  at 
 __randomizedtesting.SeedInfo.seed([24743B1132665845:AC2004CB9C9A35BD]:0)
[junit4]  at 
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:669)
[junit4]  at 
 org.apache.lucene.search.IndexSearcher.searchAfter(IndexSearcher.java:353)
[junit4]  at 
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:382)
[junit4]  at 
 org.apache.lucene.search.payloads.TestPayloadNearQuery.test(TestPayloadNearQuery.java:144)
[junit4]  at java.lang.Thread.run(Thread.java:745)
[junit4] Caused by: java.util.concurrent.ExecutionException: 
 java.lang.NullPointerException
[junit4]  at 
 java.util.concurrent.FutureTask.report(FutureTask.java:122)
[junit4]  at 
 java.util.concurrent.FutureTask.get(FutureTask.java:192)
[junit4]  at 
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:665)
[junit4]  ... 39 more
[junit4] Caused by: java.lang.NullPointerException
[junit4]  at 
 org.apache.lucene.search.payloads.PayloadNearQuery$PayloadNearSpanScorer.processPayloads(PayloadNearQuery.java:202)
[junit4]  at 
 org.apache.lucene.search.payloads.PayloadNearQuery$PayloadNearSpanScorer.setFreqCurrentDoc(PayloadNearQuery.java:223)
[junit4]  at 
 org.apache.lucene.search.spans.SpanScorer.ensureFreq(SpanScorer.java:65)
[junit4]  at 
 org.apache.lucene.search.spans.SpanScorer.score(SpanScorer.java:118)
[junit4]  at 
 org.apache.lucene.search.AssertingScorer.score(AssertingScorer.java:67)
[junit4]  at 
 org.apache.lucene.search.TopScoreDocCollector$SimpleTopScoreDocCollector$1.collect(TopScoreDocCollector.java:64)
[junit4]  at 
 org.apache.lucene.search.AssertingLeafCollector.collect(AssertingLeafCollector.java:53)
[junit4]  at 
 org.apache.lucene.search.AssertingCollector$1.collect(AssertingCollector.java:57)
[junit4]  at 
 org.apache.lucene.search.AssertingLeafCollector.collect(AssertingLeafCollector.java:53)
[junit4]  at 
 org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll(Weight.java:203)
[junit4]  at 
 org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:174)
[junit4]  at 
 org.apache.lucene.search.BulkScorer.score(BulkScorer.java:35)
[junit4]  at 
 org.apache.lucene.search.AssertingBulkScorer.score(AssertingBulkScorer.java:69)
[junit4]  at 
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:714)
[junit4]  at 
 org.apache.lucene.search.AssertingIndexSearcher.search(AssertingIndexSearcher.java:93)
[junit4]  at 
 org.apache.lucene.search.IndexSearcher$4.call(IndexSearcher.java:656)
[junit4]  at 
 org.apache.lucene.search.IndexSearcher$4.call(IndexSearcher.java:653)
[junit4]  at 
 java.util.concurrent.FutureTask.run(FutureTask.java:265)
[junit4]  at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[junit4]  at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[junit4]  ... 1 more
[junit4]   2 NOTE: test params are: 
 codec=FastDecompressionCompressingStoredFields(storedFieldsFormat=CompressingStoredFieldsFormat(compressionMode=FAST_DECOMPRESSION,
  chunkSize=25825, maxDocsPerChunk=709, blockSize=459), 
 termVectorsFormat=CompressingTermVectorsFormat(compressionMode=FAST_DECOMPRESSION,
  chunkSize=25825, blockSize=459)), sim=DefaultSimilarity, locale=es_NI, 
 timezone=Israel
[junit4]   2 

[jira] [Updated] (LUCENE-6490) TestPayloadNearQuery fails with NPE

2015-05-19 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6490:
--
Attachment: LUCENE-6490.patch

This patch introduces a SpanCollectorFactory interface, and SpanWeight now 
takes this instead of SpanCollector, generating a new collector each time 
scorer() is called.

Running tests and precommit now.

 TestPayloadNearQuery fails with NPE
 ---

 Key: LUCENE-6490
 URL: https://issues.apache.org/jira/browse/LUCENE-6490
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
Assignee: Alan Woodward
 Attachments: LUCENE-6490.patch


 ant test  -Dtestcase=TestPayloadNearQuery -Dtests.method=test 
 -Dtests.seed=24743B1132665845 -Dtests.slow=true -Dtests.locale=es_NI 
 -Dtests.timezone=Israel -Dtests.asserts=true -Dtests.file.encoding=US-ASCII
 {noformat}
[junit4] Started J0 PID(19895@localhost).
[junit4] Suite: org.apache.lucene.search.payloads.TestPayloadNearQuery
[junit4]   2 NOTE: reproduce with: ant test  
 -Dtestcase=TestPayloadNearQuery -Dtests.method=test 
 -Dtests.seed=24743B1132665845 -Dtests.slow=true -Dtests.locale=es_NI 
 -Dtests.timezone=Israel -Dtests.asserts=true -Dtests.file.encoding=US-ASCII
[junit4] ERROR   0.09s | TestPayloadNearQuery.test 
[junit4] Throwable #1: java.lang.RuntimeException: 
 java.util.concurrent.ExecutionException: java.lang.NullPointerException
[junit4]  at 
 __randomizedtesting.SeedInfo.seed([24743B1132665845:AC2004CB9C9A35BD]:0)
[junit4]  at 
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:669)
[junit4]  at 
 org.apache.lucene.search.IndexSearcher.searchAfter(IndexSearcher.java:353)
[junit4]  at 
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:382)
[junit4]  at 
 org.apache.lucene.search.payloads.TestPayloadNearQuery.test(TestPayloadNearQuery.java:144)
[junit4]  at java.lang.Thread.run(Thread.java:745)
[junit4] Caused by: java.util.concurrent.ExecutionException: 
 java.lang.NullPointerException
[junit4]  at 
 java.util.concurrent.FutureTask.report(FutureTask.java:122)
[junit4]  at 
 java.util.concurrent.FutureTask.get(FutureTask.java:192)
[junit4]  at 
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:665)
[junit4]  ... 39 more
[junit4] Caused by: java.lang.NullPointerException
[junit4]  at 
 org.apache.lucene.search.payloads.PayloadNearQuery$PayloadNearSpanScorer.processPayloads(PayloadNearQuery.java:202)
[junit4]  at 
 org.apache.lucene.search.payloads.PayloadNearQuery$PayloadNearSpanScorer.setFreqCurrentDoc(PayloadNearQuery.java:223)
[junit4]  at 
 org.apache.lucene.search.spans.SpanScorer.ensureFreq(SpanScorer.java:65)
[junit4]  at 
 org.apache.lucene.search.spans.SpanScorer.score(SpanScorer.java:118)
[junit4]  at 
 org.apache.lucene.search.AssertingScorer.score(AssertingScorer.java:67)
[junit4]  at 
 org.apache.lucene.search.TopScoreDocCollector$SimpleTopScoreDocCollector$1.collect(TopScoreDocCollector.java:64)
[junit4]  at 
 org.apache.lucene.search.AssertingLeafCollector.collect(AssertingLeafCollector.java:53)
[junit4]  at 
 org.apache.lucene.search.AssertingCollector$1.collect(AssertingCollector.java:57)
[junit4]  at 
 org.apache.lucene.search.AssertingLeafCollector.collect(AssertingLeafCollector.java:53)
[junit4]  at 
 org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll(Weight.java:203)
[junit4]  at 
 org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:174)
[junit4]  at 
 org.apache.lucene.search.BulkScorer.score(BulkScorer.java:35)
[junit4]  at 
 org.apache.lucene.search.AssertingBulkScorer.score(AssertingBulkScorer.java:69)
[junit4]  at 
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:714)
[junit4]  at 
 org.apache.lucene.search.AssertingIndexSearcher.search(AssertingIndexSearcher.java:93)
[junit4]  at 
 org.apache.lucene.search.IndexSearcher$4.call(IndexSearcher.java:656)
[junit4]  at 
 org.apache.lucene.search.IndexSearcher$4.call(IndexSearcher.java:653)
[junit4]  at 
 java.util.concurrent.FutureTask.run(FutureTask.java:265)
[junit4]  at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[junit4]  at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[junit4]  ... 1 more
[junit4]   2 NOTE: test params are: 
 codec=FastDecompressionCompressingStoredFields(storedFieldsFormat=CompressingStoredFieldsFormat(compressionMode=FAST_DECOMPRESSION,
  

[jira] [Updated] (LUCENE-6706) Support Payload scoring for all SpanQueries

2015-08-03 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6706:
--
Summary: Support Payload scoring for all SpanQueries  (was: Support Payload 
scoring for SpanOrQuery)

 Support Payload scoring for all SpanQueries
 ---

 Key: LUCENE-6706
 URL: https://issues.apache.org/jira/browse/LUCENE-6706
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/query/scoring
Affects Versions: 5.2.1
Reporter: Jamie Johnson
Assignee: Alan Woodward
Priority: Minor
 Fix For: 5.3

 Attachments: LUCENE-6706.patch, PayloadSpanOrQuery.java


 I need a way to have payloads influence the score of SpanOrQuery's.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-6706) Support Payload scoring for all SpanQueries

2015-08-03 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward resolved LUCENE-6706.
---
   Resolution: Fixed
Fix Version/s: 5.3

Thanks Jamie!

 Support Payload scoring for all SpanQueries
 ---

 Key: LUCENE-6706
 URL: https://issues.apache.org/jira/browse/LUCENE-6706
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/query/scoring
Affects Versions: 5.2.1
Reporter: Jamie Johnson
Assignee: Alan Woodward
Priority: Minor
 Fix For: 5.3

 Attachments: LUCENE-6706.patch, PayloadSpanOrQuery.java


 I need a way to have payloads influence the score of SpanOrQuery's.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6706) Support Payload scoring for SpanOrQuery

2015-07-30 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated LUCENE-6706:
--
Attachment: LUCENE-6706.patch

Here's a patch adding a new PayloadScoreQuery that takes a SpanQuery and a 
PayloadFunction.  PayloadTermQuery and PayloadNearQuery are deprecated.

 Support Payload scoring for SpanOrQuery
 ---

 Key: LUCENE-6706
 URL: https://issues.apache.org/jira/browse/LUCENE-6706
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/query/scoring
Affects Versions: 5.2.1
Reporter: Jamie Johnson
Assignee: Alan Woodward
Priority: Minor
 Attachments: LUCENE-6706.patch, PayloadSpanOrQuery.java


 I need a way to have payloads influence the score of SpanOrQuery's.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6706) Support Payload scoring for all SpanQueries

2015-08-04 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653581#comment-14653581
 ] 

Alan Woodward commented on LUCENE-6706:
---

The tests were expecting docids to be stable, but some were being re-ordered by 
the random merge policy.  I set them to use NoMergePolicy instead.

 Support Payload scoring for all SpanQueries
 ---

 Key: LUCENE-6706
 URL: https://issues.apache.org/jira/browse/LUCENE-6706
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/query/scoring
Affects Versions: 5.2.1
Reporter: Jamie Johnson
Assignee: Alan Woodward
Priority: Minor
 Fix For: 5.3

 Attachments: LUCENE-6706.patch, PayloadSpanOrQuery.java


 I need a way to have payloads influence the score of SpanOrQuery's.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



<    3   4   5   6   7   8   9   10   11   12   >