[jira] [Updated] (LUCENE-6376) Spatial PointVectorStrategy should use DocValues

2015-04-03 Thread Aditya Dhulipala (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya Dhulipala updated LUCENE-6376:
-
Attachment: LUCENE-6376.patch

I've added the line to setDocValuesType for the 
PointVectorStrategy.createIndexableFields method

> Spatial PointVectorStrategy should use DocValues 
> -
>
> Key: LUCENE-6376
> URL: https://issues.apache.org/jira/browse/LUCENE-6376
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/spatial
>Reporter: David Smiley
> Attachments: LUCENE-6376.patch
>
>
> PointVectorStrategy.createIndexableFields should be using DocValues, like 
> BBoxStrategy does.  Without this, UninvertingReader is required.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-Tests-5.x-Java7 - Build # 2887 - Still Failing

2015-04-03 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Tests-5.x-Java7/2887/

4 tests failed.
FAILED:  org.apache.solr.cloud.LeaderInitiatedRecoveryOnCommitTest.test

Error Message:
IOException occured when talking to server at: 
http://127.0.0.1:64721/xiwg/br/c8n_1x3_commits_shard1_replica3

Stack Trace:
org.apache.solr.client.solrj.SolrServerException: IOException occured when 
talking to server at: 
http://127.0.0.1:64721/xiwg/br/c8n_1x3_commits_shard1_replica3
at 
__randomizedtesting.SeedInfo.seed([E317CCBEAFC08265:6B43F364013CEF9D]:0)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:570)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:233)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:225)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:135)
at org.apache.solr.client.solrj.SolrClient.commit(SolrClient.java:483)
at org.apache.solr.client.solrj.SolrClient.commit(SolrClient.java:464)
at 
org.apache.solr.cloud.LeaderInitiatedRecoveryOnCommitTest.oneShardTest(LeaderInitiatedRecoveryOnCommitTest.java:132)
at 
org.apache.solr.cloud.LeaderInitiatedRecoveryOnCommitTest.test(LeaderInitiatedRecoveryOnCommitTest.java:64)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1627)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:836)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:872)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:886)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:960)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:935)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:845)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:747)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:781)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:792)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  

[jira] [Commented] (LUCENE-5579) Spatial, enhance RPT to differentiate confirmed from non-confirmed hits, then validate with SDV

2015-04-03 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395531#comment-14395531
 ] 

David Smiley commented on LUCENE-5579:
--

I fixed a bug (some unfinished code I overlooked) and I finally see the 
performance numbers I've been expecting to see.  With the new approx/exact 
differentiated Intersects predicate, the benchmarked queries were ~83% faster 
compared to without.  YMMV a ton.  These shapes were all geodetic circles; 
which do have some trig but I bet a polygon, esp. a non-trivial polygon, should 
see more improvement.  This test used distErrPct=0.2 which will yield a tiny 
index & fast indexing but super-approximated shapes (very blocky looking).  By 
using distErrPct=0.1, the relative improvement became 100% (2x) since more 
detail allows more hits to be in the "exact" bucket.  The index increased in 
size 93% though.  Note even at 0.1, this index is about 1/4th the size of the 
default RPT configuration.

Now I need to wrap up the TODOs; including test a bit more.  Maybe re-think the 
name of this thing; although CompositeSpatialStrategy ain't bad.  Perhaps this 
could all go right into SerializedDVStrategy and then make this index portion 
being added here optional?  On the other hand... SerializedDVStrategy is but 
one specific way (BinaryDocValues) to retrieve the shape.  Granted we don't 
have any alternative similar nor do I plan to come up with one.  Or this code 
could go into RPT, so that you could optionally add the precision of the 
serialized geometry if you so choose.  Hmmm.

> Spatial, enhance RPT to differentiate confirmed from non-confirmed hits, then 
> validate with SDV
> ---
>
> Key: LUCENE-5579
> URL: https://issues.apache.org/jira/browse/LUCENE-5579
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: modules/spatial
>Reporter: David Smiley
> Attachments: LUCENE-5579_CompositeSpatialStrategy.patch, 
> LUCENE-5579_SPT_leaf_covered.patch
>
>
> If a cell is within the query shape (doesn't straddle the edge), then you can 
> be sure that all documents it matches are a confirmed hit. But if some 
> documents are only on the edge cells, then those documents could be validated 
> against SerializedDVStrategy for precise spatial search. This should be 
> *much* faster than using RPT and SerializedDVStrategy independently on the 
> same search, particularly when a lot of documents match.
> Perhaps this'll be a new RPT subclass, or maybe an optional configuration of 
> RPT.  This issue is just for the Intersects predicate, which will apply to 
> Disjoint.  Until resolved in other issues, the other predicates can be 
> handled in a naive/slow way by creating a filter that combines RPT's filter 
> and SerializedDVStrategy's filter using BitsFilteredDocIdSet.
> One thing I'm not sure of is how to expose to Lucene-spatial users the 
> underlying functionality such that they can put other query/filters 
> in-between RPT and the SerializedDVStrategy.  Maybe that'll be done by simply 
> ensuring the predicate filters have this capability and are public.
> It would be ideal to implement this capability _after_ the PrefixTree term 
> encoding is modified to differentiate edge leaf-cells from non-edge leaf 
> cells. This distinction will allow the code here to make more confirmed 
> matches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7126) Secure loading of runtime external jars

2015-04-03 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395480#comment-14395480
 ] 

Noble Paul commented on SOLR-7126:
--

It would be helpful if you post a link to a failed Jenkins build 

> Secure loading of runtime external jars
> ---
>
> Key: SOLR-7126
> URL: https://issues.apache.org/jira/browse/SOLR-7126
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Blocker
>  Labels: security
> Fix For: Trunk, 5.1
>
> Attachments: SOLR-7126.patch, SOLR-7126.patch, SOLR-7126.patch
>
>
> We need to ensure that the jars loaded into solr are trusted 
> We shall use simple PKI to protect the jars/config loaded into the system
> The following are the steps involved for doing that.
> {noformat}
> #Step 1:
> # generate a 768-bit RSA private key. or whaterver strength you would need
> $ openssl genrsa -out priv_key.pem 768
> # store your private keys safely (with  a password if possible)
> # output public key portion in DER format (so that Java can read it)
> $ openssl rsa -in priv_key.pem -pubout -outform DER -out pub_key.der
> #Step 2:
> #Load the .DER files to ZK under /keys/exe
> Step3:
> # start all your servers with -Denable.runtime.lib=true 
> Step 4:
> # sign the sha1 digest of your jar with one of your private keys and get the 
> base64 string of that signature . 
> $ openssl dgst -sha1 -sign priv_key.pem myjar.jar | openssl enc -base64 
> #Step 5:
> # load your jars into blob store . refer SOLR-6787
> #Step 6:
> # use the command to add your jar to classpath as follows
> {noformat}
> {code}
> curl http://localhost:8983/solr/collection1/config -H 
> 'Content-type:application/json'  -d '{
> "add-runtimelib" : {"name": "jarname" , "version":2 , 
> "sig":"mW1Gwtz2QazjfVdrLFHfbGwcr8xzFYgUOLu68LHqWRDvLG0uLcy1McQ+AzVmeZFBf1yLPDEHBWJb5KXr8bdbHN/PYgUB1nsr9pk4EFyD9KfJ8TqeH/ijQ9waa/vjqyiKEI9U550EtSzruLVZ32wJ7smvV0fj2YYhrUaaPzOn9g0="
>  }// output of step 4. concatenate the lines 
> }' 
> {code}
> sig is the extra parameter that is nothing but the base64 encoded value of 
> the jar's sha1 signature 
> If no keys are present , the jar is loaded without any checking. 
> Before loading a jar from blob store , each Solr node would check if there 
> are keys present in the keys directory. If yes, each jar's signature will be 
> verified with all the available public keys. If atleast one succeeds , the 
> jar is loaded into memory. If nothing succeeds , it will be rejected 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-6271) PostingsEnum should have consistent flags behavior

2015-04-03 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-6271.
-
Resolution: Fixed

> PostingsEnum should have consistent flags behavior
> --
>
> Key: LUCENE-6271
> URL: https://issues.apache.org/jira/browse/LUCENE-6271
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Ryan Ernst
>Assignee: Robert Muir
> Fix For: 5.1
>
> Attachments: LUCENE-6271.patch, LUCENE-6271.patch
>
>
> When asking for flags like OFFSETS or PAYLOADS with DocsAndPositionsEnum, the 
> behavior was to always return an enum, even if offsets or payloads were not 
> indexed.  They would just not be available from the enum if they were not 
> present.  This behavior was carried over to PostingsEnum, which is good.
> However, the new POSITIONS flag has different behavior.  If positions are not 
> available, null is returned, instead of a PostingsEnum that just gives access 
> to freqs.  This behavior is confusing, as it means you have to special case 
> asking for positions (only ask if you know they were indexed) which sort of 
> defeats the purpose of the unified PostingsEnum.
> We should make POSITIONS have the same behavior as other flags. The trickiest 
> part will be maintaining backcompat for DocsAndPositionsEnum in 5.x, but I 
> think it can be done.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6271) PostingsEnum should have consistent flags behavior

2015-04-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395459#comment-14395459
 ] 

ASF subversion and git services commented on LUCENE-6271:
-

Commit 1671239 from [~rcmuir] in branch 'dev/branches/lucene_solr_5_1'
[ https://svn.apache.org/r1671239 ]

LUCENE-6271: PostingsEnum should have consistent flags behavior

> PostingsEnum should have consistent flags behavior
> --
>
> Key: LUCENE-6271
> URL: https://issues.apache.org/jira/browse/LUCENE-6271
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Ryan Ernst
>Assignee: Robert Muir
> Fix For: 5.1
>
> Attachments: LUCENE-6271.patch, LUCENE-6271.patch
>
>
> When asking for flags like OFFSETS or PAYLOADS with DocsAndPositionsEnum, the 
> behavior was to always return an enum, even if offsets or payloads were not 
> indexed.  They would just not be available from the enum if they were not 
> present.  This behavior was carried over to PostingsEnum, which is good.
> However, the new POSITIONS flag has different behavior.  If positions are not 
> available, null is returned, instead of a PostingsEnum that just gives access 
> to freqs.  This behavior is confusing, as it means you have to special case 
> asking for positions (only ask if you know they were indexed) which sort of 
> defeats the purpose of the unified PostingsEnum.
> We should make POSITIONS have the same behavior as other flags. The trickiest 
> part will be maintaining backcompat for DocsAndPositionsEnum in 5.x, but I 
> think it can be done.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-7290) Change schemaless _text and copyField

2015-04-03 Thread Steve Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe resolved SOLR-7290.
--
Resolution: Fixed
  Assignee: Steve Rowe

Committed the renaming patch to trunk, branch_5x and lucene_solr_5_1.

Note that this leaves the copyField behavior intact.

I'll resolve this issue - if people want to continue discussing the copyField 
behavior, please create another issue.

> Change schemaless _text and copyField
> -
>
> Key: SOLR-7290
> URL: https://issues.apache.org/jira/browse/SOLR-7290
> Project: Solr
>  Issue Type: Bug
>Reporter: Mike Murphy
>Assignee: Steve Rowe
> Fix For: Trunk, 5.1, 5.2
>
> Attachments: SOLR-7290.patch, SOLR-7290.patch
>
>
> schemaless configs should remove copyField to _text
> or at least change _text name.
> http://markmail.org/message/v6djadk5azx6k4gv
> This default led to bad indexing performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7290) Change schemaless _text and copyField

2015-04-03 Thread Steve Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated SOLR-7290:
-
Fix Version/s: 5.2
   Trunk

> Change schemaless _text and copyField
> -
>
> Key: SOLR-7290
> URL: https://issues.apache.org/jira/browse/SOLR-7290
> Project: Solr
>  Issue Type: Bug
>Reporter: Mike Murphy
> Fix For: Trunk, 5.1, 5.2
>
> Attachments: SOLR-7290.patch, SOLR-7290.patch
>
>
> schemaless configs should remove copyField to _text
> or at least change _text name.
> http://markmail.org/message/v6djadk5azx6k4gv
> This default led to bad indexing performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7290) Change schemaless _text and copyField

2015-04-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395430#comment-14395430
 ] 

ASF subversion and git services commented on SOLR-7290:
---

Commit 1671236 from [~steve_rowe] in branch 'dev/branches/lucene_solr_5_1'
[ https://svn.apache.org/r1671236 ]

SOLR-7290: Rename catchall _text field in data_driven_schema_configs to _text_ 
(merged trunk r1671234)

> Change schemaless _text and copyField
> -
>
> Key: SOLR-7290
> URL: https://issues.apache.org/jira/browse/SOLR-7290
> Project: Solr
>  Issue Type: Bug
>Reporter: Mike Murphy
> Fix For: 5.1
>
> Attachments: SOLR-7290.patch, SOLR-7290.patch
>
>
> schemaless configs should remove copyField to _text
> or at least change _text name.
> http://markmail.org/message/v6djadk5azx6k4gv
> This default led to bad indexing performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7290) Change schemaless _text and copyField

2015-04-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395429#comment-14395429
 ] 

ASF subversion and git services commented on SOLR-7290:
---

Commit 1671235 from [~steve_rowe] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1671235 ]

SOLR-7290: Rename catchall _text field in data_driven_schema_configs to _text_ 
(merged trunk r1671234)

> Change schemaless _text and copyField
> -
>
> Key: SOLR-7290
> URL: https://issues.apache.org/jira/browse/SOLR-7290
> Project: Solr
>  Issue Type: Bug
>Reporter: Mike Murphy
> Fix For: 5.1
>
> Attachments: SOLR-7290.patch, SOLR-7290.patch
>
>
> schemaless configs should remove copyField to _text
> or at least change _text name.
> http://markmail.org/message/v6djadk5azx6k4gv
> This default led to bad indexing performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7290) Change schemaless _text and copyField

2015-04-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395422#comment-14395422
 ] 

ASF subversion and git services commented on SOLR-7290:
---

Commit 1671234 from [~steve_rowe] in branch 'dev/trunk'
[ https://svn.apache.org/r1671234 ]

SOLR-7290: Rename catchall _text field in data_driven_schema_configs to _text_

> Change schemaless _text and copyField
> -
>
> Key: SOLR-7290
> URL: https://issues.apache.org/jira/browse/SOLR-7290
> Project: Solr
>  Issue Type: Bug
>Reporter: Mike Murphy
> Fix For: 5.1
>
> Attachments: SOLR-7290.patch, SOLR-7290.patch
>
>
> schemaless configs should remove copyField to _text
> or at least change _text name.
> http://markmail.org/message/v6djadk5azx6k4gv
> This default led to bad indexing performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7290) Change schemaless _text and copyField

2015-04-03 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395418#comment-14395418
 ] 

Steve Rowe commented on SOLR-7290:
--

Tests passed.

Committing shortly.

> Change schemaless _text and copyField
> -
>
> Key: SOLR-7290
> URL: https://issues.apache.org/jira/browse/SOLR-7290
> Project: Solr
>  Issue Type: Bug
>Reporter: Mike Murphy
> Fix For: 5.1
>
> Attachments: SOLR-7290.patch, SOLR-7290.patch
>
>
> schemaless configs should remove copyField to _text
> or at least change _text name.
> http://markmail.org/message/v6djadk5azx6k4gv
> This default led to bad indexing performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-Tests-5.x-Java7 - Build # 2886 - Still Failing

2015-04-03 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Tests-5.x-Java7/2886/

1 tests failed.
REGRESSION:  
org.apache.lucene.analysis.uima.UIMABaseAnalyzerTest.testRandomStringsWithConfigurationParameters

Error Message:
some thread(s) failed

Stack Trace:
java.lang.RuntimeException: some thread(s) failed
at 
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:531)
at 
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:428)
at 
org.apache.lucene.analysis.uima.UIMABaseAnalyzerTest.testRandomStringsWithConfigurationParameters(UIMABaseAnalyzerTest.java:133)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1627)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:836)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:872)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:886)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:845)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:747)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:781)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:792)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
at java.lang.Thread.run(Thread.java:745)




Build Log:
[...truncated 3939 lines...]
   [junit4] Suite: org.apache.lucene.analysis.uima.UIMABaseAnalyzerTest
   [junit4]   2> Abr 03, 2015 8:02:12 PM WhitespaceTokenizer initialize
   [junit4]   2> INFO: "Whitespace tokenizer successfully initialized"
   [junit4]   2> Abr 03, 2015 8:02:12 PM WhitespaceTokenizer typeSystemInit
   [junit4]   2> INFO: "Whitespace tokenizer typesystem initialized"
   [junit4]   2> Abr 03, 2015 8:02:12 PM WhitespaceTokenizer process
   [junit4]   2> INFO: "Whitespace tokenizer starts processing"
   [junit4]   2> Abr 03, 2015 8:02:12 PM WhitespaceTokenizer process
   [junit4]  

[jira] [Commented] (LUCENE-6271) PostingsEnum should have consistent flags behavior

2015-04-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395403#comment-14395403
 ] 

ASF subversion and git services commented on LUCENE-6271:
-

Commit 1671228 from [~rcmuir] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1671228 ]

LUCENE-6271: PostingsEnum should have consistent flags behavior

> PostingsEnum should have consistent flags behavior
> --
>
> Key: LUCENE-6271
> URL: https://issues.apache.org/jira/browse/LUCENE-6271
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Ryan Ernst
>Assignee: Robert Muir
> Fix For: 5.1
>
> Attachments: LUCENE-6271.patch, LUCENE-6271.patch
>
>
> When asking for flags like OFFSETS or PAYLOADS with DocsAndPositionsEnum, the 
> behavior was to always return an enum, even if offsets or payloads were not 
> indexed.  They would just not be available from the enum if they were not 
> present.  This behavior was carried over to PostingsEnum, which is good.
> However, the new POSITIONS flag has different behavior.  If positions are not 
> available, null is returned, instead of a PostingsEnum that just gives access 
> to freqs.  This behavior is confusing, as it means you have to special case 
> asking for positions (only ask if you know they were indexed) which sort of 
> defeats the purpose of the unified PostingsEnum.
> We should make POSITIONS have the same behavior as other flags. The trickiest 
> part will be maintaining backcompat for DocsAndPositionsEnum in 5.x, but I 
> think it can be done.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7290) Change schemaless _text and copyField

2015-04-03 Thread Steve Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated SOLR-7290:
-
Attachment: SOLR-7290.patch

Added note to "Upgrading from 5.0" section in {{CHANGES.txt}}

> Change schemaless _text and copyField
> -
>
> Key: SOLR-7290
> URL: https://issues.apache.org/jira/browse/SOLR-7290
> Project: Solr
>  Issue Type: Bug
>Reporter: Mike Murphy
> Fix For: 5.1
>
> Attachments: SOLR-7290.patch, SOLR-7290.patch
>
>
> schemaless configs should remove copyField to _text
> or at least change _text name.
> http://markmail.org/message/v6djadk5azx6k4gv
> This default led to bad indexing performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7290) Change schemaless _text and copyField

2015-04-03 Thread Steve Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated SOLR-7290:
-
Attachment: SOLR-7290.patch

Patch.

I'm running tests now.

> Change schemaless _text and copyField
> -
>
> Key: SOLR-7290
> URL: https://issues.apache.org/jira/browse/SOLR-7290
> Project: Solr
>  Issue Type: Bug
>Reporter: Mike Murphy
> Fix For: 5.1
>
> Attachments: SOLR-7290.patch
>
>
> schemaless configs should remove copyField to _text
> or at least change _text name.
> http://markmail.org/message/v6djadk5azx6k4gv
> This default led to bad indexing performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7290) Change schemaless _text and copyField

2015-04-03 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395280#comment-14395280
 ] 

Erick Erickson commented on SOLR-7290:
--

+1 for changingin 5.1



> Change schemaless _text and copyField
> -
>
> Key: SOLR-7290
> URL: https://issues.apache.org/jira/browse/SOLR-7290
> Project: Solr
>  Issue Type: Bug
>Reporter: Mike Murphy
> Fix For: 5.1
>
>
> schemaless configs should remove copyField to _text
> or at least change _text name.
> http://markmail.org/message/v6djadk5azx6k4gv
> This default led to bad indexing performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7290) Change schemaless _text and copyField

2015-04-03 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395279#comment-14395279
 ] 

Yonik Seeley commented on SOLR-7290:


bq. +1 for _text_ - any objections to changing this in 5.1? If we're going to 
change it, I think we should do it sooner rather than later.

+1

> Change schemaless _text and copyField
> -
>
> Key: SOLR-7290
> URL: https://issues.apache.org/jira/browse/SOLR-7290
> Project: Solr
>  Issue Type: Bug
>Reporter: Mike Murphy
> Fix For: 5.1
>
>
> schemaless configs should remove copyField to _text
> or at least change _text name.
> http://markmail.org/message/v6djadk5azx6k4gv
> This default led to bad indexing performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Solr cloud-dev scripts

2015-04-03 Thread Ramkumar R. Aiyengar
I started looking at porting cloud-dev scripts to the new startup scripts
after the discussion at SOLR-7240, but wasn't quite sure of what the
behaviour should be, having never used them myself. Some of the scripts
there have syntax errors, and I am not sure if some of the others are doing
what was intended even on branch_5x where Jetty 8 is still used. I have a
feeling many of them assume that the stock start.jar starts with a single
"collection1" core because of how the solr home used to be set up before,
which is no longer true.

So how do people use these scripts? Which scripts are used, and for what
purpose?


[jira] [Commented] (SOLR-7290) Change schemaless _text and copyField

2015-04-03 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395272#comment-14395272
 ] 

Steve Rowe commented on SOLR-7290:
--

+1 for {{\_text\_}} - any objections to changing this in 5.1?  If we're going 
to change it, I think we should do it sooner rather than later.

> Change schemaless _text and copyField
> -
>
> Key: SOLR-7290
> URL: https://issues.apache.org/jira/browse/SOLR-7290
> Project: Solr
>  Issue Type: Bug
>Reporter: Mike Murphy
> Fix For: 5.1
>
>
> schemaless configs should remove copyField to _text
> or at least change _text name.
> http://markmail.org/message/v6djadk5azx6k4gv
> This default led to bad indexing performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7338) A reloaded core will never register itself as active after a ZK session expiration

2015-04-03 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395200#comment-14395200
 ] 

Mark Miller commented on SOLR-7338:
---

bq. I can combine them and commit.

Go ahead. I think that's the right current fix and it also addresses SOLR-6583.

> A reloaded core will never register itself as active after a ZK session 
> expiration
> --
>
> Key: SOLR-7338
> URL: https://issues.apache.org/jira/browse/SOLR-7338
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Reporter: Timothy Potter
>Assignee: Mark Miller
> Attachments: SOLR-7338.patch, SOLR-7338_test.patch
>
>
> If a collection gets reloaded, then a core's isReloaded flag is always true. 
> If a core experiences a ZK session expiration after a reload, then it won't 
> ever be able to set itself to active because of the check in 
> {{ZkController#register}}:
> {code}
> UpdateLog ulog = core.getUpdateHandler().getUpdateLog();
> if (!core.isReloaded() && ulog != null) {
>   // disable recovery in case shard is in construction state (for 
> shard splits)
>   Slice slice = getClusterState().getSlice(collection, shardId);
>   if (slice.getState() != Slice.State.CONSTRUCTION || !isLeader) {
> Future recoveryFuture = 
> core.getUpdateHandler().getUpdateLog().recoverFromLog();
> if (recoveryFuture != null) {
>   log.info("Replaying tlog for " + ourUrl + " during startup... 
> NOTE: This can take a while.");
>   recoveryFuture.get(); // NOTE: this could potentially block for
>   // minutes or more!
>   // TODO: public as recovering in the mean time?
>   // TODO: in the future we could do peersync in parallel with 
> recoverFromLog
> } else {
>   log.info("No LogReplay needed for core=" + core.getName() + " 
> baseURL=" + baseUrl);
> }
>   }
>   boolean didRecovery = checkRecovery(coreName, desc, 
> recoverReloadedCores, isLeader, cloudDesc,
>   collection, coreZkNodeName, shardId, leaderProps, core, cc);
>   if (!didRecovery) {
> publish(desc, ZkStateReader.ACTIVE);
>   }
> }
> {code}
> I can easily simulate this on trunk by doing:
> {code}
> bin/solr -c -z localhost:2181
> bin/solr create -c foo
> bin/post -c foo example/exampledocs/*.xml
> curl "http://localhost:8983/solr/admin/collections?action=RELOAD&name=foo";
> kill -STOP  && sleep  && kill -CONT 
> {code}
> Where  is the process ID of the Solr node. Here are the logs after the 
> CONT command. As you can see below, the core never gets to setting itself as 
> active again. I think the bug is that the isReloaded flag needs to get set 
> back to false once the reload is successful, but I don't understand what this 
> flag is needed for anyway???
> {code}
> INFO  - 2015-04-01 17:28:50.962; 
> org.apache.solr.common.cloud.ConnectionManager; Watcher 
> org.apache.solr.common.cloud.ConnectionManager@5519dba0 
> name:ZooKeeperConnection Watcher:localhost:2181 got event WatchedEvent 
> state:Disconnected type:None path:null path:null type:None
> INFO  - 2015-04-01 17:28:50.963; 
> org.apache.solr.common.cloud.ConnectionManager; zkClient has disconnected
> INFO  - 2015-04-01 17:28:51.107; 
> org.apache.solr.common.cloud.ConnectionManager; Watcher 
> org.apache.solr.common.cloud.ConnectionManager@5519dba0 
> name:ZooKeeperConnection Watcher:localhost:2181 got event WatchedEvent 
> state:Expired type:None path:null path:null type:None
> INFO  - 2015-04-01 17:28:51.107; 
> org.apache.solr.common.cloud.ConnectionManager; Our previous ZooKeeper 
> session was expired. Attempting to reconnect to recover relationship with 
> ZooKeeper...
> INFO  - 2015-04-01 17:28:51.108; org.apache.solr.cloud.Overseer; Overseer 
> (id=93579450724974592-192.168.1.2:8983_solr-n_00) closing
> INFO  - 2015-04-01 17:28:51.108; 
> org.apache.solr.cloud.ZkController$WatcherImpl; A node got unwatched for 
> /configs/foo
> INFO  - 2015-04-01 17:28:51.108; 
> org.apache.solr.cloud.Overseer$ClusterStateUpdater; Overseer Loop exiting : 
> 192.168.1.2:8983_solr
> INFO  - 2015-04-01 17:28:51.109; 
> org.apache.solr.cloud.OverseerCollectionProcessor; According to ZK I 
> (id=93579450724974592-192.168.1.2:8983_solr-n_00) am no longer a 
> leader.
> INFO  - 2015-04-01 17:28:51.108; org.apache.solr.cloud.ZkController$4; 
> Running listeners for /configs/foo
> INFO  - 2015-04-01 17:28:51.109; 
> org.apache.solr.common.cloud.DefaultConnectionStrategy; Connection expired - 
> starting a new one...
> INFO  - 2015-04-01 17:28:51.109; org.apache.solr.core.SolrCore$11; config 
> update listener called for core foo_shard1_replica1
> I

Re: [JENKINS] Lucene-Solr-NightlyTests-5.x - Build # 806 - Still Failing

2015-04-03 Thread Areek Zillur
I committed a fix for this.
-Areek

On Fri, Apr 3, 2015 at 11:52 AM, Apache Jenkins Server <
jenk...@builds.apache.org> wrote:

> Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-5.x/806/
>
> 1 tests failed.
> REGRESSION:
> org.apache.lucene.search.suggest.document.SuggestFieldTest.testDupSuggestFieldValues
>
> Error Message:
> MockDirectoryWrapper: cannot close: there are still open files:
> {_ak_completion_0.lkp=1, _aj_completion_0.lkp=1, _ak_completion_0.tim=1,
> _ak_completion_0.doc=1, _ak.nvd=1, _aj_completion_0.pay=1,
> _aj_completion_0.pos=1, _ak_completion_0.pos=1, _ak.fdt=1, _aj.nvd=1,
> _aj_completion_0.tim=1, _aj_completion_0.doc=1, _ak_completion_0.pay=1,
> _aj.fdt=1}
>
> Stack Trace:
> java.lang.RuntimeException: MockDirectoryWrapper: cannot close: there are
> still open files: {_ak_completion_0.lkp=1, _aj_completion_0.lkp=1,
> _ak_completion_0.tim=1, _ak_completion_0.doc=1, _ak.nvd=1,
> _aj_completion_0.pay=1, _aj_completion_0.pos=1, _ak_completion_0.pos=1,
> _ak.fdt=1, _aj.nvd=1, _aj_completion_0.tim=1, _aj_completion_0.doc=1,
> _ak_completion_0.pay=1, _aj.fdt=1}
> at
> org.apache.lucene.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:747)
> at
> org.apache.lucene.search.suggest.document.SuggestFieldTest.after(SuggestFieldTest.java:80)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1627)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:894)
> at
> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
> at
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
> at
> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
> at
> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
> at
> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
> at
> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798)
> at
> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:845)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:747)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:781)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:792)
> at
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
> at
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
> at
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54)
> at
> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
> at
> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
> at
> org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: unclosed IndexInput:
> _aj_completion_0.p

Re: [JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.8.0_40) - Build # 12193 - Failure!

2015-04-03 Thread Areek Zillur
I committed a fix for this

On Fri, Apr 3, 2015 at 3:55 AM, Areek Zillur  wrote:

> I am looking into this failure.
> -Areek
>
> On Fri, Apr 3, 2015 at 12:59 AM, Policeman Jenkins Server <
> jenk...@thetaphi.de> wrote:
>
>> Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/12193/
>> Java: 32bit/jdk1.8.0_40 -client -XX:+UseParallelGC
>>
>> 1 tests failed.
>> FAILED:
>> org.apache.lucene.search.suggest.document.SuggestFieldTest.testDupSuggestFieldValues
>>
>> Error Message:
>> MockDirectoryWrapper: cannot close: there are still open files:
>> {_yp.cfs=1, _yo_completion_0.pos=1, _yo_completion_0.pay=1,
>> _yo_completion_0.tim=1, _yo_completion_0.lkp=1, _yq.cfs=1, _yo.fdt=1,
>> _yo_completion_0.doc=1, _yo.nvd=1}
>>
>> Stack Trace:
>> java.lang.RuntimeException: MockDirectoryWrapper: cannot close: there are
>> still open files: {_yp.cfs=1, _yo_completion_0.pos=1,
>> _yo_completion_0.pay=1, _yo_completion_0.tim=1, _yo_completion_0.lkp=1,
>> _yq.cfs=1, _yo.fdt=1, _yo_completion_0.doc=1, _yo.nvd=1}
>> at
>> org.apache.lucene.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:747)
>> at
>> org.apache.lucene.search.suggest.document.SuggestFieldTest.after(SuggestFieldTest.java:81)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:497)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1627)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:894)
>> at
>> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
>> at
>> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
>> at
>> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
>> at
>> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
>> at
>> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
>> at
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
>> at
>> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798)
>> at
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:845)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:747)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:781)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:792)
>> at
>> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
>> at
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at
>> org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
>> at
>> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
>> at
>> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
>> at
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at
>> org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54)
>> at
>> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
>> at
>> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
>> at
>> org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
>> at
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
>> at java.lang.Thread.run(Thread.java:745)
>> Caused by: java.lang.RuntimeException: unclosed IndexInput:
>> _yo_completion_0.pos
>> at
>>

[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

2015-04-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395194#comment-14395194
 ] 

ASF subversion and git services commented on LUCENE-6339:
-

Commit 1671196 from [~areek] in branch 'dev/trunk'
[ https://svn.apache.org/r1671196 ]

LUCENE-6339: fix test (ensure the maximum requested size is bounded to 1000)

> [suggest] Near real time Document Suggester
> ---
>
> Key: LUCENE-6339
> URL: https://issues.apache.org/jira/browse/LUCENE-6339
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: core/search
>Affects Versions: 5.0
>Reporter: Areek Zillur
>Assignee: Areek Zillur
> Fix For: Trunk, 5.1
>
> Attachments: LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, 
> LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch
>
>
> The idea is to index documents with one or more *SuggestField*(s) and be able 
> to suggest documents with a *SuggestField* value that matches a given key.
> A SuggestField can be assigned a numeric weight to be used to score the 
> suggestion at query time.
> Document suggestion can be done on an indexed *SuggestField*. The document 
> suggester can filter out deleted documents in near real-time. The suggester 
> can filter out documents based on a Filter (note: may change to a non-scoring 
> query?) at query time.
> A custom postings format (CompletionPostingsFormat) is used to index 
> SuggestField(s) and perform document suggestions.
> h4. Usage
> {code:java}
>   // hook up custom postings format
>   // indexAnalyzer for SuggestField
>   Analyzer analyzer = ...
>   IndexWriterConfig config = new IndexWriterConfig(analyzer);
>   Codec codec = new Lucene50Codec() {
> PostingsFormat completionPostingsFormat = new 
> Completion50PostingsFormat();
> @Override
> public PostingsFormat getPostingsFormatForField(String field) {
>   if (isSuggestField(field)) {
> return completionPostingsFormat;
>   }
>   return super.getPostingsFormatForField(field);
> }
>   };
>   config.setCodec(codec);
>   IndexWriter writer = new IndexWriter(dir, config);
>   // index some documents with suggestions
>   Document doc = new Document();
>   doc.add(new SuggestField("suggest_title", "title1", 2));
>   doc.add(new SuggestField("suggest_name", "name1", 3));
>   writer.addDocument(doc)
>   ...
>   // open an nrt reader for the directory
>   DirectoryReader reader = DirectoryReader.open(writer, false);
>   // SuggestIndexSearcher is a thin wrapper over IndexSearcher
>   // queryAnalyzer will be used to analyze the query string
>   SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader, 
> queryAnalyzer);
>   
>   // suggest 10 documents for "titl" on "suggest_title" field
>   TopSuggestDocs suggest = indexSearcher.suggest("suggest_title", "titl", 10);
> {code}
> h4. Indexing
> Index analyzer set through *IndexWriterConfig*
> {code:java}
> SuggestField(String name, String value, long weight) 
> {code}
> h4. Query
> Query analyzer set through *SuggestIndexSearcher*.
> Hits are collected in descending order of the suggestion's weight 
> {code:java}
> // full options for TopSuggestDocs (TopDocs)
> TopSuggestDocs suggest(String field, CharSequence key, int num, Filter filter)
> // full options for Collector
> // note: only collects does not score
> void suggest(String field, CharSequence key, int num, Filter filter, 
> TopSuggestDocsCollector collector) 
> {code}
> h4. Analyzer
> *CompletionAnalyzer* can be used instead to wrap another analyzer to tune 
> suggest field only parameters. 
> {code:java}
> CompletionAnalyzer(Analyzer analyzer, boolean preserveSep, boolean 
> preservePositionIncrements, int maxGraphExpansions)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

2015-04-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395176#comment-14395176
 ] 

ASF subversion and git services commented on LUCENE-6339:
-

Commit 1671189 from [~areek] in branch 'dev/branches/lucene_solr_5_1'
[ https://svn.apache.org/r1671189 ]

LUCENE-6339: fix test (ensure the maximum requested size is bounded to 1000)

> [suggest] Near real time Document Suggester
> ---
>
> Key: LUCENE-6339
> URL: https://issues.apache.org/jira/browse/LUCENE-6339
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: core/search
>Affects Versions: 5.0
>Reporter: Areek Zillur
>Assignee: Areek Zillur
> Fix For: Trunk, 5.1
>
> Attachments: LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, 
> LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch
>
>
> The idea is to index documents with one or more *SuggestField*(s) and be able 
> to suggest documents with a *SuggestField* value that matches a given key.
> A SuggestField can be assigned a numeric weight to be used to score the 
> suggestion at query time.
> Document suggestion can be done on an indexed *SuggestField*. The document 
> suggester can filter out deleted documents in near real-time. The suggester 
> can filter out documents based on a Filter (note: may change to a non-scoring 
> query?) at query time.
> A custom postings format (CompletionPostingsFormat) is used to index 
> SuggestField(s) and perform document suggestions.
> h4. Usage
> {code:java}
>   // hook up custom postings format
>   // indexAnalyzer for SuggestField
>   Analyzer analyzer = ...
>   IndexWriterConfig config = new IndexWriterConfig(analyzer);
>   Codec codec = new Lucene50Codec() {
> PostingsFormat completionPostingsFormat = new 
> Completion50PostingsFormat();
> @Override
> public PostingsFormat getPostingsFormatForField(String field) {
>   if (isSuggestField(field)) {
> return completionPostingsFormat;
>   }
>   return super.getPostingsFormatForField(field);
> }
>   };
>   config.setCodec(codec);
>   IndexWriter writer = new IndexWriter(dir, config);
>   // index some documents with suggestions
>   Document doc = new Document();
>   doc.add(new SuggestField("suggest_title", "title1", 2));
>   doc.add(new SuggestField("suggest_name", "name1", 3));
>   writer.addDocument(doc)
>   ...
>   // open an nrt reader for the directory
>   DirectoryReader reader = DirectoryReader.open(writer, false);
>   // SuggestIndexSearcher is a thin wrapper over IndexSearcher
>   // queryAnalyzer will be used to analyze the query string
>   SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader, 
> queryAnalyzer);
>   
>   // suggest 10 documents for "titl" on "suggest_title" field
>   TopSuggestDocs suggest = indexSearcher.suggest("suggest_title", "titl", 10);
> {code}
> h4. Indexing
> Index analyzer set through *IndexWriterConfig*
> {code:java}
> SuggestField(String name, String value, long weight) 
> {code}
> h4. Query
> Query analyzer set through *SuggestIndexSearcher*.
> Hits are collected in descending order of the suggestion's weight 
> {code:java}
> // full options for TopSuggestDocs (TopDocs)
> TopSuggestDocs suggest(String field, CharSequence key, int num, Filter filter)
> // full options for Collector
> // note: only collects does not score
> void suggest(String field, CharSequence key, int num, Filter filter, 
> TopSuggestDocsCollector collector) 
> {code}
> h4. Analyzer
> *CompletionAnalyzer* can be used instead to wrap another analyzer to tune 
> suggest field only parameters. 
> {code:java}
> CompletionAnalyzer(Analyzer analyzer, boolean preserveSep, boolean 
> preservePositionIncrements, int maxGraphExpansions)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester

2015-04-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395174#comment-14395174
 ] 

ASF subversion and git services commented on LUCENE-6339:
-

Commit 1671187 from [~areek] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1671187 ]

LUCENE-6339: fix test (ensure the maximum requested size is bounded to 1000)

> [suggest] Near real time Document Suggester
> ---
>
> Key: LUCENE-6339
> URL: https://issues.apache.org/jira/browse/LUCENE-6339
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: core/search
>Affects Versions: 5.0
>Reporter: Areek Zillur
>Assignee: Areek Zillur
> Fix For: Trunk, 5.1
>
> Attachments: LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, 
> LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch
>
>
> The idea is to index documents with one or more *SuggestField*(s) and be able 
> to suggest documents with a *SuggestField* value that matches a given key.
> A SuggestField can be assigned a numeric weight to be used to score the 
> suggestion at query time.
> Document suggestion can be done on an indexed *SuggestField*. The document 
> suggester can filter out deleted documents in near real-time. The suggester 
> can filter out documents based on a Filter (note: may change to a non-scoring 
> query?) at query time.
> A custom postings format (CompletionPostingsFormat) is used to index 
> SuggestField(s) and perform document suggestions.
> h4. Usage
> {code:java}
>   // hook up custom postings format
>   // indexAnalyzer for SuggestField
>   Analyzer analyzer = ...
>   IndexWriterConfig config = new IndexWriterConfig(analyzer);
>   Codec codec = new Lucene50Codec() {
> PostingsFormat completionPostingsFormat = new 
> Completion50PostingsFormat();
> @Override
> public PostingsFormat getPostingsFormatForField(String field) {
>   if (isSuggestField(field)) {
> return completionPostingsFormat;
>   }
>   return super.getPostingsFormatForField(field);
> }
>   };
>   config.setCodec(codec);
>   IndexWriter writer = new IndexWriter(dir, config);
>   // index some documents with suggestions
>   Document doc = new Document();
>   doc.add(new SuggestField("suggest_title", "title1", 2));
>   doc.add(new SuggestField("suggest_name", "name1", 3));
>   writer.addDocument(doc)
>   ...
>   // open an nrt reader for the directory
>   DirectoryReader reader = DirectoryReader.open(writer, false);
>   // SuggestIndexSearcher is a thin wrapper over IndexSearcher
>   // queryAnalyzer will be used to analyze the query string
>   SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader, 
> queryAnalyzer);
>   
>   // suggest 10 documents for "titl" on "suggest_title" field
>   TopSuggestDocs suggest = indexSearcher.suggest("suggest_title", "titl", 10);
> {code}
> h4. Indexing
> Index analyzer set through *IndexWriterConfig*
> {code:java}
> SuggestField(String name, String value, long weight) 
> {code}
> h4. Query
> Query analyzer set through *SuggestIndexSearcher*.
> Hits are collected in descending order of the suggestion's weight 
> {code:java}
> // full options for TopSuggestDocs (TopDocs)
> TopSuggestDocs suggest(String field, CharSequence key, int num, Filter filter)
> // full options for Collector
> // note: only collects does not score
> void suggest(String field, CharSequence key, int num, Filter filter, 
> TopSuggestDocsCollector collector) 
> {code}
> h4. Analyzer
> *CompletionAnalyzer* can be used instead to wrap another analyzer to tune 
> suggest field only parameters. 
> {code:java}
> CompletionAnalyzer(Analyzer analyzer, boolean preserveSep, boolean 
> preservePositionIncrements, int maxGraphExpansions)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6271) PostingsEnum should have consistent flags behavior

2015-04-03 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395118#comment-14395118
 ] 

Robert Muir commented on LUCENE-6271:
-

for the backport, I will add a deprecated flag to simulate the null behavior of 
before. 

This way the Docs/DocsAndPositionsEnum have the old semantics.

TestLegacyPostings tests will move to BasePostingsFormatTestCase and 
BaseTermVectorsFormatTestCase so all codecs (especially backwards-codecs/) are 
explicitly tested with both the old Docs/DocsAndPositions and PostingsEnum api.

> PostingsEnum should have consistent flags behavior
> --
>
> Key: LUCENE-6271
> URL: https://issues.apache.org/jira/browse/LUCENE-6271
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Ryan Ernst
>Assignee: Robert Muir
> Fix For: 5.1
>
> Attachments: LUCENE-6271.patch, LUCENE-6271.patch
>
>
> When asking for flags like OFFSETS or PAYLOADS with DocsAndPositionsEnum, the 
> behavior was to always return an enum, even if offsets or payloads were not 
> indexed.  They would just not be available from the enum if they were not 
> present.  This behavior was carried over to PostingsEnum, which is good.
> However, the new POSITIONS flag has different behavior.  If positions are not 
> available, null is returned, instead of a PostingsEnum that just gives access 
> to freqs.  This behavior is confusing, as it means you have to special case 
> asking for positions (only ask if you know they were indexed) which sort of 
> defeats the purpose of the unified PostingsEnum.
> We should make POSITIONS have the same behavior as other flags. The trickiest 
> part will be maintaining backcompat for DocsAndPositionsEnum in 5.x, but I 
> think it can be done.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6271) PostingsEnum should have consistent flags behavior

2015-04-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395108#comment-14395108
 ] 

ASF subversion and git services commented on LUCENE-6271:
-

Commit 1671163 from [~rcmuir] in branch 'dev/trunk'
[ https://svn.apache.org/r1671163 ]

LUCENE-6271: PostingsEnum should have consistent flags behavior

> PostingsEnum should have consistent flags behavior
> --
>
> Key: LUCENE-6271
> URL: https://issues.apache.org/jira/browse/LUCENE-6271
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Ryan Ernst
>Assignee: Robert Muir
> Fix For: 5.1
>
> Attachments: LUCENE-6271.patch, LUCENE-6271.patch
>
>
> When asking for flags like OFFSETS or PAYLOADS with DocsAndPositionsEnum, the 
> behavior was to always return an enum, even if offsets or payloads were not 
> indexed.  They would just not be available from the enum if they were not 
> present.  This behavior was carried over to PostingsEnum, which is good.
> However, the new POSITIONS flag has different behavior.  If positions are not 
> available, null is returned, instead of a PostingsEnum that just gives access 
> to freqs.  This behavior is confusing, as it means you have to special case 
> asking for positions (only ask if you know they were indexed) which sort of 
> defeats the purpose of the unified PostingsEnum.
> We should make POSITIONS have the same behavior as other flags. The trickiest 
> part will be maintaining backcompat for DocsAndPositionsEnum in 5.x, but I 
> think it can be done.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-Tests-5.x-Java7 - Build # 2885 - Still Failing

2015-04-03 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Tests-5.x-Java7/2885/

3 tests failed.
FAILED:  org.apache.solr.cloud.LeaderInitiatedRecoveryOnCommitTest.test

Error Message:
IOException occured when talking to server at: 
http://127.0.0.1:16808/c8n_1x3_commits_shard1_replica1

Stack Trace:
org.apache.solr.client.solrj.SolrServerException: IOException occured when 
talking to server at: http://127.0.0.1:16808/c8n_1x3_commits_shard1_replica1
at 
__randomizedtesting.SeedInfo.seed([8632F28962AA5D4B:E66CD53CC5630B3]:0)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:570)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:233)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:225)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:135)
at org.apache.solr.client.solrj.SolrClient.commit(SolrClient.java:483)
at org.apache.solr.client.solrj.SolrClient.commit(SolrClient.java:464)
at 
org.apache.solr.cloud.LeaderInitiatedRecoveryOnCommitTest.oneShardTest(LeaderInitiatedRecoveryOnCommitTest.java:132)
at 
org.apache.solr.cloud.LeaderInitiatedRecoveryOnCommitTest.test(LeaderInitiatedRecoveryOnCommitTest.java:64)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1627)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:836)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:872)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:886)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:960)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:935)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:845)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:747)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:781)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:792)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apac

[jira] [Commented] (LUCENE-6271) PostingsEnum should have consistent flags behavior

2015-04-03 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395077#comment-14395077
 ] 

Robert Muir commented on LUCENE-6271:
-

As ryan mentioned in the email list, this one needs to be in 5.1 or we can 
never fix it without a tricky semantics-only change.

I will help get this in, ive been hammering tests at it the last few days and 
I'm satisfied there. We added lots of tests that will get tested across all 
codecs (including older ones) so we know there aren't sneaky bugs. Basically, 
this is what we are saving our users from.

Please just give me a few hours for each branch.

> PostingsEnum should have consistent flags behavior
> --
>
> Key: LUCENE-6271
> URL: https://issues.apache.org/jira/browse/LUCENE-6271
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Ryan Ernst
>Assignee: Robert Muir
> Fix For: 5.1
>
> Attachments: LUCENE-6271.patch, LUCENE-6271.patch
>
>
> When asking for flags like OFFSETS or PAYLOADS with DocsAndPositionsEnum, the 
> behavior was to always return an enum, even if offsets or payloads were not 
> indexed.  They would just not be available from the enum if they were not 
> present.  This behavior was carried over to PostingsEnum, which is good.
> However, the new POSITIONS flag has different behavior.  If positions are not 
> available, null is returned, instead of a PostingsEnum that just gives access 
> to freqs.  This behavior is confusing, as it means you have to special case 
> asking for positions (only ask if you know they were indexed) which sort of 
> defeats the purpose of the unified PostingsEnum.
> We should make POSITIONS have the same behavior as other flags. The trickiest 
> part will be maintaining backcompat for DocsAndPositionsEnum in 5.x, but I 
> think it can be done.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (LUCENE-6271) PostingsEnum should have consistent flags behavior

2015-04-03 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir reassigned LUCENE-6271:
---

Assignee: Robert Muir

> PostingsEnum should have consistent flags behavior
> --
>
> Key: LUCENE-6271
> URL: https://issues.apache.org/jira/browse/LUCENE-6271
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Ryan Ernst
>Assignee: Robert Muir
> Fix For: 5.1
>
> Attachments: LUCENE-6271.patch, LUCENE-6271.patch
>
>
> When asking for flags like OFFSETS or PAYLOADS with DocsAndPositionsEnum, the 
> behavior was to always return an enum, even if offsets or payloads were not 
> indexed.  They would just not be available from the enum if they were not 
> present.  This behavior was carried over to PostingsEnum, which is good.
> However, the new POSITIONS flag has different behavior.  If positions are not 
> available, null is returned, instead of a PostingsEnum that just gives access 
> to freqs.  This behavior is confusing, as it means you have to special case 
> asking for positions (only ask if you know they were indexed) which sort of 
> defeats the purpose of the unified PostingsEnum.
> We should make POSITIONS have the same behavior as other flags. The trickiest 
> part will be maintaining backcompat for DocsAndPositionsEnum in 5.x, but I 
> think it can be done.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6271) PostingsEnum should have consistent flags behavior

2015-04-03 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-6271:

Fix Version/s: 5.1

> PostingsEnum should have consistent flags behavior
> --
>
> Key: LUCENE-6271
> URL: https://issues.apache.org/jira/browse/LUCENE-6271
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Ryan Ernst
> Fix For: 5.1
>
> Attachments: LUCENE-6271.patch, LUCENE-6271.patch
>
>
> When asking for flags like OFFSETS or PAYLOADS with DocsAndPositionsEnum, the 
> behavior was to always return an enum, even if offsets or payloads were not 
> indexed.  They would just not be available from the enum if they were not 
> present.  This behavior was carried over to PostingsEnum, which is good.
> However, the new POSITIONS flag has different behavior.  If positions are not 
> available, null is returned, instead of a PostingsEnum that just gives access 
> to freqs.  This behavior is confusing, as it means you have to special case 
> asking for positions (only ask if you know they were indexed) which sort of 
> defeats the purpose of the unified PostingsEnum.
> We should make POSITIONS have the same behavior as other flags. The trickiest 
> part will be maintaining backcompat for DocsAndPositionsEnum in 5.x, but I 
> think it can be done.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6271) PostingsEnum should have consistent flags behavior

2015-04-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395069#comment-14395069
 ] 

ASF subversion and git services commented on LUCENE-6271:
-

Commit 1671160 from [~rcmuir] in branch 'dev/branches/lucene6271'
[ https://svn.apache.org/r1671160 ]

LUCENE-6271: add vectors tests for postings enum api

> PostingsEnum should have consistent flags behavior
> --
>
> Key: LUCENE-6271
> URL: https://issues.apache.org/jira/browse/LUCENE-6271
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Ryan Ernst
> Attachments: LUCENE-6271.patch, LUCENE-6271.patch
>
>
> When asking for flags like OFFSETS or PAYLOADS with DocsAndPositionsEnum, the 
> behavior was to always return an enum, even if offsets or payloads were not 
> indexed.  They would just not be available from the enum if they were not 
> present.  This behavior was carried over to PostingsEnum, which is good.
> However, the new POSITIONS flag has different behavior.  If positions are not 
> available, null is returned, instead of a PostingsEnum that just gives access 
> to freqs.  This behavior is confusing, as it means you have to special case 
> asking for positions (only ask if you know they were indexed) which sort of 
> defeats the purpose of the unified PostingsEnum.
> We should make POSITIONS have the same behavior as other flags. The trickiest 
> part will be maintaining backcompat for DocsAndPositionsEnum in 5.x, but I 
> think it can be done.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Test infrastructure: is a running counter of tests completed possible?

2015-04-03 Thread Dawid Weiss
> When there are hundreds of test suites to do, it's extremely difficult
> to know many have been completed and how many are left.  Knowing that
> information at a glance would be extremely helpful in the time
> management arena.

I get your point. The exact notion of "time" left is somewhat tricky
because the time each test takes depends on the seed (and is in
general random), but I think some notion of "progress" would be
helpful. I filed this ticket to track this. Once I add it I'll move it
to Lucene/Solr too.

https://github.com/carrotsearch/randomizedtesting/issues/188

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4656) Add hl.maxMultiValuedToExamine to limit the number of multiValued entries examined while highlighting

2015-04-03 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395034#comment-14395034
 ] 

Erick Erickson commented on SOLR-4656:
--

[~dsmiley] You're right on both counts. The intent of maxMultiValuedToMatch is, 
indeed, it should stop after matching N _fragments_, so the name is 
unfortunate. It should trip if it was set to, say, 3 and a single MV entry had 
3 snippets. Maybe maxSnippetsToMatch? Deprecate and usenew terms IMO, but up to 
you.

Right, if there is no snippet it shouldn't be decremented.

Good catch!

Erick

> Add hl.maxMultiValuedToExamine to limit the number of multiValued entries 
> examined while highlighting
> -
>
> Key: SOLR-4656
> URL: https://issues.apache.org/jira/browse/SOLR-4656
> Project: Solr
>  Issue Type: Improvement
>  Components: highlighter
>Affects Versions: 4.3, Trunk
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: 4.3, Trunk
>
> Attachments: SOLR-4656-4x.patch, SOLR-4656-4x.patch, 
> SOLR-4656-trunk.patch, SOLR-4656.patch
>
>
> I'm looking at an admittedly pathological case of many, many entries in a 
> multiValued field, and trying to implement a way to limit the number 
> examined, analogous to maxAnalyzedChars, see the patch.
> Along the way, I noticed that we do what looks like unnecessary copying of 
> the fields to be examined. We call Document.getFields, which copies all of 
> the fields and values to the returned array. Then we copy all of those to 
> another array, converting them to Strings. Then we actually examine them. a> 
> this doesn't seem very efficient and b> reduces the benefit from limiting the 
> number of mv values examined.
> So the attached does two things:
> 1> attempts to fix this
> 2> implements hl.maxMultiValuedToExamine
> I'd _really_ love it if someone who knows the highlighting code takes a peek 
> at the fix to see if I've messed things up, the changes are actually pretty 
> minimal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Test infrastructure: is a running counter of tests completed possible?

2015-04-03 Thread Shawn Heisey
On 4/3/2015 1:28 PM, Dawid Weiss wrote:
> I think this may be confusing as it seems to imply a given suite
> number is somehow connected to a given forked JVM. This isn't the
> case, there is typically dynamic job stealing involved. You're
> probably looking after some sort of "total" progress indicator, aren't
> you?

That's exactly what I'd like.  The precise formatting/wording of the
output probably needs adjustment, that was just the first thing that
came to mind.

When there are hundreds of test suites to do, it's extremely difficult
to know many have been completed and how many are left.  Knowing that
information at a glance would be extremely helpful in the time
management arena.

Thanks,
Shawn


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7333) Make the poll queue time configurable and use knowledge that a batch is being processed to poll efficiently

2015-04-03 Thread Timothy Potter (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Potter updated SOLR-7333:
-
Attachment: SOLR-7333.patch

Thanks for the suggestion Mark! Updated patch with unit test added and ability 
to set the poll time using a Java system property, default is 25 ms. I think 
this one is ready to go.

> Make the poll queue time configurable and use knowledge that a batch is being 
> processed to poll efficiently
> ---
>
> Key: SOLR-7333
> URL: https://issues.apache.org/jira/browse/SOLR-7333
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Timothy Potter
>Assignee: Timothy Potter
> Attachments: SOLR-7333.patch, SOLR-7333.patch
>
>
> {{StreamingSolrClients}} uses {{ConcurrentUpdateSolrServer}} to stream 
> documents from leader to replica, by default it sets the {{pollQueueTime}} 
> for CUSS to 0 so that we don't impose an unnecessary wait when processing 
> single document updates or the last doc in a batch. However, the downside is 
> that replicas receive many more update requests than leaders; I've seen up to 
> 40x number of update requests between replica and leader.
> If we're processing a batch of docs, then ideally the poll queue time should 
> be greater than 0 up until the last doc is pulled off the queue. If we're 
> processing a single doc, then the poll queue time should always be 0 as we 
> don't want the thread to wait unnecessarily for another doc that won't come.
> Rather than force indexing applications to provide this optional parameter in 
> an update request, it would be better for server-side code that can detect 
> whether an update request is a single document or batch of documents to 
> override this value internally, i.e. it'll be 0 by default, but since 
> {{JavaBinUpdateRequestCodec}} can determine when it's seen the last doc in a 
> batch, it can override the pollQueueTime to something greater than 0.
> This means that current indexing clients will see a boost when doing batch 
> updates without making any changes on their side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-7348) Dashboard screen for a core in the Admin UI shows -1 for heap usage

2015-04-03 Thread Timothy Potter (JIRA)
Timothy Potter created SOLR-7348:


 Summary: Dashboard screen for a core in the Admin UI shows -1 for 
heap usage
 Key: SOLR-7348
 URL: https://issues.apache.org/jira/browse/SOLR-7348
 Project: Solr
  Issue Type: Bug
  Components: e
Reporter: Timothy Potter
Priority: Minor


Spin off from SOLR-7334 ...

Luke request handler is reporting the heap usage of a core as -1, this is not a 
UI issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-7334) Admin UI does not show "Num Docs" and "Deleted Docs"

2015-04-03 Thread Timothy Potter (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Potter resolved SOLR-7334.
--
   Resolution: Fixed
Fix Version/s: 5.1
   Trunk
 Assignee: Timothy Potter

> Admin UI does not show "Num Docs" and "Deleted Docs"
> 
>
> Key: SOLR-7334
> URL: https://issues.apache.org/jira/browse/SOLR-7334
> Project: Solr
>  Issue Type: Bug
>  Components: UI
>Affects Versions: 5.0, Trunk
>Reporter: Erick Erickson
>Assignee: Timothy Potter
>Priority: Blocker
> Fix For: Trunk, 5.1
>
> Attachments: SOLR-7334.patch, SOLR-7334.patch
>
>
> I'm calling this a blocker, but I won't argue the point too much. Mostly I'm 
> making sure we make a conscious decision here.
> Steps to reproduce:
> bin/solr start -e techproducts
> Just to go to the admin UI and select the core.
> On a chat, Upayavira volunteered, so I'm assigning it to him. I'm sure if 
> anyone wants to jump on it he wouldn't mind.
> [~thelabdude] What's your opinion?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7334) Admin UI does not show "Num Docs" and "Deleted Docs"

2015-04-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395014#comment-14395014
 ] 

ASF subversion and git services commented on SOLR-7334:
---

Commit 1671150 from [~thelabdude] in branch 'dev/branches/lucene_solr_5_1'
[ https://svn.apache.org/r1671150 ]

SOLR-7334: Admin UI does not show Num Docs and Deleted Docs

> Admin UI does not show "Num Docs" and "Deleted Docs"
> 
>
> Key: SOLR-7334
> URL: https://issues.apache.org/jira/browse/SOLR-7334
> Project: Solr
>  Issue Type: Bug
>  Components: UI
>Affects Versions: 5.0, Trunk
>Reporter: Erick Erickson
>Priority: Blocker
> Attachments: SOLR-7334.patch, SOLR-7334.patch
>
>
> I'm calling this a blocker, but I won't argue the point too much. Mostly I'm 
> making sure we make a conscious decision here.
> Steps to reproduce:
> bin/solr start -e techproducts
> Just to go to the admin UI and select the core.
> On a chat, Upayavira volunteered, so I'm assigning it to him. I'm sure if 
> anyone wants to jump on it he wouldn't mind.
> [~thelabdude] What's your opinion?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7334) Admin UI does not show "Num Docs" and "Deleted Docs"

2015-04-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395013#comment-14395013
 ] 

ASF subversion and git services commented on SOLR-7334:
---

Commit 1671149 from [~thelabdude] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1671149 ]

SOLR-7334: Admin UI does not show Num Docs and Deleted Docs

> Admin UI does not show "Num Docs" and "Deleted Docs"
> 
>
> Key: SOLR-7334
> URL: https://issues.apache.org/jira/browse/SOLR-7334
> Project: Solr
>  Issue Type: Bug
>  Components: UI
>Affects Versions: 5.0, Trunk
>Reporter: Erick Erickson
>Priority: Blocker
> Attachments: SOLR-7334.patch, SOLR-7334.patch
>
>
> I'm calling this a blocker, but I won't argue the point too much. Mostly I'm 
> making sure we make a conscious decision here.
> Steps to reproduce:
> bin/solr start -e techproducts
> Just to go to the admin UI and select the core.
> On a chat, Upayavira volunteered, so I'm assigning it to him. I'm sure if 
> anyone wants to jump on it he wouldn't mind.
> [~thelabdude] What's your opinion?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7334) Admin UI does not show "Num Docs" and "Deleted Docs"

2015-04-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395011#comment-14395011
 ] 

ASF subversion and git services commented on SOLR-7334:
---

Commit 1671148 from [~thelabdude] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1671148 ]

SOLR-7334: Admin UI does not show Num Docs and Deleted DocsC

> Admin UI does not show "Num Docs" and "Deleted Docs"
> 
>
> Key: SOLR-7334
> URL: https://issues.apache.org/jira/browse/SOLR-7334
> Project: Solr
>  Issue Type: Bug
>  Components: UI
>Affects Versions: 5.0, Trunk
>Reporter: Erick Erickson
>Priority: Blocker
> Attachments: SOLR-7334.patch, SOLR-7334.patch
>
>
> I'm calling this a blocker, but I won't argue the point too much. Mostly I'm 
> making sure we make a conscious decision here.
> Steps to reproduce:
> bin/solr start -e techproducts
> Just to go to the admin UI and select the core.
> On a chat, Upayavira volunteered, so I'm assigning it to him. I'm sure if 
> anyone wants to jump on it he wouldn't mind.
> [~thelabdude] What's your opinion?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7334) Admin UI does not show "Num Docs" and "Deleted Docs"

2015-04-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395009#comment-14395009
 ] 

ASF subversion and git services commented on SOLR-7334:
---

Commit 1671147 from [~thelabdude] in branch 'dev/trunk'
[ https://svn.apache.org/r1671147 ]

SOLR-7334: Admin UI does not show Num Docs and Deleted Docs

> Admin UI does not show "Num Docs" and "Deleted Docs"
> 
>
> Key: SOLR-7334
> URL: https://issues.apache.org/jira/browse/SOLR-7334
> Project: Solr
>  Issue Type: Bug
>  Components: UI
>Affects Versions: 5.0, Trunk
>Reporter: Erick Erickson
>Priority: Blocker
> Attachments: SOLR-7334.patch, SOLR-7334.patch
>
>
> I'm calling this a blocker, but I won't argue the point too much. Mostly I'm 
> making sure we make a conscious decision here.
> Steps to reproduce:
> bin/solr start -e techproducts
> Just to go to the admin UI and select the core.
> On a chat, Upayavira volunteered, so I'm assigning it to him. I'm sure if 
> anyone wants to jump on it he wouldn't mind.
> [~thelabdude] What's your opinion?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7334) Admin UI does not show "Num Docs" and "Deleted Docs"

2015-04-03 Thread Timothy Potter (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Potter updated SOLR-7334:
-
Summary: Admin UI does not show "Num Docs" and "Deleted Docs"  (was: Admin 
UI does not show "Num Docs" and "Deleted Docs", and "Heap Memory Usage is -1")

> Admin UI does not show "Num Docs" and "Deleted Docs"
> 
>
> Key: SOLR-7334
> URL: https://issues.apache.org/jira/browse/SOLR-7334
> Project: Solr
>  Issue Type: Bug
>  Components: UI
>Affects Versions: 5.0, Trunk
>Reporter: Erick Erickson
>Priority: Blocker
> Attachments: SOLR-7334.patch, SOLR-7334.patch
>
>
> I'm calling this a blocker, but I won't argue the point too much. Mostly I'm 
> making sure we make a conscious decision here.
> Steps to reproduce:
> bin/solr start -e techproducts
> Just to go to the admin UI and select the core.
> On a chat, Upayavira volunteered, so I'm assigning it to him. I'm sure if 
> anyone wants to jump on it he wouldn't mind.
> [~thelabdude] What's your opinion?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7334) Admin UI does not show "Num Docs" and "Deleted Docs", and "Heap Memory Usage is -1"

2015-04-03 Thread Timothy Potter (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Potter updated SOLR-7334:
-
Attachment: SOLR-7334.patch

oops - too fast ... fix for deleted docs too ... the heap -1 is not UI related, 
so will open another non-blocker for that issue

> Admin UI does not show "Num Docs" and "Deleted Docs", and "Heap Memory Usage 
> is -1"
> ---
>
> Key: SOLR-7334
> URL: https://issues.apache.org/jira/browse/SOLR-7334
> Project: Solr
>  Issue Type: Bug
>  Components: UI
>Affects Versions: 5.0, Trunk
>Reporter: Erick Erickson
>Priority: Blocker
> Attachments: SOLR-7334.patch, SOLR-7334.patch
>
>
> I'm calling this a blocker, but I won't argue the point too much. Mostly I'm 
> making sure we make a conscious decision here.
> Steps to reproduce:
> bin/solr start -e techproducts
> Just to go to the admin UI and select the core.
> On a chat, Upayavira volunteered, so I'm assigning it to him. I'm sure if 
> anyone wants to jump on it he wouldn't mind.
> [~thelabdude] What's your opinion?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7334) Admin UI does not show "Num Docs" and "Deleted Docs", and "Heap Memory Usage is -1"

2015-04-03 Thread Timothy Potter (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Potter updated SOLR-7334:
-
Attachment: SOLR-7334.patch

Here's a quick fix for the numDocs problem ... the -1 heap usage is coming back 
from the server from Luke, so that is not a UI issue

> Admin UI does not show "Num Docs" and "Deleted Docs", and "Heap Memory Usage 
> is -1"
> ---
>
> Key: SOLR-7334
> URL: https://issues.apache.org/jira/browse/SOLR-7334
> Project: Solr
>  Issue Type: Bug
>  Components: UI
>Affects Versions: 5.0, Trunk
>Reporter: Erick Erickson
>Priority: Blocker
> Attachments: SOLR-7334.patch
>
>
> I'm calling this a blocker, but I won't argue the point too much. Mostly I'm 
> making sure we make a conscious decision here.
> Steps to reproduce:
> bin/solr start -e techproducts
> Just to go to the admin UI and select the core.
> On a chat, Upayavira volunteered, so I'm assigning it to him. I'm sure if 
> anyone wants to jump on it he wouldn't mind.
> [~thelabdude] What's your opinion?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7346) Stored XSS in Admin UI Schema-Browser page and Analysis page

2015-04-03 Thread Steve Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated SOLR-7346:
-
Description: 
Like CVE-2014-3628 , the vulnerability also exists in Admin UI Schema-Browser 
page and Analysis page, which was caused by  improper validation of 
user-supplied input, for example, create fields by Schema API.  When the 
Schema-Browser page or Analysis page url is clicked,  an XSS will be triggered. 
An attacker could use this vulnerability to steal the victim's cookie-based 
authentication credentials. 
patch for solr5.0.0
{noformat}
solr/webapp/web/js/scripts/schema-browser.js
--- schema-browser.js   2015-04-03 14:42:19.0 +0800
+++ schema-browser_patch.js 2015-04-03 14:42:59.0 +0800
@@ -596,7 +596,7 @@
 {
   fields.push
   (
-'' + 
field_name + ''
+'' + 
field_name.esc() + ''
   );
 }
 if( 0 !== fields.length )

solr/webapp/web/js/scripts/analysis.js
--- analysis.js 2015-04-03 14:22:34.0 +0800
+++ analysis_patch.js   2015-04-03 14:23:09.0 +0800
@@ -80,7 +80,7 @@
   {
 fields.push
 (
-  '' + field_name 
+ ''
+  '' + 
field_name.esc() + ''
 );
   }
   if( 0 !== fields.length )
{noformat}

  was:
Like CVE-2014-3628 , the vulnerability also exists in Admin UI Schema-Browser 
page and Analysis page, which was caused by  improper validation of 
user-supplied input, for example, create fields by Schema API.  When the 
Schema-Browser page or Analysis page url is clicked,  an XSS will be triggered. 
An attacker could use this vulnerability to steal the victim's cookie-based 
authentication credentials. 
patch for solr5.0.0
solr/webapp/web/js/scripts/schema-browser.js
--- schema-browser.js   2015-04-03 14:42:19.0 +0800
+++ schema-browser_patch.js 2015-04-03 14:42:59.0 +0800
@@ -596,7 +596,7 @@
 {
   fields.push
   (
-'' + 
field_name + ''
+'' + 
field_name.esc() + ''
   );
 }
 if( 0 !== fields.length )

solr/webapp/web/js/scripts/analysis.js
--- analysis.js 2015-04-03 14:22:34.0 +0800
+++ analysis_patch.js   2015-04-03 14:23:09.0 +0800
@@ -80,7 +80,7 @@
   {
 fields.push
 (
-  '' + field_name 
+ ''
+  '' + 
field_name.esc() + ''
 );
   }
   if( 0 !== fields.length )


> Stored XSS in Admin UI Schema-Browser page and Analysis page
> 
>
> Key: SOLR-7346
> URL: https://issues.apache.org/jira/browse/SOLR-7346
> Project: Solr
>  Issue Type: Bug
>  Components: UI
>Affects Versions: 4.10.2, 5.0
> Environment: linux x86_64
> jdk 1.7.0.75
> apache tomcat-7.0.57
> solr 5.0.0
>Reporter: Mei Wang
>  Labels: patch, security
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Like CVE-2014-3628 , the vulnerability also exists in Admin UI Schema-Browser 
> page and Analysis page, which was caused by  improper validation of 
> user-supplied input, for example, create fields by Schema API.  When the 
> Schema-Browser page or Analysis page url is clicked,  an XSS will be 
> triggered. An attacker could use this vulnerability to steal the victim's 
> cookie-based authentication credentials. 
> patch for solr5.0.0
> {noformat}
> solr/webapp/web/js/scripts/schema-browser.js
> --- schema-browser.js   2015-04-03 14:42:19.0 +0800
> +++ schema-browser_patch.js 2015-04-03 14:42:59.0 +0800
> @@ -596,7 +596,7 @@
>  {
>fields.push
>(
> -'' + 
> field_name + ''
> +'' + 
> field_name.esc() + ''
>);
>  }
>  if( 0 !== fields.length )
> solr/webapp/web/js/scripts/analysis.js
> --- analysis.js 2015-04-03 14:22:34.0 +0800
> +++ analysis_patch.js   2015-04-03 14:23:09.0 +0800
> @@ -80,7 +80,7 @@
>{
>  fields.push
>  (
> -  '' + 
> field_name + ''
> +  '' + 
> field_name.esc() + ''
>  );
>}
>if( 0 !== fields.length )
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.or

[jira] [Commented] (LUCENE-6393) SpanFirstQuery sometimes returns Spans without any positions

2015-04-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394991#comment-14394991
 ] 

ASF subversion and git services commented on LUCENE-6393:
-

Commit 1671139 from [~rcmuir] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1671139 ]

LUCENE-6393: add equivalence tests for SpanFirstQuery.

> SpanFirstQuery sometimes returns Spans without any positions
> 
>
> Key: LUCENE-6393
> URL: https://issues.apache.org/jira/browse/LUCENE-6393
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Fix For: Trunk, 5.2
>
>
> This hits an assert in SpanScorer because it breaks the javadocs contract of 
> Spans.nextStartPosition():
>* Returns the next start position for the current doc.
>* There is always *at least one start/end position* per doc.
>* After the last start/end position at the current doc this returns 
> NO_MORE_POSITIONS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6393) SpanFirstQuery sometimes returns Spans without any positions

2015-04-03 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394987#comment-14394987
 ] 

Robert Muir commented on LUCENE-6393:
-

I committed tests marked with \@AwaitsFix to TestSpanSearchEquivalence for now. 
Was working to get those beefed up a little bit and cover more of the spans so 
we could have more confidence.

> SpanFirstQuery sometimes returns Spans without any positions
> 
>
> Key: LUCENE-6393
> URL: https://issues.apache.org/jira/browse/LUCENE-6393
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Fix For: Trunk, 5.2
>
>
> This hits an assert in SpanScorer because it breaks the javadocs contract of 
> Spans.nextStartPosition():
>* Returns the next start position for the current doc.
>* There is always *at least one start/end position* per doc.
>* After the last start/end position at the current doc this returns 
> NO_MORE_POSITIONS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6393) SpanFirstQuery sometimes returns Spans without any positions

2015-04-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394982#comment-14394982
 ] 

ASF subversion and git services commented on LUCENE-6393:
-

Commit 1671137 from [~rcmuir] in branch 'dev/trunk'
[ https://svn.apache.org/r1671137 ]

LUCENE-6393: add equivalence tests for SpanFirstQuery.

> SpanFirstQuery sometimes returns Spans without any positions
> 
>
> Key: LUCENE-6393
> URL: https://issues.apache.org/jira/browse/LUCENE-6393
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Fix For: Trunk, 5.2
>
>
> This hits an assert in SpanScorer because it breaks the javadocs contract of 
> Spans.nextStartPosition():
>* Returns the next start position for the current doc.
>* There is always *at least one start/end position* per doc.
>* After the last start/end position at the current doc this returns 
> NO_MORE_POSITIONS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6393) SpanFirstQuery sometimes returns Spans without any positions

2015-04-03 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394969#comment-14394969
 ] 

Robert Muir commented on LUCENE-6393:
-

This is pretty easy to find by just doing a SpanFirstQuery(someNearQuery, 0).

It only happens when it wraps a span-near query, when it wraps a simple 
termquery it never does the wrong thing.

> SpanFirstQuery sometimes returns Spans without any positions
> 
>
> Key: LUCENE-6393
> URL: https://issues.apache.org/jira/browse/LUCENE-6393
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Fix For: Trunk, 5.2
>
>
> This hits an assert in SpanScorer because it breaks the javadocs contract of 
> Spans.nextStartPosition():
>* Returns the next start position for the current doc.
>* There is always *at least one start/end position* per doc.
>* After the last start/end position at the current doc this returns 
> NO_MORE_POSITIONS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-6391) Give SpanScorer two-phase iterator support.

2015-04-03 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-6391.
-
   Resolution: Fixed
Fix Version/s: 5.2
   Trunk

> Give SpanScorer two-phase iterator support.
> ---
>
> Key: LUCENE-6391
> URL: https://issues.apache.org/jira/browse/LUCENE-6391
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Robert Muir
> Fix For: Trunk, 5.2
>
> Attachments: LUCENE-6391.patch
>
>
> Fix SpanScorer to use any two-phase iterator support of the underlying Spans. 
> This means e.g. a spans in a booleanquery, or a spans with a filter can be 
> faster.
> In order to do this, we have to clean up this class a little bit:
> * forward most methods directly to the underlying spans.
> * ensure positions are only iterated at most once.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Reopened] (LUCENE-6393) SpanFirstQuery sometimes returns Spans without any positions

2015-04-03 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir reopened LUCENE-6393:
-

Oops. i resolved the wrong issue on accident.

> SpanFirstQuery sometimes returns Spans without any positions
> 
>
> Key: LUCENE-6393
> URL: https://issues.apache.org/jira/browse/LUCENE-6393
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Fix For: Trunk, 5.2
>
>
> This hits an assert in SpanScorer because it breaks the javadocs contract of 
> Spans.nextStartPosition():
>* Returns the next start position for the current doc.
>* There is always *at least one start/end position* per doc.
>* After the last start/end position at the current doc this returns 
> NO_MORE_POSITIONS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-6393) SpanFirstQuery sometimes returns Spans without any positions

2015-04-03 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-6393.
-
   Resolution: Fixed
Fix Version/s: 5.2
   Trunk

> SpanFirstQuery sometimes returns Spans without any positions
> 
>
> Key: LUCENE-6393
> URL: https://issues.apache.org/jira/browse/LUCENE-6393
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Fix For: Trunk, 5.2
>
>
> This hits an assert in SpanScorer because it breaks the javadocs contract of 
> Spans.nextStartPosition():
>* Returns the next start position for the current doc.
>* There is always *at least one start/end position* per doc.
>* After the last start/end position at the current doc this returns 
> NO_MORE_POSITIONS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6391) Give SpanScorer two-phase iterator support.

2015-04-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394963#comment-14394963
 ] 

ASF subversion and git services commented on LUCENE-6391:
-

Commit 1671131 from [~rcmuir] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1671131 ]

LUCENE-6391: Give SpanScorer two-phase iterator support

> Give SpanScorer two-phase iterator support.
> ---
>
> Key: LUCENE-6391
> URL: https://issues.apache.org/jira/browse/LUCENE-6391
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Robert Muir
> Attachments: LUCENE-6391.patch
>
>
> Fix SpanScorer to use any two-phase iterator support of the underlying Spans. 
> This means e.g. a spans in a booleanquery, or a spans with a filter can be 
> faster.
> In order to do this, we have to clean up this class a little bit:
> * forward most methods directly to the underlying spans.
> * ensure positions are only iterated at most once.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Test infrastructure: is a running counter of tests completed possible?

2015-04-03 Thread Dawid Weiss
There's virtually no way to please everyone... so the default
TextReport would quickly become bloated with a lot of options. It
already kind of has.

Anyway, I also looked again at what you asked for:

> Suite 279/478 completed on J0 in 2.18s, 2 tests

I think this may be confusing as it seems to imply a given suite
number is somehow connected to a given forked JVM. This isn't the
case, there is typically dynamic job stealing involved. You're
probably looking after some sort of "total" progress indicator, aren't
you?

Dawid

On Fri, Apr 3, 2015 at 8:27 PM, Shawn Heisey  wrote:
> On 4/3/2015 11:13 AM, Dawid Weiss wrote:
>> You can have anything you like. The output is one example of a test
>> output listener:
>> https://github.com/carrotsearch/randomizedtesting/blob/master/junit4-ant/src/main/java/com/carrotsearch/ant/tasks/junit4/listeners/TextReport.java
>>
>> You can attach any listeners to the junit:junit4 ant task.
>
> For me, this sort of thing is very advanced, so I will apologize for
> seeming like a complete idiot.
>
> It seems as though what I would want to do here is make a new class that
> extends TextReport ... but then I'm completely unfamiliar with what's
> involved in attaching the new class to the ant task you mentioned (or
> even where that's defined), how I would need to package it, etc.
>
> A possibly better idea is to give you a pull request so the feature
> makes it to all users of randomizedtesting.
>
> Thanks,
> Shawn
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6393) SpanFirstQuery sometimes returns Spans without any positions

2015-04-03 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394962#comment-14394962
 ] 

Robert Muir commented on LUCENE-6393:
-

I think SpanPayloadCheckQuery, SpanPositionRangeQuery, or any other subclasses 
of SpanPositionCheckQuery are likely impacted.

> SpanFirstQuery sometimes returns Spans without any positions
> 
>
> Key: LUCENE-6393
> URL: https://issues.apache.org/jira/browse/LUCENE-6393
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>
> This hits an assert in SpanScorer because it breaks the javadocs contract of 
> Spans.nextStartPosition():
>* Returns the next start position for the current doc.
>* There is always *at least one start/end position* per doc.
>* After the last start/end position at the current doc this returns 
> NO_MORE_POSITIONS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-6393) SpanFirstQuery sometimes returns Spans without any positions

2015-04-03 Thread Robert Muir (JIRA)
Robert Muir created LUCENE-6393:
---

 Summary: SpanFirstQuery sometimes returns Spans without any 
positions
 Key: LUCENE-6393
 URL: https://issues.apache.org/jira/browse/LUCENE-6393
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir


This hits an assert in SpanScorer because it breaks the javadocs contract of 
Spans.nextStartPosition():
   * Returns the next start position for the current doc.
   * There is always *at least one start/end position* per doc.
   * After the last start/end position at the current doc this returns 
NO_MORE_POSITIONS.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Please, please, please look at this patch

2015-04-03 Thread Walter Underwood
This is the third time I’ve asked someone to look at this. Do I need to bribe 
someone? What kind of beer? Or would SSD’s be more effective?

The patch adds fuzzy search to the edismax specs. Now that fuzzy is 100X 
faster, this is even more useful.

I know this is useful, because this is the third time it has been implemented. 
I didn’t submit it the first two times, which is my bad.

The issue says dismax but the patch is for edismax.

https://issues.apache.org/jira/browse/SOLR-629

I would love to see this in a 4.x release, but 5.x would be OK. We need to run 
4.x for a while because we have servlet filter to provide actual useful 
metrics. Solr has really lame metrics.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)




Re: [jira] [Commented] (LUCENE-6390) WeightedSpansTermExtractor has a broken IndexReader

2015-04-03 Thread Timothy Potter
Cool - just waiting on SOLR-7126, which Noble is looking into and then
I'll cut an RC ...

Have a great weekend everyone!

On Fri, Apr 3, 2015 at 11:43 AM, Robert Muir (JIRA)  wrote:
>
> [ 
> https://issues.apache.org/jira/browse/LUCENE-6390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394897#comment-14394897
>  ]
>
> Robert Muir commented on LUCENE-6390:
> -
>
> I unset blocker. I can work to fix this, that is going to be by removing the 
> optimization. The current optimization is incorrect so just pretend like it 
> does not exist.
>
>> WeightedSpansTermExtractor has a broken IndexReader
>> ---
>>
>> Key: LUCENE-6390
>> URL: https://issues.apache.org/jira/browse/LUCENE-6390
>> Project: Lucene - Core
>>  Issue Type: Bug
>>Reporter: Robert Muir
>>Priority: Critical
>> Fix For: 5.1
>>
>>
>> The DelegatingLeafReader there is broken, it does not implement 
>> getFieldInfos. This is not an optional method, and this is blocking 
>> performance improvements to spans.
>> I'm gonna work around it for now, but if it won't be fixed, then this 
>> DelegatingLeafReader optimization should be removed.
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6391) Give SpanScorer two-phase iterator support.

2015-04-03 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394909#comment-14394909
 ] 

Robert Muir commented on LUCENE-6391:
-

Bug is just an accident since its the default value of a JIRA issue.

> Give SpanScorer two-phase iterator support.
> ---
>
> Key: LUCENE-6391
> URL: https://issues.apache.org/jira/browse/LUCENE-6391
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Robert Muir
> Attachments: LUCENE-6391.patch
>
>
> Fix SpanScorer to use any two-phase iterator support of the underlying Spans. 
> This means e.g. a spans in a booleanquery, or a spans with a filter can be 
> faster.
> In order to do this, we have to clean up this class a little bit:
> * forward most methods directly to the underlying spans.
> * ensure positions are only iterated at most once.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6391) Give SpanScorer two-phase iterator support.

2015-04-03 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-6391:

Issue Type: Improvement  (was: Bug)

> Give SpanScorer two-phase iterator support.
> ---
>
> Key: LUCENE-6391
> URL: https://issues.apache.org/jira/browse/LUCENE-6391
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Robert Muir
> Attachments: LUCENE-6391.patch
>
>
> Fix SpanScorer to use any two-phase iterator support of the underlying Spans. 
> This means e.g. a spans in a booleanquery, or a spans with a filter can be 
> faster.
> In order to do this, we have to clean up this class a little bit:
> * forward most methods directly to the underlying spans.
> * ensure positions are only iterated at most once.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6391) Give SpanScorer two-phase iterator support.

2015-04-03 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394889#comment-14394889
 ] 

David Smiley commented on LUCENE-6391:
--

Awesome!

(IMO this is a optimization/improvement, not a bug... but I see this is now 
filed correctly in CHANGES.txt unlike the class of JIRA issue).

> Give SpanScorer two-phase iterator support.
> ---
>
> Key: LUCENE-6391
> URL: https://issues.apache.org/jira/browse/LUCENE-6391
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-6391.patch
>
>
> Fix SpanScorer to use any two-phase iterator support of the underlying Spans. 
> This means e.g. a spans in a booleanquery, or a spans with a filter can be 
> faster.
> In order to do this, we have to clean up this class a little bit:
> * forward most methods directly to the underlying spans.
> * ensure positions are only iterated at most once.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6390) WeightedSpansTermExtractor has a broken IndexReader

2015-04-03 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394897#comment-14394897
 ] 

Robert Muir commented on LUCENE-6390:
-

I unset blocker. I can work to fix this, that is going to be by removing the 
optimization. The current optimization is incorrect so just pretend like it 
does not exist.

> WeightedSpansTermExtractor has a broken IndexReader
> ---
>
> Key: LUCENE-6390
> URL: https://issues.apache.org/jira/browse/LUCENE-6390
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>Priority: Critical
> Fix For: 5.1
>
>
> The DelegatingLeafReader there is broken, it does not implement 
> getFieldInfos. This is not an optional method, and this is blocking 
> performance improvements to spans.
> I'm gonna work around it for now, but if it won't be fixed, then this 
> DelegatingLeafReader optimization should be removed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6390) WeightedSpansTermExtractor has a broken IndexReader

2015-04-03 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-6390:

Priority: Critical  (was: Blocker)

> WeightedSpansTermExtractor has a broken IndexReader
> ---
>
> Key: LUCENE-6390
> URL: https://issues.apache.org/jira/browse/LUCENE-6390
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>Priority: Critical
> Fix For: 5.1
>
>
> The DelegatingLeafReader there is broken, it does not implement 
> getFieldInfos. This is not an optional method, and this is blocking 
> performance improvements to spans.
> I'm gonna work around it for now, but if it won't be fixed, then this 
> DelegatingLeafReader optimization should be removed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6391) Give SpanScorer two-phase iterator support.

2015-04-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394881#comment-14394881
 ] 

ASF subversion and git services commented on LUCENE-6391:
-

Commit 1671123 from [~rcmuir] in branch 'dev/trunk'
[ https://svn.apache.org/r1671123 ]

LUCENE-6391: Give SpanScorer two-phase iterator support

> Give SpanScorer two-phase iterator support.
> ---
>
> Key: LUCENE-6391
> URL: https://issues.apache.org/jira/browse/LUCENE-6391
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-6391.patch
>
>
> Fix SpanScorer to use any two-phase iterator support of the underlying Spans. 
> This means e.g. a spans in a booleanquery, or a spans with a filter can be 
> faster.
> In order to do this, we have to clean up this class a little bit:
> * forward most methods directly to the underlying spans.
> * ensure positions are only iterated at most once.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6390) WeightedSpansTermExtractor has a broken IndexReader

2015-04-03 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394862#comment-14394862
 ] 

Robert Muir commented on LUCENE-6390:
-

I think our query stack should have the ability to check fieldinfos and use 
everything it knows to make things efficient.

It bothers me that I can make a perfectly valid improvement to what queries do 
behind the scenes, but highlighter tests will fail, because they have broken 
code.
Maybe i should have just disabled their tests?

We cannot let the highlighter hold back our core search code.

> WeightedSpansTermExtractor has a broken IndexReader
> ---
>
> Key: LUCENE-6390
> URL: https://issues.apache.org/jira/browse/LUCENE-6390
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>Priority: Blocker
> Fix For: 5.1
>
>
> The DelegatingLeafReader there is broken, it does not implement 
> getFieldInfos. This is not an optional method, and this is blocking 
> performance improvements to spans.
> I'm gonna work around it for now, but if it won't be fixed, then this 
> DelegatingLeafReader optimization should be removed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Test infrastructure: is a running counter of tests completed possible?

2015-04-03 Thread Shawn Heisey
On 4/3/2015 11:13 AM, Dawid Weiss wrote:
> You can have anything you like. The output is one example of a test
> output listener:
> https://github.com/carrotsearch/randomizedtesting/blob/master/junit4-ant/src/main/java/com/carrotsearch/ant/tasks/junit4/listeners/TextReport.java
>
> You can attach any listeners to the junit:junit4 ant task.

For me, this sort of thing is very advanced, so I will apologize for
seeming like a complete idiot.

It seems as though what I would want to do here is make a new class that
extends TextReport ... but then I'm completely unfamiliar with what's
involved in attaching the new class to the ant task you mentioned (or
even where that's defined), how I would need to package it, etc.

A possibly better idea is to give you a pull request so the feature
makes it to all users of randomizedtesting.

Thanks,
Shawn


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7338) A reloaded core will never register itself as active after a ZK session expiration

2015-04-03 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394840#comment-14394840
 ] 

David Smiley commented on SOLR-7338:


Here's a question for you [~markrmil...@gmail.com]: If every core were to be 
reloaded, would that change anything?  What if I go and do that too all my 
cores.  Can we just assume that all cores may have been reloaded at some point 
in the past?  If we do assume that, we do we lose anything?  -- other than 
complexity :-)

> A reloaded core will never register itself as active after a ZK session 
> expiration
> --
>
> Key: SOLR-7338
> URL: https://issues.apache.org/jira/browse/SOLR-7338
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Reporter: Timothy Potter
>Assignee: Mark Miller
> Attachments: SOLR-7338.patch, SOLR-7338_test.patch
>
>
> If a collection gets reloaded, then a core's isReloaded flag is always true. 
> If a core experiences a ZK session expiration after a reload, then it won't 
> ever be able to set itself to active because of the check in 
> {{ZkController#register}}:
> {code}
> UpdateLog ulog = core.getUpdateHandler().getUpdateLog();
> if (!core.isReloaded() && ulog != null) {
>   // disable recovery in case shard is in construction state (for 
> shard splits)
>   Slice slice = getClusterState().getSlice(collection, shardId);
>   if (slice.getState() != Slice.State.CONSTRUCTION || !isLeader) {
> Future recoveryFuture = 
> core.getUpdateHandler().getUpdateLog().recoverFromLog();
> if (recoveryFuture != null) {
>   log.info("Replaying tlog for " + ourUrl + " during startup... 
> NOTE: This can take a while.");
>   recoveryFuture.get(); // NOTE: this could potentially block for
>   // minutes or more!
>   // TODO: public as recovering in the mean time?
>   // TODO: in the future we could do peersync in parallel with 
> recoverFromLog
> } else {
>   log.info("No LogReplay needed for core=" + core.getName() + " 
> baseURL=" + baseUrl);
> }
>   }
>   boolean didRecovery = checkRecovery(coreName, desc, 
> recoverReloadedCores, isLeader, cloudDesc,
>   collection, coreZkNodeName, shardId, leaderProps, core, cc);
>   if (!didRecovery) {
> publish(desc, ZkStateReader.ACTIVE);
>   }
> }
> {code}
> I can easily simulate this on trunk by doing:
> {code}
> bin/solr -c -z localhost:2181
> bin/solr create -c foo
> bin/post -c foo example/exampledocs/*.xml
> curl "http://localhost:8983/solr/admin/collections?action=RELOAD&name=foo";
> kill -STOP  && sleep  && kill -CONT 
> {code}
> Where  is the process ID of the Solr node. Here are the logs after the 
> CONT command. As you can see below, the core never gets to setting itself as 
> active again. I think the bug is that the isReloaded flag needs to get set 
> back to false once the reload is successful, but I don't understand what this 
> flag is needed for anyway???
> {code}
> INFO  - 2015-04-01 17:28:50.962; 
> org.apache.solr.common.cloud.ConnectionManager; Watcher 
> org.apache.solr.common.cloud.ConnectionManager@5519dba0 
> name:ZooKeeperConnection Watcher:localhost:2181 got event WatchedEvent 
> state:Disconnected type:None path:null path:null type:None
> INFO  - 2015-04-01 17:28:50.963; 
> org.apache.solr.common.cloud.ConnectionManager; zkClient has disconnected
> INFO  - 2015-04-01 17:28:51.107; 
> org.apache.solr.common.cloud.ConnectionManager; Watcher 
> org.apache.solr.common.cloud.ConnectionManager@5519dba0 
> name:ZooKeeperConnection Watcher:localhost:2181 got event WatchedEvent 
> state:Expired type:None path:null path:null type:None
> INFO  - 2015-04-01 17:28:51.107; 
> org.apache.solr.common.cloud.ConnectionManager; Our previous ZooKeeper 
> session was expired. Attempting to reconnect to recover relationship with 
> ZooKeeper...
> INFO  - 2015-04-01 17:28:51.108; org.apache.solr.cloud.Overseer; Overseer 
> (id=93579450724974592-192.168.1.2:8983_solr-n_00) closing
> INFO  - 2015-04-01 17:28:51.108; 
> org.apache.solr.cloud.ZkController$WatcherImpl; A node got unwatched for 
> /configs/foo
> INFO  - 2015-04-01 17:28:51.108; 
> org.apache.solr.cloud.Overseer$ClusterStateUpdater; Overseer Loop exiting : 
> 192.168.1.2:8983_solr
> INFO  - 2015-04-01 17:28:51.109; 
> org.apache.solr.cloud.OverseerCollectionProcessor; According to ZK I 
> (id=93579450724974592-192.168.1.2:8983_solr-n_00) am no longer a 
> leader.
> INFO  - 2015-04-01 17:28:51.108; org.apache.solr.cloud.ZkController$4; 
> Running listeners for /configs/foo
> INFO  - 2015-04-01 17:28:51.109; 
> org.apache.solr.common.cl

[jira] [Commented] (LUCENE-6390) WeightedSpansTermExtractor has a broken IndexReader

2015-04-03 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394811#comment-14394811
 ] 

David Smiley commented on LUCENE-6390:
--

I just noticed LUCENE-6388 (though that's targeting 5.2) and see it has a 
perfectly fine work-around.
Maybe I'm splitting hairs; 5.1 doesn't seem held up at the moment on the 
account of this any how.

> WeightedSpansTermExtractor has a broken IndexReader
> ---
>
> Key: LUCENE-6390
> URL: https://issues.apache.org/jira/browse/LUCENE-6390
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>Priority: Blocker
> Fix For: 5.1
>
>
> The DelegatingLeafReader there is broken, it does not implement 
> getFieldInfos. This is not an optional method, and this is blocking 
> performance improvements to spans.
> I'm gonna work around it for now, but if it won't be fixed, then this 
> DelegatingLeafReader optimization should be removed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6390) WeightedSpansTermExtractor has a broken IndexReader

2015-04-03 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394806#comment-14394806
 ] 

David Smiley commented on LUCENE-6390:
--

I agree it should be fixed, but I don't see this as a blocker, unless I'm 
misunderstanding the scope.  This feature was added back in 4.2 by [~simonw] 
and has been there since.  I've scanned through the usages of the method and I 
didn't see an occurrence from a Query.  Of course a user might have something 
custom, but that seems awfully rare.  Is there something I'm missing?  Anyway, 
if you can get this fixed then great; I just don't see it as blocking.

> WeightedSpansTermExtractor has a broken IndexReader
> ---
>
> Key: LUCENE-6390
> URL: https://issues.apache.org/jira/browse/LUCENE-6390
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>Priority: Blocker
> Fix For: 5.1
>
>
> The DelegatingLeafReader there is broken, it does not implement 
> getFieldInfos. This is not an optional method, and this is blocking 
> performance improvements to spans.
> I'm gonna work around it for now, but if it won't be fixed, then this 
> DelegatingLeafReader optimization should be removed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6692) hl.maxAnalyzedChars should apply cumulatively on a multi-valued field

2015-04-03 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394764#comment-14394764
 ] 

David Smiley commented on SOLR-6692:


bq. Furthermore, DefaultSolrHighligher.doHighlightingByHighlighter should exit 
early from it's field value loop if it reaches hl.snippets.

Actually it can never do that because it takes the top hl.snippets from every 
value and then takes the top hl.snippets of that.  Anyway; there's multiple 
mechanisms to exit early now -- hl.maxAnalyzedChars, and both hl.multiValued* 
options.

> hl.maxAnalyzedChars should apply cumulatively on a multi-valued field
> -
>
> Key: SOLR-6692
> URL: https://issues.apache.org/jira/browse/SOLR-6692
> Project: Solr
>  Issue Type: Improvement
>  Components: highlighter
>Reporter: David Smiley
>Assignee: David Smiley
> Fix For: 5.2
>
> Attachments: 
> SOLR-6692_hl_maxAnalyzedChars_cumulative_multiValued,_and_more.patch
>
>
> in DefaultSolrHighlighter, the hl.maxAnalyzedChars figure is used to 
> constrain how much text is analyzed before the highlighter stops, in the 
> interests of performance.  For a multi-valued field, it effectively treats 
> each value anew, no matter how much text it was previously analyzed for other 
> values for the same field for the current document. The PostingsHighlighter 
> doesn't work this way -- hl.maxAnalyzedChars is effectively the total budget 
> for a field for a document, no matter how many values there might be.  It's 
> not reset for each value.  I think this makes more sense.  When we loop over 
> the values, we should subtract from hl.maxAnalyzedChars the length of the 
> value just checked.  The motivation here is consistency with 
> PostingsHighlighter, and to allow for hl.maxAnalyzedChars to be pushed down 
> to term vector uninversion, which wouldn't be possible for multi-valued 
> fields based on the current way this parameter is used.
> Interestingly, I noticed Solr's use of FastVectorHighlighter doesn't honor 
> hl.maxAnalyzedChars as the FVH doesn't have a knob for that.  It does have 
> hl.phraseLimit which is a limit that could be used for a similar purpose, 
> albeit applied differently.
> Furthermore, DefaultSolrHighligher.doHighlightingByHighlighter should exit 
> early from it's field value loop if it reaches hl.snippets, and if 
> hl.preserveMulti=true



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6392) Add offset limit to Highlighter's TokenStreamFromTermVector

2015-04-03 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394749#comment-14394749
 ] 

David Smiley commented on LUCENE-6392:
--

Oh and before I forget, obviously Solr's DefaultSolrHighlighter should 
propagate the offset. It's not in this patch to avoid interference with my 
other WIP.

> Add offset limit to Highlighter's TokenStreamFromTermVector
> ---
>
> Key: LUCENE-6392
> URL: https://issues.apache.org/jira/browse/LUCENE-6392
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/highlighter
>Reporter: David Smiley
>Assignee: David Smiley
> Fix For: 5.2
>
> Attachments: LUCENE-6392_highlight_term_vector_maxStartOffset.patch
>
>
> The Highlighter's TokenStreamFromTermVector utility, typically accessed via 
> TokenSources, should have the ability to filter out tokens beyond a 
> configured offset. There is a TODO there already, and this issue addresses 
> it.  New methods in TokenSources now propagate a limit.
> This patch also includes some memory saving optimizations, to be described 
> shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6392) Add offset limit to Highlighter's TokenStreamFromTermVector

2015-04-03 Thread David Smiley (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated LUCENE-6392:
-
Attachment: LUCENE-6392_highlight_term_vector_maxStartOffset.patch

(Patch attached).
Elaborating on the description:

This patch includes a tweak to the TokenLL[] array size initialization to 
consider this new limit when guessing a good size.

This patch includes memory saving optimizations to the information it 
accumulates.  Before the patch, each TokenLL had a char[], so there were a 
total of 2 objects per token (including the token itself).  Now I use a shared 
CharsRefBuilder with a pointer & length into it, so there's just 1 object now, 
plus byte savings by avoiding a char[] header.  I also reduced the bytes needed 
for a TokenLL instance from 40 to 32.  *It does assume that the char offset 
delta (endOffset - startOffset) can fit within a short*, which seems like a 
reasonable assumption to me. For safety I guard against overflow and substitute 
Short.MAX_VALUE.

Finally, to encourage users to supply a limit (even if "-1" to mean no limit), 
I decided to deprecate many of the methods in TokenSources for new ones that 
include a limit parameter.  But for those methods that fall-back to a provided 
Analyzer, _I have to wonder now if it makes sense for these methods to filter 
the analyzers_.  I think it does -- if you want to limit the tokens then it 
shouldn't matter where you got them from -- you want to limit them.  I haven't 
added that but I'm looking for feedback first.

> Add offset limit to Highlighter's TokenStreamFromTermVector
> ---
>
> Key: LUCENE-6392
> URL: https://issues.apache.org/jira/browse/LUCENE-6392
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/highlighter
>Reporter: David Smiley
>Assignee: David Smiley
> Fix For: 5.2
>
> Attachments: LUCENE-6392_highlight_term_vector_maxStartOffset.patch
>
>
> The Highlighter's TokenStreamFromTermVector utility, typically accessed via 
> TokenSources, should have the ability to filter out tokens beyond a 
> configured offset. There is a TODO there already, and this issue addresses 
> it.  New methods in TokenSources now propagate a limit.
> This patch also includes some memory saving optimizations, to be described 
> shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-6392) Add offset limit to Highlighter's TokenStreamFromTermVector

2015-04-03 Thread David Smiley (JIRA)
David Smiley created LUCENE-6392:


 Summary: Add offset limit to Highlighter's 
TokenStreamFromTermVector
 Key: LUCENE-6392
 URL: https://issues.apache.org/jira/browse/LUCENE-6392
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/highlighter
Reporter: David Smiley
Assignee: David Smiley
 Fix For: 5.2


The Highlighter's TokenStreamFromTermVector utility, typically accessed via 
TokenSources, should have the ability to filter out tokens beyond a configured 
offset. There is a TODO there already, and this issue addresses it.  New 
methods in TokenSources now propagate a limit.

This patch also includes some memory saving optimizations, to be described 
shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-Tests-5.x-Java7 - Build # 2884 - Still Failing

2015-04-03 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Tests-5.x-Java7/2884/

3 tests failed.
FAILED:  org.apache.solr.cloud.LeaderInitiatedRecoveryOnCommitTest.test

Error Message:
IOException occured when talking to server at: 
http://127.0.0.1:20844/c8n_1x3_commits_shard1_replica1

Stack Trace:
org.apache.solr.client.solrj.SolrServerException: IOException occured when 
talking to server at: http://127.0.0.1:20844/c8n_1x3_commits_shard1_replica1
at 
__randomizedtesting.SeedInfo.seed([691015E6B7971E47:E1442A3C196B73BF]:0)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:570)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:233)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:225)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:135)
at org.apache.solr.client.solrj.SolrClient.commit(SolrClient.java:483)
at org.apache.solr.client.solrj.SolrClient.commit(SolrClient.java:464)
at 
org.apache.solr.cloud.LeaderInitiatedRecoveryOnCommitTest.oneShardTest(LeaderInitiatedRecoveryOnCommitTest.java:132)
at 
org.apache.solr.cloud.LeaderInitiatedRecoveryOnCommitTest.test(LeaderInitiatedRecoveryOnCommitTest.java:64)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1627)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:836)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:872)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:886)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:960)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:935)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:845)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:747)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:781)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:792)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apa

Re: Test infrastructure: is a running counter of tests completed possible?

2015-04-03 Thread Dawid Weiss
You can have anything you like. The output is one example of a test
output listener:
https://github.com/carrotsearch/randomizedtesting/blob/master/junit4-ant/src/main/java/com/carrotsearch/ant/tasks/junit4/listeners/TextReport.java

You can attach any listeners to the junit:junit4 ant task.

Dawid


On Fri, Apr 3, 2015 at 6:06 PM, Shawn Heisey  wrote:
> Would it be possible to have a running total of the number of test
> suites completed?  This is how I would envision the test output with
> this addition:
>
>[junit4] Suite:
> org.apache.solr.rest.schema.TestManagedSchemaFieldTypeResource
>[junit4] Suite 278/478 completed on J1 in 1.28s, 1 test
>[junit4]
>[junit4] Suite: org.apache.solr.spelling.suggest.TestAnalyzedSuggestions
>[junit4] Suite 279/478 completed on J0 in 2.18s, 2 tests
>[junit4]
>[junit4] Suite:
> org.apache.solr.handler.component.DistributedExpandComponentTest
>[junit4] Suite 280/478 completed on J0 in 4.28s, 1 test
>
> Thanks,
> Shawn
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6692) hl.maxAnalyzedChars should apply cumulatively on a multi-valued field

2015-04-03 Thread David Smiley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated SOLR-6692:
---
Fix Version/s: (was: 5.0)
   5.2

> hl.maxAnalyzedChars should apply cumulatively on a multi-valued field
> -
>
> Key: SOLR-6692
> URL: https://issues.apache.org/jira/browse/SOLR-6692
> Project: Solr
>  Issue Type: Improvement
>  Components: highlighter
>Reporter: David Smiley
>Assignee: David Smiley
> Fix For: 5.2
>
> Attachments: 
> SOLR-6692_hl_maxAnalyzedChars_cumulative_multiValued,_and_more.patch
>
>
> in DefaultSolrHighlighter, the hl.maxAnalyzedChars figure is used to 
> constrain how much text is analyzed before the highlighter stops, in the 
> interests of performance.  For a multi-valued field, it effectively treats 
> each value anew, no matter how much text it was previously analyzed for other 
> values for the same field for the current document. The PostingsHighlighter 
> doesn't work this way -- hl.maxAnalyzedChars is effectively the total budget 
> for a field for a document, no matter how many values there might be.  It's 
> not reset for each value.  I think this makes more sense.  When we loop over 
> the values, we should subtract from hl.maxAnalyzedChars the length of the 
> value just checked.  The motivation here is consistency with 
> PostingsHighlighter, and to allow for hl.maxAnalyzedChars to be pushed down 
> to term vector uninversion, which wouldn't be possible for multi-valued 
> fields based on the current way this parameter is used.
> Interestingly, I noticed Solr's use of FastVectorHighlighter doesn't honor 
> hl.maxAnalyzedChars as the FVH doesn't have a knob for that.  It does have 
> hl.phraseLimit which is a limit that could be used for a similar purpose, 
> albeit applied differently.
> Furthermore, DefaultSolrHighligher.doHighlightingByHighlighter should exit 
> early from it's field value loop if it reaches hl.snippets, and if 
> hl.preserveMulti=true



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-6692) hl.maxAnalyzedChars should apply cumulatively on a multi-valued field

2015-04-03 Thread David Smiley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley reassigned SOLR-6692:
--

Assignee: David Smiley

> hl.maxAnalyzedChars should apply cumulatively on a multi-valued field
> -
>
> Key: SOLR-6692
> URL: https://issues.apache.org/jira/browse/SOLR-6692
> Project: Solr
>  Issue Type: Improvement
>  Components: highlighter
>Reporter: David Smiley
>Assignee: David Smiley
> Fix For: 5.2
>
> Attachments: 
> SOLR-6692_hl_maxAnalyzedChars_cumulative_multiValued,_and_more.patch
>
>
> in DefaultSolrHighlighter, the hl.maxAnalyzedChars figure is used to 
> constrain how much text is analyzed before the highlighter stops, in the 
> interests of performance.  For a multi-valued field, it effectively treats 
> each value anew, no matter how much text it was previously analyzed for other 
> values for the same field for the current document. The PostingsHighlighter 
> doesn't work this way -- hl.maxAnalyzedChars is effectively the total budget 
> for a field for a document, no matter how many values there might be.  It's 
> not reset for each value.  I think this makes more sense.  When we loop over 
> the values, we should subtract from hl.maxAnalyzedChars the length of the 
> value just checked.  The motivation here is consistency with 
> PostingsHighlighter, and to allow for hl.maxAnalyzedChars to be pushed down 
> to term vector uninversion, which wouldn't be possible for multi-valued 
> fields based on the current way this parameter is used.
> Interestingly, I noticed Solr's use of FastVectorHighlighter doesn't honor 
> hl.maxAnalyzedChars as the FVH doesn't have a knob for that.  It does have 
> hl.phraseLimit which is a limit that could be used for a similar purpose, 
> albeit applied differently.
> Furthermore, DefaultSolrHighligher.doHighlightingByHighlighter should exit 
> early from it's field value loop if it reaches hl.snippets, and if 
> hl.preserveMulti=true



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6692) hl.maxAnalyzedChars should apply cumulatively on a multi-valued field

2015-04-03 Thread David Smiley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated SOLR-6692:
---
Attachment: 
SOLR-6692_hl_maxAnalyzedChars_cumulative_multiValued,_and_more.patch

This patch includes several things. I'm sorry if it's doing too many things; I 
will break it down into separate patches and/or JIRA issue if advised to. I 
will certainly point out the separate parts in CHANGES.txt so users know what's 
going on.  Nevertheless the patch is small IMO and the net LOC is practically 0.

* hl.maxAnalyzedChars is budgeted across all values for the field being 
highlighted.  The budget is decremented by the field length at each iteration, 
and so progressively smaller limits are passed on through until it's <= 0 at 
which we we exit the loop. I added a test for this which randomly chooses 
between a field with term vectors or not.
* Refactor/extensibility:
** All methods that were private are now protected.  This widens the scope of 
possibilities for subclassing without having to fork this code.
** The no-arg constructor isn't used; I removed it and made the SolrCore field 
final as a clarification.  If anyone ever tried to use the no-arg constructor 
(I have), they would have soon realized that was not an option since an NPE 
would be thrown from init().
** I extracted a method getFieldValues whose sole job is to get the field 
values (Strings) to be highlighted given the provided field name & some other 
parameters. This is a useful extension point so that a subclass can get the 
field values from another field (i.e. the source of a copyField).  Till now, 
people had to use hl.requireFieldMatch=false which had its drawbacks in terms 
of highlight precision.  A side-benefit is that this method is aware of 
hl.maxMultiValuedToMatch and hl.maxAnalyzedChars, which will limit the values 
it returns. This aids the term-vector code path which can now in more 
circumstances see when there is only one value to highlight, and thus forgo 
wrapping the term vector stream with a OffsetWindow filter, which is a big 
penalty to avoid.
* hl.usePhraseHighlighter can now be toggled per-field.
* It includes a nocommit to confirm from SOLR-4656 (Erick) that the intention 
of hl.maxMultiValuedToMatch is to limit _fragments_, not matching values, 
despite the parameter name hinting otherwise.
* I fixed a bug with hl.maxMultiValuedToMatch in which it would decrement its 
counter when in fact the fragment didn't match. This bug would only occur when 
hl.preserveMulti=true.
* I fixed a small bug in ordering the fragments by score.  It used Math.round() 
which will coalesce values close to 0 to appear as the same weighting. Now it 
simply uses Float.compare(a,b).
* note: the code changes below the field value loop, except for the small score 
order bug I just mentioned, are purely code clean-up and don't change behavior. 
The code was more complex due to it thinking a fragment could be null when in 
fact by that point it's impossible.

> hl.maxAnalyzedChars should apply cumulatively on a multi-valued field
> -
>
> Key: SOLR-6692
> URL: https://issues.apache.org/jira/browse/SOLR-6692
> Project: Solr
>  Issue Type: Improvement
>  Components: highlighter
>Reporter: David Smiley
> Fix For: 5.0
>
> Attachments: 
> SOLR-6692_hl_maxAnalyzedChars_cumulative_multiValued,_and_more.patch
>
>
> in DefaultSolrHighlighter, the hl.maxAnalyzedChars figure is used to 
> constrain how much text is analyzed before the highlighter stops, in the 
> interests of performance.  For a multi-valued field, it effectively treats 
> each value anew, no matter how much text it was previously analyzed for other 
> values for the same field for the current document. The PostingsHighlighter 
> doesn't work this way -- hl.maxAnalyzedChars is effectively the total budget 
> for a field for a document, no matter how many values there might be.  It's 
> not reset for each value.  I think this makes more sense.  When we loop over 
> the values, we should subtract from hl.maxAnalyzedChars the length of the 
> value just checked.  The motivation here is consistency with 
> PostingsHighlighter, and to allow for hl.maxAnalyzedChars to be pushed down 
> to term vector uninversion, which wouldn't be possible for multi-valued 
> fields based on the current way this parameter is used.
> Interestingly, I noticed Solr's use of FastVectorHighlighter doesn't honor 
> hl.maxAnalyzedChars as the FVH doesn't have a knob for that.  It does have 
> hl.phraseLimit which is a limit that could be used for a similar purpose, 
> albeit applied differently.
> Furthermore, DefaultSolrHighligher.doHighlightingByHighlighter should exit 
> early from it's field value loop if it reaches hl.snippets, and if 
> hl.preserveMulti=true



-

[jira] [Commented] (LUCENE-6391) Give SpanScorer two-phase iterator support.

2015-04-03 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394705#comment-14394705
 ] 

Robert Muir commented on LUCENE-6391:
-

{quote}
The patch looks good. As far as testing is concerned I guess we're already 
covered by TestSpanSearchEquivalence.
{quote}

Yes, I looked at coverage and this test covers all the current approximation 
support we have (SpanNear).
After this patch, we can begin implementing two-phase support for other spans 
pretty easily i think and we just need to ensure those queries are in this test.

{quote}
Since the approximation and the scorer are supposed to be views of each other, 
it is usually wrong to eg. cache the current doc id.
{quote}

This is why i made some things final in SpanScorer, and tried to clarify the 
API so it wouldn't be a trap to any subclass like PayloadTermQuery.


> Give SpanScorer two-phase iterator support.
> ---
>
> Key: LUCENE-6391
> URL: https://issues.apache.org/jira/browse/LUCENE-6391
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-6391.patch
>
>
> Fix SpanScorer to use any two-phase iterator support of the underlying Spans. 
> This means e.g. a spans in a booleanquery, or a spans with a filter can be 
> faster.
> In order to do this, we have to clean up this class a little bit:
> * forward most methods directly to the underlying spans.
> * ensure positions are only iterated at most once.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6391) Give SpanScorer two-phase iterator support.

2015-04-03 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394688#comment-14394688
 ] 

Adrien Grand commented on LUCENE-6391:
--

bq. Since SpanScorer is a "bridge" from Spans to Scorer, most things except 
scoring should be final and just go to the spans. 

Yes... it is also important for approximations. Since the approximation and the 
scorer are supposed to be views of each other, it is usually wrong to eg. cache 
the current doc id.

> Give SpanScorer two-phase iterator support.
> ---
>
> Key: LUCENE-6391
> URL: https://issues.apache.org/jira/browse/LUCENE-6391
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-6391.patch
>
>
> Fix SpanScorer to use any two-phase iterator support of the underlying Spans. 
> This means e.g. a spans in a booleanquery, or a spans with a filter can be 
> faster.
> In order to do this, we have to clean up this class a little bit:
> * forward most methods directly to the underlying spans.
> * ensure positions are only iterated at most once.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6391) Give SpanScorer two-phase iterator support.

2015-04-03 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394685#comment-14394685
 ] 

Adrien Grand commented on LUCENE-6391:
--

The patch looks good. As far as testing is concerned I guess we're already 
covered by TestSpanSearchEquivalence.

Thanks!

> Give SpanScorer two-phase iterator support.
> ---
>
> Key: LUCENE-6391
> URL: https://issues.apache.org/jira/browse/LUCENE-6391
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-6391.patch
>
>
> Fix SpanScorer to use any two-phase iterator support of the underlying Spans. 
> This means e.g. a spans in a booleanquery, or a spans with a filter can be 
> faster.
> In order to do this, we have to clean up this class a little bit:
> * forward most methods directly to the underlying spans.
> * ensure positions are only iterated at most once.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6083) Span containing/contained queries

2015-04-03 Thread Paul Elschot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394672#comment-14394672
 ] 

Paul Elschot commented on LUCENE-6083:
--

 bq. not really any logic is left in NearSpans after the patch. Maybe we should 
remove the abstraction and just store slop/query in Ordered/UnOrdered?

In case that improves performance or might reduce future maintenance, then yes.
Otherwise the NearSpans here nicely shows what it is: a conjunction of spans 
for a query with an allowed slop.


> Span containing/contained queries
> -
>
> Key: LUCENE-6083
> URL: https://issues.apache.org/jira/browse/LUCENE-6083
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Paul Elschot
>Priority: Minor
> Attachments: LUCENE-6083.patch, LUCENE-6083.patch, LUCENE-6083.patch
>
>
> SpanContainingQuery reducing a spans to where it is containing another spans.
> SpanContainedQuery reducing a spans to where it is contained in another spans.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6391) Give SpanScorer two-phase iterator support.

2015-04-03 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-6391:

Attachment: LUCENE-6391.patch

See attached patch. Since SpanScorer is a "bridge" from Spans to Scorer, most 
things except scoring should be final and just go to the spans. 

Seems ok in benchmarks, though I think with more work in the future we can do 
better:
{noformat}
Report after iter 5:
Chart saved to out.png... (wd: /home/rmuir/workspace/util/src/python)
Task   QPS trunk  StdDev   QPS patch  StdDev
Pct diff
  SpanNearF100.0   22.64  (1.4%)   22.57  (1.7%)   
-0.3% (  -3% -2%)
   SpanNearF10.0   34.12  (1.4%)   34.13  (1.4%)
0.0% (  -2% -2%)
SpanNearF0.1   40.71  (0.2%)   42.51  (0.3%)
4.4% (   3% -4%)
SpanNearF0.5   34.96  (0.4%)   39.43  (0.3%)   
12.8% (  12% -   13%)
SpanNearF1.0   32.66  (0.6%)   37.98  (0.4%)   
16.3% (  15% -   17%)
{noformat}

> Give SpanScorer two-phase iterator support.
> ---
>
> Key: LUCENE-6391
> URL: https://issues.apache.org/jira/browse/LUCENE-6391
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-6391.patch
>
>
> Fix SpanScorer to use any two-phase iterator support of the underlying Spans. 
> This means e.g. a spans in a booleanquery, or a spans with a filter can be 
> faster.
> In order to do this, we have to clean up this class a little bit:
> * forward most methods directly to the underlying spans.
> * ensure positions are only iterated at most once.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-6865) Upgrade HttpClient to 4.4.1

2015-04-03 Thread Shawn Heisey (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Heisey resolved SOLR-6865.

Resolution: Fixed
  Assignee: Shawn Heisey

Tests and precommit passed on trunk before that change was committed.

Precommit passed on 5x before that change was committed.  Tests passed 
successfully shortly after the commit.


> Upgrade HttpClient to 4.4.1
> ---
>
> Key: SOLR-6865
> URL: https://issues.apache.org/jira/browse/SOLR-6865
> Project: Solr
>  Issue Type: Task
>Affects Versions: 5.0
>Reporter: Shawn Heisey
>Assignee: Shawn Heisey
>Priority: Minor
> Fix For: Trunk, 5.2
>
> Attachments: SOLR-6865.patch, SOLR-6865.patch
>
>
> HttpClient 4.4 has been released.  5.0 seems like a good time to upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-6391) Give SpanScorer two-phase iterator support.

2015-04-03 Thread Robert Muir (JIRA)
Robert Muir created LUCENE-6391:
---

 Summary: Give SpanScorer two-phase iterator support.
 Key: LUCENE-6391
 URL: https://issues.apache.org/jira/browse/LUCENE-6391
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir


Fix SpanScorer to use any two-phase iterator support of the underlying Spans. 
This means e.g. a spans in a booleanquery, or a spans with a filter can be 
faster.

In order to do this, we have to clean up this class a little bit:
* forward most methods directly to the underlying spans.
* ensure positions are only iterated at most once.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4656) Add hl.maxMultiValuedToExamine to limit the number of multiValued entries examined while highlighting

2015-04-03 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394644#comment-14394644
 ] 

David Smiley commented on SOLR-4656:


While working on SOLR-6692 I noticed this again, and I'm wondering two things:
* Is the semantics of maxMultiValuedToMatch intentional with respect to that it 
counts snippets (i.e. fragments), as opposed to values?  It's unfortunate the 
parameter name doesn't make this clear, which is suggestive that it counts 
values (maxMultiValuedToExamine counts values).  There's a difference when 
hl.snippets isn't 1.
* I don't believe mvToMatch should be decremented when 
bestTextFragment.getScore() is <= 0 since _there actually was no match_.  This 
can happen often when hl.preserveMulti=true.  I think this is a bug.

I can fix but I'd like your thoughts, [~erickerickson].

> Add hl.maxMultiValuedToExamine to limit the number of multiValued entries 
> examined while highlighting
> -
>
> Key: SOLR-4656
> URL: https://issues.apache.org/jira/browse/SOLR-4656
> Project: Solr
>  Issue Type: Improvement
>  Components: highlighter
>Affects Versions: 4.3, Trunk
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: 4.3, Trunk
>
> Attachments: SOLR-4656-4x.patch, SOLR-4656-4x.patch, 
> SOLR-4656-trunk.patch, SOLR-4656.patch
>
>
> I'm looking at an admittedly pathological case of many, many entries in a 
> multiValued field, and trying to implement a way to limit the number 
> examined, analogous to maxAnalyzedChars, see the patch.
> Along the way, I noticed that we do what looks like unnecessary copying of 
> the fields to be examined. We call Document.getFields, which copies all of 
> the fields and values to the returned array. Then we copy all of those to 
> another array, converting them to Strings. Then we actually examine them. a> 
> this doesn't seem very efficient and b> reduces the benefit from limiting the 
> number of mv values examined.
> So the attached does two things:
> 1> attempts to fix this
> 2> implements hl.maxMultiValuedToExamine
> I'd _really_ love it if someone who knows the highlighting code takes a peek 
> at the fix to see if I've messed things up, the changes are actually pretty 
> minimal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7338) A reloaded core will never register itself as active after a ZK session expiration

2015-04-03 Thread Timothy Potter (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394640#comment-14394640
 ] 

Timothy Potter commented on SOLR-7338:
--

Hi [~markrmil...@gmail.com], do you think anything else needs to be done on 
this one? I'd actually like to get this into the 5.1 release - patch looks good 
to me. If you're comfortable with the unit test I posted, I can combine them 
and commit. Thanks.

> A reloaded core will never register itself as active after a ZK session 
> expiration
> --
>
> Key: SOLR-7338
> URL: https://issues.apache.org/jira/browse/SOLR-7338
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Reporter: Timothy Potter
>Assignee: Mark Miller
> Attachments: SOLR-7338.patch, SOLR-7338_test.patch
>
>
> If a collection gets reloaded, then a core's isReloaded flag is always true. 
> If a core experiences a ZK session expiration after a reload, then it won't 
> ever be able to set itself to active because of the check in 
> {{ZkController#register}}:
> {code}
> UpdateLog ulog = core.getUpdateHandler().getUpdateLog();
> if (!core.isReloaded() && ulog != null) {
>   // disable recovery in case shard is in construction state (for 
> shard splits)
>   Slice slice = getClusterState().getSlice(collection, shardId);
>   if (slice.getState() != Slice.State.CONSTRUCTION || !isLeader) {
> Future recoveryFuture = 
> core.getUpdateHandler().getUpdateLog().recoverFromLog();
> if (recoveryFuture != null) {
>   log.info("Replaying tlog for " + ourUrl + " during startup... 
> NOTE: This can take a while.");
>   recoveryFuture.get(); // NOTE: this could potentially block for
>   // minutes or more!
>   // TODO: public as recovering in the mean time?
>   // TODO: in the future we could do peersync in parallel with 
> recoverFromLog
> } else {
>   log.info("No LogReplay needed for core=" + core.getName() + " 
> baseURL=" + baseUrl);
> }
>   }
>   boolean didRecovery = checkRecovery(coreName, desc, 
> recoverReloadedCores, isLeader, cloudDesc,
>   collection, coreZkNodeName, shardId, leaderProps, core, cc);
>   if (!didRecovery) {
> publish(desc, ZkStateReader.ACTIVE);
>   }
> }
> {code}
> I can easily simulate this on trunk by doing:
> {code}
> bin/solr -c -z localhost:2181
> bin/solr create -c foo
> bin/post -c foo example/exampledocs/*.xml
> curl "http://localhost:8983/solr/admin/collections?action=RELOAD&name=foo";
> kill -STOP  && sleep  && kill -CONT 
> {code}
> Where  is the process ID of the Solr node. Here are the logs after the 
> CONT command. As you can see below, the core never gets to setting itself as 
> active again. I think the bug is that the isReloaded flag needs to get set 
> back to false once the reload is successful, but I don't understand what this 
> flag is needed for anyway???
> {code}
> INFO  - 2015-04-01 17:28:50.962; 
> org.apache.solr.common.cloud.ConnectionManager; Watcher 
> org.apache.solr.common.cloud.ConnectionManager@5519dba0 
> name:ZooKeeperConnection Watcher:localhost:2181 got event WatchedEvent 
> state:Disconnected type:None path:null path:null type:None
> INFO  - 2015-04-01 17:28:50.963; 
> org.apache.solr.common.cloud.ConnectionManager; zkClient has disconnected
> INFO  - 2015-04-01 17:28:51.107; 
> org.apache.solr.common.cloud.ConnectionManager; Watcher 
> org.apache.solr.common.cloud.ConnectionManager@5519dba0 
> name:ZooKeeperConnection Watcher:localhost:2181 got event WatchedEvent 
> state:Expired type:None path:null path:null type:None
> INFO  - 2015-04-01 17:28:51.107; 
> org.apache.solr.common.cloud.ConnectionManager; Our previous ZooKeeper 
> session was expired. Attempting to reconnect to recover relationship with 
> ZooKeeper...
> INFO  - 2015-04-01 17:28:51.108; org.apache.solr.cloud.Overseer; Overseer 
> (id=93579450724974592-192.168.1.2:8983_solr-n_00) closing
> INFO  - 2015-04-01 17:28:51.108; 
> org.apache.solr.cloud.ZkController$WatcherImpl; A node got unwatched for 
> /configs/foo
> INFO  - 2015-04-01 17:28:51.108; 
> org.apache.solr.cloud.Overseer$ClusterStateUpdater; Overseer Loop exiting : 
> 192.168.1.2:8983_solr
> INFO  - 2015-04-01 17:28:51.109; 
> org.apache.solr.cloud.OverseerCollectionProcessor; According to ZK I 
> (id=93579450724974592-192.168.1.2:8983_solr-n_00) am no longer a 
> leader.
> INFO  - 2015-04-01 17:28:51.108; org.apache.solr.cloud.ZkController$4; 
> Running listeners for /configs/foo
> INFO  - 2015-04-01 17:28:51.109; 
> org.apache.solr.common.cloud.DefaultConnectionStrategy; Connection expired - 
> starting a

Test infrastructure: is a running counter of tests completed possible?

2015-04-03 Thread Shawn Heisey
Would it be possible to have a running total of the number of test
suites completed?  This is how I would envision the test output with
this addition:

   [junit4] Suite:
org.apache.solr.rest.schema.TestManagedSchemaFieldTypeResource
   [junit4] Suite 278/478 completed on J1 in 1.28s, 1 test
   [junit4]
   [junit4] Suite: org.apache.solr.spelling.suggest.TestAnalyzedSuggestions
   [junit4] Suite 279/478 completed on J0 in 2.18s, 2 tests
   [junit4]
   [junit4] Suite:
org.apache.solr.handler.component.DistributedExpandComponentTest
   [junit4] Suite 280/478 completed on J0 in 4.28s, 1 test

Thanks,
Shawn

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-NightlyTests-5.x - Build # 806 - Still Failing

2015-04-03 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-5.x/806/

1 tests failed.
REGRESSION:  
org.apache.lucene.search.suggest.document.SuggestFieldTest.testDupSuggestFieldValues

Error Message:
MockDirectoryWrapper: cannot close: there are still open files: 
{_ak_completion_0.lkp=1, _aj_completion_0.lkp=1, _ak_completion_0.tim=1, 
_ak_completion_0.doc=1, _ak.nvd=1, _aj_completion_0.pay=1, 
_aj_completion_0.pos=1, _ak_completion_0.pos=1, _ak.fdt=1, _aj.nvd=1, 
_aj_completion_0.tim=1, _aj_completion_0.doc=1, _ak_completion_0.pay=1, 
_aj.fdt=1}

Stack Trace:
java.lang.RuntimeException: MockDirectoryWrapper: cannot close: there are still 
open files: {_ak_completion_0.lkp=1, _aj_completion_0.lkp=1, 
_ak_completion_0.tim=1, _ak_completion_0.doc=1, _ak.nvd=1, 
_aj_completion_0.pay=1, _aj_completion_0.pos=1, _ak_completion_0.pos=1, 
_ak.fdt=1, _aj.nvd=1, _aj_completion_0.tim=1, _aj_completion_0.doc=1, 
_ak_completion_0.pay=1, _aj.fdt=1}
at 
org.apache.lucene.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:747)
at 
org.apache.lucene.search.suggest.document.SuggestFieldTest.after(SuggestFieldTest.java:80)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1627)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:894)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:845)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:747)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:781)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:792)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: unclosed IndexInput: _aj_completion_0.pos
at 
org.apache.lucene.store.MockDirectoryWrapper.addFileHandle(MockDirectoryWrapper.java:622)
at 
org.apache.lucene.store.MockDirectoryWrapper.openInput(MockDirectoryWrapper.java:666)
at 
org.apache.lucene.codecs.lucene50.Lucene50Posting

[jira] [Updated] (SOLR-7332) Seed version buckets with max version from index

2015-04-03 Thread Timothy Potter (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Potter updated SOLR-7332:
-
Attachment: SOLR-7332.patch

Ignore that previous - it had a dead-lock condition in it when doing core 
reloads :-(

I think this updated patch is close to commit for trunk. I've added a 
distributed test that uses multiple threads to send docs, reload the 
collection, and commit data - beast passes 20 of 20. However, there are 2 areas 
that need review:

1) How I'm calling {{UpdateLog.onFirstSearcher}} in SolrCore. I was getting a 
multiple on-deck searcher warning during a core reload because the getSearcher 
method gets called twice during a reload and if the max version lookup took a 
little time, then the warning would occur. So I'm calling this as part of the 
main thread vs. in the background executor. This is of course will block the 
reload until it finishes but I think given the importance of getting the 
version buckets seeded correctly, that's OK. Let me know if there's a better 
way.

2) Originally, I was synchronizing the seedBucketVersionHighestFromIndex method 
in UpdateLog, but that led to dead-lock when doing reloads because updates 
continue to flow in while reload occurs (and DistributedUpdateProcessor 
versionAdd gets the lock on versionBuckets and calls synchronized methods on 
UpdateLog). So I've switched to using the versionInfo.blockUpdates while 
looking up the max version from the index, see: {{UpdateLog.onFirstSearcher}}. 
My thinking here is that we actually want to block updates briefly after a 
reload when getting the max from the index so that we don't end up setting the 
version too low.

Also, minor, but I removed the SortedNumericDocValues stuff from the 
{{VersionInfo#seedBucketVersionHighestFromIndex}} method from the previous 
patch since Solr doesn't have support for that yet and it was a 
mis-understanding on my part of how that type of field works. So now the lookup 
of max either uses terms if version is indexed or a range query if not indexed.

> Seed version buckets with max version from index
> 
>
> Key: SOLR-7332
> URL: https://issues.apache.org/jira/browse/SOLR-7332
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Timothy Potter
>Assignee: Timothy Potter
> Attachments: SOLR-7332.patch, SOLR-7332.patch, SOLR-7332.patch
>
>
> See full discussion with Yonik and I in SOLR-6816.
> The TL;DR of that discussion is that we should initialize highest for each 
> version bucket to the MAX value of the {{__version__}} field in the index as 
> early as possible, such as after the first soft- or hard- commit. This will 
> ensure that bulk adds where the docs don't exist avoid an unnecessary lookup 
> for a non-existent document in the index.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7336) Add State enum to Replica

2015-04-03 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394580#comment-14394580
 ] 

Mark Miller commented on SOLR-7336:
---

Looks good to me.

> Add State enum to Replica
> -
>
> Key: SOLR-7336
> URL: https://issues.apache.org/jira/browse/SOLR-7336
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrJ
>Reporter: Shai Erera
>Assignee: Shai Erera
> Attachments: SOLR-7336.patch, SOLR-7336.patch, SOLR-7336.patch
>
>
> Following SOLR-7325, this issue adds a State enum to Replica.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6865) Upgrade HttpClient to 4.4.1

2015-04-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394578#comment-14394578
 ] 

ASF subversion and git services commented on SOLR-6865:
---

Commit 1671092 from [~elyograg] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1671092 ]

SOLR-6865: Upgrade HttpClient/Core/Mime to 4.4.1. (merge trunk r1671085)

> Upgrade HttpClient to 4.4.1
> ---
>
> Key: SOLR-6865
> URL: https://issues.apache.org/jira/browse/SOLR-6865
> Project: Solr
>  Issue Type: Task
>Affects Versions: 5.0
>Reporter: Shawn Heisey
>Priority: Minor
> Fix For: Trunk, 5.2
>
> Attachments: SOLR-6865.patch, SOLR-6865.patch
>
>
> HttpClient 4.4 has been released.  5.0 seems like a good time to upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7347) clock skew can cause data loss

2015-04-03 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394574#comment-14394574
 ] 

Yonik Seeley commented on SOLR-7347:


The work in SOLR-7332 may be useful to fix this. 

> clock skew can cause data loss
> --
>
> Key: SOLR-7347
> URL: https://issues.apache.org/jira/browse/SOLR-7347
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Reporter: Yonik Seeley
>
> The high bits of versions are created using the system clock.
> System clock skew on the order of magnitude of time it takes for one leader 
> to receive it's last update to the time it takes another replica to become a 
> leader can cause data loss for any updates to the same document until the new 
> leaders clock catches up with the old leaders clock.
> 1) replica1 is the leader and indexes document A, choosing version X (and 
> forwards to replicas)
> 2) replica1 goes down
> 3) replica2 becomes the new leader
> 4) replica2 indexes an update for document A, and chooses version Y (which is 
> less than X due to clock skew) and forwards to replica3
> 5) replica3 checks for reordered updates, finds version X and thus drops 
> version Y
> This should be rare... you need a big enough clock skew and updates to the 
> same document with different leaders within that time window.  We should 
> still fix this of course.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-7347) clock skew can cause data loss

2015-04-03 Thread Yonik Seeley (JIRA)
Yonik Seeley created SOLR-7347:
--

 Summary: clock skew can cause data loss
 Key: SOLR-7347
 URL: https://issues.apache.org/jira/browse/SOLR-7347
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Yonik Seeley


The high bits of versions are created using the system clock.
System clock skew on the order of magnitude of time it takes for one leader to 
receive it's last update to the time it takes another replica to become a 
leader can cause data loss for any updates to the same document until the new 
leaders clock catches up with the old leaders clock.

1) replica1 is the leader and indexes document A, choosing version X (and 
forwards to replicas)
2) replica1 goes down
3) replica2 becomes the new leader
4) replica2 indexes an update for document A, and chooses version Y (which is 
less than X due to clock skew) and forwards to replica3
5) replica3 checks for reordered updates, finds version X and thus drops 
version Y

This should be rare... you need a big enough clock skew and updates to the same 
document with different leaders within that time window.  We should still fix 
this of course.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6865) Upgrade HttpClient to 4.4.1

2015-04-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394541#comment-14394541
 ] 

ASF subversion and git services commented on SOLR-6865:
---

Commit 1671085 from [~elyograg] in branch 'dev/trunk'
[ https://svn.apache.org/r1671085 ]

SOLR-6865: Upgrade HttpClient/Core/Mime to 4.4.1.

> Upgrade HttpClient to 4.4.1
> ---
>
> Key: SOLR-6865
> URL: https://issues.apache.org/jira/browse/SOLR-6865
> Project: Solr
>  Issue Type: Task
>Affects Versions: 5.0
>Reporter: Shawn Heisey
>Priority: Minor
> Fix For: Trunk, 5.2
>
> Attachments: SOLR-6865.patch, SOLR-6865.patch
>
>
> HttpClient 4.4 has been released.  5.0 seems like a good time to upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-6388) Optimize SpanNearQuery

2015-04-03 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-6388.
-
   Resolution: Fixed
Fix Version/s: 5.x
   Trunk

For now the check is implemented via Terms.getPayloads() until LUCENE-6390 is 
fixed.

> Optimize SpanNearQuery
> --
>
> Key: LUCENE-6388
> URL: https://issues.apache.org/jira/browse/LUCENE-6388
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Robert Muir
> Fix For: Trunk, 5.x
>
> Attachments: LUCENE-6388.patch
>
>
> After the big spans overhaul in LUCENE-6308, we can speed up SpanNearQuery a 
> little more:
> * SpanNearQuery defaults to collectPayloads=true, but this requires a slower 
> implementation, for an uncommon case. Use the faster no-payloads impl if the 
> field doesn't actually have any payloads.
> * Use a simple array of Spans rather than List in NearSpans classes. This is 
> iterated over often in the logic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6388) Optimize SpanNearQuery

2015-04-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394528#comment-14394528
 ] 

ASF subversion and git services commented on LUCENE-6388:
-

Commit 1671081 from [~rcmuir] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1671081 ]

LUCENE-6388: Optimize SpanNearQuery

> Optimize SpanNearQuery
> --
>
> Key: LUCENE-6388
> URL: https://issues.apache.org/jira/browse/LUCENE-6388
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Robert Muir
> Attachments: LUCENE-6388.patch
>
>
> After the big spans overhaul in LUCENE-6308, we can speed up SpanNearQuery a 
> little more:
> * SpanNearQuery defaults to collectPayloads=true, but this requires a slower 
> implementation, for an uncommon case. Use the faster no-payloads impl if the 
> field doesn't actually have any payloads.
> * Use a simple array of Spans rather than List in NearSpans classes. This is 
> iterated over often in the logic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6388) Optimize SpanNearQuery

2015-04-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394494#comment-14394494
 ] 

ASF subversion and git services commented on LUCENE-6388:
-

Commit 1671078 from [~rcmuir] in branch 'dev/trunk'
[ https://svn.apache.org/r1671078 ]

LUCENE-6388: Optimize SpanNearQuery

> Optimize SpanNearQuery
> --
>
> Key: LUCENE-6388
> URL: https://issues.apache.org/jira/browse/LUCENE-6388
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Robert Muir
> Attachments: LUCENE-6388.patch
>
>
> After the big spans overhaul in LUCENE-6308, we can speed up SpanNearQuery a 
> little more:
> * SpanNearQuery defaults to collectPayloads=true, but this requires a slower 
> implementation, for an uncommon case. Use the faster no-payloads impl if the 
> field doesn't actually have any payloads.
> * Use a simple array of Spans rather than List in NearSpans classes. This is 
> iterated over often in the logic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



  1   2   >