[JENKINS] Lucene-Solr-Tests-trunk-Java8 - Build # 555 - Failure

2015-10-29 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Tests-trunk-Java8/555/

1 tests failed.
FAILED:  org.apache.solr.cloud.ZkSolrClientTest.testMultipleWatchesAsync

Error Message:


Stack Trace:
java.lang.AssertionError
at 
__randomizedtesting.SeedInfo.seed([3C7F898582C821:688AAC2B1066C9BF]:0)
at org.junit.Assert.fail(Assert.java:92)
at org.junit.Assert.assertTrue(Assert.java:43)
at org.junit.Assert.assertTrue(Assert.java:54)
at 
org.apache.solr.cloud.ZkSolrClientTest.testMultipleWatchesAsync(ZkSolrClientTest.java:266)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1660)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:866)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:902)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:916)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:875)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:777)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:811)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:822)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at java.lang.Thread.run(Thread.java:745)




Build Log:
[...truncated 9831 lines...]
   [junit4] Suite: org.apache.solr.cloud.ZkSolrClientTest
   [junit4]   2> Creating dataDir: 
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-trunk-Java8/solr/build/solr-core/test/J1/temp/solr.cloud.ZkSolrClientTest_3C7F898582C821-001/init-core-data-001
   [junit4]   2> 473955 INFO  
(SUITE-ZkSolrClientTest-se

[jira] [Commented] (LUCENE-6865) BooleanQuery2ModifierNodeProcessor breaks the query node hierarchy

2015-10-29 Thread Trejkaz (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14981682#comment-14981682
 ] 

Trejkaz commented on LUCENE-6865:
-

The underlying issue here seems like it might be LUCENE-6506 ... but to get 
that fix we have to update to 5.3. :(


> BooleanQuery2ModifierNodeProcessor breaks the query node hierarchy
> --
>
> Key: LUCENE-6865
> URL: https://issues.apache.org/jira/browse/LUCENE-6865
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Trejkaz
>
> We discovered that one of our own implementations of QueryNodeProcessor was 
> seeing node.getParent() returning null for nodes other than the root of the 
> query tree.
> I put a diagnostic processor around every processor which runs and found that 
> BooleanQuery2ModifierNodeProcessor (and possibly others, although it isn't 
> clear) are mysteriously setting some of the node references to null.
> Example query tree before:
> {noformat}
> GroupQueryNode, parent = null
>   WithinQueryNode, parent = GroupQueryNode
> QuotedFieldQueryNode, parent = WithinQueryNode
> GroupQueryNode, parent = WithinQueryNode
>   AndQueryNode, parent = GroupQueryNode
> GroupQueryNode, parent = AndQueryNode
>   OrQueryNode, parent = GroupQueryNode
> QuotedFieldQueryNode, parent = OrQueryNode
> QuotedFieldQueryNode, parent = OrQueryNode
> GroupQueryNode, parent = AndQueryNode
>   OrQueryNode, parent = GroupQueryNode
> QuotedFieldQueryNode, parent = OrQueryNode
> QuotedFieldQueryNode, parent = OrQueryNode
> {noformat}
> And after BooleanQuery2ModifierNodeProcessor.process():
> {noformat}
> GroupQueryNode, parent = null
>   WithinQueryNode, parent = GroupQueryNode
> QuotedFieldQueryNode, parent = WithinQueryNode
> GroupQueryNode, parent = WithinQueryNode
>   AndQueryNode, parent = GroupQueryNode
> BooleanModifierNode, parent = AndQueryNode
>   GroupQueryNode, parent = null
> OrQueryNode, parent = GroupQueryNode
>   QuotedFieldQueryNode, parent = OrQueryNode
>   QuotedFieldQueryNode, parent = OrQueryNode
> BooleanModifierNode, parent = AndQueryNode
>   GroupQueryNode, parent = null
> OrQueryNode, parent = GroupQueryNode
>   QuotedFieldQueryNode, parent = OrQueryNode
>   QuotedFieldQueryNode, parent = OrQueryNode
> {noformat}
> Looking at QueryNodeImpl, there is a lot of fiddly logic in there. Removing 
> children can trigger setting the parent to null, but setting the parent can 
> also trigger the child removing itself, so it's near impossible to figure out 
> why this could be happening, but I'm closing in on it at least. My initial 
> suspicion is that cloneTree() is responsible, because ironically the number 
> of failures of this sort _increase_ if I try to use cloneTree to defend 
> against mutability bugs.
> The fix I have come up with is to clone the whole API but making QueryNode 
> immutable. This removes the ability for processors to mess with nodes that 
> don't belong to them, but also obviates the need for a parent reference in 
> the first place, which I think is the entire source of the problem - keeping 
> the parent and child in sync correctly is obviously going to be hard, and 
> indeed we find that there is at least one bug of this sort lurking in there.
> But even if we rewrite it, I figured I would report the issue so that maybe 
> it can be fixed for others.
> Code to use for diagnostics:
> {code}
> import java.util.List;
> import org.apache.lucene.queryparser.flexible.core.QueryNodeException;
> import org.apache.lucene.queryparser.flexible.core.config.QueryConfigHandler;
> import org.apache.lucene.queryparser.flexible.core.nodes.QueryNode;
> import 
> org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessor;
> public class DiagnosticQueryNodeProcessor implements QueryNodeProcessor
> {
> private final QueryNodeProcessor delegate;
> public TreeFixingQueryNodeProcessor(QueryNodeProcessor delegate)
> {
> this.delegate = delegate;
> }
> @Override
> public QueryConfigHandler getQueryConfigHandler()
> {
> return delegate.getQueryConfigHandler();
> }
> @Override
> public void setQueryConfigHandler(QueryConfigHandler queryConfigHandler)
> {
> delegate.setQueryConfigHandler(queryConfigHandler);
> }
> @Override
> public QueryNode process(QueryNode queryNode) throws QueryNodeException
> {
> System.out.println("Before " + delegate.getClass().getSimpleName() + 
> ".process():");
> dumpTree(queryNode);
> queryNode = delegate.process(queryNode);
> System.out.println("After " + delegate.getClass()

[JENKINS] Lucene-Solr-NightlyTests-5.x - Build # 1001 - Still Failing

2015-10-29 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-5.x/1001/

4 tests failed.
FAILED:  org.apache.solr.cloud.CollectionsAPIDistributedZkTest.test

Error Message:
KeeperErrorCode = Session expired for /clusterstate.json

Stack Trace:
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = 
Session expired for /clusterstate.json
at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)
at 
org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:345)
at 
org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:342)
at 
org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:61)
at 
org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:342)
at 
org.apache.solr.common.cloud.ZkStateReader.refreshLegacyClusterState(ZkStateReader.java:477)
at 
org.apache.solr.common.cloud.ZkStateReader.updateClusterState(ZkStateReader.java:257)
at 
org.apache.solr.cloud.AbstractDistribZkTestBase.waitForCollectionToDisappear(AbstractDistribZkTestBase.java:196)
at 
org.apache.solr.cloud.AbstractFullDistribZkTestBase.assertCollectionNotExists(AbstractFullDistribZkTestBase.java:1772)
at 
org.apache.solr.cloud.CollectionsAPIDistributedZkTest.deletePartiallyCreatedCollection(CollectionsAPIDistributedZkTest.java:234)
at 
org.apache.solr.cloud.CollectionsAPIDistributedZkTest.test(CollectionsAPIDistributedZkTest.java:171)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1660)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:866)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:902)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:916)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:963)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:938)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:875)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:777)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:811)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:822)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMe

[jira] [Comment Edited] (SOLR-7525) Add ComplementStream to the Streaming API and Streaming Expressions

2015-10-29 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14981511#comment-14981511
 ] 

Dennis Gove edited comment on SOLR-7525 at 10/29/15 10:59 PM:
--

Includes both ComplementStream and IntersectStream. All tests pass.

Depends on SOLR-8198.


was (Author: dpgove):
Includes both ComplementStream and IntersectStream. All tests pass.

> Add ComplementStream to the Streaming API and Streaming Expressions
> ---
>
> Key: SOLR-7525
> URL: https://issues.apache.org/jira/browse/SOLR-7525
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrJ
>Reporter: Joel Bernstein
>Priority: Minor
> Attachments: SOLR-7525.patch
>
>
> This ticket adds a ComplementStream to the Streaming API and Streaming 
> Expression language.
> The ComplementStream will wrap two TupleStreams (StreamA, StreamB) and emit 
> Tuples from StreamA that are not in StreamB.
> Streaming API Syntax:
> {code}
> ComplementStream cstream = new ComplementStream(streamA, streamB, comp);
> {code}
> Streaming Expression syntax:
> {code}
> complement(search(...), search(...), on(...))
> {code}
> Internal implementation will rely on the ReducerStream. The ComplementStream 
> can be parallelized using the ParallelStream.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7525) Add ComplementStream to the Streaming API and Streaming Expressions

2015-10-29 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-7525:
--
Attachment: SOLR-7525.patch

Includes both ComplementStream and IntersectStream. All tests pass.

> Add ComplementStream to the Streaming API and Streaming Expressions
> ---
>
> Key: SOLR-7525
> URL: https://issues.apache.org/jira/browse/SOLR-7525
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrJ
>Reporter: Joel Bernstein
>Priority: Minor
> Attachments: SOLR-7525.patch
>
>
> This ticket adds a ComplementStream to the Streaming API and Streaming 
> Expression language.
> The ComplementStream will wrap two TupleStreams (StreamA, StreamB) and emit 
> Tuples from StreamA that are not in StreamB.
> Streaming API Syntax:
> {code}
> ComplementStream cstream = new ComplementStream(streamA, streamB, comp);
> {code}
> Streaming Expression syntax:
> {code}
> complement(search(...), search(...), on(...))
> {code}
> Internal implementation will rely on the ReducerStream. The ComplementStream 
> can be parallelized using the ParallelStream.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7928) Improve CheckIndex to work against HdfsDirectory

2015-10-29 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14981407#comment-14981407
 ] 

Uwe Schindler commented on SOLR-7928:
-

bq. TestCheckIndex isn't visible from the Solr test classes unless we start 
publishing Lucene test artifacts, which I don't think we want to do.

You could make an abstract TestCheckIndexBase in Lucene's test framework.

> Improve CheckIndex to work against HdfsDirectory
> 
>
> Key: SOLR-7928
> URL: https://issues.apache.org/jira/browse/SOLR-7928
> Project: Solr
>  Issue Type: New Feature
>  Components: hdfs
>Reporter: Mike Drob
>Assignee: Gregory Chanan
> Fix For: 5.4, Trunk
>
> Attachments: SOLR-7928.patch, SOLR-7928.patch, SOLR-7928.patch
>
>
> CheckIndex is very useful for testing an index for corruption. However, it 
> can only work with an index on an FSDirectory, meaning that if you need to 
> check an Hdfs Index, then you have to download it to local disk (which can be 
> very large).
> We should have a way to natively check index on hdfs for corruption.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7928) Improve CheckIndex to work against HdfsDirectory

2015-10-29 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated SOLR-7928:

Attachment: SOLR-7928.patch

New patch that addresses a few of Greg's concerns.

> Improve CheckIndex to work against HdfsDirectory
> 
>
> Key: SOLR-7928
> URL: https://issues.apache.org/jira/browse/SOLR-7928
> Project: Solr
>  Issue Type: New Feature
>  Components: hdfs
>Reporter: Mike Drob
>Assignee: Gregory Chanan
> Fix For: 5.4, Trunk
>
> Attachments: SOLR-7928.patch, SOLR-7928.patch, SOLR-7928.patch
>
>
> CheckIndex is very useful for testing an index for corruption. However, it 
> can only work with an index on an FSDirectory, meaning that if you need to 
> check an Hdfs Index, then you have to download it to local disk (which can be 
> very large).
> We should have a way to natively check index on hdfs for corruption.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7928) Improve CheckIndex to work against HdfsDirectory

2015-10-29 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14981388#comment-14981388
 ] 

Mike Drob commented on SOLR-7928:
-

bq. You just need to read these publicly right? Perhaps just write public 
accessors?
Done.
bq. Testing of the HdfsCheckIndex looks pretty minimal...can we reuse 
TestCheckIndex in some way? I'm thinking like changing each test in there to 
just take a directory that you pass in. In lucene we use newDirectory, in your 
test we use an HdfsDirectory. Thoughts?
So... this is a good idea in theory, but in practice it gets really difficult 
to do. TestCheckIndex isn't visible from the Solr test classes unless we start 
publishing Lucene test artifacts, which I don't think we want to do. I think we 
can get away with minimal testing here because we aren't changing any of the 
functionality, and that's all covered in the Lucene test suite. For our 
purposes, I think it is enough to establish that if you have an HDFS cluster, 
you can point this tool at it, and it will run. 
bq. Any plans to write a MapReduce Tool to do this?
Sure, after this gets committed I'll open up a new JIRA and we can discuss 
there.

> Improve CheckIndex to work against HdfsDirectory
> 
>
> Key: SOLR-7928
> URL: https://issues.apache.org/jira/browse/SOLR-7928
> Project: Solr
>  Issue Type: New Feature
>  Components: hdfs
>Reporter: Mike Drob
>Assignee: Gregory Chanan
> Fix For: 5.4, Trunk
>
> Attachments: SOLR-7928.patch, SOLR-7928.patch
>
>
> CheckIndex is very useful for testing an index for corruption. However, it 
> can only work with an index on an FSDirectory, meaning that if you need to 
> check an Hdfs Index, then you have to download it to local disk (which can be 
> very large).
> We should have a way to natively check index on hdfs for corruption.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8215) SolrCloud can select a core not in active state for querying

2015-10-29 Thread Ishan Chattopadhyaya (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14981266#comment-14981266
 ] 

Ishan Chattopadhyaya commented on SOLR-8215:


Ah, I knew I was missing something. ;-) Sorry for the noise, please go ahead!

> SolrCloud can select a core not in active state for querying
> 
>
> Key: SOLR-8215
> URL: https://issues.apache.org/jira/browse/SOLR-8215
> Project: Solr
>  Issue Type: Bug
>Reporter: Varun Thacker
> Attachments: SOLR-8215.patch
>
>
> A query can be served by a core which is not in active state if the request 
> hits the node which hosts these non active cores.
> We explicitly check for only active cores to search against  in 
> {{CloudSolrClient#sendRequest}} Line 1043 on trunk.
> But we don't check this if someone uses the REST APIs 
> {{HttpSolrCall#getCoreByCollection}} should only pick cores which are active 
> on line 794 on trunk. 
> We however check it on line 882/883 in HttpSolrCall, when we try to find 
> cores on other nodes when it's not present locally.
> So let's fix {{HttpSolrCall#getCoreByCollection}} to make the active check as 
> well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-8223) Take care not to accidentally swallow OOMErrors

2015-10-29 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated SOLR-8223:

Attachment: SOLR-8223.patch

Attaching a patch that fixes an instance in {{CoreContainer}} and {{LIRThread}}.

> Take care not to accidentally swallow OOMErrors
> ---
>
> Key: SOLR-8223
> URL: https://issues.apache.org/jira/browse/SOLR-8223
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.10.3
>Reporter: Mike Drob
> Fix For: Trunk
>
> Attachments: SOLR-8223.patch
>
>
> This was first noticed with 4.10.3, but it looks like it still applies to 
> trunk. There are a few places in the code where we catch {{Throwable}} and 
> then don't check for OOM or rethrow it. This behaviour means that OOM kill 
> scripts won't run, and the JVM can get into an undesirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-8223) Take care not to accidentally swallow OOMErrors

2015-10-29 Thread Mike Drob (JIRA)
Mike Drob created SOLR-8223:
---

 Summary: Take care not to accidentally swallow OOMErrors
 Key: SOLR-8223
 URL: https://issues.apache.org/jira/browse/SOLR-8223
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.10.3
Reporter: Mike Drob
 Fix For: Trunk


This was first noticed with 4.10.3, but it looks like it still applies to 
trunk. There are a few places in the code where we catch {{Throwable}} and then 
don't check for OOM or rethrow it. This behaviour means that OOM kill scripts 
won't run, and the JVM can get into an undesirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8215) SolrCloud can select a core not in active state for querying

2015-10-29 Thread Varun Thacker (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14981241#comment-14981241
 ] 

Varun Thacker commented on SOLR-8215:
-

Hi Ishan,

This code is only triggered when we issue a request against a collection. So 
for example {{/gettingstarted/update/}} or {{/gettingstarted/select}} . It 
should not effect any core admin / collection api calls. 

> SolrCloud can select a core not in active state for querying
> 
>
> Key: SOLR-8215
> URL: https://issues.apache.org/jira/browse/SOLR-8215
> Project: Solr
>  Issue Type: Bug
>Reporter: Varun Thacker
> Attachments: SOLR-8215.patch
>
>
> A query can be served by a core which is not in active state if the request 
> hits the node which hosts these non active cores.
> We explicitly check for only active cores to search against  in 
> {{CloudSolrClient#sendRequest}} Line 1043 on trunk.
> But we don't check this if someone uses the REST APIs 
> {{HttpSolrCall#getCoreByCollection}} should only pick cores which are active 
> on line 794 on trunk. 
> We however check it on line 882/883 in HttpSolrCall, when we try to find 
> cores on other nodes when it's not present locally.
> So let's fix {{HttpSolrCall#getCoreByCollection}} to make the active check as 
> well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-8215) SolrCloud can select a core not in active state for querying

2015-10-29 Thread Ishan Chattopadhyaya (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14981223#comment-14981223
 ] 

Ishan Chattopadhyaya edited comment on SOLR-8215 at 10/29/15 8:43 PM:
--

I'm just wondering if this would mean that if a replica gets marked as down 
(due to bugs / by mistake), one wouldn't be able to issue core admin commands 
to bring it back up if this patch (and please correct me if I misunderstand 
this) short circuits the requests at the HttpSolrCall layer. One such command 
is under discussion / development in SOLR-7569 (last few comments), which will 
let the replica change its last published state. I'm not suggesting right away 
that we don't do this patch, but do you have any thoughts around it (and 
recovery of such replicas, in general)? Fyi, [~markrmil...@gmail.com].


was (Author: ichattopadhyaya):
I'm just wondering if this would mean that if a replica gets marked as down 
(due to bugs / by mistake), one wouldn't be able to issue core admin commands 
to bring it back up if this patch (and please correct me if I misunderstand 
this) short circuits the requests at the HttpSolrCall layer. One such command 
is under discussion / development in SOLR-7569 (last few comments), which will 
let the replica change its last published state. I'm not suggesting right away 
that we don't do this, but do you have any thoughts around it (and recovery of 
such replicas, in general)? Fyi, [~markrmil...@gmail.com].

> SolrCloud can select a core not in active state for querying
> 
>
> Key: SOLR-8215
> URL: https://issues.apache.org/jira/browse/SOLR-8215
> Project: Solr
>  Issue Type: Bug
>Reporter: Varun Thacker
> Attachments: SOLR-8215.patch
>
>
> A query can be served by a core which is not in active state if the request 
> hits the node which hosts these non active cores.
> We explicitly check for only active cores to search against  in 
> {{CloudSolrClient#sendRequest}} Line 1043 on trunk.
> But we don't check this if someone uses the REST APIs 
> {{HttpSolrCall#getCoreByCollection}} should only pick cores which are active 
> on line 794 on trunk. 
> We however check it on line 882/883 in HttpSolrCall, when we try to find 
> cores on other nodes when it's not present locally.
> So let's fix {{HttpSolrCall#getCoreByCollection}} to make the active check as 
> well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-8215) SolrCloud can select a core not in active state for querying

2015-10-29 Thread Ishan Chattopadhyaya (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14981223#comment-14981223
 ] 

Ishan Chattopadhyaya edited comment on SOLR-8215 at 10/29/15 8:41 PM:
--

I'm just wondering if this would mean that if a replica gets marked as down 
(due to bugs / by mistake), one wouldn't be able to issue core admin commands 
to bring it back up if this patch (and please correct me if I misunderstand 
this) short circuits the requests at the HttpSolrCall layer. One such command 
is under discussion / development in SOLR-7569 (last few comments), which will 
let the replica change its last published state. I'm not suggesting right away 
that we don't do this, but do you have any thoughts around it (and recovery of 
such replicas, in general)? Fyi, [~markrmil...@gmail.com].


was (Author: ichattopadhyaya):
I'm just wondering if this would mean that if a replica gets marked as down 
(due to bugs / by mistake), one wouldn't be able to issue core admin commands 
to bring it back up if this patch (and please correct me if I misunderstand 
this) short circuits the requests at the HttpSolrCall layer. One such command 
is under discussion / development in SOLR-7569 (last few comments). I'm not 
suggesting right away that we don't do this, but do you have any thoughts 
around it? Fyi, [~markrmil...@gmail.com].

> SolrCloud can select a core not in active state for querying
> 
>
> Key: SOLR-8215
> URL: https://issues.apache.org/jira/browse/SOLR-8215
> Project: Solr
>  Issue Type: Bug
>Reporter: Varun Thacker
> Attachments: SOLR-8215.patch
>
>
> A query can be served by a core which is not in active state if the request 
> hits the node which hosts these non active cores.
> We explicitly check for only active cores to search against  in 
> {{CloudSolrClient#sendRequest}} Line 1043 on trunk.
> But we don't check this if someone uses the REST APIs 
> {{HttpSolrCall#getCoreByCollection}} should only pick cores which are active 
> on line 794 on trunk. 
> We however check it on line 882/883 in HttpSolrCall, when we try to find 
> cores on other nodes when it's not present locally.
> So let's fix {{HttpSolrCall#getCoreByCollection}} to make the active check as 
> well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8215) SolrCloud can select a core not in active state for querying

2015-10-29 Thread Ishan Chattopadhyaya (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14981223#comment-14981223
 ] 

Ishan Chattopadhyaya commented on SOLR-8215:


I'm just wondering if this would mean that if a replica gets marked as down 
(due to bugs / by mistake), one wouldn't be able to issue core admin commands 
to bring it back up if this patch (and please correct me if I misunderstand 
this) short circuits the requests at the HttpSolrCall layer. One such command 
is under discussion / development in SOLR-7569 (last few comments). I'm not 
suggesting right away that we don't do this, but do you have any thoughts 
around it? Fyi, [~markrmil...@gmail.com].

> SolrCloud can select a core not in active state for querying
> 
>
> Key: SOLR-8215
> URL: https://issues.apache.org/jira/browse/SOLR-8215
> Project: Solr
>  Issue Type: Bug
>Reporter: Varun Thacker
> Attachments: SOLR-8215.patch
>
>
> A query can be served by a core which is not in active state if the request 
> hits the node which hosts these non active cores.
> We explicitly check for only active cores to search against  in 
> {{CloudSolrClient#sendRequest}} Line 1043 on trunk.
> But we don't check this if someone uses the REST APIs 
> {{HttpSolrCall#getCoreByCollection}} should only pick cores which are active 
> on line 794 on trunk. 
> We however check it on line 882/883 in HttpSolrCall, when we try to find 
> cores on other nodes when it's not present locally.
> So let's fix {{HttpSolrCall#getCoreByCollection}} to make the active check as 
> well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.8.0_60) - Build # 14711 - Failure!

2015-10-29 Thread Michael McCandless
I committed a fix.

Mike McCandless

http://blog.mikemccandless.com


On Thu, Oct 29, 2015 at 10:08 AM, Michael McCandless
 wrote:
> Ooh, I'll dig.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Thu, Oct 29, 2015 at 9:38 AM, Policeman Jenkins Server
>  wrote:
>> Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/14711/
>> Java: 64bit/jdk1.8.0_60 -XX:-UseCompressedOops -XX:+UseSerialGC
>>
>> 1 tests failed.
>> FAILED:  org.apache.lucene.index.TestDimensionalValues.testRandomBinaryMedium
>>
>> Error Message:
>> docID=857 expected: but was:
>>
>> Stack Trace:
>> java.lang.AssertionError: docID=857 expected: but was:
>> at 
>> __randomizedtesting.SeedInfo.seed([831907F8E7511A35:F43480B833FD6FE2]:0)
>> at org.junit.Assert.fail(Assert.java:93)
>> at org.junit.Assert.failNotEquals(Assert.java:647)
>> at org.junit.Assert.assertEquals(Assert.java:128)
>> at 
>> org.apache.lucene.index.TestDimensionalValues.verify(TestDimensionalValues.java:994)
>> at 
>> org.apache.lucene.index.TestDimensionalValues.verify(TestDimensionalValues.java:791)
>> at 
>> org.apache.lucene.index.TestDimensionalValues.doTestRandomBinary(TestDimensionalValues.java:781)
>> at 
>> org.apache.lucene.index.TestDimensionalValues.testRandomBinaryMedium(TestDimensionalValues.java:391)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at 
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> at 
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:497)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1660)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:866)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:902)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:916)
>> at 
>> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
>> at 
>> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
>> at 
>> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
>> at 
>> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
>> at 
>> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
>> at 
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at 
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
>> at 
>> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809)
>> at 
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:875)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:777)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:811)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:822)
>> at 
>> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
>> at 
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at 
>> org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
>> at 
>> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
>> at 
>> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
>> at 
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at 
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at 
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at 
>> org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54)
>> at 
>> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
>> at 
>> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
>> at 
>> org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
>> 

[jira] [Updated] (LUCENE-6869) When executing MoreLikeThis with multiple fields, it should create a query considering all fieldNames

2015-10-29 Thread Pedro Rosanes (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pedro Rosanes updated LUCENE-6869:
--
Component/s: core/queryparser

> When executing MoreLikeThis with multiple fields, it should create a query 
> considering all fieldNames
> -
>
> Key: LUCENE-6869
> URL: https://issues.apache.org/jira/browse/LUCENE-6869
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/query/scoring, core/queryparser
>Affects Versions: 5.3
>Reporter: Pedro Rosanes
>  Labels: morelikethis
>
> When executing MLT with multiple fields, it should
> considerem them all
> If a document has the same term in multiple fields, the
> mlt generates a query considering only the field with
> the higher idf. This commit changes the behaviour, to
> include in the query all fieldnames.
> Eg.:
> Last behaviour:
> Doc ("fieldName1", "value")
> ("fieldName2", "value")
> Old behaviour generates query: "fieldName1:value"
> New Behaviour generates query: "fieldName1:value
> fieldName2:value"
> Proposed solution:
> https://github.com/prosanes/lucene-solr/pull/1/files



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6869) When executing MoreLikeThis with multiple fields, it should create a query considering all fieldNames

2015-10-29 Thread Pedro Rosanes (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pedro Rosanes updated LUCENE-6869:
--
Priority: Minor  (was: Major)

> When executing MoreLikeThis with multiple fields, it should create a query 
> considering all fieldNames
> -
>
> Key: LUCENE-6869
> URL: https://issues.apache.org/jira/browse/LUCENE-6869
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/query/scoring, core/queryparser
>Affects Versions: 5.3
>Reporter: Pedro Rosanes
>Priority: Minor
>  Labels: morelikethis
>
> When executing MLT with multiple fields, it should
> considerem them all
> If a document has the same term in multiple fields, the
> mlt generates a query considering only the field with
> the higher idf. This commit changes the behaviour, to
> include in the query all fieldnames.
> Eg.:
> Last behaviour:
> Doc ("fieldName1", "value")
> ("fieldName2", "value")
> Old behaviour generates query: "fieldName1:value"
> New Behaviour generates query: "fieldName1:value
> fieldName2:value"
> Proposed solution:
> https://github.com/prosanes/lucene-solr/pull/1/files



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-6869) When executing MoreLikeThis with multiple fields, it should create a query considering all fieldNames

2015-10-29 Thread Pedro Rosanes (JIRA)
Pedro Rosanes created LUCENE-6869:
-

 Summary: When executing MoreLikeThis with multiple fields, it 
should create a query considering all fieldNames
 Key: LUCENE-6869
 URL: https://issues.apache.org/jira/browse/LUCENE-6869
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/query/scoring
Affects Versions: 5.3
Reporter: Pedro Rosanes


When executing MLT with multiple fields, it should
considerem them all

If a document has the same term in multiple fields, the
mlt generates a query considering only the field with
the higher idf. This commit changes the behaviour, to
include in the query all fieldnames.

Eg.:
Last behaviour:
Doc ("fieldName1", "value")
("fieldName2", "value")

Old behaviour generates query: "fieldName1:value"
New Behaviour generates query: "fieldName1:value
fieldName2:value"

Proposed solution:
https://github.com/prosanes/lucene-solr/pull/1/files



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-Tests-trunk-Java8 - Build # 553 - Failure

2015-10-29 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Tests-trunk-Java8/553/

1 tests failed.
FAILED:  org.apache.solr.client.solrj.impl.CloudSolrClientTest.test

Error Message:
There should be one document because overwrite=true expected:<1> but was:<0>

Stack Trace:
java.lang.AssertionError: There should be one document because overwrite=true 
expected:<1> but was:<0>
at 
__randomizedtesting.SeedInfo.seed([1CF0BA95FCEA8164:94A4854F5216EC9C]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at 
org.apache.solr.client.solrj.impl.CloudSolrClientTest.testOverwriteOption(CloudSolrClientTest.java:156)
at 
org.apache.solr.client.solrj.impl.CloudSolrClientTest.test(CloudSolrClientTest.java:120)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1660)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:866)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:902)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:916)
at 
org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:110)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:963)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:938)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:875)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:777)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:811)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:822)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
  

Re: [JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.8.0_60) - Build # 14711 - Failure!

2015-10-29 Thread Steve Rowe
Hi Martin,

You should start another thread to discuss this topic; replying to a jenkins 
build failure message means lots of people will never look at your email.

Do you know about the Maven build that already is included with the project?  
If not, start here: 


Steve

> On Oct 29, 2015, at 1:46 PM, Martin Gainty  wrote:
> 
> wiht a staggering 65 build.xmls it seems the ivy build situation is due for a 
> overhaul
> I also noticed alot of ping-pong grandchild calls target on parent calls 
> target on grandparent then passes that to grandchild..the flow is almost 
> impossibly to trace so
> 
> am attempting to move ivy build.xmls over to maven but at 200 +dependencies 
> am still picking up more dependencies
> 
> using mavens hierarchical parent has would allow all dependencies to be 
> predefined at parent and passed down to children
> lucene folder count (children): 32 
> solr folder count (children) 16 
> this doesnt include grandchildren most notably contrib
> 
> lucene/common-build.xml is now referencing maven artifacts located in maven 
> repo on localhost for classpath resolution
> luvene/common-build.xml project.class.path from includes these dependencies 
> required by lucene and solr
> 
>  
>location="${user.home}/.m2/repository/aopalliance/aopalliance/1.0/aopalliance-1.0.jar"/>
>  location="${user.home}/.m2/repository/com/adobe/xmp/xmpcore/5.1.2/xmpcore-5.1.2.jar"/>
> 
>location="${user.home}/.m2/repository/com/carrotsearch/hppc/0.7.7.1/hppc-0.7.7.1.jar"/>
>location="${user.home}/.m2/repository/com.carrotsearch.randomizedtesting/junit4-ant/2.1.17/junit4-ant-2.1.17.jar"/>
>   
>location="${user.home}/.m2/repository/com/codahale/metrics-core/1.1.0/metrics-core-1.1.0.jar"/>
>location="${user.home}/.m2/repository/com.codahale.metrics/metrics-healthchecks/3.0.1/metrics-healthchecks-3.0.1.jar"/>
>   
>location="${user.home}/.m2/repository/com/cybozu/labs/langdetect/1.1.20120112/langdetect-1.1.20120112.jar"/>
>  location="${user.home}/.m2/repository/com/drewnoakes/metadata-extractor/3.1.4/metadata-extractor-3.1.4.jar"/>
>  location="${user.home}/.m2/repository/com/facebook/presto/presto-parser/0.107/presto-parser-0.107.jar"/>
> 
>location="${user.home}/.m2/repository/com/fasterxml/jackson/core/jackson-core/2.5.1/jackson-core-2.5.1.jar"/>
>location="${user.home}/.m2/repository/com.fasterxml.jackson.dataformat/jackson-dataformat-smile/2.5.4/jackson-dataformat-smile-2.5.4.jar"/>
>   
>location="${user.home}/.m2/repository/com.github.ben-manes.caffeine/caffeine/1.0.1/caffeine-1.0.1.jar"/>
>   
>location="${user.home}/.m2/repository/com/google/guava/guava/18.0/guava-18.0.jar"/>
>location="${user.home}/.m2/repository/com/google/inject/guice/3.0/guice-3.0.jar"/>
>location="${user.home}/.m2/repository/com/google/inject/extensions/guice-servlet/3.0/guice-servlet-3.0.jar"/>
>   
>location="${user.home}/.m2/repository/com/googlecode/juniversalchardet/1.0.3/juniversalchardet-1.0.3.jar"/>
>  location="${user.home}/.m2/repository/com/googlecode/mp4parser/isoparser/1.0.2/isoparser-1.0.2.jar"/>
> 
> 
>location="${user.home}/.m2/repository/com/norconex/language/langdetect/1.3.0/langdetect-1.3.0.jar"/>
>location="${user.home}/.m2/repository/com/pff/java-libpst/0.8.1/java-libpst-0.8.1.jar"/>
>location="${user.home}/.m2/repository/com/rometools/rome/1.0/rome-1.0.jar"/>  
>  
>   
>location="${user.home}/.m2/repository/com/sun/jersey/jersey-bundle/1.9/jersey-bundle-1.9.jar"/>
>location="${user.home}/.m2/repository/com/sun/jersey/jersey-json/1.9/jersey-1.9.jar"/>
>location="${user.home}/.m2/repository/com/sun/jersey/contribs/jersey-guice/1.9/jersey-guice-1.9.jar"/>
>   
>location="${user.home}/.m2/repository/com.sun.mail/gimap/1.5.1/gimap-1.5.1.jar"/>
>location="${user.home}/.m2/repository/com.sun.mail/javax.mail/1.5.1/javax.mail-1.5.1.jar"/>
>   
>location="${user.home}/.m2/repository/com/sun/xml/bind/jaxb-impl/2.2.3-1/jaxb-impl-2.2.3-1.jar"/>
>location="${user.home}/.m2/repository/com/tdunning/t-digest/3.0/t-digest-3.0.jar"/>
>location="${user.home}/.m2/repository/com/thoughtworks/paranamer/paranamer/2.3/paranamer-2.3.jar"/>
>location="${user.home}/.m2/repository/com.typesafe/config/1.0.2/config-1.0.2.jar"/>
>   
>location="${user.home}/.m2/repository/commons-cli/commons-cli/1.3/commons-cli-1.3.jar"/>
>location="${user.home}/.m2/repository/commons-digester/commons-digester/2.1/commons-digester-2.1.jar"/>
>location="${user.hom

[jira] [Commented] (LUCENE-6276) Add matchCost() api to TwoPhaseDocIdSetIterator

2015-10-29 Thread Paul Elschot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14980894#comment-14980894
 ] 

Paul Elschot commented on LUCENE-6276:
--

I basically agree to all of these.

bq. ... move the utility methods to compute costs of phrases from 
TwoPhaseIterator into PhraseWeight/SpanNearQuery. I don't like leaking 
implementation details of specific TwoPhaseIterators into TwoPhaseIterator.

and make them (package) private I assume? The only disadvantage of that is that 
some duplication of these methods is needed in the spans package.

The easiest way to avoid such duplication would be when Spans move from 
o.a.l.search.spans to o.a.l.search.
Iirc there was some talk of that not so long ago (Alan's plans for spans iirc), 
so how about waiting for that, possibly at a separate issue?

It will take a while (at least a week) before I can continue with this. Please 
feel free to take it on.

> Add matchCost() api to TwoPhaseDocIdSetIterator
> ---
>
> Key: LUCENE-6276
> URL: https://issues.apache.org/jira/browse/LUCENE-6276
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Robert Muir
> Attachments: LUCENE-6276-ExactPhraseOnly.patch, 
> LUCENE-6276-NoSpans.patch, LUCENE-6276-NoSpans2.patch, LUCENE-6276.patch, 
> LUCENE-6276.patch, LUCENE-6276.patch, LUCENE-6276.patch, LUCENE-6276.patch
>
>
> We could add a method like TwoPhaseDISI.matchCost() defined as something like 
> estimate of nanoseconds or similar. 
> ConjunctionScorer could use this method to sort its 'twoPhaseIterators' array 
> so that cheaper ones are called first. Today it has no idea if one scorer is 
> a simple phrase scorer on a short field vs another that might do some geo 
> calculation or more expensive stuff.
> PhraseScorers could implement this based on index statistics (e.g. 
> totalTermFreq/maxDoc)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Index Metatags in Nutch site.xml

2015-10-29 Thread Salonee Rege
We have finished running bin/nutch solrindex command on our Nutch
segments.The data is getting indexed. I followed this link :
https://wiki.apache.org/nutch/IndexMetatags . The metatags description and
keywords were the sample ones we used. But they are not getting indexed.
What could be the problem with this

Thanks and Regards,
*Salonee Rege*
USC Viterbi School of Engineering
University of Southern California
Master of Computer Science - Student
Computer Science - B.E
salon...@usc.edu  *||* *619-709-6756 <619-709-6756>*


[JENKINS-MAVEN] Lucene-Solr-Maven-5.x #1088: POMs out of sync

2015-10-29 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Maven-5.x/1088/

No tests ran.

Build Log:
[...truncated 24716 lines...]
BUILD FAILED
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-Maven-5.x/build.xml:801: The 
following error occurred while executing this line:
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-Maven-5.x/build.xml:290: The 
following error occurred while executing this line:
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-Maven-5.x/lucene/build.xml:409: 
The following error occurred while executing this line:
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-Maven-5.x/lucene/common-build.xml:592:
 Error deploying artifact 'org.apache.lucene:lucene-solr-grandparent:pom': 
Error retrieving previous build number for artifact 
'org.apache.lucene:lucene-solr-grandparent:pom': repository metadata for: 
'snapshot org.apache.lucene:lucene-solr-grandparent:5.4.0-SNAPSHOT' could not 
be retrieved from repository: apache.snapshots.https due to an error: Error 
transferring file: Server returned HTTP response code: 502 for URL: 
https://repository.apache.org/content/repositories/snapshots/org/apache/lucene/lucene-solr-grandparent/5.4.0-SNAPSHOT/maven-metadata.xml.sha1

Total time: 10 minutes 32 seconds
Build step 'Invoke Ant' marked build as failure
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-8215) SolrCloud can select a core not in active state for querying

2015-10-29 Thread Varun Thacker (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14980838#comment-14980838
 ] 

Varun Thacker commented on SOLR-8215:
-

Thanks Mark for the review!

I'll commit this shortly.

> SolrCloud can select a core not in active state for querying
> 
>
> Key: SOLR-8215
> URL: https://issues.apache.org/jira/browse/SOLR-8215
> Project: Solr
>  Issue Type: Bug
>Reporter: Varun Thacker
> Attachments: SOLR-8215.patch
>
>
> A query can be served by a core which is not in active state if the request 
> hits the node which hosts these non active cores.
> We explicitly check for only active cores to search against  in 
> {{CloudSolrClient#sendRequest}} Line 1043 on trunk.
> But we don't check this if someone uses the REST APIs 
> {{HttpSolrCall#getCoreByCollection}} should only pick cores which are active 
> on line 794 on trunk. 
> We however check it on line 882/883 in HttpSolrCall, when we try to find 
> cores on other nodes when it's not present locally.
> So let's fix {{HttpSolrCall#getCoreByCollection}} to make the active check as 
> well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-8222) Optimize count-only faceting when there are many expected matches-per-ord

2015-10-29 Thread Yonik Seeley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley updated SOLR-8222:
---
Attachment: SOLR-8222.patch

Here's a draft patch that implements this optimization for single valued fields.

> Optimize count-only faceting when there are many expected matches-per-ord
> -
>
> Key: SOLR-8222
> URL: https://issues.apache.org/jira/browse/SOLR-8222
> Project: Solr
>  Issue Type: Improvement
>  Components: Facet Module
>Reporter: Yonik Seeley
> Attachments: SOLR-8222.patch
>
>
> This optimization for the JSON Facet API came up a few months ago on the 
> mailing list (I think by Toke).
> Basically, if one expects many hits per bucket, use a temporary array to 
> accumulate segment ords and map them all at the end to global ords.  This 
> saves redundant segOrd->globalOrd mappings at the cost of having to scan the 
> temp array.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6406) ConcurrentUpdateSolrServer hang in blockUntilFinished.

2015-10-29 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14980812#comment-14980812
 ] 

Yonik Seeley commented on SOLR-6406:


OK, I was able to reproduce... Interestingly, this is pretty easy to hit (and I 
also saw 2 threads stuck at the same point... which as you say must be 2 
different client objects).  There must be something more here than a 
subtle/little race condition.

> ConcurrentUpdateSolrServer hang in blockUntilFinished.
> --
>
> Key: SOLR-6406
> URL: https://issues.apache.org/jira/browse/SOLR-6406
> Project: Solr
>  Issue Type: Bug
>Reporter: Mark Miller
> Fix For: 5.0, Trunk
>
> Attachments: CPU Sampling.png, SOLR-6406.patch, SOLR-6406.patch
>
>
> Not sure what is causing this, but SOLR-6136 may have taken us a step back 
> here. I see this problem occasionally pop up in ChaosMonkeyNothingIsSafeTest 
> now - test fails because of a thread leak, thread leak is due to a 
> ConcurrentUpdateSolrServer hang in blockUntilFinished. Only started popping 
> up recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8215) SolrCloud can select a core not in active state for querying

2015-10-29 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14980795#comment-14980795
 ] 

Mark Miller commented on SOLR-8215:
---

Cool, good catch!

Patch LGTM.

> SolrCloud can select a core not in active state for querying
> 
>
> Key: SOLR-8215
> URL: https://issues.apache.org/jira/browse/SOLR-8215
> Project: Solr
>  Issue Type: Bug
>Reporter: Varun Thacker
> Attachments: SOLR-8215.patch
>
>
> A query can be served by a core which is not in active state if the request 
> hits the node which hosts these non active cores.
> We explicitly check for only active cores to search against  in 
> {{CloudSolrClient#sendRequest}} Line 1043 on trunk.
> But we don't check this if someone uses the REST APIs 
> {{HttpSolrCall#getCoreByCollection}} should only pick cores which are active 
> on line 794 on trunk. 
> We however check it on line 882/883 in HttpSolrCall, when we try to find 
> cores on other nodes when it's not present locally.
> So let's fix {{HttpSolrCall#getCoreByCollection}} to make the active check as 
> well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6868) ParallelLeafReader.getTermVectors can indirectly load TVs multiple times

2015-10-29 Thread David Smiley (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated LUCENE-6868:
-
Description: 
ParallelLeafReader has a getTermVectors(docId) implementation that loops over 
each field it has in a loop and calls getTermVector(docId,fieldName).  But the 
implementation of that will load all term vectors for all fields in that 
reader, yet ParallelLeafReader only wants one.  The effect is an O(n^2) where 
'n' is the number of fields, when we could get O\(n) if we do it right. PLR 
should call getTermVectors(docId) (not referring to a specific field) for each 
of it's readers and then aggregate them.

This wouldn't be such a problem if our term vector API/Codec was improved to 
not load all term vectors for all fields from disk at once.

Found via randomized-testing of IndexWriter auto-picking ParallelAtomicReader 
along with a test I have that asserts TVs aren't fetched for a doc more than 
once.

  was:
ParallelLeafReader has a getTermVectors(docId) implementation that loops over 
each field it has in a loop and calls getTermVector(docId,fieldName).  But the 
implementation of that will load all term vectors for all fields in that 
reader, yet ParallelLeafReader only wants one.  The effect is an O(n^2) where 
'n' is the number of fields, when we could get O(n) if we do it right. PLR 
should call getTermVectors(docId) (not referring to a specific field) for each 
of it's readers and then aggregate them.

This wouldn't be such a problem if our term vector API/Codec was improved to 
not load all term vectors for all fields from disk at once.

Found via randomized-testing of IndexWriter auto-picking ParallelAtomicReader 
along with a test I have that asserts TVs aren't fetched for a doc more than 
once.


> ParallelLeafReader.getTermVectors can indirectly load TVs multiple times
> 
>
> Key: LUCENE-6868
> URL: https://issues.apache.org/jira/browse/LUCENE-6868
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/index, core/termvectors
>Reporter: David Smiley
>
> ParallelLeafReader has a getTermVectors(docId) implementation that loops over 
> each field it has in a loop and calls getTermVector(docId,fieldName).  But 
> the implementation of that will load all term vectors for all fields in that 
> reader, yet ParallelLeafReader only wants one.  The effect is an O(n^2) where 
> 'n' is the number of fields, when we could get O\(n) if we do it right. PLR 
> should call getTermVectors(docId) (not referring to a specific field) for 
> each of it's readers and then aggregate them.
> This wouldn't be such a problem if our term vector API/Codec was improved to 
> not load all term vectors for all fields from disk at once.
> Found via randomized-testing of IndexWriter auto-picking ParallelAtomicReader 
> along with a test I have that asserts TVs aren't fetched for a doc more than 
> once.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Issue Comment Deleted] (SOLR-8220) Read field from docValues for non stored fields

2015-10-29 Thread Keith Laban (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Laban updated SOLR-8220:
--
Comment: was deleted

(was: I believe sorting already reaps the benefits of doc values at least 
according to [this 
documentation|https://cwiki.apache.org/confluence/display/solr/DocValues])

> Read field from docValues for non stored fields
> ---
>
> Key: SOLR-8220
> URL: https://issues.apache.org/jira/browse/SOLR-8220
> Project: Solr
>  Issue Type: Improvement
>Reporter: Keith Laban
>
> Many times a value will be both stored="true" and docValues="true" which 
> requires redundant data to be stored on disk. Since reading from docValues is 
> both efficient and a common practice (facets, analytics, streaming, etc), 
> reading values from docValues when a stored version of the field does not 
> exist would be a valuable disk usage optimization.
> The only caveat with this that I can see would be for multiValued fields as 
> they would always be returned sorted in the docValues approach. I believe 
> this is a fair compromise.
> I've done a rough implementation for this as a field transform, but I think 
> it should live closer to where stored fields are loaded in the 
> SolrIndexSearcher.
> Two open questions/observations:
> 1) There doesn't seem to be a standard way to read values for docValues, 
> facets, analytics, streaming, etc, all seem to be doing their own ways, 
> perhaps some of this logic should be centralized.
> 2) What will the API behavior be? (Below is my proposed implementation)
> Parameters for fl:
> - fl="docValueField"
>   -- return field from docValue if the field is not stored and in docValues, 
> if the field is stored return it from stored fields
> - fl="*"
>   -- return only stored fields
> - fl="+"
>-- return stored fields and docValue fields
> 2a - would be easiest implementation and might be sufficient for a first 
> pass. 2b - is current behavior



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6406) ConcurrentUpdateSolrServer hang in blockUntilFinished.

2015-10-29 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14980772#comment-14980772
 ] 

Mark Miller commented on SOLR-6406:
---

Just two threads stuck - not necessarily from the same client. Previously I had 
only ever seen 1 thread stuck. Just noting it, may not mean much.

> ConcurrentUpdateSolrServer hang in blockUntilFinished.
> --
>
> Key: SOLR-6406
> URL: https://issues.apache.org/jira/browse/SOLR-6406
> Project: Solr
>  Issue Type: Bug
>Reporter: Mark Miller
> Fix For: 5.0, Trunk
>
> Attachments: CPU Sampling.png, SOLR-6406.patch, SOLR-6406.patch
>
>
> Not sure what is causing this, but SOLR-6136 may have taken us a step back 
> here. I see this problem occasionally pop up in ChaosMonkeyNothingIsSafeTest 
> now - test fails because of a thread leak, thread leak is due to a 
> ConcurrentUpdateSolrServer hang in blockUntilFinished. Only started popping 
> up recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6863) Store sparse doc values more efficiently

2015-10-29 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-6863:
-
Attachment: LUCENE-6863.patch

Here is an updated patch that makes the sparse impl a bit more efficient when 
consumed in sequential order by keeping track of the upper bound of the current 
window. This is what has been used in the above benchmark.

I also updated the heuristic to require 1024 docs in the segment and that less 
than 1% of docs have a value in order to be on the safe side and to only slow 
down abuse/exceptionnal cases. Even if/when this gets used for some fields, I 
think the slowdown is acceptable insofar as it would only slow down fast 
queries: if you look at the above benchmarks, when the query matches many docs 
(such as a MatchAllDocsQuery) this encoding is actually faster than regular 
delta encoding. Only queries that match a small partition of the index (so that 
most dv lookups will require a binary search) would become slower.

Opinions?

> Store sparse doc values more efficiently
> 
>
> Key: LUCENE-6863
> URL: https://issues.apache.org/jira/browse/LUCENE-6863
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Adrien Grand
> Attachments: LUCENE-6863.patch, LUCENE-6863.patch
>
>
> For both NUMERIC fields and ordinals of SORTED fields, we store data in a 
> dense way. As a consequence, if you have only 1000 documents out of 1B that 
> have a value, and 8 bits are required to store those 1000 numbers, we will 
> not require 1KB of storage, but 1GB.
> I suspect this mostly happens in abuse cases, but still it's a pity that we 
> explode storage requirements. We could try to detect sparsity and compress 
> accordingly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-8220) Read field from docValues for non stored fields

2015-10-29 Thread Keith Laban (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14980731#comment-14980731
 ] 

Keith Laban edited comment on SOLR-8220 at 10/29/15 4:40 PM:
-

There are two approaches I can see:

1) implement a new type of StoredFieldReader which is aware of field type (i.e. 
does it have docValues, is it stored). This reader would delegate between 
reading from docValues or the stored fields

2) in the 
[SolrIndexSearcher.doc|https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L736]
 function do two passes. first pass to get stored fields, a second pass to get 
docValues. This can go a step further to make the SetNonLazyFieldSelector aware 
of docValues fields and instruct the the reader to not load fields which are 
known to be in docValues.


thoughts?

edit: Would approach number 2 effect how fields are loaded lazyily 
(LazyDocument)?


was (Author: k317h):
There are two approaches I can see:

1) implement a new type of StoredFieldReader which is aware of field type (i.e. 
does it have docValues, is it stored). This reader would delegate between 
reading from docValues or the stored fields

2) in the 
[SolrIndexSearcher.doc|https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L736]
 function do two passes. first pass to get stored fields, a second pass to get 
docValues. This can go a step further to make the SetNonLazyFieldSelector aware 
of docValues fields and instruct the the reader to not load fields which are 
known to be in docValues.


thoughts?

edit: Would approach number 2 how fields are loaded lazyily (LazyDocument)?

> Read field from docValues for non stored fields
> ---
>
> Key: SOLR-8220
> URL: https://issues.apache.org/jira/browse/SOLR-8220
> Project: Solr
>  Issue Type: Improvement
>Reporter: Keith Laban
>
> Many times a value will be both stored="true" and docValues="true" which 
> requires redundant data to be stored on disk. Since reading from docValues is 
> both efficient and a common practice (facets, analytics, streaming, etc), 
> reading values from docValues when a stored version of the field does not 
> exist would be a valuable disk usage optimization.
> The only caveat with this that I can see would be for multiValued fields as 
> they would always be returned sorted in the docValues approach. I believe 
> this is a fair compromise.
> I've done a rough implementation for this as a field transform, but I think 
> it should live closer to where stored fields are loaded in the 
> SolrIndexSearcher.
> Two open questions/observations:
> 1) There doesn't seem to be a standard way to read values for docValues, 
> facets, analytics, streaming, etc, all seem to be doing their own ways, 
> perhaps some of this logic should be centralized.
> 2) What will the API behavior be? (Below is my proposed implementation)
> Parameters for fl:
> - fl="docValueField"
>   -- return field from docValue if the field is not stored and in docValues, 
> if the field is stored return it from stored fields
> - fl="*"
>   -- return only stored fields
> - fl="+"
>-- return stored fields and docValue fields
> 2a - would be easiest implementation and might be sufficient for a first 
> pass. 2b - is current behavior



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-8220) Read field from docValues for non stored fields

2015-10-29 Thread Keith Laban (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14980731#comment-14980731
 ] 

Keith Laban edited comment on SOLR-8220 at 10/29/15 4:39 PM:
-

There are two approaches I can see:

1) implement a new type of StoredFieldReader which is aware of field type (i.e. 
does it have docValues, is it stored). This reader would delegate between 
reading from docValues or the stored fields

2) in the 
[SolrIndexSearcher.doc|https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L736]
 function do two passes. first pass to get stored fields, a second pass to get 
docValues. This can go a step further to make the SetNonLazyFieldSelector aware 
of docValues fields and instruct the the reader to not load fields which are 
known to be in docValues.


thoughts?

edit: Would approach number 2 how fields are loaded lazyily (LazyDocument)?


was (Author: k317h):
There are two approaches I can see:

1) implement a new type of StoredFieldReader which is aware of field type (i.e. 
does it have docValues, is it stored). This reader would delegate between 
reading from docValues or the stored fields

2) in the 
[SolrIndexSearcher.doc|https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L736]
 function do two passes. first pass to get stored fields, a second pass to get 
docValues. This can go a step further to make the SetNonLazyFieldSelector aware 
of docValues fields and instruct the the reader to not load fields which are 
known to be in docValues.


thoughts?

> Read field from docValues for non stored fields
> ---
>
> Key: SOLR-8220
> URL: https://issues.apache.org/jira/browse/SOLR-8220
> Project: Solr
>  Issue Type: Improvement
>Reporter: Keith Laban
>
> Many times a value will be both stored="true" and docValues="true" which 
> requires redundant data to be stored on disk. Since reading from docValues is 
> both efficient and a common practice (facets, analytics, streaming, etc), 
> reading values from docValues when a stored version of the field does not 
> exist would be a valuable disk usage optimization.
> The only caveat with this that I can see would be for multiValued fields as 
> they would always be returned sorted in the docValues approach. I believe 
> this is a fair compromise.
> I've done a rough implementation for this as a field transform, but I think 
> it should live closer to where stored fields are loaded in the 
> SolrIndexSearcher.
> Two open questions/observations:
> 1) There doesn't seem to be a standard way to read values for docValues, 
> facets, analytics, streaming, etc, all seem to be doing their own ways, 
> perhaps some of this logic should be centralized.
> 2) What will the API behavior be? (Below is my proposed implementation)
> Parameters for fl:
> - fl="docValueField"
>   -- return field from docValue if the field is not stored and in docValues, 
> if the field is stored return it from stored fields
> - fl="*"
>   -- return only stored fields
> - fl="+"
>-- return stored fields and docValue fields
> 2a - would be easiest implementation and might be sufficient for a first 
> pass. 2b - is current behavior



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-8220) Read field from docValues for non stored fields

2015-10-29 Thread Keith Laban (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14980731#comment-14980731
 ] 

Keith Laban edited comment on SOLR-8220 at 10/29/15 4:34 PM:
-

There are two approaches I can see:

1) implement a new type of StoredFieldReader which is aware of field type (i.e. 
does it have docValues, is it stored). This reader would delegate between 
reading from docValues or the stored fields

2) in the 
[SolrIndexSearcher.doc|https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L736]
 function do two passes. first pass to get stored fields, a second pass to get 
docValues. This can go a step further to make the SetNonLazyFieldSelector aware 
of docValues fields and instruct the the reader to not load fields which are 
known to be in docValues.


thoughts?


was (Author: k317h):
There are two approaches I can see:

1) implement a new type of StoredFieldReader which is aware of field type (i.e. 
does it have docValues, is it stored). This reader would delegate between 
reading from docValues or the stored fields

2) in the 
[doc|https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L736]
 function do two passes. first pass to get stored fields, a second pass to get 
docValues. This can go a step further to make the SetNonLazyFieldSelector aware 
of docValues fields and instruct the the reader to not load fields which are 
known to be in docValues.


thoughts?

> Read field from docValues for non stored fields
> ---
>
> Key: SOLR-8220
> URL: https://issues.apache.org/jira/browse/SOLR-8220
> Project: Solr
>  Issue Type: Improvement
>Reporter: Keith Laban
>
> Many times a value will be both stored="true" and docValues="true" which 
> requires redundant data to be stored on disk. Since reading from docValues is 
> both efficient and a common practice (facets, analytics, streaming, etc), 
> reading values from docValues when a stored version of the field does not 
> exist would be a valuable disk usage optimization.
> The only caveat with this that I can see would be for multiValued fields as 
> they would always be returned sorted in the docValues approach. I believe 
> this is a fair compromise.
> I've done a rough implementation for this as a field transform, but I think 
> it should live closer to where stored fields are loaded in the 
> SolrIndexSearcher.
> Two open questions/observations:
> 1) There doesn't seem to be a standard way to read values for docValues, 
> facets, analytics, streaming, etc, all seem to be doing their own ways, 
> perhaps some of this logic should be centralized.
> 2) What will the API behavior be? (Below is my proposed implementation)
> Parameters for fl:
> - fl="docValueField"
>   -- return field from docValue if the field is not stored and in docValues, 
> if the field is stored return it from stored fields
> - fl="*"
>   -- return only stored fields
> - fl="+"
>-- return stored fields and docValue fields
> 2a - would be easiest implementation and might be sufficient for a first 
> pass. 2b - is current behavior



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-8220) Read field from docValues for non stored fields

2015-10-29 Thread Keith Laban (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14980731#comment-14980731
 ] 

Keith Laban edited comment on SOLR-8220 at 10/29/15 4:34 PM:
-

There are two approaches I can see:

1) implement a new type of StoredFieldReader which is aware of field type (i.e. 
does it have docValues, is it stored). This reader would delegate between 
reading from docValues or the stored fields

2) in the 
[doc|https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L736]
 function do two passes. first pass to get stored fields, a second pass to get 
docValues. This can go a step further to make the SetNonLazyFieldSelector aware 
of docValues fields and instruct the the reader to not load fields which are 
known to be in docValues.


thoughts?


was (Author: k317h):
There are two approaches I can see:

1) implement a new type of StoredFieldReader which is aware of field type (i.e. 
does it have docValues, is it stored). This reader would delegate between 
reading from docValues or the stored fields

2) in the 
[doc:https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L736]
 function do two passes. first pass to get stored fields, a second pass to get 
docValues. This can go a step further to make the SetNonLazyFieldSelector aware 
of docValues fields and instruct the the reader to not load fields which are 
known to be in docValues.


thoughts?

> Read field from docValues for non stored fields
> ---
>
> Key: SOLR-8220
> URL: https://issues.apache.org/jira/browse/SOLR-8220
> Project: Solr
>  Issue Type: Improvement
>Reporter: Keith Laban
>
> Many times a value will be both stored="true" and docValues="true" which 
> requires redundant data to be stored on disk. Since reading from docValues is 
> both efficient and a common practice (facets, analytics, streaming, etc), 
> reading values from docValues when a stored version of the field does not 
> exist would be a valuable disk usage optimization.
> The only caveat with this that I can see would be for multiValued fields as 
> they would always be returned sorted in the docValues approach. I believe 
> this is a fair compromise.
> I've done a rough implementation for this as a field transform, but I think 
> it should live closer to where stored fields are loaded in the 
> SolrIndexSearcher.
> Two open questions/observations:
> 1) There doesn't seem to be a standard way to read values for docValues, 
> facets, analytics, streaming, etc, all seem to be doing their own ways, 
> perhaps some of this logic should be centralized.
> 2) What will the API behavior be? (Below is my proposed implementation)
> Parameters for fl:
> - fl="docValueField"
>   -- return field from docValue if the field is not stored and in docValues, 
> if the field is stored return it from stored fields
> - fl="*"
>   -- return only stored fields
> - fl="+"
>-- return stored fields and docValue fields
> 2a - would be easiest implementation and might be sufficient for a first 
> pass. 2b - is current behavior



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8220) Read field from docValues for non stored fields

2015-10-29 Thread Keith Laban (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14980731#comment-14980731
 ] 

Keith Laban commented on SOLR-8220:
---

There are two approaches I can see:

1) implement a new type of StoredFieldReader which is aware of field type (i.e. 
does it have docValues, is it stored). This reader would delegate between 
reading from docValues or the stored fields

2) in the 
[doc:https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L736]
 function do two passes. first pass to get stored fields, a second pass to get 
docValues. This can go a step further to make the SetNonLazyFieldSelector aware 
of docValues fields and instruct the the reader to not load fields which are 
known to be in docValues.


thoughts?

> Read field from docValues for non stored fields
> ---
>
> Key: SOLR-8220
> URL: https://issues.apache.org/jira/browse/SOLR-8220
> Project: Solr
>  Issue Type: Improvement
>Reporter: Keith Laban
>
> Many times a value will be both stored="true" and docValues="true" which 
> requires redundant data to be stored on disk. Since reading from docValues is 
> both efficient and a common practice (facets, analytics, streaming, etc), 
> reading values from docValues when a stored version of the field does not 
> exist would be a valuable disk usage optimization.
> The only caveat with this that I can see would be for multiValued fields as 
> they would always be returned sorted in the docValues approach. I believe 
> this is a fair compromise.
> I've done a rough implementation for this as a field transform, but I think 
> it should live closer to where stored fields are loaded in the 
> SolrIndexSearcher.
> Two open questions/observations:
> 1) There doesn't seem to be a standard way to read values for docValues, 
> facets, analytics, streaming, etc, all seem to be doing their own ways, 
> perhaps some of this logic should be centralized.
> 2) What will the API behavior be? (Below is my proposed implementation)
> Parameters for fl:
> - fl="docValueField"
>   -- return field from docValue if the field is not stored and in docValues, 
> if the field is stored return it from stored fields
> - fl="*"
>   -- return only stored fields
> - fl="+"
>-- return stored fields and docValue fields
> 2a - would be easiest implementation and might be sufficient for a first 
> pass. 2b - is current behavior



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-8222) Optimize count-only faceting when there are many expected matches-per-ord

2015-10-29 Thread Yonik Seeley (JIRA)
Yonik Seeley created SOLR-8222:
--

 Summary: Optimize count-only faceting when there are many expected 
matches-per-ord
 Key: SOLR-8222
 URL: https://issues.apache.org/jira/browse/SOLR-8222
 Project: Solr
  Issue Type: Improvement
  Components: Facet Module
Reporter: Yonik Seeley


This optimization for the JSON Facet API came up a few months ago on the 
mailing list (I think by Toke).
Basically, if one expects many hits per bucket, use a temporary array to 
accumulate segment ords and map them all at the end to global ords.  This saves 
redundant segOrd->globalOrd mappings at the cost of having to scan the temp 
array.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-8220) Read field from docValues for non stored fields

2015-10-29 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14980479#comment-14980479
 ] 

Yonik Seeley edited comment on SOLR-8220 at 10/29/15 4:22 PM:
--

bq. +1 to doing this. I think this will be useful for SOLR-5944, and was anyway 
planning to split this functionality out into its own issue.

Ah, right, atomic updates needs this functionality as well if we are to allow 
docValues fields that aren't stored.
In that case I'll amend my previous comments around ResultContext... that's 
appropriate for decorating documents as they are being returned, but perhaps 
not low enough level for other use cases.

edit: I'm basically agreeing with Keith's original observation - "I think it 
should live closer to where stored fields are loaded in the SolrIndexSearcher."


was (Author: ysee...@gmail.com):
bq. +1 to doing this. I think this will be useful for SOLR-5944, and was anyway 
planning to split this functionality out into its own issue.

Ah, right, atomic updates needs this functionality as well if we are to allow 
docValues fields that aren't stored.
In that case I'll amend my previous comments around ResultContext... that's 
appropriate for decorating documents as they are being returned, but perhaps 
not low enough level for other use cases.


> Read field from docValues for non stored fields
> ---
>
> Key: SOLR-8220
> URL: https://issues.apache.org/jira/browse/SOLR-8220
> Project: Solr
>  Issue Type: Improvement
>Reporter: Keith Laban
>
> Many times a value will be both stored="true" and docValues="true" which 
> requires redundant data to be stored on disk. Since reading from docValues is 
> both efficient and a common practice (facets, analytics, streaming, etc), 
> reading values from docValues when a stored version of the field does not 
> exist would be a valuable disk usage optimization.
> The only caveat with this that I can see would be for multiValued fields as 
> they would always be returned sorted in the docValues approach. I believe 
> this is a fair compromise.
> I've done a rough implementation for this as a field transform, but I think 
> it should live closer to where stored fields are loaded in the 
> SolrIndexSearcher.
> Two open questions/observations:
> 1) There doesn't seem to be a standard way to read values for docValues, 
> facets, analytics, streaming, etc, all seem to be doing their own ways, 
> perhaps some of this logic should be centralized.
> 2) What will the API behavior be? (Below is my proposed implementation)
> Parameters for fl:
> - fl="docValueField"
>   -- return field from docValue if the field is not stored and in docValues, 
> if the field is stored return it from stored fields
> - fl="*"
>   -- return only stored fields
> - fl="+"
>-- return stored fields and docValue fields
> 2a - would be easiest implementation and might be sufficient for a first 
> pass. 2b - is current behavior



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-6868) ParallelLeafReader.getTermVectors can indirectly load TVs multiple times

2015-10-29 Thread David Smiley (JIRA)
David Smiley created LUCENE-6868:


 Summary: ParallelLeafReader.getTermVectors can indirectly load TVs 
multiple times
 Key: LUCENE-6868
 URL: https://issues.apache.org/jira/browse/LUCENE-6868
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index, core/termvectors
Reporter: David Smiley


ParallelLeafReader has a getTermVectors(docId) implementation that loops over 
each field it has in a loop and calls getTermVector(docId,fieldName).  But the 
implementation of that will load all term vectors for all fields in that 
reader, yet ParallelLeafReader only wants one.  The effect is an O(n^2) where 
'n' is the number of fields, when we could get O(n) if we do it right. PLR 
should call getTermVectors(docId) (not referring to a specific field) for each 
of it's readers and then aggregate them.

This wouldn't be such a problem if our term vector API/Codec was improved to 
not load all term vectors for all fields from disk at once.

Found via randomized-testing of IndexWriter auto-picking ParallelAtomicReader 
along with a test I have that asserts TVs aren't fetched for a doc more than 
once.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6863) Store sparse doc values more efficiently

2015-10-29 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14980687#comment-14980687
 ] 

Adrien Grand commented on LUCENE-6863:
--

I ran some benchmarks with the geoname dataset which has a few sparse fields:
 - cc2: 3.2% of documents have this field, which has 573 unique values
 - admin4: 4.3% of documents have this field, which has 102950 unique values
 - admin3: 10.2% of documents have this field, which has 73120 unique values
 - admin2: 45.3% of documents have this field, which has 30603 unique values

First I enabled sparse compression on all fields, regardless of density to see 
how this compares to the delta compression that we use by default, and then ran 
two kinds of queries:
 - queries on a random partition of the index, which I guess would be the case 
when you have true sparse fields
 - a query only on documents that have a value, which I guess would be more 
realistic if you store several types of data in the same index that don't have 
the same fields

||Field||disk usage for ordinals||memory usage with sparse compression 
enabled||sort performance on a MatchAllDocsQuery||sort performance on a term 
query that matches 10% of docs||sort performance on a term query that matches 
1% of docs||sort performance on a term query that matches docs that have the 
field||
|cc2 | -88%|1680 bytes|-27%|+25%|+58%|+208%|
|admin4|-86%|568 bytes|-20%|+7%|-20%|+214%|
|admin3|-67%|1312 bytes|+11%|+57%|+42%|+236%|
|admin2 |+17%|2904 bytes|+132%|+275%|+331%|+221%|

The reduction in disk usage is significant, but so is the slowdown, especially 
when running a query that only matches docs that have a value. However memory 
usage looks acceptable to me for 10M docs.

I couldn't test with 3% as even the rarest field is contained by 3.2% of 
documents, but I updated the heuristic to require at least 1024 docs in the 
segment (like Robert suggested) and that less than 5% of docs have a value:

||Field||memory usage due to sparse compression||sort performance on a 
MatchAllDocsQuery||sort performance on a term query that matches 10% of 
docs||sort performance on a term query that matches 1% of docs||sort 
performance on a term query that matches docs that have the field||
|cc2 | 1680 bytes|-10%|+34%|+62%|+214%|
|admin4|568 bytes|-7%|+20%|-14%|+241%|
|admin3|576 bytes|+9%|+7%|+11%|+10%|
|admin2 |1008 bytes|+1%|+8%|+9%|+11%|

To my surprise, admin2 and admin3 were still using sparse compression on some 
segments. The reason is that documents with sparse values are not uniform in 
the dataset but rather clustered: I suspect this partially explains of the 
slowdown for admin2/admin3, maybe there is also hotspot not liking having more 
impls to deal with.

> Store sparse doc values more efficiently
> 
>
> Key: LUCENE-6863
> URL: https://issues.apache.org/jira/browse/LUCENE-6863
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Adrien Grand
> Attachments: LUCENE-6863.patch
>
>
> For both NUMERIC fields and ordinals of SORTED fields, we store data in a 
> dense way. As a consequence, if you have only 1000 documents out of 1B that 
> have a value, and 8 bits are required to store those 1000 numbers, we will 
> not require 1KB of storage, but 1GB.
> I suspect this mostly happens in abuse cases, but still it's a pity that we 
> explode storage requirements. We could try to detect sparsity and compress 
> accordingly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7569) Create an API to force a leader election between nodes

2015-10-29 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14980581#comment-14980581
 ] 

Mark Miller commented on SOLR-7569:
---

There are two main things I think that prevent replicas from becoming a leader 
- if there last published state on the clouddescriptor is not ACTIVE or LIR. I 
thought we would want to clear LIR and perhaps add an ADMIN command that will 
set the last published state on the clouddescriptor to ACTIVE for each replica.

> Create an API to force a leader election between nodes
> --
>
> Key: SOLR-7569
> URL: https://issues.apache.org/jira/browse/SOLR-7569
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Reporter: Shalin Shekhar Mangar
>Assignee: Shalin Shekhar Mangar
>  Labels: difficulty-medium, impact-high
> Attachments: SOLR-7569.patch, SOLR-7569.patch, SOLR-7569.patch, 
> SOLR-7569.patch, SOLR-7569.patch, SOLR-7569.patch, SOLR-7569.patch, 
> SOLR-7569.patch, SOLR-7569.patch, SOLR-7569.patch, SOLR-7569.patch, 
> SOLR-7569.patch, SOLR-7569.patch, SOLR-7569_lir_down_state_test.patch
>
>
> There are many reasons why Solr will not elect a leader for a shard e.g. all 
> replicas' last published state was recovery or due to bugs which cause a 
> leader to be marked as 'down'. While the best solution is that they never get 
> into this state, we need a manual way to fix this when it does get into this  
> state. Right now we can do a series of dance involving bouncing the node 
> (since recovery paths between bouncing and REQUESTRECOVERY are different), 
> but that is difficult when running a large cluster. Although it is possible 
> that such a manual API may lead to some data loss but in some cases, it is 
> the only possible option to restore availability.
> This issue proposes to build a new collection API which can be used to force 
> replicas into recovering a leader while avoiding data loss on a best effort 
> basis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS-EA] Lucene-Solr-trunk-Linux (64bit/jdk1.9.0-ea-b85) - Build # 14712 - Still Failing!

2015-10-29 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/14712/
Java: 64bit/jdk1.9.0-ea-b85 -XX:-UseCompressedOops -XX:+UseConcMarkSweepGC

3 tests failed.
FAILED:  
junit.framework.TestSuite.org.apache.solr.cloud.TestAuthenticationFramework

Error Message:
89 threads leaked from SUITE scope at 
org.apache.solr.cloud.TestAuthenticationFramework: 1) Thread[id=1288, 
name=qtp540269018-1288-selector-ServerConnectorManager@316ec176/0, 
state=RUNNABLE, group=TGRP-TestAuthenticationFramework] at 
sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at 
sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) at 
sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93) at 
sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) at 
sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at 
sun.nio.ch.SelectorImpl.select(SelectorImpl.java:101) at 
org.eclipse.jetty.io.SelectorManager$ManagedSelector.select(SelectorManager.java:600)
 at 
org.eclipse.jetty.io.SelectorManager$ManagedSelector.run(SelectorManager.java:549)
 at 
org.eclipse.jetty.util.thread.NonBlockingThread.run(NonBlockingThread.java:52)  
   at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
 at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555) 
at java.lang.Thread.run(Thread.java:747)2) Thread[id=1338, 
name=org.eclipse.jetty.server.session.HashSessionManager@be2bfcaTimer, 
state=TIMED_WAITING, group=TGRP-TestAuthenticationFramework] at 
sun.misc.Unsafe.park(Native Method) at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) 
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
 at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)   
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
at java.lang.Thread.run(Thread.java:747)3) Thread[id=1336, 
name=org.eclipse.jetty.server.session.HashSessionManager@4cc8be8Timer, 
state=TIMED_WAITING, group=TGRP-TestAuthenticationFramework] at 
sun.misc.Unsafe.park(Native Method) at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) 
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
 at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)   
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
at java.lang.Thread.run(Thread.java:747)4) Thread[id=1390, 
name=Thread-561, state=WAITING, group=TGRP-TestAuthenticationFramework] 
at java.lang.Object.wait(Native Method) at 
java.lang.Object.wait(Object.java:516) at 
org.apache.solr.core.CloserThread.run(CoreContainer.java:1155)5) 
Thread[id=1374, name=jetty-launcher-198-thread-3-SendThread(127.0.0.1:33236), 
state=TIMED_WAITING, group=TGRP-TestAuthenticationFramework] at 
java.lang.Thread.sleep(Native Method) at 
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:994)6) 
Thread[id=1379, name=jetty-launcher-198-thread-4-SendThread(127.0.0.1:33236), 
state=TIMED_WAITING, group=TGRP-TestAuthenticationFramework] at 
java.lang.Thread.sleep(Native Method) at 
org.apache.zookeeper.client.StaticHostProvider.next(StaticHostProvider.java:101)
 at 
org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:940)
 at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1003)
7) Thread[id=1325, name=qtp723096975-1325, state=TIMED_WAITING, 
group=TGRP-TestAuthenticationFramework] at sun.misc.Unsafe.park(Native 
Method) at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) 
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
 at 
org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:389) 
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll(QueuedThreadPool.java:531)
   

[jira] [Commented] (SOLR-8220) Read field from docValues for non stored fields

2015-10-29 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14980479#comment-14980479
 ] 

Yonik Seeley commented on SOLR-8220:


bq. +1 to doing this. I think this will be useful for SOLR-5944, and was anyway 
planning to split this functionality out into its own issue.

Ah, right, atomic updates needs this functionality as well if we are to allow 
docValues fields that aren't stored.
In that case I'll amend my previous comments around ResultContext... that's 
appropriate for decorating documents as they are being returned, but perhaps 
not low enough level for other use cases.


> Read field from docValues for non stored fields
> ---
>
> Key: SOLR-8220
> URL: https://issues.apache.org/jira/browse/SOLR-8220
> Project: Solr
>  Issue Type: Improvement
>Reporter: Keith Laban
>
> Many times a value will be both stored="true" and docValues="true" which 
> requires redundant data to be stored on disk. Since reading from docValues is 
> both efficient and a common practice (facets, analytics, streaming, etc), 
> reading values from docValues when a stored version of the field does not 
> exist would be a valuable disk usage optimization.
> The only caveat with this that I can see would be for multiValued fields as 
> they would always be returned sorted in the docValues approach. I believe 
> this is a fair compromise.
> I've done a rough implementation for this as a field transform, but I think 
> it should live closer to where stored fields are loaded in the 
> SolrIndexSearcher.
> Two open questions/observations:
> 1) There doesn't seem to be a standard way to read values for docValues, 
> facets, analytics, streaming, etc, all seem to be doing their own ways, 
> perhaps some of this logic should be centralized.
> 2) What will the API behavior be? (Below is my proposed implementation)
> Parameters for fl:
> - fl="docValueField"
>   -- return field from docValue if the field is not stored and in docValues, 
> if the field is stored return it from stored fields
> - fl="*"
>   -- return only stored fields
> - fl="+"
>-- return stored fields and docValue fields
> 2a - would be easiest implementation and might be sufficient for a first 
> pass. 2b - is current behavior



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.8.0_60) - Build # 14711 - Failure!

2015-10-29 Thread Michael McCandless
Ooh, I'll dig.

Mike McCandless

http://blog.mikemccandless.com


On Thu, Oct 29, 2015 at 9:38 AM, Policeman Jenkins Server
 wrote:
> Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/14711/
> Java: 64bit/jdk1.8.0_60 -XX:-UseCompressedOops -XX:+UseSerialGC
>
> 1 tests failed.
> FAILED:  org.apache.lucene.index.TestDimensionalValues.testRandomBinaryMedium
>
> Error Message:
> docID=857 expected: but was:
>
> Stack Trace:
> java.lang.AssertionError: docID=857 expected: but was:
> at 
> __randomizedtesting.SeedInfo.seed([831907F8E7511A35:F43480B833FD6FE2]:0)
> at org.junit.Assert.fail(Assert.java:93)
> at org.junit.Assert.failNotEquals(Assert.java:647)
> at org.junit.Assert.assertEquals(Assert.java:128)
> at 
> org.apache.lucene.index.TestDimensionalValues.verify(TestDimensionalValues.java:994)
> at 
> org.apache.lucene.index.TestDimensionalValues.verify(TestDimensionalValues.java:791)
> at 
> org.apache.lucene.index.TestDimensionalValues.doTestRandomBinary(TestDimensionalValues.java:781)
> at 
> org.apache.lucene.index.TestDimensionalValues.testRandomBinaryMedium(TestDimensionalValues.java:391)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1660)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:866)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:902)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:916)
> at 
> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
> at 
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
> at 
> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
> at 
> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
> at 
> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
> at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
> at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809)
> at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:875)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:777)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:811)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:822)
> at 
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
> at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at 
> org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
> at 
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
> at 
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
> at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at 
> org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54)
> at 
> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
> at 
> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
> at 
> org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
> at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
> at j

[jira] [Commented] (SOLR-8220) Read field from docValues for non stored fields

2015-10-29 Thread Ishan Chattopadhyaya (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14980449#comment-14980449
 ] 

Ishan Chattopadhyaya commented on SOLR-8220:


+1 to doing this. I think this will be useful for SOLR-5944, and was anyway 
planning to split this functionality out into its own issue. Also, if we go 
forward with \_version\_ field as docvalues field, then this becomes important. 
In current Solr, the way to read non-stored docValues fields is to use a 
function query, field(mydvfield). 

[~k317h] Are you planning to work on this / have a patch for this? If not, then 
I can give it a try and have SOLR-5944 depend on it.

> Read field from docValues for non stored fields
> ---
>
> Key: SOLR-8220
> URL: https://issues.apache.org/jira/browse/SOLR-8220
> Project: Solr
>  Issue Type: Improvement
>Reporter: Keith Laban
>
> Many times a value will be both stored="true" and docValues="true" which 
> requires redundant data to be stored on disk. Since reading from docValues is 
> both efficient and a common practice (facets, analytics, streaming, etc), 
> reading values from docValues when a stored version of the field does not 
> exist would be a valuable disk usage optimization.
> The only caveat with this that I can see would be for multiValued fields as 
> they would always be returned sorted in the docValues approach. I believe 
> this is a fair compromise.
> I've done a rough implementation for this as a field transform, but I think 
> it should live closer to where stored fields are loaded in the 
> SolrIndexSearcher.
> Two open questions/observations:
> 1) There doesn't seem to be a standard way to read values for docValues, 
> facets, analytics, streaming, etc, all seem to be doing their own ways, 
> perhaps some of this logic should be centralized.
> 2) What will the API behavior be? (Below is my proposed implementation)
> Parameters for fl:
> - fl="docValueField"
>   -- return field from docValue if the field is not stored and in docValues, 
> if the field is stored return it from stored fields
> - fl="*"
>   -- return only stored fields
> - fl="+"
>-- return stored fields and docValue fields
> 2a - would be easiest implementation and might be sufficient for a first 
> pass. 2b - is current behavior



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6276) Add matchCost() api to TwoPhaseDocIdSetIterator

2015-10-29 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14980447#comment-14980447
 ] 

Adrien Grand commented on LUCENE-6276:
--

Some suggestions:
 - could the match costs be computed eagerly instead of lazily, like we compute 
the costs of DocIdSetIterators?
 - can you move the utility methods to compute costs of phrases from 
TwoPhaseIterator into PhraseWeight/SpanNearQuery. I don't like leaking 
implementation details of specific TwoPhaseIterators into TwoPhaseIterator.
 - some implementations of matchCost return 0 (eg. RandomAccessWeight); I'm 
fine with not implementing every match cost for now, but could you leave a TODO 
and use a higher constant (eg. 100)? I think it's safer to assume that such 
implementations are costly until they are implemented correctly.
 - re: ReqExclScorer: indeed I think we should try to take into account the 
cost of advancing the excluded scorer. If this is complicated, I'm fine with 
tackling this problem later.
 - re: DisjunctionScorer: I think your current definition of the match cost is 
fine. Maybe we could improve it by weighting the match cost of each 
TwoPhaseIterator by the cost of their approximation. I think it would make 
sense since we will call TwoPhaseIterator.matches() more often if their 
approximation matches more documents.
 - AssertingSpans/AsseringScorer ensure that the match cost is >= 0. Maybe we 
should also check for NaN and document acceptable return values in the 
documentation of #matchCost?

> Add matchCost() api to TwoPhaseDocIdSetIterator
> ---
>
> Key: LUCENE-6276
> URL: https://issues.apache.org/jira/browse/LUCENE-6276
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Robert Muir
> Attachments: LUCENE-6276-ExactPhraseOnly.patch, 
> LUCENE-6276-NoSpans.patch, LUCENE-6276-NoSpans2.patch, LUCENE-6276.patch, 
> LUCENE-6276.patch, LUCENE-6276.patch, LUCENE-6276.patch, LUCENE-6276.patch
>
>
> We could add a method like TwoPhaseDISI.matchCost() defined as something like 
> estimate of nanoseconds or similar. 
> ConjunctionScorer could use this method to sort its 'twoPhaseIterators' array 
> so that cheaper ones are called first. Today it has no idea if one scorer is 
> a simple phrase scorer on a short field vs another that might do some geo 
> calculation or more expensive stuff.
> PhraseScorers could implement this based on index statistics (e.g. 
> totalTermFreq/maxDoc)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7569) Create an API to force a leader election between nodes

2015-10-29 Thread Ishan Chattopadhyaya (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14980440#comment-14980440
 ] 

Ishan Chattopadhyaya commented on SOLR-7569:


bq. It seems like what we really want is to make sure the last published state 
for each replica does not prevent it from becoming the leader?
It seems to me that there's no easy way to set the last published state of a 
replica without the replicas doing it themselves. Do you think we should be 
doing that instead of marking them as active? Or do you think that just 
clearing the LIR is enough?

> Create an API to force a leader election between nodes
> --
>
> Key: SOLR-7569
> URL: https://issues.apache.org/jira/browse/SOLR-7569
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Reporter: Shalin Shekhar Mangar
>Assignee: Shalin Shekhar Mangar
>  Labels: difficulty-medium, impact-high
> Attachments: SOLR-7569.patch, SOLR-7569.patch, SOLR-7569.patch, 
> SOLR-7569.patch, SOLR-7569.patch, SOLR-7569.patch, SOLR-7569.patch, 
> SOLR-7569.patch, SOLR-7569.patch, SOLR-7569.patch, SOLR-7569.patch, 
> SOLR-7569.patch, SOLR-7569.patch, SOLR-7569_lir_down_state_test.patch
>
>
> There are many reasons why Solr will not elect a leader for a shard e.g. all 
> replicas' last published state was recovery or due to bugs which cause a 
> leader to be marked as 'down'. While the best solution is that they never get 
> into this state, we need a manual way to fix this when it does get into this  
> state. Right now we can do a series of dance involving bouncing the node 
> (since recovery paths between bouncing and REQUESTRECOVERY are different), 
> but that is difficult when running a large cluster. Although it is possible 
> that such a manual API may lead to some data loss but in some cases, it is 
> the only possible option to restore availability.
> This issue proposes to build a new collection API which can be used to force 
> replicas into recovering a leader while avoiding data loss on a best effort 
> basis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.8.0_60) - Build # 14711 - Failure!

2015-10-29 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/14711/
Java: 64bit/jdk1.8.0_60 -XX:-UseCompressedOops -XX:+UseSerialGC

1 tests failed.
FAILED:  org.apache.lucene.index.TestDimensionalValues.testRandomBinaryMedium

Error Message:
docID=857 expected: but was:

Stack Trace:
java.lang.AssertionError: docID=857 expected: but was:
at 
__randomizedtesting.SeedInfo.seed([831907F8E7511A35:F43480B833FD6FE2]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at 
org.apache.lucene.index.TestDimensionalValues.verify(TestDimensionalValues.java:994)
at 
org.apache.lucene.index.TestDimensionalValues.verify(TestDimensionalValues.java:791)
at 
org.apache.lucene.index.TestDimensionalValues.doTestRandomBinary(TestDimensionalValues.java:781)
at 
org.apache.lucene.index.TestDimensionalValues.testRandomBinaryMedium(TestDimensionalValues.java:391)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1660)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:866)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:902)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:916)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:875)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:777)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:811)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:822)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at java.lang.Thread.run(Thread.java:745)




Build Log:
[...truncated 655 lines...]
   [junit4] Suite: org.apache.lucene.index.TestDimensionalValues
   [junit4]   2> NOTE: reproduce with: ant test  
-Dtestcase=TestDimensionalValues -Dtests.method=testRandomBinaryMedium 
-Dtests.seed=831907F8E7511A35 -Dtests.multiplier

[jira] [Commented] (LUCENE-6863) Store sparse doc values more efficiently

2015-10-29 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14980418#comment-14980418
 ] 

Adrien Grand commented on LUCENE-6863:
--

bq. Is it okay if we mark LUCENE-5688 and LUCENE-4921 as duplicate of this Jira 
or could there still be plans on having a specialized doc value format?

I think it would be a bit premature to say the other jiras are superseded by 
this one, it's not clear yet whether the proposed approach here is actually a 
better idea and/or could make it to the default codec. I suggest that we wait 
to see how happy benchmarks are with this patch first.

> Store sparse doc values more efficiently
> 
>
> Key: LUCENE-6863
> URL: https://issues.apache.org/jira/browse/LUCENE-6863
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Adrien Grand
> Attachments: LUCENE-6863.patch
>
>
> For both NUMERIC fields and ordinals of SORTED fields, we store data in a 
> dense way. As a consequence, if you have only 1000 documents out of 1B that 
> have a value, and 8 bits are required to store those 1000 numbers, we will 
> not require 1KB of storage, but 1GB.
> I suspect this mostly happens in abuse cases, but still it's a pity that we 
> explode storage requirements. We could try to detect sparsity and compress 
> accordingly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-NightlyTests-trunk - Build # 836 - Still Failing

2015-10-29 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-trunk/836/

3 tests failed.
FAILED:  
org.apache.solr.cloud.OverseerTest.testExternalClusterStateChangeBehavior

Error Message:
Illegal state, was: down expected:active clusterState:live 
nodes:[]collections:{c1=DocCollection(c1)={   "shards":{"shard1":{   
"parent":null,   "range":null,   "state":"active",   
"replicas":{"core_node1":{   "base_url":"http://127.0.0.1/solr";,
   "node_name":"node1",   "core":"core1",   "roles":"", 
  "state":"down",   "router":{"name":"implicit"}}, 
test=LazyCollectionRef(test)}

Stack Trace:
java.lang.AssertionError: Illegal state, was: down expected:active 
clusterState:live nodes:[]collections:{c1=DocCollection(c1)={
  "shards":{"shard1":{
  "parent":null,
  "range":null,
  "state":"active",
  "replicas":{"core_node1":{
  "base_url":"http://127.0.0.1/solr";,
  "node_name":"node1",
  "core":"core1",
  "roles":"",
  "state":"down",
  "router":{"name":"implicit"}}, test=LazyCollectionRef(test)}
at 
__randomizedtesting.SeedInfo.seed([AA04CF969E3D1668:C21ACC7A7CAD4C26]:0)
at org.junit.Assert.fail(Assert.java:93)
at 
org.apache.solr.cloud.OverseerTest.verifyStatus(OverseerTest.java:601)
at 
org.apache.solr.cloud.OverseerTest.testExternalClusterStateChangeBehavior(OverseerTest.java:1261)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1660)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:866)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:902)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:916)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:875)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:777)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:811)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:822)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.e

[JENKINS-EA] Lucene-Solr-5.x-Linux (32bit/jdk1.9.0-ea-b85) - Build # 14420 - Failure!

2015-10-29 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-Linux/14420/
Java: 32bit/jdk1.9.0-ea-b85 -client -XX:+UseConcMarkSweepGC

1 tests failed.
FAILED:  org.apache.solr.spelling.SpellCheckCollatorTest.testEstimatedHitCounts

Error Message:
Exception during query

Stack Trace:
java.lang.RuntimeException: Exception during query
at 
__randomizedtesting.SeedInfo.seed([6EF26D526516A41B:5F49D367C029B4CB]:0)
at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:766)
at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:733)
at 
org.apache.solr.spelling.SpellCheckCollatorTest.testEstimatedHitCounts(SpellCheckCollatorTest.java:530)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:520)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1665)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:864)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:900)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:914)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:873)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:775)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:809)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:820)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at java.lang.Thread.run(Thread.java:747)
Caused by: java.lang.RuntimeException: REQUEST FAILED: 
xpath=//lst[@name='spellcheck']/lst[@name='collations']/lst[@name='collation']/int[@name='hits'
 and 6 <= . and . <= 10]
xml response was: 

031918everyotherteststop:everyother12eve

[jira] [Resolved] (LUCENE-6862) Upgrade of RandomizedRunner to version 2.2.0

2015-10-29 Thread Dawid Weiss (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss resolved LUCENE-6862.
-
Resolution: Fixed

> Upgrade of RandomizedRunner to version 2.2.0
> 
>
> Key: LUCENE-6862
> URL: https://issues.apache.org/jira/browse/LUCENE-6862
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
> Fix For: Trunk, 5.4
>
> Attachments: LUCENE-6862.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-6478) Test execution can hang with java.security.debug

2015-10-29 Thread Dawid Weiss (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss resolved LUCENE-6478.
-
   Resolution: Fixed
Fix Version/s: 5.4
   Trunk

> Test execution can hang with java.security.debug
> 
>
> Key: LUCENE-6478
> URL: https://issues.apache.org/jira/browse/LUCENE-6478
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
> Fix For: Trunk, 5.4
>
>
> As reported by Robert:
> {code}
> # clone trunk
> cd lucene/core/
> ant test -Dargs="-Djava.security.debug=access:failure" -Dtestcase=TestDemo
> {code}
> Hangs the test runner. The same problem appears to be present in ES builds 
> too. It seems like some kind of weird stream buffer problem, the security 
> framework seems to be writing to the native descriptors directly. Will have 
> to dig (deep...).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6862) Upgrade of RandomizedRunner to version 2.2.0

2015-10-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14980146#comment-14980146
 ] 

ASF subversion and git services commented on LUCENE-6862:
-

Commit 1711205 from [~dawidweiss] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1711205 ]

LUCENE-6862: Upgrade of RandomizedRunner to version 2.2.0

> Upgrade of RandomizedRunner to version 2.2.0
> 
>
> Key: LUCENE-6862
> URL: https://issues.apache.org/jira/browse/LUCENE-6862
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
> Fix For: Trunk, 5.4
>
> Attachments: LUCENE-6862.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6867) Unmark ignored geo tests (due to RR incompatibility)

2015-10-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14980143#comment-14980143
 ] 

ASF subversion and git services commented on LUCENE-6867:
-

Commit 1711204 from [~dawidweiss] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1711204 ]

LUCENE-6867: marking certain tests as ignored because of incompatibility in 
RandomizedTesting versions.

> Unmark ignored geo tests (due to RR incompatibility)
> 
>
> Key: LUCENE-6867
> URL: https://issues.apache.org/jira/browse/LUCENE-6867
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: David Smiley
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6862) Upgrade of RandomizedRunner to version 2.2.0

2015-10-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14980127#comment-14980127
 ] 

ASF subversion and git services commented on LUCENE-6862:
-

Commit 1711203 from [~dawidweiss] in branch 'dev/trunk'
[ https://svn.apache.org/r1711203 ]

LUCENE-6862: Upgrade of RandomizedRunner to version 2.2.0

> Upgrade of RandomizedRunner to version 2.2.0
> 
>
> Key: LUCENE-6862
> URL: https://issues.apache.org/jira/browse/LUCENE-6862
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
> Fix For: Trunk, 5.4
>
> Attachments: LUCENE-6862.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6867) Unmark ignored geo tests (due to RR incompatibility)

2015-10-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14980110#comment-14980110
 ] 

ASF subversion and git services commented on LUCENE-6867:
-

Commit 1711201 from [~dawidweiss] in branch 'dev/trunk'
[ https://svn.apache.org/r1711201 ]

LUCENE-6867: marking certain tests as ignored because of incompatibility in 
RandomizedTesting versions.

> Unmark ignored geo tests (due to RR incompatibility)
> 
>
> Key: LUCENE-6867
> URL: https://issues.apache.org/jira/browse/LUCENE-6867
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: David Smiley
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6867) Unmark ignored geo tests (due to RR incompatibility)

2015-10-29 Thread Dawid Weiss (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated LUCENE-6867:

Assignee: David Smiley

> Unmark ignored geo tests (due to RR incompatibility)
> 
>
> Key: LUCENE-6867
> URL: https://issues.apache.org/jira/browse/LUCENE-6867
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: David Smiley
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-6867) Unmark ignored geo tests (due to RR incompatibility)

2015-10-29 Thread Dawid Weiss (JIRA)
Dawid Weiss created LUCENE-6867:
---

 Summary: Unmark ignored geo tests (due to RR incompatibility)
 Key: LUCENE-6867
 URL: https://issues.apache.org/jira/browse/LUCENE-6867
 Project: Lucene - Core
  Issue Type: Task
Reporter: Dawid Weiss
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8194) Improve error reporting UpdateRequest

2015-10-29 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14980067#comment-14980067
 ] 

Markus Jelsma commented on SOLR-8194:
-

Agreed! The the call immediately throws NPE, it is clear enough.

> Improve error reporting UpdateRequest
> -
>
> Key: SOLR-8194
> URL: https://issues.apache.org/jira/browse/SOLR-8194
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 5.3
>Reporter: Markus Jelsma
>Assignee: Alan Woodward
>Priority: Trivial
> Fix For: 5.4
>
>
> SolrJ throws NPE if null documents are added to UpdateRequest. It should 
> report a proper error message so i don't get confused the next time i skrew 
> up. Please see: 
> https://www.mail-archive.com/solr-user@lucene.apache.org/msg115074.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.8.0_60) - Build # 14708 - Still Failing!

2015-10-29 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/14708/
Java: 64bit/jdk1.8.0_60 -XX:-UseCompressedOops -XX:+UseG1GC

1 tests failed.
FAILED:  org.apache.solr.cloud.CdcrReplicationHandlerTest.doTest

Error Message:
There are still nodes recoverying - waited for 330 seconds

Stack Trace:
java.lang.AssertionError: There are still nodes recoverying - waited for 330 
seconds
at 
__randomizedtesting.SeedInfo.seed([322E670852B90B0D:956ADFAC3F0218B4]:0)
at org.junit.Assert.fail(Assert.java:93)
at 
org.apache.solr.cloud.AbstractDistribZkTestBase.waitForRecoveriesToFinish(AbstractDistribZkTestBase.java:172)
at 
org.apache.solr.cloud.AbstractDistribZkTestBase.waitForRecoveriesToFinish(AbstractDistribZkTestBase.java:133)
at 
org.apache.solr.cloud.AbstractDistribZkTestBase.waitForRecoveriesToFinish(AbstractDistribZkTestBase.java:128)
at 
org.apache.solr.cloud.BaseCdcrDistributedZkTest.waitForRecoveriesToFinish(BaseCdcrDistributedZkTest.java:465)
at 
org.apache.solr.cloud.BaseCdcrDistributedZkTest.clearSourceCollection(BaseCdcrDistributedZkTest.java:319)
at 
org.apache.solr.cloud.CdcrReplicationHandlerTest.doTestPartialReplicationWithTruncatedTlog(CdcrReplicationHandlerTest.java:121)
at 
org.apache.solr.cloud.CdcrReplicationHandlerTest.doTest(CdcrReplicationHandlerTest.java:52)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1665)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:864)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:900)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:914)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:963)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:938)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:873)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:775)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:809)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:820)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.Statement