date:20171004

[jira] [Created] (SOLR-11440) TestLeaderElectionZkExpiry failures after autoscaling merges

2017-10-04 Thread Noble Paul (JIRA)

Noble Paul created SOLR-11440:
-

 Summary: TestLeaderElectionZkExpiry failures after autoscaling 
merges
 Key: SOLR-11440
 URL: https://issues.apache.org/jira/browse/SOLR-11440
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Noble Paul


{code}
 [junit4]   2> NOTE: reproduce with: ant test  
-Dtestcase=TestLeaderElectionZkExpiry 
-Dtests.method=testLeaderElectionWithZkExpiry -Dtests.seed=936BBD073C4C1EE2 
-Dtests.slow=true -Dtests.locale=fi -Dtests.timezone=Africa/Sao_Tome 
-Dtests.asserts=true -Dtests.file.encoding=ISO-8859-1
   [junit4] ERROR   13.6s J11 | 
TestLeaderElectionZkExpiry.testLeaderElectionWithZkExpiry <<<
   [junit4]> Throwable #1: 
com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught 
exception in thread: Thread[id=7154, 
name=OverseerAutoScalingTriggerThread-98770164405436418-dummy.host.com:8984_solr-n_00,
 state=RUNNABLE, group=Overseer autoscaling triggers]
   [junit4]> Caused by: org.apache.solr.common.SolrException: 
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = 
Session expired for /autoscaling/events/.auto_add_replicas
   [junit4]>at 
__randomizedtesting.SeedInfo.seed([936BBD073C4C1EE2]:0)
   [junit4]>at 
org.apache.solr.cloud.ZkDistributedQueue.(ZkDistributedQueue.java:107)
   [junit4]>at 
org.apache.solr.cloud.autoscaling.TriggerEventQueue.(TriggerEventQueue.java:44)
   [junit4]>at 
org.apache.solr.cloud.autoscaling.ScheduledTriggers$ScheduledTrigger.(ScheduledTriggers.java:398)
   [junit4]>at 
org.apache.solr.cloud.autoscaling.ScheduledTriggers.add(ScheduledTriggers.java:149)
   [junit4]>at 
org.apache.solr.cloud.autoscaling.OverseerTriggerThread.run(OverseerTriggerThread.java:220)
   [junit4]>at java.lang.Thread.run(Thread.java:745)
   [junit4]> Caused by: 
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = 
Session expired for /autoscaling/events/.auto_add_replicas
   [junit4]>at 
org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
   [junit4]>at 
org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   [junit4]>at 
org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1102)
   [junit4]>at 
org.apache.solr.common.cloud.SolrZkClient$5.execute(SolrZkClient.java:323)
   [junit4]>at 
org.apache.solr.common.cloud.SolrZkClient$5.execute(SolrZkClient.java:320)
   [junit4]>at 
org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:60)
   [junit4]>at 
org.apache.solr.common.cloud.SolrZkClient.exists(SolrZkClient.java:320)
   [junit4]>at 
org.apache.solr.common.cloud.ZkCmdExecutor.ensureExists(ZkCmdExecutor.java:93)
   [junit4]>at 
org.apache.solr.common.cloud.ZkCmdExecutor.ensureExists(ZkCmdExecutor.java:78)
   [junit4]>at 
org.apache.solr.cloud.ZkDistributedQueue.(ZkDistributedQueue.java:105)
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-7.x-Windows (64bit/jdk1.8.0_144) - Build # 232 - Still Unstable!

2017-10-04 Thread Policeman Jenkins Server

Build: https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-Windows/232/
Java: 64bit/jdk1.8.0_144 -XX:+UseCompressedOops -XX:+UseG1GC

3 tests failed.
FAILED:  org.apache.solr.core.TestLazyCores.testNoCommit

Error Message:
Exception during query

Stack Trace:
java.lang.RuntimeException: Exception during query
at 
__randomizedtesting.SeedInfo.seed([24ED3E226C262865:FB8D9FF3A7014BC0]:0)
at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:884)
at org.apache.solr.core.TestLazyCores.check10(TestLazyCores.java:847)
at 
org.apache.solr.core.TestLazyCores.testNoCommit(TestLazyCores.java:829)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: REQUEST FAILED: 
xpath=//result[@numFound='10']
xml response was: 

00*:*


request was:q=*:*
at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:877)
... 41 more


FAILED:

[JENKINS] Lucene-Solr-master-Linux (64bit/jdk-9) - Build # 20609 - Still Failing!

2017-10-04 Thread Policeman Jenkins Server

Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/20609/
Java: 64bit/jdk-9 -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC 
--illegal-access=deny

3 tests failed.
FAILED:  
org.apache.solr.cloud.TestCollectionsAPIViaSolrCloudCluster.testCollectionCreateSearchDelete

Error Message:
Error from server at 
https://127.0.0.1:34373/solr/testcollection_shard1_replica_n2: Expected mime 
type application/octet-stream but got text/html.Error 
404HTTP ERROR: 404 Problem accessing 
/solr/testcollection_shard1_replica_n2/update. Reason: Can not find: 
/solr/testcollection_shard1_replica_n2/update http://eclipse.org/jetty;>Powered by Jetty:// 9.3.20.v20170531 
  

Stack Trace:
org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error from 
server at https://127.0.0.1:34373/solr/testcollection_shard1_replica_n2: 
Expected mime type application/octet-stream but got text/html. 


Error 404 


HTTP ERROR: 404
Problem accessing /solr/testcollection_shard1_replica_n2/update. Reason:
Can not find: /solr/testcollection_shard1_replica_n2/update
http://eclipse.org/jetty;>Powered by Jetty:// 
9.3.20.v20170531



at 
__randomizedtesting.SeedInfo.seed([28D3DB2A20700A5D:8B29758FA798E0F8]:0)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.directUpdate(CloudSolrClient.java:539)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:993)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:862)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:793)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:178)
at 
org.apache.solr.client.solrj.request.UpdateRequest.commit(UpdateRequest.java:233)
at 
org.apache.solr.cloud.TestCollectionsAPIViaSolrCloudCluster.testCollectionCreateSearchDelete(TestCollectionsAPIViaSolrCloudCluster.java:167)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:564)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at

[JENKINS] Lucene-Solr-7.0-Windows (64bit/jdk-9) - Build # 186 - Unstable!

2017-10-04 Thread Policeman Jenkins Server

Build: https://jenkins.thetaphi.de/job/Lucene-Solr-7.0-Windows/186/
Java: 64bit/jdk-9 -XX:+UseCompressedOops -XX:+UseG1GC --illegal-access=deny

2 tests failed.
FAILED:  org.apache.solr.cloud.TestTlogReplica.testRecovery

Error Message:
Replica core_node1 not up to date after 0 seconds expected:<4> but was:<2>

Stack Trace:
java.lang.AssertionError: Replica core_node1 not up to date after 0 seconds 
expected:<4> but was:<2>
at 
__randomizedtesting.SeedInfo.seed([242CC9592BB21A6F:E5DCB0F506E2D0C8]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at 
org.apache.solr.cloud.TestTlogReplica.waitForNumDocsInAllReplicas(TestTlogReplica.java:743)
at 
org.apache.solr.cloud.TestTlogReplica.waitForNumDocsInAllReplicas(TestTlogReplica.java:731)
at 
org.apache.solr.cloud.TestTlogReplica.testRecovery(TestTlogReplica.java:538)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:564)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at

[jira] [Commented] (LUCENE-7976) Add a parameter to TieredMergePolicy to merge segments that have more than X percent deleted documents

2017-10-04 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16192424#comment-16192424
 ] 

Yonik Seeley commented on LUCENE-7976:
--

There are plenty of use-cases for a forceMerge or optimize to be done in either 
special cases, or on a fixed schedule.  It's a deficiency that the default 
merge policy can't deal more intelligently with that.  Merge policies are 
pluggable though, so we may be able to deal with this at either the Lucene or 
Solr level.  No need for 100% of all devs to agree ;-)

bq. any segment with > X% deleted documents would be merged or rewritten NO 
MATTER HOW LARGE.

+1 for the idea... I haven't thought about all the ways it might interact with 
other things, but I like it in general.
Segments with X% deleted docs will be candidates for merging.  Max segment 
sizes will still be targeted of course, so if it's estimated size after merging 
with smaller segments is less than the max seg size, we're good.  If not, merge 
it by itself (i.e. expungeDeletes).


> Add a parameter to TieredMergePolicy to merge segments that have more than X 
> percent deleted documents
> --
>
> Key: LUCENE-7976
> URL: https://issues.apache.org/jira/browse/LUCENE-7976
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
>
> We're seeing situations "in the wild" where there are very large indexes (on 
> disk) handled quite easily in a single Lucene index. This is particularly 
> true as features like docValues move data into MMapDirectory space. The 
> current TMP algorithm allows on the order of 50% deleted documents as per a 
> dev list conversation with Mike McCandless (and his blog here:  
> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents).
> Especially in the current era of very large indexes in aggregate, (think many 
> TB) solutions like "you need to distribute your collection over more shards" 
> become very costly. Additionally, the tempting "optimize" button exacerbates 
> the issue since once you form, say, a 100G segment (by 
> optimizing/forceMerging) it is not eligible for merging until 97.5G of the 
> docs in it are deleted (current default 5G max segment size).
> The proposal here would be to add a new parameter to TMP, something like 
>  (no, that's not serious name, suggestions 
> welcome) which would default to 100 (or the same behavior we have now).
> So if I set this parameter to, say, 20%, and the max segment size stays at 
> 5G, the following would happen when segments were selected for merging:
> > any segment with > 20% deleted documents would be merged or rewritten NO 
> > MATTER HOW LARGE. There are two cases,
> >> the segment has < 5G "live" docs. In that case it would be merged with 
> >> smaller segments to bring the resulting segment up to 5G. If no smaller 
> >> segments exist, it would just be rewritten
> >> The segment has > 5G "live" docs (the result of a forceMerge or optimize). 
> >> It would be rewritten into a single segment removing all deleted docs no 
> >> matter how big it is to start. The 100G example above would be rewritten 
> >> to an 80G segment for instance.
> Of course this would lead to potentially much more I/O which is why the 
> default would be the same behavior we see now. As it stands now, though, 
> there's no way to recover from an optimize/forceMerge except to re-index from 
> scratch. We routinely see 200G-300G Lucene indexes at this point "in the 
> wild" with 10s of  shards replicated 3 or more times. And that doesn't even 
> include having these over HDFS.
> Alternatives welcome! Something like the above seems minimally invasive. A 
> new merge policy is certainly an alternative.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7985) Update forbiddenapis to 2.4.1

2017-10-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16192417#comment-16192417
 ] 

ASF subversion and git services commented on LUCENE-7985:
-

Commit 080232f3d1766da66bb1883378692409c89e986b in lucene-solr's branch 
refs/heads/master from [~steve_rowe]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=080232f ]

LUCENE-7985: maven build: update forbiddenapis to 2.4.1


> Update forbiddenapis to 2.4.1
> -
>
> Key: LUCENE-7985
> URL: https://issues.apache.org/jira/browse/LUCENE-7985
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: general/build
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
>  Labels: Java9, forbiddenapis
> Fix For: 7.1, master (8.0)
>
> Attachments: LUCENE-7985.patch
>
>
> Forbidden APIs 2.4.1 was released a minute ago: It mainly adds full support 
> for Java 9 by upgrading to ASM 6.0. It also adds Gradle 4 support (not 
> relevant to Lucene/Solr).
> This will also update Groovy to latest 2.4.12 version (also for Java 9 
> support).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7985) Update forbiddenapis to 2.4.1

2017-10-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16192416#comment-16192416
 ] 

ASF subversion and git services commented on LUCENE-7985:
-

Commit 26adc10445138e2a3488ea7b6ecf81534d3d4005 in lucene-solr's branch 
refs/heads/branch_7x from [~steve_rowe]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=26adc10 ]

LUCENE-7985: maven build: update forbiddenapis to 2.4.1


> Update forbiddenapis to 2.4.1
> -
>
> Key: LUCENE-7985
> URL: https://issues.apache.org/jira/browse/LUCENE-7985
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: general/build
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
>  Labels: Java9, forbiddenapis
> Fix For: 7.1, master (8.0)
>
> Attachments: LUCENE-7985.patch
>
>
> Forbidden APIs 2.4.1 was released a minute ago: It mainly adds full support 
> for Java 9 by upgrading to ASM 6.0. It also adds Gradle 4 support (not 
> relevant to Lucene/Solr).
> This will also update Groovy to latest 2.4.12 version (also for Java 9 
> support).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-Tests-7.x - Build # 175 - Still unstable

2017-10-04 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-Tests-7.x/175/

6 tests failed.
FAILED:  
org.apache.solr.cloud.CollectionsAPIDistributedZkTest.deletePartiallyCreatedCollection

Error Message:


Stack Trace:
java.lang.AssertionError
at 
__randomizedtesting.SeedInfo.seed([F2738E8F07AEDDA:25A7540D47A0CD12]:0)
at org.junit.Assert.fail(Assert.java:92)
at org.junit.Assert.assertTrue(Assert.java:43)
at org.junit.Assert.assertFalse(Assert.java:68)
at org.junit.Assert.assertFalse(Assert.java:79)
at 
org.apache.solr.cloud.CollectionsAPIDistributedZkTest.deletePartiallyCreatedCollection(CollectionsAPIDistributedZkTest.java:171)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at java.lang.Thread.run(Thread.java:748)


FAILED:  org.apache.solr.cloud.ShardRoutingCustomTest.test

Error Message:
KeeperErrorCode = Session expired for /clusterstate.json

Stack Trace:
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = 
Session expired for

[JENKINS] Lucene-Solr-7.x-Linux (64bit/jdk1.8.0_144) - Build # 546 - Unstable!

2017-10-04 Thread Policeman Jenkins Server

Build: https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-Linux/546/
Java: 64bit/jdk1.8.0_144 -XX:+UseCompressedOops -XX:+UseG1GC

3 tests failed.
FAILED:  
junit.framework.TestSuite.org.apache.solr.cloud.TestSolrCloudWithSecureImpersonation

Error Message:
2 threads leaked from SUITE scope at 
org.apache.solr.cloud.TestSolrCloudWithSecureImpersonation: 1) 
Thread[id=18177, name=jetty-launcher-1215-thread-1-EventThread, 
state=TIMED_WAITING, group=TGRP-TestSolrCloudWithSecureImpersonation] 
at sun.misc.Unsafe.park(Native Method) at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) 
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
 at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
 at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)  
   at 
org.apache.curator.CuratorZookeeperClient.internalBlockUntilConnectedOrTimedOut(CuratorZookeeperClient.java:323)
 at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:105)  
   at 
org.apache.curator.framework.imps.GetDataBuilderImpl.pathInForeground(GetDataBuilderImpl.java:288)
 at 
org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:279)
 at 
org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:41)
 at 
org.apache.curator.framework.recipes.shared.SharedValue.readValue(SharedValue.java:244)
 at 
org.apache.curator.framework.recipes.shared.SharedValue.access$100(SharedValue.java:44)
 at 
org.apache.curator.framework.recipes.shared.SharedValue$1.process(SharedValue.java:61)
 at 
org.apache.curator.framework.imps.NamespaceWatcher.process(NamespaceWatcher.java:67)
 at 
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)   
  at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:505)   
 2) Thread[id=18170, name=jetty-launcher-1215-thread-2-EventThread, 
state=TIMED_WAITING, group=TGRP-TestSolrCloudWithSecureImpersonation] 
at sun.misc.Unsafe.park(Native Method) at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) 
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
 at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
 at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)  
   at 
org.apache.curator.CuratorZookeeperClient.internalBlockUntilConnectedOrTimedOut(CuratorZookeeperClient.java:323)
 at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:105)  
   at 
org.apache.curator.framework.imps.GetDataBuilderImpl.pathInForeground(GetDataBuilderImpl.java:288)
 at 
org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:279)
 at 
org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:41)
 at 
org.apache.curator.framework.recipes.shared.SharedValue.readValue(SharedValue.java:244)
 at 
org.apache.curator.framework.recipes.shared.SharedValue.access$100(SharedValue.java:44)
 at 
org.apache.curator.framework.recipes.shared.SharedValue$1.process(SharedValue.java:61)
 at 
org.apache.curator.framework.imps.NamespaceWatcher.process(NamespaceWatcher.java:67)
 at 
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)   
  at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:505)

Stack Trace:
com.carrotsearch.randomizedtesting.ThreadLeakError: 2 threads leaked from SUITE 
scope at org.apache.solr.cloud.TestSolrCloudWithSecureImpersonation: 
   1) Thread[id=18177, name=jetty-launcher-1215-thread-1-EventThread, 
state=TIMED_WAITING, group=TGRP-TestSolrCloudWithSecureImpersonation]
at sun.misc.Unsafe.park(Native Method)
at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
at 
org.apache.curator.CuratorZookeeperClient.internalBlockUntilConnectedOrTimedOut(CuratorZookeeperClient.java:323)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:105)
at 
org.apache.curator.framework.imps.GetDataBuilderImpl.pathInForeground(GetDataBuilderImpl.java:288)
at 
org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:279)
at

[jira] [Commented] (SOLR-11391) JoinQParser for non point fields should use the GraphTermsCollector

2017-10-04 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16192388#comment-16192388
 ] 

Hoss Man commented on SOLR-11391:
-


Comments on latest patch (broader thoughts below)...

* JoinQParserPlugin.java
** chooseJoinMethod
*** javadocs should be valid html, docs need to come before @param, etc...
*** IIUC, you've deleted the only code path that will give the user an error if 
the field points==true but docValues==false
 validateJoinMethod will fail if the user explicitly asks for ENUM, but 
here in this method you'll attempt to use ENUM if the user sends 
{{method=null|SMART}}
*** treating "!= -1" as magic is dangerous, what if some code accidently passes 
-2? ... "< 0" is safer
 either that or {{assert -1 <= domainSize}}
*** if we just want the LeafReaderContext then why ask for 
{{getSlowAtomicReader()}} ? why not just use {{searcher.getLeafContexts()}} ?
*** {{sumRatio = + segmentCardinality / segmentDocs;}} ... 100% certain that 
should be {{+=}}
*** the existence of the {{+=}} typo plus the "pick ENUM for poins w/o 
docValues w/o error" mentioned above tells me this method should most certainly 
have some very targeted unit tests
 there should most certianly be a unit test of this specific method that 
asserts it gets ENUM back for some fields and DV back for others based on the 
individual segments
 ideally, instead of needing to actaully build up tones of large indexes 
for this, this helper method should be refactored to take in some arrays of the 
data -- ie: {{chooseJoinMethod(SolrIndexSearcher, SchemaField, int)}} should 
delegate to some {{chooseJoinMethodByDvStats(long[] numDocsPerLeaf, long[] 
cardinalityPerLeaf)}} that we have direct unit tests for
** JoinQuery
*** constructor still needs jdocs
 in particular needs to point out that if domainFilters != null it will 
have elements added to it when query is exececuted
 (either that, or refactor the defensive copy down from createJoinQuery)
*** {{toString}} should definitely use the {{method}} in it's output
*** {{hashCode}} and {{equals}} currently don't consider {{method}} or 
{{domainFilters}} in processing
 should diff {{method}} values make 2 {{JoinQuery}} objects not-equals?  
what about {{SMART}} values? ... not sure.
 different {{domainFilters}} should *definitely* make 2 {{JoinQuery}} 
objects not-equals, othewise we're going to get weird caching behavior -- but 
this raises interesting questions about what happens when the {{JoinMethod}} 
type doesn't even support/use the {{domainFilters}} ... then should 2 
{{JoinQuery}} objects be considered equal?
 no matter what, these questions make it clear we should add some more 
robust equal/not-equal checks for the join parser (and using createJoinQuery 
with diff domainFilters)
*** we should probably call {{method = chooseJoinMethod}} before 
{{validateJoinMethod(...)}} to help ensure we never intorduce bugs that cause 
our hueristic to pick broken options for the specified field
 ideally this code path would/should even catch/retrow the SolrException 
and wrap it in a "code is broken, please file a bug" if the original method was 
SMART
 either that, or just refactor validateJoinMethod and chooseJoinMethod into 
a single method that: 1) if SMART, attempts to narrow down by field props or 
error if nothing legal; 2) if still options, pick based on field stats.
 either way, doing this choosing & validating should be done as early as 
possible -- why not put it in the {{JoinQueryWeight}} constructor instead of in 
the {{JoinQueryWeight.getDocSet()}} method ?
* then we could also set the "actual" {{method}} in the {{dbg}} info 
created by {{JoinQueryWeight.scorer()}}
 actually -- what would probably be best is if the {{JoinQuery}} 
constructor rejected {{null}} or {{SMART}} as input values, and required the 
caller to call thes new merged "choose+validate helper" (both the 
{{JoinQParser.parse() method and the facet code using {{createJoinQuery()}} are 
garunteed to have access the the neccessary SolrIndexSearcher) to resolve the 
user specified "method" into an "executable" method.  That way the 
{{JoinQuery}} constructor could explicitly set {{this.domainFilters=null}} 
unless the {{method}} supported them (which would simplify the questions about 
how {{hashCode}} and {{equals()}} should work.
** validateJoinMethod
*** once points is checked, why is there redundent nested if check for 
{{isPointField()}} ?
** executeJoinQuery
*** jdocs still need to mention this method modifies domainFilters
 or better still: preserve the original {{domainFilter}} passed by the 
client all the way into the {{JoinQuery}} and assign to {{this.domainFilters}} 
(which should be final), but only make a copy here inside {{executeJoinQuery}} 
(to add the {{uncachedJoinQ}} to) and then let the copy by GCed after the 
{{toSearcher.getDocSet(...);}} call

[JENKINS] Lucene-Solr-master-Windows (64bit/jdk1.8.0_144) - Build # 6941 - Still Failing!

2017-10-04 Thread Policeman Jenkins Server

Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Windows/6941/
Java: 64bit/jdk1.8.0_144 -XX:+UseCompressedOops -XX:+UseSerialGC

3 tests failed.
FAILED:  org.apache.solr.core.TestLazyCores.testNoCommit

Error Message:
Exception during query

Stack Trace:
java.lang.RuntimeException: Exception during query
at 
__randomizedtesting.SeedInfo.seed([70D521E7C33CFF1A:AFB58036081B9CBF]:0)
at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:884)
at org.apache.solr.core.TestLazyCores.check10(TestLazyCores.java:847)
at 
org.apache.solr.core.TestLazyCores.testNoCommit(TestLazyCores.java:829)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: REQUEST FAILED: 
xpath=//result[@numFound='10']
xml response was: 

01*:*


request was:q=*:*
at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:877)
... 41 more


FAILED:

[jira] [Updated] (SOLR-11293) HttpPartitionTest fails often

2017-10-04 Thread Cao Manh Dat (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-11293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat updated SOLR-11293:

Priority: Major  (was: Blocker)

> HttpPartitionTest fails often
> -
>
> Key: SOLR-11293
> URL: https://issues.apache.org/jira/browse/SOLR-11293
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Cao Manh Dat
> Fix For: 7.1
>
> Attachments: SOLR-11293.patch, SOLR-11293.patch, SOLR-11293.patch, 
> SOLR-11293.patch
>
>
> https://jenkins.thetaphi.de/job/Lucene-Solr-master-MacOSX/4140/testReport/org.apache.solr.cloud/HttpPartitionTest/test/
> {code}
> Error Message
> Doc with id=1 not found in http://127.0.0.1:60897/b/xj/collMinRf_1x3 due to: 
> Path not found: /id; rsp={doc=null}
> Stacktrace
> java.lang.AssertionError: Doc with id=1 not found in 
> http://127.0.0.1:60897/b/xj/collMinRf_1x3 due to: Path not found: /id; 
> rsp={doc=null}
>   at 
> __randomizedtesting.SeedInfo.seed([ACF841744A332569:24AC7EAEE4CF4891]:0)
>   at org.junit.Assert.fail(Assert.java:93)
>   at org.junit.Assert.assertTrue(Assert.java:43)
>   at 
> org.apache.solr.cloud.HttpPartitionTest.assertDocExists(HttpPartitionTest.java:603)
>   at 
> org.apache.solr.cloud.HttpPartitionTest.assertDocsExistInAllReplicas(HttpPartitionTest.java:558)
>   at 
> org.apache.solr.cloud.HttpPartitionTest.testMinRf(HttpPartitionTest.java:249)
>   at 
> org.apache.solr.cloud.HttpPartitionTest.test(HttpPartitionTest.java:127)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-7.0-Linux (64bit/jdk-9) - Build # 421 - Unstable!

2017-10-04 Thread Policeman Jenkins Server

Build: https://jenkins.thetaphi.de/job/Lucene-Solr-7.0-Linux/421/
Java: 64bit/jdk-9 -XX:-UseCompressedOops -XX:+UseConcMarkSweepGC 
--illegal-access=deny

1 tests failed.
FAILED:  org.apache.solr.cloud.TestTlogReplica.testRecovery

Error Message:
Can not find doc 3 in https://127.0.0.1:42291/solr

Stack Trace:
java.lang.AssertionError: Can not find doc 3 in https://127.0.0.1:42291/solr
at 
__randomizedtesting.SeedInfo.seed([AAB57E423B8616DF:6B4507EE16D6DC78]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at org.junit.Assert.assertNotNull(Assert.java:526)
at 
org.apache.solr.cloud.TestTlogReplica.checkRTG(TestTlogReplica.java:868)
at 
org.apache.solr.cloud.TestTlogReplica.testRecovery(TestTlogReplica.java:559)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:564)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at java.base/java.lang.Thread.run(Thread.java:844)




Build Log:
[...truncated 1693 lines...]
   [junit4] JVM J1: stderr was not empty, see:

[jira] [Commented] (LUCENE-7976) Add a parameter to TieredMergePolicy to merge segments that have more than X percent deleted documents

2017-10-04 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16192266#comment-16192266
 ] 

Erick Erickson commented on LUCENE-7976:


There are two issues here that are a bit conflated; the consequences of 
forceMerge and having up to 50% of your index space used up by deleted docs:

1> If they _do_ optimize/forcemerge/expungeDeletes, they're stuck. Totally 
agree that having a big red button makes that way too tempting. Even if 
removed, users can still use the optimize call from the SolrJ client and/or via 
the update handler. So one issue is if there are ways to prevent the 
unfortunate consequences (the freeze idea, only optimize into segments max 
segment size etc) or recover somehow (some of the proposals above).  Keeping 
the number of deleted docs lower would make pressing that button less tempting, 
but the button still should be removed. There are ways to forceMerge even if 
removed though.

2> Even if they _don't_ forcemerge/expungeDeletes, having 50% of the index 
consumed by deleted docs can be quite costly. Telling users that they have only 
two choices, 1> start and keep optimizing or 2> buy enough hardware that they 
can meet their SLAs with half their index space wasted is a hard sell. We have 
people who need 100s of machines in their clusters to hit their SLAs. Accepting 
up to 50% deleted docs as the norm means potentially millions of dollars in 
unnecessary hardware.



> Add a parameter to TieredMergePolicy to merge segments that have more than X 
> percent deleted documents
> --
>
> Key: LUCENE-7976
> URL: https://issues.apache.org/jira/browse/LUCENE-7976
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
>
> We're seeing situations "in the wild" where there are very large indexes (on 
> disk) handled quite easily in a single Lucene index. This is particularly 
> true as features like docValues move data into MMapDirectory space. The 
> current TMP algorithm allows on the order of 50% deleted documents as per a 
> dev list conversation with Mike McCandless (and his blog here:  
> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents).
> Especially in the current era of very large indexes in aggregate, (think many 
> TB) solutions like "you need to distribute your collection over more shards" 
> become very costly. Additionally, the tempting "optimize" button exacerbates 
> the issue since once you form, say, a 100G segment (by 
> optimizing/forceMerging) it is not eligible for merging until 97.5G of the 
> docs in it are deleted (current default 5G max segment size).
> The proposal here would be to add a new parameter to TMP, something like 
>  (no, that's not serious name, suggestions 
> welcome) which would default to 100 (or the same behavior we have now).
> So if I set this parameter to, say, 20%, and the max segment size stays at 
> 5G, the following would happen when segments were selected for merging:
> > any segment with > 20% deleted documents would be merged or rewritten NO 
> > MATTER HOW LARGE. There are two cases,
> >> the segment has < 5G "live" docs. In that case it would be merged with 
> >> smaller segments to bring the resulting segment up to 5G. If no smaller 
> >> segments exist, it would just be rewritten
> >> The segment has > 5G "live" docs (the result of a forceMerge or optimize). 
> >> It would be rewritten into a single segment removing all deleted docs no 
> >> matter how big it is to start. The 100G example above would be rewritten 
> >> to an 80G segment for instance.
> Of course this would lead to potentially much more I/O which is why the 
> default would be the same behavior we see now. As it stands now, though, 
> there's no way to recover from an optimize/forceMerge except to re-index from 
> scratch. We routinely see 200G-300G Lucene indexes at this point "in the 
> wild" with 10s of  shards replicated 3 or more times. And that doesn't even 
> include having these over HDFS.
> Alternatives welcome! Something like the above seems minimally invasive. A 
> new merge policy is certainly an alternative.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-11438) Solr should return rf when min_rf is specified for delete-by-id and delete-by-query.

2017-10-04 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16192230#comment-16192230
 ] 

Erick Erickson commented on SOLR-11438:
---

Hmm, I just found this by accident:

This code in DistributedUpdateProcessor.processAdd:
{{if (minRf > 1) {
  String myShardId = forwardToLeader ? null : cloudDesc.getShardId();
  replicationTracker = new RequestReplicationTracker(myShardId, minRf);
}
}}

throws an NPE if you specify min_rf in stand alone mode. Specifying min_rf 
doesn't make any _sense_ in stand-alone mode but we can be more graceful about 
it. This is even without the hack I attached. There are no safeguards around 
dereferencing cloudDesc I think (haven't pursued it much)

> Solr should return rf when min_rf is specified for delete-by-id and 
> delete-by-query.
> 
>
> Key: SOLR-11438
> URL: https://issues.apache.org/jira/browse/SOLR-11438
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.6.1, 7.0, master (8.0)
>Reporter: Erick Erickson
>Assignee: Erick Erickson
> Attachments: SOLR-11438.patch
>
>
> When we add documents and specify min_rf we get back an rf parameter in the 
> response which is the number of replicas that successfully received the add. 
> However, for delete-by-id or delete-by-query we do not return this data. Is 
> there any harm in it?
> Assigning to myself to track, anyone else who wants it feel free.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-master-Linux (64bit/jdk1.8.0_144) - Build # 20608 - Still Failing!

2017-10-04 Thread Policeman Jenkins Server

Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/20608/
Java: 64bit/jdk1.8.0_144 -XX:-UseCompressedOops -XX:+UseSerialGC

1 tests failed.
FAILED:  
org.apache.solr.cloud.TestLeaderElectionZkExpiry.testLeaderElectionWithZkExpiry

Error Message:
Captured an uncaught exception in thread: Thread[id=7095, 
name=OverseerAutoScalingTriggerThread-98773144295571458-dummy.host.com:8984_solr-n_00,
 state=RUNNABLE, group=Overseer autoscaling triggers]

Stack Trace:
com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught 
exception in thread: Thread[id=7095, 
name=OverseerAutoScalingTriggerThread-98773144295571458-dummy.host.com:8984_solr-n_00,
 state=RUNNABLE, group=Overseer autoscaling triggers]
at 
__randomizedtesting.SeedInfo.seed([FC5CF73E695F42E6:2A83D6645F42517E]:0)
Caused by: org.apache.solr.common.SolrException: 
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = 
Session expired for /autoscaling/events/.auto_add_replicas
at __randomizedtesting.SeedInfo.seed([FC5CF73E695F42E6]:0)
at 
org.apache.solr.cloud.ZkDistributedQueue.(ZkDistributedQueue.java:107)
at 
org.apache.solr.cloud.autoscaling.TriggerEventQueue.(TriggerEventQueue.java:44)
at 
org.apache.solr.cloud.autoscaling.ScheduledTriggers$ScheduledTrigger.(ScheduledTriggers.java:398)
at 
org.apache.solr.cloud.autoscaling.ScheduledTriggers.add(ScheduledTriggers.java:149)
at 
org.apache.solr.cloud.autoscaling.OverseerTriggerThread.run(OverseerTriggerThread.java:220)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.zookeeper.KeeperException$SessionExpiredException: 
KeeperErrorCode = Session expired for /autoscaling/events/.auto_add_replicas
at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1102)
at 
org.apache.solr.common.cloud.SolrZkClient$5.execute(SolrZkClient.java:323)
at 
org.apache.solr.common.cloud.SolrZkClient$5.execute(SolrZkClient.java:320)
at 
org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:60)
at 
org.apache.solr.common.cloud.SolrZkClient.exists(SolrZkClient.java:320)
at 
org.apache.solr.common.cloud.ZkCmdExecutor.ensureExists(ZkCmdExecutor.java:93)
at 
org.apache.solr.common.cloud.ZkCmdExecutor.ensureExists(ZkCmdExecutor.java:78)
at 
org.apache.solr.cloud.ZkDistributedQueue.(ZkDistributedQueue.java:105)
... 5 more




Build Log:
[...truncated 12396 lines...]
   [junit4] Suite: org.apache.solr.cloud.TestLeaderElectionZkExpiry
   [junit4]   2> 1024382 INFO  
(SUITE-TestLeaderElectionZkExpiry-seed#[FC5CF73E695F42E6]-worker) [] 
o.a.s.SolrTestCaseJ4 SecureRandom sanity checks: 
test.solr.allowed.securerandom=null & java.security.egd=file:/dev/./urandom
   [junit4]   2> Creating dataDir: 
/home/jenkins/workspace/Lucene-Solr-master-Linux/solr/build/solr-core/test/J0/temp/solr.cloud.TestLeaderElectionZkExpiry_FC5CF73E695F42E6-001/init-core-data-001
   [junit4]   2> 1024383 WARN  
(SUITE-TestLeaderElectionZkExpiry-seed#[FC5CF73E695F42E6]-worker) [] 
o.a.s.SolrTestCaseJ4 startTrackingSearchers: numOpens=1 numCloses=1
   [junit4]   2> 1024383 INFO  
(SUITE-TestLeaderElectionZkExpiry-seed#[FC5CF73E695F42E6]-worker) [] 
o.a.s.SolrTestCaseJ4 Using PointFields (NUMERIC_POINTS_SYSPROP=true) 
w/NUMERIC_DOCVALUES_SYSPROP=true
   [junit4]   2> 1024385 INFO  
(SUITE-TestLeaderElectionZkExpiry-seed#[FC5CF73E695F42E6]-worker) [] 
o.a.s.SolrTestCaseJ4 Randomized ssl (true) and clientAuth (false) via: 
@org.apache.solr.util.RandomizeSSL(reason=, ssl=NaN, value=NaN, clientAuth=NaN)
   [junit4]   2> 1024386 INFO  
(TEST-TestLeaderElectionZkExpiry.testLeaderElectionWithZkExpiry-seed#[FC5CF73E695F42E6])
 [] o.a.s.SolrTestCaseJ4 ###Starting testLeaderElectionWithZkExpiry
   [junit4]   2> 1024390 INFO  
(TEST-TestLeaderElectionZkExpiry.testLeaderElectionWithZkExpiry-seed#[FC5CF73E695F42E6])
 [] o.a.s.c.SolrXmlConfig MBean server found: 
com.sun.jmx.mbeanserver.JmxMBeanServer@76602c3f, but no JMX reporters were 
configured - adding default JMX reporter.
   [junit4]   2> 1024418 INFO  
(TEST-TestLeaderElectionZkExpiry.testLeaderElectionWithZkExpiry-seed#[FC5CF73E695F42E6])
 [] o.a.s.m.r.SolrJmxReporter JMX monitoring for 'solr.node' (registry 
'solr.node') enabled at server: com.sun.jmx.mbeanserver.JmxMBeanServer@76602c3f
   [junit4]   2> 1024424 INFO  
(TEST-TestLeaderElectionZkExpiry.testLeaderElectionWithZkExpiry-seed#[FC5CF73E695F42E6])
 [] o.a.s.m.r.SolrJmxReporter JMX monitoring for 'solr.jvm' (registry 
'solr.jvm') enabled at server: com.sun.jmx.mbeanserver.JmxMBeanServer@76602c3f
   [junit4]   2> 1024424 INFO  
(TEST-TestLeaderElectionZkExpiry.testLeaderElectionWithZkExpiry-seed#[FC5CF73E695F42E6])
 []

[jira] [Comment Edited] (SOLR-11438) Solr should return rf when min_rf is specified for delete-by-id and delete-by-query.

2017-10-04 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191871#comment-16191871
 ] 

Erick Erickson edited comment on SOLR-11438 at 10/4/17 11:52 PM:
-

A bare sketch of a patch, just enough to show that it's _possible_ get "rf" 
back from deletes. I just mechanically passed replicationTracker to 
cmdDistrib.distribDelete (sometimes) without really understanding what I was 
doing and there are zero tests

I did note that the processAdd returned rf of 3 when there were 4 replicas, so 
my bit of refactoring may have messed up processAdd. Like I said, this was only 
to see what the level of effort required would be. 

I didn't even run the test suite on this patch BTW.


was (Author: erickerickson):
A bare sketch of a patch, just enough to show that it's _possible_ get "rf" 
back from deletes. I just mechanically passed replicationTracker to 
cmdDistrib.distribDelete (sometimes) without really understanding what I was 
doing and there are zero tests

I did note that the processAdd returned rf of 3 when there were 4 replicas, so 
my bit of refactoring may have messed up processAdd. Like I said, this was only 
to see if it could be done.

> Solr should return rf when min_rf is specified for delete-by-id and 
> delete-by-query.
> 
>
> Key: SOLR-11438
> URL: https://issues.apache.org/jira/browse/SOLR-11438
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.6.1, 7.0, master (8.0)
>Reporter: Erick Erickson
>Assignee: Erick Erickson
> Attachments: SOLR-11438.patch
>
>
> When we add documents and specify min_rf we get back an rf parameter in the 
> response which is the number of replicas that successfully received the add. 
> However, for delete-by-id or delete-by-query we do not return this data. Is 
> there any harm in it?
> Assigning to myself to track, anyone else who wants it feel free.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-7316) API to create a core is broken

2017-10-04 Thread Jason Gerlowski (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-7316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16192194#comment-16192194
 ] 

Jason Gerlowski commented on SOLR-7316:
---

>From my reading of the JIRA, there are two aspects of Mark's concerns here:

1. There should be a default configSet.  As you pointed out Steve, this is done.
2. For user-friendliness, the {{CreateCores}} API should use the default 
configset {{_default}} when the configSet parameter isn't specified.  This 
hasn't been addressed, one way or another.

The create-cores API still fails with the same error message originally 
reported by Mark:
{code}
[~/c/l/solr] $ curl -ilk 
'http://localhost:8983/solr/admin/cores?action=CREATE=new_core=new_core'
{
  "responseHeader":{
"status":400,
  "error":{
"metadata":[
  "error-class","org.apache.solr.common.SolrException",
"msg":"Error CREATEing SolrCore 'new_core': Unable to create core 
[new_core] Caused by: Can't find resource 'solrconfig.xml' in classpath or 
'/home/asdf/checkouts/lucene-solr/solr/server/solr/new_core'",
"code":400}}
{code}

Not that I necessarily agree with Mark's points.  Quietly allowing a core to be 
created with the {{_default}} configset _would_ make getting started a little 
less painless.  But it also introduces trappy behavior for those who just 
messed up their API call and forgot the parameter.  There's tradeoffs here.

So I don't necessarily agree, but it's tough to close this, unless you were 
going to close it as "wont-fix".

> API to create a core is broken
> --
>
> Key: SOLR-7316
> URL: https://issues.apache.org/jira/browse/SOLR-7316
> Project: Solr
>  Issue Type: Bug
>  Components: Server
>Affects Versions: 5.0
>Reporter: Mark Haase
>
> *Steps To Reproduce*
> {code}
> curl 
> 'http://localhost:8983/solr/admin/cores?action=CREATE=new_core=new_core'
> {code}
> *Expected Result*
> Create a core called "new_core".
> *Actual Result*
> {quote}
> Error CREATEing SolrCore 'new_core': Unable to create core [new_core] Caused 
> by: Can't find resource 'solrconfig.xml' in classpath or 
> '/var/solr/data/new_core/conf'
> {quote}
> Somebody on solr-users tells me:
> {quote}
> The CoreAdmin API requires that the instanceDir already exist, with a
> conf directory inside it that contains solrconfig.xml, schema.xml, and
> any other necessary config files.
> {quote}
> Huh? Where is this magical knowledge mentioned in the [API 
> documentation|https://cwiki.apache.org/confluence/display/solr/CoreAdmin+API]?
> Another user on the list serve says:
> {quote}
> In fact, yes. The thing to remember here is that you're using a much
> older approach that had its roots in the pre-cloud days.
> {quote}
> *The whole point of creating APIs is to abstract out details that the caller 
> doesn't need to know, and yet this API requires an understanding of Solr's 
> internal file structure and history of the project?* I'm speechless.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (SOLR-8335) HdfsLockFactory does not allow core to come up after a node was killed

2017-10-04 Thread Mark Miller (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-8335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller reassigned SOLR-8335:
-

Assignee: Mark Miller

> HdfsLockFactory does not allow core to come up after a node was killed
> --
>
> Key: SOLR-8335
> URL: https://issues.apache.org/jira/browse/SOLR-8335
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 5.0, 5.1, 5.2, 5.2.1, 5.3, 5.3.1
>Reporter: Varun Thacker
>Assignee: Mark Miller
> Attachments: SOLR-8335.patch
>
>
> When using HdfsLockFactory if a node gets killed instead of a graceful 
> shutdown the write.lock file remains in HDFS . The next time you start the 
> node the core doesn't load up because of LockObtainFailedException .
> I was able to reproduce this in all 5.x versions of Solr . The problem wasn't 
> there when I tested it in 4.10.4
> Steps to reproduce this on 5.x
> 1. Create directory in HDFS : {{bin/hdfs dfs -mkdir /solr}}
> 2. Start Solr: {{bin/solr start -Dsolr.directoryFactory=HdfsDirectoryFactory 
> -Dsolr.lock.type=hdfs -Dsolr.data.dir=hdfs://localhost:9000/solr 
> -Dsolr.updatelog=hdfs://localhost:9000/solr}}
> 3. Create core: {{./bin/solr create -c test -n data_driven}}
> 4. Kill solr
> 5. The lock file is there in HDFS and is called {{write.lock}}
> 6. Start Solr again and you get a stack trace like this:
> {code}
> 2015-11-23 13:28:04.287 ERROR (coreLoadExecutor-6-thread-1) [   x:test] 
> o.a.s.c.CoreContainer Error creating core [test]: Index locked for write for 
> core 'test'. Solr now longer supports forceful unlocking via 
> 'unlockOnStartup'. Please verify locks manually!
> org.apache.solr.common.SolrException: Index locked for write for core 'test'. 
> Solr now longer supports forceful unlocking via 'unlockOnStartup'. Please 
> verify locks manually!
> at org.apache.solr.core.SolrCore.(SolrCore.java:820)
> at org.apache.solr.core.SolrCore.(SolrCore.java:659)
> at org.apache.solr.core.CoreContainer.create(CoreContainer.java:723)
> at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:443)
> at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:434)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:210)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.lucene.store.LockObtainFailedException: Index locked 
> for write for core 'test'. Solr now longer supports forceful unlocking via 
> 'unlockOnStartup'. Please verify locks manually!
> at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:528)
> at org.apache.solr.core.SolrCore.(SolrCore.java:761)
> ... 9 more
> 2015-11-23 13:28:04.289 ERROR (coreContainerWorkExecutor-2-thread-1) [   ] 
> o.a.s.c.CoreContainer Error waiting for SolrCore to be created
> java.util.concurrent.ExecutionException: 
> org.apache.solr.common.SolrException: Unable to create core [test]
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> at org.apache.solr.core.CoreContainer$2.run(CoreContainer.java:472)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:210)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.solr.common.SolrException: Unable to create core [test]
> at org.apache.solr.core.CoreContainer.create(CoreContainer.java:737)
> at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:443)
> at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:434)
> ... 5 more
> Caused by: org.apache.solr.common.SolrException: Index locked for write for 
> core 'test'. Solr now longer supports forceful unlocking via 
> 'unlockOnStartup'. Please verify locks manually!
> at org.apache.solr.core.SolrCore.(SolrCore.java:820)
> at org.apache.solr.core.SolrCore.(SolrCore.java:659)
> at org.apache.solr.core.CoreContainer.create(CoreContainer.java:723)
> ... 7 more
> Caused by: org.apache.lucene.store.LockObtainFailedException: Index locked 
> for write for core

[jira] [Reopened] (SOLR-8335) HdfsLockFactory does not allow core to come up after a node was killed

2017-10-04 Thread Mark Miller (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-8335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller reopened SOLR-8335:
---

> HdfsLockFactory does not allow core to come up after a node was killed
> --
>
> Key: SOLR-8335
> URL: https://issues.apache.org/jira/browse/SOLR-8335
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 5.0, 5.1, 5.2, 5.2.1, 5.3, 5.3.1
>Reporter: Varun Thacker
>Assignee: Mark Miller
> Attachments: SOLR-8335.patch
>
>
> When using HdfsLockFactory if a node gets killed instead of a graceful 
> shutdown the write.lock file remains in HDFS . The next time you start the 
> node the core doesn't load up because of LockObtainFailedException .
> I was able to reproduce this in all 5.x versions of Solr . The problem wasn't 
> there when I tested it in 4.10.4
> Steps to reproduce this on 5.x
> 1. Create directory in HDFS : {{bin/hdfs dfs -mkdir /solr}}
> 2. Start Solr: {{bin/solr start -Dsolr.directoryFactory=HdfsDirectoryFactory 
> -Dsolr.lock.type=hdfs -Dsolr.data.dir=hdfs://localhost:9000/solr 
> -Dsolr.updatelog=hdfs://localhost:9000/solr}}
> 3. Create core: {{./bin/solr create -c test -n data_driven}}
> 4. Kill solr
> 5. The lock file is there in HDFS and is called {{write.lock}}
> 6. Start Solr again and you get a stack trace like this:
> {code}
> 2015-11-23 13:28:04.287 ERROR (coreLoadExecutor-6-thread-1) [   x:test] 
> o.a.s.c.CoreContainer Error creating core [test]: Index locked for write for 
> core 'test'. Solr now longer supports forceful unlocking via 
> 'unlockOnStartup'. Please verify locks manually!
> org.apache.solr.common.SolrException: Index locked for write for core 'test'. 
> Solr now longer supports forceful unlocking via 'unlockOnStartup'. Please 
> verify locks manually!
> at org.apache.solr.core.SolrCore.(SolrCore.java:820)
> at org.apache.solr.core.SolrCore.(SolrCore.java:659)
> at org.apache.solr.core.CoreContainer.create(CoreContainer.java:723)
> at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:443)
> at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:434)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:210)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.lucene.store.LockObtainFailedException: Index locked 
> for write for core 'test'. Solr now longer supports forceful unlocking via 
> 'unlockOnStartup'. Please verify locks manually!
> at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:528)
> at org.apache.solr.core.SolrCore.(SolrCore.java:761)
> ... 9 more
> 2015-11-23 13:28:04.289 ERROR (coreContainerWorkExecutor-2-thread-1) [   ] 
> o.a.s.c.CoreContainer Error waiting for SolrCore to be created
> java.util.concurrent.ExecutionException: 
> org.apache.solr.common.SolrException: Unable to create core [test]
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> at org.apache.solr.core.CoreContainer$2.run(CoreContainer.java:472)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:210)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.solr.common.SolrException: Unable to create core [test]
> at org.apache.solr.core.CoreContainer.create(CoreContainer.java:737)
> at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:443)
> at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:434)
> ... 5 more
> Caused by: org.apache.solr.common.SolrException: Index locked for write for 
> core 'test'. Solr now longer supports forceful unlocking via 
> 'unlockOnStartup'. Please verify locks manually!
> at org.apache.solr.core.SolrCore.(SolrCore.java:820)
> at org.apache.solr.core.SolrCore.(SolrCore.java:659)
> at org.apache.solr.core.CoreContainer.create(CoreContainer.java:723)
> ... 7 more
> Caused by: org.apache.lucene.store.LockObtainFailedException: Index locked 
> for write for core 'test'. Solr now longer supports

[jira] [Reopened] (SOLR-11293) HttpPartitionTest fails often

2017-10-04 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/SOLR-11293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomás Fernández Löbbe reopened SOLR-11293:
--

> HttpPartitionTest fails often
> -
>
> Key: SOLR-11293
> URL: https://issues.apache.org/jira/browse/SOLR-11293
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Cao Manh Dat
>Priority: Blocker
> Fix For: 7.1
>
> Attachments: SOLR-11293.patch, SOLR-11293.patch, SOLR-11293.patch, 
> SOLR-11293.patch
>
>
> https://jenkins.thetaphi.de/job/Lucene-Solr-master-MacOSX/4140/testReport/org.apache.solr.cloud/HttpPartitionTest/test/
> {code}
> Error Message
> Doc with id=1 not found in http://127.0.0.1:60897/b/xj/collMinRf_1x3 due to: 
> Path not found: /id; rsp={doc=null}
> Stacktrace
> java.lang.AssertionError: Doc with id=1 not found in 
> http://127.0.0.1:60897/b/xj/collMinRf_1x3 due to: Path not found: /id; 
> rsp={doc=null}
>   at 
> __randomizedtesting.SeedInfo.seed([ACF841744A332569:24AC7EAEE4CF4891]:0)
>   at org.junit.Assert.fail(Assert.java:93)
>   at org.junit.Assert.assertTrue(Assert.java:43)
>   at 
> org.apache.solr.cloud.HttpPartitionTest.assertDocExists(HttpPartitionTest.java:603)
>   at 
> org.apache.solr.cloud.HttpPartitionTest.assertDocsExistInAllReplicas(HttpPartitionTest.java:558)
>   at 
> org.apache.solr.cloud.HttpPartitionTest.testMinRf(HttpPartitionTest.java:249)
>   at 
> org.apache.solr.cloud.HttpPartitionTest.test(HttpPartitionTest.java:127)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-7986) start setting some max-age cache control headers on our website via htaccess

2017-10-04 Thread Hoss Man (JIRA)

Hoss Man created LUCENE-7986:


 Summary: start setting some max-age cache control headers on our 
website via htaccess
 Key: LUCENE-7986
 URL: https://issues.apache.org/jira/browse/LUCENE-7986
 Project: Lucene - Core
  Issue Type: Task
  Components: general/website
Reporter: Hoss Man


just by the nature of using Apache httpd our website is pretty well behaved in 
terms of Last-Modified & ETag headers -- but diff browsers use different 
hueristics for how long they will cache a page before they even bother to do a 
validation request, that can cause many people to see "stale" pages after we do 
release announcements.

Example: Chrome apparently uses this hueristic -- w/o any upper bound -- to 
decide how long to keep an item in it's cache w/o revalidation...

{{(date_item_was_last_fetched - last_mod_date_when_item_was_last_fetched) / 10}}

...that means that if it's been 100 days since they last time we updated & 
published a page, when someone loads our website in chrome, their browser will 
cache that page for up to 10 days w/o bothering to do a cacle-validation 
request to see if the page has changed.

We should consider taking advantage of {{mod_headers}} in our htaccess file to 
set {{Cache-Control: max-age ...}} headers on various file extensions, and 
perhaps set lower max-ages (or must-revalidate options) on some of the pages we 
use specifically for annoucements & releases (ie: news, download, doc landing 
pages, etc...)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-11293) HttpPartitionTest fails often

2017-10-04 Thread JIRA


[ 
https://issues.apache.org/jira/browse/SOLR-11293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16192171#comment-16192171
 ] 

Tomás Fernández Löbbe commented on SOLR-11293:
--

There are a number of Jenkin failures with:
{noformat}
FAILED:  
org.apache.solr.cloud.TestTlogReplica.testOutOfOrderDBQWithInPlaceUpdates

Error Message:
Can not find doc 1 in https://127.0.0.1:42663/solr
{noformat}
since this was committed, I believe it may be related

> HttpPartitionTest fails often
> -
>
> Key: SOLR-11293
> URL: https://issues.apache.org/jira/browse/SOLR-11293
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Cao Manh Dat
>Priority: Blocker
> Fix For: 7.1
>
> Attachments: SOLR-11293.patch, SOLR-11293.patch, SOLR-11293.patch, 
> SOLR-11293.patch
>
>
> https://jenkins.thetaphi.de/job/Lucene-Solr-master-MacOSX/4140/testReport/org.apache.solr.cloud/HttpPartitionTest/test/
> {code}
> Error Message
> Doc with id=1 not found in http://127.0.0.1:60897/b/xj/collMinRf_1x3 due to: 
> Path not found: /id; rsp={doc=null}
> Stacktrace
> java.lang.AssertionError: Doc with id=1 not found in 
> http://127.0.0.1:60897/b/xj/collMinRf_1x3 due to: Path not found: /id; 
> rsp={doc=null}
>   at 
> __randomizedtesting.SeedInfo.seed([ACF841744A332569:24AC7EAEE4CF4891]:0)
>   at org.junit.Assert.fail(Assert.java:93)
>   at org.junit.Assert.assertTrue(Assert.java:43)
>   at 
> org.apache.solr.cloud.HttpPartitionTest.assertDocExists(HttpPartitionTest.java:603)
>   at 
> org.apache.solr.cloud.HttpPartitionTest.assertDocsExistInAllReplicas(HttpPartitionTest.java:558)
>   at 
> org.apache.solr.cloud.HttpPartitionTest.testMinRf(HttpPartitionTest.java:249)
>   at 
> org.apache.solr.cloud.HttpPartitionTest.test(HttpPartitionTest.java:127)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-master-Solaris (64bit/jdk1.8.0) - Build # 1454 - Still Failing!

2017-10-04 Thread Policeman Jenkins Server

Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Solaris/1454/
Java: 64bit/jdk1.8.0 -XX:-UseCompressedOops -XX:+UseG1GC

6 tests failed.
FAILED:  
org.apache.solr.cloud.CdcrBootstrapTest.testConvertClusterToCdcrAndBootstrap

Error Message:
java.util.concurrent.TimeoutException: Could not connect to ZooKeeper 
127.0.0.1:42976 within 3 ms

Stack Trace:
org.apache.solr.common.SolrException: java.util.concurrent.TimeoutException: 
Could not connect to ZooKeeper 127.0.0.1:42976 within 3 ms
at 
__randomizedtesting.SeedInfo.seed([AE584CE27515D63E:798F6395C14A4E79]:0)
at 
org.apache.solr.common.cloud.SolrZkClient.(SolrZkClient.java:183)
at 
org.apache.solr.common.cloud.SolrZkClient.(SolrZkClient.java:117)
at 
org.apache.solr.common.cloud.SolrZkClient.(SolrZkClient.java:112)
at 
org.apache.solr.common.cloud.SolrZkClient.(SolrZkClient.java:99)
at 
org.apache.solr.cloud.MiniSolrCloudCluster.(MiniSolrCloudCluster.java:232)
at 
org.apache.solr.cloud.MiniSolrCloudCluster.(MiniSolrCloudCluster.java:195)
at 
org.apache.solr.cloud.MiniSolrCloudCluster.(MiniSolrCloudCluster.java:122)
at 
org.apache.solr.cloud.CdcrBootstrapTest.testConvertClusterToCdcrAndBootstrap(CdcrBootstrapTest.java:61)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at

Re: [VOTE] Release Lucene/Solr 7.0.1 RC1

2017-10-04 Thread Varun Thacker

+1
SUCCESS! [1:59:31.704000]

On Wed, Oct 4, 2017 at 2:46 PM, Anshum Gupta  wrote:

> +1
>
> SUCCESS! [1:03:55.801076]
>
> Changes and docs look good.
>
> -Anshum
>
>
>
> On Oct 2, 2017, at 12:43 PM, Steve Rowe  wrote:
>
> Please vote for release candidate 1 for Lucene/Solr 7.0.1
>
> The artifacts can be downloaded from:
> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-7.0.1-RC1-
> rev8d6c3889aa543954424d8ac1dbb3f03bf207140b
>
> You can run the smoke tester directly with this command:
>
> python3 -u dev-tools/scripts/smokeTestRelease.py \
> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-7.0.1-RC1-
> rev8d6c3889aa543954424d8ac1dbb3f03bf207140b
>
> Here's my +1
> [0:28:08.126321]
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>
>

Re: [VOTE] Release Lucene/Solr 7.0.1 RC1

2017-10-04 Thread Anshum Gupta

+1

SUCCESS! [1:03:55.801076]

Changes and docs look good.

-Anshum



> On Oct 2, 2017, at 12:43 PM, Steve Rowe  wrote:
> 
> Please vote for release candidate 1 for Lucene/Solr 7.0.1
> 
> The artifacts can be downloaded from: 
> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-7.0.1-RC1-rev8d6c3889aa543954424d8ac1dbb3f03bf207140b
> 
> You can run the smoke tester directly with this command: 
> 
> python3 -u dev-tools/scripts/smokeTestRelease.py \ 
> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-7.0.1-RC1-rev8d6c3889aa543954424d8ac1dbb3f03bf207140b
> 
> Here's my +1 
> [0:28:08.126321]
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

[JENKINS] Lucene-Solr-7.x-MacOSX (64bit/jdk1.8.0) - Build # 227 - Unstable!

2017-10-04 Thread Policeman Jenkins Server

Build: https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-MacOSX/227/
Java: 64bit/jdk1.8.0 -XX:+UseCompressedOops -XX:+UseG1GC

1 tests failed.
FAILED:  
org.apache.solr.cloud.TestTlogReplica.testOutOfOrderDBQWithInPlaceUpdates

Error Message:
Can not find doc 1 in http://127.0.0.1:58477/solr

Stack Trace:
java.lang.AssertionError: Can not find doc 1 in http://127.0.0.1:58477/solr
at 
__randomizedtesting.SeedInfo.seed([35824FB124FFF497:B343B75C7BAE2277]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at org.junit.Assert.assertNotNull(Assert.java:526)
at 
org.apache.solr.cloud.TestTlogReplica.checkRTG(TestTlogReplica.java:861)
at 
org.apache.solr.cloud.TestTlogReplica.testOutOfOrderDBQWithInPlaceUpdates(TestTlogReplica.java:664)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at java.lang.Thread.run(Thread.java:748)




Build Log:
[...truncated 11644 lines...]
   [junit4] Suite:

[jira] [Commented] (SOLR-11435) Replicas failed to be deleted can overwrite replicas of recreated collections

2017-10-04 Thread Varun Thacker (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191998#comment-16191998
 ] 

Varun Thacker commented on SOLR-11435:
--

I think we should add it back when users are not using the autoAddReplica 
feature?

> Replicas failed to be deleted can overwrite replicas of recreated collections
> -
>
> Key: SOLR-11435
> URL: https://issues.apache.org/jira/browse/SOLR-11435
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.0
>Reporter: Mano Kovacs
>
> When a replica comes up that was deleted from ZK while it was gone, it can 
> replace replicas in ZK if the collection an core names are equal.
> Reproduction:
> # Create {{collection1}} with 1 shard, 2 replicas on {{node1}} and {{node2}}
> # Shut down {{node2}}
> # Delete {{collection1}}
> # Create {{collection1}} on {{node1}} and {{node3}}
> # Start {{node2}}
> Expected:
> {{node2}} will not initialize node as it is not assigned to in ZK 
> ({{legacyCloud=false}})
> Actual:
> {{node2}} will overwrite the {{baseurl}} in {{state.json}} for one of the 
> replicas as the {{coreNodeName}} and the collection name will match the core 
> it has.
> Note: SOLR-7248 introduced a {{baseurl}} comparison which was removed in 
> SOLR-10279.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-11391) JoinQParser for non point fields should use the GraphTermsCollector

2017-10-04 Thread Varun Thacker (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191976#comment-16191976
 ] 

Varun Thacker commented on SOLR-11391:
--

Here is another datapoint with a test set to tune NUM_DOCS_THRESHOLD

Indexed 33M documents in a single shard collection

{{doc_type_s:X AND year:2007}} matches 240397 documents . 


The following queries matches 673000 documents
{code}{!join to=join_key from=join_key cache=false}(doc_type_s:X AND 
year:2007){code} 


method=enum executes in 8 seconds and method=dv executes in 2 seconds .  


> JoinQParser for non point fields should use the GraphTermsCollector 
> 
>
> Key: SOLR-11391
> URL: https://issues.apache.org/jira/browse/SOLR-11391
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Varun Thacker
> Attachments: SOLR-11391.patch, SOLR-11391.patch, SOLR-11391.patch, 
> SOLR-11391.patch, SOLR-11391.patch, SOLR-11391.patch, SOLR-11391.patch, 
> SOLR-11391.patch, SOLR-11391.patch
>
>
> The Join Query Parser uses the GraphPointsCollector for point fields. 
> For non point fields if we use the GraphTermsCollector instead of the current 
> algorithm I am seeing quite a bit of performance gains.
> I'm going to attach a quick patch which I cooked up , making sure TestJoin 
> and TestCloudJSONFacetJoinDomain passed. 
> More tests, benchmarking and code cleanup to follow



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-10842) Move quickstart.html to Ref Guide

2017-10-04 Thread Steve Rowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-10842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe resolved SOLR-10842.
---
Resolution: Fixed

> Move quickstart.html to Ref Guide
> -
>
> Key: SOLR-10842
> URL: https://issues.apache.org/jira/browse/SOLR-10842
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Minor
> Fix For: 7.0
>
> Attachments: SOLR-10842-part2.patch, SOLR-10842.patch
>
>
> The Solr Quick Start at https://lucene.apache.org/solr/quickstart.html has 
> been problematic to keep up to date - until Ishan just updated it yesterday 
> for 6.6, it said "6.2.1", so hadn't been updated for several releases.
> Now that the Ref Guide is in AsciiDoc format, we can easily use variables for 
> package versions, and it could be released as part of the Ref Guide and kept 
> up to date. It could also integrate links to more information on topics, and 
> users would already be IN the docs, so they would not need to wonder where 
> the docs are.
> There are a few places on the site that will need to be updated to point to 
> the new location, but I can also put a redirect rule into .htaccess so people 
> are redirected to the new location if there are other links "in the wild" 
> that we cannot control. This allows it to be versioned also, if that becomes 
> necessary.
> As part of this, I would like to also update the entire "Getting Started" 
> section of the Ref Guide, which is effectively identical to what was in the 
> first release of the Ref Guide back in 2009 for Solr 1.4 and is in serious 
> need of reconsideration.
> My thought is that moving the page + redoing the Getting Started section 
> would be for 7.0, but if folks are excited about this idea I could move the 
> page for 6.6 and hold off redoing the larger section until 7.0.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [VOTE] Release Lucene/Solr 7.0.1 RC1

2017-10-04 Thread Tomás Fernández Löbbe

+1
SUCCESS! [0:46:49.127245]

On Wed, Oct 4, 2017 at 11:11 AM, David Smiley 
wrote:

> +1
>
> SUCCESS! [0:52:40.786442]
>
> On Wed, Oct 4, 2017 at 4:28 AM Tommaso Teofili 
> wrote:
>
>> +1
>>
>> SUCCESS! [2:16:41.101128]
>>
>> Il giorno mar 3 ott 2017 alle ore 10:56 Dawid Weiss <
>> dawid.we...@gmail.com> ha scritto:
>>
>>> +1.
>>>
>>> SUCCESS! [0:56:08.500826]
>>>
>>>
>>> On Mon, Oct 2, 2017 at 9:43 PM, Steve Rowe  wrote:
>>> > Please vote for release candidate 1 for Lucene/Solr 7.0.1
>>> >
>>> > The artifacts can be downloaded from:
>>> > https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-7.0.1-RC1-
>>> rev8d6c3889aa543954424d8ac1dbb3f03bf207140b
>>> >
>>> > You can run the smoke tester directly with this command:
>>> >
>>> > python3 -u dev-tools/scripts/smokeTestRelease.py \
>>> > https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-7.0.1-RC1-
>>> rev8d6c3889aa543954424d8ac1dbb3f03bf207140b
>>> >
>>> > Here's my +1
>>> > [0:28:08.126321]
>>> > -
>>> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> > For additional commands, e-mail: dev-h...@lucene.apache.org
>>> >
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>> --
> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book: http://www.
> solrenterprisesearchserver.com
>

[jira] [Commented] (SOLR-11423) Overseer queue needs a hard cap (maximum size) that clients respect

2017-10-04 Thread Scott Blum (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191962#comment-16191962
 ] 

Scott Blum commented on SOLR-11423:
---

[~shalinmangar] [~noble.paul] either of you want to take a look at this change 
and +1 or -1?

> Overseer queue needs a hard cap (maximum size) that clients respect
> ---
>
> Key: SOLR-11423
> URL: https://issues.apache.org/jira/browse/SOLR-11423
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: Scott Blum
>Assignee: Scott Blum
>
> When Solr gets into pathological GC thrashing states, it can fill the 
> overseer queue with literally thousands and thousands of queued state 
> changes.  Many of these end up being duplicated up/down state updates.  Our 
> production cluster has gotten to the 100k queued items level many times, and 
> there's nothing useful you can do at this point except manually purge the 
> queue in ZK.  Recently, it hit 3 million queued items, at which point our 
> entire ZK cluster exploded.
> I propose a hard cap.  Any client trying to enqueue a item when a queue is 
> full would throw an exception.  I was thinking maybe 10,000 items would be a 
> reasonable limit.  Thoughts?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-11439) Add harmonicFit Stream Evaluator

2017-10-04 Thread Joel Bernstein (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-11439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-11439:
--
Fix Version/s: master (8.0)
   7.1

> Add harmonicFit Stream Evaluator
> 
>
> Key: SOLR-11439
> URL: https://issues.apache.org/jira/browse/SOLR-11439
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 7.1, master (8.0)
>
> Attachments: SOLR-11439.patch
>
>
> This ticket adds the harmonicFit Stream Evaluator to support curve fitting of 
> sinusoidal curves.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-11439) Add harmonicFit Stream Evaluator

2017-10-04 Thread Joel Bernstein (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-11439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-11439:
--
Attachment: SOLR-11439.patch

> Add harmonicFit Stream Evaluator
> 
>
> Key: SOLR-11439
> URL: https://issues.apache.org/jira/browse/SOLR-11439
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 7.1, master (8.0)
>
> Attachments: SOLR-11439.patch
>
>
> This ticket adds the harmonicFit Stream Evaluator to support curve fitting of 
> sinusoidal curves.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (SOLR-11439) Add harmonicFit Stream Evaluator

2017-10-04 Thread Joel Bernstein (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-11439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein reassigned SOLR-11439:
-

Assignee: Joel Bernstein

> Add harmonicFit Stream Evaluator
> 
>
> Key: SOLR-11439
> URL: https://issues.apache.org/jira/browse/SOLR-11439
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>
> This ticket adds the harmonicFit Stream Evaluator to support curve fitting of 
> sinusoidal curves.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-11439) Add harmonicFit Stream Evaluator

2017-10-04 Thread Joel Bernstein (JIRA)

Joel Bernstein created SOLR-11439:
-

 Summary: Add harmonicFit Stream Evaluator
 Key: SOLR-11439
 URL: https://issues.apache.org/jira/browse/SOLR-11439
 Project: Solr
  Issue Type: New Feature
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Joel Bernstein


This ticket adds the harmonicFit Stream Evaluator to support curve fitting of 
sinusoidal curves.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-master-Linux (64bit/jdk-9) - Build # 20607 - Still Failing!

2017-10-04 Thread Policeman Jenkins Server

Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/20607/
Java: 64bit/jdk-9 -XX:-UseCompressedOops -XX:+UseConcMarkSweepGC 
--illegal-access=deny

1 tests failed.
FAILED:  
org.apache.solr.cloud.TestLeaderElectionZkExpiry.testLeaderElectionWithZkExpiry

Error Message:
Captured an uncaught exception in thread: Thread[id=10610, 
name=OverseerAutoScalingTriggerThread-98772196019077128-dummy.host.com:8984_solr-n_06,
 state=RUNNABLE, group=Overseer autoscaling triggers]

Stack Trace:
com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught 
exception in thread: Thread[id=10610, 
name=OverseerAutoScalingTriggerThread-98772196019077128-dummy.host.com:8984_solr-n_06,
 state=RUNNABLE, group=Overseer autoscaling triggers]
at 
__randomizedtesting.SeedInfo.seed([F0351A93AF8B68E8:26EA3BC67B70]:0)
Caused by: org.apache.solr.common.SolrException: 
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = 
Session expired for /autoscaling/events/.auto_add_replicas
at __randomizedtesting.SeedInfo.seed([F0351A93AF8B68E8]:0)
at 
org.apache.solr.cloud.ZkDistributedQueue.(ZkDistributedQueue.java:107)
at 
org.apache.solr.cloud.autoscaling.TriggerEventQueue.(TriggerEventQueue.java:44)
at 
org.apache.solr.cloud.autoscaling.ScheduledTriggers$ScheduledTrigger.(ScheduledTriggers.java:398)
at 
org.apache.solr.cloud.autoscaling.ScheduledTriggers.add(ScheduledTriggers.java:149)
at 
org.apache.solr.cloud.autoscaling.OverseerTriggerThread.run(OverseerTriggerThread.java:220)
at java.base/java.lang.Thread.run(Thread.java:844)
Caused by: org.apache.zookeeper.KeeperException$SessionExpiredException: 
KeeperErrorCode = Session expired for /autoscaling/events/.auto_add_replicas
at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1102)
at 
org.apache.solr.common.cloud.SolrZkClient$5.execute(SolrZkClient.java:323)
at 
org.apache.solr.common.cloud.SolrZkClient$5.execute(SolrZkClient.java:320)
at 
org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:60)
at 
org.apache.solr.common.cloud.SolrZkClient.exists(SolrZkClient.java:320)
at 
org.apache.solr.common.cloud.ZkCmdExecutor.ensureExists(ZkCmdExecutor.java:93)
at 
org.apache.solr.common.cloud.ZkCmdExecutor.ensureExists(ZkCmdExecutor.java:78)
at 
org.apache.solr.cloud.ZkDistributedQueue.(ZkDistributedQueue.java:105)
... 5 more




Build Log:
[...truncated 1707 lines...]
   [junit4] JVM J2: stderr was not empty, see: 
/home/jenkins/workspace/Lucene-Solr-master-Linux/lucene/build/core/test/temp/junit4-J2-20171004_183121_9593898907984711683179.syserr
   [junit4] >>> JVM J2 emitted unexpected output (verbatim) 
   [junit4] Java HotSpot(TM) 64-Bit Server VM warning: Option 
UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in 
a future release.
   [junit4] <<< JVM J2: EOF 

[...truncated 3 lines...]
   [junit4] JVM J0: stderr was not empty, see: 
/home/jenkins/workspace/Lucene-Solr-master-Linux/lucene/build/core/test/temp/junit4-J0-20171004_183121_9596985565554041436554.syserr
   [junit4] >>> JVM J0 emitted unexpected output (verbatim) 
   [junit4] Java HotSpot(TM) 64-Bit Server VM warning: Option 
UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in 
a future release.
   [junit4] <<< JVM J0: EOF 

[...truncated 9 lines...]
   [junit4] JVM J1: stderr was not empty, see: 
/home/jenkins/workspace/Lucene-Solr-master-Linux/lucene/build/core/test/temp/junit4-J1-20171004_183121_95915338059355260645160.syserr
   [junit4] >>> JVM J1 emitted unexpected output (verbatim) 
   [junit4] Java HotSpot(TM) 64-Bit Server VM warning: Option 
UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in 
a future release.
   [junit4] <<< JVM J1: EOF 

[...truncated 290 lines...]
   [junit4] JVM J0: stderr was not empty, see: 
/home/jenkins/workspace/Lucene-Solr-master-Linux/lucene/build/test-framework/test/temp/junit4-J0-20171004_183805_18016694421302162680865.syserr
   [junit4] >>> JVM J0 emitted unexpected output (verbatim) 
   [junit4] Java HotSpot(TM) 64-Bit Server VM warning: Option 
UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in 
a future release.
   [junit4] <<< JVM J0: EOF 

[...truncated 9 lines...]
   [junit4] JVM J1: stderr was not empty, see: 
/home/jenkins/workspace/Lucene-Solr-master-Linux/lucene/build/test-framework/test/temp/junit4-J1-20171004_183805_1803316907571050366585.syserr
   [junit4] >>> JVM J1 emitted unexpected output (verbatim) 
   [junit4] Java HotSpot(TM) 64-Bit Server VM warning: Option 
UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed

[JENKINS] Lucene-Solr-master-MacOSX (64bit/jdk-9) - Build # 4207 - Failure!

2017-10-04 Thread Policeman Jenkins Server

Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-MacOSX/4207/
Java: 64bit/jdk-9 -XX:-UseCompressedOops -XX:+UseParallelGC 
--illegal-access=deny

1 tests failed.
FAILED:  
org.apache.solr.cloud.TestLeaderElectionZkExpiry.testLeaderElectionWithZkExpiry

Error Message:
Captured an uncaught exception in thread: Thread[id=25490, 
name=OverseerAutoScalingTriggerThread-98772199430356994-dummy.host.com:8984_solr-n_00,
 state=RUNNABLE, group=Overseer autoscaling triggers]

Stack Trace:
com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught 
exception in thread: Thread[id=25490, 
name=OverseerAutoScalingTriggerThread-98772199430356994-dummy.host.com:8984_solr-n_00,
 state=RUNNABLE, group=Overseer autoscaling triggers]
at 
__randomizedtesting.SeedInfo.seed([E9B9CF785279AD0C:3F66EE226464BE94]:0)
Caused by: org.apache.solr.common.SolrException: 
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = 
Session expired for /autoscaling/events/.auto_add_replicas
at __randomizedtesting.SeedInfo.seed([E9B9CF785279AD0C]:0)
at 
org.apache.solr.cloud.ZkDistributedQueue.(ZkDistributedQueue.java:107)
at 
org.apache.solr.cloud.autoscaling.TriggerEventQueue.(TriggerEventQueue.java:44)
at 
org.apache.solr.cloud.autoscaling.ScheduledTriggers$ScheduledTrigger.(ScheduledTriggers.java:398)
at 
org.apache.solr.cloud.autoscaling.ScheduledTriggers.add(ScheduledTriggers.java:149)
at 
org.apache.solr.cloud.autoscaling.OverseerTriggerThread.run(OverseerTriggerThread.java:220)
at java.base/java.lang.Thread.run(Thread.java:844)
Caused by: org.apache.zookeeper.KeeperException$SessionExpiredException: 
KeeperErrorCode = Session expired for /autoscaling/events/.auto_add_replicas
at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1102)
at 
org.apache.solr.common.cloud.SolrZkClient$5.execute(SolrZkClient.java:323)
at 
org.apache.solr.common.cloud.SolrZkClient$5.execute(SolrZkClient.java:320)
at 
org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:60)
at 
org.apache.solr.common.cloud.SolrZkClient.exists(SolrZkClient.java:320)
at 
org.apache.solr.common.cloud.ZkCmdExecutor.ensureExists(ZkCmdExecutor.java:93)
at 
org.apache.solr.common.cloud.ZkCmdExecutor.ensureExists(ZkCmdExecutor.java:78)
at 
org.apache.solr.cloud.ZkDistributedQueue.(ZkDistributedQueue.java:105)
... 5 more




Build Log:
[...truncated 13284 lines...]
   [junit4] Suite: org.apache.solr.cloud.TestLeaderElectionZkExpiry
   [junit4]   2> Creating dataDir: 
/Users/jenkins/workspace/Lucene-Solr-master-MacOSX/solr/build/solr-core/test/J0/temp/solr.cloud.TestLeaderElectionZkExpiry_E9B9CF785279AD0C-001/init-core-data-001
   [junit4]   2> 2516353 WARN  
(SUITE-TestLeaderElectionZkExpiry-seed#[E9B9CF785279AD0C]-worker) [] 
o.a.s.SolrTestCaseJ4 startTrackingSearchers: numOpens=2 numCloses=2
   [junit4]   2> 2516353 INFO  
(SUITE-TestLeaderElectionZkExpiry-seed#[E9B9CF785279AD0C]-worker) [] 
o.a.s.SolrTestCaseJ4 Using TrieFields (NUMERIC_POINTS_SYSPROP=false) 
w/NUMERIC_DOCVALUES_SYSPROP=true
   [junit4]   2> 2516354 INFO  
(SUITE-TestLeaderElectionZkExpiry-seed#[E9B9CF785279AD0C]-worker) [] 
o.a.s.SolrTestCaseJ4 Randomized ssl (true) and clientAuth (false) via: 
@org.apache.solr.util.RandomizeSSL(reason="", value=0.0/0.0, ssl=0.0/0.0, 
clientAuth=0.0/0.0) w/ MAC_OS_X supressed clientAuth
   [junit4]   2> 2516354 INFO  
(SUITE-TestLeaderElectionZkExpiry-seed#[E9B9CF785279AD0C]-worker) [] 
o.a.s.SolrTestCaseJ4 SecureRandom sanity checks: 
test.solr.allowed.securerandom=null & java.security.egd=file:/dev/./urandom
   [junit4]   2> 2516355 INFO  
(TEST-TestLeaderElectionZkExpiry.testLeaderElectionWithZkExpiry-seed#[E9B9CF785279AD0C])
 [] o.a.s.SolrTestCaseJ4 ###Starting testLeaderElectionWithZkExpiry
   [junit4]   2> 2516357 INFO  
(TEST-TestLeaderElectionZkExpiry.testLeaderElectionWithZkExpiry-seed#[E9B9CF785279AD0C])
 [] o.a.s.c.SolrXmlConfig MBean server found: 
com.sun.jmx.mbeanserver.JmxMBeanServer@4cf80e12, but no JMX reporters were 
configured - adding default JMX reporter.
   [junit4]   2> 2516389 INFO  
(TEST-TestLeaderElectionZkExpiry.testLeaderElectionWithZkExpiry-seed#[E9B9CF785279AD0C])
 [] o.a.s.m.r.SolrJmxReporter JMX monitoring for 'solr.node' (registry 
'solr.node') enabled at server: com.sun.jmx.mbeanserver.JmxMBeanServer@4cf80e12
   [junit4]   2> 2516398 INFO  
(TEST-TestLeaderElectionZkExpiry.testLeaderElectionWithZkExpiry-seed#[E9B9CF785279AD0C])
 [] o.a.s.m.r.SolrJmxReporter JMX monitoring for 'solr.jvm' (registry 
'solr.jvm') enabled at server: com.sun.jmx.mbeanserver.JmxMBeanServer@4cf80e12
   [junit4]   2> 2516398 INFO

[jira] [Commented] (SOLR-10842) Move quickstart.html to Ref Guide

2017-10-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-10842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191883#comment-16191883
 ] 

ASF subversion and git services commented on SOLR-10842:


Commit 91659938610036034d0bf1abc7786d781c4f661c in lucene-solr's branch 
refs/heads/branch_7x from [~steve_rowe]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=9165993 ]

SOLR-10842: Convert all remaining {{quickstart.html}} links to 
{{guide/solr-tutorial.html}}; remove all references to quickstart from the 
build; and version the link to the ref guide's tutorial in Solr's versioned 
top-level documentation page.


> Move quickstart.html to Ref Guide
> -
>
> Key: SOLR-10842
> URL: https://issues.apache.org/jira/browse/SOLR-10842
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Minor
> Fix For: 7.0
>
> Attachments: SOLR-10842-part2.patch, SOLR-10842.patch
>
>
> The Solr Quick Start at https://lucene.apache.org/solr/quickstart.html has 
> been problematic to keep up to date - until Ishan just updated it yesterday 
> for 6.6, it said "6.2.1", so hadn't been updated for several releases.
> Now that the Ref Guide is in AsciiDoc format, we can easily use variables for 
> package versions, and it could be released as part of the Ref Guide and kept 
> up to date. It could also integrate links to more information on topics, and 
> users would already be IN the docs, so they would not need to wonder where 
> the docs are.
> There are a few places on the site that will need to be updated to point to 
> the new location, but I can also put a redirect rule into .htaccess so people 
> are redirected to the new location if there are other links "in the wild" 
> that we cannot control. This allows it to be versioned also, if that becomes 
> necessary.
> As part of this, I would like to also update the entire "Getting Started" 
> section of the Ref Guide, which is effectively identical to what was in the 
> first release of the Ref Guide back in 2009 for Solr 1.4 and is in serious 
> need of reconsideration.
> My thought is that moving the page + redoing the Getting Started section 
> would be for 7.0, but if folks are excited about this idea I could move the 
> page for 6.6 and hold off redoing the larger section until 7.0.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-10842) Move quickstart.html to Ref Guide

2017-10-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-10842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191884#comment-16191884
 ] 

ASF subversion and git services commented on SOLR-10842:


Commit 93d8e428ea4643c6641d8a9b2c73827683b831a4 in lucene-solr's branch 
refs/heads/master from [~steve_rowe]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=93d8e42 ]

SOLR-10842: Convert all remaining {{quickstart.html}} links to 
{{guide/solr-tutorial.html}}; remove all references to quickstart from the 
build; and version the link to the ref guide's tutorial in Solr's versioned 
top-level documentation page.


> Move quickstart.html to Ref Guide
> -
>
> Key: SOLR-10842
> URL: https://issues.apache.org/jira/browse/SOLR-10842
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Minor
> Fix For: 7.0
>
> Attachments: SOLR-10842-part2.patch, SOLR-10842.patch
>
>
> The Solr Quick Start at https://lucene.apache.org/solr/quickstart.html has 
> been problematic to keep up to date - until Ishan just updated it yesterday 
> for 6.6, it said "6.2.1", so hadn't been updated for several releases.
> Now that the Ref Guide is in AsciiDoc format, we can easily use variables for 
> package versions, and it could be released as part of the Ref Guide and kept 
> up to date. It could also integrate links to more information on topics, and 
> users would already be IN the docs, so they would not need to wonder where 
> the docs are.
> There are a few places on the site that will need to be updated to point to 
> the new location, but I can also put a redirect rule into .htaccess so people 
> are redirected to the new location if there are other links "in the wild" 
> that we cannot control. This allows it to be versioned also, if that becomes 
> necessary.
> As part of this, I would like to also update the entire "Getting Started" 
> section of the Ref Guide, which is effectively identical to what was in the 
> first release of the Ref Guide back in 2009 for Solr 1.4 and is in serious 
> need of reconsideration.
> My thought is that moving the page + redoing the Getting Started section 
> would be for 7.0, but if folks are excited about this idea I could move the 
> page for 6.6 and hold off redoing the larger section until 7.0.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-11438) Solr should return rf when min_rf is specified for delete-by-id and delete-by-query.

2017-10-04 Thread Erick Erickson (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-11438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-11438:
--
Attachment: SOLR-11438.patch

A bare sketch of a patch, just enough to show that it's _possible_ get "rf" 
back from deletes. I just mechanically passed replicationTracker to 
cmdDistrib.distribDelete (sometimes) without really understanding what I was 
doing and there are zero tests

I did note that the processAdd returned rf of 3 when there were 4 replicas, so 
my bit of refactoring may have messed up processAdd. Like I said, this was only 
to see if it could be done.

> Solr should return rf when min_rf is specified for delete-by-id and 
> delete-by-query.
> 
>
> Key: SOLR-11438
> URL: https://issues.apache.org/jira/browse/SOLR-11438
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.6.1, 7.0, master (8.0)
>Reporter: Erick Erickson
>Assignee: Erick Erickson
> Attachments: SOLR-11438.patch
>
>
> When we add documents and specify min_rf we get back an rf parameter in the 
> response which is the number of replicas that successfully received the add. 
> However, for delete-by-id or delete-by-query we do not return this data. Is 
> there any harm in it?
> Assigning to myself to track, anyone else who wants it feel free.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-10842) Move quickstart.html to Ref Guide

2017-10-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-10842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191865#comment-16191865
 ] 

ASF subversion and git services commented on SOLR-10842:


Commit a29a08716766706bd913792cfd3a5dc1cd970de9 in lucene-solr's branch 
refs/heads/branch_7_0 from [~steve_rowe]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=a29a087 ]

SOLR-10842: Convert all remaining {{quickstart.html}} links to 
{{guide/solr-tutorial.html}}; remove all references to quickstart from the 
build; and version the link to the ref guide's tutorial in Solr's versioned 
top-level documentation page.


> Move quickstart.html to Ref Guide
> -
>
> Key: SOLR-10842
> URL: https://issues.apache.org/jira/browse/SOLR-10842
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Minor
> Fix For: 7.0
>
> Attachments: SOLR-10842-part2.patch, SOLR-10842.patch
>
>
> The Solr Quick Start at https://lucene.apache.org/solr/quickstart.html has 
> been problematic to keep up to date - until Ishan just updated it yesterday 
> for 6.6, it said "6.2.1", so hadn't been updated for several releases.
> Now that the Ref Guide is in AsciiDoc format, we can easily use variables for 
> package versions, and it could be released as part of the Ref Guide and kept 
> up to date. It could also integrate links to more information on topics, and 
> users would already be IN the docs, so they would not need to wonder where 
> the docs are.
> There are a few places on the site that will need to be updated to point to 
> the new location, but I can also put a redirect rule into .htaccess so people 
> are redirected to the new location if there are other links "in the wild" 
> that we cannot control. This allows it to be versioned also, if that becomes 
> necessary.
> As part of this, I would like to also update the entire "Getting Started" 
> section of the Ref Guide, which is effectively identical to what was in the 
> first release of the Ref Guide back in 2009 for Solr 1.4 and is in serious 
> need of reconsideration.
> My thought is that moving the page + redoing the Getting Started section 
> would be for 7.0, but if folks are excited about this idea I could move the 
> page for 6.6 and hold off redoing the larger section until 7.0.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-11438) Solr should return rf when min_rf is specified for delete-by-id and delete-by-query.

2017-10-04 Thread Erick Erickson (JIRA)

Erick Erickson created SOLR-11438:
-

 Summary: Solr should return rf when min_rf is specified for 
delete-by-id and delete-by-query.
 Key: SOLR-11438
 URL: https://issues.apache.org/jira/browse/SOLR-11438
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Affects Versions: 7.0, 6.6.1, master (8.0)
Reporter: Erick Erickson
Assignee: Erick Erickson


When we add documents and specify min_rf we get back an rf parameter in the 
response which is the number of replicas that successfully received the add. 

However, for delete-by-id or delete-by-query we do not return this data. Is 
there any harm in it?

Assigning to myself to track, anyone else who wants it feel free.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-10842) Move quickstart.html to Ref Guide

2017-10-04 Thread Cassandra Targett (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-10842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191859#comment-16191859
 ] 

Cassandra Targett commented on SOLR-10842:
--

+1 Steve, thanks.

> Move quickstart.html to Ref Guide
> -
>
> Key: SOLR-10842
> URL: https://issues.apache.org/jira/browse/SOLR-10842
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Minor
> Fix For: 7.0
>
> Attachments: SOLR-10842-part2.patch, SOLR-10842.patch
>
>
> The Solr Quick Start at https://lucene.apache.org/solr/quickstart.html has 
> been problematic to keep up to date - until Ishan just updated it yesterday 
> for 6.6, it said "6.2.1", so hadn't been updated for several releases.
> Now that the Ref Guide is in AsciiDoc format, we can easily use variables for 
> package versions, and it could be released as part of the Ref Guide and kept 
> up to date. It could also integrate links to more information on topics, and 
> users would already be IN the docs, so they would not need to wonder where 
> the docs are.
> There are a few places on the site that will need to be updated to point to 
> the new location, but I can also put a redirect rule into .htaccess so people 
> are redirected to the new location if there are other links "in the wild" 
> that we cannot control. This allows it to be versioned also, if that becomes 
> necessary.
> As part of this, I would like to also update the entire "Getting Started" 
> section of the Ref Guide, which is effectively identical to what was in the 
> first release of the Ref Guide back in 2009 for Solr 1.4 and is in serious 
> need of reconsideration.
> My thought is that moving the page + redoing the Getting Started section 
> would be for 7.0, but if folks are excited about this idea I could move the 
> page for 6.6 and hold off redoing the larger section until 7.0.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-11392) StreamExpressionTest.testParallelExecutorStream fails too frequently

2017-10-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191834#comment-16191834
 ] 

ASF subversion and git services commented on SOLR-11392:


Commit 8ac381a6a0ce7fae1d50896f15dd4fe8307c79d6 in lucene-solr's branch 
refs/heads/branch_7x from [~joel.bernstein]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=8ac381a ]

SOLR-11392: Change collection names in test case


> StreamExpressionTest.testParallelExecutorStream fails too frequently
> 
>
> Key: SOLR-11392
> URL: https://issues.apache.org/jira/browse/SOLR-11392
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>
> I've never been able to reproduce the failure but jenkins fails frequently 
> with the following error:
> {code}
> Stack Trace:
> org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error from 
> server at http://127.0.0.1:38180/solr/workQueue_shard2_replica_n3: Expected 
> mime type application/octet-stream but got text/html. 
> 
> 
> Error 404 
> 
> 
> HTTP ERROR: 404
> Problem accessing /solr/workQueue_shard2_replica_n3/update. Reason:
> Can not find: /solr/workQueue_shard2_replica_n3/update
> http://eclipse.org/jetty;>Powered by Jetty:// 
> 9.3.20.v20170531
> 
> 
> {code}
> What appears to be happening is that the test framework is having trouble 
> setting up the collection.
> Here is the test code:
> {code}
> @Test
>   public void testParallelExecutorStream() throws Exception {
> CollectionAdminRequest.createCollection("workQueue", "conf", 2, 
> 1).process(cluster.getSolrClient());
> AbstractDistribZkTestBase.waitForRecoveriesToFinish("workQueue", 
> cluster.getSolrClient().getZkStateReader(),
> false, true, TIMEOUT);
> CollectionAdminRequest.createCollection("mainCorpus", "conf", 2, 
> 1).process(cluster.getSolrClient());
> AbstractDistribZkTestBase.waitForRecoveriesToFinish("mainCorpus", 
> cluster.getSolrClient().getZkStateReader(),
> false, true, TIMEOUT);
> CollectionAdminRequest.createCollection("destination", "conf", 2, 
> 1).process(cluster.getSolrClient());
> AbstractDistribZkTestBase.waitForRecoveriesToFinish("destination", 
> cluster.getSolrClient().getZkStateReader(),
> false, true, TIMEOUT);
> UpdateRequest workRequest = new UpdateRequest();
> UpdateRequest dataRequest = new UpdateRequest();
> for (int i = 0; i < 500; i++) {
>   workRequest.add(id, String.valueOf(i), "expr_s", "update(destination, 
> batchSize=50, search(mainCorpus, q=id:"+i+", rows=1, sort=\"id asc\", 
> fl=\"id, body_t, field_i\"))");
>   dataRequest.add(id, String.valueOf(i), "body_t", "hello world "+i, 
> "field_i", Integer.toString(i));
> }
> workRequest.commit(cluster.getSolrClient(), "workQueue");
> dataRequest.commit(cluster.getSolrClient(), "mainCorpus");
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-11285) Support simulations at scale in the autoscaling framework

2017-10-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191826#comment-16191826
 ] 

ASF subversion and git services commented on SOLR-11285:


Commit 5c62fb56f7a7a86214cbff7e171f461603b9b0fe in lucene-solr's branch 
refs/heads/master from [~ab]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=5c62fb5 ]

SOLR-11285: Fix a bug in Policy modifier methods.


> Support simulations at scale in the autoscaling framework
> -
>
> Key: SOLR-11285
> URL: https://issues.apache.org/jira/browse/SOLR-11285
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: AutoScaling
>Reporter: Andrzej Bialecki 
>Assignee: Andrzej Bialecki 
> Attachments: SOLR-11285.patch
>
>
> This is a spike to investigate how difficult it would be to modify the 
> autoscaling framework so that it's possible to run simulated large-scale 
> experiments and test its dynamic behavior without actually spinning up a 
> large cluster.
> Currently many components rely heavily on actual Solr, ZK and behavior of ZK 
> watches, or insist on making actual HTTP calls. Notable exception is the core 
> Policy framework where most of the ZK / Solr details are abstracted.
> As the algorithms for autoscaling that we implement become more and more 
> complex the ability to effectively run multiple large simulations will be 
> crucial - it's very easy to unknowingly introduce catastrophic instabilities 
> that don't manifest themselves in regular unit tests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-10842) Move quickstart.html to Ref Guide

2017-10-04 Thread Steve Rowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-10842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated SOLR-10842:
--
Attachment: SOLR-10842-part2.patch

Patch. Converts all remaining {{quickstart.html}} links to 
{{guide/solr-tutorial.html}}; removes all references to quickstart from the 
build; and versions the link to the ref guide's tutorial in Solr's versioned 
top-level documentation page.  Committing shortly.

> Move quickstart.html to Ref Guide
> -
>
> Key: SOLR-10842
> URL: https://issues.apache.org/jira/browse/SOLR-10842
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Minor
> Fix For: 7.0
>
> Attachments: SOLR-10842-part2.patch, SOLR-10842.patch
>
>
> The Solr Quick Start at https://lucene.apache.org/solr/quickstart.html has 
> been problematic to keep up to date - until Ishan just updated it yesterday 
> for 6.6, it said "6.2.1", so hadn't been updated for several releases.
> Now that the Ref Guide is in AsciiDoc format, we can easily use variables for 
> package versions, and it could be released as part of the Ref Guide and kept 
> up to date. It could also integrate links to more information on topics, and 
> users would already be IN the docs, so they would not need to wonder where 
> the docs are.
> There are a few places on the site that will need to be updated to point to 
> the new location, but I can also put a redirect rule into .htaccess so people 
> are redirected to the new location if there are other links "in the wild" 
> that we cannot control. This allows it to be versioned also, if that becomes 
> necessary.
> As part of this, I would like to also update the entire "Getting Started" 
> section of the Ref Guide, which is effectively identical to what was in the 
> first release of the Ref Guide back in 2009 for Solr 1.4 and is in serious 
> need of reconsideration.
> My thought is that moving the page + redoing the Getting Started section 
> would be for 7.0, but if folks are excited about this idea I could move the 
> page for 6.6 and hold off redoing the larger section until 7.0.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-11392) StreamExpressionTest.testParallelExecutorStream fails too frequently

2017-10-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191788#comment-16191788
 ] 

ASF subversion and git services commented on SOLR-11392:


Commit 070d6d3748341a955d807570d96896068a933f3e in lucene-solr's branch 
refs/heads/master from [~joel.bernstein]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=070d6d3 ]

SOLR-11392: Change collection names in test case


> StreamExpressionTest.testParallelExecutorStream fails too frequently
> 
>
> Key: SOLR-11392
> URL: https://issues.apache.org/jira/browse/SOLR-11392
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>
> I've never been able to reproduce the failure but jenkins fails frequently 
> with the following error:
> {code}
> Stack Trace:
> org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error from 
> server at http://127.0.0.1:38180/solr/workQueue_shard2_replica_n3: Expected 
> mime type application/octet-stream but got text/html. 
> 
> 
> Error 404 
> 
> 
> HTTP ERROR: 404
> Problem accessing /solr/workQueue_shard2_replica_n3/update. Reason:
> Can not find: /solr/workQueue_shard2_replica_n3/update
> http://eclipse.org/jetty;>Powered by Jetty:// 
> 9.3.20.v20170531
> 
> 
> {code}
> What appears to be happening is that the test framework is having trouble 
> setting up the collection.
> Here is the test code:
> {code}
> @Test
>   public void testParallelExecutorStream() throws Exception {
> CollectionAdminRequest.createCollection("workQueue", "conf", 2, 
> 1).process(cluster.getSolrClient());
> AbstractDistribZkTestBase.waitForRecoveriesToFinish("workQueue", 
> cluster.getSolrClient().getZkStateReader(),
> false, true, TIMEOUT);
> CollectionAdminRequest.createCollection("mainCorpus", "conf", 2, 
> 1).process(cluster.getSolrClient());
> AbstractDistribZkTestBase.waitForRecoveriesToFinish("mainCorpus", 
> cluster.getSolrClient().getZkStateReader(),
> false, true, TIMEOUT);
> CollectionAdminRequest.createCollection("destination", "conf", 2, 
> 1).process(cluster.getSolrClient());
> AbstractDistribZkTestBase.waitForRecoveriesToFinish("destination", 
> cluster.getSolrClient().getZkStateReader(),
> false, true, TIMEOUT);
> UpdateRequest workRequest = new UpdateRequest();
> UpdateRequest dataRequest = new UpdateRequest();
> for (int i = 0; i < 500; i++) {
>   workRequest.add(id, String.valueOf(i), "expr_s", "update(destination, 
> batchSize=50, search(mainCorpus, q=id:"+i+", rows=1, sort=\"id asc\", 
> fl=\"id, body_t, field_i\"))");
>   dataRequest.add(id, String.valueOf(i), "body_t", "hello world "+i, 
> "field_i", Integer.toString(i));
> }
> workRequest.commit(cluster.getSolrClient(), "workQueue");
> dataRequest.commit(cluster.getSolrClient(), "mainCorpus");
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-11436) Add polyfit and polyfitDerivative Stream Evaluators

2017-10-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191756#comment-16191756
 ] 

ASF subversion and git services commented on SOLR-11436:


Commit 5a123ab5c1738f78ded0bb3b1c31272712bc5c69 in lucene-solr's branch 
refs/heads/branch_7x from [~joel.bernstein]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=5a123ab ]

SOLR-11436: Add polyfit and polyfitDerivative Stream Evaluators


> Add polyfit and polyfitDerivative Stream Evaluators
> ---
>
> Key: SOLR-11436
> URL: https://issues.apache.org/jira/browse/SOLR-11436
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 7.1, master (8.0)
>
> Attachments: SOLR-11436.patch, SOLR-11436.patch
>
>
> The *polyfit* and *polyfitDerivative* Stream Evaluators provide support for 
> polynomial curve fitting and calculating the derivative of the fitted curve.
> Implementation provided by the Apache Commons Math polynomial curve fitting 
> implementation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [VOTE] Release Lucene/Solr 7.0.1 RC1

2017-10-04 Thread David Smiley

+1

SUCCESS! [0:52:40.786442]

On Wed, Oct 4, 2017 at 4:28 AM Tommaso Teofili 
wrote:

> +1
>
> SUCCESS! [2:16:41.101128]
>
> Il giorno mar 3 ott 2017 alle ore 10:56 Dawid Weiss 
> ha scritto:
>
>> +1.
>>
>> SUCCESS! [0:56:08.500826]
>>
>>
>> On Mon, Oct 2, 2017 at 9:43 PM, Steve Rowe  wrote:
>> > Please vote for release candidate 1 for Lucene/Solr 7.0.1
>> >
>> > The artifacts can be downloaded from:
>> >
>> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-7.0.1-RC1-rev8d6c3889aa543954424d8ac1dbb3f03bf207140b
>> >
>> > You can run the smoke tester directly with this command:
>> >
>> > python3 -u dev-tools/scripts/smokeTestRelease.py \
>> >
>> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-7.0.1-RC1-rev8d6c3889aa543954424d8ac1dbb3f03bf207140b
>> >
>> > Here's my +1
>> > [0:28:08.126321]
>> > -
>> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> > For additional commands, e-mail: dev-h...@lucene.apache.org
>> >
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>> --
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com

[jira] [Commented] (SOLR-11436) Add polyfit and polyfitDerivative Stream Evaluators

2017-10-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191748#comment-16191748
 ] 

ASF subversion and git services commented on SOLR-11436:


Commit 1782dd9ca901eced0adbe9932cdbc50ad5792663 in lucene-solr's branch 
refs/heads/master from [~joel.bernstein]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=1782dd9 ]

SOLR-11436: Add polyfit and polyfitDerivative Stream Evaluators


> Add polyfit and polyfitDerivative Stream Evaluators
> ---
>
> Key: SOLR-11436
> URL: https://issues.apache.org/jira/browse/SOLR-11436
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 7.1, master (8.0)
>
> Attachments: SOLR-11436.patch, SOLR-11436.patch
>
>
> The *polyfit* and *polyfitDerivative* Stream Evaluators provide support for 
> polynomial curve fitting and calculating the derivative of the fitted curve.
> Implementation provided by the Apache Commons Math polynomial curve fitting 
> implementation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-7.x-Windows (64bit/jdk-9) - Build # 231 - Still Unstable!

2017-10-04 Thread Policeman Jenkins Server

Build: https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-Windows/231/
Java: 64bit/jdk-9 -XX:-UseCompressedOops -XX:+UseParallelGC 
--illegal-access=deny

2 tests failed.
FAILED:  
org.apache.solr.cloud.TestTlogReplica.testOutOfOrderDBQWithInPlaceUpdates

Error Message:
Can not find doc 1 in http://127.0.0.1:59312/solr

Stack Trace:
java.lang.AssertionError: Can not find doc 1 in http://127.0.0.1:59312/solr
at 
__randomizedtesting.SeedInfo.seed([A6EE7F86C565808C:202F876B9A34566C]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at org.junit.Assert.assertNotNull(Assert.java:526)
at 
org.apache.solr.cloud.TestTlogReplica.checkRTG(TestTlogReplica.java:861)
at 
org.apache.solr.cloud.TestTlogReplica.testOutOfOrderDBQWithInPlaceUpdates(TestTlogReplica.java:664)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:564)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at java.base/java.lang.Thread.run(Thread.java:844)


FAILED:  
org.apache.solr.handler.TestReplicationHandler.doTestIndexAndConfigReplication

Error

[jira] [Commented] (LUCENE-7976) Add a parameter to TieredMergePolicy to merge segments that have more than X percent deleted documents

2017-10-04 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191712#comment-16191712
 ] 

Uwe Schindler commented on LUCENE-7976:
---

bq. "I think the main issue" ... I disagree; this issue is about freeing up 
many deleted docs. Uwe, feel free of course to create a Solr issue to rename 
"optimize" to "forceMerge" and to suggest where the Solr Ref Guide's wording is 
either bad or needs improvement. I think these are clearly separate from this 
issue.

Sorry, it is always caused by calling "optimize" or "forceMerge" at some point 
in the past. Doing this always brings the index into a state where the deletes 
sum up, because its no longer in an ideal state for deleting an adding new 
documents. If you never call forceMerge/optifucke (sorry "optimize" haha), the 
deletes won't automatically sum up, as TieredMergePolicy will merge them away. 
The deleted documents ratio is in most cases between 30 and 40% on the whole 
index in that case. But if you force merge it gets bad and you sometimes sum up 
80% deletes. The reason was described before.

And for that reason it is way important to remove "optimize" from Solr, THIS 
issue won't happen without "optifucke"! PERIOD.

> Add a parameter to TieredMergePolicy to merge segments that have more than X 
> percent deleted documents
> --
>
> Key: LUCENE-7976
> URL: https://issues.apache.org/jira/browse/LUCENE-7976
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
>
> We're seeing situations "in the wild" where there are very large indexes (on 
> disk) handled quite easily in a single Lucene index. This is particularly 
> true as features like docValues move data into MMapDirectory space. The 
> current TMP algorithm allows on the order of 50% deleted documents as per a 
> dev list conversation with Mike McCandless (and his blog here:  
> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents).
> Especially in the current era of very large indexes in aggregate, (think many 
> TB) solutions like "you need to distribute your collection over more shards" 
> become very costly. Additionally, the tempting "optimize" button exacerbates 
> the issue since once you form, say, a 100G segment (by 
> optimizing/forceMerging) it is not eligible for merging until 97.5G of the 
> docs in it are deleted (current default 5G max segment size).
> The proposal here would be to add a new parameter to TMP, something like 
>  (no, that's not serious name, suggestions 
> welcome) which would default to 100 (or the same behavior we have now).
> So if I set this parameter to, say, 20%, and the max segment size stays at 
> 5G, the following would happen when segments were selected for merging:
> > any segment with > 20% deleted documents would be merged or rewritten NO 
> > MATTER HOW LARGE. There are two cases,
> >> the segment has < 5G "live" docs. In that case it would be merged with 
> >> smaller segments to bring the resulting segment up to 5G. If no smaller 
> >> segments exist, it would just be rewritten
> >> The segment has > 5G "live" docs (the result of a forceMerge or optimize). 
> >> It would be rewritten into a single segment removing all deleted docs no 
> >> matter how big it is to start. The 100G example above would be rewritten 
> >> to an 80G segment for instance.
> Of course this would lead to potentially much more I/O which is why the 
> default would be the same behavior we see now. As it stands now, though, 
> there's no way to recover from an optimize/forceMerge except to re-index from 
> scratch. We routinely see 200G-300G Lucene indexes at this point "in the 
> wild" with 10s of  shards replicated 3 or more times. And that doesn't even 
> include having these over HDFS.
> Alternatives welcome! Something like the above seems minimally invasive. A 
> new merge policy is certainly an alternative.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-11434) Solr 4.10 sharded collection issues when SSL is enabled

2017-10-04 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191705#comment-16191705
 ] 

Erick Erickson commented on SOLR-11434:
---

There have been lots of changes in SSL support since the 4.x days, I know for 
certain that there are installations using SSL in sharded situations on more 
recent versions, so I don't think it's still a problem. 

Of course if anybody sees it in 6.6 or 7.x we can reopen this.

> Solr 4.10 sharded collection issues when SSL is enabled
> ---
>
> Key: SOLR-11434
> URL: https://issues.apache.org/jira/browse/SOLR-11434
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: security
>Affects Versions: 4.10.3
>Reporter: Magesh Tarala
>
> We have a 3 node solr cloud installation running on version 4.10. There is 
> one collection that’s sharded. After enabling SSL, we are unable to query the 
> sharded collection. Other non sharded collections are ok. We are getting this 
> error: 
> “no servers hosting shard:”
> I’ve googled and seen reports of this issue, but have not seen a resolution. 
> Thanks in advance for your help!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-6205) Make SolrCloud Data-center, rack or zone aware

2017-10-04 Thread jefferyyuan (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-6205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191698#comment-16191698
 ] 

jefferyyuan edited comment on SOLR-6205 at 10/4/17 5:53 PM:


Make Solr rack awareness can help prevent data loss and improve query 
performance.
Elastic-search already supported it:
https://www.elastic.co/guide/en/elasticsearch/reference/5.4/allocation-awareness.html

And a lot of projects support this: Hadoop, Cassandra, Kafka etc.


was (Author: yuanyun.cn):
Make Solr rack awareness can help prevent data loss and improve query 
performance.
Elastic-search already supported it:
https://www.elastic.co/guide/en/elasticsearch/reference/5.4/allocation-awareness.html

> Make SolrCloud Data-center, rack or zone aware
> --
>
> Key: SOLR-6205
> URL: https://issues.apache.org/jira/browse/SOLR-6205
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Affects Versions: 4.8.1
>Reporter: Arcadius Ahouansou
>Assignee: Noble Paul
>
> Use case:
> Let's say we have SolrCloud deployed across 2 Datacenters, racks or zones A 
> and B
> There is a need to have a SolrCloud deployment that will make it possible to 
> have a working system even if one of the Datacenter/rack/zone A or B is lost.
> - This has been discussed on the mailing list at
> http://lucene.472066.n3.nabble.com/SolrCloud-multiple-data-center-support-td4115097.html
> and there are many workarounds that require adding more moving parts to the 
> system.
> - On the above thread, Daniel Collins mentioned  
> https://issues.apache.org/jira/browse/ZOOKEEPER-107 
>  which could help solve this issue.
> - Note that this is a very important feature that is overlooked most of the 
> time.
> - Note that this feature is available in ElasticSearch.
> See 
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-cluster.html#allocation-awareness
> and
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-cluster.html#forced-awareness



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7976) Add a parameter to TieredMergePolicy to merge segments that have more than X percent deleted documents

2017-10-04 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191703#comment-16191703
 ] 

Erick Erickson commented on LUCENE-7976:


I linked in SOLR-7733 for dealing with the admin UI optimize button (I favor 
removing it entirely, make people put in some _effort_ to back themselves into 
a corner).

re: read-only rather than optimize.

It may be the cases I've seen where users think optimize gives a speed 
improvement are really the result of squeezing out the deleted documents. 
Question for the Lucene folks, what would you guess the performance differences 
would be between.

a single 200G segment?
40 5G segments?

With no deleted documents? I see indexes on disk at that size in the wild.

If the perf in the two cases above is "close enough" then freezing rather than 
optimize is an easier sell. The rest of this JIRA is about keeping the % 
deleted documents small, which, if we do, would handle the perf issues people 
get currently from forceMerge, assuming the above.

[~msoko...@gmail.com] The delete percentage isn't really the issue currently, 
if TMP respects max segment size it can't merge two segments > 50% live docs. 
If TMP were tweaked to merge _unlike_ size segments when some % deleted docs is 
exceeded in the large one (i.e. merge a segment with 4.75G live docs with a 
segment with 0.25G live docs) we could get there.

[~mikemccand]: 

bq: Right, but that's a normal/acceptable index state, where up to 50% of your 
docs are deleted

Gotta disagree with acceptable, normal I'll grant. We're way over indexes being 
terabytes and on our way to petabytes. I have cases where they're running out 
of physical room to add more disks. Saying that half your disk space being 
occupied by deleted documents is a hard sell.

> Add a parameter to TieredMergePolicy to merge segments that have more than X 
> percent deleted documents
> --
>
> Key: LUCENE-7976
> URL: https://issues.apache.org/jira/browse/LUCENE-7976
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
>
> We're seeing situations "in the wild" where there are very large indexes (on 
> disk) handled quite easily in a single Lucene index. This is particularly 
> true as features like docValues move data into MMapDirectory space. The 
> current TMP algorithm allows on the order of 50% deleted documents as per a 
> dev list conversation with Mike McCandless (and his blog here:  
> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents).
> Especially in the current era of very large indexes in aggregate, (think many 
> TB) solutions like "you need to distribute your collection over more shards" 
> become very costly. Additionally, the tempting "optimize" button exacerbates 
> the issue since once you form, say, a 100G segment (by 
> optimizing/forceMerging) it is not eligible for merging until 97.5G of the 
> docs in it are deleted (current default 5G max segment size).
> The proposal here would be to add a new parameter to TMP, something like 
>  (no, that's not serious name, suggestions 
> welcome) which would default to 100 (or the same behavior we have now).
> So if I set this parameter to, say, 20%, and the max segment size stays at 
> 5G, the following would happen when segments were selected for merging:
> > any segment with > 20% deleted documents would be merged or rewritten NO 
> > MATTER HOW LARGE. There are two cases,
> >> the segment has < 5G "live" docs. In that case it would be merged with 
> >> smaller segments to bring the resulting segment up to 5G. If no smaller 
> >> segments exist, it would just be rewritten
> >> The segment has > 5G "live" docs (the result of a forceMerge or optimize). 
> >> It would be rewritten into a single segment removing all deleted docs no 
> >> matter how big it is to start. The 100G example above would be rewritten 
> >> to an 80G segment for instance.
> Of course this would lead to potentially much more I/O which is why the 
> default would be the same behavior we see now. As it stands now, though, 
> there's no way to recover from an optimize/forceMerge except to re-index from 
> scratch. We routinely see 200G-300G Lucene indexes at this point "in the 
> wild" with 10s of  shards replicated 3 or more times. And that doesn't even 
> include having these over HDFS.
> Alternatives welcome! Something like the above seems minimally invasive. A 
> new merge policy is certainly an alternative.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6205) Make SolrCloud Data-center, rack or zone aware

2017-10-04 Thread jefferyyuan (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-6205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191698#comment-16191698
 ] 

jefferyyuan commented on SOLR-6205:
---

Make Solr rack awareness can help prevent data loss and improve query 
performance.
Elastic-search already supported it:
https://www.elastic.co/guide/en/elasticsearch/reference/5.4/allocation-awareness.html

> Make SolrCloud Data-center, rack or zone aware
> --
>
> Key: SOLR-6205
> URL: https://issues.apache.org/jira/browse/SOLR-6205
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Affects Versions: 4.8.1
>Reporter: Arcadius Ahouansou
>Assignee: Noble Paul
>
> Use case:
> Let's say we have SolrCloud deployed across 2 Datacenters, racks or zones A 
> and B
> There is a need to have a SolrCloud deployment that will make it possible to 
> have a working system even if one of the Datacenter/rack/zone A or B is lost.
> - This has been discussed on the mailing list at
> http://lucene.472066.n3.nabble.com/SolrCloud-multiple-data-center-support-td4115097.html
> and there are many workarounds that require adding more moving parts to the 
> system.
> - On the above thread, Daniel Collins mentioned  
> https://issues.apache.org/jira/browse/ZOOKEEPER-107 
>  which could help solve this issue.
> - Note that this is a very important feature that is overlooked most of the 
> time.
> - Note that this feature is available in ElasticSearch.
> See 
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-cluster.html#allocation-awareness
> and
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-cluster.html#forced-awareness



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-7.x-Linux (64bit/jdk1.8.0_144) - Build # 544 - Failure!

2017-10-04 Thread Policeman Jenkins Server

Build: https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-Linux/544/
Java: 64bit/jdk1.8.0_144 -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC

All tests passed

Build Log:
[...truncated 60485 lines...]
-documentation-lint:
[jtidy] Checking for broken html (such as invalid tags)...
   [delete] Deleting directory 
/home/jenkins/workspace/Lucene-Solr-7.x-Linux/lucene/build/jtidy_tmp
 [echo] Checking for broken links...
 [exec] 
 [exec] Crawl/parse...
 [exec] 
 [exec] Verify...
 [exec] 
 [exec] 
file:///home/jenkins/workspace/Lucene-Solr-7.x-Linux/solr/build/docs/quickstart.html
 [exec]   BAD EXTERNAL LINK: 
https://lucene.apache.org/solr/guide/solr-tutorial.html
 [exec] 
 [exec] Broken javadocs links were found! Common root causes:
 [exec] * A typo of some sort for manually created links.
 [exec] * Public methods referencing non-public classes in their signature.

BUILD FAILED
/home/jenkins/workspace/Lucene-Solr-7.x-Linux/build.xml:826: The following 
error occurred while executing this line:
/home/jenkins/workspace/Lucene-Solr-7.x-Linux/build.xml:101: The following 
error occurred while executing this line:
/home/jenkins/workspace/Lucene-Solr-7.x-Linux/solr/build.xml:669: The following 
error occurred while executing this line:
/home/jenkins/workspace/Lucene-Solr-7.x-Linux/solr/build.xml:682: The following 
error occurred while executing this line:
/home/jenkins/workspace/Lucene-Solr-7.x-Linux/lucene/common-build.xml:2570: 
exec returned: 1

Total time: 96 minutes 48 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
[WARNINGS] Skipping publisher since build result is FAILURE
Recording test results
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-11434) Solr 4.10 sharded collection issues when SSL is enabled

2017-10-04 Thread Magesh Tarala (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191632#comment-16191632
 ] 

Magesh Tarala commented on SOLR-11434:
--

Erick - Thanks for the super fast response. Looking at the JIRA tickets, 
there's nothing that indicates this has been reported/fixed. So, am I correct 
to assume this is still an issue in 5.x and 6.x code line. 

> Solr 4.10 sharded collection issues when SSL is enabled
> ---
>
> Key: SOLR-11434
> URL: https://issues.apache.org/jira/browse/SOLR-11434
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: security
>Affects Versions: 4.10.3
>Reporter: Magesh Tarala
>
> We have a 3 node solr cloud installation running on version 4.10. There is 
> one collection that’s sharded. After enabling SSL, we are unable to query the 
> sharded collection. Other non sharded collections are ok. We are getting this 
> error: 
> “no servers hosting shard:”
> I’ve googled and seen reports of this issue, but have not seen a resolution. 
> Thanks in advance for your help!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-11372) enable refinement testing in TestCloudJSONFacetJoinDomain

2017-10-04 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191626#comment-16191626
 ] 

Hoss Man commented on SOLR-11372:
-

no worries, i just didn't want it to slip through the cracks -- especially 
since the work varun is doing in SOLR-11391 is going to involve more updates to 
this test.  I wanted to ensure he had a solid foundation to build on.

> enable refinement testing in TestCloudJSONFacetJoinDomain
> -
>
> Key: SOLR-11372
> URL: https://issues.apache.org/jira/browse/SOLR-11372
> Project: Solr
>  Issue Type: Test
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
> Fix For: 7.1
>
> Attachments: SOLR-11372_fixup.patch, SOLR-11372_fixup.patch, 
> SOLR-11372.patch
>
>
> This test has great random tests that ensure that the count returned for a 
> bucket matches the number of documents returned from an equivalent filter.  
> We should enable randomly testing smaller limits in conjunction with 
> refinement to ensure we still get accurate counts for buckets that are 
> returned.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-11436) Add polyfit and polyfitDerivative Stream Evaluators

2017-10-04 Thread Joel Bernstein (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-11436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-11436:
--
Attachment: SOLR-11436.patch

> Add polyfit and polyfitDerivative Stream Evaluators
> ---
>
> Key: SOLR-11436
> URL: https://issues.apache.org/jira/browse/SOLR-11436
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 7.1, master (8.0)
>
> Attachments: SOLR-11436.patch, SOLR-11436.patch
>
>
> The *polyfit* and *polyfitDerivative* Stream Evaluators provide support for 
> polynomial curve fitting and calculating the derivative of the fitted curve.
> Implementation provided by the Apache Commons Math polynomial curve fitting 
> implementation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-11372) enable refinement testing in TestCloudJSONFacetJoinDomain

2017-10-04 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191614#comment-16191614
 ] 

Yonik Seeley commented on SOLR-11372:
-

Thanks Hoss, I really didn't mean to push the work off on you, but I'm still 
tied down with something else for the next couple of weeks probably.

> enable refinement testing in TestCloudJSONFacetJoinDomain
> -
>
> Key: SOLR-11372
> URL: https://issues.apache.org/jira/browse/SOLR-11372
> Project: Solr
>  Issue Type: Test
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
> Fix For: 7.1
>
> Attachments: SOLR-11372_fixup.patch, SOLR-11372_fixup.patch, 
> SOLR-11372.patch
>
>
> This test has great random tests that ensure that the count returned for a 
> bucket matches the number of documents returned from an equivalent filter.  
> We should enable randomly testing smaller limits in conjunction with 
> refinement to ensure we still get accurate counts for buckets that are 
> returned.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-11391) JoinQParser for non point fields should use the GraphTermsCollector

2017-10-04 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-11391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-11391:

Attachment: SOLR-11391.patch

reviewing varun's latest patch now ... here's a quick updated version to get it 
to compile (after the SOLR-11372 tweaks i just committed)

> JoinQParser for non point fields should use the GraphTermsCollector 
> 
>
> Key: SOLR-11391
> URL: https://issues.apache.org/jira/browse/SOLR-11391
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Varun Thacker
> Attachments: SOLR-11391.patch, SOLR-11391.patch, SOLR-11391.patch, 
> SOLR-11391.patch, SOLR-11391.patch, SOLR-11391.patch, SOLR-11391.patch, 
> SOLR-11391.patch, SOLR-11391.patch
>
>
> The Join Query Parser uses the GraphPointsCollector for point fields. 
> For non point fields if we use the GraphTermsCollector instead of the current 
> algorithm I am seeing quite a bit of performance gains.
> I'm going to attach a quick patch which I cooked up , making sure TestJoin 
> and TestCloudJSONFacetJoinDomain passed. 
> More tests, benchmarking and code cleanup to follow



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Reopened] (SOLR-10842) Move quickstart.html to Ref Guide

2017-10-04 Thread Steve Rowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-10842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe reopened SOLR-10842:
---

> Move quickstart.html to Ref Guide
> -
>
> Key: SOLR-10842
> URL: https://issues.apache.org/jira/browse/SOLR-10842
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Minor
> Fix For: 7.0
>
> Attachments: SOLR-10842.patch
>
>
> The Solr Quick Start at https://lucene.apache.org/solr/quickstart.html has 
> been problematic to keep up to date - until Ishan just updated it yesterday 
> for 6.6, it said "6.2.1", so hadn't been updated for several releases.
> Now that the Ref Guide is in AsciiDoc format, we can easily use variables for 
> package versions, and it could be released as part of the Ref Guide and kept 
> up to date. It could also integrate links to more information on topics, and 
> users would already be IN the docs, so they would not need to wonder where 
> the docs are.
> There are a few places on the site that will need to be updated to point to 
> the new location, but I can also put a redirect rule into .htaccess so people 
> are redirected to the new location if there are other links "in the wild" 
> that we cannot control. This allows it to be versioned also, if that becomes 
> necessary.
> As part of this, I would like to also update the entire "Getting Started" 
> section of the Ref Guide, which is effectively identical to what was in the 
> first release of the Ref Guide back in 2009 for Solr 1.4 and is in serious 
> need of reconsideration.
> My thought is that moving the page + redoing the Getting Started section 
> would be for 7.0, but if folks are excited about this idea I could move the 
> page for 6.6 and hold off redoing the larger section until 7.0.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-11391) JoinQParser for non point fields should use the GraphTermsCollector

2017-10-04 Thread Varun Thacker (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-11391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Thacker updated SOLR-11391:
-
Attachment: SOLR-11391.patch

small update to the patch. Still need to address the last comment

> JoinQParser for non point fields should use the GraphTermsCollector 
> 
>
> Key: SOLR-11391
> URL: https://issues.apache.org/jira/browse/SOLR-11391
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Varun Thacker
> Attachments: SOLR-11391.patch, SOLR-11391.patch, SOLR-11391.patch, 
> SOLR-11391.patch, SOLR-11391.patch, SOLR-11391.patch, SOLR-11391.patch, 
> SOLR-11391.patch
>
>
> The Join Query Parser uses the GraphPointsCollector for point fields. 
> For non point fields if we use the GraphTermsCollector instead of the current 
> algorithm I am seeing quite a bit of performance gains.
> I'm going to attach a quick patch which I cooked up , making sure TestJoin 
> and TestCloudJSONFacetJoinDomain passed. 
> More tests, benchmarking and code cleanup to follow



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-11437) double open ended range queries should be optimized to DocValuesFieldExistsQuery or NormsFieldExistsQuery

2017-10-04 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191570#comment-16191570
 ] 

Hoss Man commented on SOLR-11437:
-

straw man idea: a new static helper utility in FieldType that takes the same 
args as {{getRangeQuery(..)}} that returns an "exist" query if and only if it's 
applicable for the given field,lower,upper,inclusion args -- else returns null. 
 impls of {{getRangeQuery(..)}} could call this method as their first line, and 
return the result if non null -- else continue processing.

{noformat}
public Query getRangeQuery(QParser parser, SchemaField field, String part1, 
String part2, boolean minInclusive, boolean maxInclusive) {
  Query simpleExists = getExistsQueryIfApplicable(parser, field, part1, part2, 
minInclusive, maxInclusive);
  if (null != simpleExists) {
return simpleExists;
  }
  // existing method body
}

public static Query getExistsQueryIfApplicable(QParser parser, SchemaField 
field, String part1, String part2, boolean minInclusive, boolean maxInclusive) {
  if (null != part1 || null != part2 || !minInclusive || !maxInclusive) {
return null;
  }
  if (field.hasDocValues()) {
// ...
  }
  // ...else check norms
  // else return null
{noformat}

> double open ended range queries should be optimized to 
> DocValuesFieldExistsQuery or NormsFieldExistsQuery
> -
>
> Key: SOLR-11437
> URL: https://issues.apache.org/jira/browse/SOLR-11437
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>
> DocValuesFieldExistsQuery & NormsFieldExistsQuery are efficient ways to 
> determine if a doc has a value in a given field (assuming the field has 
> docValues or norms respectively)
> Since Solr's schema knows if/when these properties are true for a given 
> field, we should be able to optimize some of the {{field:[* TO *]}} usecases 
> to use these queries under the covers -- notably in 
> {{FieldType.getRangeQuery}} and subclasses that override it, but there may be 
> other cases where they could come in handy as well



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-11372) enable refinement testing in TestCloudJSONFacetJoinDomain

2017-10-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191562#comment-16191562
 ] 

ASF subversion and git services commented on SOLR-11372:


Commit b10eb1172a76ad877dece87893fec80895562968 in lucene-solr's branch 
refs/heads/master from Chris Hostetter
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=b10eb11 ]

SOLR-11372: addemdum, refactor refinement randomization to be reproducible when 
re-using the same TermFacet instance


> enable refinement testing in TestCloudJSONFacetJoinDomain
> -
>
> Key: SOLR-11372
> URL: https://issues.apache.org/jira/browse/SOLR-11372
> Project: Solr
>  Issue Type: Test
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
> Fix For: 7.1
>
> Attachments: SOLR-11372_fixup.patch, SOLR-11372_fixup.patch, 
> SOLR-11372.patch
>
>
> This test has great random tests that ensure that the count returned for a 
> bucket matches the number of documents returned from an equivalent filter.  
> We should enable randomly testing smaller limits in conjunction with 
> refinement to ensure we still get accurate counts for buckets that are 
> returned.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-11372) enable refinement testing in TestCloudJSONFacetJoinDomain

2017-10-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191561#comment-16191561
 ] 

ASF subversion and git services commented on SOLR-11372:


Commit 85bd0afaf816e36969f6715805ce2d4e4907f0de in lucene-solr's branch 
refs/heads/branch_7x from Chris Hostetter
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=85bd0af ]

SOLR-11372: addemdum, refactor refinement randomization to be reproducible when 
re-using the same TermFacet instance

(cherry picked from commit b10eb1172a76ad877dece87893fec80895562968)


> enable refinement testing in TestCloudJSONFacetJoinDomain
> -
>
> Key: SOLR-11372
> URL: https://issues.apache.org/jira/browse/SOLR-11372
> Project: Solr
>  Issue Type: Test
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
> Fix For: 7.1
>
> Attachments: SOLR-11372_fixup.patch, SOLR-11372_fixup.patch, 
> SOLR-11372.patch
>
>
> This test has great random tests that ensure that the count returned for a 
> bucket matches the number of documents returned from an equivalent filter.  
> We should enable randomly testing smaller limits in conjunction with 
> refinement to ensure we still get accurate counts for buckets that are 
> returned.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-11437) double open ended range queries should be optimized to DocValuesFieldExistsQuery or NormsFieldExistsQuery

2017-10-04 Thread Hoss Man (JIRA)

Hoss Man created SOLR-11437:
---

 Summary: double open ended range queries should be optimized to 
DocValuesFieldExistsQuery or NormsFieldExistsQuery
 Key: SOLR-11437
 URL: https://issues.apache.org/jira/browse/SOLR-11437
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Hoss Man


DocValuesFieldExistsQuery & NormsFieldExistsQuery are efficient ways to 
determine if a doc has a value in a given field (assuming the field has 
docValues or norms respectively)

Since Solr's schema knows if/when these properties are true for a given field, 
we should be able to optimize some of the {{field:[* TO *]}} usecases to use 
these queries under the covers -- notably in {{FieldType.getRangeQuery}} and 
subclasses that override it, but there may be other cases where they could come 
in handy as well




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-11436) Add polyfit and polyfitDerivative Stream Evaluators

2017-10-04 Thread Joel Bernstein (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-11436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-11436:
--
Description: 
The *polyfit* and *polyfitDerivative* Stream Evaluators provide support for 
polynomial curve fitting and calculating the derivative of the fitted curve.

Implementation provided by the Apache Commons Math polynomial curve fitting 
implementation.

  was:The *polyfit* and *polyfitDerivative* Stream Evaluators provide support 
for polynomial curve fitting and calculating the derivative of the fitted curve.


> Add polyfit and polyfitDerivative Stream Evaluators
> ---
>
> Key: SOLR-11436
> URL: https://issues.apache.org/jira/browse/SOLR-11436
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 7.1, master (8.0)
>
> Attachments: SOLR-11436.patch
>
>
> The *polyfit* and *polyfitDerivative* Stream Evaluators provide support for 
> polynomial curve fitting and calculating the derivative of the fitted curve.
> Implementation provided by the Apache Commons Math polynomial curve fitting 
> implementation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-11436) Add polyfit and polyfitDerivative Stream Evaluators

2017-10-04 Thread Joel Bernstein (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-11436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-11436:
--
Attachment: SOLR-11436.patch

Initial implementation, tests to come shortly.

> Add polyfit and polyfitDerivative Stream Evaluators
> ---
>
> Key: SOLR-11436
> URL: https://issues.apache.org/jira/browse/SOLR-11436
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 7.1, master (8.0)
>
> Attachments: SOLR-11436.patch
>
>
> The *polyfit* and *polyfitDerivative* Stream Evaluators provide support for 
> polynomial curve fitting and calculating the derivative of the fitted curve.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (SOLR-11436) Add polyfit and polyfitDerivative Stream Evaluators

2017-10-04 Thread Joel Bernstein (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-11436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein reassigned SOLR-11436:
-

Assignee: Joel Bernstein

> Add polyfit and polyfitDerivative Stream Evaluators
> ---
>
> Key: SOLR-11436
> URL: https://issues.apache.org/jira/browse/SOLR-11436
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 7.1, master (8.0)
>
>
> The *polyfit* and *polyfitDerivative* Stream Evaluators provide support for 
> polynomial curve fitting and calculating the derivative of the fitted curve.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-11436) Add polyfit and polyfitDerivative Stream Evaluators

2017-10-04 Thread Joel Bernstein (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-11436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-11436:
--
Description: The polyfit and polyfitDerivative Stream Evaluators provide 
support for polynomial curve fitting and calculating the derivative of the 
fitted curve.  (was: The polyfit and polyfitDerivative Stream Evaluators 
provide support for curve fitting and calculating the derivative of the fitted 
curve.)

> Add polyfit and polyfitDerivative Stream Evaluators
> ---
>
> Key: SOLR-11436
> URL: https://issues.apache.org/jira/browse/SOLR-11436
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
> Fix For: 7.1, master (8.0)
>
>
> The polyfit and polyfitDerivative Stream Evaluators provide support for 
> polynomial curve fitting and calculating the derivative of the fitted curve.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-11436) Add polyfit and polyfitDerivative Stream Evaluators

2017-10-04 Thread Joel Bernstein (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-11436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-11436:
--
Fix Version/s: master (8.0)
   7.1

> Add polyfit and polyfitDerivative Stream Evaluators
> ---
>
> Key: SOLR-11436
> URL: https://issues.apache.org/jira/browse/SOLR-11436
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 7.1, master (8.0)
>
>
> The *polyfit* and *polyfitDerivative* Stream Evaluators provide support for 
> polynomial curve fitting and calculating the derivative of the fitted curve.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-11436) Add polyfit and polyfitDerivative Stream Evaluators

2017-10-04 Thread Joel Bernstein (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-11436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-11436:
--
Description: The *polyfit* and *polyfitDerivative* Stream Evaluators 
provide support for polynomial curve fitting and calculating the derivative of 
the fitted curve.  (was: The polyfit and polyfitDerivative Stream Evaluators 
provide support for polynomial curve fitting and calculating the derivative of 
the fitted curve.)

> Add polyfit and polyfitDerivative Stream Evaluators
> ---
>
> Key: SOLR-11436
> URL: https://issues.apache.org/jira/browse/SOLR-11436
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
> Fix For: 7.1, master (8.0)
>
>
> The *polyfit* and *polyfitDerivative* Stream Evaluators provide support for 
> polynomial curve fitting and calculating the derivative of the fitted curve.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-11436) Add polyfit and polyfitDerivative Stream Evaluators

2017-10-04 Thread Joel Bernstein (JIRA)

Joel Bernstein created SOLR-11436:
-

 Summary: Add polyfit and polyfitDerivative Stream Evaluators
 Key: SOLR-11436
 URL: https://issues.apache.org/jira/browse/SOLR-11436
 Project: Solr
  Issue Type: New Feature
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Joel Bernstein


The polyfit and polyfitDerivative Stream Evaluators provide support for curve 
fitting and calculating the derivative of the fitted curve.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7976) Add a parameter to TieredMergePolicy to merge segments that have more than X percent deleted documents

2017-10-04 Thread Mike Sokolov (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191541#comment-16191541
 ] 

Mike Sokolov commented on LUCENE-7976:
--

Is it reasonable to modify the delete percentage in the policy while leaving 
the max in place?


-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


> Add a parameter to TieredMergePolicy to merge segments that have more than X 
> percent deleted documents
> --
>
> Key: LUCENE-7976
> URL: https://issues.apache.org/jira/browse/LUCENE-7976
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
>
> We're seeing situations "in the wild" where there are very large indexes (on 
> disk) handled quite easily in a single Lucene index. This is particularly 
> true as features like docValues move data into MMapDirectory space. The 
> current TMP algorithm allows on the order of 50% deleted documents as per a 
> dev list conversation with Mike McCandless (and his blog here:  
> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents).
> Especially in the current era of very large indexes in aggregate, (think many 
> TB) solutions like "you need to distribute your collection over more shards" 
> become very costly. Additionally, the tempting "optimize" button exacerbates 
> the issue since once you form, say, a 100G segment (by 
> optimizing/forceMerging) it is not eligible for merging until 97.5G of the 
> docs in it are deleted (current default 5G max segment size).
> The proposal here would be to add a new parameter to TMP, something like 
>  (no, that's not serious name, suggestions 
> welcome) which would default to 100 (or the same behavior we have now).
> So if I set this parameter to, say, 20%, and the max segment size stays at 
> 5G, the following would happen when segments were selected for merging:
> > any segment with > 20% deleted documents would be merged or rewritten NO 
> > MATTER HOW LARGE. There are two cases,
> >> the segment has < 5G "live" docs. In that case it would be merged with 
> >> smaller segments to bring the resulting segment up to 5G. If no smaller 
> >> segments exist, it would just be rewritten
> >> The segment has > 5G "live" docs (the result of a forceMerge or optimize). 
> >> It would be rewritten into a single segment removing all deleted docs no 
> >> matter how big it is to start. The 100G example above would be rewritten 
> >> to an 80G segment for instance.
> Of course this would lead to potentially much more I/O which is why the 
> default would be the same behavior we see now. As it stands now, though, 
> there's no way to recover from an optimize/forceMerge except to re-index from 
> scratch. We routinely see 200G-300G Lucene indexes at this point "in the 
> wild" with 10s of  shards replicated 3 or more times. And that doesn't even 
> include having these over HDFS.
> Alternatives welcome! Something like the above seems minimally invasive. A 
> new merge policy is certainly an alternative.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-NightlyTests-7.x - Build # 56 - Still Failing

2017-10-04 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-7.x/56/

9 tests failed.
FAILED:  org.apache.solr.cloud.ChaosMonkeyNothingIsSafeWithPullReplicasTest.test

Error Message:
Could not load collection from ZK: collection1

Stack Trace:
org.apache.solr.common.SolrException: Could not load collection from ZK: 
collection1
at 
__randomizedtesting.SeedInfo.seed([93F883ACB19AA413:1BACBC761F66C9EB]:0)
at 
org.apache.solr.common.cloud.ZkStateReader.getCollectionLive(ZkStateReader.java:1115)
at 
org.apache.solr.common.cloud.ZkStateReader$LazyCollectionRef.get(ZkStateReader.java:648)
at 
org.apache.solr.common.cloud.ClusterState.getCollectionOrNull(ClusterState.java:128)
at 
org.apache.solr.common.cloud.ClusterState.getCollection(ClusterState.java:108)
at 
org.apache.solr.cloud.ChaosMonkey.logCollectionStateSummary(ChaosMonkey.java:709)
at org.apache.solr.cloud.ChaosMonkey.wait(ChaosMonkey.java:703)
at 
org.apache.solr.cloud.ChaosMonkeyNothingIsSafeWithPullReplicasTest.test(ChaosMonkeyNothingIsSafeWithPullReplicasTest.java:220)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:993)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:968)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at

[jira] [Resolved] (LUCENE-7911) checkJavadocLinks.py doesn't allow links to new Ref Guide in javadocs

2017-10-04 Thread Steve Rowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-7911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe resolved LUCENE-7911.

   Resolution: Fixed
 Assignee: Steve Rowe
Fix Version/s: master (8.0)
   7.1

> checkJavadocLinks.py doesn't allow links to new Ref Guide in javadocs
> -
>
> Key: LUCENE-7911
> URL: https://issues.apache.org/jira/browse/LUCENE-7911
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Cassandra Targett
>Assignee: Steve Rowe
> Fix For: 7.1, master (8.0)
>
> Attachments: LUCENE-7911.patch
>
>
> In SOLR-11135 I'm fixing a number of URLs in source that point to the old 
> Solr Reference Guide location 
> (https://cwiki.apache.org/confluence/display/solr/...). The new base URL for 
> the Ref Guide is {{https://lucene.apache.org/solr/guide...}} which is the 
> same as the javadocs. 
> Several of these references are in Java classes, but changing those to the 
> new URLs causes precommit to fail because {{checkJavadocLinks.py}} doesn't 
> allow links in javadocs to contain URLs starting with {{lucene.apache.org}} 
> unless they are explicitly allowed.
> Fixing this may not be as simple as just allowing any URL starting with 
> {{https://lucene.apache.org/solr/guide...}}. For javadocs we want to only use 
> non-versioned urls, but someone could accidentally insert a versioned URL 
> (say, for 7.0) that would be invalid in later releases.
> Since javadocs & ref guide are on the same server, perhaps some sort of 
> relative link is preferable, but I honestly don't know enough about URL 
> construction in javadocs to know what sorts of options are available.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7911) checkJavadocLinks.py doesn't allow links to new Ref Guide in javadocs

2017-10-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191532#comment-16191532
 ] 

ASF subversion and git services commented on LUCENE-7911:
-

Commit 4392500a3b19b2cf7111f2914daf6f23fce985d5 in lucene-solr's branch 
refs/heads/master from [~steve_rowe]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=4392500 ]

LUCENE-7911: allow javadoc links containing 'lucene.apache.org/solr/guide/'


> checkJavadocLinks.py doesn't allow links to new Ref Guide in javadocs
> -
>
> Key: LUCENE-7911
> URL: https://issues.apache.org/jira/browse/LUCENE-7911
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Cassandra Targett
> Attachments: LUCENE-7911.patch
>
>
> In SOLR-11135 I'm fixing a number of URLs in source that point to the old 
> Solr Reference Guide location 
> (https://cwiki.apache.org/confluence/display/solr/...). The new base URL for 
> the Ref Guide is {{https://lucene.apache.org/solr/guide...}} which is the 
> same as the javadocs. 
> Several of these references are in Java classes, but changing those to the 
> new URLs causes precommit to fail because {{checkJavadocLinks.py}} doesn't 
> allow links in javadocs to contain URLs starting with {{lucene.apache.org}} 
> unless they are explicitly allowed.
> Fixing this may not be as simple as just allowing any URL starting with 
> {{https://lucene.apache.org/solr/guide...}}. For javadocs we want to only use 
> non-versioned urls, but someone could accidentally insert a versioned URL 
> (say, for 7.0) that would be invalid in later releases.
> Since javadocs & ref guide are on the same server, perhaps some sort of 
> relative link is preferable, but I honestly don't know enough about URL 
> construction in javadocs to know what sorts of options are available.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7911) checkJavadocLinks.py doesn't allow links to new Ref Guide in javadocs

2017-10-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191527#comment-16191527
 ] 

ASF subversion and git services commented on LUCENE-7911:
-

Commit f9b30c12dd4544953529428f13b7a02a4df8bfac in lucene-solr's branch 
refs/heads/branch_7x from [~steve_rowe]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=f9b30c1 ]

LUCENE-7911: allow javadoc links containing 'lucene.apache.org/solr/guide/'


> checkJavadocLinks.py doesn't allow links to new Ref Guide in javadocs
> -
>
> Key: LUCENE-7911
> URL: https://issues.apache.org/jira/browse/LUCENE-7911
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Cassandra Targett
> Attachments: LUCENE-7911.patch
>
>
> In SOLR-11135 I'm fixing a number of URLs in source that point to the old 
> Solr Reference Guide location 
> (https://cwiki.apache.org/confluence/display/solr/...). The new base URL for 
> the Ref Guide is {{https://lucene.apache.org/solr/guide...}} which is the 
> same as the javadocs. 
> Several of these references are in Java classes, but changing those to the 
> new URLs causes precommit to fail because {{checkJavadocLinks.py}} doesn't 
> allow links in javadocs to contain URLs starting with {{lucene.apache.org}} 
> unless they are explicitly allowed.
> Fixing this may not be as simple as just allowing any URL starting with 
> {{https://lucene.apache.org/solr/guide...}}. For javadocs we want to only use 
> non-versioned urls, but someone could accidentally insert a versioned URL 
> (say, for 7.0) that would be invalid in later releases.
> Since javadocs & ref guide are on the same server, perhaps some sort of 
> relative link is preferable, but I honestly don't know enough about URL 
> construction in javadocs to know what sorts of options are available.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-7911) checkJavadocLinks.py doesn't allow links to new Ref Guide in javadocs

2017-10-04 Thread Steve Rowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-7911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated LUCENE-7911:
---
Attachment: LUCENE-7911.patch

Attaching a patch to allow any link that contains 
{{lucene.apache.org/solr/guide/}} to {{checkJavadocLinks.py}}.  {{ant 
precommit}} passes for me with this change.  Committing shortly.

> checkJavadocLinks.py doesn't allow links to new Ref Guide in javadocs
> -
>
> Key: LUCENE-7911
> URL: https://issues.apache.org/jira/browse/LUCENE-7911
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Cassandra Targett
> Attachments: LUCENE-7911.patch
>
>
> In SOLR-11135 I'm fixing a number of URLs in source that point to the old 
> Solr Reference Guide location 
> (https://cwiki.apache.org/confluence/display/solr/...). The new base URL for 
> the Ref Guide is {{https://lucene.apache.org/solr/guide...}} which is the 
> same as the javadocs. 
> Several of these references are in Java classes, but changing those to the 
> new URLs causes precommit to fail because {{checkJavadocLinks.py}} doesn't 
> allow links in javadocs to contain URLs starting with {{lucene.apache.org}} 
> unless they are explicitly allowed.
> Fixing this may not be as simple as just allowing any URL starting with 
> {{https://lucene.apache.org/solr/guide...}}. For javadocs we want to only use 
> non-versioned urls, but someone could accidentally insert a versioned URL 
> (say, for 7.0) that would be invalid in later releases.
> Since javadocs & ref guide are on the same server, perhaps some sort of 
> relative link is preferable, but I honestly don't know enough about URL 
> construction in javadocs to know what sorts of options are available.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7976) Add a parameter to TieredMergePolicy to merge segments that have more than X percent deleted documents

2017-10-04 Thread Michael Braun (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191510#comment-16191510
 ] 

Michael Braun commented on LUCENE-7976:
---

[~mikemccand] thought this issue was about the case where you have segments 
that are effectively unmergeable and that stick around at < 50% deletes? We 
have seen this in our production systems where these segments which are at the 
segment size limit sick around and waste not only disk resources but throw off 
term frequencies because the policy does not merge at the lower delete level. 
Would love a way to specify that segments which would normally be unmergeable 
should still be considered for operations in the event the number of deletes 
passes a (lower) threshold. 

> Add a parameter to TieredMergePolicy to merge segments that have more than X 
> percent deleted documents
> --
>
> Key: LUCENE-7976
> URL: https://issues.apache.org/jira/browse/LUCENE-7976
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
>
> We're seeing situations "in the wild" where there are very large indexes (on 
> disk) handled quite easily in a single Lucene index. This is particularly 
> true as features like docValues move data into MMapDirectory space. The 
> current TMP algorithm allows on the order of 50% deleted documents as per a 
> dev list conversation with Mike McCandless (and his blog here:  
> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents).
> Especially in the current era of very large indexes in aggregate, (think many 
> TB) solutions like "you need to distribute your collection over more shards" 
> become very costly. Additionally, the tempting "optimize" button exacerbates 
> the issue since once you form, say, a 100G segment (by 
> optimizing/forceMerging) it is not eligible for merging until 97.5G of the 
> docs in it are deleted (current default 5G max segment size).
> The proposal here would be to add a new parameter to TMP, something like 
>  (no, that's not serious name, suggestions 
> welcome) which would default to 100 (or the same behavior we have now).
> So if I set this parameter to, say, 20%, and the max segment size stays at 
> 5G, the following would happen when segments were selected for merging:
> > any segment with > 20% deleted documents would be merged or rewritten NO 
> > MATTER HOW LARGE. There are two cases,
> >> the segment has < 5G "live" docs. In that case it would be merged with 
> >> smaller segments to bring the resulting segment up to 5G. If no smaller 
> >> segments exist, it would just be rewritten
> >> The segment has > 5G "live" docs (the result of a forceMerge or optimize). 
> >> It would be rewritten into a single segment removing all deleted docs no 
> >> matter how big it is to start. The 100G example above would be rewritten 
> >> to an 80G segment for instance.
> Of course this would lead to potentially much more I/O which is why the 
> default would be the same behavior we see now. As it stands now, though, 
> there's no way to recover from an optimize/forceMerge except to re-index from 
> scratch. We routinely see 200G-300G Lucene indexes at this point "in the 
> wild" with 10s of  shards replicated 3 or more times. And that doesn't even 
> include having these over HDFS.
> Alternatives welcome! Something like the above seems minimally invasive. A 
> new merge policy is certainly an alternative.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7976) Add a parameter to TieredMergePolicy to merge segments that have more than X percent deleted documents

2017-10-04 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191494#comment-16191494
 ] 

Michael McCandless commented on LUCENE-7976:


bq. If a collection has many 5GB segments, it's possible for many of them to be 
at less than 50% but still accumulate a fair amount of deletes. Increasing the 
max segment helps, but increases the amount of churn on disk through large 
merges.

Right, but that's a normal/acceptable index state, where up to 50% of your docs 
are deleted.

What this bug is about is cases where it's way over 50% of your docs that are 
deleted, and as far as I know, the only way to get yourself into that state is 
by doing a {{forceMerge}} and then continuing to update/delete documents.

> Add a parameter to TieredMergePolicy to merge segments that have more than X 
> percent deleted documents
> --
>
> Key: LUCENE-7976
> URL: https://issues.apache.org/jira/browse/LUCENE-7976
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
>
> We're seeing situations "in the wild" where there are very large indexes (on 
> disk) handled quite easily in a single Lucene index. This is particularly 
> true as features like docValues move data into MMapDirectory space. The 
> current TMP algorithm allows on the order of 50% deleted documents as per a 
> dev list conversation with Mike McCandless (and his blog here:  
> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents).
> Especially in the current era of very large indexes in aggregate, (think many 
> TB) solutions like "you need to distribute your collection over more shards" 
> become very costly. Additionally, the tempting "optimize" button exacerbates 
> the issue since once you form, say, a 100G segment (by 
> optimizing/forceMerging) it is not eligible for merging until 97.5G of the 
> docs in it are deleted (current default 5G max segment size).
> The proposal here would be to add a new parameter to TMP, something like 
>  (no, that's not serious name, suggestions 
> welcome) which would default to 100 (or the same behavior we have now).
> So if I set this parameter to, say, 20%, and the max segment size stays at 
> 5G, the following would happen when segments were selected for merging:
> > any segment with > 20% deleted documents would be merged or rewritten NO 
> > MATTER HOW LARGE. There are two cases,
> >> the segment has < 5G "live" docs. In that case it would be merged with 
> >> smaller segments to bring the resulting segment up to 5G. If no smaller 
> >> segments exist, it would just be rewritten
> >> The segment has > 5G "live" docs (the result of a forceMerge or optimize). 
> >> It would be rewritten into a single segment removing all deleted docs no 
> >> matter how big it is to start. The 100G example above would be rewritten 
> >> to an 80G segment for instance.
> Of course this would lead to potentially much more I/O which is why the 
> default would be the same behavior we see now. As it stands now, though, 
> there's no way to recover from an optimize/forceMerge except to re-index from 
> scratch. We routinely see 200G-300G Lucene indexes at this point "in the 
> wild" with 10s of  shards replicated 3 or more times. And that doesn't even 
> include having these over HDFS.
> Alternatives welcome! Something like the above seems minimally invasive. A 
> new merge policy is certainly an alternative.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7976) Add a parameter to TieredMergePolicy to merge segments that have more than X percent deleted documents

2017-10-04 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191488#comment-16191488
 ] 

Michael McCandless commented on LUCENE-7976:


bq. How about having forceMerge() obey max segment size. If you really want to 
merge down to one segment, you have to change the policy to increase the max 
size.

+1, that makes a lot of sense.  Basically TMP is buggy today because it allows 
{{forceMerge}} to create too-big segments.

> Add a parameter to TieredMergePolicy to merge segments that have more than X 
> percent deleted documents
> --
>
> Key: LUCENE-7976
> URL: https://issues.apache.org/jira/browse/LUCENE-7976
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
>
> We're seeing situations "in the wild" where there are very large indexes (on 
> disk) handled quite easily in a single Lucene index. This is particularly 
> true as features like docValues move data into MMapDirectory space. The 
> current TMP algorithm allows on the order of 50% deleted documents as per a 
> dev list conversation with Mike McCandless (and his blog here:  
> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents).
> Especially in the current era of very large indexes in aggregate, (think many 
> TB) solutions like "you need to distribute your collection over more shards" 
> become very costly. Additionally, the tempting "optimize" button exacerbates 
> the issue since once you form, say, a 100G segment (by 
> optimizing/forceMerging) it is not eligible for merging until 97.5G of the 
> docs in it are deleted (current default 5G max segment size).
> The proposal here would be to add a new parameter to TMP, something like 
>  (no, that's not serious name, suggestions 
> welcome) which would default to 100 (or the same behavior we have now).
> So if I set this parameter to, say, 20%, and the max segment size stays at 
> 5G, the following would happen when segments were selected for merging:
> > any segment with > 20% deleted documents would be merged or rewritten NO 
> > MATTER HOW LARGE. There are two cases,
> >> the segment has < 5G "live" docs. In that case it would be merged with 
> >> smaller segments to bring the resulting segment up to 5G. If no smaller 
> >> segments exist, it would just be rewritten
> >> The segment has > 5G "live" docs (the result of a forceMerge or optimize). 
> >> It would be rewritten into a single segment removing all deleted docs no 
> >> matter how big it is to start. The 100G example above would be rewritten 
> >> to an 80G segment for instance.
> Of course this would lead to potentially much more I/O which is why the 
> default would be the same behavior we see now. As it stands now, though, 
> there's no way to recover from an optimize/forceMerge except to re-index from 
> scratch. We routinely see 200G-300G Lucene indexes at this point "in the 
> wild" with 10s of  shards replicated 3 or more times. And that doesn't even 
> include having these over HDFS.
> Alternatives welcome! Something like the above seems minimally invasive. A 
> new merge policy is certainly an alternative.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-7.0-Linux (64bit/jdk1.8.0_144) - Build # 419 - Still Unstable!

2017-10-04 Thread Policeman Jenkins Server

Build: https://jenkins.thetaphi.de/job/Lucene-Solr-7.0-Linux/419/
Java: 64bit/jdk1.8.0_144 -XX:-UseCompressedOops -XX:+UseConcMarkSweepGC

2 tests failed.
FAILED:  
org.apache.solr.cloud.TestTlogReplica.testOutOfOrderDBQWithInPlaceUpdates

Error Message:
Can not find doc 1 in https://127.0.0.1:44783/solr

Stack Trace:
java.lang.AssertionError: Can not find doc 1 in https://127.0.0.1:44783/solr
at 
__randomizedtesting.SeedInfo.seed([9B0F3883806AC77:8F710B6567577A97]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at org.junit.Assert.assertNotNull(Assert.java:526)
at 
org.apache.solr.cloud.TestTlogReplica.checkRTG(TestTlogReplica.java:868)
at 
org.apache.solr.cloud.TestTlogReplica.testOutOfOrderDBQWithInPlaceUpdates(TestTlogReplica.java:671)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at java.lang.Thread.run(Thread.java:748)


FAILED:

[jira] [Commented] (LUCENE-7976) Add a parameter to TieredMergePolicy to merge segments that have more than X percent deleted documents

2017-10-04 Thread Timothy M. Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191470#comment-16191470
 ] 

Timothy M. Rodriguez commented on LUCENE-7976:
--

If a collection has many 5GB segments, it's possible for many of them to be at 
less than 50% but still accumulate a fair amount of deletes.  Increasing the 
max segment helps, but increases the amount of churn on disk through large 
merges.

> Add a parameter to TieredMergePolicy to merge segments that have more than X 
> percent deleted documents
> --
>
> Key: LUCENE-7976
> URL: https://issues.apache.org/jira/browse/LUCENE-7976
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
>
> We're seeing situations "in the wild" where there are very large indexes (on 
> disk) handled quite easily in a single Lucene index. This is particularly 
> true as features like docValues move data into MMapDirectory space. The 
> current TMP algorithm allows on the order of 50% deleted documents as per a 
> dev list conversation with Mike McCandless (and his blog here:  
> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents).
> Especially in the current era of very large indexes in aggregate, (think many 
> TB) solutions like "you need to distribute your collection over more shards" 
> become very costly. Additionally, the tempting "optimize" button exacerbates 
> the issue since once you form, say, a 100G segment (by 
> optimizing/forceMerging) it is not eligible for merging until 97.5G of the 
> docs in it are deleted (current default 5G max segment size).
> The proposal here would be to add a new parameter to TMP, something like 
>  (no, that's not serious name, suggestions 
> welcome) which would default to 100 (or the same behavior we have now).
> So if I set this parameter to, say, 20%, and the max segment size stays at 
> 5G, the following would happen when segments were selected for merging:
> > any segment with > 20% deleted documents would be merged or rewritten NO 
> > MATTER HOW LARGE. There are two cases,
> >> the segment has < 5G "live" docs. In that case it would be merged with 
> >> smaller segments to bring the resulting segment up to 5G. If no smaller 
> >> segments exist, it would just be rewritten
> >> The segment has > 5G "live" docs (the result of a forceMerge or optimize). 
> >> It would be rewritten into a single segment removing all deleted docs no 
> >> matter how big it is to start. The 100G example above would be rewritten 
> >> to an 80G segment for instance.
> Of course this would lead to potentially much more I/O which is why the 
> default would be the same behavior we see now. As it stands now, though, 
> there's no way to recover from an optimize/forceMerge except to re-index from 
> scratch. We routinely see 200G-300G Lucene indexes at this point "in the 
> wild" with 10s of  shards replicated 3 or more times. And that doesn't even 
> include having these over HDFS.
> Alternatives welcome! Something like the above seems minimally invasive. A 
> new merge policy is certainly an alternative.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7976) Add a parameter to TieredMergePolicy to merge segments that have more than X percent deleted documents

2017-10-04 Thread Mike Sokolov (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191454#comment-16191454
 ] 

Mike Sokolov commented on LUCENE-7976:
--

How about having forceMerge() obey max segment size. If you *really* want to 
merge down to one segment, you have to change the policy to increase the max 
size.

> Add a parameter to TieredMergePolicy to merge segments that have more than X 
> percent deleted documents
> --
>
> Key: LUCENE-7976
> URL: https://issues.apache.org/jira/browse/LUCENE-7976
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
>
> We're seeing situations "in the wild" where there are very large indexes (on 
> disk) handled quite easily in a single Lucene index. This is particularly 
> true as features like docValues move data into MMapDirectory space. The 
> current TMP algorithm allows on the order of 50% deleted documents as per a 
> dev list conversation with Mike McCandless (and his blog here:  
> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents).
> Especially in the current era of very large indexes in aggregate, (think many 
> TB) solutions like "you need to distribute your collection over more shards" 
> become very costly. Additionally, the tempting "optimize" button exacerbates 
> the issue since once you form, say, a 100G segment (by 
> optimizing/forceMerging) it is not eligible for merging until 97.5G of the 
> docs in it are deleted (current default 5G max segment size).
> The proposal here would be to add a new parameter to TMP, something like 
>  (no, that's not serious name, suggestions 
> welcome) which would default to 100 (or the same behavior we have now).
> So if I set this parameter to, say, 20%, and the max segment size stays at 
> 5G, the following would happen when segments were selected for merging:
> > any segment with > 20% deleted documents would be merged or rewritten NO 
> > MATTER HOW LARGE. There are two cases,
> >> the segment has < 5G "live" docs. In that case it would be merged with 
> >> smaller segments to bring the resulting segment up to 5G. If no smaller 
> >> segments exist, it would just be rewritten
> >> The segment has > 5G "live" docs (the result of a forceMerge or optimize). 
> >> It would be rewritten into a single segment removing all deleted docs no 
> >> matter how big it is to start. The 100G example above would be rewritten 
> >> to an 80G segment for instance.
> Of course this would lead to potentially much more I/O which is why the 
> default would be the same behavior we see now. As it stands now, though, 
> there's no way to recover from an optimize/forceMerge except to re-index from 
> scratch. We routinely see 200G-300G Lucene indexes at this point "in the 
> wild" with 10s of  shards replicated 3 or more times. And that doesn't even 
> include having these over HDFS.
> Alternatives welcome! Something like the above seems minimally invasive. A 
> new merge policy is certainly an alternative.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-10842) Move quickstart.html to Ref Guide

2017-10-04 Thread Steve Rowe (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-10842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191445#comment-16191445
 ] 

Steve Rowe commented on SOLR-10842:
---

bq. The only solution I know of right now is to remove 
{{solr/site/quickstart.mdtext}} and remove any link to the tutorial from 
{{solr/site/index.xsl}}. If anyone has a better solution, I humbly request your 
assistance.

+1 to remove {{solr/site/quickstart.mdtext}} and clean up references to it.  I 
think we can point to the new location in {{solr/site/index.xsl}}.  I'll work 
on it.

> Move quickstart.html to Ref Guide
> -
>
> Key: SOLR-10842
> URL: https://issues.apache.org/jira/browse/SOLR-10842
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Minor
> Fix For: 7.0
>
> Attachments: SOLR-10842.patch
>
>
> The Solr Quick Start at https://lucene.apache.org/solr/quickstart.html has 
> been problematic to keep up to date - until Ishan just updated it yesterday 
> for 6.6, it said "6.2.1", so hadn't been updated for several releases.
> Now that the Ref Guide is in AsciiDoc format, we can easily use variables for 
> package versions, and it could be released as part of the Ref Guide and kept 
> up to date. It could also integrate links to more information on topics, and 
> users would already be IN the docs, so they would not need to wonder where 
> the docs are.
> There are a few places on the site that will need to be updated to point to 
> the new location, but I can also put a redirect rule into .htaccess so people 
> are redirected to the new location if there are other links "in the wild" 
> that we cannot control. This allows it to be versioned also, if that becomes 
> necessary.
> As part of this, I would like to also update the entire "Getting Started" 
> section of the Ref Guide, which is effectively identical to what was in the 
> first release of the Ref Guide back in 2009 for Solr 1.4 and is in serious 
> need of reconsideration.
> My thought is that moving the page + redoing the Getting Started section 
> would be for 7.0, but if folks are excited about this idea I could move the 
> page for 6.6 and hold off redoing the larger section until 7.0.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-master-Windows (64bit/jdk-9) - Build # 6940 - Failure!

2017-10-04 Thread Policeman Jenkins Server

Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Windows/6940/
Java: 64bit/jdk-9 -XX:-UseCompressedOops -XX:+UseParallelGC 
--illegal-access=deny

1 tests failed.
FAILED:  
org.apache.solr.client.solrj.io.stream.StreamExpressionTest.testParallelExecutorStream

Error Message:
Error from server at http://127.0.0.1:54473/solr/mainCorpus_shard2_replica_n3: 
Expected mime type application/octet-stream but got text/html.   
 
Error 404HTTP ERROR: 404 Problem 
accessing /solr/mainCorpus_shard2_replica_n3/update. Reason: Can not 
find: /solr/mainCorpus_shard2_replica_n3/update http://eclipse.org/jetty;>Powered by Jetty:// 9.3.20.v20170531 
  

Stack Trace:
org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error from 
server at http://127.0.0.1:54473/solr/mainCorpus_shard2_replica_n3: Expected 
mime type application/octet-stream but got text/html. 


Error 404 


HTTP ERROR: 404
Problem accessing /solr/mainCorpus_shard2_replica_n3/update. Reason:
Can not find: /solr/mainCorpus_shard2_replica_n3/update
http://eclipse.org/jetty;>Powered by Jetty:// 
9.3.20.v20170531



at 
__randomizedtesting.SeedInfo.seed([DC5296EB745A9843:6145E3F24D76A51E]:0)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.directUpdate(CloudSolrClient.java:539)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:993)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:862)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:793)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:178)
at 
org.apache.solr.client.solrj.request.UpdateRequest.commit(UpdateRequest.java:233)
at 
org.apache.solr.client.solrj.io.stream.StreamExpressionTest.testParallelExecutorStream(StreamExpressionTest.java:7541)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:564)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at

[jira] [Commented] (LUCENE-7976) Add a parameter to TieredMergePolicy to merge segments that have more than X percent deleted documents

2017-10-04 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191428#comment-16191428
 ] 

Robert Muir commented on LUCENE-7976:
-

{quote}
It's too bad people call forceMerge and get themselves into this situation to 
begin with  Maybe we should remove that method! Or maybe the index should be 
put into a read-only state after you call it?
{quote}

{quote}
I know the issue, so the first thing I tell solr customers is: "never ever call 
optimize unless your index is static."
{quote}

The read-only idea is really cool, maybe consider deprecating forceMerge() and 
adding freeze()? I think this removes the trap completely and still allows for 
use-cases where people just want less segments for the read-only case.

> Add a parameter to TieredMergePolicy to merge segments that have more than X 
> percent deleted documents
> --
>
> Key: LUCENE-7976
> URL: https://issues.apache.org/jira/browse/LUCENE-7976
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
>
> We're seeing situations "in the wild" where there are very large indexes (on 
> disk) handled quite easily in a single Lucene index. This is particularly 
> true as features like docValues move data into MMapDirectory space. The 
> current TMP algorithm allows on the order of 50% deleted documents as per a 
> dev list conversation with Mike McCandless (and his blog here:  
> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents).
> Especially in the current era of very large indexes in aggregate, (think many 
> TB) solutions like "you need to distribute your collection over more shards" 
> become very costly. Additionally, the tempting "optimize" button exacerbates 
> the issue since once you form, say, a 100G segment (by 
> optimizing/forceMerging) it is not eligible for merging until 97.5G of the 
> docs in it are deleted (current default 5G max segment size).
> The proposal here would be to add a new parameter to TMP, something like 
>  (no, that's not serious name, suggestions 
> welcome) which would default to 100 (or the same behavior we have now).
> So if I set this parameter to, say, 20%, and the max segment size stays at 
> 5G, the following would happen when segments were selected for merging:
> > any segment with > 20% deleted documents would be merged or rewritten NO 
> > MATTER HOW LARGE. There are two cases,
> >> the segment has < 5G "live" docs. In that case it would be merged with 
> >> smaller segments to bring the resulting segment up to 5G. If no smaller 
> >> segments exist, it would just be rewritten
> >> The segment has > 5G "live" docs (the result of a forceMerge or optimize). 
> >> It would be rewritten into a single segment removing all deleted docs no 
> >> matter how big it is to start. The 100G example above would be rewritten 
> >> to an 80G segment for instance.
> Of course this would lead to potentially much more I/O which is why the 
> default would be the same behavior we see now. As it stands now, though, 
> there's no way to recover from an optimize/forceMerge except to re-index from 
> scratch. We routinely see 200G-300G Lucene indexes at this point "in the 
> wild" with 10s of  shards replicated 3 or more times. And that doesn't even 
> include having these over HDFS.
> Alternatives welcome! Something like the above seems minimally invasive. A 
> new merge policy is certainly an alternative.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-10842) Move quickstart.html to Ref Guide

2017-10-04 Thread Cassandra Targett (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-10842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191390#comment-16191390
 ] 

Cassandra Targett commented on SOLR-10842:
--

bq. anyone object to the idea that I just remove the 
{{solr/site/quickstart.mdtext}} entirely since it's replaced by 
{{solr/solr-ref-guide/solr-tutorial.adoc}}.

That's not going to work since precommit checks that all the links on a page 
like https://lucene.apache.org/solr/7_0_0/ are "valid" links, but it's rules of 
"valid" are no longer true as described in LUCENE-7911. I tried to change the 
reference in {{solr/site/index.xsl}} to point to where the tutorial will be in 
production, but that fails the link checker since according to it, that file 
doesn't exist (it's not wrong in its own context).

So, we can't have a hard-link using {{lucene.apache.org}} and we can't use a 
relative link that we know works. 

The only solution I know of right now is to remove 
{{solr/site/quickstart.mdtext}} and remove any link to the tutorial from 
{{solr/site/index.xsl}}. If anyone has a better solution, I humbly request your 
assistance.

> Move quickstart.html to Ref Guide
> -
>
> Key: SOLR-10842
> URL: https://issues.apache.org/jira/browse/SOLR-10842
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Minor
> Fix For: 7.0
>
> Attachments: SOLR-10842.patch
>
>
> The Solr Quick Start at https://lucene.apache.org/solr/quickstart.html has 
> been problematic to keep up to date - until Ishan just updated it yesterday 
> for 6.6, it said "6.2.1", so hadn't been updated for several releases.
> Now that the Ref Guide is in AsciiDoc format, we can easily use variables for 
> package versions, and it could be released as part of the Ref Guide and kept 
> up to date. It could also integrate links to more information on topics, and 
> users would already be IN the docs, so they would not need to wonder where 
> the docs are.
> There are a few places on the site that will need to be updated to point to 
> the new location, but I can also put a redirect rule into .htaccess so people 
> are redirected to the new location if there are other links "in the wild" 
> that we cannot control. This allows it to be versioned also, if that becomes 
> necessary.
> As part of this, I would like to also update the entire "Getting Started" 
> section of the Ref Guide, which is effectively identical to what was in the 
> first release of the Ref Guide back in 2009 for Solr 1.4 and is in serious 
> need of reconsideration.
> My thought is that moving the page + redoing the Getting Started section 
> would be for 7.0, but if folks are excited about this idea I could move the 
> page for 6.6 and hold off redoing the larger section until 7.0.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7972) DirectoryTaxonomyReader should implement Accountable

2017-10-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191383#comment-16191383
 ] 

ASF subversion and git services commented on LUCENE-7972:
-

Commit db95888effb14b5600106e91d21d3adb090fbd96 in lucene-solr's branch 
refs/heads/branch_7x from Mike McCandless
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=db95888 ]

LUCENE-7972: DirectoryTaxonomyReader now implements Accountable


> DirectoryTaxonomyReader should implement Accountable
> 
>
> Key: LUCENE-7972
> URL: https://issues.apache.org/jira/browse/LUCENE-7972
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 7.1, master (8.0)
>
> Attachments: LUCENE-7972.patch
>
>
> This class is a concrete instance of {{TaxonomyReader}} that uses a Lucene 
> index to map facet labels to ordinals.
> It uses a fair amount of heap, e.g. to hold parent/sibling/child int arrays, 
> to cache recent lookups, and in the underlying {{IndexReader}}.  I think we 
> should have it implement {{Accountable}} so people can track its heap usage.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-7972) DirectoryTaxonomyReader should implement Accountable

2017-10-04 Thread Michael McCandless (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-7972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-7972.

Resolution: Fixed

> DirectoryTaxonomyReader should implement Accountable
> 
>
> Key: LUCENE-7972
> URL: https://issues.apache.org/jira/browse/LUCENE-7972
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 7.1, master (8.0)
>
> Attachments: LUCENE-7972.patch
>
>
> This class is a concrete instance of {{TaxonomyReader}} that uses a Lucene 
> index to map facet labels to ordinals.
> It uses a fair amount of heap, e.g. to hold parent/sibling/child int arrays, 
> to cache recent lookups, and in the underlying {{IndexReader}}.  I think we 
> should have it implement {{Accountable}} so people can track its heap usage.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7972) DirectoryTaxonomyReader should implement Accountable

2017-10-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191381#comment-16191381
 ] 

ASF subversion and git services commented on LUCENE-7972:
-

Commit b9a51a16869cf516853fabb6b4904aa6e2332586 in lucene-solr's branch 
refs/heads/master from Mike McCandless
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=b9a51a1 ]

LUCENE-7972: DirectoryTaxonomyReader now implements Accountable


> DirectoryTaxonomyReader should implement Accountable
> 
>
> Key: LUCENE-7972
> URL: https://issues.apache.org/jira/browse/LUCENE-7972
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 7.1, master (8.0)
>
> Attachments: LUCENE-7972.patch
>
>
> This class is a concrete instance of {{TaxonomyReader}} that uses a Lucene 
> index to map facet labels to ordinals.
> It uses a fair amount of heap, e.g. to hold parent/sibling/child int arrays, 
> to cache recent lookups, and in the underlying {{IndexReader}}.  I think we 
> should have it implement {{Accountable}} so people can track its heap usage.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-10904) Unnecessary waiting during failover in case of failed core creation

2017-10-04 Thread Mihaly Toth (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-10904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191347#comment-16191347
 ] 

Mihaly Toth commented on SOLR-10904:


[~markrmil...@gmail.com], I will try to put up a patch for this tonight.

> Unnecessary waiting during failover in case of failed core creation
> ---
>
> Key: SOLR-10904
> URL: https://issues.apache.org/jira/browse/SOLR-10904
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.0
>Reporter: Mihaly Toth
>Assignee: Mark Miller
>
> Background failover thread checks for bad replicas. In case one is found it 
> tries to create it on another node. Then it waits for the new replica to show 
> up in the cluster state. It waits even if the core creation (initiated by 
> itself) fails. 
> This situation does not occur on the happy path of the failover cases because 
> the new node was marked as alive. But in case the cluster is in an instable 
> state, or user is restarting the new node, or overseer is overloaded this 
> extra wait will result in holding up this failover thread.
> Proposed solution may be
> # wait for the result of the core creation
> # only if previous step is successful proceed to wait for cluster state change
> In code:
> {code}
> try {
>   Future future = updateExecutor.submit(() -> 
> createSolrCore(collection, createUrl, dataDir, ulogDir, coreNodeName, 
> coreName, shardId));
>   future.get(3L, TimeUnit.MILLISECONDS);
> } catch (InterruptedException | ExecutionException | TimeoutException e) {
>   log.error("Error creating core", e);
>   return false;
> } finally {
>   MDC.remove("OverseerAutoReplicaFailoverThread.createUrl");
> }
> {code}
> In such case we could consider moving core creation into the failover thread 
> from the updateExecutor.
> I can post a patch with these changes if the solution seems appropriate.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-10842) Move quickstart.html to Ref Guide

2017-10-04 Thread Cassandra Targett (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-10842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191341#comment-16191341
 ] 

Cassandra Targett commented on SOLR-10842:
--

I just discovered that the commit I made to modify the 
{{solr/site/quickstart.mdtext}} file is causing precommit to fail, which is due 
to LUCENE-7911 (bad external link in "javadocs", even though it's a valid link 
to the Ref Guide).

I don't know what to do about that issue, and don't have a ton of time to deal 
with it right now - anyone object to the idea that I just remove the 
{{solr/site/quickstart.mdtext}} entirely since it's replaced by 
{{solr/solr-ref-guide/solr-tutorial.adoc}}?

> Move quickstart.html to Ref Guide
> -
>
> Key: SOLR-10842
> URL: https://issues.apache.org/jira/browse/SOLR-10842
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Minor
> Fix For: 7.0
>
> Attachments: SOLR-10842.patch
>
>
> The Solr Quick Start at https://lucene.apache.org/solr/quickstart.html has 
> been problematic to keep up to date - until Ishan just updated it yesterday 
> for 6.6, it said "6.2.1", so hadn't been updated for several releases.
> Now that the Ref Guide is in AsciiDoc format, we can easily use variables for 
> package versions, and it could be released as part of the Ref Guide and kept 
> up to date. It could also integrate links to more information on topics, and 
> users would already be IN the docs, so they would not need to wonder where 
> the docs are.
> There are a few places on the site that will need to be updated to point to 
> the new location, but I can also put a redirect rule into .htaccess so people 
> are redirected to the new location if there are other links "in the wild" 
> that we cannot control. This allows it to be versioned also, if that becomes 
> necessary.
> As part of this, I would like to also update the entire "Getting Started" 
> section of the Ref Guide, which is effectively identical to what was in the 
> first release of the Ref Guide back in 2009 for Solr 1.4 and is in serious 
> need of reconsideration.
> My thought is that moving the page + redoing the Getting Started section 
> would be for 7.0, but if folks are excited about this idea I could move the 
> page for 6.6 and hold off redoing the larger section until 7.0.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-master-Linux (64bit/jdk-9) - Build # 20606 - Still Failing!

2017-10-04 Thread Policeman Jenkins Server

Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/20606/
Java: 64bit/jdk-9 -XX:-UseCompressedOops -XX:+UseParallelGC 
--illegal-access=deny

4 tests failed.
FAILED:  junit.framework.TestSuite.org.apache.solr.core.TestLazyCores

Error Message:
1 thread leaked from SUITE scope at org.apache.solr.core.TestLazyCores: 1) 
Thread[id=14903, name=searcherExecutor-6469-thread-1, state=WAITING, 
group=TGRP-TestLazyCores] at 
java.base@9/jdk.internal.misc.Unsafe.park(Native Method) at 
java.base@9/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)   
  at 
java.base@9/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2062)
 at 
java.base@9/java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:435)
 at 
java.base@9/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1092)
 at 
java.base@9/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1152)
 at 
java.base@9/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
 at java.base@9/java.lang.Thread.run(Thread.java:844)

Stack Trace:
com.carrotsearch.randomizedtesting.ThreadLeakError: 1 thread leaked from SUITE 
scope at org.apache.solr.core.TestLazyCores: 
   1) Thread[id=14903, name=searcherExecutor-6469-thread-1, state=WAITING, 
group=TGRP-TestLazyCores]
at java.base@9/jdk.internal.misc.Unsafe.park(Native Method)
at 
java.base@9/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
at 
java.base@9/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2062)
at 
java.base@9/java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:435)
at 
java.base@9/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1092)
at 
java.base@9/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1152)
at 
java.base@9/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
at java.base@9/java.lang.Thread.run(Thread.java:844)
at __randomizedtesting.SeedInfo.seed([859FC622C520A965]:0)


FAILED:  junit.framework.TestSuite.org.apache.solr.core.TestLazyCores

Error Message:
There are still zombie threads that couldn't be terminated:1) 
Thread[id=14903, name=searcherExecutor-6469-thread-1, state=WAITING, 
group=TGRP-TestLazyCores] at 
java.base@9/jdk.internal.misc.Unsafe.park(Native Method) at 
java.base@9/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)   
  at 
java.base@9/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2062)
 at 
java.base@9/java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:435)
 at 
java.base@9/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1092)
 at 
java.base@9/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1152)
 at 
java.base@9/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
 at java.base@9/java.lang.Thread.run(Thread.java:844)

Stack Trace:
com.carrotsearch.randomizedtesting.ThreadLeakError: There are still zombie 
threads that couldn't be terminated:
   1) Thread[id=14903, name=searcherExecutor-6469-thread-1, state=WAITING, 
group=TGRP-TestLazyCores]
at java.base@9/jdk.internal.misc.Unsafe.park(Native Method)
at 
java.base@9/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
at 
java.base@9/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2062)
at 
java.base@9/java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:435)
at 
java.base@9/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1092)
at 
java.base@9/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1152)
at 
java.base@9/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
at java.base@9/java.lang.Thread.run(Thread.java:844)
at __randomizedtesting.SeedInfo.seed([859FC622C520A965]:0)


FAILED:  org.apache.solr.core.TestLazyCores.testNoCommit

Error Message:
Exception during query

Stack Trace:
java.lang.RuntimeException: Exception during query
at 
__randomizedtesting.SeedInfo.seed([859FC622C520A965:5AFF67F30E07CAC0]:0)
at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:884)
at org.apache.solr.core.TestLazyCores.check10(TestLazyCores.java:847)
at 
org.apache.solr.core.TestLazyCores.testNoCommit(TestLazyCores.java:829)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at

[jira] [Commented] (LUCENE-7976) Add a parameter to TieredMergePolicy to merge segments that have more than X percent deleted documents

2017-10-04 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191339#comment-16191339
 ] 

Michael McCandless commented on LUCENE-7976:


bq. It can happen for large collections or with many updates to existing 
documents.

Hmm can you explain how?  TMP should produce max sized segments of ~5 GB, and 
allow at most 50% deleted documents in them, at which point they are eligible 
for merging.

Doing a {{forceMerge}} yet then continuing to add documents to your index can 
result in a large (> 5 GB) segment with more than 50% deletions not being 
merged away.

But I don't see how this can happen if you didn't do a {{forceMerge}} in the 
past?

> Add a parameter to TieredMergePolicy to merge segments that have more than X 
> percent deleted documents
> --
>
> Key: LUCENE-7976
> URL: https://issues.apache.org/jira/browse/LUCENE-7976
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
>
> We're seeing situations "in the wild" where there are very large indexes (on 
> disk) handled quite easily in a single Lucene index. This is particularly 
> true as features like docValues move data into MMapDirectory space. The 
> current TMP algorithm allows on the order of 50% deleted documents as per a 
> dev list conversation with Mike McCandless (and his blog here:  
> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents).
> Especially in the current era of very large indexes in aggregate, (think many 
> TB) solutions like "you need to distribute your collection over more shards" 
> become very costly. Additionally, the tempting "optimize" button exacerbates 
> the issue since once you form, say, a 100G segment (by 
> optimizing/forceMerging) it is not eligible for merging until 97.5G of the 
> docs in it are deleted (current default 5G max segment size).
> The proposal here would be to add a new parameter to TMP, something like 
>  (no, that's not serious name, suggestions 
> welcome) which would default to 100 (or the same behavior we have now).
> So if I set this parameter to, say, 20%, and the max segment size stays at 
> 5G, the following would happen when segments were selected for merging:
> > any segment with > 20% deleted documents would be merged or rewritten NO 
> > MATTER HOW LARGE. There are two cases,
> >> the segment has < 5G "live" docs. In that case it would be merged with 
> >> smaller segments to bring the resulting segment up to 5G. If no smaller 
> >> segments exist, it would just be rewritten
> >> The segment has > 5G "live" docs (the result of a forceMerge or optimize). 
> >> It would be rewritten into a single segment removing all deleted docs no 
> >> matter how big it is to start. The 100G example above would be rewritten 
> >> to an 80G segment for instance.
> Of course this would lead to potentially much more I/O which is why the 
> default would be the same behavior we see now. As it stands now, though, 
> there's no way to recover from an optimize/forceMerge except to re-index from 
> scratch. We routinely see 200G-300G Lucene indexes at this point "in the 
> wild" with 10s of  shards replicated 3 or more times. And that doesn't even 
> include having these over HDFS.
> Alternatives welcome! Something like the above seems minimally invasive. A 
> new merge policy is certainly an alternative.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7976) Add a parameter to TieredMergePolicy to merge segments that have more than X percent deleted documents

2017-10-04 Thread Timothy M. Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191319#comment-16191319
 ] 

Timothy M. Rodriguez commented on LUCENE-7976:
--

Agreed, it's not strictly a result of optimizations.  It can happen for large 
collections or with many updates to existing documents.

> Add a parameter to TieredMergePolicy to merge segments that have more than X 
> percent deleted documents
> --
>
> Key: LUCENE-7976
> URL: https://issues.apache.org/jira/browse/LUCENE-7976
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
>
> We're seeing situations "in the wild" where there are very large indexes (on 
> disk) handled quite easily in a single Lucene index. This is particularly 
> true as features like docValues move data into MMapDirectory space. The 
> current TMP algorithm allows on the order of 50% deleted documents as per a 
> dev list conversation with Mike McCandless (and his blog here:  
> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents).
> Especially in the current era of very large indexes in aggregate, (think many 
> TB) solutions like "you need to distribute your collection over more shards" 
> become very costly. Additionally, the tempting "optimize" button exacerbates 
> the issue since once you form, say, a 100G segment (by 
> optimizing/forceMerging) it is not eligible for merging until 97.5G of the 
> docs in it are deleted (current default 5G max segment size).
> The proposal here would be to add a new parameter to TMP, something like 
>  (no, that's not serious name, suggestions 
> welcome) which would default to 100 (or the same behavior we have now).
> So if I set this parameter to, say, 20%, and the max segment size stays at 
> 5G, the following would happen when segments were selected for merging:
> > any segment with > 20% deleted documents would be merged or rewritten NO 
> > MATTER HOW LARGE. There are two cases,
> >> the segment has < 5G "live" docs. In that case it would be merged with 
> >> smaller segments to bring the resulting segment up to 5G. If no smaller 
> >> segments exist, it would just be rewritten
> >> The segment has > 5G "live" docs (the result of a forceMerge or optimize). 
> >> It would be rewritten into a single segment removing all deleted docs no 
> >> matter how big it is to start. The 100G example above would be rewritten 
> >> to an 80G segment for instance.
> Of course this would lead to potentially much more I/O which is why the 
> default would be the same behavior we see now. As it stands now, though, 
> there's no way to recover from an optimize/forceMerge except to re-index from 
> scratch. We routinely see 200G-300G Lucene indexes at this point "in the 
> wild" with 10s of  shards replicated 3 or more times. And that doesn't even 
> include having these over HDFS.
> Alternatives welcome! Something like the above seems minimally invasive. A 
> new merge policy is certainly an alternative.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

1 2 >

1 - 100 of 134 matches

Mail list logo