date:20160719

[JENKINS] Lucene-Solr-master-MacOSX (64bit/jdk1.8.0) - Build # 3422 - Unstable!

2016-07-19 Thread Policeman Jenkins Server

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-master-MacOSX/3422/
Java: 64bit/jdk1.8.0 -XX:-UseCompressedOops -XX:+UseParallelGC

1 tests failed.
FAILED:  
org.apache.solr.common.cloud.TestCollectionStateWatchers.testCanWaitForNonexistantCollection

Error Message:
waitForState was not triggered by collection creation

Stack Trace:
java.lang.AssertionError: waitForState was not triggered by collection creation
at 
__randomizedtesting.SeedInfo.seed([E4122F559A4CA2A6:4F32B4E3F712668D]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at 
org.apache.solr.common.cloud.TestCollectionStateWatchers.testCanWaitForNonexistantCollection(TestCollectionStateWatchers.java:182)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1764)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:871)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:921)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:880)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:781)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:816)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:827)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at java.lang.Thread.run(Thread.java:745)




Build Log:
[...truncated 13013 lines...]
   [junit4] Suite: org.apache.solr.common.cloud.TestCollectionStateWatchers
   [junit4]   2> Creating dataDir: 
/Users/jenkins/wor

[jira] [Commented] (LUCENE-7381) Add new RangeField

2016-07-19 Thread Adrien Grand (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15385420#comment-15385420
 ] 

Adrien Grand commented on LUCENE-7381:
--

The bounds checks for getMin and getMax apply to {{type.pointDimensionCount()}} 
but it really should be {{type.pointDimensionCount()/2}}?
I think I would find it a bit easier to read if each query type had a different 
RangeFieldComparator impl, the fact that all cases are handled in a single 
method makes it a bit hard to follow for me.

> Add new RangeField
> --
>
> Key: LUCENE-7381
> URL: https://issues.apache.org/jira/browse/LUCENE-7381
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Nicholas Knize
> Attachments: LUCENE-7381.patch, LUCENE-7381.patch, LUCENE-7381.patch, 
> LUCENE-7381.patch, LUCENE-7381.patch
>
>
> I've been tinkering with a new Point-based {{RangeField}} for indexing 
> numeric ranges that could be useful for a number of applications.
> For example, a single dimension represents a span along a single axis such as 
> indexing calendar entries start and end time, 2d range could represent 
> bounding boxes for geometric applications (e.g., supporting Point based geo 
> shapes), 3d ranges bounding cubes for 3d geometric applications (collision 
> detection, 3d geospatial), and 4d ranges for space time applications. I'm 
> sure there's applicability for 5d+ ranges but a first incarnation should 
> likely limit for performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7386) Flatten nested disjunctions

2016-07-19 Thread David Smiley (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15385415#comment-15385415
 ] 

David Smiley commented on LUCENE-7386:
--

Yes, thanks; I've yet to use that.

> Flatten nested disjunctions
> ---
>
> Key: LUCENE-7386
> URL: https://issues.apache.org/jira/browse/LUCENE-7386
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7386.patch, LUCENE-7386.patch
>
>
> Now that coords are gone it became easier to flatten nested disjunctions. It 
> might sound weird to write nested disjunctions in the first place, but 
> disjunctions can be created implicitly by other queries such as 
> more-like-this, LatLonPoint.newBoxQuery, non-scoring synonym queries, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9320) A REPLACENODE command to decommission an existing node with another new node

2016-07-19 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15385408#comment-15385408
 ] 

Noble Paul commented on SOLR-9320:
--

bq.That will leave uneven naming conventions in the cluster.

replica names do not matter. You can create a new replica in any name as you 
want

> A REPLACENODE command to decommission an existing node with another new node
> 
>
> Key: SOLR-9320
> URL: https://issues.apache.org/jira/browse/SOLR-9320
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Noble Paul
> Fix For: 6.1
>
>
> The command should accept a source node and target node. recreate the 
> replicas in source node in the target and do a DLETENODE of source node



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-9319) DELETEREPLICA should be able to accept just count and remove replicas intelligenty

2016-07-19 Thread Noble Paul (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-9319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-9319:
-
Summary: DELETEREPLICA should be able to accept  just count and remove 
replicas intelligenty  (was: DELETEREPLICA should accept  just count and remove 
replicas intelligenty)

> DELETEREPLICA should be able to accept  just count and remove replicas 
> intelligenty
> ---
>
> Key: SOLR-9319
> URL: https://issues.apache.org/jira/browse/SOLR-9319
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Noble Paul
> Fix For: 6.1
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-7386) Flatten nested disjunctions

2016-07-19 Thread Adrien Grand (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-7386:
-
Attachment: LUCENE-7386.patch

> Flatten nested disjunctions
> ---
>
> Key: LUCENE-7386
> URL: https://issues.apache.org/jira/browse/LUCENE-7386
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7386.patch, LUCENE-7386.patch
>
>
> Now that coords are gone it became easier to flatten nested disjunctions. It 
> might sound weird to write nested disjunctions in the first place, but 
> disjunctions can be created implicitly by other queries such as 
> more-like-this, LatLonPoint.newBoxQuery, non-scoring synonym queries, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7386) Flatten nested disjunctions

2016-07-19 Thread Adrien Grand (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15385398#comment-15385398
 ] 

Adrien Grand commented on LUCENE-7386:
--

Sorry David I don't get your question. The attached patch needs to be applied 
to the master branch, and the inlined patch about the task file needs to be 
applied to a master checkout of https://github.com/mikemccand/luceneutil. Is it 
what you were asking?

> Flatten nested disjunctions
> ---
>
> Key: LUCENE-7386
> URL: https://issues.apache.org/jira/browse/LUCENE-7386
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7386.patch
>
>
> Now that coords are gone it became easier to flatten nested disjunctions. It 
> might sound weird to write nested disjunctions in the first place, but 
> disjunctions can be created implicitly by other queries such as 
> more-like-this, LatLonPoint.newBoxQuery, non-scoring synonym queries, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9320) A REPLACENODE command to decommission an existing node with another new node

2016-07-19 Thread Nitin Sharma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15385323#comment-15385323
 ] 

Nitin Sharma commented on SOLR-9320:


Clarification on this. When you mean recreate, does the naming matter? Lets say 
x_shard1_replica1 is on source node, if you want to move it to destination 
node, we can
a)  either create a new replica (x_shard1_replica2) and delete the source. That 
will leave uneven naming conventions in the cluster. (there will not be a 
replica1 but a replica2). 
b) Preserve the exact same name as the replica in source node. We can achieve 
this by creating a temp replica on destination first, deleting the  replica on 
source,  recreating the replica (with same name) on destination and then 
cleaning up the temp. 


Option (b) can be thought of as a migrate core. 

Let me know which sounds more usable. 

> A REPLACENODE command to decommission an existing node with another new node
> 
>
> Key: SOLR-9320
> URL: https://issues.apache.org/jira/browse/SOLR-9320
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Noble Paul
> Fix For: 6.1
>
>
> The command should accept a source node and target node. recreate the 
> replicas in source node in the target and do a DLETENODE of source node



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8344) Decide default when requested fields are both column and row stored.

2016-07-19 Thread David Smiley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15385318#comment-15385318
 ] 

David Smiley edited comment on SOLR-8344 at 7/20/16 4:49 AM:
-

bq. Is it possible to preferentially return DV when doing the first pass of a 
distributed search?

Yes with customizations.  I have a patch in 
https://issues.apache.org/jira/browse/SOLR-5478 which is used by one of my 
clients, and I have another deployed implementation in which a custom request 
handler extending SearchHandler rewrites the id field in {{fl}} to be 
{{field(id)}} in certain circumstances, and that didn't require patching Solr.  
I'd love to see SOLR-5478 get finally put to bed.  Oh and I wrote tests.  Hmmm.


was (Author: dsmiley):
bq. Is it possible to preferentially return DV when doing the first pass of a 
distributed search?

Yes with customizations.  I have a patch in 
https://issues.apache.org/jira/browse/SOLR-5478 which is used by one of my 
clients, and I have another deployed implementation in which a custom request 
handler extending SearchHandler the id in {{fl}} to be {{field(id)}} in certain 
circumstances, and that didn't require patching Solr.  I'd love to see 
SOLR-5478 get finally put to bed.  Oh and I wrote tests.  Hmmm.

> Decide default when requested fields are both column and row stored.
> 
>
> Key: SOLR-8344
> URL: https://issues.apache.org/jira/browse/SOLR-8344
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>
> This issue was discussed in the comments at SOLR-8220. Splitting it out to a 
> separate issue so that we can have a focused discussion on whether/how to do 
> this.
> If a given set of requested fields are all stored and have docValues (column 
> stored), we can retrieve the values from either place.  What should the 
> default be?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-8344) Decide default when requested fields are both column and row stored.

2016-07-19 Thread David Smiley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15385318#comment-15385318
 ] 

David Smiley commented on SOLR-8344:


bq. Is it possible to preferentially return DV when doing the first pass of a 
distributed search?

Yes with customizations.  I have a patch in 
https://issues.apache.org/jira/browse/SOLR-5478 which is used by one of my 
clients, and I have another deployed implementation in which a custom request 
handler extending SearchHandler the id in {{fl}} to be {{field(id)}} in certain 
circumstances, and that didn't require patching Solr.  I'd love to see 
SOLR-5478 get finally put to bed.  Oh and I wrote tests.  Hmmm.

> Decide default when requested fields are both column and row stored.
> 
>
> Key: SOLR-8344
> URL: https://issues.apache.org/jira/browse/SOLR-8344
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>
> This issue was discussed in the comments at SOLR-8220. Splitting it out to a 
> separate issue so that we can have a focused discussion on whether/how to do 
> this.
> If a given set of requested fields are all stored and have docValues (column 
> stored), we can retrieve the values from either place.  What should the 
> default be?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-6.x-Windows (32bit/jdk1.8.0_92) - Build # 331 - Still Unstable!

2016-07-19 Thread Policeman Jenkins Server

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Windows/331/
Java: 32bit/jdk1.8.0_92 -server -XX:+UseParallelGC

2 tests failed.
FAILED:  org.apache.solr.handler.TestReplicationHandler.doTestStressReplication

Error Message:
[index.20160720112945206, index.20160720113009129, index.properties, 
replication.properties] expected:<1> but was:<2>

Stack Trace:
java.lang.AssertionError: [index.20160720112945206, index.20160720113009129, 
index.properties, replication.properties] expected:<1> but was:<2>
at 
__randomizedtesting.SeedInfo.seed([45FED61E52539376:9E55D6D8577BFAC5]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at 
org.apache.solr.handler.TestReplicationHandler.checkForSingleIndex(TestReplicationHandler.java:907)
at 
org.apache.solr.handler.TestReplicationHandler.doTestStressReplication(TestReplicationHandler.java:874)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1764)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:871)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:921)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:880)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:781)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:816)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:827)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedt

[jira] [Updated] (SOLR-9310) PeerSync fails on a node restart due to IndexFingerPrint mismatch

2016-07-19 Thread Pushkar Raste (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-9310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pushkar Raste updated SOLR-9310:

Attachment: (was: SOLR9310.patch)

> PeerSync fails on a node restart due to IndexFingerPrint mismatch
> -
>
> Key: SOLR-9310
> URL: https://issues.apache.org/jira/browse/SOLR-9310
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Pushkar Raste
> Attachments: SOLR-9310.patch
>
>
> I found that Peer Sync fails if a node restarts and documents were indexed 
> while node was down. IndexFingerPrint check fails after recovering node 
> applies updates. 
> This happens only when node restarts and not if node just misses updates due 
> reason other than it being down.
> Please check attached patch for the test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-9310) PeerSync fails on a node restart due to IndexFingerPrint mismatch

2016-07-19 Thread Pushkar Raste (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-9310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pushkar Raste updated SOLR-9310:

Attachment: (was: PeerSyncReplicationTest.patch)

> PeerSync fails on a node restart due to IndexFingerPrint mismatch
> -
>
> Key: SOLR-9310
> URL: https://issues.apache.org/jira/browse/SOLR-9310
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Pushkar Raste
> Attachments: SOLR-9310.patch
>
>
> I found that Peer Sync fails if a node restarts and documents were indexed 
> while node was down. IndexFingerPrint check fails after recovering node 
> applies updates. 
> This happens only when node restarts and not if node just misses updates due 
> reason other than it being down.
> Please check attached patch for the test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-9310) PeerSync fails on a node restart due to IndexFingerPrint mismatch

2016-07-19 Thread Pushkar Raste (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-9310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pushkar Raste updated SOLR-9310:

Attachment: SOLR-9310.patch

Updated patch and tests for scenarios I have described

> PeerSync fails on a node restart due to IndexFingerPrint mismatch
> -
>
> Key: SOLR-9310
> URL: https://issues.apache.org/jira/browse/SOLR-9310
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Pushkar Raste
> Attachments: PeerSyncReplicationTest.patch, SOLR-9310.patch, 
> SOLR9310.patch
>
>
> I found that Peer Sync fails if a node restarts and documents were indexed 
> while node was down. IndexFingerPrint check fails after recovering node 
> applies updates. 
> This happens only when node restarts and not if node just misses updates due 
> reason other than it being down.
> Please check attached patch for the test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9310) PeerSync fails on a node restart due to IndexFingerPrint mismatch

2016-07-19 Thread Pushkar Raste (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15385278#comment-15385278
 ] 

Pushkar Raste commented on SOLR-9310:
-

There is another problem in {{RealTimeGetComponent.processGetVersions()}}, 
since asking for a fingerprint causes a new RealTime Searcher to open, we 
should first get the fingerprint and then get versions from ulog

> PeerSync fails on a node restart due to IndexFingerPrint mismatch
> -
>
> Key: SOLR-9310
> URL: https://issues.apache.org/jira/browse/SOLR-9310
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Pushkar Raste
> Attachments: PeerSyncReplicationTest.patch, SOLR9310.patch
>
>
> I found that Peer Sync fails if a node restarts and documents were indexed 
> while node was down. IndexFingerPrint check fails after recovering node 
> applies updates. 
> This happens only when node restarts and not if node just misses updates due 
> reason other than it being down.
> Please check attached patch for the test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9310) PeerSync fails on a node restart due to IndexFingerPrint mismatch

2016-07-19 Thread Pushkar Raste (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15385265#comment-15385265
 ] 

Pushkar Raste commented on SOLR-9310:
-

Although my patch would work, if there was no active indexing going on during 
PeerSync, fingerprint check may fail if there was any active indexing was going 
on. There are too many race conditions here. 

In my opinion people who are continuously indexing data, should disable 
fingerprint check.

> PeerSync fails on a node restart due to IndexFingerPrint mismatch
> -
>
> Key: SOLR-9310
> URL: https://issues.apache.org/jira/browse/SOLR-9310
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Pushkar Raste
> Attachments: PeerSyncReplicationTest.patch, SOLR9310.patch
>
>
> I found that Peer Sync fails if a node restarts and documents were indexed 
> while node was down. IndexFingerPrint check fails after recovering node 
> applies updates. 
> This happens only when node restarts and not if node just misses updates due 
> reason other than it being down.
> Please check attached patch for the test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-7381) Add new RangeField

2016-07-19 Thread Nicholas Knize (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Knize updated LUCENE-7381:
---
Attachment: LUCENE-7381.patch

Minor clean up of some exception messages and javadocs.

> Add new RangeField
> --
>
> Key: LUCENE-7381
> URL: https://issues.apache.org/jira/browse/LUCENE-7381
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Nicholas Knize
> Attachments: LUCENE-7381.patch, LUCENE-7381.patch, LUCENE-7381.patch, 
> LUCENE-7381.patch, LUCENE-7381.patch
>
>
> I've been tinkering with a new Point-based {{RangeField}} for indexing 
> numeric ranges that could be useful for a number of applications.
> For example, a single dimension represents a span along a single axis such as 
> indexing calendar entries start and end time, 2d range could represent 
> bounding boxes for geometric applications (e.g., supporting Point based geo 
> shapes), 3d ranges bounding cubes for 3d geometric applications (collision 
> detection, 3d geospatial), and 4d ranges for space time applications. I'm 
> sure there's applicability for 5d+ ranges but a first incarnation should 
> likely limit for performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-8344) Decide default when requested fields are both column and row stored.

2016-07-19 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15385236#comment-15385236
 ] 

Erick Erickson commented on SOLR-8344:
--

Is it possible to preferentially return DV when doing the first pass of a 
distributed search? IIRC, Yonik said at one point that we already get the sort 
values from the index (or perhaps we can pull them from DV fields). But I just 
verified that the logic in DocStreamer decompresses the doc pretty much no 
matter what. I've just verified this in 6.2. The test for whether all the 
fields are DV fields fails due to the presence of the "score" pseudo field.

I added a hack patch to 6810 to illustrate one path that sidesteps 
decompression. 

Anyway, I was wondering if we could somehow detect that this was the first pass 
and return the DV fields all the time, although I do wonder if some obscure 
case where the sorted order of the multiValued DV fields would mess things up.

> Decide default when requested fields are both column and row stored.
> 
>
> Key: SOLR-8344
> URL: https://issues.apache.org/jira/browse/SOLR-8344
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>
> This issue was discussed in the comments at SOLR-8220. Splitting it out to a 
> separate issue so that we can have a focused discussion on whether/how to do 
> this.
> If a given set of requested fields are all stored and have docValues (column 
> stored), we can retrieve the values from either place.  What should the 
> default be?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS-EA] Lucene-Solr-master-Linux (32bit/jdk-9-ea+127) - Build # 17306 - Unstable!

2016-07-19 Thread Policeman Jenkins Server

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/17306/
Java: 32bit/jdk-9-ea+127 -server -XX:+UseSerialGC

1 tests failed.
FAILED:  org.apache.solr.cloud.ForceLeaderTest.testReplicasInLIRNoLeader

Error Message:
No live SolrServers available to handle this 
request:[http://127.0.0.1:39416/cy_/fb/forceleader_test_collection_shard1_replica1]

Stack Trace:
org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: No live 
SolrServers available to handle this 
request:[http://127.0.0.1:39416/cy_/fb/forceleader_test_collection_shard1_replica1]
at 
__randomizedtesting.SeedInfo.seed([EE5B7F2E14174DDC:8CC4BEE2D95B4BD]:0)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.directUpdate(CloudSolrClient.java:739)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1151)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:1040)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:976)
at 
org.apache.solr.cloud.AbstractFullDistribZkTestBase.sendDocsWithRetry(AbstractFullDistribZkTestBase.java:753)
at 
org.apache.solr.cloud.AbstractFullDistribZkTestBase.sendDocsWithRetry(AbstractFullDistribZkTestBase.java:741)
at 
org.apache.solr.cloud.ForceLeaderTest.sendDoc(ForceLeaderTest.java:424)
at 
org.apache.solr.cloud.ForceLeaderTest.testReplicasInLIRNoLeader(ForceLeaderTest.java:131)
at 
jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(java.base@9-ea/Native 
Method)
at 
jdk.internal.reflect.NativeMethodAccessorImpl.invoke(java.base@9-ea/NativeMethodAccessorImpl.java:62)
at 
jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@9-ea/DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(java.base@9-ea/Method.java:533)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1764)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:871)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:921)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:985)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:960)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:880)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:781)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:816)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:827)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.Sta

[JENKINS] Lucene-Solr-master-Windows (64bit/jdk1.8.0_92) - Build # 5994 - Still Unstable!

2016-07-19 Thread Policeman Jenkins Server

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-master-Windows/5994/
Java: 64bit/jdk1.8.0_92 -XX:+UseCompressedOops -XX:+UseParallelGC

1 tests failed.
FAILED:  org.apache.solr.cloud.TestLocalFSCloudBackupRestore.test

Error Message:
Error from server at https://127.0.0.1:49301/solr: The backup directory already 
exists: 
file:///C:/Users/jenkins/workspace/Lucene-Solr-master-Windows/solr/build/solr-core/test/J1/temp/solr.cloud.TestLocalFSCloudBackupRestore_29622E2A8D9B2921-001/tempDir-002/mytestbackup/

Stack Trace:
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at https://127.0.0.1:49301/solr: The backup directory already 
exists: 
file:///C:/Users/jenkins/workspace/Lucene-Solr-master-Windows/solr/build/solr-core/test/J1/temp/solr.cloud.TestLocalFSCloudBackupRestore_29622E2A8D9B2921-001/tempDir-002/mytestbackup/
at 
__randomizedtesting.SeedInfo.seed([29622E2A8D9B2921:A13611F0236744D9]:0)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:606)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:259)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:248)
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:413)
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:366)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1270)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:1040)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:976)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:149)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:166)
at 
org.apache.solr.cloud.AbstractCloudBackupRestoreTestCase.testBackupAndRestore(AbstractCloudBackupRestoreTestCase.java:206)
at 
org.apache.solr.cloud.AbstractCloudBackupRestoreTestCase.test(AbstractCloudBackupRestoreTestCase.java:126)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1764)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:871)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:921)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:880)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:781)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:816)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:827)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.

[jira] [Updated] (LUCENE-7381) Add new RangeField

2016-07-19 Thread Nicholas Knize (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Knize updated LUCENE-7381:
---
Attachment: LUCENE-7381.patch

Thanks [~jpountz]! 

bq. Should the field be called something like DoubleRange

+1. I was on the fence about this but I think its the right way to go. Not only 
does it make it sense for consistency but it reduces index size for ints and 
floats.

bq. The reuse of fieldsData in setRangeValues worries me a bit, is it safe?

We did this in {{LatLonPoint}}. I don't think its any less safe than creating a 
new BytesRef (but more efficient?) Maybe [~mikemccand] has a better answer?

bq. QueryType does not need to be public?

+1

bq. Why do you replace infinities with +/-MAX_VALUE?

Good catch, I think that was in there from unrelated debugging funzies.

Updated patch is attached.

> Add new RangeField
> --
>
> Key: LUCENE-7381
> URL: https://issues.apache.org/jira/browse/LUCENE-7381
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Nicholas Knize
> Attachments: LUCENE-7381.patch, LUCENE-7381.patch, LUCENE-7381.patch, 
> LUCENE-7381.patch
>
>
> I've been tinkering with a new Point-based {{RangeField}} for indexing 
> numeric ranges that could be useful for a number of applications.
> For example, a single dimension represents a span along a single axis such as 
> indexing calendar entries start and end time, 2d range could represent 
> bounding boxes for geometric applications (e.g., supporting Point based geo 
> shapes), 3d ranges bounding cubes for 3d geometric applications (collision 
> detection, 3d geospatial), and 4d ranges for space time applications. I'm 
> sure there's applicability for 5d+ ranges but a first incarnation should 
> likely limit for performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5944) Support updates of numeric DocValues

2016-07-19 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15385215#comment-15385215
 ] 

Shalin Shekhar Mangar commented on SOLR-5944:
-

bq. Btw, I've surrounded only the lookupVersion() calls with the acquiring and 
releasing of the lock, instead of surrounding the entire wait loop with the 
acquiring/releasing of the lock: I reasoned that while we are waiting in that 
wait loop, other threads need to have indexed the update that we're waiting on, 
and hence I released the lock as soon as it was not needed, only to re-acquire 
it after 100ms. Does that sound like a valid reason?

The read lock is for safe publication of fields in update log and it is 
acquired by indexing threads who only want to read stuff from update log. Also 
read locks can be held by multiple readers. Therefore, acquiring this lock does 
not prevent other threads from indexing.

Also, please be very careful when changing the order of acquiring locks because 
it can result in deadlocks. It is good practice to do them in the same sequence 
as everywhere else in the code. So synchronizing on bucket before 
vinfo.lockForUpdate for a small optimization doesn't seem worthwhile to me.

bq. Since this method enters the wait loop for every in-place update that has 
arrived out of order at a replica (an event, that I think is frequent under 
multithreaded load), I don't want every such update to be waiting for the full 
timeout period (5s here), but instead check back from time to time. In most of 
the cases, the dependent update would've been written (by another thread) 
within the first 100ms, after which we can bail out. Do you think that makes 
sense?

You misunderstand. A wait(5000) does not mean that you are waiting the full 5 
seconds. Any notifyAll() will wake up the waiting thread and when it does, it 
will check the lastFoundVersion and proceed accordingly. In practice wait(100) 
may not be so bad but if an update doesn't arrive for more than 100ms the 
thread will wake up and lookup the version needlessly with your current patch.

A few more comments:
# In your latest patch, acquiring the read lock to call versionAdd is not 
necessary -- it will do that anyway. You can re-acquire it for reading the 
version after the method call returns.
# I don't think the case of {{vinfo.lookupVersion}} returning a negative value 
(for deletes) is handled here at all.
# What happens if the document has been deleted already (due to reordering on 
the replica) when you enter waitForDependentUpdates? i.e. what if re-ordering 
leads to new_doc (v1) -> del_doc (v3) -> dv_update (v2) on the replica?
# Similarly, how do we handle the case when the doc has been deleted on the 
leader when you execute fetchMissingUpdateFromLeader. Does RTG return the 
requested version even if the doc has been deleted already? I suspect it does 
but be nice to confirm.

> Support updates of numeric DocValues
> 
>
> Key: SOLR-5944
> URL: https://issues.apache.org/jira/browse/SOLR-5944
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Shalin Shekhar Mangar
> Attachments: DUP.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, 
> TestStressInPlaceUpdates.eb044ac71.beast-167-failure.stdout.txt, 
> TestStressInPlaceUpdates.eb044ac71.beast-587-failure.stdout.txt, 
> TestStressInPlaceUpdates.eb044ac71.failures.tar.gz, 
> hoss.62D328FA1DEA57FD.fail.txt, hoss.62D328FA1DEA57FD.fail2.txt, 
> hoss.62D328FA1DEA57FD.fail3.txt, hoss.D768DD9443A98DC.fail.txt, 
> hoss.D768DD9443A98DC.pass.txt
>
>
> LUCENE-5189 introduced support for updates to numeric docvalues. It would be 
> really nice to have Solr support this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-7387) Something wrong with how "File Formats" link is generated in docs/index.html - can cause precommit to fail on some systems

2016-07-19 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15385177#comment-15385177
 ] 

Hoss Man edited comment on LUCENE-7387 at 7/20/16 1:57 AM:
---

The source of the newline is the original newline in {{Codec.java}} ... the way 
we're using {{}} to only pass through the line we want, and to 
replace the entire line with only the codec name doesn't do anything to remove 
the newline ... oddly enough removing {{$}} from the pattern and using 
{{flags="s"}} to get the final {{.}} to match (and thus ignore) the line ending 
doesn't seem to help.

In this patch I've added an {{}} to remove the newline, 
preceded by an explicit {{}} to ensure {{\n}} is the *only* thing we 
might have at the end of that line, regardless of the platform defaults.



This doesn't explain why Ant 1.9.4 was converting the newline to a non-breaking 
space (probably something changed in the xslt tag?) but honestly i don't care 
as long as we fix the root problem.

My bigger concern is why documentation-lint isn't failing if/when our links 
have newlines in them like this?


was (Author: hossman):
The source of the newline is the original newline in {{Codec.java} ... the way 
we're using {{containsregex}} only passing through the line we want, and to 
replace the entire line with only the codec name doesn't do anything to remove 
the newline ... oddly enough removing {{$}} from the pattern and using 
{{flags="s"}} to get the final {{.}} to match (and thus ignore) the line ending 
doesn't seem to help.

In this patch I've added an {{}} to ensure {{\n}} is the *only* thing we 
might have at the end of that line, regardless of the platform defaults.

This doesn't explain why Ant 1.9.4 was converting the newline to a non-breaking 
space (probably something changed in the xslt tag?) but honestly i don't care 
as long as we fix the root problem.

My bigger concern is why documentation-lint isn't failing if/when our links 
have newlines in them like this?

> Something wrong with how "File Formats" link is generated in docs/index.html 
> - can cause precommit to fail on some systems
> --
>
> Key: LUCENE-7387
> URL: https://issues.apache.org/jira/browse/LUCENE-7387
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Hoss Man
> Attachments: LUCENE-7387.patch
>
>
> I'm not sure what's going on, but here's what I've figured out while poking 
> at things with Ishan to try and figure out why {{ant precommit}} fails for 
> him on a clean checkout of master...
> * on my machine, with a clean checkout, the generated index.html file has 
> lines that look like this...{noformat}
> 
> File Formats: Guide to the 
> supported index format used by Lucene.  This can be customized by using  href="core/org/apache/lucene/codecs/package-summary.html#package.description">an
>  alternate codec.
> 
> {noformat}...note there is a newline in the href after {{lucene62}}
> * on ishan's machine, with a clean checkout, the same line looks like 
> this...{noformat}
> 
>  href="core/org/apache/lucene/codecs/lucene62%0A/package-summary.html#package.description">File
>  Formats: Guide to the supported index format used by Lucene.  This can 
> be customized by using  href="core/org/apache/lucene/codecs/package-summary.html#package.description">an
>  alternate codec.
> 
> {noformat}...note that he has a URL escaped {{'NO-BREAK SPACE' (U+00A0)}} 
> character in href attribute.
> * on my machine, {{ant documentation-lint}} doesn't complain about the 
> newline in the href attribute when checking links.
> * on ishan's machine, {{ant documentation-lint}} most certainly complains 
> about the 'NO-BREAK SPACE'...{noformat}
> ...
> -documentation-lint:
>  [echo] checking for broken html...
> [jtidy] Checking for broken html (such as invalid tags)...
>[delete] Deleting directory 
> /home/ishan/code/chatman-lucene-solr/lucene/build/jtidy_tmp
>  [echo] Checking for broken links...
>  [exec] 
>  [exec] Crawl/parse...
>  [exec] 
>  [exec] Verify...
>  [exec] 
>  [exec] file:///build/docs/index.html
>  [exec]   BROKEN LINK: 
> file:///build/docs/core/org/apache/lucene/codecs/lucene62%0A/package-summary.html
>  [exec] 
>  [exec] Broken javadocs links were found!
> BUILD FAILED
> {noformat}
> Raising the following questions...
> * How is *either* a newline or a 'NO-BREAK SPACE' getting introduced into the 
> {{$defaultCodecPackage}} variable that index.xsl uses to generate that href 
> attribute?
> * why doesn't {{documentation-lint}} complain that the href has a newline in 
> it on my system?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-

[jira] [Updated] (LUCENE-7387) Something wrong with how "File Formats" link is generated in docs/index.html - can cause precommit to fail on some systems

2016-07-19 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-7387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated LUCENE-7387:
-
Attachment: LUCENE-7387.patch

The source of the newline is the original newline in {{Codec.java} ... the way 
we're using {{containsregex}} only passing through the line we want, and to 
replace the entire line with only the codec name doesn't do anything to remove 
the newline ... oddly enough removing {{$}} from the pattern and using 
{{flags="s"}} to get the final {{.}} to match (and thus ignore) the line ending 
doesn't seem to help.

In this patch I've added an {{}} to ensure {{\n}} is the *only* thing we 
might have at the end of that line, regardless of the platform defaults.

This doesn't explain why Ant 1.9.4 was converting the newline to a non-breaking 
space (probably something changed in the xslt tag?) but honestly i don't care 
as long as we fix the root problem.

My bigger concern is why documentation-lint isn't failing if/when our links 
have newlines in them like this?

> Something wrong with how "File Formats" link is generated in docs/index.html 
> - can cause precommit to fail on some systems
> --
>
> Key: LUCENE-7387
> URL: https://issues.apache.org/jira/browse/LUCENE-7387
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Hoss Man
> Attachments: LUCENE-7387.patch
>
>
> I'm not sure what's going on, but here's what I've figured out while poking 
> at things with Ishan to try and figure out why {{ant precommit}} fails for 
> him on a clean checkout of master...
> * on my machine, with a clean checkout, the generated index.html file has 
> lines that look like this...{noformat}
> 
> File Formats: Guide to the 
> supported index format used by Lucene.  This can be customized by using  href="core/org/apache/lucene/codecs/package-summary.html#package.description">an
>  alternate codec.
> 
> {noformat}...note there is a newline in the href after {{lucene62}}
> * on ishan's machine, with a clean checkout, the same line looks like 
> this...{noformat}
> 
>  href="core/org/apache/lucene/codecs/lucene62%0A/package-summary.html#package.description">File
>  Formats: Guide to the supported index format used by Lucene.  This can 
> be customized by using  href="core/org/apache/lucene/codecs/package-summary.html#package.description">an
>  alternate codec.
> 
> {noformat}...note that he has a URL escaped {{'NO-BREAK SPACE' (U+00A0)}} 
> character in href attribute.
> * on my machine, {{ant documentation-lint}} doesn't complain about the 
> newline in the href attribute when checking links.
> * on ishan's machine, {{ant documentation-lint}} most certainly complains 
> about the 'NO-BREAK SPACE'...{noformat}
> ...
> -documentation-lint:
>  [echo] checking for broken html...
> [jtidy] Checking for broken html (such as invalid tags)...
>[delete] Deleting directory 
> /home/ishan/code/chatman-lucene-solr/lucene/build/jtidy_tmp
>  [echo] Checking for broken links...
>  [exec] 
>  [exec] Crawl/parse...
>  [exec] 
>  [exec] Verify...
>  [exec] 
>  [exec] file:///build/docs/index.html
>  [exec]   BROKEN LINK: 
> file:///build/docs/core/org/apache/lucene/codecs/lucene62%0A/package-summary.html
>  [exec] 
>  [exec] Broken javadocs links were found!
> BUILD FAILED
> {noformat}
> Raising the following questions...
> * How is *either* a newline or a 'NO-BREAK SPACE' getting introduced into the 
> {{$defaultCodecPackage}} variable that index.xsl uses to generate that href 
> attribute?
> * why doesn't {{documentation-lint}} complain that the href has a newline in 
> it on my system?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-7387) Something wrong with how "File Formats" link is generated in docs/index.html - can cause precommit to fail on some systems

2016-07-19 Thread Ishan Chattopadhyaya (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15385162#comment-15385162
 ] 

Ishan Chattopadhyaya edited comment on LUCENE-7387 at 7/20/16 1:28 AM:
---

Just downgraded from ant 1.9.6 (that was preinstalled with Fedora 23) to 1.9.4, 
and {{ant documentation-lint}} passed. However, it seems like a genuine bug and 
shouldn't have passed. I see a newline with 1.9.4 (doc lint passes), and the 
NO-BREAK SPACE character with 1.9.6 (doc lint fails).


was (Author: ichattopadhyaya):
Just downgraded from ant 1.9.6 (that was preinstalled with Fedora 23) to 1.9.4, 
and {{ant documentation-lint}} passed. However, it seems like a genuine bug and 
shouldn't have passed. 

> Something wrong with how "File Formats" link is generated in docs/index.html 
> - can cause precommit to fail on some systems
> --
>
> Key: LUCENE-7387
> URL: https://issues.apache.org/jira/browse/LUCENE-7387
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Hoss Man
>
> I'm not sure what's going on, but here's what I've figured out while poking 
> at things with Ishan to try and figure out why {{ant precommit}} fails for 
> him on a clean checkout of master...
> * on my machine, with a clean checkout, the generated index.html file has 
> lines that look like this...{noformat}
> 
> File Formats: Guide to the 
> supported index format used by Lucene.  This can be customized by using  href="core/org/apache/lucene/codecs/package-summary.html#package.description">an
>  alternate codec.
> 
> {noformat}...note there is a newline in the href after {{lucene62}}
> * on ishan's machine, with a clean checkout, the same line looks like 
> this...{noformat}
> 
>  href="core/org/apache/lucene/codecs/lucene62%0A/package-summary.html#package.description">File
>  Formats: Guide to the supported index format used by Lucene.  This can 
> be customized by using  href="core/org/apache/lucene/codecs/package-summary.html#package.description">an
>  alternate codec.
> 
> {noformat}...note that he has a URL escaped {{'NO-BREAK SPACE' (U+00A0)}} 
> character in href attribute.
> * on my machine, {{ant documentation-lint}} doesn't complain about the 
> newline in the href attribute when checking links.
> * on ishan's machine, {{ant documentation-lint}} most certainly complains 
> about the 'NO-BREAK SPACE'...{noformat}
> ...
> -documentation-lint:
>  [echo] checking for broken html...
> [jtidy] Checking for broken html (such as invalid tags)...
>[delete] Deleting directory 
> /home/ishan/code/chatman-lucene-solr/lucene/build/jtidy_tmp
>  [echo] Checking for broken links...
>  [exec] 
>  [exec] Crawl/parse...
>  [exec] 
>  [exec] Verify...
>  [exec] 
>  [exec] file:///build/docs/index.html
>  [exec]   BROKEN LINK: 
> file:///build/docs/core/org/apache/lucene/codecs/lucene62%0A/package-summary.html
>  [exec] 
>  [exec] Broken javadocs links were found!
> BUILD FAILED
> {noformat}
> Raising the following questions...
> * How is *either* a newline or a 'NO-BREAK SPACE' getting introduced into the 
> {{$defaultCodecPackage}} variable that index.xsl uses to generate that href 
> attribute?
> * why doesn't {{documentation-lint}} complain that the href has a newline in 
> it on my system?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7387) Something wrong with how "File Formats" link is generated in docs/index.html - can cause precommit to fail on some systems

2016-07-19 Thread Ishan Chattopadhyaya (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15385162#comment-15385162
 ] 

Ishan Chattopadhyaya commented on LUCENE-7387:
--

Just downgraded from ant 1.9.6 (that was preinstalled with Fedora 23) to 1.9.4, 
and {{ant documentation-lint}} passed. However, it seems like a genuine bug and 
shouldn't have passed. 

> Something wrong with how "File Formats" link is generated in docs/index.html 
> - can cause precommit to fail on some systems
> --
>
> Key: LUCENE-7387
> URL: https://issues.apache.org/jira/browse/LUCENE-7387
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Hoss Man
>
> I'm not sure what's going on, but here's what I've figured out while poking 
> at things with Ishan to try and figure out why {{ant precommit}} fails for 
> him on a clean checkout of master...
> * on my machine, with a clean checkout, the generated index.html file has 
> lines that look like this...{noformat}
> 
> File Formats: Guide to the 
> supported index format used by Lucene.  This can be customized by using  href="core/org/apache/lucene/codecs/package-summary.html#package.description">an
>  alternate codec.
> 
> {noformat}...note there is a newline in the href after {{lucene62}}
> * on ishan's machine, with a clean checkout, the same line looks like 
> this...{noformat}
> 
>  href="core/org/apache/lucene/codecs/lucene62%0A/package-summary.html#package.description">File
>  Formats: Guide to the supported index format used by Lucene.  This can 
> be customized by using  href="core/org/apache/lucene/codecs/package-summary.html#package.description">an
>  alternate codec.
> 
> {noformat}...note that he has a URL escaped {{'NO-BREAK SPACE' (U+00A0)}} 
> character in href attribute.
> * on my machine, {{ant documentation-lint}} doesn't complain about the 
> newline in the href attribute when checking links.
> * on ishan's machine, {{ant documentation-lint}} most certainly complains 
> about the 'NO-BREAK SPACE'...{noformat}
> ...
> -documentation-lint:
>  [echo] checking for broken html...
> [jtidy] Checking for broken html (such as invalid tags)...
>[delete] Deleting directory 
> /home/ishan/code/chatman-lucene-solr/lucene/build/jtidy_tmp
>  [echo] Checking for broken links...
>  [exec] 
>  [exec] Crawl/parse...
>  [exec] 
>  [exec] Verify...
>  [exec] 
>  [exec] file:///build/docs/index.html
>  [exec]   BROKEN LINK: 
> file:///build/docs/core/org/apache/lucene/codecs/lucene62%0A/package-summary.html
>  [exec] 
>  [exec] Broken javadocs links were found!
> BUILD FAILED
> {noformat}
> Raising the following questions...
> * How is *either* a newline or a 'NO-BREAK SPACE' getting introduced into the 
> {{$defaultCodecPackage}} variable that index.xsl uses to generate that href 
> attribute?
> * why doesn't {{documentation-lint}} complain that the href has a newline in 
> it on my system?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-7387) Something wrong with how "File Formats" link is generated in docs/index.html - can cause precommit to fail on some systems

2016-07-19 Thread Hoss Man (JIRA)

Hoss Man created LUCENE-7387:


 Summary: Something wrong with how "File Formats" link is generated 
in docs/index.html - can cause precommit to fail on some systems
 Key: LUCENE-7387
 URL: https://issues.apache.org/jira/browse/LUCENE-7387
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Hoss Man



I'm not sure what's going on, but here's what I've figured out while poking at 
things with Ishan to try and figure out why {{ant precommit}} fails for him on 
a clean checkout of master...

* on my machine, with a clean checkout, the generated index.html file has lines 
that look like this...{noformat}

File Formats: Guide to the 
supported index format used by Lucene.  This can be customized by using an
 alternate codec.

{noformat}...note there is a newline in the href after {{lucene62}}
* on ishan's machine, with a clean checkout, the same line looks like 
this...{noformat}

File
 Formats: Guide to the supported index format used by Lucene.  This can be 
customized by using an
 alternate codec.

{noformat}...note that he has a URL escaped {{'NO-BREAK SPACE' (U+00A0)}} 
character in href attribute.
* on my machine, {{ant documentation-lint}} doesn't complain about the newline 
in the href attribute when checking links.
* on ishan's machine, {{ant documentation-lint}} most certainly complains about 
the 'NO-BREAK SPACE'...{noformat}
...
-documentation-lint:
 [echo] checking for broken html...
[jtidy] Checking for broken html (such as invalid tags)...
   [delete] Deleting directory 
/home/ishan/code/chatman-lucene-solr/lucene/build/jtidy_tmp
 [echo] Checking for broken links...
 [exec] 
 [exec] Crawl/parse...
 [exec] 
 [exec] Verify...
 [exec] 
 [exec] file:///build/docs/index.html
 [exec]   BROKEN LINK: 
file:///build/docs/core/org/apache/lucene/codecs/lucene62%0A/package-summary.html
 [exec] 
 [exec] Broken javadocs links were found!
BUILD FAILED

{noformat}

Raising the following questions...

* How is *either* a newline or a 'NO-BREAK SPACE' getting introduced into the 
{{$defaultCodecPackage}} variable that index.xsl uses to generate that href 
attribute?
* why doesn't {{documentation-lint}} complain that the href has a newline in it 
on my system?




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-9308) SolrCloud RTG doesn't forward any params to shards, causes fqs & non-default fl params to be ignored

2016-07-19 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-9308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-9308:
---
Attachment: SOLR-9308.patch


Ugh...

* TestStressCloudBlindAtomicUpdates has been using RTG + filter queries to 
assert that atomic updates work -- but because of this issue the filter queries 
were getting silently ignored and the tests wasn't as strong as i thought when 
i wrote it.
* TestStressCloudBlindAtomicUpdates evidently had a bug in how it formatted the 
{{fq}} params when trying to filter on negative numbers -- but again: because 
of SOLR-9308 those filter queries were never getting parsed, and the test bug 
when unnoticed until now.

Latest patch updated to also fix the bug in TestStressCloudBlindAtomicUpdates 
now that the filter queries are getting parsed & used correctly.



> SolrCloud RTG doesn't forward any params to shards, causes fqs & non-default 
> fl params to be ignored
> 
>
> Key: SOLR-9308
> URL: https://issues.apache.org/jira/browse/SOLR-9308
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
> Attachments: SOLR-9308.patch, SOLR-9308.patch, SOLR-9308.patch
>
>
> While working on a robust randomized test for SOLR-9285, I can't seem to get 
> filter queries on RTG to work at all -- even when the docs are fully 
> committed.
> steps to reproduce to follow in comment...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-7495) Unexpected docvalues type NUMERIC when grouping by a int facet

2016-07-19 Thread Nick Coult (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-7495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15385097#comment-15385097
 ] 

Nick Coult commented on SOLR-7495:
--

Can you supply your patch for 6.1?
Thanks


> Unexpected docvalues type NUMERIC when grouping by a int facet
> --
>
> Key: SOLR-7495
> URL: https://issues.apache.org/jira/browse/SOLR-7495
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 5.0, 5.1, 5.2, 5.3
>Reporter: Fabio Batista da Silva
> Attachments: SOLR-7495.patch, SOLR-7495.patch
>
>
> Hey All,
> After upgrading from solr 4.10 to 5.1 with solr could
> I'm getting a IllegalStateException when i try to facet a int field.
> IllegalStateException: unexpected docvalues type NUMERIC for field 'year' 
> (expected=SORTED). Use UninvertingReader or index with docvalues.
> schema.xml
> {code}
> 
> 
> 
> 
> 
> 
>  multiValued="false" required="true"/>
>  multiValued="false" required="true"/>
> 
> 
>  stored="true"/>
> 
> 
> 
>  />
>  sortMissingLast="true"/>
>  positionIncrementGap="0"/>
>  positionIncrementGap="0"/>
>  positionIncrementGap="0"/>
>  precisionStep="0" positionIncrementGap="0"/>
>  positionIncrementGap="0"/>
>  positionIncrementGap="100">
> 
> 
>  words="stopwords.txt" />
> 
>  maxGramSize="15"/>
> 
> 
> 
>  words="stopwords.txt" />
>  synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
> 
> 
> 
>  positionIncrementGap="100">
> 
> 
>  words="stopwords.txt" />
> 
>  maxGramSize="15"/>
> 
> 
> 
>  words="stopwords.txt" />
>  synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
> 
> 
> 
>  class="solr.SpatialRecursivePrefixTreeFieldType" geo="true" 
> distErrPct="0.025" maxDistErr="0.09" units="degrees" />
> 
> id
> name
> 
> 
> {code}
> query :
> {code}
> http://solr.dev:8983/solr/my_collection/select?wt=json&fl=id&fq=index_type:foobar&group=true&group.field=year_make_model&group.facet=true&facet=true&facet.field=year
> {code}
> Exception :
> {code}
> ull:org.apache.solr.common.SolrException: Exception during facet.field: year
> at org.apache.solr.request.SimpleFacets$3.call(SimpleFacets.java:627)
> at org.apache.solr.request.SimpleFacets$3.call(SimpleFacets.java:612)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at org.apache.solr.request.SimpleFacets$2.execute(SimpleFacets.java:566)
> at 
> org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:637)
> at 
> org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:280)
> at 
> org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:106)
> at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:222)
> at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1984)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:829)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:446)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:220)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
> at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
> at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
> at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
> at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
> at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
> at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
> at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:1

[jira] [Updated] (SOLR-5944) Support updates of numeric DocValues

2016-07-19 Thread Ishan Chattopadhyaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya updated SOLR-5944:
---
Attachment: SOLR-5944.patch

Thanks Shalin, I've fixed it in my updated patch, mostly on the lines of what 
you suggested in that snippet. Can you please take a look?

Btw, I've surrounded only the lookupVersion() calls with the acquiring and 
releasing of the lock, instead of surrounding the entire wait loop with the 
acquiring/releasing of the lock: I reasoned that while we are waiting in that 
wait loop, other threads need to have indexed the update that we're waiting on, 
and hence I released the lock as soon as it was not needed, only to re-acquire 
it after 100ms. Does that sound like a valid reason?

bq. Secondly, this method can be made more efficient. It currently wakes up 
every 100ms and reads the new "lastFoundVersion" from the update log or index. 
This is wasteful. A better way would be to wait for the timeout period directly 
after calling vinfo.lookupVersion() inside the synchronized block.

Since this method enters the wait loop for every in-place update that has 
arrived out of order at a replica (an event, that I think is frequent under 
multithreaded load), I don't want every such update to be waiting for the full 
timeout period (5s here), but instead check back from time to time. In most of 
the cases, the dependent update would've been written (by another thread) 
within the first 100ms, after which we can bail out. Do you think that makes 
sense?

> Support updates of numeric DocValues
> 
>
> Key: SOLR-5944
> URL: https://issues.apache.org/jira/browse/SOLR-5944
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Shalin Shekhar Mangar
> Attachments: DUP.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, 
> TestStressInPlaceUpdates.eb044ac71.beast-167-failure.stdout.txt, 
> TestStressInPlaceUpdates.eb044ac71.beast-587-failure.stdout.txt, 
> TestStressInPlaceUpdates.eb044ac71.failures.tar.gz, 
> hoss.62D328FA1DEA57FD.fail.txt, hoss.62D328FA1DEA57FD.fail2.txt, 
> hoss.62D328FA1DEA57FD.fail3.txt, hoss.D768DD9443A98DC.fail.txt, 
> hoss.D768DD9443A98DC.pass.txt
>
>
> LUCENE-5189 introduced support for updates to numeric docvalues. It would be 
> really nice to have Solr support this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-master-Solaris (64bit/jdk1.8.0) - Build # 728 - Unstable!

2016-07-19 Thread Policeman Jenkins Server

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-master-Solaris/728/
Java: 64bit/jdk1.8.0 -XX:+UseCompressedOops -XX:+UseParallelGC

1 tests failed.
FAILED:  
org.apache.solr.cloud.DistributedVersionInfoTest.testReplicaVersionHandling

Error Message:
Captured an uncaught exception in thread: Thread[id=2363, name=Thread-457, 
state=RUNNABLE, group=TGRP-DistributedVersionInfoTest]

Stack Trace:
com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught 
exception in thread: Thread[id=2363, name=Thread-457, state=RUNNABLE, 
group=TGRP-DistributedVersionInfoTest]
at 
__randomizedtesting.SeedInfo.seed([18738F097A62141C:C48A58F3D819DE5D]:0)
Caused by: java.lang.IllegalArgumentException: bound must be positive
at __randomizedtesting.SeedInfo.seed([18738F097A62141C]:0)
at java.util.Random.nextInt(Random.java:388)
at 
org.apache.solr.cloud.DistributedVersionInfoTest$3.run(DistributedVersionInfoTest.java:204)




Build Log:
[...truncated 10750 lines...]
   [junit4] Suite: org.apache.solr.cloud.DistributedVersionInfoTest
   [junit4]   2> Creating dataDir: 
/export/home/jenkins/workspace/Lucene-Solr-master-Solaris/solr/build/solr-core/test/J0/temp/solr.cloud.DistributedVersionInfoTest_18738F097A62141C-001/init-core-data-001
   [junit4]   2> 289648 INFO  
(SUITE-DistributedVersionInfoTest-seed#[18738F097A62141C]-worker) [] 
o.a.s.SolrTestCaseJ4 Randomized ssl (false) and clientAuth (false) via: 
@org.apache.solr.SolrTestCaseJ4$SuppressSSL(bugUrl=https://issues.apache.org/jira/browse/SOLR-5776)
   [junit4]   2> 289651 INFO  
(SUITE-DistributedVersionInfoTest-seed#[18738F097A62141C]-worker) [] 
o.a.s.c.ZkTestServer STARTING ZK TEST SERVER
   [junit4]   2> 289651 INFO  (Thread-423) [] o.a.s.c.ZkTestServer client 
port:0.0.0.0/0.0.0.0:0
   [junit4]   2> 289651 INFO  (Thread-423) [] o.a.s.c.ZkTestServer Starting 
server
   [junit4]   2> 289752 INFO  
(SUITE-DistributedVersionInfoTest-seed#[18738F097A62141C]-worker) [] 
o.a.s.c.ZkTestServer start zk server on port:43861
   [junit4]   2> 289752 INFO  
(SUITE-DistributedVersionInfoTest-seed#[18738F097A62141C]-worker) [] 
o.a.s.c.c.SolrZkClient Using default ZkCredentialsProvider
   [junit4]   2> 289753 INFO  
(SUITE-DistributedVersionInfoTest-seed#[18738F097A62141C]-worker) [] 
o.a.s.c.c.ConnectionManager Waiting for client to connect to ZooKeeper
   [junit4]   2> 289757 INFO  (zkCallback-461-thread-1) [] 
o.a.s.c.c.ConnectionManager Watcher 
org.apache.solr.common.cloud.ConnectionManager@abd6ecc name:ZooKeeperConnection 
Watcher:127.0.0.1:43861 got event WatchedEvent state:SyncConnected type:None 
path:null path:null type:None
   [junit4]   2> 289757 INFO  
(SUITE-DistributedVersionInfoTest-seed#[18738F097A62141C]-worker) [] 
o.a.s.c.c.ConnectionManager Client is connected to ZooKeeper
   [junit4]   2> 289757 INFO  
(SUITE-DistributedVersionInfoTest-seed#[18738F097A62141C]-worker) [] 
o.a.s.c.c.SolrZkClient Using default ZkACLProvider
   [junit4]   2> 289757 INFO  
(SUITE-DistributedVersionInfoTest-seed#[18738F097A62141C]-worker) [] 
o.a.s.c.c.SolrZkClient makePath: /solr/solr.xml
   [junit4]   2> 289775 INFO  (jetty-launcher-460-thread-1) [] 
o.e.j.s.Server jetty-9.3.8.v20160314
   [junit4]   2> 289776 INFO  (jetty-launcher-460-thread-1) [] 
o.e.j.s.h.ContextHandler Started 
o.e.j.s.ServletContextHandler@2bc11fc8{/solr,null,AVAILABLE}
   [junit4]   2> 289778 INFO  (jetty-launcher-460-thread-1) [] 
o.e.j.s.ServerConnector Started 
ServerConnector@223de35c{HTTP/1.1,[http/1.1]}{127.0.0.1:42836}
   [junit4]   2> 289778 INFO  (jetty-launcher-460-thread-1) [] 
o.e.j.s.Server Started @292836ms
   [junit4]   2> 289778 INFO  (jetty-launcher-460-thread-1) [] 
o.a.s.c.s.e.JettySolrRunner Jetty properties: {hostContext=/solr, 
hostPort=42836}
   [junit4]   2> 289779 INFO  (jetty-launcher-460-thread-1) [] 
o.a.s.s.SolrDispatchFilter SolrDispatchFilter.init(): 
sun.misc.Launcher$AppClassLoader@6d06d69c
   [junit4]   2> 289779 INFO  (jetty-launcher-460-thread-1) [] 
o.a.s.c.SolrResourceLoader new SolrResourceLoader for directory: 
'/export/home/jenkins/workspace/Lucene-Solr-master-Solaris/solr/build/solr-core/test/J0/temp/solr.cloud.DistributedVersionInfoTest_18738F097A62141C-001/tempDir-001/node1'
   [junit4]   2> 289779 INFO  (jetty-launcher-460-thread-1) [] 
o.a.s.c.SolrResourceLoader JNDI not configured for solr (NoInitialContextEx)
   [junit4]   2> 289779 INFO  (jetty-launcher-460-thread-1) [] 
o.a.s.c.SolrResourceLoader solr home defaulted to 'solr/' (could not find 
system property or JNDI)
   [junit4]   2> 289779 INFO  (jetty-launcher-460-thread-1) [] 
o.a.s.c.c.SolrZkClient Using default ZkCredentialsProvider
   [junit4]   2> 289780 INFO  (jetty-launcher-460-thread-1) [] 
o.a.s.c.c.ConnectionManager Waiting for client to connect to ZooKeeper
   [junit4]   2> 289781 INFO  (jetty-launcher-460-thread-2) [] 
o.e.j.s.Server jetty-9.3.8.v20160

[jira] [Updated] (SOLR-6810) Faster searching limited but high rows across many shards all with many hits

2016-07-19 Thread Erick Erickson (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-6810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-6810:
-
Attachment: SOLR-6810-hack-eoe.patch

SOLR-8220 does NOT resolve this, but I think it lays the groundwork for a much 
smaller implementation.

I've attached a patch that is a PoC, note there are //nocommits where I write 
to system.out from CopmressingStoredFieldsReader just for easy verification 
that we're decompressing or not

Also see the nocommit in DocsStreamer. To make this work you need to define 
your id field as stored=false, dv=true. I don't think I understand 
useDocValuesAsStored, because setting stored=true useDocValuesAsStored=true 
still gets the stored field, I'll have to figure that out.

I'm sure this isn't an optimal implementation, but maybe it'll prompt some more 
carefully thought-out approaches.

Mostly putting this up for comment, I'm probably not going to pursue this in 
the near future though.

> Faster searching limited but high rows across many shards all with many hits
> 
>
> Key: SOLR-6810
> URL: https://issues.apache.org/jira/browse/SOLR-6810
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Reporter: Per Steffensen
>Assignee: Shalin Shekhar Mangar
>  Labels: distributed_search, performance
> Attachments: SOLR-6810-hack-eoe.patch, SOLR-6810-trunk.patch, 
> SOLR-6810-trunk.patch, SOLR-6810-trunk.patch, branch_5x_rev1642874.patch, 
> branch_5x_rev1642874.patch, branch_5x_rev1645549.patch
>
>
> Searching "limited but high rows across many shards all with many hits" is 
> slow
> E.g.
> * Query from outside client: q=something&rows=1000
> * Resulting in sub-requests to each shard something a-la this
> ** 1) q=something&rows=1000&fl=id,score
> ** 2) Request the full documents with ids in the global-top-1000 found among 
> the top-1000 from each shard
> What does the subject mean
> * "limited but high rows" means 1000 in the example above
> * "many shards" means 200-1000 in our case
> * "all with many hits" means that each of the shards have a significant 
> number of hits on the query
> The problem grows on all three factors above
> Doing such a query on our system takes between 5 min to 1 hour - depending on 
> a lot of things. It ought to be much faster, so lets make it.
> Profiling show that the problem is that it takes lots of time to access the 
> store to get id’s for (up to) 1000 docs (value of rows parameter) per shard. 
> Having 1000 shards its up to 1 mio ids that has to be fetched. There is 
> really no good reason to ever read information from store for more than the 
> overall top-1000 documents, that has to be returned to the client.
> For further detail see mail-thread "Slow searching limited but high rows 
> across many shards all with high hits" started 13/11-2014 on 
> dev@lucene.apache.org



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9252) Feature selection and logistic regression on text

2016-07-19 Thread Cao Manh Dat (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15385038#comment-15385038
 ] 

Cao Manh Dat commented on SOLR-9252:


+1
That can help expression cleaner. 

> Feature selection and logistic regression on text
> -
>
> Key: SOLR-9252
> URL: https://issues.apache.org/jira/browse/SOLR-9252
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Joel Bernstein
> Attachments: SOLR-9252.patch, SOLR-9252.patch, SOLR-9252.patch, 
> SOLR-9252.patch, SOLR-9252.patch, enron1.zip
>
>
> SOLR-9186 come up with a challenges that for each iterative we have to 
> rebuild the tf-idf vector for each documents. It is costly computation if we 
> represent doc by a lot of terms. Features selection can help reducing the 
> computation.
> Due to its computational efficiency and simple interpretation, information 
> gain is one of the most popular feature selection methods. It is used to 
> measure the dependence between features and labels and calculates the 
> information gain between the i-th feature and the class labels 
> (http://www.jiliang.xyz/publication/feature_selection_for_classification.pdf).
> I confirmed that by running logistics regressions on enron mail dataset (in 
> which each email is represented by top 100 terms that have highest 
> information gain) and got the accuracy by 92% and precision by 82%.
> This ticket will create two new streaming expression. Both of them use the 
> same *parallel iterative framework* as SOLR-8492.
> {code}
> featuresSelection(collection1, q="*:*",  field="tv_text", outcome="out_i", 
> positiveLabel=1, numTerms=100)
> {code}
> featuresSelection will emit top terms that have highest information gain 
> scores. It can be combined with new tlogit stream.
> {code}
> tlogit(collection1, q="*:*",
>  featuresSelection(collection1, 
>   q="*:*",  
>   field="tv_text", 
>   outcome="out_i", 
>   positiveLabel=1, 
>   numTerms=100),
>  field="tv_text",
>  outcome="out_i",
>  maxIterations=100)
> {code}
> In the iteration n, the text logistics regression will emit nth model, and 
> compute the error of (n-1)th model. Because the error will be wrong if we 
> compute the error dynamically in each iteration. 
> In each iteration tlogit will change learning rate based on error of previous 
> iteration. It will increase the learning rate by 5% if error is going down 
> and It will decrease the learning rate by 50% if error is going up.
> This will support use cases such as building models for spam detection, 
> sentiment analysis and threat detection. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-9103) Restore ability for users to add custom Streaming Expressions

2016-07-19 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15385016#comment-15385016
 ] 

Joel Bernstein edited comment on SOLR-9103 at 7/19/16 10:56 PM:


Ok, then let's use the solrconfig.xml for adding custom expressions. In that 
case I think this patch looks like a good approach. We can use the same 
approach as other pluggables and register the standard expressions in a class. 


was (Author: joel.bernstein):
Ok, then let's use the solrconfig.xml for adding custom expressions. In that 
case I think this patch looks fine. We can use the same approaches as other 
pluggables and register the standard expressions in a class. 

> Restore ability for users to add custom Streaming Expressions
> -
>
> Key: SOLR-9103
> URL: https://issues.apache.org/jira/browse/SOLR-9103
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Joel Bernstein
> Attachments: HelloStream.class, SOLR-9103.PATCH, SOLR-9103.PATCH
>
>
> StreamHandler is an implicit handler. So to make it extensible, we can 
> introduce the below syntax in solrconfig.xml. 
> {code}
> 
> {code}
> This will add hello function to streamFactory of StreamHandler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-9103) Restore ability for users to add custom Streaming Expressions

2016-07-19 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15385016#comment-15385016
 ] 

Joel Bernstein edited comment on SOLR-9103 at 7/19/16 10:54 PM:


Ok, then let's use the solrconfig.xml for adding custom expressions. In that 
case I think this patch looks fine. We can use the same approaches as other 
pluggables and register the standard expressions in a class. 


was (Author: joel.bernstein):
Ok, then let's use the solrconfig for adding custom expressions. In that case I 
think this patch looks fine. We can use the same approaches as other pluggables 
and register the standard expressions in a class. 

> Restore ability for users to add custom Streaming Expressions
> -
>
> Key: SOLR-9103
> URL: https://issues.apache.org/jira/browse/SOLR-9103
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Joel Bernstein
> Attachments: HelloStream.class, SOLR-9103.PATCH, SOLR-9103.PATCH
>
>
> StreamHandler is an implicit handler. So to make it extensible, we can 
> introduce the below syntax in solrconfig.xml. 
> {code}
> 
> {code}
> This will add hello function to streamFactory of StreamHandler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (SOLR-9103) Restore ability for users to add custom Streaming Expressions

2016-07-19 Thread Joel Bernstein (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein reassigned SOLR-9103:


Assignee: Joel Bernstein

> Restore ability for users to add custom Streaming Expressions
> -
>
> Key: SOLR-9103
> URL: https://issues.apache.org/jira/browse/SOLR-9103
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Joel Bernstein
> Attachments: HelloStream.class, SOLR-9103.PATCH, SOLR-9103.PATCH
>
>
> StreamHandler is an implicit handler. So to make it extensible, we can 
> introduce the below syntax in solrconfig.xml. 
> {code}
> 
> {code}
> This will add hello function to streamFactory of StreamHandler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9103) Restore ability for users to add custom Streaming Expressions

2016-07-19 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15385016#comment-15385016
 ] 

Joel Bernstein commented on SOLR-9103:
--

Ok, then let's use the solrconfig for adding custom expressions. In that case I 
think this patch looks fine. We can use the same approaches as other pluggables 
and register the standard expressions in a class. 

> Restore ability for users to add custom Streaming Expressions
> -
>
> Key: SOLR-9103
> URL: https://issues.apache.org/jira/browse/SOLR-9103
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
> Attachments: HelloStream.class, SOLR-9103.PATCH, SOLR-9103.PATCH
>
>
> StreamHandler is an implicit handler. So to make it extensible, we can 
> introduce the below syntax in solrconfig.xml. 
> {code}
> 
> {code}
> This will add hello function to streamFactory of StreamHandler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-9252) Feature selection and logistic regression on text

2016-07-19 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384990#comment-15384990
 ] 

Joel Bernstein edited comment on SOLR-9252 at 7/19/16 10:32 PM:


One of the things I've been thinking about is the function names. I think we 
can shorten the featureSelection function to just be *features*.

I think we could change the tlogit function to *train*. So the syntax would 
look like this:

{code}
train(collection1, q="*:*",
  features(collection1, 
   q="*:*",  
   field="tv_text", 
   outcome="out_i", 
   positiveLabel=1, 
   numTerms=100),
  field="tv_text",
  outcome="out_i",
  maxIterations=100)
{code}

In the future both the *features* and the *train* functions can have a 
parameter for setting the algorithm. The default algorithm in the initial 
release will be *information gain* for feature selection, and *logistic 
regression* for training



was (Author: joel.bernstein):
One of the things I've been thinking about is the function names. I think we 
can shorten the featureSelection function to just be *features*.

I think we could change the tlogit function to *train*. So the syntax would 
look like this:

{code}
train(collection1, q="*:*",
  features(collection1, 
   q="*:*",  
   field="tv_text", 
   outcome="out_i", 
   positiveLabel=1, 
   numTerms=100),
  field="tv_text",
  outcome="out_i",
  maxIterations=100)
{code}

In the future both the *features* and the *train* functions can have a 
parameter for setting the algorithm. 


> Feature selection and logistic regression on text
> -
>
> Key: SOLR-9252
> URL: https://issues.apache.org/jira/browse/SOLR-9252
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Joel Bernstein
> Attachments: SOLR-9252.patch, SOLR-9252.patch, SOLR-9252.patch, 
> SOLR-9252.patch, SOLR-9252.patch, enron1.zip
>
>
> SOLR-9186 come up with a challenges that for each iterative we have to 
> rebuild the tf-idf vector for each documents. It is costly computation if we 
> represent doc by a lot of terms. Features selection can help reducing the 
> computation.
> Due to its computational efficiency and simple interpretation, information 
> gain is one of the most popular feature selection methods. It is used to 
> measure the dependence between features and labels and calculates the 
> information gain between the i-th feature and the class labels 
> (http://www.jiliang.xyz/publication/feature_selection_for_classification.pdf).
> I confirmed that by running logistics regressions on enron mail dataset (in 
> which each email is represented by top 100 terms that have highest 
> information gain) and got the accuracy by 92% and precision by 82%.
> This ticket will create two new streaming expression. Both of them use the 
> same *parallel iterative framework* as SOLR-8492.
> {code}
> featuresSelection(collection1, q="*:*",  field="tv_text", outcome="out_i", 
> positiveLabel=1, numTerms=100)
> {code}
> featuresSelection will emit top terms that have highest information gain 
> scores. It can be combined with new tlogit stream.
> {code}
> tlogit(collection1, q="*:*",
>  featuresSelection(collection1, 
>   q="*:*",  
>   field="tv_text", 
>   outcome="out_i", 
>   positiveLabel=1, 
>   numTerms=100),
>  field="tv_text",
>  outcome="out_i",
>  maxIterations=100)
> {code}
> In the iteration n, the text logistics regression will emit nth model, and 
> compute the error of (n-1)th model. Because the error will be wrong if we 
> compute the error dynamically in each iteration. 
> In each iteration tlogit will change learning rate based on error of previous 
> iteration. It will increase the learning rate by 5% if error is going down 
> and It will decrease the learning rate by 50% if error is going up.
> This will support use cases such as building models for spam detection, 
> sentiment analysis and threat detection. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-9252) Feature selection and logistic regression on text

2016-07-19 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384990#comment-15384990
 ] 

Joel Bernstein edited comment on SOLR-9252 at 7/19/16 10:31 PM:


One of the things I've been thinking about is the function names. I think we 
can shorten the featureSelection function to just be *features*.

I think we could change the tlogit function to *train*. So the syntax would 
look like this:

{code}
train(collection1, q="*:*",
  features(collection1, 
   q="*:*",  
   field="tv_text", 
   outcome="out_i", 
   positiveLabel=1, 
   numTerms=100),
  field="tv_text",
  outcome="out_i",
  maxIterations=100)
{code}

In the future both the *features* and the *train* functions can have a 
parameter for setting the algorithm. 



was (Author: joel.bernstein):
One of the things I've been thinking about is the function names. I think we 
can shorten the featureSelection function to just be *features*.

I think we could change the tlogit function to *train*. So the syntax would 
look like this:

{code}
train(collection1, q="*:*",
  features(collection1, 
   q="*:*",  
   field="tv_text", 
   outcome="out_i", 
   positiveLabel=1, 
   numTerms=100),
  field="tv_text",
  outcome="out_i",
  maxIterations=100)
{code}



> Feature selection and logistic regression on text
> -
>
> Key: SOLR-9252
> URL: https://issues.apache.org/jira/browse/SOLR-9252
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Joel Bernstein
> Attachments: SOLR-9252.patch, SOLR-9252.patch, SOLR-9252.patch, 
> SOLR-9252.patch, SOLR-9252.patch, enron1.zip
>
>
> SOLR-9186 come up with a challenges that for each iterative we have to 
> rebuild the tf-idf vector for each documents. It is costly computation if we 
> represent doc by a lot of terms. Features selection can help reducing the 
> computation.
> Due to its computational efficiency and simple interpretation, information 
> gain is one of the most popular feature selection methods. It is used to 
> measure the dependence between features and labels and calculates the 
> information gain between the i-th feature and the class labels 
> (http://www.jiliang.xyz/publication/feature_selection_for_classification.pdf).
> I confirmed that by running logistics regressions on enron mail dataset (in 
> which each email is represented by top 100 terms that have highest 
> information gain) and got the accuracy by 92% and precision by 82%.
> This ticket will create two new streaming expression. Both of them use the 
> same *parallel iterative framework* as SOLR-8492.
> {code}
> featuresSelection(collection1, q="*:*",  field="tv_text", outcome="out_i", 
> positiveLabel=1, numTerms=100)
> {code}
> featuresSelection will emit top terms that have highest information gain 
> scores. It can be combined with new tlogit stream.
> {code}
> tlogit(collection1, q="*:*",
>  featuresSelection(collection1, 
>   q="*:*",  
>   field="tv_text", 
>   outcome="out_i", 
>   positiveLabel=1, 
>   numTerms=100),
>  field="tv_text",
>  outcome="out_i",
>  maxIterations=100)
> {code}
> In the iteration n, the text logistics regression will emit nth model, and 
> compute the error of (n-1)th model. Because the error will be wrong if we 
> compute the error dynamically in each iteration. 
> In each iteration tlogit will change learning rate based on error of previous 
> iteration. It will increase the learning rate by 5% if error is going down 
> and It will decrease the learning rate by 50% if error is going up.
> This will support use cases such as building models for spam detection, 
> sentiment analysis and threat detection. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-9252) Feature selection and logistic regression on text

2016-07-19 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384990#comment-15384990
 ] 

Joel Bernstein edited comment on SOLR-9252 at 7/19/16 10:28 PM:


One of the things I've been thinking about is the function names. I think we 
can shorten the featureSelection function to just be *features*.

I think we could change the tlogit function to *train*. So the syntax would 
look like this:

{code}
train(collection1, q="*:*",
  features(collection1, 
   q="*:*",  
   field="tv_text", 
   outcome="out_i", 
   positiveLabel=1, 
   numTerms=100),
  field="tv_text",
  outcome="out_i",
  maxIterations=100)
{code}




was (Author: joel.bernstein):
One of the things I've been thinking about is the function names. I think we 
can shorten the featureSelection function to just be *features*.

I think we could change the tlogit function to *train*. So the syntax would 
look like this:

{code}
train(collection1, q="*:*",
 features(collection1, 
   q="*:*",  
   field="tv_text", 
   outcome="out_i", 
   positiveLabel=1, 
   numTerms=100),
  field="tv_text",
  outcome="out_i",
  maxIterations=100)
{code}



> Feature selection and logistic regression on text
> -
>
> Key: SOLR-9252
> URL: https://issues.apache.org/jira/browse/SOLR-9252
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Joel Bernstein
> Attachments: SOLR-9252.patch, SOLR-9252.patch, SOLR-9252.patch, 
> SOLR-9252.patch, SOLR-9252.patch, enron1.zip
>
>
> SOLR-9186 come up with a challenges that for each iterative we have to 
> rebuild the tf-idf vector for each documents. It is costly computation if we 
> represent doc by a lot of terms. Features selection can help reducing the 
> computation.
> Due to its computational efficiency and simple interpretation, information 
> gain is one of the most popular feature selection methods. It is used to 
> measure the dependence between features and labels and calculates the 
> information gain between the i-th feature and the class labels 
> (http://www.jiliang.xyz/publication/feature_selection_for_classification.pdf).
> I confirmed that by running logistics regressions on enron mail dataset (in 
> which each email is represented by top 100 terms that have highest 
> information gain) and got the accuracy by 92% and precision by 82%.
> This ticket will create two new streaming expression. Both of them use the 
> same *parallel iterative framework* as SOLR-8492.
> {code}
> featuresSelection(collection1, q="*:*",  field="tv_text", outcome="out_i", 
> positiveLabel=1, numTerms=100)
> {code}
> featuresSelection will emit top terms that have highest information gain 
> scores. It can be combined with new tlogit stream.
> {code}
> tlogit(collection1, q="*:*",
>  featuresSelection(collection1, 
>   q="*:*",  
>   field="tv_text", 
>   outcome="out_i", 
>   positiveLabel=1, 
>   numTerms=100),
>  field="tv_text",
>  outcome="out_i",
>  maxIterations=100)
> {code}
> In the iteration n, the text logistics regression will emit nth model, and 
> compute the error of (n-1)th model. Because the error will be wrong if we 
> compute the error dynamically in each iteration. 
> In each iteration tlogit will change learning rate based on error of previous 
> iteration. It will increase the learning rate by 5% if error is going down 
> and It will decrease the learning rate by 50% if error is going up.
> This will support use cases such as building models for spam detection, 
> sentiment analysis and threat detection. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-9252) Feature selection and logistic regression on text

2016-07-19 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384990#comment-15384990
 ] 

Joel Bernstein edited comment on SOLR-9252 at 7/19/16 10:28 PM:


One of the things I've been thinking about is the function names. I think we 
can shorten the featureSelection function to just be *features*.

I think we could change the tlogit function to *train*. So the syntax would 
look like this:

{code}
train(collection1, q="*:*",
 features(collection1, 
   q="*:*",  
   field="tv_text", 
   outcome="out_i", 
   positiveLabel=1, 
   numTerms=100),
  field="tv_text",
  outcome="out_i",
  maxIterations=100)
{code}




was (Author: joel.bernstein):
One of the things I've been thinking about is the function names. I think we 
can shorten the featureSelection function to just be *features*.

I think we could change the tlogit function to *train*. So the syntax would 
look like this:

{code}
train(collection1, q="*:*",
 features(collection1, 
   q="*:*",  
field="tv_text", 
outcome="out_i", 
positiveLabel=1, 
numTerms=100),
  field="tv_text",
  outcome="out_i",
  maxIterations=100)
{code}



> Feature selection and logistic regression on text
> -
>
> Key: SOLR-9252
> URL: https://issues.apache.org/jira/browse/SOLR-9252
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Joel Bernstein
> Attachments: SOLR-9252.patch, SOLR-9252.patch, SOLR-9252.patch, 
> SOLR-9252.patch, SOLR-9252.patch, enron1.zip
>
>
> SOLR-9186 come up with a challenges that for each iterative we have to 
> rebuild the tf-idf vector for each documents. It is costly computation if we 
> represent doc by a lot of terms. Features selection can help reducing the 
> computation.
> Due to its computational efficiency and simple interpretation, information 
> gain is one of the most popular feature selection methods. It is used to 
> measure the dependence between features and labels and calculates the 
> information gain between the i-th feature and the class labels 
> (http://www.jiliang.xyz/publication/feature_selection_for_classification.pdf).
> I confirmed that by running logistics regressions on enron mail dataset (in 
> which each email is represented by top 100 terms that have highest 
> information gain) and got the accuracy by 92% and precision by 82%.
> This ticket will create two new streaming expression. Both of them use the 
> same *parallel iterative framework* as SOLR-8492.
> {code}
> featuresSelection(collection1, q="*:*",  field="tv_text", outcome="out_i", 
> positiveLabel=1, numTerms=100)
> {code}
> featuresSelection will emit top terms that have highest information gain 
> scores. It can be combined with new tlogit stream.
> {code}
> tlogit(collection1, q="*:*",
>  featuresSelection(collection1, 
>   q="*:*",  
>   field="tv_text", 
>   outcome="out_i", 
>   positiveLabel=1, 
>   numTerms=100),
>  field="tv_text",
>  outcome="out_i",
>  maxIterations=100)
> {code}
> In the iteration n, the text logistics regression will emit nth model, and 
> compute the error of (n-1)th model. Because the error will be wrong if we 
> compute the error dynamically in each iteration. 
> In each iteration tlogit will change learning rate based on error of previous 
> iteration. It will increase the learning rate by 5% if error is going down 
> and It will decrease the learning rate by 50% if error is going up.
> This will support use cases such as building models for spam detection, 
> sentiment analysis and threat detection. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-9252) Feature selection and logistic regression on text

2016-07-19 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384990#comment-15384990
 ] 

Joel Bernstein edited comment on SOLR-9252 at 7/19/16 10:27 PM:


One of the things I've been thinking about is the function names. I think we 
can shorten the featureSelection function to just be *features*.

I think we could change the tlogit function to *train*. So the syntax would 
look like this:

{code}
train(collection1, q="*:*",
 features(collection1, 
   q="*:*",  
field="tv_text", 
outcome="out_i", 
positiveLabel=1, 
numTerms=100),
  field="tv_text",
  outcome="out_i",
  maxIterations=100)
{code}




was (Author: joel.bernstein):
One of the things I've been thinking about is the function names. I think we 
can shorten the featureSelection function to just be *features*.

I think we could change the tlogit function to *train*. So the syntax would 
look like this:

{code}
train(collection1, q="*:*",
 features(collection1, 
  q="*:*",  
  field="tv_text", 
  outcome="out_i", 
  positiveLabel=1, 
  numTerms=100),
 field="tv_text",
 outcome="out_i",
 maxIterations=100)
{code}



> Feature selection and logistic regression on text
> -
>
> Key: SOLR-9252
> URL: https://issues.apache.org/jira/browse/SOLR-9252
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Joel Bernstein
> Attachments: SOLR-9252.patch, SOLR-9252.patch, SOLR-9252.patch, 
> SOLR-9252.patch, SOLR-9252.patch, enron1.zip
>
>
> SOLR-9186 come up with a challenges that for each iterative we have to 
> rebuild the tf-idf vector for each documents. It is costly computation if we 
> represent doc by a lot of terms. Features selection can help reducing the 
> computation.
> Due to its computational efficiency and simple interpretation, information 
> gain is one of the most popular feature selection methods. It is used to 
> measure the dependence between features and labels and calculates the 
> information gain between the i-th feature and the class labels 
> (http://www.jiliang.xyz/publication/feature_selection_for_classification.pdf).
> I confirmed that by running logistics regressions on enron mail dataset (in 
> which each email is represented by top 100 terms that have highest 
> information gain) and got the accuracy by 92% and precision by 82%.
> This ticket will create two new streaming expression. Both of them use the 
> same *parallel iterative framework* as SOLR-8492.
> {code}
> featuresSelection(collection1, q="*:*",  field="tv_text", outcome="out_i", 
> positiveLabel=1, numTerms=100)
> {code}
> featuresSelection will emit top terms that have highest information gain 
> scores. It can be combined with new tlogit stream.
> {code}
> tlogit(collection1, q="*:*",
>  featuresSelection(collection1, 
>   q="*:*",  
>   field="tv_text", 
>   outcome="out_i", 
>   positiveLabel=1, 
>   numTerms=100),
>  field="tv_text",
>  outcome="out_i",
>  maxIterations=100)
> {code}
> In the iteration n, the text logistics regression will emit nth model, and 
> compute the error of (n-1)th model. Because the error will be wrong if we 
> compute the error dynamically in each iteration. 
> In each iteration tlogit will change learning rate based on error of previous 
> iteration. It will increase the learning rate by 5% if error is going down 
> and It will decrease the learning rate by 50% if error is going up.
> This will support use cases such as building models for spam detection, 
> sentiment analysis and threat detection. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9289) SolrCloud RTG: fl=[docid] silently ignored for all docs

2016-07-19 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384991#comment-15384991
 ] 

Hoss Man commented on SOLR-9289:


Root cause of this issue seems to be same as SOLR-9308

> SolrCloud RTG: fl=[docid] silently ignored for all docs
> ---
>
> Key: SOLR-9289
> URL: https://issues.apache.org/jira/browse/SOLR-9289
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>
> Found in SOLR-9180 testing.
> In SolrCloud mode, the {{\[docid\]}} transformer is completely ignored when 
> used in a RTG request (even for commited docs) ... this is inconsistent with 
> single node solr behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9252) Feature selection and logistic regression on text

2016-07-19 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384990#comment-15384990
 ] 

Joel Bernstein commented on SOLR-9252:
--

One of the things I've been thinking about is the function names. I think we 
can shorten the featureSelection function to just be *features*.

I think we could change the tlogit function to *train*. So the syntax would 
look like this:

{code}
train(collection1, q="*:*",
 features(collection1, 
  q="*:*",  
  field="tv_text", 
  outcome="out_i", 
  positiveLabel=1, 
  numTerms=100),
 field="tv_text",
 outcome="out_i",
 maxIterations=100)
{code}



> Feature selection and logistic regression on text
> -
>
> Key: SOLR-9252
> URL: https://issues.apache.org/jira/browse/SOLR-9252
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Joel Bernstein
> Attachments: SOLR-9252.patch, SOLR-9252.patch, SOLR-9252.patch, 
> SOLR-9252.patch, SOLR-9252.patch, enron1.zip
>
>
> SOLR-9186 come up with a challenges that for each iterative we have to 
> rebuild the tf-idf vector for each documents. It is costly computation if we 
> represent doc by a lot of terms. Features selection can help reducing the 
> computation.
> Due to its computational efficiency and simple interpretation, information 
> gain is one of the most popular feature selection methods. It is used to 
> measure the dependence between features and labels and calculates the 
> information gain between the i-th feature and the class labels 
> (http://www.jiliang.xyz/publication/feature_selection_for_classification.pdf).
> I confirmed that by running logistics regressions on enron mail dataset (in 
> which each email is represented by top 100 terms that have highest 
> information gain) and got the accuracy by 92% and precision by 82%.
> This ticket will create two new streaming expression. Both of them use the 
> same *parallel iterative framework* as SOLR-8492.
> {code}
> featuresSelection(collection1, q="*:*",  field="tv_text", outcome="out_i", 
> positiveLabel=1, numTerms=100)
> {code}
> featuresSelection will emit top terms that have highest information gain 
> scores. It can be combined with new tlogit stream.
> {code}
> tlogit(collection1, q="*:*",
>  featuresSelection(collection1, 
>   q="*:*",  
>   field="tv_text", 
>   outcome="out_i", 
>   positiveLabel=1, 
>   numTerms=100),
>  field="tv_text",
>  outcome="out_i",
>  maxIterations=100)
> {code}
> In the iteration n, the text logistics regression will emit nth model, and 
> compute the error of (n-1)th model. Because the error will be wrong if we 
> compute the error dynamically in each iteration. 
> In each iteration tlogit will change learning rate based on error of previous 
> iteration. It will increase the learning rate by 5% if error is going down 
> and It will decrease the learning rate by 50% if error is going up.
> This will support use cases such as building models for spam detection, 
> sentiment analysis and threat detection. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-9308) SolrCloud RTG doesn't forward any params to shards, causes fqs & non-default fl params to be ignored

2016-07-19 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-9308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-9308:
---
Attachment: SOLR-9308.patch

Updated patch:

* Updated to apply clean on master
* I realized this issue is the same root cause as SOLR-9289 (as well as 
SOLR-9286) so i've enabled those tests as well.

> SolrCloud RTG doesn't forward any params to shards, causes fqs & non-default 
> fl params to be ignored
> 
>
> Key: SOLR-9308
> URL: https://issues.apache.org/jira/browse/SOLR-9308
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
> Attachments: SOLR-9308.patch, SOLR-9308.patch
>
>
> While working on a robust randomized test for SOLR-9285, I can't seem to get 
> filter queries on RTG to work at all -- even when the docs are fully 
> committed.
> steps to reproduce to follow in comment...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-7036) Faster method for group.facet

2016-07-19 Thread Jamie Swain (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-7036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jamie Swain updated SOLR-7036:
--
Attachment: (was: jstack-output.txt)

> Faster method for group.facet
> -
>
> Key: SOLR-7036
> URL: https://issues.apache.org/jira/browse/SOLR-7036
> Project: Solr
>  Issue Type: Improvement
>  Components: faceting
>Affects Versions: 4.10.3
>Reporter: Jim Musil
>Assignee: Erick Erickson
> Fix For: 5.5, 6.0
>
> Attachments: SOLR-7036.patch, SOLR-7036.patch, SOLR-7036.patch, 
> SOLR-7036.patch, jstack-output.txt, performance.txt, source_for_patch.zip
>
>
> This is a patch that speeds up the performance of requests made with 
> group.facet=true. The original code that collects and counts unique facet 
> values for each group does not use the same improved field cache methods that 
> have been added for normal faceting in recent versions.
> Specifically, this approach leverages the UninvertedField class which 
> provides a much faster way to look up docs that contain a term. I've also 
> added a simple grouping map so that when a term is found for a doc, it can 
> quickly look up the group to which it belongs.
> Group faceting was very slow for our data set and when the number of docs or 
> terms was high, the latency spiked to multiple second requests. This solution 
> provides better overall performance -- from an average of 54ms to 32ms. It 
> also dropped our slowest performing queries way down -- from 6012ms to 991ms.
> I also added a few tests.
> I added an additional parameter so that you can choose to use this method or 
> the original. Add group.facet.method=fc to use the improved method or 
> group.facet.method=original which is the default if not specified.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-7036) Faster method for group.facet

2016-07-19 Thread Jamie Swain (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-7036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jamie Swain updated SOLR-7036:
--
Attachment: jstack-output.txt

> Faster method for group.facet
> -
>
> Key: SOLR-7036
> URL: https://issues.apache.org/jira/browse/SOLR-7036
> Project: Solr
>  Issue Type: Improvement
>  Components: faceting
>Affects Versions: 4.10.3
>Reporter: Jim Musil
>Assignee: Erick Erickson
> Fix For: 5.5, 6.0
>
> Attachments: SOLR-7036.patch, SOLR-7036.patch, SOLR-7036.patch, 
> SOLR-7036.patch, jstack-output.txt, performance.txt, source_for_patch.zip
>
>
> This is a patch that speeds up the performance of requests made with 
> group.facet=true. The original code that collects and counts unique facet 
> values for each group does not use the same improved field cache methods that 
> have been added for normal faceting in recent versions.
> Specifically, this approach leverages the UninvertedField class which 
> provides a much faster way to look up docs that contain a term. I've also 
> added a simple grouping map so that when a term is found for a doc, it can 
> quickly look up the group to which it belongs.
> Group faceting was very slow for our data set and when the number of docs or 
> terms was high, the latency spiked to multiple second requests. This solution 
> provides better overall performance -- from an average of 54ms to 32ms. It 
> also dropped our slowest performing queries way down -- from 6012ms to 991ms.
> I also added a few tests.
> I added an additional parameter so that you can choose to use this method or 
> the original. Add group.facet.method=fc to use the improved method or 
> group.facet.method=original which is the default if not specified.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-7036) Faster method for group.facet

2016-07-19 Thread Jamie Swain (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-7036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jamie Swain updated SOLR-7036:
--
Attachment: jstack-output.txt

> Faster method for group.facet
> -
>
> Key: SOLR-7036
> URL: https://issues.apache.org/jira/browse/SOLR-7036
> Project: Solr
>  Issue Type: Improvement
>  Components: faceting
>Affects Versions: 4.10.3
>Reporter: Jim Musil
>Assignee: Erick Erickson
> Fix For: 5.5, 6.0
>
> Attachments: SOLR-7036.patch, SOLR-7036.patch, SOLR-7036.patch, 
> SOLR-7036.patch, jstack-output.txt, performance.txt, source_for_patch.zip
>
>
> This is a patch that speeds up the performance of requests made with 
> group.facet=true. The original code that collects and counts unique facet 
> values for each group does not use the same improved field cache methods that 
> have been added for normal faceting in recent versions.
> Specifically, this approach leverages the UninvertedField class which 
> provides a much faster way to look up docs that contain a term. I've also 
> added a simple grouping map so that when a term is found for a doc, it can 
> quickly look up the group to which it belongs.
> Group faceting was very slow for our data set and when the number of docs or 
> terms was high, the latency spiked to multiple second requests. This solution 
> provides better overall performance -- from an average of 54ms to 32ms. It 
> also dropped our slowest performing queries way down -- from 6012ms to 991ms.
> I also added a few tests.
> I added an additional parameter so that you can choose to use this method or 
> the original. Add group.facet.method=fc to use the improved method or 
> group.facet.method=original which is the default if not specified.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-7036) Faster method for group.facet

2016-07-19 Thread Jamie Swain (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-7036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jamie Swain updated SOLR-7036:
--
Attachment: (was: jstack-output.txt)

> Faster method for group.facet
> -
>
> Key: SOLR-7036
> URL: https://issues.apache.org/jira/browse/SOLR-7036
> Project: Solr
>  Issue Type: Improvement
>  Components: faceting
>Affects Versions: 4.10.3
>Reporter: Jim Musil
>Assignee: Erick Erickson
> Fix For: 5.5, 6.0
>
> Attachments: SOLR-7036.patch, SOLR-7036.patch, SOLR-7036.patch, 
> SOLR-7036.patch, jstack-output.txt, performance.txt, source_for_patch.zip
>
>
> This is a patch that speeds up the performance of requests made with 
> group.facet=true. The original code that collects and counts unique facet 
> values for each group does not use the same improved field cache methods that 
> have been added for normal faceting in recent versions.
> Specifically, this approach leverages the UninvertedField class which 
> provides a much faster way to look up docs that contain a term. I've also 
> added a simple grouping map so that when a term is found for a doc, it can 
> quickly look up the group to which it belongs.
> Group faceting was very slow for our data set and when the number of docs or 
> terms was high, the latency spiked to multiple second requests. This solution 
> provides better overall performance -- from an average of 54ms to 32ms. It 
> also dropped our slowest performing queries way down -- from 6012ms to 991ms.
> I also added a few tests.
> I added an additional parameter so that you can choose to use this method or 
> the original. Add group.facet.method=fc to use the improved method or 
> group.facet.method=original which is the default if not specified.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-7036) Faster method for group.facet

2016-07-19 Thread Jamie Swain (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-7036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jamie Swain updated SOLR-7036:
--
Attachment: jstack-output.txt

[~mdvir1] sorry, just getting back to this now.  I've attached the jstack 
output.

> Faster method for group.facet
> -
>
> Key: SOLR-7036
> URL: https://issues.apache.org/jira/browse/SOLR-7036
> Project: Solr
>  Issue Type: Improvement
>  Components: faceting
>Affects Versions: 4.10.3
>Reporter: Jim Musil
>Assignee: Erick Erickson
> Fix For: 5.5, 6.0
>
> Attachments: SOLR-7036.patch, SOLR-7036.patch, SOLR-7036.patch, 
> SOLR-7036.patch, jstack-output.txt, performance.txt, source_for_patch.zip
>
>
> This is a patch that speeds up the performance of requests made with 
> group.facet=true. The original code that collects and counts unique facet 
> values for each group does not use the same improved field cache methods that 
> have been added for normal faceting in recent versions.
> Specifically, this approach leverages the UninvertedField class which 
> provides a much faster way to look up docs that contain a term. I've also 
> added a simple grouping map so that when a term is found for a doc, it can 
> quickly look up the group to which it belongs.
> Group faceting was very slow for our data set and when the number of docs or 
> terms was high, the latency spiked to multiple second requests. This solution 
> provides better overall performance -- from an average of 54ms to 32ms. It 
> also dropped our slowest performing queries way down -- from 6012ms to 991ms.
> I also added a few tests.
> I added an additional parameter so that you can choose to use this method or 
> the original. Add group.facet.method=fc to use the improved method or 
> group.facet.method=original which is the default if not specified.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9252) Feature selection and logistic regression on text

2016-07-19 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384914#comment-15384914
 ] 

Joel Bernstein commented on SOLR-9252:
--

I've been working with the latest patch. After putting the enron1.zip file in 
place all the test methods in StreamExpressionTest pass on their own. But if 
you run the entire StreamExpressionTest you get failures. I'm investigating 
this now. I'll update the ticket when I've got this resolved. The latest run 
had the following failures, but different ones fail on each run:

 [junit4] Tests with failures [seed: F53E526DA62A037F]:
 [junit4]   - 
org.apache.solr.client.solrj.io.stream.StreamExpressionTest.testFeaturesSelectionStream
 [junit4]   - 
org.apache.solr.client.solrj.io.stream.StreamExpressionTest.testUpdateStream


> Feature selection and logistic regression on text
> -
>
> Key: SOLR-9252
> URL: https://issues.apache.org/jira/browse/SOLR-9252
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Joel Bernstein
> Attachments: SOLR-9252.patch, SOLR-9252.patch, SOLR-9252.patch, 
> SOLR-9252.patch, SOLR-9252.patch, enron1.zip
>
>
> SOLR-9186 come up with a challenges that for each iterative we have to 
> rebuild the tf-idf vector for each documents. It is costly computation if we 
> represent doc by a lot of terms. Features selection can help reducing the 
> computation.
> Due to its computational efficiency and simple interpretation, information 
> gain is one of the most popular feature selection methods. It is used to 
> measure the dependence between features and labels and calculates the 
> information gain between the i-th feature and the class labels 
> (http://www.jiliang.xyz/publication/feature_selection_for_classification.pdf).
> I confirmed that by running logistics regressions on enron mail dataset (in 
> which each email is represented by top 100 terms that have highest 
> information gain) and got the accuracy by 92% and precision by 82%.
> This ticket will create two new streaming expression. Both of them use the 
> same *parallel iterative framework* as SOLR-8492.
> {code}
> featuresSelection(collection1, q="*:*",  field="tv_text", outcome="out_i", 
> positiveLabel=1, numTerms=100)
> {code}
> featuresSelection will emit top terms that have highest information gain 
> scores. It can be combined with new tlogit stream.
> {code}
> tlogit(collection1, q="*:*",
>  featuresSelection(collection1, 
>   q="*:*",  
>   field="tv_text", 
>   outcome="out_i", 
>   positiveLabel=1, 
>   numTerms=100),
>  field="tv_text",
>  outcome="out_i",
>  maxIterations=100)
> {code}
> In the iteration n, the text logistics regression will emit nth model, and 
> compute the error of (n-1)th model. Because the error will be wrong if we 
> compute the error dynamically in each iteration. 
> In each iteration tlogit will change learning rate based on error of previous 
> iteration. It will increase the learning rate by 5% if error is going down 
> and It will decrease the learning rate by 50% if error is going up.
> This will support use cases such as building models for spam detection, 
> sentiment analysis and threat detection. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-5944) Support updates of numeric DocValues

2016-07-19 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384773#comment-15384773
 ] 

Shalin Shekhar Mangar edited comment on SOLR-5944 at 7/19/16 8:04 PM:
--

Thanks Ishan. This is great progress since the last time I reviewed this patch.

I've only skimmed the latest patch but in particular I find a few problems in 
DistributedUpdateProcessor#waitForDependentUpdates:
# This method doesn't look correct. All places which call vinfo.lookupVersion() 
do it after acquiring a read lock by calling vinfo.lockForUpdate(). Looking at 
the code for vinfo.lookupVersion() it accesses the map and prevMap in updateLog 
which can be modified by a writer thread while waitForDependentUpdates is 
reading their values. So first of all we need to ensure that we acquire and 
release the read lock. Acquiring this lock and then waiting on a different 
object (the "bucket") will not cause a deadlock condition because it is a read 
lock (which can be held by multiple threads).
# Secondly, this method can be made more efficient. It currently wakes up every 
100ms and reads the new "lastFoundVersion" from the update log or index. This 
is wasteful. A better way would be to wait for the timeout period directly 
after calling {{vinfo.lookupVersion()}} inside the synchronized block.
# Similar to #1 -- calling {{vinfo.lookupVersion()}} after 
{{fetchMissingUpdateFromLeader}} should be done after acquiring a read lock.
# There is no reason to synchronize on bucket when calling the {{versionAdd}} 
method again because it will acquire the monitor anyway.
# DistributedUpdateProcessor#waitForDependentUpdates uses wrong javadoc tag 
'@returns' instead of '@return'
# The debug log message should be moved out of the loop instead of introducing 
a debugMessagePrinted boolean flag
# Use the org.apache.solr.util.TimeOut class for timed wait loops
# Method can be made private

I've attempted to write a better wait-loop here (warning: not tested):
{code}
long prev = cmd.prevVersion;
long lastFoundVersion = 0;


TimeOut timeOut = new TimeOut(5, TimeUnit.SECONDS);
vinfo.lockForUpdate();
try {
  synchronized (bucket) {
lastFoundVersion = vinfo.lookupVersion(cmd.getIndexedId());
while (lastFoundVersion < prev && !timeOut.hasTimedOut())  {
  if (log.isDebugEnabled()) {
log.debug("Re-ordered inplace update. version=" + (cmd.getVersion() 
== 0 ? versionOnUpdate : cmd.getVersion()) +
", prevVersion=" + prev + ", lastVersion=" + lastFoundVersion + 
", replayOrPeerSync=" + isReplayOrPeersync);
  }
  try {
bucket.wait(5000);
  } catch (InterruptedException ie) {
throw new RuntimeException(ie);
  }
  lastFoundVersion = vinfo.lookupVersion(cmd.getIndexedId());
}
  }
} finally {
  vinfo.unlockForUpdate();
}

// check lastFoundVersion against prev again and handle all conditions
{code}

However I think that since the read lock and bucket monitor has to be acquired 
by this method anyway, it might be a good idea to just call it from inside 
versionAdd after acquiring those monitors. Then this method can focus on just 
waiting for dependent updates and nothing else.

A random comment on the changes made to DebugFilter: The setDelay mechanism 
introduced here may be a good candidate for Mark's new 
TestInjection#injectUpdateRandomPause?


was (Author: shalinmangar):
Thanks Ishan. This is great progress since the last time I reviewed this patch.

I've only skimmed the latest patch but in particular I find a few problems in 
DistributedUpdateProcessor#waitForDependentUpdates:
# This method doesn't look correct. All places which call vinfo.lookupVersion() 
do it after acquiring a read lock by calling vinfo.lockForUpdate(). Looking at 
the code for vinfo.lookupVersion() it accesses the map and prevMap in updateLog 
which can be modified by a writer thread while waitForDependentUpdates is 
reading their values. So first of all we need to ensure that we acquire and 
release the read lock. Acquiring this lock and then waiting on a different 
object (the "bucket") will not cause a deadlock condition because it is a read 
lock (which can be held by multiple threads).
# Secondly, this method can be made more efficient. It currently wakes up every 
100ms and reads the new "lastFoundVersion" from the update log or index. This 
is wasteful. A better way would be to wait for the timeout period directly 
before calling {{vinfo.lookupVersion()}} inside the synchronized block.
# Similar to #1 -- calling {{vinfo.lookupVersion()}} after 
{{fetchMissingUpdateFromLeader}} should be done after acquiring a read lock.
# There is no reason to synchronize on bucket when calling the {{versionAdd}} 
method again because it will acquire the monitor anyway.
# DistributedUpdateProcessor

[jira] [Commented] (SOLR-5944) Support updates of numeric DocValues

2016-07-19 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384773#comment-15384773
 ] 

Shalin Shekhar Mangar commented on SOLR-5944:
-

Thanks Ishan. This is great progress since the last time I reviewed this patch.

I've only skimmed the latest patch but in particular I find a few problems in 
DistributedUpdateProcessor#waitForDependentUpdates:
# This method doesn't look correct. All places which call vinfo.lookupVersion() 
do it after acquiring a read lock by calling vinfo.lockForUpdate(). Looking at 
the code for vinfo.lookupVersion() it accesses the map and prevMap in updateLog 
which can be modified by a writer thread while waitForDependentUpdates is 
reading their values. So first of all we need to ensure that we acquire and 
release the read lock. Acquiring this lock and then waiting on a different 
object (the "bucket") will not cause a deadlock condition because it is a read 
lock (which can be held by multiple threads).
# Secondly, this method can be made more efficient. It currently wakes up every 
100ms and reads the new "lastFoundVersion" from the update log or index. This 
is wasteful. A better way would be to wait for the timeout period directly 
before calling {{vinfo.lookupVersion()}} inside the synchronized block.
# Similar to #1 -- calling {{vinfo.lookupVersion()}} after 
{{fetchMissingUpdateFromLeader}} should be done after acquiring a read lock.
# There is no reason to synchronize on bucket when calling the {{versionAdd}} 
method again because it will acquire the monitor anyway.
# DistributedUpdateProcessor#waitForDependentUpdates uses wrong javadoc tag 
'@returns' instead of '@return'
# The debug log message should be moved out of the loop instead of introducing 
a debugMessagePrinted boolean flag
# Use the org.apache.solr.util.TimeOut class for timed wait loops
# Method can be made private

I've attempted to write a better wait-loop here (warning: not tested):
{code}
long prev = cmd.prevVersion;
long lastFoundVersion = 0;


TimeOut timeOut = new TimeOut(5, TimeUnit.SECONDS);
vinfo.lockForUpdate();
try {
  synchronized (bucket) {
lastFoundVersion = vinfo.lookupVersion(cmd.getIndexedId());
while (lastFoundVersion < prev && !timeOut.hasTimedOut())  {
  if (log.isDebugEnabled()) {
log.debug("Re-ordered inplace update. version=" + (cmd.getVersion() 
== 0 ? versionOnUpdate : cmd.getVersion()) +
", prevVersion=" + prev + ", lastVersion=" + lastFoundVersion + 
", replayOrPeerSync=" + isReplayOrPeersync);
  }
  try {
bucket.wait(5000);
  } catch (InterruptedException ie) {
throw new RuntimeException(ie);
  }
  lastFoundVersion = vinfo.lookupVersion(cmd.getIndexedId());
}
  }
} finally {
  vinfo.unlockForUpdate();
}

// check lastFoundVersion against prev again and handle all conditions
{code}

However I think that since the read lock and bucket monitor has to be acquired 
by this method anyway, it might be a good idea to just call it from inside 
versionAdd after acquiring those monitors. Then this method can focus on just 
waiting for dependent updates and nothing else.

A random comment on the changes made to DebugFilter: The setDelay mechanism 
introduced here may be a good candidate for Mark's new 
TestInjection#injectUpdateRandomPause?

> Support updates of numeric DocValues
> 
>
> Key: SOLR-5944
> URL: https://issues.apache.org/jira/browse/SOLR-5944
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Shalin Shekhar Mangar
> Attachments: DUP.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> TestStressInPlaceUpdates.eb044ac71.beast-167-failure.stdout.txt, 
> TestStressInPlaceUpdates.eb044ac71.beast-587-failure.stdout.txt, 
> TestStressInPlaceUpdates.eb044ac71.failures.tar.gz, 
> hoss.62D328FA1DEA57FD.fail.txt, hoss.62D328FA1DEA57FD.fail2.txt, 
> hoss.62D328FA1DEA57FD.fail3.txt, hoss.D768DD9443A98DC.fail.txt, 
> hoss.D768DD9443A98DC.pass.txt
>
>
> LUCENE-5189 int

[jira] [Updated] (SOLR-9321) Removed usage of deprecated clusterstate.getSlicesMap(), getSlices() and getActiveSlices()

2016-07-19 Thread Ishan Chattopadhyaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-9321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya updated SOLR-9321:
---
Attachment: SOLR-9321.patch

Patch to remove the deprecated usages.

> Removed usage of deprecated clusterstate.getSlicesMap(), getSlices() and 
> getActiveSlices()
> --
>
> Key: SOLR-9321
> URL: https://issues.apache.org/jira/browse/SOLR-9321
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Ishan Chattopadhyaya
>Priority: Minor
> Attachments: SOLR-9321.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-9321) Removed usage of deprecated clusterstate.getSlicesMap(), getSlices() and getActiveSlices()

2016-07-19 Thread Ishan Chattopadhyaya (JIRA)

Ishan Chattopadhyaya created SOLR-9321:
--

 Summary: Removed usage of deprecated clusterstate.getSlicesMap(), 
getSlices() and getActiveSlices()
 Key: SOLR-9321
 URL: https://issues.apache.org/jira/browse/SOLR-9321
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Ishan Chattopadhyaya
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-5944) Support updates of numeric DocValues

2016-07-19 Thread Ishan Chattopadhyaya (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15361862#comment-15361862
 ] 

Ishan Chattopadhyaya edited comment on SOLR-5944 at 7/19/16 6:52 PM:
-

New patch fixing all nocommits. Still a few additional tests, which Hoss 
mentioned, are TODO. Here's a stab at replying to Hoss' comments (Maybe I'll 
keep updating this comment itself as and when I fix some of the TODO items 
here):

 {panel:title=JettySolrRunner}
* javadocs, javadocs, javadocs {color:green}[FIXED]{color}
{panel}

{panel:title=XMLLoader + JavabinLoader}
* why is this param checks logic duplicated in these classes? {color:green}[Not 
sure what you mean here, I just set the prevVersion to the cmd here now]{color}
* why not put this in DUP (which already has access to the request params) when 
it's doing it's "FROMLEADER" logic? {color:green}[Since commitWithin and 
overwrite was being set here, I thought this is an appropriate place to set the 
prevVersion to the cmd]{color}
{panel}

{panel:title=AddUpdateCommand}
* these variables (like all variables) should have javadocs explaining what 
they are and what they mean {color:green}[FIXED]{color}
** people skimming a class shouldn't have to grep the code for a variable name 
to understand it's purpose
* having 2 variables here seems like it might be error prone?  what does it 
mean if {{prevVersion < 0 && isInPlaceUpdate == true}} ? or {{0 < prevVersion 
&& isInPlaceUpdate == false}} ? {color:green}[FIXED: Now just have one 
variable]{color}
** would it make more sense to use a single {{long prevVersion}} variable and 
have a {{public boolean isInPlaceUpdate()}} that simply does {{return (0 < 
prevVersion); }} ? {color:green}[FIXED]{color}
{panel}

{panel:title=TransactionLog}
* javadocs for both the new {{write}} method and the existig {{write}} method  
{color:green}[FIXED]{color}
** explain what "prevPointer" means and note in the 2 arg method what the 
effective default "prevPoint" is.
* we should really have some "int" constants for refering to the List indexes 
involved in these records, so instead of code like {{entry.get(3)}} sprinkled 
in various classes like UpdateLog and PeerSync it can be smething more readable 
like {{entry.get(PREV_VERSION_IDX)}}  {color:green}[FIXED]{color}
{panel}


{panel:title=UpdateLog}
* javadocs for both the new {{LogPtr}} constructure and the existing 
constructor {color:green}[FIXED]{color}
** explain what "prevPointer" means and note in the 2 arg constructure what the 
effective default "prevPoint" is.  {color:green}[FIXED]{color}
* {{add(AddUpdateCommand, boolean)}}
** this new code for doing lookups in {{map}}, {{prevMap}} and {{preMap2}} 
seems weird to me (but admitedly i'm not really an expert on UpdateLog in 
general and how these maps are used
** what primarily concerns me is what the expected behavior is if the "id" 
isn't found in any of these maps -- it looks like prevPointer defaults to "-1" 
regardless of whether this is an inplace update ... is that intentional? ... is 
it possible there are older records we will miss and need to flag that?  
{color:green}[Yes, this was intentional, and I think it doesn't make any 
difference. If an "id" isn't found in any of these maps, it would mean that the 
previous update was committed and should be looked up in the index. ]{color}
** ie: do we need to worry about distinguising here between "not an in place 
update, therefore prePointer=-1" vs "is an in place update, but we can't find 
the prevPointer" ?? {color:green}[I think we don't need to worry. Upon 
receiving a prevPointer=-1 by whoever reads this LogPtr, it should be clear why 
it was -1: if the command's {{flags|UpdateLog.UPDATE_INPLACE}} is set, then 
this command is an in-place update whose previous update is in the index and 
not in the tlog; if that flag is not set, it is not an in-place update at all, 
and don't bother about the prevPointer value at all (which is -1 as a dummy 
value).]{color}
** assuming this code is correct, it might be a little easier to read if it 
were refactored into something like:{code}
// nocommit: jdocs
private synchronized long getPrevPointerForUpdate(AddUpdateCommand cmd) {
  // note: sync required to ensure maps aren't changed out form under us
  if (cmd.isInPlaceUpdate) {
BytesRef indexedId = cmd.getIndexedId();
for (Map currentMap : Arrays.asList(map, prevMap, 
prevMap2)) {
  LogPtr prevEntry = currentMap.get(indexedId);
  if (null != prevEntry) {
return prevEntry.pointer;
  }
}
  }
  return -1; // default when not inplace, or if we can't find a previous entry
}
{code} {color:green}[FIXED: Refactored into something similar to above]{color}
* {{applyPartialUpdates}}
** it seems like this method would be a really good candidate for some direct 
unit testing? {color:green}[Added test to UpdateLogTest]{color}
*** ie: co

[jira] [Updated] (SOLR-5944) Support updates of numeric DocValues

2016-07-19 Thread Ishan Chattopadhyaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya updated SOLR-5944:
---
Attachment: SOLR-5944.patch

Another TODO item got missed out.
# Refactored calls like entry.get(1) etc. (for entries fetched from the 
ulog/tlog) to entry.get(UpdateLog.VERSION_IDX).

> Support updates of numeric DocValues
> 
>
> Key: SOLR-5944
> URL: https://issues.apache.org/jira/browse/SOLR-5944
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Shalin Shekhar Mangar
> Attachments: DUP.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> TestStressInPlaceUpdates.eb044ac71.beast-167-failure.stdout.txt, 
> TestStressInPlaceUpdates.eb044ac71.beast-587-failure.stdout.txt, 
> TestStressInPlaceUpdates.eb044ac71.failures.tar.gz, 
> hoss.62D328FA1DEA57FD.fail.txt, hoss.62D328FA1DEA57FD.fail2.txt, 
> hoss.62D328FA1DEA57FD.fail3.txt, hoss.D768DD9443A98DC.fail.txt, 
> hoss.D768DD9443A98DC.pass.txt
>
>
> LUCENE-5189 introduced support for updates to numeric docvalues. It would be 
> really nice to have Solr support this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-6.x-Windows (32bit/jdk1.8.0_92) - Build # 330 - Still Unstable!

2016-07-19 Thread Policeman Jenkins Server

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Windows/330/
Java: 32bit/jdk1.8.0_92 -client -XX:+UseSerialGC

3 tests failed.
FAILED:  junit.framework.TestSuite.org.apache.solr.schema.TestManagedSchemaAPI

Error Message:
ObjectTracker found 8 object(s) that were not released!!! 
[MDCAwareThreadPoolExecutor, TransactionLog, MDCAwareThreadPoolExecutor, 
MockDirectoryWrapper, MockDirectoryWrapper, TransactionLog, 
MockDirectoryWrapper, MockDirectoryWrapper]

Stack Trace:
java.lang.AssertionError: ObjectTracker found 8 object(s) that were not 
released!!! [MDCAwareThreadPoolExecutor, TransactionLog, 
MDCAwareThreadPoolExecutor, MockDirectoryWrapper, MockDirectoryWrapper, 
TransactionLog, MockDirectoryWrapper, MockDirectoryWrapper]
at __randomizedtesting.SeedInfo.seed([923E4E8E783D0E32]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at org.junit.Assert.assertNull(Assert.java:551)
at 
org.apache.solr.SolrTestCaseJ4.teardownTestCases(SolrTestCaseJ4.java:258)
at sun.reflect.GeneratedMethodAccessor28.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1764)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:834)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at java.lang.Thread.run(Thread.java:745)


FAILED:  junit.framework.TestSuite.org.apache.solr.schema.TestManagedSchemaAPI

Error Message:
Could not remove the following files (in the order of attempts):
C:\Users\jenkins\workspace\Lucene-Solr-6.x-Windows\solr\build\solr-core\test\J1\temp\solr.schema.TestManagedSchemaAPI_923E4E8E783D0E32-001\tempDir-001\node1\testschemaapi_shard1_replica2\data\tlog\tlog.001:
 java.nio.file.FileSystemException: 
C:\Users\jenkins\workspace\Lucene-Solr-6.x-Windows\solr\build\solr-core\test\J1\temp\solr.schema.TestManagedSchemaAPI_923E4E8E783D0E32-001\tempDir-001\node1\testschemaapi_shard1_replica2\data\tlog\tlog.001:
 The process cannot access the file because it is being used by another 
process. 
C:\Users\jenkins\workspace\Lucene-Solr-6.x-Windows\solr\build\solr-core\test\J1\temp\solr.schema.TestManagedSchemaAPI_923E4E8E783D0E32-001\tempDir-001\node1\testschemaapi_shard1_replica2\data\tlog:
 java.nio.file.DirectoryNotEmptyException: 
C:\Users\jenkins\workspace\Lucene-Solr-6.x-Windows\solr\build\solr-core\test\J1\temp\solr.schema.TestManagedSchemaAPI_923E4E8E783D0E32-001\tempDir-001\node1\testschemaapi_shard1_replica2\data\tlog

C:\Users\jenkins\workspace\Lucene-Solr-6.x-Windows\solr\build\solr-core\test\J1\temp\solr.schema.TestManagedSchemaAPI_923E4E8E783D0E32-001\tempDir-001\node1\testschemaapi_shard1_replica2\data:
 java.nio.file.DirectoryNotEmptyException: 
C:\Users\jenkins\workspace\Lucene-Solr-6.x-Windows\solr\build\solr-core\test\J1\temp\solr.schema.TestManagedSchemaAPI_923E4E8E783D0E32-001\tempDir-001\node1\testschemaapi_shard1_replica2\data

C:\Users\jenkins\workspace\Lucene-Solr-6.x-Windows\solr\

[jira] [Commented] (SOLR-9309) SolrCloud RTG with multiple "id" params has inconsistent behavior if only 0 or 1 ids are returned

2016-07-19 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384618#comment-15384618
 ] 

ASF subversion and git services commented on SOLR-9309:
---

Commit acbe59c70cf862c4f3c452c37e05061e1c939c04 in lucene-solr's branch 
refs/heads/branch_6x from Chris Hostetter
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=acbe59c ]

SOLR-9309: Fix SolrCloud RTG response structure when multi ids requested but 
only 1 found

(cherry picked from commit 9aa639d45e31059bb2910dade6d7728ea075cd57)


> SolrCloud RTG with multiple "id" params has inconsistent behavior if only 0 
> or 1 ids are returned
> -
>
> Key: SOLR-9309
> URL: https://issues.apache.org/jira/browse/SOLR-9309
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
> Fix For: 6.2, master (7.0)
>
> Attachments: SOLR-9309.patch, SOLR-9309.patch
>
>
> * RTG uses a diff reqponse format depending on whether a single id is 
> requested or multiple ids are requested.
> * there are 2 ways to request multiple ids:
> *# multiple {{id}} params
> *# comma seperated ids in one (or more) {{ids}} param(s)
> But in cloud mode, asking for multiple ids using the first method can 
> incorrectly return the "single" doc response structure if 0 or 1 docs are 
> returned (ie: because the other doc(s) don't exist in the index or were 
> deleted).
> This inconsistency does not seem to exist in single node solr RTG
> (Example to follow in comment)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-9309) SolrCloud RTG with multiple "id" params has inconsistent behavior if only 0 or 1 ids are returned

2016-07-19 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-9309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-9309.

   Resolution: Fixed
 Assignee: Hoss Man
Fix Version/s: master (7.0)
   6.2

> SolrCloud RTG with multiple "id" params has inconsistent behavior if only 0 
> or 1 ids are returned
> -
>
> Key: SOLR-9309
> URL: https://issues.apache.org/jira/browse/SOLR-9309
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Assignee: Hoss Man
> Fix For: 6.2, master (7.0)
>
> Attachments: SOLR-9309.patch, SOLR-9309.patch
>
>
> * RTG uses a diff reqponse format depending on whether a single id is 
> requested or multiple ids are requested.
> * there are 2 ways to request multiple ids:
> *# multiple {{id}} params
> *# comma seperated ids in one (or more) {{ids}} param(s)
> But in cloud mode, asking for multiple ids using the first method can 
> incorrectly return the "single" doc response structure if 0 or 1 docs are 
> returned (ie: because the other doc(s) don't exist in the index or were 
> deleted).
> This inconsistency does not seem to exist in single node solr RTG
> (Example to follow in comment)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9309) SolrCloud RTG with multiple "id" params has inconsistent behavior if only 0 or 1 ids are returned

2016-07-19 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384619#comment-15384619
 ] 

ASF subversion and git services commented on SOLR-9309:
---

Commit 9aa639d45e31059bb2910dade6d7728ea075cd57 in lucene-solr's branch 
refs/heads/master from Chris Hostetter
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=9aa639d ]

SOLR-9309: Fix SolrCloud RTG response structure when multi ids requested but 
only 1 found


> SolrCloud RTG with multiple "id" params has inconsistent behavior if only 0 
> or 1 ids are returned
> -
>
> Key: SOLR-9309
> URL: https://issues.apache.org/jira/browse/SOLR-9309
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
> Fix For: 6.2, master (7.0)
>
> Attachments: SOLR-9309.patch, SOLR-9309.patch
>
>
> * RTG uses a diff reqponse format depending on whether a single id is 
> requested or multiple ids are requested.
> * there are 2 ways to request multiple ids:
> *# multiple {{id}} params
> *# comma seperated ids in one (or more) {{ids}} param(s)
> But in cloud mode, asking for multiple ids using the first method can 
> incorrectly return the "single" doc response structure if 0 or 1 docs are 
> returned (ie: because the other doc(s) don't exist in the index or were 
> deleted).
> This inconsistency does not seem to exist in single node solr RTG
> (Example to follow in comment)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-7280) Load cores in sorted order and tweak coreLoadThread counts to improve cluster stability on restarts

2016-07-19 Thread Noble Paul (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-7280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-7280:
-
Attachment: SOLR-7280-5x.patch

backported the changes I committed to 6x today

> Load cores in sorted order and tweak coreLoadThread counts to improve cluster 
> stability on restarts
> ---
>
> Key: SOLR-7280
> URL: https://issues.apache.org/jira/browse/SOLR-7280
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Reporter: Shalin Shekhar Mangar
>Assignee: Noble Paul
> Fix For: 6.2
>
> Attachments: SOLR-7280-5x.patch, SOLR-7280-5x.patch, 
> SOLR-7280-5x.patch, SOLR-7280.patch, SOLR-7280.patch
>
>
> In SOLR-7191, Damien mentioned that by loading solr cores in a sorted order 
> and tweaking some of the coreLoadThread counts, he was able to improve the 
> stability of a cluster with thousands of collections. We should explore some 
> of these changes and fold them into Solr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9288) RTG: fl=[docid] silently missing for uncommitted docs

2016-07-19 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384586#comment-15384586
 ] 

ASF subversion and git services commented on SOLR-9288:
---

Commit 08019f42889a537764384429c4184515d233a2cb in lucene-solr's branch 
refs/heads/master from Chris Hostetter
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=08019f4 ]

SOLR-9288: Fix [docid] transformer to return -1 when used in RTG with 
uncommitted doc


> RTG: fl=[docid] silently missing for uncommitted docs
> -
>
> Key: SOLR-9288
> URL: https://issues.apache.org/jira/browse/SOLR-9288
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
> Fix For: 6.2, master (7.0)
>
> Attachments: SOLR-9288.patch
>
>
> Found in SOLR-9180 testing.
> when using RTG in a single node solr install, the {{\[docid\]}} transformer 
> works for committed docs, but is silently missing from uncommited docs.
> this inconsistency is confusing.  It seems like even if there is no valid 
> docid to return in this case, the key should still be present in the 
> resulting doc.
> I would suggest using either {{null}} or {{-1}} in this case?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9288) RTG: fl=[docid] silently missing for uncommitted docs

2016-07-19 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384585#comment-15384585
 ] 

ASF subversion and git services commented on SOLR-9288:
---

Commit 9f4e2764add63afab8f5b3784274f300a94f in lucene-solr's branch 
refs/heads/branch_6x from Chris Hostetter
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=9f4 ]

SOLR-9288: Fix [docid] transformer to return -1 when used in RTG with 
uncommitted doc

(cherry picked from commit 08019f42889a537764384429c4184515d233a2cb)


> RTG: fl=[docid] silently missing for uncommitted docs
> -
>
> Key: SOLR-9288
> URL: https://issues.apache.org/jira/browse/SOLR-9288
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
> Fix For: 6.2, master (7.0)
>
> Attachments: SOLR-9288.patch
>
>
> Found in SOLR-9180 testing.
> when using RTG in a single node solr install, the {{\[docid\]}} transformer 
> works for committed docs, but is silently missing from uncommited docs.
> this inconsistency is confusing.  It seems like even if there is no valid 
> docid to return in this case, the key should still be present in the 
> resulting doc.
> I would suggest using either {{null}} or {{-1}} in this case?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-9288) RTG: fl=[docid] silently missing for uncommitted docs

2016-07-19 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-9288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-9288.

   Resolution: Fixed
 Assignee: Hoss Man
Fix Version/s: master (7.0)
   6.2

> RTG: fl=[docid] silently missing for uncommitted docs
> -
>
> Key: SOLR-9288
> URL: https://issues.apache.org/jira/browse/SOLR-9288
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Assignee: Hoss Man
> Fix For: 6.2, master (7.0)
>
> Attachments: SOLR-9288.patch
>
>
> Found in SOLR-9180 testing.
> when using RTG in a single node solr install, the {{\[docid\]}} transformer 
> works for committed docs, but is silently missing from uncommited docs.
> this inconsistency is confusing.  It seems like even if there is no valid 
> docid to return in this case, the key should still be present in the 
> resulting doc.
> I would suggest using either {{null}} or {{-1}} in this case?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7386) Flatten nested disjunctions

2016-07-19 Thread David Smiley (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384582#comment-15384582
 ] 

David Smiley commented on LUCENE-7386:
--

What diff does that apply to?

> Flatten nested disjunctions
> ---
>
> Key: LUCENE-7386
> URL: https://issues.apache.org/jira/browse/LUCENE-7386
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7386.patch
>
>
> Now that coords are gone it became easier to flatten nested disjunctions. It 
> might sound weird to write nested disjunctions in the first place, but 
> disjunctions can be created implicitly by other queries such as 
> more-like-this, LatLonPoint.newBoxQuery, non-scoring synonym queries, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7381) Add new RangeField

2016-07-19 Thread David Smiley (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384579#comment-15384579
 ] 

David Smiley commented on LUCENE-7381:
--

Very cool Nick!

> Add new RangeField
> --
>
> Key: LUCENE-7381
> URL: https://issues.apache.org/jira/browse/LUCENE-7381
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Nicholas Knize
> Attachments: LUCENE-7381.patch, LUCENE-7381.patch, LUCENE-7381.patch
>
>
> I've been tinkering with a new Point-based {{RangeField}} for indexing 
> numeric ranges that could be useful for a number of applications.
> For example, a single dimension represents a span along a single axis such as 
> indexing calendar entries start and end time, 2d range could represent 
> bounding boxes for geometric applications (e.g., supporting Point based geo 
> shapes), 3d ranges bounding cubes for 3d geometric applications (collision 
> detection, 3d geospatial), and 4d ranges for space time applications. I'm 
> sure there's applicability for 5d+ ranges but a first incarnation should 
> likely limit for performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-7280) Load cores in sorted order and tweak coreLoadThread counts to improve cluster stability on restarts

2016-07-19 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-7280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384572#comment-15384572
 ] 

Mike Drob commented on SOLR-7280:
-

{{+List l = new ArrayList<>();}}
Could init this list to copy.size()

{{+List ret = new ArrayList<>();}}
Same idea here, cc.getCores().size()

Other than that, your unwrapped lambdas look fine to me.

> Load cores in sorted order and tweak coreLoadThread counts to improve cluster 
> stability on restarts
> ---
>
> Key: SOLR-7280
> URL: https://issues.apache.org/jira/browse/SOLR-7280
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Reporter: Shalin Shekhar Mangar
>Assignee: Noble Paul
> Fix For: 6.2
>
> Attachments: SOLR-7280-5x.patch, SOLR-7280-5x.patch, SOLR-7280.patch, 
> SOLR-7280.patch
>
>
> In SOLR-7191, Damien mentioned that by loading solr cores in a sorted order 
> and tweaking some of the coreLoadThread counts, he was able to improve the 
> stability of a cluster with thousands of collections. We should explore some 
> of these changes and fold them into Solr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5944) Support updates of numeric DocValues

2016-07-19 Thread Ishan Chattopadhyaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya updated SOLR-5944:
---
Attachment: SOLR-5944.patch

Updated the patch:
# Fixed a bug with mixing atomic and in-place updates. Problem was that after 
in-place update, the RTGC.getInputDocument() got only the partial document, and 
hence further atomic updates on it failed. Changed this to return a "resolved" 
document for use during atomic update.
# Added direct unit tests for AUDM.isInPlaceUpdate() at 
TestInPlaceUpdatesCopyField.java and UpdateLogTest. applyPartialUpdates() at 
UpdateLogTest.java.

> Support updates of numeric DocValues
> 
>
> Key: SOLR-5944
> URL: https://issues.apache.org/jira/browse/SOLR-5944
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Shalin Shekhar Mangar
> Attachments: DUP.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> TestStressInPlaceUpdates.eb044ac71.beast-167-failure.stdout.txt, 
> TestStressInPlaceUpdates.eb044ac71.beast-587-failure.stdout.txt, 
> TestStressInPlaceUpdates.eb044ac71.failures.tar.gz, 
> hoss.62D328FA1DEA57FD.fail.txt, hoss.62D328FA1DEA57FD.fail2.txt, 
> hoss.62D328FA1DEA57FD.fail3.txt, hoss.D768DD9443A98DC.fail.txt, 
> hoss.D768DD9443A98DC.pass.txt
>
>
> LUCENE-5189 introduced support for updates to numeric docvalues. It would be 
> really nice to have Solr support this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-7280) Load cores in sorted order and tweak coreLoadThread counts to improve cluster stability on restarts

2016-07-19 Thread Erick Erickson (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-7280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-7280:
-
Attachment: SOLR-7280-5x.patch

Success! The problem I was having before was that I was _still_ getting OOM 
errors in my setup with the default 24 coreLoadTheads. Reducing it to 8 cured 
the problem, I ran my test setup all last night and there were zero problems.
I've attached the patch for 5x. I had to re-implement the lambda expressions in 
the original, I _think_ I did the right thing in the new CoreSorterTest, but 
any checks welcome. This patch also sets the default coreLoadThreads to 8 as 
Noble discussed.

[~mdrob] thanks for the pointer on the junit stuff BTW. I didn't incorporate 
your other suggestion, but having it in 6x and 7x suffices I think.

This passes precommit and test as well as my stress test.

So, the question becomes should this be merged into the 5x code line so it'll 
be picked up by any (hypothetical) 5x releases or just left here and we'll deal 
with whether it should be included in any new 5x release when the time comes? 
Any firm opinions? This topic has come up on more than one occasion, but even 
checking it into 5x still means people would have to build it themselves.

> Load cores in sorted order and tweak coreLoadThread counts to improve cluster 
> stability on restarts
> ---
>
> Key: SOLR-7280
> URL: https://issues.apache.org/jira/browse/SOLR-7280
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Reporter: Shalin Shekhar Mangar
>Assignee: Noble Paul
> Fix For: 6.2
>
> Attachments: SOLR-7280-5x.patch, SOLR-7280-5x.patch, SOLR-7280.patch, 
> SOLR-7280.patch
>
>
> In SOLR-7191, Damien mentioned that by loading solr cores in a sorted order 
> and tweaking some of the coreLoadThread counts, he was able to improve the 
> stability of a cluster with thousands of collections. We should explore some 
> of these changes and fold them into Solr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-7384) Remove ScoringWrapperSpans

2016-07-19 Thread David Smiley (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-7384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley resolved LUCENE-7384.
--
   Resolution: Fixed
 Assignee: David Smiley
Fix Version/s: 6.2

> Remove ScoringWrapperSpans
> --
>
> Key: LUCENE-7384
> URL: https://issues.apache.org/jira/browse/LUCENE-7384
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: David Smiley
>Assignee: David Smiley
> Fix For: 6.2
>
> Attachments: LUCENE_7384.patch
>
>
> In LUCENE-6919 (Lucene 5.5), ScoringWrapperSpans was modified in such a way 
> that made the existence of this class pointless, and possibly broke anyone 
> who was using it as it's SimScorer argument isn't used anymore.  We should 
> now delete it.  SpanWeight has getSimScorer() so people can customize the 
> SimScorer that way.
> Another small change I observe to improve is have SpanWeight.buildSimWeight's 
> last line use the existing Similarity that has already been populated on the 
> field?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7384) Remove ScoringWrapperSpans

2016-07-19 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384500#comment-15384500
 ] 

ASF subversion and git services commented on LUCENE-7384:
-

Commit dfa3f61ecf501014836ff8d015a1548715198a05 in lucene-solr's branch 
refs/heads/branch_6x from [~dsmiley]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=dfa3f61 ]

LUCENE-7384: Tweak SpanWeight.buildSimWeight to reuse the existing similarity.
(cherry picked from commit 180f956)


> Remove ScoringWrapperSpans
> --
>
> Key: LUCENE-7384
> URL: https://issues.apache.org/jira/browse/LUCENE-7384
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: David Smiley
> Attachments: LUCENE_7384.patch
>
>
> In LUCENE-6919 (Lucene 5.5), ScoringWrapperSpans was modified in such a way 
> that made the existence of this class pointless, and possibly broke anyone 
> who was using it as it's SimScorer argument isn't used anymore.  We should 
> now delete it.  SpanWeight has getSimScorer() so people can customize the 
> SimScorer that way.
> Another small change I observe to improve is have SpanWeight.buildSimWeight's 
> last line use the existing Similarity that has already been populated on the 
> field?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7384) Remove ScoringWrapperSpans

2016-07-19 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384499#comment-15384499
 ] 

ASF subversion and git services commented on LUCENE-7384:
-

Commit 8904c3a952fa9cc56d95161c263096e6a9d5 in lucene-solr's branch 
refs/heads/branch_6x from [~dsmiley]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=8904c3a ]

LUCENE-7384: Remove defunct ScoringWrapperSpans.
(cherry picked from commit abb81e4)


> Remove ScoringWrapperSpans
> --
>
> Key: LUCENE-7384
> URL: https://issues.apache.org/jira/browse/LUCENE-7384
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: David Smiley
> Attachments: LUCENE_7384.patch
>
>
> In LUCENE-6919 (Lucene 5.5), ScoringWrapperSpans was modified in such a way 
> that made the existence of this class pointless, and possibly broke anyone 
> who was using it as it's SimScorer argument isn't used anymore.  We should 
> now delete it.  SpanWeight has getSimScorer() so people can customize the 
> SimScorer that way.
> Another small change I observe to improve is have SpanWeight.buildSimWeight's 
> last line use the existing Similarity that has already been populated on the 
> field?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-7381) Add new RangeField

2016-07-19 Thread Adrien Grand (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384486#comment-15384486
 ] 

Adrien Grand edited comment on LUCENE-7381 at 7/19/16 4:48 PM:
---

This is an exciting feature! I looked at the patch and have some questions:
 - Should the field be called something like DoubleRange rather than RangeField 
so that we still have namespace to have similar fields for other data types? I 
think this would also be more consistent with the names of other fields like 
StringField or DoublePoint?
 - The reuse of {{fieldsData}} in {{setRangeValues}} worries me a bit, is it 
safe? Other fields do not seem to do that?
 - QueryType does not need to be public?
 - Why do you replace infinities with +/-MAX_VALUE?



was (Author: jpountz):
This is an eciting feature! I looked at the patch and have some questions:
 - Should the field be called something like DoubleRange rather than RangeField 
so that we still have namespace to have similar fields for other data types? I 
think this would also be more consistent with the names of other fields like 
StringField or DoublePoint?
 - The reuse of {{fieldsData}} in {{setRangeValues}} worries me a bit, is it 
safe? Other fields do not seem to do that?
 - QueryType does not need to be public?
 - Why do you replace infinities with +/-MAX_VALUE?


> Add new RangeField
> --
>
> Key: LUCENE-7381
> URL: https://issues.apache.org/jira/browse/LUCENE-7381
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Nicholas Knize
> Attachments: LUCENE-7381.patch, LUCENE-7381.patch, LUCENE-7381.patch
>
>
> I've been tinkering with a new Point-based {{RangeField}} for indexing 
> numeric ranges that could be useful for a number of applications.
> For example, a single dimension represents a span along a single axis such as 
> indexing calendar entries start and end time, 2d range could represent 
> bounding boxes for geometric applications (e.g., supporting Point based geo 
> shapes), 3d ranges bounding cubes for 3d geometric applications (collision 
> detection, 3d geospatial), and 4d ranges for space time applications. I'm 
> sure there's applicability for 5d+ ranges but a first incarnation should 
> likely limit for performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7381) Add new RangeField

2016-07-19 Thread Adrien Grand (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384486#comment-15384486
 ] 

Adrien Grand commented on LUCENE-7381:
--

This is an eciting feature! I looked at the patch and have some questions:
 - Should the field be called something like DoubleRange rather than RangeField 
so that we still have namespace to have similar fields for other data types? I 
think this would also be more consistent with the names of other fields like 
StringField or DoublePoint?
 - The reuse of {{fieldsData}} in {{setRangeValues}} worries me a bit, is it 
safe? Other fields do not seem to do that?
 - QueryType does not need to be public?
 - Why do you replace infinities with +/-MAX_VALUE?


> Add new RangeField
> --
>
> Key: LUCENE-7381
> URL: https://issues.apache.org/jira/browse/LUCENE-7381
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Nicholas Knize
> Attachments: LUCENE-7381.patch, LUCENE-7381.patch, LUCENE-7381.patch
>
>
> I've been tinkering with a new Point-based {{RangeField}} for indexing 
> numeric ranges that could be useful for a number of applications.
> For example, a single dimension represents a span along a single axis such as 
> indexing calendar entries start and end time, 2d range could represent 
> bounding boxes for geometric applications (e.g., supporting Point based geo 
> shapes), 3d ranges bounding cubes for 3d geometric applications (collision 
> detection, 3d geospatial), and 4d ranges for space time applications. I'm 
> sure there's applicability for 5d+ ranges but a first incarnation should 
> likely limit for performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7384) Remove ScoringWrapperSpans

2016-07-19 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384480#comment-15384480
 ] 

ASF subversion and git services commented on LUCENE-7384:
-

Commit 180f9562aa9c1e271d8dce48ac5695d0612bf808 in lucene-solr's branch 
refs/heads/master from [~dsmiley]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=180f956 ]

LUCENE-7384: Tweak SpanWeight.buildSimWeight to reuse the existing similarity.


> Remove ScoringWrapperSpans
> --
>
> Key: LUCENE-7384
> URL: https://issues.apache.org/jira/browse/LUCENE-7384
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: David Smiley
> Attachments: LUCENE_7384.patch
>
>
> In LUCENE-6919 (Lucene 5.5), ScoringWrapperSpans was modified in such a way 
> that made the existence of this class pointless, and possibly broke anyone 
> who was using it as it's SimScorer argument isn't used anymore.  We should 
> now delete it.  SpanWeight has getSimScorer() so people can customize the 
> SimScorer that way.
> Another small change I observe to improve is have SpanWeight.buildSimWeight's 
> last line use the existing Similarity that has already been populated on the 
> field?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-master-Windows (32bit/jdk1.8.0_92) - Build # 5993 - Still unstable!

2016-07-19 Thread Policeman Jenkins Server

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-master-Windows/5993/
Java: 32bit/jdk1.8.0_92 -client -XX:+UseConcMarkSweepGC

2 tests failed.
FAILED:  junit.framework.TestSuite.org.apache.solr.util.TestSolrCLIRunExample

Error Message:
ObjectTracker found 3 object(s) that were not released!!! [SolrCore, 
MockDirectoryWrapper, MockDirectoryWrapper]

Stack Trace:
java.lang.AssertionError: ObjectTracker found 3 object(s) that were not 
released!!! [SolrCore, MockDirectoryWrapper, MockDirectoryWrapper]
at __randomizedtesting.SeedInfo.seed([68CD0FDC2C3CEA5E]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at org.junit.Assert.assertNull(Assert.java:551)
at 
org.apache.solr.SolrTestCaseJ4.teardownTestCases(SolrTestCaseJ4.java:257)
at sun.reflect.GeneratedMethodAccessor46.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1764)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:834)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at java.lang.Thread.run(Thread.java:745)


FAILED:  org.apache.solr.cloud.TestLocalFSCloudBackupRestore.test

Error Message:
Error from server at http://127.0.0.1:56823/solr: The backup directory already 
exists: 
file:///C:/Users/jenkins/workspace/Lucene-Solr-master-Windows/solr/build/solr-core/test/J1/temp/solr.cloud.TestLocalFSCloudBackupRestore_68CD0FDC2C3CEA5E-001/tempDir-002/mytestbackup/

Stack Trace:
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at http://127.0.0.1:56823/solr: The backup directory already 
exists: 
file:///C:/Users/jenkins/workspace/Lucene-Solr-master-Windows/solr/build/solr-core/test/J1/temp/solr.cloud.TestLocalFSCloudBackupRestore_68CD0FDC2C3CEA5E-001/tempDir-002/mytestbackup/
at 
__randomizedtesting.SeedInfo.seed([68CD0FDC2C3CEA5E:E099300682C087A6]:0)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:606)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:259)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:248)
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:413)
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:366)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1270)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:1040)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:976)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:149)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:166)
at 
org.apache.solr.cloud.AbstractCloudBackupRestoreTestCase.testBackupAndRestore(AbstractCloudBackupRestoreTestCase.java:206)

[jira] [Commented] (LUCENE-7384) Remove ScoringWrapperSpans

2016-07-19 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384479#comment-15384479
 ] 

ASF subversion and git services commented on LUCENE-7384:
-

Commit abb81e4dedd05606f91be809d702be0ca8be1caf in lucene-solr's branch 
refs/heads/master from [~dsmiley]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=abb81e4 ]

LUCENE-7384: Remove defunct ScoringWrapperSpans.


> Remove ScoringWrapperSpans
> --
>
> Key: LUCENE-7384
> URL: https://issues.apache.org/jira/browse/LUCENE-7384
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: David Smiley
> Attachments: LUCENE_7384.patch
>
>
> In LUCENE-6919 (Lucene 5.5), ScoringWrapperSpans was modified in such a way 
> that made the existence of this class pointless, and possibly broke anyone 
> who was using it as it's SimScorer argument isn't used anymore.  We should 
> now delete it.  SpanWeight has getSimScorer() so people can customize the 
> SimScorer that way.
> Another small change I observe to improve is have SpanWeight.buildSimWeight's 
> last line use the existing Similarity that has already been populated on the 
> field?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9313) Solr 6.1.0 SSL, and Basic Auth - shards index failed

2016-07-19 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384419#comment-15384419
 ] 

Erick Erickson commented on SOLR-9313:
--

Please raise questions like this on the user's list before
raising a JIRA. More eyes will see it and you'll perhaps
get help much more quickly.

We try to keep the JIRA list to known code issues. I
know there are auth tests in the junit tests, so my first
guess would be that this is something not quite right
in your configuration and not a code problem.

> Solr 6.1.0 SSL, and Basic Auth - shards index failed
> 
>
> Key: SOLR-9313
> URL: https://issues.apache.org/jira/browse/SOLR-9313
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Authentication
>Affects Versions: 6.1
> Environment: RHEL 7.2, Solr 6.1.0, Java 1.8, zk 3.4.8
>Reporter: narayana b
>Priority: Blocker
>  Labels: security
>
> Hi,
> This is a blocker, shards collection seeking for auth with 401 error.
> I have provided auth details in my java client then too failing to index on 
> shards collection
> I have 2 boxes (dev01,dev02)
> Zookeeper with chroot (/solr)
> 
> dev01 - zoo1:2181, zoo2:2182
> dev02 - zoo3:2183
> solr jvm instances:
> ---
> dev01 - solrjvm1 - 8983, solrjvm2 - 8984
> dev02 - solrjvm1 - 8983, solrjvm2 - 8984
> I enabled solr SSL channel, followed below link, i have used self signed 
> certificate
> https://cwiki.apache.org/confluence/display/solr/Enabling+SSL
> Basic auth:
> https://cwiki.apache.org/confluence/display/solr/Basic+Authentication+Plugin
> security.json
> 
> {
> "authentication":{
>"blockUnknown": true,
>"class":"solr.BasicAuthPlugin",
>"credentials":{"solr":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0= 
> Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c="}
>},
> "authorization":{
>"class":"solr.RuleBasedAuthorizationPlugin",
>"user-role":{"solr":"admin"},
>"permissions":[
>  {"name":"security-edit", "role":"admin"},
>{"name":"config-edit", "role":"admin"},
>{"name":"collection-admin-edit", 
> "role":"admin"},
>{"name":"all", "collection":null, "path":"/*", 
> "role":"admin"},
>{"name":"update", "collection":null, 
> "path":"/*", "role":"admin"}
>]
>}
> }
> Collection CREATE/DELETE via browser
> https://pcam-dev-app-01:8983/solr/admin/collections?action=DELETE&name=scdata_test
> https://pcam-dev-app-01:8983/solr/admin/collections?action=CREATE&name=scdata_test&numShards=1&replicationFactor=2&createNodeSet=pcam-dev-app-01:8983_solr,pcam-dev-app-01:8984_solr&collection.configName=scdata
> Two shards created:
> -
> scdata_test_shard1_replica1
> scdata_test_shard1_replica2
> Sample Java client
> 
> package com.test.solr.auth;
> import java.util.concurrent.TimeUnit;
> import org.apache.solr.client.solrj.SolrRequest;
> import org.apache.solr.client.solrj.impl.CloudSolrClient;
> import org.apache.solr.client.solrj.request.QueryRequest;
> import org.apache.solr.common.SolrInputDocument;
> public class SolrPopulateWithSSLAndBasicAuth {
>   public SolrPopulateWithSSLAndBasicAuth() {
>   }
>   @SuppressWarnings("rawtypes")
>   public static void main(String[] args) {
>   // https://cwiki.apache.org/confluence/display/solr/Using+SolrJ
>   //Standalone client
>   //String urlString = "http://localhost:8983/solr/techproducts";;
>   //SolrClient solr = new 
> HttpSolrClient.Builder(urlString).build();
>   try {
>   System.setProperty("javax.net.ssl.keyStore", 
> "C:/Users/nbasetty/Desktop/Solr-Dev-Cluster/solr-ssl.keystore.dev01.jks");
>   System.setProperty("javax.net.ssl.keyStorePassword", 
> "secret");
>   System.setProperty("javax.net.ssl.trustStore", 
> "C:/Users/nbasetty/Desktop/Solr-Dev-Cluster/solr-ssl.keystore.dev01.jks");
>   System.setProperty("javax.net.ssl.trustStorePassword", 
> "secret");
>   System.out.println(" Certificates setup done..");
>   String zkHosts = 
> "pcam-dev-app-01:2181,pcam-dev-app-01:2182,pcam-dev-app-02:2183/solr";
>   CloudSolrClient solrClient = new 
> CloudSolrClient.Builder().withZkHost(zkHosts).build();
>   solrClient.setDefaultCollection("scdata_test");
>   System.out.println(" ZooKeeper nodes setup done..");
>   SolrRequest

[jira] [Commented] (SOLR-9315) SchemaSimilarityFactory should delegate queryNorm and coord to the default similarity

2016-07-19 Thread Upayavira (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384407#comment-15384407
 ] 

Upayavira commented on SOLR-9315:
-

This patch did resolve my issue. How should we go about committing it?

I'm happy to commit it, but I wouldn't know how to provide a test for it.

> SchemaSimilarityFactory should delegate queryNorm and coord to the default 
> similarity
> -
>
> Key: SOLR-9315
> URL: https://issues.apache.org/jira/browse/SOLR-9315
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: SOLR-9315.patch
>
>
> This is a follow-up to the discussion with [~upayavira] on LUCENE-6590: 
> SchemaSimilarityFactory can easily build similarities that apply the idf 
> twice.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-6590) Explore different ways to apply boosts

2016-07-19 Thread Upayavira (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-6590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384402#comment-15384402
 ] 

Upayavira commented on LUCENE-6590:
---

I applied your patch and my problems went away. Many thanks!!

> Explore different ways to apply boosts
> --
>
> Key: LUCENE-6590
> URL: https://issues.apache.org/jira/browse/LUCENE-6590
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Adrien Grand
>Priority: Minor
> Fix For: 5.4
>
> Attachments: LUCENE-6590.patch, LUCENE-6590.patch, LUCENE-6590.patch, 
> LUCENE-6590.patch, LUCENE-6590.patch, LUCENE-6590.patch, LUCENE-6590.patch
>
>
> Follow-up from LUCENE-6570: the fact that all queries are mutable in order to 
> allow for applying a boost raises issues since it makes queries bad cache 
> keys since their hashcode can change anytime. We could just document that 
> queries should never be modified after they have gone through IndexSearcher 
> but it would be even better if the API made queries impossible to mutate at 
> all.
> I think there are two main options:
>  - either replace "void setBoost(boost)" with something like "Query 
> withBoost(boost)" which would return a clone that has a different boost
>  - or move boost handling outside of Query, for instance we could have a 
> (immutable) query impl that would be dedicated to applying boosts, that 
> queries that need to change boosts at rewrite time (such as BooleanQuery) 
> would use as a wrapper.
> The latter idea is from Robert and I like it a lot given how often I either 
> introduced or found a bug which was due to the boost parameter being ignored. 
> Maybe there are other options, but I think this is worth exploring.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-7386) Flatten nested disjunctions

2016-07-19 Thread Adrien Grand (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-7386:
-
Attachment: LUCENE-7386.patch

Here is a patch: scorers are flattened in the case that minShouldMatch==1 and 
scores need to be summed up. luceneutil seems happy with this patch with the 
following patch applied to the tasks file:

{noformat}
diff --git a/tasks/wikimedium.10M.nostopwords.tasks 
b/tasks/wikimedium.10M.nostopwords.tasks
index 342070c..4b36348 100644
--- a/tasks/wikimedium.10M.nostopwords.tasks
+++ b/tasks/wikimedium.10M.nostopwords.tasks
@@ -3735,6 +3735,22 @@ AndHighLow: +2005 +saad # freq=835460 freq=1184
 AndHighLow: +than +sneaks # freq=676864 freq=1291
 AndHighLow: +see +leveling # freq=1044180 freq=943
 AndHighLow: +page +mandel # freq=681036 freq=1866
+OrHighHighHigh: (several following) publisher
+OrHighHighHigh: (2009 film) http
+OrHighHighHigh: (south county) now
+OrHighHighHigh: called (utc until)
+OrHighHighHigh: most (part used)
+OrHighHighHigh: title (2006 references)
+OrHighHighHigh: known (century references)
+OrHighHighHigh: can (against news)
+AndHighOrHighHighHigh: +http (several following) publisher
+AndHighOrHighHighHigh: +now (2009 film) http
+AndHighOrHighHighHigh: +until (south county) now
+AndHighOrHighHighHigh: +used called (utc until)
+AndHighOrHighHighHigh: +references most (part used)
+AndHighOrHighHighHigh: +news title (2006 references)
+AndHighOrHighHighHigh: +several known (century references)
+AndHighOrHighHighHigh: +film can (against news)
 OrHighHigh: several following # freq=436129 freq=416515
 OrHighHigh: publisher end # freq=1289029 freq=526636
 OrHighHigh: 2009 film # freq=887702 freq=432758
{noformat}

The goal of OrHighHighHigh is to test BS1 and AndHighOrHighHighHigh to test BS2.

{noformat}
TaskQPS baseline  StdDev   QPS patch  StdDev
Pct diff
  Fuzzy2   73.80 (14.5%)   69.06 (22.5%)   
-6.4% ( -37% -   35%)
  Fuzzy1   86.33  (8.1%)   82.89  (9.6%)   
-4.0% ( -20% -   14%)
OrNotHighLow 1204.34  (4.0%) 1188.30  (4.0%)   
-1.3% (  -8% -6%)
OrNotHighMed  146.82  (2.7%)  145.94  (2.9%)   
-0.6% (  -6% -5%)
 MedTerm  158.21  (6.7%)  157.58  (6.5%)   
-0.4% ( -12% -   13%)
   OrNotHighHigh   67.20  (4.8%)   66.99  (4.4%)   
-0.3% (  -9% -9%)
OrHighNotMed  121.66  (8.5%)  121.38  (8.3%)   
-0.2% ( -15% -   18%)
 Prefix3   36.48  (7.1%)   36.40  (6.8%)   
-0.2% ( -13% -   14%)
OrHighNotLow  136.63  (9.2%)  136.35  (9.5%)   
-0.2% ( -17% -   20%)
HighSloppyPhrase   56.20  (6.7%)   56.09  (6.0%)   
-0.2% ( -12% -   13%)
   MedPhrase   47.37  (2.3%)   47.28  (2.4%)   
-0.2% (  -4% -4%)
   LowPhrase   47.39  (2.2%)   47.31  (2.8%)   
-0.2% (  -5% -4%)
 Respell   64.37  (3.1%)   64.26  (3.6%)   
-0.2% (  -6% -6%)
Wildcard   39.79  (5.9%)   39.72  (6.0%)   
-0.2% ( -11% -   12%)
  IntNRQ   11.80 (18.8%)   11.79 (18.6%)   
-0.1% ( -31% -   45%)
 AndHighHigh   81.62  (3.0%)   81.56  (2.6%)   
-0.1% (  -5% -5%)
HighSpanNear9.39  (3.8%)9.38  (3.2%)   
-0.1% (  -6% -7%)
 LowSpanNear   17.78  (3.1%)   17.77  (2.9%)   
-0.0% (  -5% -6%)
 MedSpanNear   11.97  (3.5%)   11.96  (3.1%)   
-0.0% (  -6% -6%)
HighTerm  102.38  (6.8%)  102.38  (6.2%)
0.0% ( -12% -   13%)
   OrHighLow  131.23  (6.6%)  131.26  (6.5%)
0.0% ( -12% -   13%)
 MedSloppyPhrase   41.51  (4.3%)   41.57  (3.9%)
0.2% (  -7% -8%)
 LowSloppyPhrase   16.08  (6.2%)   16.11  (5.7%)
0.2% ( -11% -   12%)
  HighPhrase   14.70  (2.9%)   14.74  (2.6%)
0.3% (  -5% -5%)
  AndHighMed  154.49  (3.4%)  154.97  (2.5%)
0.3% (  -5% -6%)
   OrHighNotHigh   50.78  (6.6%)   50.96  (6.6%)
0.4% ( -12% -   14%)
  AndHighLow  673.89  (4.2%)  677.46  (2.7%)
0.5% (  -6% -7%)
 LowTerm  599.83  (9.3%)  605.98  (9.4%)
1.0% ( -16% -   21%)
  OrHighHigh   31.23  (5.6%)   31.70  (5.5%)
1.5% (  -9% -   13%)
   OrHighMed   38.87  (5.3%)   39.60  (5.2%)
1.9% (  -8% -   13%)
   AndHighOrHighHighHigh   22.74  (3.2%)   24.10  (3.3%)
6.0% (

[jira] [Created] (LUCENE-7386) Flatten nested disjunctions

2016-07-19 Thread Adrien Grand (JIRA)

Adrien Grand created LUCENE-7386:


 Summary: Flatten nested disjunctions
 Key: LUCENE-7386
 URL: https://issues.apache.org/jira/browse/LUCENE-7386
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor


Now that coords are gone it became easier to flatten nested disjunctions. It 
might sound weird to write nested disjunctions in the first place, but 
disjunctions can be created implicitly by other queries such as more-like-this, 
LatLonPoint.newBoxQuery, non-scoring synonym queries, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Near real time search improvement

2016-07-19 Thread Michael McCandless

I think for most users, "near" real time is good enough, especially when
you can control what "near" is for your use case.  E.g., Elasticsearch
defaults to opening a new searcher once per second.

Mike McCandless

http://blog.mikemccandless.com

On Mon, Jul 18, 2016 at 7:27 AM, Konstantin 
wrote:

> It seems that existing write cache stores data in unsorted manner
> (hash-table).
> I cannot come up with anything smarter than using persistent sorted-map
> for write cache as implemented in my project Rhinodog.
> Persistent - to let readers work without locking, sorted-map to access
> documentIDs for particular termID in order.
> My implementation indexes text about 2.5 times slower using existing
> EnglishAnalyzer, so I'm wandering if this is a good trade off.
> Probably for some use cases it's desirable, but not for all.
> Also I'm new to Lucene, and don't feel like throwing away code that has
> been here longer that I've been writing code.
> Probably real time search is not very important ?
>
>
> 2016-07-14 15:55 GMT+03:00 Michael McCandless :
>
>> Your RAMDirectory option is what NRTCachingDirectory does I think?  Small
>> files are written in RAM, and only on merging them into larger files, do we
>> write those files to the real directory.  It's not clear it's that helpful,
>> though, because the OS does similar write caching, more efficiently.
>>
>> But even with RAMDirectory, you need to periodically open a new searcher
>> ... which makes it *near* real time, not truly real time like the Twitter
>> solution.
>>
>> Unfortunately, the crazy classes like BytesRefHash, TermsHash, etc., do
>> not have any documentation beyond what comments you see in their sources
>> ... maybe try looking at their test cases, or how the classes are used by
>> other classes in Lucene.
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>> On Thu, Jul 14, 2016 at 8:14 AM, Konstantin 
>> wrote:
>>
>>> Hello Michael,
>>> Maybe this problem is already solved/(can be solved) on a different
>>> level of abstraction (in Solr or Elasticsearch) - write new documents to
>>> both persistent index and RAMDirectory, so new docs will be queried from it
>>> immediately.
>>> My motivation for this is to learn from Lucene. Could you please suggest
>>> any source of information on BytesRefHash, TermsHash   and the whole
>>> indexing process ?
>>> Changing anything in there looks like a complex task to me too.
>>>
>>>
>>> 2016-07-14 11:54 GMT+03:00 Michael McCandless >> >:
>>>
 Another example is Michael Busch's work while at Twitter, extending
 Lucene so you can do real-time searches of the write cache ... here's a
 paper describing it:
 http://www.umiacs.umd.edu/~jimmylin/publications/Busch_etal_ICDE2012.pdf

 But this was a very heavy modification of Lucene and wasn't ever
 contributed back.

 I do think it should be possible (just complex!) to have real-time
 searching of recently indexed documents, and the sorted terms is really
 only needed if you must support multi-term queries.

 Mike McCandless

 http://blog.mikemccandless.com

 On Tue, Jul 12, 2016 at 12:29 PM, Adrien Grand 
 wrote:

> This is not something I am very familiar with, but this issue
> https://issues.apache.org/jira/browse/LUCENE-2312 tried to improve
> NRT latency by adding the ability to search directly into the indexing
> buffer of the index writer.
>
> Le mar. 12 juil. 2016 à 16:11, Konstantin 
> a écrit :
>
>> Hello everyone,
>> As far as I understand NRT requires flushing new segment to disk. Is
>> it correct that write cache is not searchable ?
>>
>> Competing search library groonga
>>  - claim that they have
>> much smaller realtime search latency (as far as I understand via 
>> searchable
>> write-cache), but loading data into their index takes almost three times
>> longer (benchmark in blog post in Japanese
>>  , seems like
>>  wikipedia XML, I'm not sure if it's English one ).
>>
>> I've created incomplete prototype of searchable write cache in my
>> pet project  - and it takes two
>> times longer to index fraction of wikipedia using same EnglishAnalyzer 
>> from
>> lucene.analysis (probably there is a room for optimizations). While 
>> loading
>> data into Lucene I didn't reuse Document instances. Searchable 
>> write-cache
>> was implemented as a bunch of persistent  scala's SortedMap[TermKey,
>> Measure](), one per logical core. Where TermKey is defined as 
>> TermKey(termID:Int,
>> docID: Long)and Measure is just frequency and norm (but could be
>> extended).
>>
>> Do you think it's worth the slowdown ? If so I'm interested to learn
>> how this part of Lucene works while implementing this featur

[jira] [Commented] (LUCENE-7384) Remove ScoringWrapperSpans

2016-07-19 Thread Alan Woodward (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384357#comment-15384357
 ] 

Alan Woodward commented on LUCENE-7384:
---

When I merged Spans and SpanScorer, I needed a way of setting a docScorer 
object on a Spans after it had been created - for example, if the exclusion 
Spans in a SpanNotQuery is empty, then we just return the inclusion Spans, but 
because the docScorer was only being set on the root Spans object you could end 
up with a null scorer.  Now that SpanScorer and Spans are separate again, and 
SpanScorer holds the similarity objects, we don't need to deal with this any 
more.

> Remove ScoringWrapperSpans
> --
>
> Key: LUCENE-7384
> URL: https://issues.apache.org/jira/browse/LUCENE-7384
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: David Smiley
> Attachments: LUCENE_7384.patch
>
>
> In LUCENE-6919 (Lucene 5.5), ScoringWrapperSpans was modified in such a way 
> that made the existence of this class pointless, and possibly broke anyone 
> who was using it as it's SimScorer argument isn't used anymore.  We should 
> now delete it.  SpanWeight has getSimScorer() so people can customize the 
> SimScorer that way.
> Another small change I observe to improve is have SpanWeight.buildSimWeight's 
> last line use the existing Similarity that has already been populated on the 
> field?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9319) DELETEREPLICA should accept just count and remove replicas intelligenty

2016-07-19 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384351#comment-15384351
 ] 

Shai Erera commented on SOLR-9319:
--

Thanks [~noble.paul]. The issue description is a bit misleading (_should accept 
*just* count_) but thanks for clarifying.

> DELETEREPLICA should accept  just count and remove replicas intelligenty
> 
>
> Key: SOLR-9319
> URL: https://issues.apache.org/jira/browse/SOLR-9319
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Noble Paul
> Fix For: 6.1
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9241) Rebalance API for SolrCloud

2016-07-19 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384346#comment-15384346
 ] 

Noble Paul commented on SOLR-9241:
--

bq.REPLACENODE: Do you want this to be a seperate admin action? 

We will not have a REBALANCE command. The name is ambiguous. Instead, we will 
have explicit actions for each

bq.What is the default behavior of the replica replacement strategy? 

Nothing. It just randomly assigns nodes in the absence of rules.  We should add 
a default strategy and make it kick in when no rules are specified.

> Rebalance API for SolrCloud
> ---
>
> Key: SOLR-9241
> URL: https://issues.apache.org/jira/browse/SOLR-9241
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Affects Versions: 6.1
> Environment: Ubuntu, Mac OsX
>Reporter: Nitin Sharma
>  Labels: Cluster, SolrCloud
> Fix For: 6.1
>
> Attachments: Redistribute_After.jpeg, Redistribute_Before.jpeg, 
> Redistribute_call.jpeg, Replace_After.jpeg, Replace_Before.jpeg, 
> Replace_Call.jpeg, SOLR-9241-4.6.patch, SOLR-9241-6.1.patch
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> This is the v1 of the patch for Solrcloud Rebalance api (as described in 
> http://engineering.bloomreach.com/solrcloud-rebalance-api/) , built at 
> Bloomreach by Nitin Sharma and Suruchi Shah. The goal of the API  is to 
> provide a zero downtime mechanism to perform data manipulation and  efficient 
> core allocation in solrcloud. This API was envisioned to be the base layer 
> that enables Solrcloud to be an auto scaling platform. (and work in unison 
> with other complementing monitoring and scaling features).
> Patch Status:
> ===
> The patch is work in progress and incremental. We have done a few rounds of 
> code clean up. We wanted to get the patch going first to get initial feed 
> back.  We will continue to work on making it more open source friendly and 
> easily testable.
>  Deployment Status:
> 
> The platform is deployed in production at bloomreach and has been battle 
> tested for large scale load. (millions of documents and hundreds of 
> collections).
>  Internals:
> =
> The internals of the API and performance : 
> http://engineering.bloomreach.com/solrcloud-rebalance-api/
> It is built on top of the admin collections API as an action (with various 
> flavors). At a high level, the rebalance api provides 2 constructs:
> Scaling Strategy:  Decides how to move the data.  Every flavor has multiple 
> options which can be reviewed in the api spec.
> Re-distribute  - Move around data in the cluster based on capacity/allocation.
> Auto Shard  - Dynamically shard a collection to any size.
> Smart Merge - Distributed Mode - Helps merging data from a larger shard setup 
> into smaller one.  (the source should be divisible by destination)
> Scale up -  Add replicas on the fly
> Scale Down - Remove replicas on the fly
> Allocation Strategy:  Decides where to put the data.  (Nodes with least 
> cores, Nodes that do not have this collection etc). Custom implementations 
> can be built on top as well. One other example is Availability Zone aware. 
> Distribute data such that every replica is placed on different availability 
> zone to support HA.
>  Detailed API Spec:
> 
>   https://github.com/bloomreach/solrcloud-rebalance-api
>  Contributors:
> =
>   Nitin Sharma
>   Suruchi Shah
>  Questions/Comments:
> =
>   You can reach me at nitin.sha...@bloomreach.com



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9241) Rebalance API for SolrCloud

2016-07-19 Thread Nitin Sharma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384337#comment-15384337
 ] 

Nitin Sharma commented on SOLR-9241:


[~noble.paul] Thanks for the feedback.  Those are some major changes to be 
made. I will submit a patch only for REDISTRIBUTE/REPLACE with the recommended 
changes and we can iterate from there. 

A few clarifications

1) REPLACENODE: Do you want this to be a seperate admin action? or as a sub 
action of REBALANCE? Ex. /solr/admin?action=REPLACENODE or 
/solr/admin/collections?action=REBALANCE&scaling_strategy=REPLACENODE? 

2) What is the default behavior of the replica replacement strategy? Does it 
pick unused nodes or just uses round robin to pick the next replica? We have 
unused and least used as 2 allocation strategies. I can fold them as separate 
replica replacement strategies if we like. 

> Rebalance API for SolrCloud
> ---
>
> Key: SOLR-9241
> URL: https://issues.apache.org/jira/browse/SOLR-9241
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Affects Versions: 6.1
> Environment: Ubuntu, Mac OsX
>Reporter: Nitin Sharma
>  Labels: Cluster, SolrCloud
> Fix For: 6.1
>
> Attachments: Redistribute_After.jpeg, Redistribute_Before.jpeg, 
> Redistribute_call.jpeg, Replace_After.jpeg, Replace_Before.jpeg, 
> Replace_Call.jpeg, SOLR-9241-4.6.patch, SOLR-9241-6.1.patch
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> This is the v1 of the patch for Solrcloud Rebalance api (as described in 
> http://engineering.bloomreach.com/solrcloud-rebalance-api/) , built at 
> Bloomreach by Nitin Sharma and Suruchi Shah. The goal of the API  is to 
> provide a zero downtime mechanism to perform data manipulation and  efficient 
> core allocation in solrcloud. This API was envisioned to be the base layer 
> that enables Solrcloud to be an auto scaling platform. (and work in unison 
> with other complementing monitoring and scaling features).
> Patch Status:
> ===
> The patch is work in progress and incremental. We have done a few rounds of 
> code clean up. We wanted to get the patch going first to get initial feed 
> back.  We will continue to work on making it more open source friendly and 
> easily testable.
>  Deployment Status:
> 
> The platform is deployed in production at bloomreach and has been battle 
> tested for large scale load. (millions of documents and hundreds of 
> collections).
>  Internals:
> =
> The internals of the API and performance : 
> http://engineering.bloomreach.com/solrcloud-rebalance-api/
> It is built on top of the admin collections API as an action (with various 
> flavors). At a high level, the rebalance api provides 2 constructs:
> Scaling Strategy:  Decides how to move the data.  Every flavor has multiple 
> options which can be reviewed in the api spec.
> Re-distribute  - Move around data in the cluster based on capacity/allocation.
> Auto Shard  - Dynamically shard a collection to any size.
> Smart Merge - Distributed Mode - Helps merging data from a larger shard setup 
> into smaller one.  (the source should be divisible by destination)
> Scale up -  Add replicas on the fly
> Scale Down - Remove replicas on the fly
> Allocation Strategy:  Decides where to put the data.  (Nodes with least 
> cores, Nodes that do not have this collection etc). Custom implementations 
> can be built on top as well. One other example is Availability Zone aware. 
> Distribute data such that every replica is placed on different availability 
> zone to support HA.
>  Detailed API Spec:
> 
>   https://github.com/bloomreach/solrcloud-rebalance-api
>  Contributors:
> =
>   Nitin Sharma
>   Suruchi Shah
>  Questions/Comments:
> =
>   You can reach me at nitin.sha...@bloomreach.com



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7384) Remove ScoringWrapperSpans

2016-07-19 Thread David Smiley (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384336#comment-15384336
 ] 

David Smiley commented on LUCENE-7384:
--

Before I commit its removal, [~romseygeek] do you recall why these queries 
needed/used the now-defunct ScoringWrapperSpans?

> Remove ScoringWrapperSpans
> --
>
> Key: LUCENE-7384
> URL: https://issues.apache.org/jira/browse/LUCENE-7384
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: David Smiley
> Attachments: LUCENE_7384.patch
>
>
> In LUCENE-6919 (Lucene 5.5), ScoringWrapperSpans was modified in such a way 
> that made the existence of this class pointless, and possibly broke anyone 
> who was using it as it's SimScorer argument isn't used anymore.  We should 
> now delete it.  SpanWeight has getSimScorer() so people can customize the 
> SimScorer that way.
> Another small change I observe to improve is have SpanWeight.buildSimWeight's 
> last line use the existing Similarity that has already been populated on the 
> field?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9319) DELETEREPLICA should accept just count and remove replicas intelligenty

2016-07-19 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384324#comment-15384324
 ] 

Noble Paul commented on SOLR-9319:
--

The old functionality continues to be there. This kicks in when 'count' is 
specified a parameter

> DELETEREPLICA should accept  just count and remove replicas intelligenty
> 
>
> Key: SOLR-9319
> URL: https://issues.apache.org/jira/browse/SOLR-9319
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Noble Paul
> Fix For: 6.1
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9319) DELETEREPLICA should accept just count and remove replicas intelligenty

2016-07-19 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384322#comment-15384322
 ] 

Noble Paul commented on SOLR-9319:
--

The command would take the following parameters

* count: no:of replicas to be removed
* collection : (required) The collection name
* shard : (optional) . If shard is absent, 'count' no:of replicas from each 
shard will be removed

> DELETEREPLICA should accept  just count and remove replicas intelligenty
> 
>
> Key: SOLR-9319
> URL: https://issues.apache.org/jira/browse/SOLR-9319
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Noble Paul
> Fix For: 6.1
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9319) DELETEREPLICA should accept just count and remove replicas intelligenty

2016-07-19 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384301#comment-15384301
 ] 

Shai Erera commented on SOLR-9319:
--

What does "just count" mean? Will I not be able to delete a specific replica, 
or is this in addition to being able to delete a selected replica? I think that 
having an API like "delete replicas such that only X remain" is fine, but I 
would like to also be able to specify which replica I want to delete (since in 
my case I need to control that).

> DELETEREPLICA should accept  just count and remove replicas intelligenty
> 
>
> Key: SOLR-9319
> URL: https://issues.apache.org/jira/browse/SOLR-9319
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Noble Paul
> Fix For: 6.1
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-9320) A REPLACENODE command to decommission an existing node with another new node

2016-07-19 Thread Noble Paul (JIRA)

Noble Paul created SOLR-9320:


 Summary: A REPLACENODE command to decommission an existing node 
with another new node
 Key: SOLR-9320
 URL: https://issues.apache.org/jira/browse/SOLR-9320
 Project: Solr
  Issue Type: Sub-task
Reporter: Noble Paul


The command should accept a source node and target node. recreate the replicas 
in source node in the target and do a DLETENODE of source node



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-9319) DELETEREPLICA should accept just count and remove replicas intelligenty

2016-07-19 Thread Noble Paul (JIRA)

Noble Paul created SOLR-9319:


 Summary: DELETEREPLICA should accept  just count and remove 
replicas intelligenty
 Key: SOLR-9319
 URL: https://issues.apache.org/jira/browse/SOLR-9319
 Project: Solr
  Issue Type: Sub-task
Reporter: Noble Paul






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-9318) A DELETENODE command that should delete all replicas in that node

2016-07-19 Thread Noble Paul (JIRA)

Noble Paul created SOLR-9318:


 Summary: A DELETENODE command that should delete all replicas in 
that node
 Key: SOLR-9318
 URL: https://issues.apache.org/jira/browse/SOLR-9318
 Project: Solr
  Issue Type: Sub-task
Reporter: Noble Paul


The command should look in all collections , find out replicas hosted in that 
node and remove them



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-9317) ADDREPLICA command should be more flexible and add 'n' replicas to a collection,shard

2016-07-19 Thread Noble Paul (JIRA)

Noble Paul created SOLR-9317:


 Summary: ADDREPLICA command should be more flexible and add 'n' 
replicas to a collection,shard
 Key: SOLR-9317
 URL: https://issues.apache.org/jira/browse/SOLR-9317
 Project: Solr
  Issue Type: Sub-task
Reporter: Noble Paul


It should automatically identify the nodes where these replicas should be 
created as well



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-9316) New Collection Action REDISTRIBUTE

2016-07-19 Thread Noble Paul (JIRA)

Noble Paul created SOLR-9316:


 Summary: New Collection Action REDISTRIBUTE 
 Key: SOLR-9316
 URL: https://issues.apache.org/jira/browse/SOLR-9316
 Project: Solr
  Issue Type: Sub-task
Reporter: Noble Paul


This would redistribute the replicas among other nodes so that the no:of 
replicas in each node is  more or less same



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-7280) Load cores in sorted order and tweak coreLoadThread counts to improve cluster stability on restarts

2016-07-19 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-7280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384263#comment-15384263
 ] 

ASF subversion and git services commented on SOLR-7280:
---

Commit 89a1fe661e7b73082d019543a83a7f511e74c9ca in lucene-solr's branch 
refs/heads/branch_6x from [~noble.paul]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=89a1fe6 ]

SOLR-7280: refactored to incorporate Mike's suggestions. Default thread count 
for cloud is limited to 8 now. In our internal teting 8 has given us the best 
stability during restarts


> Load cores in sorted order and tweak coreLoadThread counts to improve cluster 
> stability on restarts
> ---
>
> Key: SOLR-7280
> URL: https://issues.apache.org/jira/browse/SOLR-7280
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Reporter: Shalin Shekhar Mangar
>Assignee: Noble Paul
> Fix For: 6.2
>
> Attachments: SOLR-7280-5x.patch, SOLR-7280.patch, SOLR-7280.patch
>
>
> In SOLR-7191, Damien mentioned that by loading solr cores in a sorted order 
> and tweaking some of the coreLoadThread counts, he was able to improve the 
> stability of a cluster with thousands of collections. We should explore some 
> of these changes and fold them into Solr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-6.x-Linux (64bit/jdk1.8.0_92) - Build # 1206 - Failure!

2016-07-19 Thread Policeman Jenkins Server

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Linux/1206/
Java: 64bit/jdk1.8.0_92 -XX:+UseCompressedOops -XX:+UseSerialGC

All tests passed

Build Log:
[...truncated 9901 lines...]
[javac] Compiling 939 source files to 
/home/jenkins/workspace/Lucene-Solr-6.x-Linux/solr/build/solr-core/classes/java
[javac] 
/home/jenkins/workspace/Lucene-Solr-6.x-Linux/solr/core/src/java/org/apache/solr/core/CoreContainer.java:469:
 error: incompatible types: int cannot be converted to boolean
[javac] cfg.getCoreLoadThreadCount(isZooKeeperAware() ? 
DEFAULT_CORE_LOAD_THREADS_IN_CLOUD : DEFAULT_CORE_LOAD_THREADS),
[javac]   ^
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.
[javac] Note: Some messages have been simplified; recompile with 
-Xdiags:verbose to get full output
[javac] 1 error

BUILD FAILED
/home/jenkins/workspace/Lucene-Solr-6.x-Linux/build.xml:763: The following 
error occurred while executing this line:
/home/jenkins/workspace/Lucene-Solr-6.x-Linux/build.xml:707: The following 
error occurred while executing this line:
/home/jenkins/workspace/Lucene-Solr-6.x-Linux/build.xml:59: The following error 
occurred while executing this line:
/home/jenkins/workspace/Lucene-Solr-6.x-Linux/solr/build.xml:233: The following 
error occurred while executing this line:
/home/jenkins/workspace/Lucene-Solr-6.x-Linux/solr/common-build.xml:536: The 
following error occurred while executing this line:
/home/jenkins/workspace/Lucene-Solr-6.x-Linux/solr/common-build.xml:484: The 
following error occurred while executing this line:
/home/jenkins/workspace/Lucene-Solr-6.x-Linux/solr/common-build.xml:385: The 
following error occurred while executing this line:
/home/jenkins/workspace/Lucene-Solr-6.x-Linux/lucene/common-build.xml:501: The 
following error occurred while executing this line:
/home/jenkins/workspace/Lucene-Solr-6.x-Linux/lucene/common-build.xml:1955: 
Compile failed; see the compiler error output for details.

Total time: 20 minutes 11 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
[WARNINGS] Skipping publisher since build result is FAILURE
Recording test results
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-6.x-Solaris (64bit/jdk1.8.0) - Build # 276 - Failure!

2016-07-19 Thread Policeman Jenkins Server

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Solaris/276/
Java: 64bit/jdk1.8.0 -XX:-UseCompressedOops -XX:+UseConcMarkSweepGC

All tests passed

Build Log:
[...truncated 9830 lines...]
[javac] Compiling 939 source files to 
/export/home/jenkins/workspace/Lucene-Solr-6.x-Solaris/solr/build/solr-core/classes/java
[javac] 
/export/home/jenkins/workspace/Lucene-Solr-6.x-Solaris/solr/core/src/java/org/apache/solr/core/CoreContainer.java:469:
 error: incompatible types: int cannot be converted to boolean
[javac] cfg.getCoreLoadThreadCount(isZooKeeperAware() ? 
DEFAULT_CORE_LOAD_THREADS_IN_CLOUD : DEFAULT_CORE_LOAD_THREADS),
[javac]   ^
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.
[javac] Note: Some messages have been simplified; recompile with 
-Xdiags:verbose to get full output
[javac] 1 error

BUILD FAILED
/export/home/jenkins/workspace/Lucene-Solr-6.x-Solaris/build.xml:763: The 
following error occurred while executing this line:
/export/home/jenkins/workspace/Lucene-Solr-6.x-Solaris/build.xml:707: The 
following error occurred while executing this line:
/export/home/jenkins/workspace/Lucene-Solr-6.x-Solaris/build.xml:59: The 
following error occurred while executing this line:
/export/home/jenkins/workspace/Lucene-Solr-6.x-Solaris/solr/build.xml:233: The 
following error occurred while executing this line:
/export/home/jenkins/workspace/Lucene-Solr-6.x-Solaris/solr/common-build.xml:536:
 The following error occurred while executing this line:
/export/home/jenkins/workspace/Lucene-Solr-6.x-Solaris/solr/common-build.xml:484:
 The following error occurred while executing this line:
/export/home/jenkins/workspace/Lucene-Solr-6.x-Solaris/solr/common-build.xml:385:
 The following error occurred while executing this line:
/export/home/jenkins/workspace/Lucene-Solr-6.x-Solaris/lucene/common-build.xml:501:
 The following error occurred while executing this line:
/export/home/jenkins/workspace/Lucene-Solr-6.x-Solaris/lucene/common-build.xml:1955:
 Compile failed; see the compiler error output for details.

Total time: 23 minutes 55 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
[WARNINGS] Skipping publisher since build result is FAILURE
Recording test results
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

1 2 >

1 - 100 of 114 matches

Mail list logo