from:"Alexander S. $JIRA$"


 [ 
https://issues.apache.org/jira/browse/SOLR-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander S. updated SOLR-6468:
---
Affects Version/s: 7.1
   6.6.2

> Regression: StopFilterFactory doesn't work properly without deprecated 
> enablePositionIncrements="false"
> ---
>
> Key: SOLR-6468
> URL: https://issues.apache.org/jira/browse/SOLR-6468
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Affects Versions: 4.8.1, 4.9, 5.3.1, 7.1, 6.6.2
>Reporter: Alexander S.
>Priority: Major
> Attachments: FieldValue.png
>
>
> Setup:
> * Schema version is 1.5
> * Field config:
> {code}
>  autoGeneratePhraseQueries="true">
>   
> 
>  ignoreCase="true" />
> 
>   
> 
> {code}
> * Stop words:
> {code}
> http 
> https 
> ftp 
> www
> {code}
> So very simple. In the index I have:
> * twitter.com/testuser
> All these queries do match:
> * twitter.com/testuser
> * com/testuser
> * testuser
> But none of these does:
> * https://twitter.com/testuser
> * https://www.twitter.com/testuser
> * www.twitter.com/testuser
> Debug output shows:
> "parsedquery_toString": "+(url_words_ngram:\"? twitter com testuser\")"
> But we need:
> "parsedquery_toString": "+(url_words_ngram:\"twitter com testuser\")"
> Complete debug outputs:
> * a valid search: 
> http://pastie.org/pastes/9500661/text?key=rgqj5ivlgsbk1jxsudx9za
> * an invalid search: 
> http://pastie.org/pastes/9500662/text?key=b4zlh2oaxtikd8jvo5xaww
> The complete discussion and explanation of the problem is here: 
> http://lucene.472066.n3.nabble.com/Help-with-StopFilterFactory-td4153839.html
> I didn't find a clear explanation how can we upgrade Solr, there's no any 
> replacement or a workarround to this, so this is not just a major change but 
> a major disrespect to all existing Solr users who are using this feature.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6468) Regression: StopFilterFactory doesn't work properly without deprecated enablePositionIncrements="false"


[ 
https://issues.apache.org/jira/browse/SOLR-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16359924#comment-16359924
 ] 

Alexander S. commented on SOLR-6468:


Wondering how we can bring attention to this problem?

> Regression: StopFilterFactory doesn't work properly without deprecated 
> enablePositionIncrements="false"
> ---
>
> Key: SOLR-6468
> URL: https://issues.apache.org/jira/browse/SOLR-6468
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Affects Versions: 4.8.1, 4.9, 5.3.1
>Reporter: Alexander S.
>Priority: Major
> Attachments: FieldValue.png
>
>
> Setup:
> * Schema version is 1.5
> * Field config:
> {code}
>  autoGeneratePhraseQueries="true">
>   
> 
>  ignoreCase="true" />
> 
>   
> 
> {code}
> * Stop words:
> {code}
> http 
> https 
> ftp 
> www
> {code}
> So very simple. In the index I have:
> * twitter.com/testuser
> All these queries do match:
> * twitter.com/testuser
> * com/testuser
> * testuser
> But none of these does:
> * https://twitter.com/testuser
> * https://www.twitter.com/testuser
> * www.twitter.com/testuser
> Debug output shows:
> "parsedquery_toString": "+(url_words_ngram:\"? twitter com testuser\")"
> But we need:
> "parsedquery_toString": "+(url_words_ngram:\"twitter com testuser\")"
> Complete debug outputs:
> * a valid search: 
> http://pastie.org/pastes/9500661/text?key=rgqj5ivlgsbk1jxsudx9za
> * an invalid search: 
> http://pastie.org/pastes/9500662/text?key=b4zlh2oaxtikd8jvo5xaww
> The complete discussion and explanation of the problem is here: 
> http://lucene.472066.n3.nabble.com/Help-with-StopFilterFactory-td4153839.html
> I didn't find a clear explanation how can we upgrade Solr, there's no any 
> replacement or a workarround to this, so this is not just a major change but 
> a major disrespect to all existing Solr users who are using this feature.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-6468) Regression: StopFilterFactory doesn't work properly without deprecated enablePositionIncrements="false"


 [ 
https://issues.apache.org/jira/browse/SOLR-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander S. updated SOLR-6468:
---
Summary: Regression: StopFilterFactory doesn't work properly without 
deprecated enablePositionIncrements="false"  (was: Regression: 
StopFilterFactory doesn't work properly without 
enablePositionIncrements="false")

> Regression: StopFilterFactory doesn't work properly without deprecated 
> enablePositionIncrements="false"
> ---
>
> Key: SOLR-6468
> URL: https://issues.apache.org/jira/browse/SOLR-6468
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Affects Versions: 4.8.1, 4.9, 5.3.1
>Reporter: Alexander S.
>Priority: Major
> Attachments: FieldValue.png
>
>
> Setup:
> * Schema version is 1.5
> * Field config:
> {code}
>  autoGeneratePhraseQueries="true">
>   
> 
>  ignoreCase="true" />
> 
>   
> 
> {code}
> * Stop words:
> {code}
> http 
> https 
> ftp 
> www
> {code}
> So very simple. In the index I have:
> * twitter.com/testuser
> All these queries do match:
> * twitter.com/testuser
> * com/testuser
> * testuser
> But none of these does:
> * https://twitter.com/testuser
> * https://www.twitter.com/testuser
> * www.twitter.com/testuser
> Debug output shows:
> "parsedquery_toString": "+(url_words_ngram:\"? twitter com testuser\")"
> But we need:
> "parsedquery_toString": "+(url_words_ngram:\"twitter com testuser\")"
> Complete debug outputs:
> * a valid search: 
> http://pastie.org/pastes/9500661/text?key=rgqj5ivlgsbk1jxsudx9za
> * an invalid search: 
> http://pastie.org/pastes/9500662/text?key=b4zlh2oaxtikd8jvo5xaww
> The complete discussion and explanation of the problem is here: 
> http://lucene.472066.n3.nabble.com/Help-with-StopFilterFactory-td4153839.html
> I didn't find a clear explanation how can we upgrade Solr, there's no any 
> replacement or a workarround to this, so this is not just a major change but 
> a major disrespect to all existing Solr users who are using this feature.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-6468) Regression: StopFilterFactory doesn't work properly without enablePositionIncrements="false"


 [ 
https://issues.apache.org/jira/browse/SOLR-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander S. updated SOLR-6468:
---
Affects Version/s: 5.3.1

> Regression: StopFilterFactory doesn't work properly without 
> enablePositionIncrements="false"
> 
>
> Key: SOLR-6468
> URL: https://issues.apache.org/jira/browse/SOLR-6468
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Affects Versions: 4.8.1, 4.9, 5.3.1
>Reporter: Alexander S.
>Priority: Major
> Attachments: FieldValue.png
>
>
> Setup:
> * Schema version is 1.5
> * Field config:
> {code}
>  autoGeneratePhraseQueries="true">
>   
> 
>  ignoreCase="true" />
> 
>   
> 
> {code}
> * Stop words:
> {code}
> http 
> https 
> ftp 
> www
> {code}
> So very simple. In the index I have:
> * twitter.com/testuser
> All these queries do match:
> * twitter.com/testuser
> * com/testuser
> * testuser
> But none of these does:
> * https://twitter.com/testuser
> * https://www.twitter.com/testuser
> * www.twitter.com/testuser
> Debug output shows:
> "parsedquery_toString": "+(url_words_ngram:\"? twitter com testuser\")"
> But we need:
> "parsedquery_toString": "+(url_words_ngram:\"twitter com testuser\")"
> Complete debug outputs:
> * a valid search: 
> http://pastie.org/pastes/9500661/text?key=rgqj5ivlgsbk1jxsudx9za
> * an invalid search: 
> http://pastie.org/pastes/9500662/text?key=b4zlh2oaxtikd8jvo5xaww
> The complete discussion and explanation of the problem is here: 
> http://lucene.472066.n3.nabble.com/Help-with-StopFilterFactory-td4153839.html
> I didn't find a clear explanation how can we upgrade Solr, there's no any 
> replacement or a workarround to this, so this is not just a major change but 
> a major disrespect to all existing Solr users who are using this feature.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-11939) Collection API: property.name ignored when creating collections

2018-02-04 Thread Alexander S. (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16351708#comment-16351708
 ] 

Alexander S. commented on SOLR-11939:
-

Hi Varun,

I am referring to 
[https://lucene.apache.org/solr/guide/6_6/collections-api.html]
|property._name_=_value_|string|No| |Set core property _name_ to _value_. See 
the section [Defining 
core.properties|https://lucene.apache.org/solr/guide/6_6/defining-core-properties.html#defining-core-properties]
 for details on supported properties and values.|

All shards and replicas are created on separate Solr instances so a single name 
for all cores would work in this case.

Well, I started working on core names mostly because the WEB UI (at least in 
5.3.1) doesn't work with collections so I wasn't aware that query requests 
would work with collection names also. Core name doesn't matter that much then 
and we're fine with generic core names.

It would be good to mention this in the docs somewhere.

Best,

Alexander S.

> Collection API: property.name ignored when creating collections
> ---
>
> Key: SOLR-11939
> URL: https://issues.apache.org/jira/browse/SOLR-11939
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.3.1
>Reporter: Alexander S.
>Assignee: Varun Thacker
>Priority: Major
>
> Trying to create a collection this way:
> {code:java}
> /solr/admin/collections?wt=json=CREATE=carmen-test=1=4=shard1,shard2,shard3,shard4=carmen=compositeId=carmen_test{code}
> This appears in the log:
> {code:java}
> OverseerCollectionProcessor.processMessage : create , {
>   "name":"carmen-test",
>   "fromApi":"true",
>   "replicationFactor":"1",
>   "collection.configName":"carmen",
>   "numShards":"4",
>   "shards":"shard1,shard2,shard3,shard4",
>   "stateFormat":"2",
>   "property.name":"carmen_test",
>   "router.name":"compositeId",
>   "operation":"create"}{code}
> But the resulting core name is *carmen-test_shard1_replica1* matching 
> "collection name" + sharn name + replica number.
> How can I set a custom core name when creating a collection?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-11939) Collection API: property.name ignored when creating collections

2018-02-02 Thread Alexander S. (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16350473#comment-16350473
 ] 

Alexander S. edited comment on SOLR-11939 at 2/2/18 3:07 PM:
-

Found this discussion 
[http://lucene.472066.n3.nabble.com/Core-property-name-ignored-when-creating-collection-using-API-td4183405.html]

It seems like I don't have to worry about the core name as it seems Solr is 
mowing towards collections.

UPD. But this is still a discrepancy between the docs and the API. I've spent 
an hour figuring this out, patching a Chef's cookbook adding these properties 
and figured out that this doesn't work as described in the docs. 


was (Author: aheaven):
Found this discussion 
[http://lucene.472066.n3.nabble.com/Core-property-name-ignored-when-creating-collection-using-API-td4183405.html]

It seems like I don't have to worry about the core name as it seems Solr is 
mowing towards collections.

> Collection API: property.name ignored when creating collections
> ---
>
> Key: SOLR-11939
> URL: https://issues.apache.org/jira/browse/SOLR-11939
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.3.1
>Reporter: Alexander S.
>Priority: Major
>
> Trying to create a collection this way:
> {code:java}
> /solr/admin/collections?wt=json=CREATE=carmen-test=1=4=shard1,shard2,shard3,shard4=carmen=compositeId=carmen_test{code}
> This appears in the log:
> {code:java}
> OverseerCollectionProcessor.processMessage : create , {
>   "name":"carmen-test",
>   "fromApi":"true",
>   "replicationFactor":"1",
>   "collection.configName":"carmen",
>   "numShards":"4",
>   "shards":"shard1,shard2,shard3,shard4",
>   "stateFormat":"2",
>   "property.name":"carmen_test",
>   "router.name":"compositeId",
>   "operation":"create"}{code}
> But the resulting core name is *carmen-test_shard1_replica1* matching 
> "collection name" + sharn name + replica number.
> How can I set a custom core name when creating a collection?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-11939) Collection API: property.name ignored when creating collections

2018-02-02 Thread Alexander S. (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-11939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16350473#comment-16350473
 ] 

Alexander S. commented on SOLR-11939:
-

Found this discussion 
[http://lucene.472066.n3.nabble.com/Core-property-name-ignored-when-creating-collection-using-API-td4183405.html]

It seems like I don't have to worry about the core name as it seems Solr is 
mowing towards collections.

> Collection API: property.name ignored when creating collections
> ---
>
> Key: SOLR-11939
> URL: https://issues.apache.org/jira/browse/SOLR-11939
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.3.1
>Reporter: Alexander S.
>Priority: Major
>
> Trying to create a collection this way:
> {code:java}
> /solr/admin/collections?wt=json=CREATE=carmen-test=1=4=shard1,shard2,shard3,shard4=carmen=compositeId=carmen_test{code}
> This appears in the log:
> {code:java}
> OverseerCollectionProcessor.processMessage : create , {
>   "name":"carmen-test",
>   "fromApi":"true",
>   "replicationFactor":"1",
>   "collection.configName":"carmen",
>   "numShards":"4",
>   "shards":"shard1,shard2,shard3,shard4",
>   "stateFormat":"2",
>   "property.name":"carmen_test",
>   "router.name":"compositeId",
>   "operation":"create"}{code}
> But the resulting core name is *carmen-test_shard1_replica1* matching 
> "collection name" + sharn name + replica number.
> How can I set a custom core name when creating a collection?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-11939) Collection API: property.name ignored when creating collections

2018-02-02 Thread Alexander S. (JIRA)

Alexander S. created SOLR-11939:
---

 Summary: Collection API: property.name ignored when creating 
collections
 Key: SOLR-11939
 URL: https://issues.apache.org/jira/browse/SOLR-11939
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Affects Versions: 5.3.1
Reporter: Alexander S.


Trying to create a collection this way:
{code:java}
/solr/admin/collections?wt=json=CREATE=carmen-test=1=4=shard1,shard2,shard3,shard4=carmen=compositeId=carmen_test{code}
This appears in the log:
{code:java}
OverseerCollectionProcessor.processMessage : create , {
  "name":"carmen-test",
  "fromApi":"true",
  "replicationFactor":"1",
  "collection.configName":"carmen",
  "numShards":"4",
  "shards":"shard1,shard2,shard3,shard4",
  "stateFormat":"2",
  "property.name":"carmen_test",
  "router.name":"compositeId",
  "operation":"create"}{code}
But the resulting core name is *carmen-test_shard1_replica1* matching 
"collection name" + sharn name + replica number.

How can I set a custom core name when creating a collection?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6468) Regression: StopFilterFactory doesn't work properly without enablePositionIncrements="false"

2016-10-03 Thread Alexander S. (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15541697#comment-15541697
 ] 

Alexander S. commented on SOLR-6468:


We now can't upgrade to Solr 6 due to this.

> Regression: StopFilterFactory doesn't work properly without 
> enablePositionIncrements="false"
> 
>
> Key: SOLR-6468
> URL: https://issues.apache.org/jira/browse/SOLR-6468
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.8.1, 4.9
>Reporter: Alexander S.
>
> Setup:
> * Schema version is 1.5
> * Field config:
> {code}
>  autoGeneratePhraseQueries="true">
>   
> 
>  ignoreCase="true" />
> 
>   
> 
> {code}
> * Stop words:
> {code}
> http 
> https 
> ftp 
> www
> {code}
> So very simple. In the index I have:
> * twitter.com/testuser
> All these queries do match:
> * twitter.com/testuser
> * com/testuser
> * testuser
> But none of these does:
> * https://twitter.com/testuser
> * https://www.twitter.com/testuser
> * www.twitter.com/testuser
> Debug output shows:
> "parsedquery_toString": "+(url_words_ngram:\"? twitter com testuser\")"
> But we need:
> "parsedquery_toString": "+(url_words_ngram:\"twitter com testuser\")"
> Complete debug outputs:
> * a valid search: 
> http://pastie.org/pastes/9500661/text?key=rgqj5ivlgsbk1jxsudx9za
> * an invalid search: 
> http://pastie.org/pastes/9500662/text?key=b4zlh2oaxtikd8jvo5xaww
> The complete discussion and explanation of the problem is here: 
> http://lucene.472066.n3.nabble.com/Help-with-StopFilterFactory-td4153839.html
> I didn't find a clear explanation how can we upgrade Solr, there's no any 
> replacement or a workarround to this, so this is not just a major change but 
> a major disrespect to all existing Solr users who are using this feature.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3274) ZooKeeper related SolrCloud problems

2015-09-09 Thread Alexander S. (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14738200#comment-14738200
 ] 

Alexander S. commented on SOLR-3274:


Hi, just wanted to let you know that adding 2 new ZK servers (so I have 5 
running ZK instances) improved the situation a lot.

But I found one weird thing with the ZK:
{code}
java.net.UnknownHostException: zoo5.devops
at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:178)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368)
at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:402)
at 
org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:840)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:762)
2015-09-10 01:13:21,235 - WARN  
[QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@382] - Cannot open 
channel to 2 at election address zoo2.devops:3888
java.net.UnknownHostException: zoo2.devops
at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:178)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368)
at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:402)
at 
org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:840)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:762)
2015-09-10 01:13:21,235 - WARN  
[QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@382] - Cannot open 
channel to 1 at election address zoo1.devops:3888
java.net.UnknownHostException: zoo1.devops
at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:178)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368)
at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:402)
at 
org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:840)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:762)
2015-09-10 01:13:21,236 - WARN  
[QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@382] - Cannot open 
channel to 4 at election address zoo4.devops:3888
java.net.UnknownHostException: zoo4.devops
at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:178)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368)
at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:402)
at 
org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:840)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:762)
{code}

Just opened 2 ssh sessions to that server and was monitoring the log with tail. 
While ZK posted these errors I was able to ping zoo1/2/4/5.devops servers and 
was able to connect to ZK there with telnet. So it seems something could go 
wrong with ZK itself. At this time I seen these "cannot talk to ZK" errors in 
Solr.

And eventually I've just restarted this broken ZK instance and everything is 
fine again. So I guess Solr tried to connect namely to this broken ZK instance 
(can't say for sure since it doesn't mention the instance it failed to connect 
to in its log).

> ZooKeeper related SolrCloud problems
> 
>
> Key: SOLR-3274
> URL: https://issues.apache.org/jira/browse/SOLR-3274
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.0-ALPHA
> Environment: Any
>Reporter: Per Steffensen
>
> Same setup as in SOLR-3273. Well if I have to tell the entire truth we have 7 
> Solr servers, running 28 slices of the same collection (collA) - all slices 
> have one replica (two shards all in all - leader + replica) - 56 cores all in 
> all (8 shards on each solr instance). But anyways...
> Besides the problem reported in SOLR-3273, the system seems to run fine under 
> high load for several hours, but eventually errors like the ones shown below 
> start to occur. I might be wrong, but they all seem to indicate some kind of 
> unstability in the

[jira] [Comment Edited] (SOLR-3274) ZooKeeper related SolrCloud problems

2015-09-09 Thread Alexander S. (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14738200#comment-14738200
 ] 

Alexander S. edited comment on SOLR-3274 at 9/10/15 5:43 AM:
-

Hi, just wanted to let you know that adding 2 new ZK servers (so I have 5 
running ZK instances) improved the situation a lot.

But I found one weird thing with the ZK:
{code}
java.net.UnknownHostException: zoo5.devops
at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:178)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368)
at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:402)
at 
org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:840)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:762)
2015-09-10 01:13:21,235 - WARN  
[QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@382] - Cannot open 
channel to 2 at election address zoo2.devops:3888
java.net.UnknownHostException: zoo2.devops
at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:178)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368)
at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:402)
at 
org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:840)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:762)
2015-09-10 01:13:21,235 - WARN  
[QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@382] - Cannot open 
channel to 1 at election address zoo1.devops:3888
java.net.UnknownHostException: zoo1.devops
at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:178)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368)
at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:402)
at 
org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:840)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:762)
2015-09-10 01:13:21,236 - WARN  
[QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@382] - Cannot open 
channel to 4 at election address zoo4.devops:3888
java.net.UnknownHostException: zoo4.devops
at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:178)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368)
at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:402)
at 
org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:840)
at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:762)
{code}

Just opened 2 ssh sessions to that server and was monitoring the log with tail. 
While ZK posted these errors I was able to ping zoo1/2/4/5.devops servers and 
was able to connect to ZK there with telnet. So it seems something could go 
wrong with ZK itself. At this time I seen these "cannot talk to ZK" errors in 
Solr.

And eventually I've just restarted this broken ZK instance and everything is 
fine again. So I guess Solr tried to connect namely to this broken ZK instance 
(can't say for sure since it doesn't mention the instance it failed to connect 
to in its log).

UPD: but still often see these errors in ZK logs:
{code}
2015-09-10 01:31:28,804 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted 
socket connection from /10.128.202.22:35990
2015-09-10 01:31:28,847 - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of 
stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x0, 
likely client has closed socket
at 
org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at 
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:744)
2015-09-10 01:31:28,847 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket 
connection for client /10.128.202.22:35990 (no session established for client)

[jira] [Comment Edited] (SOLR-6875) No data integrity between replicas

2015-06-11 Thread Alexander S. (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14582960#comment-14582960
 ] 

Alexander S. edited comment on SOLR-6875 at 6/12/15 5:24 AM:
-

Got another error today on 4 shards set up, each has 2 replicas (8 nodes in 
total).

On the shard 4/replica 1 I see the next error: [^replica1.png]
On the shard 4/replica 2 the next: [^replica2.png]

Here's the backtrace for the error on the first screenshot:
{code}
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:196)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at 
org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:160)
at 
org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:84)
at 
org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:273)
at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140)
at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
at 
org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:260)
at 
org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:283)
at 
org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:251)
at 
org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:197)
at 
org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:271)
at 
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:123)
at 
org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:682)
at 
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:486)
at 
org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
at 
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:233)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
{code}

After all this replica 1 shows:
{quote}
numDocs: 28 215 608
{quote}

And the replica 2 shows:
{quote}
numDocs: 28 215 609
{quote}

Everything worked well for a few months until yesterday, when we started to 
reindex some data (like 1.7m records).

Our Solr set up is using large pages and there's enough resources. Here's how 
we run the instances:
{code}
exec chpst -u solr java -Xms6G -Xmx8G -XX:+UseConcMarkSweepGC 
-XX:+UseLargePages -XX:+CMSParallelRemarkEnabled -XX:+ParallelRefProcEnabled 
-XX:+UseLargePages -XX:+AggressiveOpts -XX:CMSInitiatingOccupancyFraction=75 
-DzkHost=zoo5.devops:2181,zoo4.devops:2181,zoo1.devops:2181,zoo2.devops:2181,zoo3.devops:2181
 -Dcollection.configName=Carmen -Dbootstrap_confdir=./solr/conf 
-Dbootstrap_conf=true -DnumShards=4 -jar start.jar etc/jetty.xml
{code}

The server has 16 CPU cores and SSD RAID 10, the load average is between 2 and 
3 usually. The charts also don't show anything suspicious in server load, it is 
very stable.

So seems like something went wrong during recovery after the network error. Not 
sure how to debug that deeper and what those warnings in the log mean, for 
example the last 2 messages on the first screenshot, from 
DistributedUpdateProcessor and CoreAdminHandler.


was (Author: aheaven):
Get another error today on 4 shards set up, each has 2 replicas (8 nodes in 
total).

On the shard 4/replica 1 I see the next error: [^replica1.png]
On the shard 4/replica 2 the next: [^replica2.png]

Here's the backtrace for the error on the first screenshot:
{code}
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:196)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at 
org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:160)
at 
org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:84)
at 
org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:273)
at

[jira] [Updated] (SOLR-6875) No data integrity between replicas

2015-06-11 Thread Alexander S. (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander S. updated SOLR-6875:
---
Attachment: replica2.png
replica1.png

Get another error today on 4 shards set up, each has 2 replicas (8 nodes in 
total).

On the shard 4/replica 1 I see the next error: [^replica1.png]
On the shard 4/replica 2 the next: [^replica2.png]

Here's the backtrace for the error on the first screenshot:
{code}
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:196)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at 
org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:160)
at 
org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:84)
at 
org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:273)
at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140)
at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
at 
org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:260)
at 
org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:283)
at 
org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:251)
at 
org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:197)
at 
org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:271)
at 
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:123)
at 
org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:682)
at 
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:486)
at 
org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
at 
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:233)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
{code}

After all this replica 1 shows:
{quote}
numDocs: 28 215 608
{quote}

And the replica 2 shows:
{quote}
numDocs: 28 215 609
{quote}

Everything worked well for a few months until yesterday, when we started to 
reindex some data (like 1.7m records).

Our Solr set up is using large pages and there's enough resources. Here's how 
we run the instances:
{code}
exec chpst -u solr java -Xms6G -Xmx8G -XX:+UseConcMarkSweepGC 
-XX:+UseLargePages -XX:+CMSParallelRemarkEnabled -XX:+ParallelRefProcEnabled 
-XX:+UseLargePages -XX:+AggressiveOpts -XX:CMSInitiatingOccupancyFraction=75 
-DzkHost=zoo5.devops:2181,zoo4.devops:2181,zoo1.devops:2181,zoo2.devops:2181,zoo3.devops:2181
 -Dcollection.configName=Carmen -Dbootstrap_confdir=./solr/conf 
-Dbootstrap_conf=true -DnumShards=4 -jar start.jar etc/jetty.xml
{code}

The server has 16 CPU cores and SSD RAID 10, the load average is between 2 and 
3 usually. The charts also don't show anything suspicious in server load, it is 
very stable.

So seems like something went wrong during recovery after the network error. Not 
sure how to debug that deeper and what those warnings in the log mean, for 
example the last 2 messages on the first screenshot, from 
DistributedUpdateProcessor and CoreAdminHandler.

 No data integrity between replicas
 --

 Key: SOLR-6875
 URL: https://issues.apache.org/jira/browse/SOLR-6875
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.10.2
 Environment: One replica is @ Linux solr1.devops.wegohealth.com 
 3.8.0-29-generic #42~precise1-Ubuntu SMP Wed Aug 14 16:19:23 UTC 2013 x86_64 
 x86_64 x86_64 GNU/Linux
 Another replica is @ Linux solr2.devops.wegohealth.com 3.16.0-23-generic 
 #30-Ubuntu SMP Thu Oct 16 13:17:16 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
 Solr is running with the next options:
 * -Xms12G
 * -Xmx16G
 * -XX:+UseConcMarkSweepGC
 * -XX:+UseLargePages
 * -XX:+CMSParallelRemarkEnabled
 * -XX:+ParallelRefProcEnabled
 * -XX:+UseLargePages
 * -XX:+AggressiveOpts
 * -XX:CMSInitiatingOccupancyFraction=75
Reporter: Alexander S.

[jira] [Updated] (SOLR-5332) Add preserve original setting to the EdgeNGramFilterFactory

2015-03-02 Thread Alexander S. (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Alexander S. updated SOLR-5332:
---
Affects Version/s: 5.1

Add preserve original setting to the EdgeNGramFilterFactory
-

Key: SOLR-5332
URL: https://issues.apache.org/jira/browse/SOLR-5332
Project: Solr
Issue Type: Wish
Affects Versions: 4.4, 4.5, 4.5.1, 4.6, 5.1
Reporter: Alexander S.

Hi, as described here:
http://lucene.472066.n3.nabble.com/Help-to-figure-out-why-query-does-not-match-td4086967.html
the problem is in that if you have these 2 strings to index:
1. facebook.com/someuser.1
2. facebook.com/someveryandverylongusername
and the edge ngram filter factory with min and max gram size settings 2 and
25, search requests for these urls will fail.
But search requests for:
1. facebook.com/someuser
2. facebook.com/someveryandverylonguserna
will work properly.
It's because first url has 1 at the end, which is lover than the allowed
min gram size. In the second url the user name is longer than the max gram
size (27 characters).
Would be good to have a preserve original option, that will add the
original string to the index if it does not fit the allowed gram size, so
that 1 and someveryandverylongusername tokens will also be added to the
index.
Best,
Alex

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5332) Add preserve original setting to the EdgeNGramFilterFactory

2015-03-02 Thread Alexander S. (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Alexander S. updated SOLR-5332:
---
Fix Version/s: 5.1

Add preserve original setting to the EdgeNGramFilterFactory
-

Key: SOLR-5332
URL: https://issues.apache.org/jira/browse/SOLR-5332
Project: Solr
Issue Type: Wish
Affects Versions: 4.4, 4.5, 4.5.1, 4.6
Reporter: Alexander S.
Fix For: 5.1

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5332) Add preserve original setting to the EdgeNGramFilterFactory

2015-03-02 Thread Alexander S. (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Alexander S. updated SOLR-5332:
---
Affects Version/s: (was: 5.1)

Add preserve original setting to the EdgeNGramFilterFactory
-

Key: SOLR-5332
URL: https://issues.apache.org/jira/browse/SOLR-5332
Project: Solr
Issue Type: Wish
Affects Versions: 4.4, 4.5, 4.5.1, 4.6
Reporter: Alexander S.
Fix For: 5.1

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-7022) ERROR UpdateHandler java.lang.InterruptedException

2015-01-23 Thread Alexander S. (JIRA)

Alexander S. created SOLR-7022:
--

 Summary: ERROR UpdateHandler java.lang.InterruptedException
 Key: SOLR-7022
 URL: https://issues.apache.org/jira/browse/SOLR-7022
 Project: Solr
  Issue Type: Bug
 Environment: Solr 4.10.2, Ubuntu x86_64
Reporter: Alexander S.


What I did:
* Updated configs in zookeeper with zkcli.sh -cmd upconfig.
* Opened solr admin interface in the web browser
* Followed to core admin and reloaded the cores one by one

Backtrace:
{code}
java.lang.InterruptedException
at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:400)
at java.util.concurrent.FutureTask.get(FutureTask.java:187)
at 
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:654)
at org.apache.solr.update.CommitTracker.run(CommitTracker.java:216)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
{code}

I already did that before and didn't see such errors, but previous time I 
increased the caches too much so warming time for query results cache was 
around 30 seconds. This time cores reload took much longer and then this error 
appeared in the log.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-6875) No data integrity between replicas


[ 
https://issues.apache.org/jira/browse/SOLR-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14272877#comment-14272877
 ] 

Alexander S. edited comment on SOLR-6875 at 1/11/15 11:33 AM:
--

Now we have 4 shards, each with 2 replicas (8 total nodes) and the next picture:
{noformat}
Shard 1:
  Replica 1: *14 486 089*
  Replica 2: *14 496 445*

Shard 2
  Replica 1: 14 496 609
  Replica 2: 14 496 609

Shard 3
  Replica 1: 14 492 812
  Replica 2: 14 492 812

Shard 4
  Replica 1: 14 488 755
  Replica 2: 14 488 755
{noformat}

How could it be? We didn't see anything like that before upgrade from 4.8.1 to 
4.10.2. Also we enabled checkIntegrityAtMerge, could it be the reason?


was (Author: aheaven):
Now we have 4 shards, each with 2 replicas (8 total nodes) and the next picture:
{noformat}
Shard 1:
  Replica 1: 14 486 089
  Replica 2: 14 496 445

Shard 2
  Replica 1: 14 496 609
  Replica 2: 14 496 609

Shard 3
  Replica 1: 14 492 812
  Replica 2: 14 492 812

Shard 4
  Replica 1: 14 488 755
  Replica 2: 14 488 755
{noformat}

How could it be? We didn't see anything like that before upgrade from 4.8.1 to 
4.10.2. Also we enabled checkIntegrityAtMerge, could it be the reason?

 No data integrity between replicas
 --

 Key: SOLR-6875
 URL: https://issues.apache.org/jira/browse/SOLR-6875
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.10.2
 Environment: One replica is @ Linux solr1.devops.wegohealth.com 
 3.8.0-29-generic #42~precise1-Ubuntu SMP Wed Aug 14 16:19:23 UTC 2013 x86_64 
 x86_64 x86_64 GNU/Linux
 Another replica is @ Linux solr2.devops.wegohealth.com 3.16.0-23-generic 
 #30-Ubuntu SMP Thu Oct 16 13:17:16 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
 Solr is running with the next options:
 * -Xms12G
 * -Xmx16G
 * -XX:+UseConcMarkSweepGC
 * -XX:+UseLargePages
 * -XX:+CMSParallelRemarkEnabled
 * -XX:+ParallelRefProcEnabled
 * -XX:+UseLargePages
 * -XX:+AggressiveOpts
 * -XX:CMSInitiatingOccupancyFraction=75
Reporter: Alexander S.

 Setup: SolrCloud with 2 shards, each with 2 replicas, 4 nodes in total.
 Indexing is stopped, one replica of a shard (Solr1) shows 45 574 039 docs, 
 and another (Solr1.1) 45 574 038 docs.
 Solr1 is the leader, these errors appeared in the logs:
 {code}
 ERROR - 2014-12-20 09:54:38.783; 
 org.apache.solr.update.StreamingSolrServers$1; error
 java.net.SocketException: Connection reset
 at java.net.SocketInputStream.read(SocketInputStream.java:196)
 at java.net.SocketInputStream.read(SocketInputStream.java:122)
 at 
 org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:160)
 at 
 org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:84)
 at 
 org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:273)
 at 
 org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140)
 at 
 org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
 at 
 org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:260)
 at 
 org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:283)
 at 
 org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:251)
 at 
 org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:197)
 at 
 org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:271)
 at 
 org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:123)
 at 
 org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:682)
 at 
 org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:486)
 at 
 org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
 at 
 org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
 at 
 org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
 at 
 org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
 at 
 org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:233)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)
 WARN  - 2014-12-20 09:54:38.787; 
 org.apache.solr.update.processor.DistributedUpdateProcessor;

[jira] [Commented] (SOLR-6875) No data integrity between replicas


[ 
https://issues.apache.org/jira/browse/SOLR-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14272877#comment-14272877
 ] 

Alexander S. commented on SOLR-6875:


Now we have 4 shards, each with 2 replics (8 total nodes) and the next picture:
{noformat}
Shard 1:
  Replica 1: 14 486 089
  Replica 2: 14 496 445

Shard 2
  Replica 1: 14 496 609
  Replica 2: 14 496 609

Shard 3
  Replica 1: 14 492 812
  Replica 2: 14 492 812

Shard 4
  Replica 1: 14 488 755
  Replica 2: 14 488 755
{noformat}

How could it be? We didn't see anything like that before upgrade from 4.8.1 to 
4.10.2. Also we enabled checkIntegrityAtMerge, could it be the reason?

 No data integrity between replicas
 --

 Key: SOLR-6875
 URL: https://issues.apache.org/jira/browse/SOLR-6875
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.10.2
 Environment: One replica is @ Linux solr1.devops.wegohealth.com 
 3.8.0-29-generic #42~precise1-Ubuntu SMP Wed Aug 14 16:19:23 UTC 2013 x86_64 
 x86_64 x86_64 GNU/Linux
 Another replica is @ Linux solr2.devops.wegohealth.com 3.16.0-23-generic 
 #30-Ubuntu SMP Thu Oct 16 13:17:16 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
 Solr is running with the next options:
 * -Xms12G
 * -Xmx16G
 * -XX:+UseConcMarkSweepGC
 * -XX:+UseLargePages
 * -XX:+CMSParallelRemarkEnabled
 * -XX:+ParallelRefProcEnabled
 * -XX:+UseLargePages
 * -XX:+AggressiveOpts
 * -XX:CMSInitiatingOccupancyFraction=75
Reporter: Alexander S.

 Setup: SolrCloud with 2 shards, each with 2 replicas, 4 nodes in total.
 Indexing is stopped, one replica of a shard (Solr1) shows 45 574 039 docs, 
 and another (Solr1.1) 45 574 038 docs.
 Solr1 is the leader, these errors appeared in the logs:
 {code}
 ERROR - 2014-12-20 09:54:38.783; 
 org.apache.solr.update.StreamingSolrServers$1; error
 java.net.SocketException: Connection reset
 at java.net.SocketInputStream.read(SocketInputStream.java:196)
 at java.net.SocketInputStream.read(SocketInputStream.java:122)
 at 
 org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:160)
 at 
 org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:84)
 at 
 org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:273)
 at 
 org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140)
 at 
 org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
 at 
 org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:260)
 at 
 org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:283)
 at 
 org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:251)
 at 
 org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:197)
 at 
 org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:271)
 at 
 org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:123)
 at 
 org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:682)
 at 
 org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:486)
 at 
 org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
 at 
 org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
 at 
 org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
 at 
 org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
 at 
 org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:233)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)
 WARN  - 2014-12-20 09:54:38.787; 
 org.apache.solr.update.processor.DistributedUpdateProcessor; Error sending 
 update
 java.net.SocketException: Connection reset
 at java.net.SocketInputStream.read(SocketInputStream.java:196)
 at java.net.SocketInputStream.read(SocketInputStream.java:122)
 at 
 org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:160)
 at 
 org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:84)
 at 
 org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:273)
 at

[jira] [Comment Edited] (SOLR-6875) No data integrity between replicas


[ 
https://issues.apache.org/jira/browse/SOLR-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14272877#comment-14272877
 ] 

Alexander S. edited comment on SOLR-6875 at 1/11/15 11:33 AM:
--

Now we have 4 shards, each with 2 replicas (8 total nodes) and the next picture:
{noformat}
Shard 1:
  Replica 1: 14 486 089
  Replica 2: 14 496 445

Shard 2
  Replica 1: 14 496 609
  Replica 2: 14 496 609

Shard 3
  Replica 1: 14 492 812
  Replica 2: 14 492 812

Shard 4
  Replica 1: 14 488 755
  Replica 2: 14 488 755
{noformat}

How could it be? We didn't see anything like that before upgrade from 4.8.1 to 
4.10.2. Also we enabled checkIntegrityAtMerge, could it be the reason?


was (Author: aheaven):
Now we have 4 shards, each with 2 replics (8 total nodes) and the next picture:
{noformat}
Shard 1:
  Replica 1: 14 486 089
  Replica 2: 14 496 445

Shard 2
  Replica 1: 14 496 609
  Replica 2: 14 496 609

Shard 3
  Replica 1: 14 492 812
  Replica 2: 14 492 812

Shard 4
  Replica 1: 14 488 755
  Replica 2: 14 488 755
{noformat}

How could it be? We didn't see anything like that before upgrade from 4.8.1 to 
4.10.2. Also we enabled checkIntegrityAtMerge, could it be the reason?

 No data integrity between replicas
 --

 Key: SOLR-6875
 URL: https://issues.apache.org/jira/browse/SOLR-6875
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.10.2
 Environment: One replica is @ Linux solr1.devops.wegohealth.com 
 3.8.0-29-generic #42~precise1-Ubuntu SMP Wed Aug 14 16:19:23 UTC 2013 x86_64 
 x86_64 x86_64 GNU/Linux
 Another replica is @ Linux solr2.devops.wegohealth.com 3.16.0-23-generic 
 #30-Ubuntu SMP Thu Oct 16 13:17:16 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
 Solr is running with the next options:
 * -Xms12G
 * -Xmx16G
 * -XX:+UseConcMarkSweepGC
 * -XX:+UseLargePages
 * -XX:+CMSParallelRemarkEnabled
 * -XX:+ParallelRefProcEnabled
 * -XX:+UseLargePages
 * -XX:+AggressiveOpts
 * -XX:CMSInitiatingOccupancyFraction=75
Reporter: Alexander S.

 Setup: SolrCloud with 2 shards, each with 2 replicas, 4 nodes in total.
 Indexing is stopped, one replica of a shard (Solr1) shows 45 574 039 docs, 
 and another (Solr1.1) 45 574 038 docs.
 Solr1 is the leader, these errors appeared in the logs:
 {code}
 ERROR - 2014-12-20 09:54:38.783; 
 org.apache.solr.update.StreamingSolrServers$1; error
 java.net.SocketException: Connection reset
 at java.net.SocketInputStream.read(SocketInputStream.java:196)
 at java.net.SocketInputStream.read(SocketInputStream.java:122)
 at 
 org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:160)
 at 
 org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:84)
 at 
 org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:273)
 at 
 org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140)
 at 
 org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
 at 
 org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:260)
 at 
 org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:283)
 at 
 org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:251)
 at 
 org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:197)
 at 
 org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:271)
 at 
 org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:123)
 at 
 org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:682)
 at 
 org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:486)
 at 
 org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
 at 
 org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
 at 
 org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
 at 
 org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
 at 
 org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:233)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)
 WARN  - 2014-12-20 09:54:38.787; 
 org.apache.solr.update.processor.DistributedUpdateProcessor; Error

[jira] [Comment Edited] (SOLR-6875) No data integrity between replicas


[ 
https://issues.apache.org/jira/browse/SOLR-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14272877#comment-14272877
 ] 

Alexander S. edited comment on SOLR-6875 at 1/11/15 11:33 AM:
--

Now we have 4 shards, each with 2 replicas (8 total nodes) and the next picture:
{noformat}
Shard 1:
  Replica 1: 14 486 089
  Replica 2: 14 496 445

Shard 2
  Replica 1: 14 496 609
  Replica 2: 14 496 609

Shard 3
  Replica 1: 14 492 812
  Replica 2: 14 492 812

Shard 4
  Replica 1: 14 488 755
  Replica 2: 14 488 755
{noformat}

How could it be? We didn't see anything like that before upgrade from 4.8.1 to 
4.10.2. Also we enabled checkIntegrityAtMerge, could it be the reason?


was (Author: aheaven):
Now we have 4 shards, each with 2 replicas (8 total nodes) and the next picture:
{noformat}
Shard 1:
  Replica 1: *14 486 089*
  Replica 2: *14 496 445*

Shard 2
  Replica 1: 14 496 609
  Replica 2: 14 496 609

Shard 3
  Replica 1: 14 492 812
  Replica 2: 14 492 812

Shard 4
  Replica 1: 14 488 755
  Replica 2: 14 488 755
{noformat}

How could it be? We didn't see anything like that before upgrade from 4.8.1 to 
4.10.2. Also we enabled checkIntegrityAtMerge, could it be the reason?

 No data integrity between replicas
 --

 Key: SOLR-6875
 URL: https://issues.apache.org/jira/browse/SOLR-6875
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.10.2
 Environment: One replica is @ Linux solr1.devops.wegohealth.com 
 3.8.0-29-generic #42~precise1-Ubuntu SMP Wed Aug 14 16:19:23 UTC 2013 x86_64 
 x86_64 x86_64 GNU/Linux
 Another replica is @ Linux solr2.devops.wegohealth.com 3.16.0-23-generic 
 #30-Ubuntu SMP Thu Oct 16 13:17:16 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
 Solr is running with the next options:
 * -Xms12G
 * -Xmx16G
 * -XX:+UseConcMarkSweepGC
 * -XX:+UseLargePages
 * -XX:+CMSParallelRemarkEnabled
 * -XX:+ParallelRefProcEnabled
 * -XX:+UseLargePages
 * -XX:+AggressiveOpts
 * -XX:CMSInitiatingOccupancyFraction=75
Reporter: Alexander S.

 Setup: SolrCloud with 2 shards, each with 2 replicas, 4 nodes in total.
 Indexing is stopped, one replica of a shard (Solr1) shows 45 574 039 docs, 
 and another (Solr1.1) 45 574 038 docs.
 Solr1 is the leader, these errors appeared in the logs:
 {code}
 ERROR - 2014-12-20 09:54:38.783; 
 org.apache.solr.update.StreamingSolrServers$1; error
 java.net.SocketException: Connection reset
 at java.net.SocketInputStream.read(SocketInputStream.java:196)
 at java.net.SocketInputStream.read(SocketInputStream.java:122)
 at 
 org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:160)
 at 
 org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:84)
 at 
 org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:273)
 at 
 org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140)
 at 
 org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
 at 
 org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:260)
 at 
 org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:283)
 at 
 org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:251)
 at 
 org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:197)
 at 
 org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:271)
 at 
 org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:123)
 at 
 org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:682)
 at 
 org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:486)
 at 
 org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
 at 
 org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
 at 
 org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
 at 
 org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
 at 
 org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:233)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)
 WARN  - 2014-12-20 09:54:38.787; 
 org.apache.solr.update.processor.DistributedUpdateProcessor;

[jira] [Commented] (SOLR-6494) Query filters applied in a wrong order

2015-01-02 Thread Alexander S. (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14262969#comment-14262969
 ] 

Alexander S. commented on SOLR-6494:


Correct, and that's exactly my case, because the time is entered by users and 
differ between queries. I'd love to have something like this working with the 
standard query parser:
{code}
fq={!cache=false cost=101}field:value
{code}
It seems that `cache=false` does actually work, but `cost` doesn't (some 
parsers, like the frange one, do threat and apply all queries with the `cost` 
higher than 100 as post filters).

 Query filters applied in a wrong order
 --

 Key: SOLR-6494
 URL: https://issues.apache.org/jira/browse/SOLR-6494
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.8.1
Reporter: Alexander S.

 This query:
 {code}
 {
   fq: [type:Award::Nomination],
   sort: score desc,
   start: 0,
   rows: 20,
   q: *:*
 }
 {code}
 takes just a few milliseconds, but this one:
 {code}
 {
   fq: [
 type:Award::Nomination,
 created_at_d:[* TO 2014-09-08T23:59:59Z]
   ],
   sort: score desc,
   start: 0,
   rows: 20,
   q: *:*
 }
 {code}
 takes almost 15 seconds.
 I have just ≈12k of documents with type Award::Nomination, but around half 
 a billion with created_at_d field set. And it seems Solr applies the 
 created_at_d filter first going through all documents where this field is 
 set, which is not very smart.
 I think if it can't do anything better than applying filters in the alphabet 
 order it should apply them in the order they were received.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6494) Query filters applied in a wrong order

2014-12-28 Thread Alexander S. (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14259627#comment-14259627
]

Alexander S. commented on SOLR-6494:

As I was told already, Solr does not apply filters incrementally, instead each
filter runs through the entire data set, then Solr caches the results. In the
case with filters that contain ranges cache is not effective, especially when
we need NRT search and commits being triggered multiple times per minute. Then
big caches make no sense and big autowarming numbers causing Solr to fail. My
point is that cache is not always efficient and for such cases Solr need to use
another strategy and apply filters incrementally (read as post filters).

So this:
{quote}
By design, fq clauses like this are calculated for the entire document set and
the results cached, there is no ordering for that part. Otherwise, how could
they be re-used for a different query?
{quote}
does not work in all cases.

Something like this:
{code}
fq={!cache=false cost=101}field:value # to run as a post filter
{code}
would definitely solve the problem, but this is not supported.

The frange parser has support for this, but it is not always suitable and fails
with different errors, like can not use FieldCache on multivalued field:
type, etc.

Does that look like a missing feature? I mean for me it definitely does, but
could this be considered as a wish and implemented some day? How can Solr
community help with missing features?

Query filters applied in a wrong order
--

Key: SOLR-6494
URL: https://issues.apache.org/jira/browse/SOLR-6494
Project: Solr
Issue Type: Bug
Affects Versions: 4.8.1
Reporter: Alexander S.

This query:
{code}
{
fq: [type:Award::Nomination],
sort: score desc,
start: 0,
rows: 20,
q: *:*
}
{code}
takes just a few milliseconds, but this one:
{code}
{
fq: [
type:Award::Nomination,
created_at_d:[* TO 2014-09-08T23:59:59Z]
],
sort: score desc,
start: 0,
rows: 20,
q: *:*
}
{code}
takes almost 15 seconds.
I have just ≈12k of documents with type Award::Nomination, but around half
a billion with created_at_d field set. And it seems Solr applies the
created_at_d filter first going through all documents where this field is
set, which is not very smart.
I think if it can't do anything better than applying filters in the alphabet
order it should apply them in the order they were received.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-6494) Query filters applied in a wrong order

2014-12-28 Thread Alexander S. (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14259627#comment-14259627
]

Alexander S. edited comment on SOLR-6494 at 12/28/14 12:50 PM:
---

Something like this:
{code}
# cost 100 to run as a post filter, but something like post=true would be
better I think
fq={!cache=false cost=101}field:value
{code}
would definitely solve the problem, but this is not supported.

The frange parser has support for this, but it is not always suitable and fails
with different errors, like can not use FieldCache on multivalued field:
type, etc.

Does that look like a missing feature? I mean for me it definitely does, but
could this be considered as a wish and implemented some day? How can Solr
community help with missing features?

was (Author: aheaven):
As I was told already, Solr does not apply filters incrementally, instead each
filter runs through the entire data set, then Solr caches the results. In the
case with filters that contain ranges cache is not effective, especially when
we need NRT search and commits being triggered multiple times per minute. Then
big caches make no sense and big autowarming numbers causing Solr to fail. My
point is that cache is not always efficient and for such cases Solr need to use
another strategy and apply filters incrementally (read as post filters).

Something like this:
{code}
fq={!cache=false cost=101}field:value # to run as a post filter
{code}
would definitely solve the problem, but this is not supported.

The frange parser has support for this, but it is not always suitable and fails
with different errors, like can not use FieldCache on multivalued field:
type, etc.

Does that look like a missing feature? I mean for me it definitely does, but
could this be considered as a wish and implemented some day? How can Solr
community help with missing features?

Query filters applied in a wrong order
--

Key: SOLR-6494
URL: https://issues.apache.org/jira/browse/SOLR-6494
Project: Solr
Issue Type: Bug
Affects Versions: 4.8.1
Reporter: Alexander S.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6494) Query filters applied in a wrong order


[ 
https://issues.apache.org/jira/browse/SOLR-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14259150#comment-14259150
 ] 

Alexander S. commented on SOLR-6494:


Just an idea, but what if Solr detecting that the filter does use date rages 
like [* TO 2014-09-08T23:59:59Z] (or probably any ranges were cache is not very 
efficient), and if there are other simpler filters in the query, will apply 
such range filters at last? And probably to already fetched results as a post 
filter? And probably avoid caching for this filter? That sounds like a good 
optimization to me. This will avoid losing of more useful filters from the 
cache, increase warming speed and which is the most important — increase the 
search speed. [~erickerickson] [~hossman]

Best,
Alex

 Query filters applied in a wrong order
 --

 Key: SOLR-6494
 URL: https://issues.apache.org/jira/browse/SOLR-6494
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.8.1
Reporter: Alexander S.

 This query:
 {code}
 {
   fq: [type:Award::Nomination],
   sort: score desc,
   start: 0,
   rows: 20,
   q: *:*
 }
 {code}
 takes just a few milliseconds, but this one:
 {code}
 {
   fq: [
 type:Award::Nomination,
 created_at_d:[* TO 2014-09-08T23:59:59Z]
   ],
   sort: score desc,
   start: 0,
   rows: 20,
   q: *:*
 }
 {code}
 takes almost 15 seconds.
 I have just ≈12k of documents with type Award::Nomination, but around half 
 a billion with created_at_d field set. And it seems Solr applies the 
 created_at_d filter first going through all documents where this field is 
 set, which is not very smart.
 I think if it can't do anything better than applying filters in the alphabet 
 order it should apply them in the order they were received.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-6494) Query filters applied in a wrong order


[ 
https://issues.apache.org/jira/browse/SOLR-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14259150#comment-14259150
 ] 

Alexander S. edited comment on SOLR-6494 at 12/26/14 5:20 PM:
--

Just an idea, but what if Solr detecting that the filter does use date rages 
like [* TO 2014-09-08T23:59:59Z] (or probably any ranges where cache is not 
very efficient), and if there are other simpler filters in the query, will 
apply such range filters at last? And probably to already fetched results as a 
post filter? And probably avoid caching for this filter? That sounds like a 
good optimization to me. This will avoid losing of more useful filters from the 
cache, increase warming speed and which is the most important — increase the 
search speed. [~erickerickson] [~hossman]

Best,
Alex


was (Author: aheaven):
Just an idea, but what if Solr detecting that the filter does use date rages 
like [* TO 2014-09-08T23:59:59Z] (or probably any ranges were cache is not very 
efficient), and if there are other simpler filters in the query, will apply 
such range filters at last? And probably to already fetched results as a post 
filter? And probably avoid caching for this filter? That sounds like a good 
optimization to me. This will avoid losing of more useful filters from the 
cache, increase warming speed and which is the most important — increase the 
search speed. [~erickerickson] [~hossman]

Best,
Alex

 Query filters applied in a wrong order
 --

 Key: SOLR-6494
 URL: https://issues.apache.org/jira/browse/SOLR-6494
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.8.1
Reporter: Alexander S.

 This query:
 {code}
 {
   fq: [type:Award::Nomination],
   sort: score desc,
   start: 0,
   rows: 20,
   q: *:*
 }
 {code}
 takes just a few milliseconds, but this one:
 {code}
 {
   fq: [
 type:Award::Nomination,
 created_at_d:[* TO 2014-09-08T23:59:59Z]
   ],
   sort: score desc,
   start: 0,
   rows: 20,
   q: *:*
 }
 {code}
 takes almost 15 seconds.
 I have just ≈12k of documents with type Award::Nomination, but around half 
 a billion with created_at_d field set. And it seems Solr applies the 
 created_at_d filter first going through all documents where this field is 
 set, which is not very smart.
 I think if it can't do anything better than applying filters in the alphabet 
 order it should apply them in the order they were received.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-6494) Query filters applied in a wrong order


[ 
https://issues.apache.org/jira/browse/SOLR-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14259150#comment-14259150
 ] 

Alexander S. edited comment on SOLR-6494 at 12/26/14 5:24 PM:
--

Just an idea, but what if Solr detecting that the filter does use date rages 
like [* TO 2014-09-08T23:59:59Z] (or probably any ranges where cache is not 
very efficient), and if there are other simpler filters in the query, will 
apply such range filters at last? And probably to already fetched results as a 
post filter? And probably avoid caching for this filter? That sounds like a 
good optimization to me. This will avoid losing of more useful filters from the 
cache, increase warming speed and which is the most important — increase the 
search speed.

Like in the case above, if you have 200m of docs, but only 12k with 
type:AwardNomination, and query has 2 filters, one with a date range, Solr 
definitely can detect this and do the right thing instead simply loop through 
all 200m documents with this cache-inefficient filter. Could this be at least 
considered as a wish?

[~erickerickson] [~hossman]

Best,
Alex


was (Author: aheaven):
Just an idea, but what if Solr detecting that the filter does use date rages 
like [* TO 2014-09-08T23:59:59Z] (or probably any ranges where cache is not 
very efficient), and if there are other simpler filters in the query, will 
apply such range filters at last? And probably to already fetched results as a 
post filter? And probably avoid caching for this filter? That sounds like a 
good optimization to me. This will avoid losing of more useful filters from the 
cache, increase warming speed and which is the most important — increase the 
search speed. [~erickerickson] [~hossman]

Best,
Alex

 Query filters applied in a wrong order
 --

 Key: SOLR-6494
 URL: https://issues.apache.org/jira/browse/SOLR-6494
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.8.1
Reporter: Alexander S.

 This query:
 {code}
 {
   fq: [type:Award::Nomination],
   sort: score desc,
   start: 0,
   rows: 20,
   q: *:*
 }
 {code}
 takes just a few milliseconds, but this one:
 {code}
 {
   fq: [
 type:Award::Nomination,
 created_at_d:[* TO 2014-09-08T23:59:59Z]
   ],
   sort: score desc,
   start: 0,
   rows: 20,
   q: *:*
 }
 {code}
 takes almost 15 seconds.
 I have just ≈12k of documents with type Award::Nomination, but around half 
 a billion with created_at_d field set. And it seems Solr applies the 
 created_at_d filter first going through all documents where this field is 
 set, which is not very smart.
 I think if it can't do anything better than applying filters in the alphabet 
 order it should apply them in the order they were received.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-6494) Query filters applied in a wrong order

[
https://issues.apache.org/jira/browse/SOLR-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14259150#comment-14259150
]

Alexander S. edited comment on SOLR-6494 at 12/26/14 5:25 PM:
--

Just an idea, but what if Solr detecting that the filter does use date rages
like [* TO 2014-09-08T23:59:59Z] (or probably any ranges where cache is not
very efficient), and if there are other simpler filters in the query, will
apply such range filters at last? And probably to already fetched results as a
post filter? And probably avoid caching for this filter? That sounds like a
good optimization to me. This will avoid losing of more useful filters from the
cache, increase warming speed and which is the most important — increase the
search speed.

Like in the case above, if you have 200m of docs, but only 12k with
type:AwardNomination, and query has 2 filters, one with a date range, Solr
definitely can detect this and do the right thing instead simply loop through
all 200m documents with this cache-inefficient filter. Could this be at least
considered as a wish and reopened?

[~erickerickson] [~hossman]

Best,
Alex

was (Author: aheaven):
Just an idea, but what if Solr detecting that the filter does use date rages
like [* TO 2014-09-08T23:59:59Z] (or probably any ranges where cache is not
very efficient), and if there are other simpler filters in the query, will
apply such range filters at last? And probably to already fetched results as a
post filter? And probably avoid caching for this filter? That sounds like a
good optimization to me. This will avoid losing of more useful filters from the
cache, increase warming speed and which is the most important — increase the
search speed.

[~erickerickson] [~hossman]

Best,
Alex

Query filters applied in a wrong order
--

Key: SOLR-6494
URL: https://issues.apache.org/jira/browse/SOLR-6494
Project: Solr
Issue Type: Bug
Affects Versions: 4.8.1
Reporter: Alexander S.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-6494) Query filters applied in a wrong order

[
https://issues.apache.org/jira/browse/SOLR-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14259150#comment-14259150
]

Alexander S. edited comment on SOLR-6494 at 12/26/14 5:27 PM:
--

Like in the case above, if you have 200m of docs, but only 12k with
type:AwardNomination, and query has 2 filters where one with a date range. Solr
definitely can detect this and do the right thing instead of simply looping
through all 200m documents with this cache-inefficient filter. Could this be at
least considered as a wish and reopened?

[~erickerickson] [~hossman]

Best,
Alex

[~erickerickson] [~hossman]

Best,
Alex

Query filters applied in a wrong order
--

Key: SOLR-6494
URL: https://issues.apache.org/jira/browse/SOLR-6494
Project: Solr
Issue Type: Bug
Affects Versions: 4.8.1
Reporter: Alexander S.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-6875) No data integrity between replicas

2014-12-21 Thread Alexander S. (JIRA)

Alexander S. created SOLR-6875:
--

 Summary: No data integrity between replicas
 Key: SOLR-6875
 URL: https://issues.apache.org/jira/browse/SOLR-6875
 Project: Solr
  Issue Type: Bug
Reporter: Alexander S.


Setup: SolrCloud with 2 shards, each with 2 replicas, 4 nodes in total.

Indexing is stopped, one replica of a shard (Solr1) shows 45 574 039 docs, and 
another (Solr1.1) 45 574 038 docs.

Solr1 is the leader, these errors appeared in the logs:
{code}
ERROR - 2014-12-20 09:54:38.783; org.apache.solr.update.StreamingSolrServers$1; 
error
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:196)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at 
org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:160)
at 
org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:84)
at 
org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:273)
at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140)
at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
at 
org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:260)
at 
org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:283)
at 
org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:251)
at 
org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:197)
at 
org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:271)
at 
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:123)
at 
org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:682)
at 
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:486)
at 
org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
at 
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:233)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
WARN  - 2014-12-20 09:54:38.787; 
org.apache.solr.update.processor.DistributedUpdateProcessor; Error sending 
update
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:196)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at 
org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:160)
at 
org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:84)
at 
org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:273)
at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140)
at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
at 
org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:260)
at 
org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:283)
at 
org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:251)
at 
org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:197)
at 
org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:271)
at 
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:123)
at 
org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:682)
at 
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:486)
at 
org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)

[jira] [Updated] (SOLR-6875) No data integrity between replicas

2014-12-21 Thread Alexander S. (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander S. updated SOLR-6875:
---
  Environment: 
One replica is @ Linux solr1.devops.wegohealth.com 3.8.0-29-generic 
#42~precise1-Ubuntu SMP Wed Aug 14 16:19:23 UTC 2013 x86_64 x86_64 x86_64 
GNU/Linux
Another replica is @ Linux solr2.devops.wegohealth.com 3.16.0-23-generic 
#30-Ubuntu SMP Thu Oct 16 13:17:16 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

Solr is running with the next options:
* -Xms12G
* -Xmx16G
* -XX:+UseConcMarkSweepGC
* -XX:+UseLargePages
* -XX:+CMSParallelRemarkEnabled
* -XX:+ParallelRefProcEnabled
* -XX:+UseLargePages
* -XX:+AggressiveOpts
* -XX:CMSInitiatingOccupancyFraction=75
Affects Version/s: 4.10.2

 No data integrity between replicas
 --

 Key: SOLR-6875
 URL: https://issues.apache.org/jira/browse/SOLR-6875
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.10.2
 Environment: One replica is @ Linux solr1.devops.wegohealth.com 
 3.8.0-29-generic #42~precise1-Ubuntu SMP Wed Aug 14 16:19:23 UTC 2013 x86_64 
 x86_64 x86_64 GNU/Linux
 Another replica is @ Linux solr2.devops.wegohealth.com 3.16.0-23-generic 
 #30-Ubuntu SMP Thu Oct 16 13:17:16 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
 Solr is running with the next options:
 * -Xms12G
 * -Xmx16G
 * -XX:+UseConcMarkSweepGC
 * -XX:+UseLargePages
 * -XX:+CMSParallelRemarkEnabled
 * -XX:+ParallelRefProcEnabled
 * -XX:+UseLargePages
 * -XX:+AggressiveOpts
 * -XX:CMSInitiatingOccupancyFraction=75
Reporter: Alexander S.

 Setup: SolrCloud with 2 shards, each with 2 replicas, 4 nodes in total.
 Indexing is stopped, one replica of a shard (Solr1) shows 45 574 039 docs, 
 and another (Solr1.1) 45 574 038 docs.
 Solr1 is the leader, these errors appeared in the logs:
 {code}
 ERROR - 2014-12-20 09:54:38.783; 
 org.apache.solr.update.StreamingSolrServers$1; error
 java.net.SocketException: Connection reset
 at java.net.SocketInputStream.read(SocketInputStream.java:196)
 at java.net.SocketInputStream.read(SocketInputStream.java:122)
 at 
 org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:160)
 at 
 org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:84)
 at 
 org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:273)
 at 
 org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140)
 at 
 org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
 at 
 org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:260)
 at 
 org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:283)
 at 
 org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:251)
 at 
 org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:197)
 at 
 org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:271)
 at 
 org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:123)
 at 
 org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:682)
 at 
 org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:486)
 at 
 org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
 at 
 org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
 at 
 org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
 at 
 org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
 at 
 org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:233)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)
 WARN  - 2014-12-20 09:54:38.787; 
 org.apache.solr.update.processor.DistributedUpdateProcessor; Error sending 
 update
 java.net.SocketException: Connection reset
 at java.net.SocketInputStream.read(SocketInputStream.java:196)
 at java.net.SocketInputStream.read(SocketInputStream.java:122)
 at 
 org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:160)
 at 
 org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:84)
 at

[jira] [Commented] (SOLR-6769) Election bug

2014-12-21 Thread Alexander S. (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-6769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14255146#comment-14255146
]

Alexander S. commented on SOLR-6769:

Correct, an endless warming was causing this problem. So this is a bug in Solr,
it waits for searchers to end warming, which could take up to 5 minutes in some
cases. The node itself goes down and does not accept connections but the
ellection does not happen.

Election bug

Key: SOLR-6769
URL: https://issues.apache.org/jira/browse/SOLR-6769
Project: Solr
Issue Type: Bug
Reporter: Alexander S.
Attachments: Screenshot 876.png

Hello, I have a very simple set up: 2 shards and 2 replicas (4 nodes in
total).
What I did is just stopped the shards, but if first shard stopped immediately
the second one took about 5 minutes to stop. You can see on the screenshot
what happened next. In short:
1. Shard 1 stopped normally
3. Replica 1 became a leader
2. Shard 2 still was performing some job but wasn't accepting connection
4. Replica 2 did not became a leader because Shard 2 is still there but
doesn't work
5. Entire cluster went down until Shard 2 stopped and Replica 2 became a
leader
Marked as critical because this shuts down the entire cluster. Please adjust
if I am wrong.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6769) Election bug

2014-12-18 Thread Alexander S. (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-6769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14252440#comment-14252440
 ] 

Alexander S. commented on SOLR-6769:


This might be related: 
http://lucene.472066.n3.nabble.com/Endless-100-CPU-usage-on-searcherExecutor-thread-td4175088.html

 Election bug
 

 Key: SOLR-6769
 URL: https://issues.apache.org/jira/browse/SOLR-6769
 Project: Solr
  Issue Type: Bug
Reporter: Alexander S.
 Attachments: Screenshot 876.png


 Hello, I have a very simple set up: 2 shards and 2 replicas (4 nodes in 
 total).
 What I did is just stopped the shards, but if first shard stopped immediately 
 the second one took about 5 minutes to stop. You can see on the screenshot 
 what happened next. In short:
 1. Shard 1 stopped normally
 3. Replica 1 became a leader
 2. Shard 2 still was performing some job but wasn't accepting connection
 4. Replica 2 did not became a leader because Shard 2 is still there but 
 doesn't work
 5. Entire cluster went down until Shard 2 stopped and Replica 2 became a 
 leader
 Marked as critical because this shuts down the entire cluster. Please adjust 
 if I am wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6769) Election bug

2014-12-09 Thread Alexander S. (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-6769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14239203#comment-14239203
]

Alexander S. commented on SOLR-6769:

Hi, yes, my terminology about shards and replicas wasn't clear, let me explain
this better.

* Solr: 4.8.1
* Java:
java version 1.7.0_51
Java(TM) SE Runtime Environment (build 1.7.0_51-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode)
* We have 5 servers, 2 of which are big (16 CPU cores, 48G of RAM each) and 3
others are small (1 CPU and 1G of RAM). All servers have rapid SSD RAID 10.
Each server runs a ZK instance, so we have 5 ZK instances in total. Those big
servers also run Solr: the first one runs 2 instances and the second one also
runs 2 replicas (so each shard has 2 replicas, the simplest SolrCloud setup
from the wiki).

So the cluster looks like this:
{noformat}
* Small 1G node: ZK
* Small 1G node: ZK
* Small 1G node: ZK
* Big 16G node: ZK, Solr1, Solr2
* Big 16G node: ZK, Solr1.1, Solr2.1
{noformat}

Stopped manually means I tried to manually stop Solr1 and Solr2, which were
the leaders, by sending a TERM signal (we have service files so I did service
stop and was expecting a graceful shut down). This was working for Solr1 and
it went down normally and Solr1.1 became the leader instantly. Then I tried to
do the same for Solr2, but once I sent the TERM it became not operable but
didn't exit completely (orange on the screenshot), the process was still
running for ≈ 5-10 minutes and the election didn't happen. As a result I get
no node hosting shard errors, but was expecting Solr2.1 to become the leader
instantly as it was with Solr1.1.

As I understand this, the Solr2 didn't shut down instantly because there could
be some background jobs, e.g. index merging, an in process commit, etc, *but
then it should not stop accepting connections and should not change its status
to down* until all background jobs are finished and it s really ready to go
down and pass leadership to the Solr2.1.

It seems like a bug in Solr, because all services were working normally, all ZK
instances were up and operable, and Solr itself wasn't under a heavy load.
Otherwise could you please point me where to look for any information about how
to gracefully shut down instances? It would be good to have a button in the web
UI to be able to force a replica to become the leader with one click. So then I
would be able to force Solr1.1 and Solr 2.1 to become the leaders, wait until
this happen and safely reboot Solr1 and solr2 instances.

Best,
Alexander

Election bug

Key: SOLR-6769
URL: https://issues.apache.org/jira/browse/SOLR-6769
Project: Solr
Issue Type: Bug
Reporter: Alexander S.
Attachments: Screenshot 876.png

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-6769) Election bug

2014-11-20 Thread Alexander S. (JIRA)

Alexander S. created SOLR-6769:
--

 Summary: Election bug
 Key: SOLR-6769
 URL: https://issues.apache.org/jira/browse/SOLR-6769
 Project: Solr
  Issue Type: Bug
Reporter: Alexander S.
Priority: Critical


Hello, I have a very simple set up: 2 shards and 2 replicas (4 nodes in total).

What I did is just stopped the shards, but if first shard stopped immediately 
the second one took about 5 minutes to stop. You can see on the screenshot what 
happened next. In short:
1. Shard 1 stopped normally
3. Replica 1 became a leader
2. Shard 2 still was performing some job but wasn't accepting connection
4. Replica 2 did not became a leader because Shard 2 is still there but doesn't 
work
5. Entire cluster went down until Shard 2 stopped and Replica 2 became a leader

Marked as critical because this shuts down the entire cluster. Please adjust if 
I am wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-6769) Election bug

2014-11-20 Thread Alexander S. (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-6769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander S. updated SOLR-6769:
---
Attachment: Screenshot 876.png

[^Screenshot 876.png]

 Election bug
 

 Key: SOLR-6769
 URL: https://issues.apache.org/jira/browse/SOLR-6769
 Project: Solr
  Issue Type: Bug
Reporter: Alexander S.
Priority: Critical
 Attachments: Screenshot 876.png


 Hello, I have a very simple set up: 2 shards and 2 replicas (4 nodes in 
 total).
 What I did is just stopped the shards, but if first shard stopped immediately 
 the second one took about 5 minutes to stop. You can see on the screenshot 
 what happened next. In short:
 1. Shard 1 stopped normally
 3. Replica 1 became a leader
 2. Shard 2 still was performing some job but wasn't accepting connection
 4. Replica 2 did not became a leader because Shard 2 is still there but 
 doesn't work
 5. Entire cluster went down until Shard 2 stopped and Replica 2 became a 
 leader
 Marked as critical because this shuts down the entire cluster. Please adjust 
 if I am wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6494) Query filters applied in a wrong order

2014-09-12 Thread Alexander S. (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131495#comment-14131495
 ] 

Alexander S. commented on SOLR-6494:


So I've added a new field nominated_at_d to all docs with type 
Award::Nomination. Now this query:
{code}
{
  fq: [
type:Award::Nomination,
nominated_at_d:[* TO 2014-09-08T23:59:59Z]
  ],
  sort: score desc,
  start: 0,
  rows: 20,
  q: *:*
}
{code}
doesn't take longer than a few milliseconds.

The new nominated_at_d is the same field as created_at_d, the only difference 
is that there are only ≈ 12k of documents with nominated_at_d field and ≈ 100m 
with created_at_d.

So again, I am saying that current way Solr applies filters is not optimal, 
sometimes we need to skip cache and apply filters incrementally. So each filter 
doesn't have to go through entire collection, so we can filter this way:
{code}
200m docs → filter (type:Award::Nomination) → 12k docs → filter 
(created_at_d:[* TO 2014-09-08T23:59:59Z]) → 500 docs
{code}

I don't think the *entire* solr user community can do anything with this, but a 
few solr developers could. Do I have to be an solr expert to report 
bugs/feature leaks?

 Query filters applied in a wrong order
 --

 Key: SOLR-6494
 URL: https://issues.apache.org/jira/browse/SOLR-6494
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.8.1
Reporter: Alexander S.

 This query:
 {code}
 {
   fq: [type:Award::Nomination],
   sort: score desc,
   start: 0,
   rows: 20,
   q: *:*
 }
 {code}
 takes just a few milliseconds, but this one:
 {code}
 {
   fq: [
 type:Award::Nomination,
 created_at_d:[* TO 2014-09-08T23:59:59Z]
   ],
   sort: score desc,
   start: 0,
   rows: 20,
   q: *:*
 }
 {code}
 takes almost 15 seconds.
 I have just ≈12k of documents with type Award::Nomination, but around half 
 a billion with created_at_d field set. And it seems Solr applies the 
 created_at_d filter first going through all documents where this field is 
 set, which is not very smart.
 I think if it can't do anything better than applying filters in the alphabet 
 order it should apply them in the order they were received.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6494) Query filters applied in a wrong order

2014-09-11 Thread Alexander S. (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129956#comment-14129956
 ] 

Alexander S. commented on SOLR-6494:


Added the schema and debug output here: 
http://lucene.472066.n3.nabble.com/Help-with-a-slow-filter-query-td4158159.html

 Query filters applied in a wrong order
 --

 Key: SOLR-6494
 URL: https://issues.apache.org/jira/browse/SOLR-6494
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.8.1
Reporter: Alexander S.

 This query:
 {code}
 {
   fq: [type:Award::Nomination],
   sort: score desc,
   start: 0,
   rows: 20,
   q: *:*
 }
 {code}
 takes just a few milliseconds, but this one:
 {code}
 {
   fq: [
 type:Award::Nomination,
 created_at_d:[* TO 2014-09-08T23:59:59Z]
   ],
   sort: score desc,
   start: 0,
   rows: 20,
   q: *:*
 }
 {code}
 takes almost 15 seconds.
 I have just ≈12k of documents with type Award::Nomination, but around half 
 a billion with created_at_d field set. And it seems Solr applies the 
 created_at_d filter first going through all documents where this field is 
 set, which is not very smart.
 I think if it can't do anything better than applying filters in the alphabet 
 order it should apply them in the order they were received.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6468) Regression: StopFilterFactory doesn't work properly without enablePositionIncrements=false

2014-09-11 Thread Alexander S. (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14130090#comment-14130090
 ] 

Alexander S. commented on SOLR-6468:


Just tried to add matchVersion but got this error:
{code}
null:org.apache.solr.common.SolrException: Unable to create core: crm-prod
at 
org.apache.solr.core.CoreContainer.recordAndThrow(CoreContainer.java:911)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:568)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:261)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:253)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.solr.common.SolrException: Could not load core 
configuration for core crm-prod
at 
org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:66)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:554)
... 8 more
Caused by: org.apache.solr.common.SolrException: Plugin init failure for 
[schema.xml] fieldType words_ngram: Plugin init failure for [schema.xml] 
analyzer/filter: Error instantiating class: 
'org.apache.lucene.analysis.core.StopFilterFactory'. Schema file is 
/etc/solr/core2/schema.xml
at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:616)
at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:166)
at 
org.apache.solr.schema.IndexSchemaFactory.create(IndexSchemaFactory.java:55)
at 
org.apache.solr.schema.IndexSchemaFactory.buildIndexSchema(IndexSchemaFactory.java:69)
at 
org.apache.solr.core.ConfigSetService.createIndexSchema(ConfigSetService.java:89)
at 
org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:62)
... 9 more
Caused by: org.apache.solr.common.SolrException: Plugin init failure for 
[schema.xml] fieldType words_ngram: Plugin init failure for [schema.xml] 
analyzer/filter: Error instantiating class: 
'org.apache.lucene.analysis.core.StopFilterFactory'
at 
org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:177)
at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:470)
... 14 more
Caused by: org.apache.solr.common.SolrException: Plugin init failure for 
[schema.xml] analyzer/filter: Error instantiating class: 
'org.apache.lucene.analysis.core.StopFilterFactory'
at 
org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:177)
at 
org.apache.solr.schema.FieldTypePluginLoader.readAnalyzer(FieldTypePluginLoader.java:400)
at 
org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:86)
at 
org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:43)
at 
org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:151)
... 15 more
Caused by: org.apache.solr.common.SolrException: Error instantiating class: 
'org.apache.lucene.analysis.core.StopFilterFactory'
at 
org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:606)
at 
org.apache.solr.schema.FieldTypePluginLoader$3.create(FieldTypePluginLoader.java:382)
at 
org.apache.solr.schema.FieldTypePluginLoader$3.create(FieldTypePluginLoader.java:376)
at 
org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:151)
... 19 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
at 
org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:603)
... 22 more
Caused by: java.lang.IllegalArgumentException: Unknown parameters: 
{matchVersion=4.3}
at 
org.apache.lucene.analysis.core.StopFilterFactory.init(StopFilterFactory.java:91)
... 27 more
{code}

 Regression: StopFilterFactory doesn't work properly without 
 enablePositionIncrements=false
 

 Key: SOLR-6468
 URL: https://issues.apache.org/jira/browse/SOLR-6468
 Project: Solr

[jira] [Commented] (SOLR-6468) Regression: StopFilterFactory doesn't work properly without enablePositionIncrements=false

2014-09-11 Thread Alexander S. (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14130209#comment-14130209
 ] 

Alexander S. commented on SOLR-6468:


Thanks, it does work with luceneMatchVersion=4.3, isn't this deprecated? Any 
chance to return enablePositionIncrements?

 Regression: StopFilterFactory doesn't work properly without 
 enablePositionIncrements=false
 

 Key: SOLR-6468
 URL: https://issues.apache.org/jira/browse/SOLR-6468
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.8.1, 4.9
Reporter: Alexander S.

 Setup:
 * Schema version is 1.5
 * Field config:
 {code}
 fieldType name=words_ngram class=solr.TextField omitNorms=false 
 autoGeneratePhraseQueries=true
   analyzer
 tokenizer class=solr.PatternTokenizerFactory pattern=[^\w]+ /
 filter class=solr.StopFilterFactory words=url_stopwords.txt 
 ignoreCase=true /
 filter class=solr.LowerCaseFilterFactory /
   /analyzer
 /fieldType
 {code}
 * Stop words:
 {code}
 http 
 https 
 ftp 
 www
 {code}
 So very simple. In the index I have:
 * twitter.com/testuser
 All these queries do match:
 * twitter.com/testuser
 * com/testuser
 * testuser
 But none of these does:
 * https://twitter.com/testuser
 * https://www.twitter.com/testuser
 * www.twitter.com/testuser
 Debug output shows:
 parsedquery_toString: +(url_words_ngram:\? twitter com testuser\)
 But we need:
 parsedquery_toString: +(url_words_ngram:\twitter com testuser\)
 Complete debug outputs:
 * a valid search: 
 http://pastie.org/pastes/9500661/text?key=rgqj5ivlgsbk1jxsudx9za
 * an invalid search: 
 http://pastie.org/pastes/9500662/text?key=b4zlh2oaxtikd8jvo5xaww
 The complete discussion and explanation of the problem is here: 
 http://lucene.472066.n3.nabble.com/Help-with-StopFilterFactory-td4153839.html
 I didn't find a clear explanation how can we upgrade Solr, there's no any 
 replacement or a workarround to this, so this is not just a major change but 
 a major disrespect to all existing Solr users who are using this feature.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6494) Query filters applied in a wrong order

2014-09-10 Thread Alexander S. (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128322#comment-14128322
 ] 

Alexander S. commented on SOLR-6494:


Unfortunately that doesn't solve the problem completely, these queries take ≈7 
seconds instead of 15:
{code}
{!cache=false}type:Award::Nomination
{!cache=false cost=10}created_at_d:[* TO 2014-09-08T23:59:59Z]
{code}
Which is still not good since I have only 11 974 docs with 
type:Award::Nomination and 139 716 883 with created_at_d:[* TO 
2014-09-08T23:59:59Z]. if the cost parameter tells Solr to apply cheapest 
filters first why the query still takes so long? It seems even though it 
doesn't run them in parallel filters still don't know of each other and go 
through all docs. My point is that it would be much faster if it could run 
filters one by one and if each next filter would work not with the entire data 
set but with results returned from the previous filter.

Also tried cost = 100 to apply a filter as a post filter, but nothing changes, 
same 7 seconds. Filter cache doesn't help here.

So this:
 By design, fq clauses like this are calculated for the entire document set 
 and the results cached, there is no ordering for that part.
doesn't sound right to me. Sometimes we don't need to reuse filters (and 
sometimes even can't, e.g. the cost option requires cache=false).

In the provided use case the way Solr applies filters is more harmful than 
useful. I'd even say more than 600 times harmful. The query that wouldn't take 
more than a second in MySQL takes 15 seconds in a search engine that uses rapid 
SSD RAID 10, has a few shards and replicas, uses more that 160G of memory in 
total and has ≈40 CPU cores.

Thus this sounds like a feature leak (at least). Please share your thoughts on 
this.

 Query filters applied in a wrong order
 --

 Key: SOLR-6494
 URL: https://issues.apache.org/jira/browse/SOLR-6494
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.8.1
Reporter: Alexander S.

 This query:
 {code}
 {
   fq: [type:Award::Nomination],
   sort: score desc,
   start: 0,
   rows: 20,
   q: *:*
 }
 {code}
 takes just a few milliseconds, but this one:
 {code}
 {
   fq: [
 type:Award::Nomination,
 created_at_d:[* TO 2014-09-08T23:59:59Z]
   ],
   sort: score desc,
   start: 0,
   rows: 20,
   q: *:*
 }
 {code}
 takes almost 15 seconds.
 I have just ≈12k of documents with type Award::Nomination, but around half 
 a billion with created_at_d field set. And it seems Solr applies the 
 created_at_d filter first going through all documents where this field is 
 set, which is not very smart.
 I think if it can't do anything better than applying filters in the alphabet 
 order it should apply them in the order they were received.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-6494) Query filters applied in a wrong order

2014-09-09 Thread Alexander S. (JIRA)

Alexander S. created SOLR-6494:
--

 Summary: Query filters applied in a wrong order
 Key: SOLR-6494
 URL: https://issues.apache.org/jira/browse/SOLR-6494
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.8.1
Reporter: Alexander S.


This query:
{code}
{
  fq: [type:Award::Nomination],
  sort: score desc,
  start: 0,
  rows: 20,
  q: *:*
}
{code}
takes just a few milliseconds, but this one:
{code}
{
  fq: [
type:Award::Nomination,
created_at_d:[* TO 2014-09-08T23:59:59Z]
  ],
  sort: score desc,
  start: 0,
  rows: 20,
  q: *:*
}
{code}
takes almost 15 seconds.

I have just ≈12k of documents with type Award::Nomination, but around half a 
billion with created_at_d field set. And it seems Solr applies the created_at_d 
filter first going through all documents where this field is set, which is not 
very smart.

I think if it can't do anything better than applying filters in the alphabet 
order it should apply them in the order they were received.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6494) Query filters applied in a wrong order

2014-09-09 Thread Alexander S. (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127304#comment-14127304
 ] 

Alexander S. commented on SOLR-6494:


Hi, thank you for the explanation, but I think sometimes (like in this case) it 
would be much more efficient to run filters one by one. It seems that the cost 
parameter should do what I need, e.g.:
{code}
{!cost=1}type:Award::Nomination
{!cost=10}created_at_d:[* TO 2014-09-08T23:59:59Z]
{code}

 Query filters applied in a wrong order
 --

 Key: SOLR-6494
 URL: https://issues.apache.org/jira/browse/SOLR-6494
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.8.1
Reporter: Alexander S.

 This query:
 {code}
 {
   fq: [type:Award::Nomination],
   sort: score desc,
   start: 0,
   rows: 20,
   q: *:*
 }
 {code}
 takes just a few milliseconds, but this one:
 {code}
 {
   fq: [
 type:Award::Nomination,
 created_at_d:[* TO 2014-09-08T23:59:59Z]
   ],
   sort: score desc,
   start: 0,
   rows: 20,
   q: *:*
 }
 {code}
 takes almost 15 seconds.
 I have just ≈12k of documents with type Award::Nomination, but around half 
 a billion with created_at_d field set. And it seems Solr applies the 
 created_at_d filter first going through all documents where this field is 
 set, which is not very smart.
 I think if it can't do anything better than applying filters in the alphabet 
 order it should apply them in the order they were received.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-6468) Regression: StopFilterFactory doesn't work properly without enablePositionIncrements=false

2014-09-02 Thread Alexander S. (JIRA)

Alexander S. created SOLR-6468:
--

 Summary: Regression: StopFilterFactory doesn't work properly 
without enablePositionIncrements=false
 Key: SOLR-6468
 URL: https://issues.apache.org/jira/browse/SOLR-6468
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.9, 4.8.1
Reporter: Alexander S.


Setup:
* Schema version is 1.5
* Field config:
{code}
fieldType name=words_ngram class=solr.TextField omitNorms=false 
autoGeneratePhraseQueries=true
  analyzer
tokenizer class=solr.PatternTokenizerFactory pattern=[^\w]+ /
filter class=solr.StopFilterFactory words=url_stopwords.txt 
ignoreCase=true /
filter class=solr.LowerCaseFilterFactory /
  /analyzer
/fieldType
{code}
* Stop words:
{code}
http 
https 
ftp 
www
{code}

So very simple. In the index I have:
* twitter.com/testuser

All these queries do match:
* twitter.com/testuser
* com/testuser
* testuser

But none of these does:
* https://twitter.com/testuser
* https://www.twitter.com/testuser
* www.twitter.com/testuser

Debug output shows:
parsedquery_toString: +(url_words_ngram:\? twitter com zer0sleep\)
But we need:
parsedquery_toString: +(url_words_ngram:\twitter com zer0sleep\)

Complete debug outputs:
* a valid search: 
http://pastie.org/pastes/9500661/text?key=rgqj5ivlgsbk1jxsudx9za
* an invalid search: 
http://pastie.org/pastes/9500662/text?key=b4zlh2oaxtikd8jvo5xaww

The complete discussion and explanation of the problem is here: 
http://lucene.472066.n3.nabble.com/Help-with-StopFilterFactory-td4153839.html

I didn't find a clear explanation how can we upgrade Solr, there's no any 
replacement or a workarround to this, so this is not just a major change but a 
major disrespect to all existing Solr users who are using this feature.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-6468) Regression: StopFilterFactory doesn't work properly without enablePositionIncrements=false

2014-09-02 Thread Alexander S. (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander S. updated SOLR-6468:
---
Description: 
Setup:
* Schema version is 1.5
* Field config:
{code}
fieldType name=words_ngram class=solr.TextField omitNorms=false 
autoGeneratePhraseQueries=true
  analyzer
tokenizer class=solr.PatternTokenizerFactory pattern=[^\w]+ /
filter class=solr.StopFilterFactory words=url_stopwords.txt 
ignoreCase=true /
filter class=solr.LowerCaseFilterFactory /
  /analyzer
/fieldType
{code}
* Stop words:
{code}
http 
https 
ftp 
www
{code}

So very simple. In the index I have:
* twitter.com/testuser

All these queries do match:
* twitter.com/testuser
* com/testuser
* testuser

But none of these does:
* https://twitter.com/testuser
* https://www.twitter.com/testuser
* www.twitter.com/testuser

Debug output shows:
parsedquery_toString: +(url_words_ngram:\? twitter com testuser\)
But we need:
parsedquery_toString: +(url_words_ngram:\twitter com testuser\)

Complete debug outputs:
* a valid search: 
http://pastie.org/pastes/9500661/text?key=rgqj5ivlgsbk1jxsudx9za
* an invalid search: 
http://pastie.org/pastes/9500662/text?key=b4zlh2oaxtikd8jvo5xaww

The complete discussion and explanation of the problem is here: 
http://lucene.472066.n3.nabble.com/Help-with-StopFilterFactory-td4153839.html

I didn't find a clear explanation how can we upgrade Solr, there's no any 
replacement or a workarround to this, so this is not just a major change but a 
major disrespect to all existing Solr users who are using this feature.

  was:
Setup:
* Schema version is 1.5
* Field config:
{code}
fieldType name=words_ngram class=solr.TextField omitNorms=false 
autoGeneratePhraseQueries=true
  analyzer
tokenizer class=solr.PatternTokenizerFactory pattern=[^\w]+ /
filter class=solr.StopFilterFactory words=url_stopwords.txt 
ignoreCase=true /
filter class=solr.LowerCaseFilterFactory /
  /analyzer
/fieldType
{code}
* Stop words:
{code}
http 
https 
ftp 
www
{code}

So very simple. In the index I have:
* twitter.com/testuser

All these queries do match:
* twitter.com/testuser
* com/testuser
* testuser

But none of these does:
* https://twitter.com/testuser
* https://www.twitter.com/testuser
* www.twitter.com/testuser

Debug output shows:
parsedquery_toString: +(url_words_ngram:\? twitter com zer0sleep\)
But we need:
parsedquery_toString: +(url_words_ngram:\twitter com zer0sleep\)

Complete debug outputs:
* a valid search: 
http://pastie.org/pastes/9500661/text?key=rgqj5ivlgsbk1jxsudx9za
* an invalid search: 
http://pastie.org/pastes/9500662/text?key=b4zlh2oaxtikd8jvo5xaww

The complete discussion and explanation of the problem is here: 
http://lucene.472066.n3.nabble.com/Help-with-StopFilterFactory-td4153839.html

I didn't find a clear explanation how can we upgrade Solr, there's no any 
replacement or a workarround to this, so this is not just a major change but a 
major disrespect to all existing Solr users who are using this feature.


 Regression: StopFilterFactory doesn't work properly without 
 enablePositionIncrements=false
 

 Key: SOLR-6468
 URL: https://issues.apache.org/jira/browse/SOLR-6468
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.8.1, 4.9
Reporter: Alexander S.

 Setup:
 * Schema version is 1.5
 * Field config:
 {code}
 fieldType name=words_ngram class=solr.TextField omitNorms=false 
 autoGeneratePhraseQueries=true
   analyzer
 tokenizer class=solr.PatternTokenizerFactory pattern=[^\w]+ /
 filter class=solr.StopFilterFactory words=url_stopwords.txt 
 ignoreCase=true /
 filter class=solr.LowerCaseFilterFactory /
   /analyzer
 /fieldType
 {code}
 * Stop words:
 {code}
 http 
 https 
 ftp 
 www
 {code}
 So very simple. In the index I have:
 * twitter.com/testuser
 All these queries do match:
 * twitter.com/testuser
 * com/testuser
 * testuser
 But none of these does:
 * https://twitter.com/testuser
 * https://www.twitter.com/testuser
 * www.twitter.com/testuser
 Debug output shows:
 parsedquery_toString: +(url_words_ngram:\? twitter com testuser\)
 But we need:
 parsedquery_toString: +(url_words_ngram:\twitter com testuser\)
 Complete debug outputs:
 * a valid search: 
 http://pastie.org/pastes/9500661/text?key=rgqj5ivlgsbk1jxsudx9za
 * an invalid search: 
 http://pastie.org/pastes/9500662/text?key=b4zlh2oaxtikd8jvo5xaww
 The complete discussion and explanation of the problem is here: 
 http://lucene.472066.n3.nabble.com/Help-with-StopFilterFactory-td4153839.html
 I didn't find a clear explanation how can we upgrade Solr, there's no any 
 replacement or a workarround to this, so this is not just a major change but 
 a major disrespect to all existing Solr users who are using this feature.



--
This message was sent by Atlassian JIRA

[jira] [Commented] (SOLR-6468) Regression: StopFilterFactory doesn't work properly without enablePositionIncrements=false

2014-09-02 Thread Alexander S. (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118078#comment-14118078
]

Alexander S. commented on SOLR-6468:

Correct, but isn't this behavior deprecated? I mean matchVersion=4.3? I was
told this could get removed from 5.0 as well.

If I do understand the problem correctly enablePositionIncrements=false could
generate wrong tokens for those who do not know how to use this option
correctly? It seems it requires a custom tokenizer and
solr.PatternTokenizerFactory in my example should work properly. So instead of
removing the option the problem with wrong tokens could be explained in the
readme and the option could be kept for those who really needs it. That makes
more sense to me than simply removing it.

Anyway, is there any chance the option could be restored? My usecase should
clearly show how useful it might be. And I was trying to google the problem,
there's a lot of complaints about this, but no solutions.

Regression: StopFilterFactory doesn't work properly without
enablePositionIncrements=false

Key: SOLR-6468
URL: https://issues.apache.org/jira/browse/SOLR-6468
Project: Solr
Issue Type: Bug
Affects Versions: 4.8.1, 4.9
Reporter: Alexander S.

Setup:
* Schema version is 1.5
* Field config:
{code}
fieldType name=words_ngram class=solr.TextField omitNorms=false
autoGeneratePhraseQueries=true
analyzer
tokenizer class=solr.PatternTokenizerFactory pattern=[^\w]+ /
filter class=solr.StopFilterFactory words=url_stopwords.txt
ignoreCase=true /
filter class=solr.LowerCaseFilterFactory /
/analyzer
/fieldType
{code}
* Stop words:
{code}
http
https
ftp
www
{code}
So very simple. In the index I have:
* twitter.com/testuser
All these queries do match:
* twitter.com/testuser
* com/testuser
* testuser
But none of these does:
* https://twitter.com/testuser
* https://www.twitter.com/testuser
* www.twitter.com/testuser
Debug output shows:
parsedquery_toString: +(url_words_ngram:\? twitter com testuser\)
But we need:
parsedquery_toString: +(url_words_ngram:\twitter com testuser\)
Complete debug outputs:
* a valid search:
http://pastie.org/pastes/9500661/text?key=rgqj5ivlgsbk1jxsudx9za
* an invalid search:
http://pastie.org/pastes/9500662/text?key=b4zlh2oaxtikd8jvo5xaww
The complete discussion and explanation of the problem is here:
http://lucene.472066.n3.nabble.com/Help-with-StopFilterFactory-td4153839.html
I didn't find a clear explanation how can we upgrade Solr, there's no any
replacement or a workarround to this, so this is not just a major change but
a major disrespect to all existing Solr users who are using this feature.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3274) ZooKeeper related SolrCloud problems

2014-08-11 Thread Alexander S. (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092884#comment-14092884
 ] 

Alexander S. commented on SOLR-3274:


Hi, thanks for the response.

bq. Well you never know
I've checked nodes status, that 3rd node was online all the time and there were 
no any load on it.

bq. In a 3-node ZK-cluster you need at least 2 healthy ZK-nodes connected with 
each other for the cluster to be operational.
That should be the problem since 2 other ZK instances might be (theoretically) 
unavailable because of heavy load (since they share same nodes with Solr 
instances). Both nodes have 16 CPU cores, 48G of memory and RAID 10 (SSD), I 
thought it would be hard to get performance issues there. Anyway, adding a 
separate node with 4th zookeeper instance might help, right?

 ZooKeeper related SolrCloud problems
 

 Key: SOLR-3274
 URL: https://issues.apache.org/jira/browse/SOLR-3274
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.0-ALPHA
 Environment: Any
Reporter: Per Steffensen

 Same setup as in SOLR-3273. Well if I have to tell the entire truth we have 7 
 Solr servers, running 28 slices of the same collection (collA) - all slices 
 have one replica (two shards all in all - leader + replica) - 56 cores all in 
 all (8 shards on each solr instance). But anyways...
 Besides the problem reported in SOLR-3273, the system seems to run fine under 
 high load for several hours, but eventually errors like the ones shown below 
 start to occur. I might be wrong, but they all seem to indicate some kind of 
 unstability in the collaboration between Solr and ZooKeeper. I have to say 
 that I havnt been there to check ZooKeeper at the moment where those 
 exception occur, but basically I dont believe the exceptions occur because 
 ZooKeeper is not running stable - at least when I go and check ZooKeeper 
 through other channels (e.g. my eclipse ZK plugin) it is always accepting 
 my connection and generally seems to be doing fine.
 Exception 1) Often the first error we see in solr.log is something like this
 {code}
 Mar 22, 2012 5:06:43 AM org.apache.solr.common.SolrException log
 SEVERE: org.apache.solr.common.SolrException: Cannot talk to ZooKeeper - 
 Updates are disabled.
 at 
 org.apache.solr.update.processor.DistributedUpdateProcessor.zkCheck(DistributedUpdateProcessor.java:678)
 at 
 org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:250)
 at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:140)
 at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:80)
 at 
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:59)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1540)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:407)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:256)
 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
 at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
 at 
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
 at 
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
 at 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
 at 
 org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
 at 
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
 at 
 org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
 at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
 at org.mortbay.jetty.Server.handle(Server.java:326)
 at 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
 at 
 org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:945)
 at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756)
 at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:218)
 at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
 at 
 org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
 at 
 org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
 {code}
 I believe this error basically occurs because SolrZkClient.isConnected 
 reports false, which means that its internal keeper.getState does not 
 return ZooKeeper.States.CONNECTED.

[jira] [Commented] (SOLR-3274) ZooKeeper related SolrCloud problems

2014-08-08 Thread Alexander S. (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14090519#comment-14090519
 ] 

Alexander S. commented on SOLR-3274:


Suffering from the same problem, happens during high load on the nodes.

Our setup is pretty simple, 4 nodes: 2 shards, 2 replicas and 3 zookeeper 
instance. Everything is running on 3 physical nodes:
* 1st node — 1 zookeeper instance
* 2nd node — 2 shards and 1 zookeeper
* 3rd node — 2 replicas and 1 zookeeper

And running solr instances this way:
java -Xms2G -Xmx16G -XX:+UseConcMarkSweepGC 
-XX:CMSInitiatingOccupancyFraction=80 
-DzkHost=zoo1.devops:2181,zoo2.devops:2181,zoo3.devops:2181 
-Dcollection.configName=Carmen -Dbootstrap_confdir=./solr/conf 
-Dbootstrap_conf=true -DnumShards=2 -jar start.jar etc/jetty.xml

And once loading increases we get:
{code}
org.apache.solr.common.SolrException: Cannot talk to ZooKeeper - Updates are 
disabled.
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.zkCheck(DistributedUpdateProcessor.java:1306)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processDelete(DistributedUpdateProcessor.java:981)
at 
org.apache.solr.update.processor.LogUpdateProcessor.processDelete(LogUpdateProcessorFactory.java:121)
at 
org.apache.solr.handler.loader.XMLLoader.processDelete(XMLLoader.java:349)
at 
org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:278)
at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:174)
at 
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1952)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:774)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:418)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:368)
at 
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
at 
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
at 
org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953)
at 
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)
at 
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
at 
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:744)
{code}

That's simply impossible for all 3 zookeeper instances to get offline 
simultaneously. I understand that 2nd and 3rd nodes could be overloaded because 
of Solr, but 1st node runs just a single zookeeper instance and the load 
average on that node is close to zero.

Since there's always at least 1 stable ZK node this seems like a 
communication/reliability bug in Solr.


 ZooKeeper related SolrCloud problems

[jira] [Comment Edited] (SOLR-3274) ZooKeeper related SolrCloud problems

2014-08-08 Thread Alexander S. (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14090519#comment-14090519
 ] 

Alexander S. edited comment on SOLR-3274 at 8/8/14 9:28 AM:


Suffering from the same problem, happens during high load on the nodes.

Our setup is pretty simple, 4 solr instances: 2 shards, 2 replicas and 3 
zookeeper instances. Everything is running on 3 physical nodes:
* 1st node — 1 zookeeper instance
* 2nd node — 2 solr shards and 1 zookeeper
* 3rd node — 2 solr replicas and 1 zookeeper

We're running solr instances this way:
java -Xms2G -Xmx16G -XX:+UseConcMarkSweepGC 
-XX:CMSInitiatingOccupancyFraction=80 
-DzkHost=zoo1.devops:2181,zoo2.devops:2181,zoo3.devops:2181 
-Dcollection.configName=Carmen -Dbootstrap_confdir=./solr/conf 
-Dbootstrap_conf=true -DnumShards=2 -jar start.jar etc/jetty.xml

And once loading increases we get:
{code}
org.apache.solr.common.SolrException: Cannot talk to ZooKeeper - Updates are 
disabled.
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.zkCheck(DistributedUpdateProcessor.java:1306)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processDelete(DistributedUpdateProcessor.java:981)
at 
org.apache.solr.update.processor.LogUpdateProcessor.processDelete(LogUpdateProcessorFactory.java:121)
at 
org.apache.solr.handler.loader.XMLLoader.processDelete(XMLLoader.java:349)
at 
org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:278)
at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:174)
at 
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1952)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:774)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:418)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:368)
at 
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
at 
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
at 
org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953)
at 
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)
at 
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
at 
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:744)
{code}

That's simply impossible for all 3 zookeeper instances to get offline 
simultaneously. I understand that 2nd and 3rd nodes could be overloaded because 
of Solr, but 1st node runs just a single zookeeper instance and the load 
average on that node is close to zero.

Since there's always at least 1 stable ZK node this seems like a

[jira] [Commented] (SOLR-4787) Join Contrib

2014-07-29 Thread Alexander S. (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077928#comment-14077928
]

Alexander S. commented on SOLR-4787:

It seems join doesn't work as expected, please have a look:
http://lucene.472066.n3.nabble.com/Search-results-inconsistency-when-using-joins-td4149810.html

Join Contrib

Key: SOLR-4787
URL: https://issues.apache.org/jira/browse/SOLR-4787
Project: Solr
Issue Type: New Feature
Components: search
Affects Versions: 4.2.1
Reporter: Joel Bernstein
Priority: Minor
Fix For: 4.9, 5.0

Attachments: SOLR-4787-deadlock-fix.patch,
SOLR-4787-pjoin-long-keys.patch, SOLR-4787-with-testcase-fix.patch,
SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch,
SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch,
SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch,
SOLR-4787.patch, SOLR-4787.patch,
SOLR-4797-hjoin-multivaluekeys-nestedJoins.patch,
SOLR-4797-hjoin-multivaluekeys-trunk.patch

This contrib provides a place where different join implementations can be
contributed to Solr. This contrib currently includes 3 join implementations.
The initial patch was generated from the Solr 4.3 tag. Because of changes in
the FieldCache API this patch will only build with Solr 4.2 or above.
*HashSetJoinQParserPlugin aka hjoin*
The hjoin provides a join implementation that filters results in one core
based on the results of a search in another core. This is similar in
functionality to the JoinQParserPlugin but the implementation differs in a
couple of important ways.
The first way is that the hjoin is designed to work with int and long join
keys only. So, in order to use hjoin, int or long join keys must be included
in both the to and from core.
The second difference is that the hjoin builds memory structures that are
used to quickly connect the join keys. So, the hjoin will need more memory
then the JoinQParserPlugin to perform the join.
The main advantage of the hjoin is that it can scale to join millions of keys
between cores and provide sub-second response time. The hjoin should work
well with up to two million results from the fromIndex and tens of millions
of results from the main query.
The hjoin supports the following features:
1) Both lucene query and PostFilter implementations. A *cost* 99 will
turn on the PostFilter. The PostFilter will typically outperform the Lucene
query when the main query results have been narrowed down.
2) With the lucene query implementation there is an option to build the
filter with threads. This can greatly improve the performance of the query if
the main query index is very large. The threads parameter turns on
threading. For example *threads=6* will use 6 threads to build the filter.
This will setup a fixed threadpool with six threads to handle all hjoin
requests. Once the threadpool is created the hjoin will always use it to
build the filter. Threading does not come into play with the PostFilter.
3) The *size* local parameter can be used to set the initial size of the
hashset used to perform the join. If this is set above the number of results
from the fromIndex then the you can avoid hashset resizing which improves
performance.
4) Nested filter queries. The local parameter fq can be used to nest a
filter query within the join. The nested fq will filter the results of the
join query. This can point to another join to support nested joins.
5) Full caching support for the lucene query implementation. The filterCache
and queryResultCache should work properly even with deep nesting of joins.
Only the queryResultCache comes into play with the PostFilter implementation
because PostFilters are not cacheable in the filterCache.
The syntax of the hjoin is similar to the JoinQParserPlugin except that the
plugin is referenced by the string hjoin rather then join.
fq=\{!hjoin fromIndex=collection2 from=id_i to=id_i threads=6
fq=$qq\}user:customer1qq=group:5
The example filter query above will search the fromIndex (collection2) for
user:customer1 applying the local fq parameter to filter the results. The
lucene filter query will be built using 6 threads. This query will generate a
list of values from the from field that will be used to filter the main
query. Only records from the main query, where the to field is present in
the from list will be included in the results.
The solrconfig.xml in the main query core must contain the reference to the
hjoin.
queryParser name=hjoin
class=org.apache.solr.joins.HashSetJoinQParserPlugin/
And the join contrib lib jars must be registed in the solrconfig.xml.
lib dir=../../../contrib/joins/lib regex=.*\.jar /
After issuing the ant dist command

[jira] [Commented] (SOLR-5463) Provide cursor/token based searchAfter support that works with arbitrary sorting (ie: deep paging)

2014-05-29 Thread Alexander S. (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14012449#comment-14012449
 ] 

Alexander S. commented on SOLR-5463:


I have another idea about cursors implementation. That's just an idea, I am not 
sure if that's possible to do.

Is it possible to use cursors together with start and rows parameters? That 
would allow to use pagination and draw links for prev, next, 1, 2, 3, n+1 
pages, as we can do now. So that instead of using cursorMark we'll use 
cursorName, which could be a static. So the request start:0, rows:10, 
cursorName:* will return first page of results and a static cursor name, which 
could then be used for all other pages (i.e. start:10, rows:10, 
cursorName:#{received_cursor_name}).

Does that make sense?

 Provide cursor/token based searchAfter support that works with arbitrary 
 sorting (ie: deep paging)
 --

 Key: SOLR-5463
 URL: https://issues.apache.org/jira/browse/SOLR-5463
 Project: Solr
  Issue Type: New Feature
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 4.7, 5.0

 Attachments: SOLR-5463-randomized-faceting-test.patch, 
 SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, 
 SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man__MissingStringLastComparatorSource.patch


 I'd like to revist a solution to the problem of deep paging in Solr, 
 leveraging an HTTP based API similar to how IndexSearcher.searchAfter works 
 at the lucene level: require the clients to provide back a token indicating 
 the sort values of the last document seen on the previous page.  This is 
 similar to the cursor model I've seen in several other REST APIs that 
 support pagnation over a large sets of results (notable the twitter API and 
 it's since_id param) except that we'll want something that works with 
 arbitrary multi-level sort critera that can be either ascending or descending.
 SOLR-1726 laid some initial ground work here and was commited quite a while 
 ago, but the key bit of argument parsing to leverage it was commented out due 
 to some problems (see comments in that issue).  It's also somewhat out of 
 date at this point: at the time it was commited, IndexSearcher only supported 
 searchAfter for simple scores, not arbitrary field sorts; and the params 
 added in SOLR-1726 suffer from this limitation as well.
 ---
 I think it would make sense to start fresh with a new issue with a focus on 
 ensuring that we have deep paging which:
 * supports arbitrary field sorts in addition to sorting by score
 * works in distributed mode
 {panel:title=Basic Usage}
 * send a request with {{sort=Xstart=0rows=NcursorMark=*}}
 ** sort can be anything, but must include the uniqueKey field (as a tie 
 breaker) 
 ** N can be any number you want per page
 ** start must be 0
 ** \* denotes you want to use a cursor starting at the beginning mark
 * parse the response body and extract the (String) {{nextCursorMark}} value
 * Replace the \* value in your initial request params with the 
 {{nextCursorMark}} value from the response in the subsequent request
 * repeat until the {{nextCursorMark}} value stops changing, or you have 
 collected as many docs as you need
 {panel}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5463) Provide cursor/token based searchAfter support that works with arbitrary sorting (ie: deep paging)


[ 
https://issues.apache.org/jira/browse/SOLR-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010881#comment-14010881
 ] 

Alexander S. commented on SOLR-5463:


Inability to use this without sorting by an unique key (e.g. id) makes this 
feature useless. Same could be achieved previously with sorting by id and 
searching for docs where id is / than the last received. See how cursors do 
work in MongoDB, that's the right direction.

 Provide cursor/token based searchAfter support that works with arbitrary 
 sorting (ie: deep paging)
 --

 Key: SOLR-5463
 URL: https://issues.apache.org/jira/browse/SOLR-5463
 Project: Solr
  Issue Type: New Feature
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 4.7, 5.0

 Attachments: SOLR-5463-randomized-faceting-test.patch, 
 SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, 
 SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man__MissingStringLastComparatorSource.patch


 I'd like to revist a solution to the problem of deep paging in Solr, 
 leveraging an HTTP based API similar to how IndexSearcher.searchAfter works 
 at the lucene level: require the clients to provide back a token indicating 
 the sort values of the last document seen on the previous page.  This is 
 similar to the cursor model I've seen in several other REST APIs that 
 support pagnation over a large sets of results (notable the twitter API and 
 it's since_id param) except that we'll want something that works with 
 arbitrary multi-level sort critera that can be either ascending or descending.
 SOLR-1726 laid some initial ground work here and was commited quite a while 
 ago, but the key bit of argument parsing to leverage it was commented out due 
 to some problems (see comments in that issue).  It's also somewhat out of 
 date at this point: at the time it was commited, IndexSearcher only supported 
 searchAfter for simple scores, not arbitrary field sorts; and the params 
 added in SOLR-1726 suffer from this limitation as well.
 ---
 I think it would make sense to start fresh with a new issue with a focus on 
 ensuring that we have deep paging which:
 * supports arbitrary field sorts in addition to sorting by score
 * works in distributed mode
 {panel:title=Basic Usage}
 * send a request with {{sort=Xstart=0rows=NcursorMark=*}}
 ** sort can be anything, but must include the uniqueKey field (as a tie 
 breaker) 
 ** N can be any number you want per page
 ** start must be 0
 ** \* denotes you want to use a cursor starting at the beginning mark
 * parse the response body and extract the (String) {{nextCursorMark}} value
 * Replace the \* value in your initial request params with the 
 {{nextCursorMark}} value from the response in the subsequent request
 * repeat until the {{nextCursorMark}} value stops changing, or you have 
 collected as many docs as you need
 {panel}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5463) Provide cursor/token based searchAfter support that works with arbitrary sorting (ie: deep paging)


[ 
https://issues.apache.org/jira/browse/SOLR-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010883#comment-14010883
 ] 

Alexander S. commented on SOLR-5463:


http://docs.mongodb.org/manual/core/cursors/

 Provide cursor/token based searchAfter support that works with arbitrary 
 sorting (ie: deep paging)
 --

 Key: SOLR-5463
 URL: https://issues.apache.org/jira/browse/SOLR-5463
 Project: Solr
  Issue Type: New Feature
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 4.7, 5.0

 Attachments: SOLR-5463-randomized-faceting-test.patch, 
 SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, 
 SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man__MissingStringLastComparatorSource.patch


 I'd like to revist a solution to the problem of deep paging in Solr, 
 leveraging an HTTP based API similar to how IndexSearcher.searchAfter works 
 at the lucene level: require the clients to provide back a token indicating 
 the sort values of the last document seen on the previous page.  This is 
 similar to the cursor model I've seen in several other REST APIs that 
 support pagnation over a large sets of results (notable the twitter API and 
 it's since_id param) except that we'll want something that works with 
 arbitrary multi-level sort critera that can be either ascending or descending.
 SOLR-1726 laid some initial ground work here and was commited quite a while 
 ago, but the key bit of argument parsing to leverage it was commented out due 
 to some problems (see comments in that issue).  It's also somewhat out of 
 date at this point: at the time it was commited, IndexSearcher only supported 
 searchAfter for simple scores, not arbitrary field sorts; and the params 
 added in SOLR-1726 suffer from this limitation as well.
 ---
 I think it would make sense to start fresh with a new issue with a focus on 
 ensuring that we have deep paging which:
 * supports arbitrary field sorts in addition to sorting by score
 * works in distributed mode
 {panel:title=Basic Usage}
 * send a request with {{sort=Xstart=0rows=NcursorMark=*}}
 ** sort can be anything, but must include the uniqueKey field (as a tie 
 breaker) 
 ** N can be any number you want per page
 ** start must be 0
 ** \* denotes you want to use a cursor starting at the beginning mark
 * parse the response body and extract the (String) {{nextCursorMark}} value
 * Replace the \* value in your initial request params with the 
 {{nextCursorMark}} value from the response in the subsequent request
 * repeat until the {{nextCursorMark}} value stops changing, or you have 
 collected as many docs as you need
 {panel}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5463) Provide cursor/token based searchAfter support that works with arbitrary sorting (ie: deep paging)


[ 
https://issues.apache.org/jira/browse/SOLR-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010888#comment-14010888
 ] 

Alexander S. commented on SOLR-5463:


Sorry for spamming, but can't edit my previous message. I just found that in 
mongo they also aren't isolated and could return duplicates, I was thinking 
they are. But sorting docs by id is not acceptable in 99% of use cases, 
especially in Solr, where it is more expected to get results sorted by 
relevance.

 Provide cursor/token based searchAfter support that works with arbitrary 
 sorting (ie: deep paging)
 --

 Key: SOLR-5463
 URL: https://issues.apache.org/jira/browse/SOLR-5463
 Project: Solr
  Issue Type: New Feature
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 4.7, 5.0

 Attachments: SOLR-5463-randomized-faceting-test.patch, 
 SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, 
 SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man__MissingStringLastComparatorSource.patch


 I'd like to revist a solution to the problem of deep paging in Solr, 
 leveraging an HTTP based API similar to how IndexSearcher.searchAfter works 
 at the lucene level: require the clients to provide back a token indicating 
 the sort values of the last document seen on the previous page.  This is 
 similar to the cursor model I've seen in several other REST APIs that 
 support pagnation over a large sets of results (notable the twitter API and 
 it's since_id param) except that we'll want something that works with 
 arbitrary multi-level sort critera that can be either ascending or descending.
 SOLR-1726 laid some initial ground work here and was commited quite a while 
 ago, but the key bit of argument parsing to leverage it was commented out due 
 to some problems (see comments in that issue).  It's also somewhat out of 
 date at this point: at the time it was commited, IndexSearcher only supported 
 searchAfter for simple scores, not arbitrary field sorts; and the params 
 added in SOLR-1726 suffer from this limitation as well.
 ---
 I think it would make sense to start fresh with a new issue with a focus on 
 ensuring that we have deep paging which:
 * supports arbitrary field sorts in addition to sorting by score
 * works in distributed mode
 {panel:title=Basic Usage}
 * send a request with {{sort=Xstart=0rows=NcursorMark=*}}
 ** sort can be anything, but must include the uniqueKey field (as a tie 
 breaker) 
 ** N can be any number you want per page
 ** start must be 0
 ** \* denotes you want to use a cursor starting at the beginning mark
 * parse the response body and extract the (String) {{nextCursorMark}} value
 * Replace the \* value in your initial request params with the 
 {{nextCursorMark}} value from the response in the subsequent request
 * repeat until the {{nextCursorMark}} value stops changing, or you have 
 collected as many docs as you need
 {panel}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5463) Provide cursor/token based searchAfter support that works with arbitrary sorting (ie: deep paging)


[ 
https://issues.apache.org/jira/browse/SOLR-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14011226#comment-14011226
 ] 

Alexander S. commented on SOLR-5463:


Oh, that's awesome, thanks for the tip.

 Provide cursor/token based searchAfter support that works with arbitrary 
 sorting (ie: deep paging)
 --

 Key: SOLR-5463
 URL: https://issues.apache.org/jira/browse/SOLR-5463
 Project: Solr
  Issue Type: New Feature
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 4.7, 5.0

 Attachments: SOLR-5463-randomized-faceting-test.patch, 
 SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, 
 SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man__MissingStringLastComparatorSource.patch


 I'd like to revist a solution to the problem of deep paging in Solr, 
 leveraging an HTTP based API similar to how IndexSearcher.searchAfter works 
 at the lucene level: require the clients to provide back a token indicating 
 the sort values of the last document seen on the previous page.  This is 
 similar to the cursor model I've seen in several other REST APIs that 
 support pagnation over a large sets of results (notable the twitter API and 
 it's since_id param) except that we'll want something that works with 
 arbitrary multi-level sort critera that can be either ascending or descending.
 SOLR-1726 laid some initial ground work here and was commited quite a while 
 ago, but the key bit of argument parsing to leverage it was commented out due 
 to some problems (see comments in that issue).  It's also somewhat out of 
 date at this point: at the time it was commited, IndexSearcher only supported 
 searchAfter for simple scores, not arbitrary field sorts; and the params 
 added in SOLR-1726 suffer from this limitation as well.
 ---
 I think it would make sense to start fresh with a new issue with a focus on 
 ensuring that we have deep paging which:
 * supports arbitrary field sorts in addition to sorting by score
 * works in distributed mode
 {panel:title=Basic Usage}
 * send a request with {{sort=Xstart=0rows=NcursorMark=*}}
 ** sort can be anything, but must include the uniqueKey field (as a tie 
 breaker) 
 ** N can be any number you want per page
 ** start must be 0
 ** \* denotes you want to use a cursor starting at the beginning mark
 * parse the response body and extract the (String) {{nextCursorMark}} value
 * Replace the \* value in your initial request params with the 
 {{nextCursorMark}} value from the response in the subsequent request
 * repeat until the {{nextCursorMark}} value stops changing, or you have 
 collected as many docs as you need
 {panel}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5463) Provide cursor/token based searchAfter support that works with arbitrary sorting (ie: deep paging)


[ 
https://issues.apache.org/jira/browse/SOLR-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14012084#comment-14012084
 ] 

Alexander S. commented on SOLR-5463:


If, as David mentioned, Solr will add it only if it is not there, this should 
keep the ability for users to manually specify another key and order when that 
is required (a rare case it seems).

 Provide cursor/token based searchAfter support that works with arbitrary 
 sorting (ie: deep paging)
 --

 Key: SOLR-5463
 URL: https://issues.apache.org/jira/browse/SOLR-5463
 Project: Solr
  Issue Type: New Feature
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 4.7, 5.0

 Attachments: SOLR-5463-randomized-faceting-test.patch, 
 SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, 
 SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man__MissingStringLastComparatorSource.patch


 I'd like to revist a solution to the problem of deep paging in Solr, 
 leveraging an HTTP based API similar to how IndexSearcher.searchAfter works 
 at the lucene level: require the clients to provide back a token indicating 
 the sort values of the last document seen on the previous page.  This is 
 similar to the cursor model I've seen in several other REST APIs that 
 support pagnation over a large sets of results (notable the twitter API and 
 it's since_id param) except that we'll want something that works with 
 arbitrary multi-level sort critera that can be either ascending or descending.
 SOLR-1726 laid some initial ground work here and was commited quite a while 
 ago, but the key bit of argument parsing to leverage it was commented out due 
 to some problems (see comments in that issue).  It's also somewhat out of 
 date at this point: at the time it was commited, IndexSearcher only supported 
 searchAfter for simple scores, not arbitrary field sorts; and the params 
 added in SOLR-1726 suffer from this limitation as well.
 ---
 I think it would make sense to start fresh with a new issue with a focus on 
 ensuring that we have deep paging which:
 * supports arbitrary field sorts in addition to sorting by score
 * works in distributed mode
 {panel:title=Basic Usage}
 * send a request with {{sort=Xstart=0rows=NcursorMark=*}}
 ** sort can be anything, but must include the uniqueKey field (as a tie 
 breaker) 
 ** N can be any number you want per page
 ** start must be 0
 ** \* denotes you want to use a cursor starting at the beginning mark
 * parse the response body and extract the (String) {{nextCursorMark}} value
 * Replace the \* value in your initial request params with the 
 {{nextCursorMark}} value from the response in the subsequent request
 * repeat until the {{nextCursorMark}} value stops changing, or you have 
 collected as many docs as you need
 {panel}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5871) Ability to see the list of fields that matched the query with scores

2014-04-13 Thread Alexander S. (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967927#comment-13967927
 ] 

Alexander S. commented on SOLR-5871:


I already asked at solr-u...@lucene.apache.org but seems only one way currently 
is to read the debug explanation. Unfortunately I am not a java developer thus 
unable to create a patch, but Solr jira has a wish type so I posted my wish 
here.

 Ability to see the list of fields that matched the query with scores
 

 Key: SOLR-5871
 URL: https://issues.apache.org/jira/browse/SOLR-5871
 Project: Solr
  Issue Type: Wish
Reporter: Alexander S.
Assignee: Erick Erickson

 Hello, I need the ability to tell users what content matched their query, 
 this way:
 | Name  | Twitter Profile | Topics | Site Title | Site Description | Site 
 content | 
 | John Doe | Yes| No  | Yes | No  
   | Yes | 
 | Jane Doe | No | Yes | No  | No  
   | Yes | 
 All these columns are indexed text fields and I need to know what content 
 matched the query and would be also cool to be able to show the score per 
 field.
 As far as I know right now there's no way to return this information when 
 running a query request. Debug outputs is suitable for visual review but has 
 lots of nesting levels and is hard for understanding.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5871) Ability to see the list of fields that matched the query with scores

2014-04-07 Thread Alexander S. (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13961730#comment-13961730
 ] 

Alexander S. commented on SOLR-5871:


Any luck this could be reviewed by someone?

 Ability to see the list of fields that matched the query with scores
 

 Key: SOLR-5871
 URL: https://issues.apache.org/jira/browse/SOLR-5871
 Project: Solr
  Issue Type: Wish
Reporter: Alexander S.

 Hello, I need the ability to tell users what content matched their query, 
 this way:
 | Name  | Twitter Profile | Topics | Site Title | Site Description | Site 
 content | 
 | John Doe | Yes| No  | Yes | No  
   | Yes | 
 | Jane Doe | No | Yes | No  | No  
   | Yes | 
 All these columns are indexed text fields and I need to know what content 
 matched the query and would be also cool to be able to show the score per 
 field.
 As far as I know right now there's no way to return this information when 
 running a query request. Debug outputs is suitable for visual review but has 
 lots of nesting levels and is hard for understanding.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-4787) Join Contrib

2014-04-07 Thread Alexander S. (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13961733#comment-13961733
]

Alexander S. edited comment on SOLR-4787 at 4/7/14 9:10 AM:

@Kranti Parisa, hi, any luck with this?

was (Author: aheaven):
@Kranti Parisa, hi, any lick with this?

Join Contrib

Attachments: SOLR-4787-deadlock-fix.patch,
SOLR-4787-pjoin-long-keys.patch, SOLR-4787.patch, SOLR-4787.patch,
SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch,
SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch,
SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch,
SOLR-4797-hjoin-multivaluekeys-nestedJoins.patch,
SOLR-4797-hjoin-multivaluekeys-trunk.patch

[jira] [Commented] (SOLR-4787) Join Contrib

2014-04-07 Thread Alexander S. (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13961733#comment-13961733
]

Alexander S. commented on SOLR-4787:

@Kranti Parisa, hi, any lick with this?

Join Contrib

[jira] [Commented] (SOLR-4787) Join Contrib

2014-03-21 Thread Alexander S. (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943153#comment-13943153
]

Alexander S. commented on SOLR-4787:

Kranti Parisa

Did you try to apply this patch to 4.7.0? I was trying to download it here:
http://www.apache.org/dyn/closer.cgi/lucene/solr/4.7.0 and then did the next
steps:
* ant compile
* ant ivy-bootstrap
* ant dist
And then created a package for my Linux distributive, but no luck, Solr fails
to initialize with
queryParser name=hjoin
class=org.apache.solr.search.joins.HashSetJoinQParserPlugin/

Join Contrib

[jira] [Commented] (SOLR-4787) Join Contrib

2014-03-19 Thread Alexander S. (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940740#comment-13940740
]

Alexander S. commented on SOLR-4787:

Any query fails, seems I am doing something wrong (perhaps the patch was
applied incorrectly). I see this error:
{quote}
SolrCore Initialization Failures
crm-dev:
org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
Error loading class 'org.apache.solr.search.joins.HashSetJoinQParserPlugin'
{quote}
when trying to access the web interface.

Join Contrib

[jira] [Created] (SOLR-5871) Ability to see the list of fields that matched the query with scores

Alexander S. created SOLR-5871:
--

 Summary: Ability to see the list of fields that matched the query 
with scores
 Key: SOLR-5871
 URL: https://issues.apache.org/jira/browse/SOLR-5871
 Project: Solr
  Issue Type: Wish
Reporter: Alexander S.


Hello, I need the ability to show users what content matched their query, this 
way:
| Name  | Twitter Profile | Topics | Site Title | Site Description | Site 
content | 
| John Doe | Yes| No  | Yes | No
| Yes | 
| Jane Doe | No | Yes | No  | No
| Yes | 

All these columns are indexed text fields and I need to know what content 
matched the query and would be also cool to be able to show the score per field.

As far as I know right now there's no way to return this information when 
running a query request. Debug outputs is suitable for visual review but has 
lots of nesting levels and is hard for understanding.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5871) Ability to see the list of fields that matched the query with scores


 [ 
https://issues.apache.org/jira/browse/SOLR-5871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander S. updated SOLR-5871:
---

Description: 
Hello, I need the ability to tell users what content matched their query, this 
way:
| Name  | Twitter Profile | Topics | Site Title | Site Description | Site 
content | 
| John Doe | Yes| No  | Yes | No
| Yes | 
| Jane Doe | No | Yes | No  | No
| Yes | 

All these columns are indexed text fields and I need to know what content 
matched the query and would be also cool to be able to show the score per field.

As far as I know right now there's no way to return this information when 
running a query request. Debug outputs is suitable for visual review but has 
lots of nesting levels and is hard for understanding.

  was:
Hello, I need the ability to show users what content matched their query, this 
way:
| Name  | Twitter Profile | Topics | Site Title | Site Description | Site 
content | 
| John Doe | Yes| No  | Yes | No
| Yes | 
| Jane Doe | No | Yes | No  | No
| Yes | 

All these columns are indexed text fields and I need to know what content 
matched the query and would be also cool to be able to show the score per field.

As far as I know right now there's no way to return this information when 
running a query request. Debug outputs is suitable for visual review but has 
lots of nesting levels and is hard for understanding.


 Ability to see the list of fields that matched the query with scores
 

 Key: SOLR-5871
 URL: https://issues.apache.org/jira/browse/SOLR-5871
 Project: Solr
  Issue Type: Wish
Reporter: Alexander S.

 Hello, I need the ability to tell users what content matched their query, 
 this way:
 | Name  | Twitter Profile | Topics | Site Title | Site Description | Site 
 content | 
 | John Doe | Yes| No  | Yes | No  
   | Yes | 
 | Jane Doe | No | Yes | No  | No  
   | Yes | 
 All these columns are indexed text fields and I need to know what content 
 matched the query and would be also cool to be able to show the score per 
 field.
 As far as I know right now there's no way to return this information when 
 running a query request. Debug outputs is suitable for visual review but has 
 lots of nesting levels and is hard for understanding.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4787) Join Contrib

[
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13937732#comment-13937732
]

Alexander S. commented on SOLR-4787:

Thank you, Kranti Parisa, I am far from java development, how can I apply this
patch and build solr for linux? I tried to patch, it creates a new folder
joins in solr/contrib, installed ivy and launched ant compile but got this
error:
{quote}
common.compile-core:
[mkdir] Created dir:
/home/heaven/Desktop/solr-4.7.0/solr/build/contrib/solr-joins/classes/java
[javac] Compiling 3 source files to
/home/heaven/Desktop/solr-4.7.0/solr/build/contrib/solr-joins/classes/java
[javac] warning: [options] bootstrap class path not set in conjunction with
-source 1.6
[javac]
/home/heaven/Desktop/solr-4.7.0/solr/contrib/joins/src/java/org/apache/solr/joins/HashSetJoinQParserPlugin.java:883:
error: reached end of file while parsing
[javac] return this.delegate.acceptsDocsOutOfOrder();
[javac]^
[javac]
/home/heaven/Desktop/solr-4.7.0/solr/contrib/joins/src/java/org/apache/solr/joins/HashSetJoinQParserPlugin.java:884:
error: reached end of file while parsing
[javac] 2 errors
[javac] 1 warning

BUILD FAILED
/home/heaven/Desktop/solr-4.7.0/build.xml:106: The following error occurred
while executing this line:
/home/heaven/Desktop/solr-4.7.0/solr/common-build.xml:458: The following error
occurred while executing this line:
/home/heaven/Desktop/solr-4.7.0/solr/common-build.xml:449: The following error
occurred while executing this line:
/home/heaven/Desktop/solr-4.7.0/lucene/common-build.xml:471: The following
error occurred while executing this line:
/home/heaven/Desktop/solr-4.7.0/lucene/common-build.xml:1736: Compile failed;
see the compiler error output for details.

Total time: 8 minutes 55 seconds
{quote}

Join Contrib

[jira] [Commented] (SOLR-4787) Join Contrib

[
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13937747#comment-13937747
]

Alexander S. commented on SOLR-4787:

Nvm, there were 3 missing } at the end of HashSetJoinQParserPlugin.java, the
build was successful, testing now.

Join Contrib

[jira] [Commented] (SOLR-4787) Join Contrib

[
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13937845#comment-13937845
]

Alexander S. commented on SOLR-4787:

Kranti,

Do I need to update anything in my solr config/schema? I've just tried the
patched version and it still ignores the fq parameter. I was using solr 4.7.0.

Thanks,
Alex

Join Contrib

[jira] [Commented] (SOLR-4787) Join Contrib

[
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13937887#comment-13937887
]

Alexander S. commented on SOLR-4787:

Hi, I am using simple join, this way: {!join from=profile_ids_im to=id_i
fq=$joinFilter1 v=$joinQuery1}.

Join Contrib

[jira] [Commented] (SOLR-4787) Join Contrib

[
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13937921#comment-13937921
]

Alexander S. commented on SOLR-4787:

Ok, thx, I'll try with hjoin. And yes, I am trying to do it on the same core.

Join Contrib

[jira] [Commented] (SOLR-4787) Join Contrib