[jira] [Commented] (SOLR-10115) Corruption in read-side of SOLR-HDFS stack

2017-02-21 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877188#comment-15877188
 ] 

Yonik Seeley commented on SOLR-10115:
-

OK, after the fixes in SOLR-10121 and SOLR-10141, I can no longer reproduce 
fails with the attached test.
I still need to make it into a more proper unit test before committing it 
though.

> Corruption in read-side of SOLR-HDFS stack
> --
>
> Key: SOLR-10115
> URL: https://issues.apache.org/jira/browse/SOLR-10115
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 4.4
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
> Attachments: YCS_HdfsTest.java
>
>
> I've been trying to track down some random AIOOB exceptions in Lucene for a 
> customer, and I've managed to reproduce the issue with a unit test of 
> sufficient size in conjunction with highly concurrent read requests.
> A typical stack trace looks like:
> {code}
> org.apache.solr.common.SolrException; 
> java.lang.ArrayIndexOutOfBoundsException: 172033655
> at org.apache.lucene.codecs.lucene40.BitVector.get(BitVector.java:149)
> at 
> org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsEnum.nextDoc(Lucene41PostingsReader.java:455)
> at 
> org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet(MultiTermQueryWrapperFilter.java:111)
> at 
> org.apache.lucene.search.ConstantScoreQuery$ConstantWeight.scorer(ConstantScoreQuery.java:157)
> {code}
> The number of unique stack traces is relatively high, most AIOOB exceptions, 
> but some EOF.  Most exceptions occur in the term index, however I believe 
> this may be just an artifact of where highly concurrent access is most likely 
> to occur.  The queries that triggered this had many wildcards and other 
> multi-term queries.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10115) Corruption in read-side of SOLR-HDFS stack

2017-02-10 Thread Michael Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15861936#comment-15861936
 ] 

Michael Sun commented on SOLR-10115:


Yeah, that's right. keysToRelease is a concurrent data structure.

> Corruption in read-side of SOLR-HDFS stack
> --
>
> Key: SOLR-10115
> URL: https://issues.apache.org/jira/browse/SOLR-10115
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 4.10
>Reporter: Yonik Seeley
> Attachments: YCS_HdfsTest.java
>
>
> I've been trying to track down some random AIOOB exceptions in Lucene for a 
> customer, and I've managed to reproduce the issue with a unit test of 
> sufficient size in conjunction with highly concurrent read requests.
> A typical stack trace looks like:
> {code}
> org.apache.solr.common.SolrException; 
> java.lang.ArrayIndexOutOfBoundsException: 172033655
> at org.apache.lucene.codecs.lucene40.BitVector.get(BitVector.java:149)
> at 
> org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsEnum.nextDoc(Lucene41PostingsReader.java:455)
> at 
> org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet(MultiTermQueryWrapperFilter.java:111)
> at 
> org.apache.lucene.search.ConstantScoreQuery$ConstantWeight.scorer(ConstantScoreQuery.java:157)
> {code}
> The number of unique stack traces is relatively high, most AIOOB exceptions, 
> but some EOF.  Most exceptions occur in the term index, however I believe 
> this may be just an artifact of where highly concurrent access is most likely 
> to occur.  The queries that triggered this had many wildcards and other 
> multi-term queries.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10115) Corruption in read-side of SOLR-HDFS stack

2017-02-10 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15861911#comment-15861911
 ] 

Yonik Seeley commented on SOLR-10115:
-

bq. keysToRelease is not a concurrent data structure and there is no locking in 
adding and removing items.

I haven't worked my way up to BlockDirectoryCache yet, but a quick glance shows 
that keysToRelease is a concurrent data structure:
https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/store/blockcache/BlockDirectoryCache.java#L52

> Corruption in read-side of SOLR-HDFS stack
> --
>
> Key: SOLR-10115
> URL: https://issues.apache.org/jira/browse/SOLR-10115
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 4.10
>Reporter: Yonik Seeley
> Attachments: YCS_HdfsTest.java
>
>
> I've been trying to track down some random AIOOB exceptions in Lucene for a 
> customer, and I've managed to reproduce the issue with a unit test of 
> sufficient size in conjunction with highly concurrent read requests.
> A typical stack trace looks like:
> {code}
> org.apache.solr.common.SolrException; 
> java.lang.ArrayIndexOutOfBoundsException: 172033655
> at org.apache.lucene.codecs.lucene40.BitVector.get(BitVector.java:149)
> at 
> org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsEnum.nextDoc(Lucene41PostingsReader.java:455)
> at 
> org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet(MultiTermQueryWrapperFilter.java:111)
> at 
> org.apache.lucene.search.ConstantScoreQuery$ConstantWeight.scorer(ConstantScoreQuery.java:157)
> {code}
> The number of unique stack traces is relatively high, most AIOOB exceptions, 
> but some EOF.  Most exceptions occur in the term index, however I believe 
> this may be just an artifact of where highly concurrent access is most likely 
> to occur.  The queries that triggered this had many wildcards and other 
> multi-term queries.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10115) Corruption in read-side of SOLR-HDFS stack

2017-02-10 Thread Michael Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15861721#comment-15861721
 ] 

Michael Sun commented on SOLR-10115:


In BlockDirectoryCache, keysToRelease is not a concurrent data structure and 
there is no locking in adding and removing items. 
(https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/store/blockcache/BlockDirectoryCache.java#L36)
 Is there any guarantee that adding and removing items for keysToRelease don't 
execute concurrently in different threads?

> Corruption in read-side of SOLR-HDFS stack
> --
>
> Key: SOLR-10115
> URL: https://issues.apache.org/jira/browse/SOLR-10115
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 4.10
>Reporter: Yonik Seeley
> Attachments: YCS_HdfsTest.java
>
>
> I've been trying to track down some random AIOOB exceptions in Lucene for a 
> customer, and I've managed to reproduce the issue with a unit test of 
> sufficient size in conjunction with highly concurrent read requests.
> A typical stack trace looks like:
> {code}
> org.apache.solr.common.SolrException; 
> java.lang.ArrayIndexOutOfBoundsException: 172033655
> at org.apache.lucene.codecs.lucene40.BitVector.get(BitVector.java:149)
> at 
> org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsEnum.nextDoc(Lucene41PostingsReader.java:455)
> at 
> org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet(MultiTermQueryWrapperFilter.java:111)
> at 
> org.apache.lucene.search.ConstantScoreQuery$ConstantWeight.scorer(ConstantScoreQuery.java:157)
> {code}
> The number of unique stack traces is relatively high, most AIOOB exceptions, 
> but some EOF.  Most exceptions occur in the term index, however I believe 
> this may be just an artifact of where highly concurrent access is most likely 
> to occur.  The queries that triggered this had many wildcards and other 
> multi-term queries.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org