[jira] [Commented] (SOLR-10121) BlockCache corruption with high concurrency

2017-02-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877051#comment-15877051
 ] 

ASF subversion and git services commented on SOLR-10121:


Commit 8dbb1bb3fb64fea4baa672ce82a1b62af22c3571 in lucene-solr's branch 
refs/heads/branch_6x from [~yo...@apache.org]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=8dbb1bb ]

SOLR-10121: enable BlockCacheTest.testBlockCacheConcurrent that now passes


> BlockCache corruption with high concurrency
> ---
>
> Key: SOLR-10121
> URL: https://issues.apache.org/jira/browse/SOLR-10121
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
> Attachments: SOLR-10121.patch
>
>
> Improving the tests of the BlockCache in SOLR-10116 uncovered a corruption 
> bug (either that or the test is flawed... TBD).
> The failing test is TestBlockCache.testBlockCacheConcurrent()



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10121) BlockCache corruption with high concurrency

2017-02-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877050#comment-15877050
 ] 

ASF subversion and git services commented on SOLR-10121:


Commit cf1cba66f49c551cddbc6053565c30bf3a8b23bb in lucene-solr's branch 
refs/heads/master from [~yo...@apache.org]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=cf1cba6 ]

SOLR-10121: enable BlockCacheTest.testBlockCacheConcurrent that now passes


> BlockCache corruption with high concurrency
> ---
>
> Key: SOLR-10121
> URL: https://issues.apache.org/jira/browse/SOLR-10121
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
> Attachments: SOLR-10121.patch
>
>
> Improving the tests of the BlockCache in SOLR-10116 uncovered a corruption 
> bug (either that or the test is flawed... TBD).
> The failing test is TestBlockCache.testBlockCacheConcurrent()



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10121) BlockCache corruption with high concurrency

2017-02-15 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868088#comment-15868088
 ] 

Yonik Seeley commented on SOLR-10121:
-

I'm splitting off the Caffeine issues to SOLR-10141 since the BlockCache race 
conditions that have existed since inception and will need to be 
handled/backported separately.


> BlockCache corruption with high concurrency
> ---
>
> Key: SOLR-10121
> URL: https://issues.apache.org/jira/browse/SOLR-10121
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
> Attachments: SOLR-10121.patch
>
>
> Improving the tests of the BlockCache in SOLR-10116 uncovered a corruption 
> bug (either that or the test is flawed... TBD).
> The failing test is TestBlockCache.testBlockCacheConcurrent()



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10121) BlockCache corruption with high concurrency

2017-02-15 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868008#comment-15868008
 ] 

ASF subversion and git services commented on SOLR-10121:


Commit 65e2d2add68a557b1e628039c328f9346df282f9 in lucene-solr's branch 
refs/heads/branch_6x from [~yo...@apache.org]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=65e2d2a ]

SOLR-10121: fix race conditions in BlockCache.fetch and BlockCache.store


> BlockCache corruption with high concurrency
> ---
>
> Key: SOLR-10121
> URL: https://issues.apache.org/jira/browse/SOLR-10121
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
> Attachments: SOLR-10121.patch
>
>
> Improving the tests of the BlockCache in SOLR-10116 uncovered a corruption 
> bug (either that or the test is flawed... TBD).
> The failing test is TestBlockCache.testBlockCacheConcurrent()



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10121) BlockCache corruption with high concurrency

2017-02-15 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867962#comment-15867962
 ] 

ASF subversion and git services commented on SOLR-10121:


Commit b71a667d74dfabeaad9584372bded80b0c609add in lucene-solr's branch 
refs/heads/master from [~yo...@apache.org]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=b71a667 ]

SOLR-10121: fix race conditions in BlockCache.fetch and BlockCache.store


> BlockCache corruption with high concurrency
> ---
>
> Key: SOLR-10121
> URL: https://issues.apache.org/jira/browse/SOLR-10121
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
> Attachments: SOLR-10121.patch
>
>
> Improving the tests of the BlockCache in SOLR-10116 uncovered a corruption 
> bug (either that or the test is flawed... TBD).
> The failing test is TestBlockCache.testBlockCacheConcurrent()



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10121) BlockCache corruption with high concurrency

2017-02-13 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15864590#comment-15864590
 ] 

Yonik Seeley commented on SOLR-10121:
-

Thanks for the extra info - running the eviction listener in a separate thread 
shouldn't matter for correctness, but may work better the way this BlockCache 
code is written anyway.

I went back and re-tested right before the Caffeine switch (SOLR-7355) and was 
able to reproduce some fails by bumping up the concurrency.

> BlockCache corruption with high concurrency
> ---
>
> Key: SOLR-10121
> URL: https://issues.apache.org/jira/browse/SOLR-10121
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>
> Improving the tests of the BlockCache in SOLR-10116 uncovered a corruption 
> bug (either that or the test is flawed... TBD).
> The failing test is TestBlockCache.testBlockCacheConcurrent()



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10121) BlockCache corruption with high concurrency

2017-02-13 Thread Ben Manes (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15864352#comment-15864352
 ] 

Ben Manes commented on SOLR-10121:
--

Can you try a local hack of changing Caffeine versions and, if it fails, try 
reverting back to CLHM? Both should be easy changes that could help us isolate 
it.

Also note that CLHM ran the eviction listener on the same thread, whereas 
Caffeine delegates that to the executor. If there is a race due to that, you 
could use `executor(Runnable::run)` in the builder.

> BlockCache corruption with high concurrency
> ---
>
> Key: SOLR-10121
> URL: https://issues.apache.org/jira/browse/SOLR-10121
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>
> Improving the tests of the BlockCache in SOLR-10116 uncovered a corruption 
> bug (either that or the test is flawed... TBD).
> The failing test is TestBlockCache.testBlockCacheConcurrent()



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10121) BlockCache corruption with high concurrency

2017-02-13 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15864343#comment-15864343
 ] 

Yonik Seeley commented on SOLR-10121:
-

Hmmm, so on further review of BlockCache.java, I think I've found 2 concurrency 
issues.
Unfortunately, fixing those issues does not get my test to pass.
Another "issue" is that my test did pass pre-Caffeine, which means the test is 
not good enough at sussing out issues (since the BlockCache bugs I identified 
should not depend on the underlying map implementation).

> BlockCache corruption with high concurrency
> ---
>
> Key: SOLR-10121
> URL: https://issues.apache.org/jira/browse/SOLR-10121
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>
> Improving the tests of the BlockCache in SOLR-10116 uncovered a corruption 
> bug (either that or the test is flawed... TBD).
> The failing test is TestBlockCache.testBlockCacheConcurrent()



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10121) BlockCache corruption with high concurrency

2017-02-13 Thread Ben Manes (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15863924#comment-15863924
 ] 

Ben Manes commented on SOLR-10121:
--

Yes, a write should constitute a publication. Caffeine decorates a 
ConcurrentHashMap but does bypass it at times. By default eviction is 
asynchronous by delegating to fjp commonPool, but can be configured to use the 
caller instead. That might be useful for testing.

Solr uses an old version of Caffeine. A patch was reviewed and approved, but 
needs someone to merge it in SOLR-8241. I'm not aware of a visibility bug in 
any release, but staying current would be helpful as I have fixed bugs since 
that version.

> BlockCache corruption with high concurrency
> ---
>
> Key: SOLR-10121
> URL: https://issues.apache.org/jira/browse/SOLR-10121
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>
> Improving the tests of the BlockCache in SOLR-10116 uncovered a corruption 
> bug (either that or the test is flawed... TBD).
> The failing test is TestBlockCache.testBlockCacheConcurrent()



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10121) BlockCache corruption with high concurrency

2017-02-13 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15863699#comment-15863699
 ] 

Yonik Seeley commented on SOLR-10121:
-

I reviewed the pertinent BlockCache and couldn't see any thread safety issues.
Looking at the history of BlockCache, I reverted to right before SOLR-7355 was 
applied, and the issues went away.  So it looks like a thread safety or usage 
issue with Caffeine?
 [~ben.manes], does putting a key/value in Caffeine constitute safe publication 
to a different thread (as is the case with ConcurrentHashMap for example)?

> BlockCache corruption with high concurrency
> ---
>
> Key: SOLR-10121
> URL: https://issues.apache.org/jira/browse/SOLR-10121
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
>
> Improving the tests of the BlockCache in SOLR-10116 uncovered a corruption 
> bug (either that or the test is flawed... TBD).
> The failing test is TestBlockCache.testBlockCacheConcurrent()



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org