[ https://issues.apache.org/jira/browse/CASSANDRA-15726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ZhaoYang updated CASSANDRA-15726: --------------------------------- Bug Category: Parent values: Correctness(12982) Complexity: Normal Component/s: Legacy/Core Discovered By: Unit Test Fix Version/s: 4.0-alpha Severity: Normal Status: Open (was: Triage Needed) [~aleksey] [~benedict] do you mind reviewing? > buffer pool may NPE with concurrent release > ------------------------------------------- > > Key: CASSANDRA-15726 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15726 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Core > Reporter: ZhaoYang > Assignee: ZhaoYang > Priority: Normal > Fix For: 4.0-alpha > > > This can be reproduced by running {{LongBufferPoolTest}}, 1 out 5 runs.. > {code:java} > java.lang.NullPointerException > at > org.apache.cassandra.utils.memory.BufferPool$Chunk.access$1300(BufferPool.java:836) > at > org.apache.cassandra.utils.memory.BufferPool$LocalPool.lambda$remove$1(BufferPool.java:716) > at > org.apache.cassandra.utils.memory.BufferPool$MicroQueueOfChunks.removeIf(BufferPool.java:460) > at > org.apache.cassandra.utils.memory.BufferPool$MicroQueueOfChunks.access$1500(BufferPool.java:304) > at > org.apache.cassandra.utils.memory.BufferPool$LocalPool.remove(BufferPool.java:716) > at > org.apache.cassandra.utils.memory.BufferPool$LocalPool.put(BufferPool.java:590) > at > org.apache.cassandra.utils.memory.BufferPool$LocalPool.recycle(BufferPool.java:709) > at > org.apache.cassandra.utils.memory.BufferPool$Chunk.recycle(BufferPool.java:909) > at > org.apache.cassandra.utils.memory.BufferPool$Chunk.tryRecycle(BufferPool.java:903) > at > org.apache.cassandra.utils.memory.BufferPool$Chunk.release(BufferPool.java:896) > at > org.apache.cassandra.utils.memory.BufferPool$MicroQueueOfChunks.removeIf(BufferPool.java:465) > at > org.apache.cassandra.utils.memory.BufferPool$MicroQueueOfChunks.access$1500(BufferPool.java:304) > at > org.apache.cassandra.utils.memory.BufferPool$LocalPool.addChunk(BufferPool.java:736) > at > org.apache.cassandra.utils.memory.BufferPool$LocalPool.addChunkFromParent(BufferPool.java:725) > at > org.apache.cassandra.utils.memory.BufferPool$LocalPool.tryGetInternal(BufferPool.java:691) > at > org.apache.cassandra.utils.memory.BufferPool$LocalPool.tryGet(BufferPool.java:679) > at > org.apache.cassandra.utils.memory.BufferPool$LocalPool.access$000(BufferPool.java:518) > at > org.apache.cassandra.utils.memory.BufferPool.tryGet(BufferPool.java:120) > > at > org.apache.cassandra.utils.memory.LongBufferPoolTest$2.testOne(LongBufferPoolTest.java:497) > at > org.apache.cassandra.utils.memory.LongBufferPoolTest$TestUntil.call(LongBufferPoolTest.java:558) > at > org.apache.cassandra.utils.memory.LongBufferPoolTest$TestUntil.call(LongBufferPoolTest.java:538) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.lang.Thread.run(Thread.java:748) > {code} > The cause is that: > * When evicting a normal chunk from a full MicroQueueOfChunks, local pool > will try to remove corresponding tiny chunks, via > {{MicroQueueOfChunks#removeIf}}. > * If matching tiny chunk is found, tiny {{chunk.release()}} is called > immediately before moving null chunk to the back of the queue. > * Due to concurrent release from different threads, tiny {{chunk.release()}} > may cause its parent normal chunk, aka. the evicted chunk in #1, to be > removed from local pool and causes tiny pool to remove corresponding tiny > chunks again in {{LocalPool#remove()}}. > * In {{MicroQueueOfChunks#removeIf}}, due to previous in-progress > {{removeIf}}, it throws NPE as it violate MicroQueueOfChunks's assumption > which requires null chunks to be put at the back of queue. > > | [patch|https://github.com/apache/cassandra/pull/537] | > [CI|https://circleci.com/workflow-run/a97317a0-ef21-4c01-9a97-82eaf28d7faf]| > The fix to put null chunks to the back of queue before releasing any chunks. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org