[jira] [Comment Edited] (CASSANDRA-17507) IllegalArgumentException in query code path during 3.11.12 => 4.0.3 rolling upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-17507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17679139#comment-17679139 ] Andres de la Peña edited comment on CASSANDRA-17507 at 1/20/23 1:06 PM: Oh, right, it's a typo, the last 4.0 affected version is 4.0.7. Just fixed it, thanks! was (Author: adelapena): Oh, right, it's a typo, the last 4.0 affected version is 4.0.7. > IllegalArgumentException in query code path during 3.11.12 => 4.0.3 rolling > upgrade > --- > > Key: CASSANDRA-17507 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17507 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Coordination >Reporter: Thomas Steinmaurer >Assignee: Andres de la Peña >Priority: Normal > Fix For: 4.0.x, 4.1.x, 4.x > > Time Spent: 40m > Remaining Estimate: 0h > > In a 6 node 3.11.12 test cluster - freshly set up, thus no legacy SSTables > etc. - with ~ 1TB SSTables on disk per node, I have been running a rolling > upgrade to 4.0.3. On upgraded 4.0.3 nodes I then have seen the following > exception regularly, which disappeared once all 6 nodes have been on 4.0.3. > Is this known? Can this be ignored? As said, just a test drive, but not sure > if we want to have that in production, especially with a larger number of > nodes, where it could take some time, until all are upgraded. Thanks! > {code} > ERROR [Native-Transport-Requests-8] 2022-03-30 11:30:24,057 > ErrorMessage.java:457 - Unexpected exception during request > java.lang.IllegalArgumentException: newLimit > capacity: (290 > 15) > at java.base/java.nio.Buffer.createLimitException(Buffer.java:372) > at java.base/java.nio.Buffer.limit(Buffer.java:346) > at java.base/java.nio.ByteBuffer.limit(ByteBuffer.java:1107) > at java.base/java.nio.ByteBuffer.limit(ByteBuffer.java:262) > at > org.apache.cassandra.db.marshal.ByteBufferAccessor.slice(ByteBufferAccessor.java:107) > at > org.apache.cassandra.db.marshal.ByteBufferAccessor.slice(ByteBufferAccessor.java:39) > at > org.apache.cassandra.db.marshal.ValueAccessor.sliceWithShortLength(ValueAccessor.java:225) > at > org.apache.cassandra.db.marshal.CompositeType.splitName(CompositeType.java:222) > at > org.apache.cassandra.service.pager.PagingState$RowMark.decodeClustering(PagingState.java:434) > at > org.apache.cassandra.service.pager.PagingState$RowMark.clustering(PagingState.java:388) > at > org.apache.cassandra.service.pager.SinglePartitionPager.nextPageReadQuery(SinglePartitionPager.java:88) > at > org.apache.cassandra.service.pager.SinglePartitionPager.nextPageReadQuery(SinglePartitionPager.java:32) > at > org.apache.cassandra.service.pager.AbstractQueryPager.fetchPage(AbstractQueryPager.java:69) > at > org.apache.cassandra.service.pager.SinglePartitionPager.fetchPage(SinglePartitionPager.java:32) > at > org.apache.cassandra.cql3.statements.SelectStatement$Pager$NormalPager.fetchPage(SelectStatement.java:352) > at > org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:400) > at > org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:250) > at > org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:88) > at > org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:244) > at > org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:723) > at > org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:701) > at > org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:159) > at > org.apache.cassandra.transport.Message$Request.execute(Message.java:242) > at > org.apache.cassandra.transport.Dispatcher.processRequest(Dispatcher.java:86) > at > org.apache.cassandra.transport.Dispatcher.processRequest(Dispatcher.java:106) > at > org.apache.cassandra.transport.Dispatcher.lambda$dispatch$0(Dispatcher.java:70) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:165) > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:119) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.base/java.lang.Thread.run(Thread.java:829) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) ---
[jira] [Comment Edited] (CASSANDRA-17507) IllegalArgumentException in query code path during 3.11.12 => 4.0.3 rolling upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-17507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17679122#comment-17679122 ] Andres de la Peña edited comment on CASSANDRA-17507 at 1/20/23 12:02 PM: - Note that the mistake in the serialisation of protocol v3 that was introduced with 4.0.0 has effectively meant that we have two versions of protocol v3: the one used by 3.0/3.x, and the one used by 4.0/4.1/4.x. The proposed fix will make all new 4.0/4.1 minors use the same version of protocol v3 that 3.0/3.x have always used. However, we will hit the same problem in a cluster with a node using an unpatched 4.0/4.1/4.x node and a patched 4.0/4.1/4.x node. In other words, we are trading upgrade issues on 3.0.x -> 4.0.7 by upgrade issues on 4.0.8 -> 4.0.9, etc. I'm not sure how we could know which version of v3 (broken or unbroken) we should use, if we can. That said, the problem occurs only when using v3, and I'd say that the old v3 protocol is much more likely to be used on a major 3.0/3.x- > 4.0/4.1 upgrade than in a 4.0/4.1 - > 4.0/4.1 upgrade. If that assumption is true, we are improving things with this fix, and stopping the spread of the broken version of the v3 protocol. What do you think? [~brandon.williams] any thoughts on this? was (Author: adelapena): Note that the mistake in the serialisation of protocol v3 that was introduced with 4.0.0 has effectively meant that we have two versions of protocol v3: the one used by 3.0/3.x, and the one used by 4.0/4.1/4.x. The proposed fix will make all new 4.0/4.1 minors use the same version of protocol v3 that 3.0/3.x have always used. However, we will hit the same problem in a cluster with a node using an unpatched 4.0/4.1/4.x node and a patched 4.0/4.1/4.x node. In other words, we are trading upgrade issues on 3.0.x -> 4.0.7 by upgrade issues on 4.0.8 -> 4.0.9, etc. I'm not sure how we could know which version of v3 (broken or unbroken) we should use, if we can. That said, the problem occurs only when using v3, and I'd say that the old v3 protocol is much more likely to be used on a major [3.0 | 3.x] - > [4.0 | 4.x] upgrade than in a [4.0 | 4.1 | 4.x] - > [4.0 | 4.1 | 4.x] upgrade. If that assumption is true, we are improving things with this fix, and stopping the spread of the broken version of the v3 protocol. What do you think? [~brandon.williams] any thoughts on this? > IllegalArgumentException in query code path during 3.11.12 => 4.0.3 rolling > upgrade > --- > > Key: CASSANDRA-17507 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17507 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Coordination >Reporter: Thomas Steinmaurer >Assignee: Andres de la Peña >Priority: Normal > Fix For: 4.0.x, 4.1.x, 4.x > > Time Spent: 40m > Remaining Estimate: 0h > > In a 6 node 3.11.12 test cluster - freshly set up, thus no legacy SSTables > etc. - with ~ 1TB SSTables on disk per node, I have been running a rolling > upgrade to 4.0.3. On upgraded 4.0.3 nodes I then have seen the following > exception regularly, which disappeared once all 6 nodes have been on 4.0.3. > Is this known? Can this be ignored? As said, just a test drive, but not sure > if we want to have that in production, especially with a larger number of > nodes, where it could take some time, until all are upgraded. Thanks! > {code} > ERROR [Native-Transport-Requests-8] 2022-03-30 11:30:24,057 > ErrorMessage.java:457 - Unexpected exception during request > java.lang.IllegalArgumentException: newLimit > capacity: (290 > 15) > at java.base/java.nio.Buffer.createLimitException(Buffer.java:372) > at java.base/java.nio.Buffer.limit(Buffer.java:346) > at java.base/java.nio.ByteBuffer.limit(ByteBuffer.java:1107) > at java.base/java.nio.ByteBuffer.limit(ByteBuffer.java:262) > at > org.apache.cassandra.db.marshal.ByteBufferAccessor.slice(ByteBufferAccessor.java:107) > at > org.apache.cassandra.db.marshal.ByteBufferAccessor.slice(ByteBufferAccessor.java:39) > at > org.apache.cassandra.db.marshal.ValueAccessor.sliceWithShortLength(ValueAccessor.java:225) > at > org.apache.cassandra.db.marshal.CompositeType.splitName(CompositeType.java:222) > at > org.apache.cassandra.service.pager.PagingState$RowMark.decodeClustering(PagingState.java:434) > at > org.apache.cassandra.service.pager.PagingState$RowMark.clustering(PagingState.java:388) > at > org.apache.cassandra.service.pager.SinglePartitionPager.nextPageReadQuery(SinglePartitionPager.java:88) > at > org.apache.cassandra.service.pager.SinglePartitionPager.nextPageReadQuery(SingleParti
[jira] [Comment Edited] (CASSANDRA-17507) IllegalArgumentException in query code path during 3.11.12 => 4.0.3 rolling upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-17507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17679122#comment-17679122 ] Andres de la Peña edited comment on CASSANDRA-17507 at 1/20/23 12:01 PM: - Note that the mistake in the serialisation of protocol v3 that was introduced with 4.0.0 has effectively meant that we have two versions of protocol v3: the one used by 3.0/3.x, and the one used by 4.0/4.1/4.x. The proposed fix will make all new 4.0/4.1 minors use the same version of protocol v3 that 3.0/3.x have always used. However, we will hit the same problem in a cluster with a node using an unpatched 4.0/4.1/4.x node and a patched 4.0/4.1/4.x node. In other words, we are trading upgrade issues on 3.0.x -> 4.0.7 by upgrade issues on 4.0.8 -> 4.0.9, etc. I'm not sure how we could know which version of v3 (broken or unbroken) we should use, if we can. That said, the problem occurs only when using v3, and I'd say that the old v3 protocol is much more likely to be used on a major [3.0 | 3.x] - > [4.0 | 4.x] upgrade than in a [4.0 | 4.1 | 4.x] - > [4.0 | 4.1 | 4.x] upgrade. If that assumption is true, we are improving things with this fix, and stopping the spread of the broken version of the v3 protocol. What do you think? [~brandon.williams] any thoughts on this? was (Author: adelapena): Note that the mistake in the serialisation of protocol v3 that was introduced with 4.0.0 has effectively meant that we have two versions of protocol v3: the one used by 3.0/3.x, and the one used by 4.0/4.1/4.x. The proposed fix will make all new 4.0/4.1 minors use the same version of protocol v3 that 3.0/3.x have always used. However, we will hit the same problem in a cluster with a node using an unpatched 4.0/4.1/4.x node and a patched 4.0/4.1/4.x node. In other words, we are trading upgrade issues on 3.0.x -> 4.0.7 by upgrade issues on 4.0.8 -> 4.0.9, etc. I'm not sure how we could know which version of v3 (broken or unbroken) we should use, if we can. That said, the problem occurs only when using v3, and I'd say that the old v3 protocol is much more likely to be used on a major [3.0 | 3.x]-> [4.0 | 4.x] upgrade than in a [4.0 | 4.1 | 4.x]-> [4.0 | 4.1 | 4.x] upgrade. If that assumption is true, we are improving things with this fix, and stopping the spread of the broken version of the v3 protocol. What do you think? [~brandon.williams] any thoughts on this? > IllegalArgumentException in query code path during 3.11.12 => 4.0.3 rolling > upgrade > --- > > Key: CASSANDRA-17507 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17507 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Coordination >Reporter: Thomas Steinmaurer >Assignee: Andres de la Peña >Priority: Normal > Fix For: 4.0.x, 4.1.x, 4.x > > Time Spent: 40m > Remaining Estimate: 0h > > In a 6 node 3.11.12 test cluster - freshly set up, thus no legacy SSTables > etc. - with ~ 1TB SSTables on disk per node, I have been running a rolling > upgrade to 4.0.3. On upgraded 4.0.3 nodes I then have seen the following > exception regularly, which disappeared once all 6 nodes have been on 4.0.3. > Is this known? Can this be ignored? As said, just a test drive, but not sure > if we want to have that in production, especially with a larger number of > nodes, where it could take some time, until all are upgraded. Thanks! > {code} > ERROR [Native-Transport-Requests-8] 2022-03-30 11:30:24,057 > ErrorMessage.java:457 - Unexpected exception during request > java.lang.IllegalArgumentException: newLimit > capacity: (290 > 15) > at java.base/java.nio.Buffer.createLimitException(Buffer.java:372) > at java.base/java.nio.Buffer.limit(Buffer.java:346) > at java.base/java.nio.ByteBuffer.limit(ByteBuffer.java:1107) > at java.base/java.nio.ByteBuffer.limit(ByteBuffer.java:262) > at > org.apache.cassandra.db.marshal.ByteBufferAccessor.slice(ByteBufferAccessor.java:107) > at > org.apache.cassandra.db.marshal.ByteBufferAccessor.slice(ByteBufferAccessor.java:39) > at > org.apache.cassandra.db.marshal.ValueAccessor.sliceWithShortLength(ValueAccessor.java:225) > at > org.apache.cassandra.db.marshal.CompositeType.splitName(CompositeType.java:222) > at > org.apache.cassandra.service.pager.PagingState$RowMark.decodeClustering(PagingState.java:434) > at > org.apache.cassandra.service.pager.PagingState$RowMark.clustering(PagingState.java:388) > at > org.apache.cassandra.service.pager.SinglePartitionPager.nextPageReadQuery(SinglePartitionPager.java:88) > at > org.apache.cassandra.service.pager.SinglePartitionPager.next
[jira] [Comment Edited] (CASSANDRA-17507) IllegalArgumentException in query code path during 3.11.12 => 4.0.3 rolling upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-17507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17675629#comment-17675629 ] Andres de la Peña edited comment on CASSANDRA-17507 at 1/11/23 3:48 PM: I can confirm that 4.1 and trunk are also affected. Here are the patches for all the branches: ||PR||CI|| |[4.0|https://github.com/apache/cassandra/pull/2082]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/2543/workflows/cb16ec9d-8ec6-4914-a08a-92715bd15ff0]| |[4.1|https://github.com/apache/cassandra/pull/2083]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/2544/workflows/924c41ce-accb-44eb-be07-1fc678b1f4b2]| |[trunk|https://github.com/apache/cassandra/pull/2084]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/2542/workflows/16be8e22-b8d1-440f-93bd-a599b56e3093]| I think that CI will fail for the new tests on the [4.0, 4.1] -> [4.1, trunk] upgrade paths. That's because our CI script [generates the dtest artifacts from the main repo|https://github.com/apache/cassandra/blob/trunk/.circleci/config-2_1.yml#L2684-L2693], and the branches there don't contain the serialization fix that we are proposing here. Reviewers can test it locally by generating the dtests artifacts of each patched branch with {{{}ant dtest-jar{}}}, and copying all the generated {{dtest-*.jar}} files into the {{build}} directory of the tested branch. was (Author: adelapena): I can confirm that 4.1 and trunk are also affected. Here are the patches for all the branches: ||PR||CI|| |[4.0|https://github.com/apache/cassandra/pull/2082]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/2543/workflows/cb16ec9d-8ec6-4914-a08a-92715bd15ff0]| |[4.1|https://github.com/apache/cassandra/pull/2083]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/2544/workflows/924c41ce-accb-44eb-be07-1fc678b1f4b2]| |[trunk|https://github.com/apache/cassandra/pull/2084]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/2542/workflows/16be8e22-b8d1-440f-93bd-a599b56e3093]| I think that CI will fail for the new tests on the [4.0, 4.1] -> [4.1, trunk] upgrade paths. That's because our CI scripts generates the dtest artifacts from the main repo, and the branches there don't contain the serialization fix that we are proposing here. Reviewers can test it locally by generating the dtests artifacts of each patched branch with {{{}ant dtest-jar{}}}, and copying all the generated {{dtest-*.jar}} files into the {{build}} directory of the tested branch. > IllegalArgumentException in query code path during 3.11.12 => 4.0.3 rolling > upgrade > --- > > Key: CASSANDRA-17507 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17507 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Coordination >Reporter: Thomas Steinmaurer >Priority: Normal > Fix For: 4.0.x > > Time Spent: 0.5h > Remaining Estimate: 0h > > In a 6 node 3.11.12 test cluster - freshly set up, thus no legacy SSTables > etc. - with ~ 1TB SSTables on disk per node, I have been running a rolling > upgrade to 4.0.3. On upgraded 4.0.3 nodes I then have seen the following > exception regularly, which disappeared once all 6 nodes have been on 4.0.3. > Is this known? Can this be ignored? As said, just a test drive, but not sure > if we want to have that in production, especially with a larger number of > nodes, where it could take some time, until all are upgraded. Thanks! > {code} > ERROR [Native-Transport-Requests-8] 2022-03-30 11:30:24,057 > ErrorMessage.java:457 - Unexpected exception during request > java.lang.IllegalArgumentException: newLimit > capacity: (290 > 15) > at java.base/java.nio.Buffer.createLimitException(Buffer.java:372) > at java.base/java.nio.Buffer.limit(Buffer.java:346) > at java.base/java.nio.ByteBuffer.limit(ByteBuffer.java:1107) > at java.base/java.nio.ByteBuffer.limit(ByteBuffer.java:262) > at > org.apache.cassandra.db.marshal.ByteBufferAccessor.slice(ByteBufferAccessor.java:107) > at > org.apache.cassandra.db.marshal.ByteBufferAccessor.slice(ByteBufferAccessor.java:39) > at > org.apache.cassandra.db.marshal.ValueAccessor.sliceWithShortLength(ValueAccessor.java:225) > at > org.apache.cassandra.db.marshal.CompositeType.splitName(CompositeType.java:222) > at > org.apache.cassandra.service.pager.PagingState$RowMark.decodeClustering(PagingState.java:434) > at > org.apache.cassandra.service.pager.PagingState$RowMark.clustering(PagingState.java:388) > at > org.apache.cassandra.service.pager.SinglePartitionPager.nextPageReadQuery(SinglePartitionPager.java:88) >
[jira] [Comment Edited] (CASSANDRA-17507) IllegalArgumentException in query code path during 3.11.12 => 4.0.3 rolling upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-17507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17675575#comment-17675575 ] Andres de la Peña edited comment on CASSANDRA-17507 at 1/11/23 1:55 PM: I think I have found the cause of the bug when using protocol v3. Cassandra 3.0 and 3.x with protocol v3 and compact storage don't serialize single-column clusterings as single-element composites. Instead, single-column clusterings values are written as they are, as it can be seen [here|https://github.com/apache/cassandra/blob/cassandra-3.11/src/java/org/apache/cassandra/db/LegacyLayout.java#L477-L486]. However, Cassandra 4.0 always reads and writes single-column clusterings as composites. This can be seen [here|https://github.com/apache/cassandra/blob/cassandra-4.0.3/src/java/org/apache/cassandra/service/pager/PagingState.java#L434], exactly where the reported exception is thrown. I think the solution is modifying the code to read legacy formats in Cassandra 4.0 so it special cases single-column clusterings for compact storage: ||PR||CI|| |[4.0|https://github.com/apache/cassandra/pull/2082]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/2540/workflows/22cfa989-31df-4fcc-a896-f46c8d77d364]| If the approach looks good I'll prepare patches for 4.1 and trunk, that probably are also affected. was (Author: adelapena): I think I have found the cause of the bug when using protocol v3. Cassandra 3.0 and 3.x with protocol v3 and compact storage doesn't serialize single-column clusterings as single-element composites. Instead, single-column clusterings values are written as they are, as it can be seen [here|https://github.com/apache/cassandra/blob/cassandra-3.11/src/java/org/apache/cassandra/db/LegacyLayout.java#L477-L486]. However, Cassandra 4.0 always reads clusterings as composites. This can be seen [here|https://github.com/apache/cassandra/blob/cassandra-4.0.3/src/java/org/apache/cassandra/service/pager/PagingState.java#L434], exactly where the reported exception is thrown. I think the solution is modifying the code to read legacy formats in Cassandra 4.0 so it special cases single-column clusterings for compact storage: ||PR||CI|| |[4.0|https://github.com/apache/cassandra/pull/2082]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/2540/workflows/22cfa989-31df-4fcc-a896-f46c8d77d364]| If the approach looks good I'll prepare patches for 4.1 and trunk, that probably are also affected. > IllegalArgumentException in query code path during 3.11.12 => 4.0.3 rolling > upgrade > --- > > Key: CASSANDRA-17507 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17507 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Coordination >Reporter: Thomas Steinmaurer >Priority: Normal > Fix For: 4.0.x > > Time Spent: 10m > Remaining Estimate: 0h > > In a 6 node 3.11.12 test cluster - freshly set up, thus no legacy SSTables > etc. - with ~ 1TB SSTables on disk per node, I have been running a rolling > upgrade to 4.0.3. On upgraded 4.0.3 nodes I then have seen the following > exception regularly, which disappeared once all 6 nodes have been on 4.0.3. > Is this known? Can this be ignored? As said, just a test drive, but not sure > if we want to have that in production, especially with a larger number of > nodes, where it could take some time, until all are upgraded. Thanks! > {code} > ERROR [Native-Transport-Requests-8] 2022-03-30 11:30:24,057 > ErrorMessage.java:457 - Unexpected exception during request > java.lang.IllegalArgumentException: newLimit > capacity: (290 > 15) > at java.base/java.nio.Buffer.createLimitException(Buffer.java:372) > at java.base/java.nio.Buffer.limit(Buffer.java:346) > at java.base/java.nio.ByteBuffer.limit(ByteBuffer.java:1107) > at java.base/java.nio.ByteBuffer.limit(ByteBuffer.java:262) > at > org.apache.cassandra.db.marshal.ByteBufferAccessor.slice(ByteBufferAccessor.java:107) > at > org.apache.cassandra.db.marshal.ByteBufferAccessor.slice(ByteBufferAccessor.java:39) > at > org.apache.cassandra.db.marshal.ValueAccessor.sliceWithShortLength(ValueAccessor.java:225) > at > org.apache.cassandra.db.marshal.CompositeType.splitName(CompositeType.java:222) > at > org.apache.cassandra.service.pager.PagingState$RowMark.decodeClustering(PagingState.java:434) > at > org.apache.cassandra.service.pager.PagingState$RowMark.clustering(PagingState.java:388) > at > org.apache.cassandra.service.pager.SinglePartitionPager.nextPageReadQuery(SinglePartitionPager.java:88) > at > org.apache.cassandra.service.pager.SinglePartitionPage
[jira] [Comment Edited] (CASSANDRA-17507) IllegalArgumentException in query code path during 3.11.12 => 4.0.3 rolling upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-17507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17656119#comment-17656119 ] Andres de la Peña edited comment on CASSANDRA-17507 at 1/11/23 12:51 PM: - [This JVM dtest|https://github.com/apache/cassandra/compare/cassandra-4.0...adelapena:cassandra:17507-4.0-repro] reproduces the bug. It testes a 3.x -> 4.0 rolling upgrade scenario with a table with {{COMPACT STORAGE}} and a query over that uses paging. The bug only seems to manifest itself when the driver uses native protocol v3, instead on the default (v5 for 4.0 and v4 for 3.11). The tests results can be found [here|https://app.circleci.com/pipelines/github/adelapena/cassandra/2536/workflows/5791569d-8ea1-42b5-bacd-bd8716afaee8/jobs/25163]. The artifacts stored for each test contain an identical stack trace, for example [this one|https://output.circle-artifacts.com/output/job/f4cbecbc-92dd-49c8-a75d-a5a7b53bcd21/artifacts/0/stdout/fails/1/org.apache.cassandra.distributed.upgrade.CompactStoragePagingTest%23testPagingWithCompactStorageAndProtocolVersion.txt] If this is actually caused by the combination of {{COMPACT STORAGE}}, paging and an old protocol version, probably the easiest workaround until we get a fix is setting the driver to use a more recent version of the native transport protocol. was (Author: adelapena): [This JVM dtest|https://github.com/apache/cassandra/compare/cassandra-4.0...adelapena:cassandra:17507-4.0] reproduces the bug. It testes a 3.x -> 4.0 rolling upgrade scenario with a table with {{COMPACT STORAGE}} and a query over that uses paging. The bug only seems to manifest itself when the driver uses native protocol v3, instead on the default (v5 for 4.0 and v4 for 3.11). The tests results can be found [here|https://app.circleci.com/pipelines/github/adelapena/cassandra/2536/workflows/5791569d-8ea1-42b5-bacd-bd8716afaee8/jobs/25163]. The artifacts stored for each test contain an identical stack trace, for example [this one|https://output.circle-artifacts.com/output/job/f4cbecbc-92dd-49c8-a75d-a5a7b53bcd21/artifacts/0/stdout/fails/1/org.apache.cassandra.distributed.upgrade.CompactStoragePagingTest%23testPagingWithCompactStorageAndProtocolVersion.txt] If this is actually caused by the combination of {{COMPACT STORAGE}}, paging and an old protocol version, probably the easiest workaround until we get a fix is setting the driver to use a more recent version of the native transport protocol. > IllegalArgumentException in query code path during 3.11.12 => 4.0.3 rolling > upgrade > --- > > Key: CASSANDRA-17507 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17507 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Coordination >Reporter: Thomas Steinmaurer >Priority: Normal > Fix For: 4.0.x > > > In a 6 node 3.11.12 test cluster - freshly set up, thus no legacy SSTables > etc. - with ~ 1TB SSTables on disk per node, I have been running a rolling > upgrade to 4.0.3. On upgraded 4.0.3 nodes I then have seen the following > exception regularly, which disappeared once all 6 nodes have been on 4.0.3. > Is this known? Can this be ignored? As said, just a test drive, but not sure > if we want to have that in production, especially with a larger number of > nodes, where it could take some time, until all are upgraded. Thanks! > {code} > ERROR [Native-Transport-Requests-8] 2022-03-30 11:30:24,057 > ErrorMessage.java:457 - Unexpected exception during request > java.lang.IllegalArgumentException: newLimit > capacity: (290 > 15) > at java.base/java.nio.Buffer.createLimitException(Buffer.java:372) > at java.base/java.nio.Buffer.limit(Buffer.java:346) > at java.base/java.nio.ByteBuffer.limit(ByteBuffer.java:1107) > at java.base/java.nio.ByteBuffer.limit(ByteBuffer.java:262) > at > org.apache.cassandra.db.marshal.ByteBufferAccessor.slice(ByteBufferAccessor.java:107) > at > org.apache.cassandra.db.marshal.ByteBufferAccessor.slice(ByteBufferAccessor.java:39) > at > org.apache.cassandra.db.marshal.ValueAccessor.sliceWithShortLength(ValueAccessor.java:225) > at > org.apache.cassandra.db.marshal.CompositeType.splitName(CompositeType.java:222) > at > org.apache.cassandra.service.pager.PagingState$RowMark.decodeClustering(PagingState.java:434) > at > org.apache.cassandra.service.pager.PagingState$RowMark.clustering(PagingState.java:388) > at > org.apache.cassandra.service.pager.SinglePartitionPager.nextPageReadQuery(SinglePartitionPager.java:88) > at > org.apache.cassandra.service.pager.SinglePartitionPager.nextPageReadQuery(SinglePartitionPager.java:32) >
[jira] [Comment Edited] (CASSANDRA-17507) IllegalArgumentException in query code path during 3.11.12 => 4.0.3 rolling upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-17507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17656119#comment-17656119 ] Andres de la Peña edited comment on CASSANDRA-17507 at 1/9/23 3:49 PM: --- [This JVM dtest|https://github.com/apache/cassandra/compare/cassandra-4.0...adelapena:cassandra:17507-4.0] reproduces the bug. It testes a 3.x -> 4.0 rolling upgrade scenario with a table with {{COMPACT STORAGE}} and a query over that uses paging. The bug only seems to manifest itself when the driver uses native protocol v3, instead on the default (v5 for 4.0 and v4 for 3.11). The tests results can be found [here|https://app.circleci.com/pipelines/github/adelapena/cassandra/2536/workflows/5791569d-8ea1-42b5-bacd-bd8716afaee8/jobs/25163]. The artifacts stored for each test contain an identical stack trace, for example [this one|https://output.circle-artifacts.com/output/job/f4cbecbc-92dd-49c8-a75d-a5a7b53bcd21/artifacts/0/stdout/fails/1/org.apache.cassandra.distributed.upgrade.CompactStoragePagingTest%23testPagingWithCompactStorageAndProtocolVersion.txt] If this is actually caused by the combination of {{COMPACT STORAGE}}, paging and an old protocol version, probably the easiest workaround until we get a fix is setting the driver to use a more recent version of the native transport protocol. was (Author: adelapena): [This JVM dtest|https://github.com/apache/cassandra/compare/cassandra-4.0...adelapena:cassandra:17507-4.0] reproduces the bug. It testes a 3.x -> 4.0 rolling upgrade scenario with a table with {{COMPACT STORAGE}} and a query over that uses paging. The bug only seems to manifest itself when the driver uses native protocol v3, instead on the default (v5 for 4.0 and v4 for 3.11). The tests results can be found [here|https://app.circleci.com/pipelines/github/adelapena/cassandra/2536/workflows/5791569d-8ea1-42b5-bacd-bd8716afaee8/jobs/25163]. The artifacts stored for each test contain an identical stacktrace, for example [this one|https://output.circle-artifacts.com/output/job/f4cbecbc-92dd-49c8-a75d-a5a7b53bcd21/artifacts/0/stdout/fails/1/org.apache.cassandra.distributed.upgrade.CompactStoragePagingTest%23testPagingWithCompactStorageAndProtocolVersion.txt] If this is actually caused by the combination of {{{}COMPACT STORAGE{}}}, paging and and old protocol version, probably the easiest workaround until we get a fix is setting the driver to use a more recent version of the native transport protocol. > IllegalArgumentException in query code path during 3.11.12 => 4.0.3 rolling > upgrade > --- > > Key: CASSANDRA-17507 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17507 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Coordination >Reporter: Thomas Steinmaurer >Priority: Normal > Fix For: 4.0.x > > > In a 6 node 3.11.12 test cluster - freshly set up, thus no legacy SSTables > etc. - with ~ 1TB SSTables on disk per node, I have been running a rolling > upgrade to 4.0.3. On upgraded 4.0.3 nodes I then have seen the following > exception regularly, which disappeared once all 6 nodes have been on 4.0.3. > Is this known? Can this be ignored? As said, just a test drive, but not sure > if we want to have that in production, especially with a larger number of > nodes, where it could take some time, until all are upgraded. Thanks! > {code} > ERROR [Native-Transport-Requests-8] 2022-03-30 11:30:24,057 > ErrorMessage.java:457 - Unexpected exception during request > java.lang.IllegalArgumentException: newLimit > capacity: (290 > 15) > at java.base/java.nio.Buffer.createLimitException(Buffer.java:372) > at java.base/java.nio.Buffer.limit(Buffer.java:346) > at java.base/java.nio.ByteBuffer.limit(ByteBuffer.java:1107) > at java.base/java.nio.ByteBuffer.limit(ByteBuffer.java:262) > at > org.apache.cassandra.db.marshal.ByteBufferAccessor.slice(ByteBufferAccessor.java:107) > at > org.apache.cassandra.db.marshal.ByteBufferAccessor.slice(ByteBufferAccessor.java:39) > at > org.apache.cassandra.db.marshal.ValueAccessor.sliceWithShortLength(ValueAccessor.java:225) > at > org.apache.cassandra.db.marshal.CompositeType.splitName(CompositeType.java:222) > at > org.apache.cassandra.service.pager.PagingState$RowMark.decodeClustering(PagingState.java:434) > at > org.apache.cassandra.service.pager.PagingState$RowMark.clustering(PagingState.java:388) > at > org.apache.cassandra.service.pager.SinglePartitionPager.nextPageReadQuery(SinglePartitionPager.java:88) > at > org.apache.cassandra.service.pager.SinglePartitionPager.nextPageReadQuery(SinglePartitionPager.java:32) >
[jira] [Comment Edited] (CASSANDRA-17507) IllegalArgumentException in query code path during 3.11.12 => 4.0.3 rolling upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-17507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17656119#comment-17656119 ] Andres de la Peña edited comment on CASSANDRA-17507 at 1/9/23 3:17 PM: --- [This JVM dtest|https://github.com/apache/cassandra/compare/cassandra-4.0...adelapena:cassandra:17507-4.0] reproduces the bug. It testes a 3.x -> 4.0 rolling upgrade scenario with a table with {{COMPACT STORAGE}} and a query over that uses paging. The bug only seems to manifest itself when the driver uses native protocol v3, instead on the default (v5 for 4.0 and v4 for 3.11). The tests results can be found [here|https://app.circleci.com/pipelines/github/adelapena/cassandra/2536/workflows/5791569d-8ea1-42b5-bacd-bd8716afaee8/jobs/25163]. The artifacts stored for each test contain an identical stacktrace, for example [this one|https://output.circle-artifacts.com/output/job/f4cbecbc-92dd-49c8-a75d-a5a7b53bcd21/artifacts/0/stdout/fails/1/org.apache.cassandra.distributed.upgrade.CompactStoragePagingTest%23testPagingWithCompactStorageAndProtocolVersion.txt] If this is actually caused by the combination of {{{}COMPACT STORAGE{}}}, paging and and old protocol version, probably the easiest workaround until we get a fix is setting the driver to use a more recent version of the native transport protocol. was (Author: adelapena): [This JVM dtest|https://github.com/apache/cassandra/compare/trunk...adelapena:cassandra:17507-4.0?expand=1] reproduces the bug. It testes a 3.x -> 4.0 rolling upgrade scenario with a table with {{COMPACT STORAGE}} and a query over that uses paging. The bug only seems to manifest itself when the driver uses native protocol v3, instead on the default (v5 for 4.0 and v4 for 3.11). The tests results can be found [here|https://app.circleci.com/pipelines/github/adelapena/cassandra/2536/workflows/5791569d-8ea1-42b5-bacd-bd8716afaee8/jobs/25163]. The artifacts stored for each test contain an identical stacktrace, for example [this one|https://output.circle-artifacts.com/output/job/f4cbecbc-92dd-49c8-a75d-a5a7b53bcd21/artifacts/0/stdout/fails/1/org.apache.cassandra.distributed.upgrade.CompactStoragePagingTest%23testPagingWithCompactStorageAndProtocolVersion.txt] If this is actually caused by the combination of {{{}COMPACT STORAGE{}}}, paging and and old protocol version, probably the easiest workaround until we get a fix is setting the driver to use a more recent version of the native transport protocol. > IllegalArgumentException in query code path during 3.11.12 => 4.0.3 rolling > upgrade > --- > > Key: CASSANDRA-17507 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17507 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Coordination >Reporter: Thomas Steinmaurer >Priority: Normal > Fix For: 4.0.x > > > In a 6 node 3.11.12 test cluster - freshly set up, thus no legacy SSTables > etc. - with ~ 1TB SSTables on disk per node, I have been running a rolling > upgrade to 4.0.3. On upgraded 4.0.3 nodes I then have seen the following > exception regularly, which disappeared once all 6 nodes have been on 4.0.3. > Is this known? Can this be ignored? As said, just a test drive, but not sure > if we want to have that in production, especially with a larger number of > nodes, where it could take some time, until all are upgraded. Thanks! > {code} > ERROR [Native-Transport-Requests-8] 2022-03-30 11:30:24,057 > ErrorMessage.java:457 - Unexpected exception during request > java.lang.IllegalArgumentException: newLimit > capacity: (290 > 15) > at java.base/java.nio.Buffer.createLimitException(Buffer.java:372) > at java.base/java.nio.Buffer.limit(Buffer.java:346) > at java.base/java.nio.ByteBuffer.limit(ByteBuffer.java:1107) > at java.base/java.nio.ByteBuffer.limit(ByteBuffer.java:262) > at > org.apache.cassandra.db.marshal.ByteBufferAccessor.slice(ByteBufferAccessor.java:107) > at > org.apache.cassandra.db.marshal.ByteBufferAccessor.slice(ByteBufferAccessor.java:39) > at > org.apache.cassandra.db.marshal.ValueAccessor.sliceWithShortLength(ValueAccessor.java:225) > at > org.apache.cassandra.db.marshal.CompositeType.splitName(CompositeType.java:222) > at > org.apache.cassandra.service.pager.PagingState$RowMark.decodeClustering(PagingState.java:434) > at > org.apache.cassandra.service.pager.PagingState$RowMark.clustering(PagingState.java:388) > at > org.apache.cassandra.service.pager.SinglePartitionPager.nextPageReadQuery(SinglePartitionPager.java:88) > at > org.apache.cassandra.service.pager.SinglePartitionPager.nextPageReadQuery(SinglePartitionPager.java:32) >
[jira] [Comment Edited] (CASSANDRA-17507) IllegalArgumentException in query code path during 3.11.12 => 4.0.3 rolling upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-17507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17656119#comment-17656119 ] Andres de la Peña edited comment on CASSANDRA-17507 at 1/9/23 3:15 PM: --- [This JVM dtest|https://github.com/apache/cassandra/compare/trunk...adelapena:cassandra:17507-4.0?expand=1] reproduces the bug. It testes a 3.x -> 4.0 rolling upgrade scenario with a table with {{COMPACT STORAGE}} and a query over that uses paging. The bug only seems to manifest itself when the driver uses native protocol v3, instead on the default (v5 for 4.0 and v4 for 3.11). The tests results can be found [here|https://app.circleci.com/pipelines/github/adelapena/cassandra/2536/workflows/5791569d-8ea1-42b5-bacd-bd8716afaee8/jobs/25163]. The artifacts stored for each test contain an identical stacktrace, for example [this one|https://output.circle-artifacts.com/output/job/f4cbecbc-92dd-49c8-a75d-a5a7b53bcd21/artifacts/0/stdout/fails/1/org.apache.cassandra.distributed.upgrade.CompactStoragePagingTest%23testPagingWithCompactStorageAndProtocolVersion.txt] If this is actually caused by the combination of {{{}COMPACT STORAGE{}}}, paging and and old protocol version, probably the easiest workaround until we get a fix is setting the driver to use a more recent version of the native transport protocol. was (Author: adelapena): [This JVM dtest|https://github.com/apache/cassandra/compare/trunk...adelapena:cassandra:17507-4.0?expand=1] reproduces the bug. It testes a 3.x -> 4.0 rolling upgrade scenario with a table with {{COMPACT STORAGE}} and a query over that uses paging. The bug only seems to manifest itself when the driver uses native protocol v3, instead on the default (v5 for 4.0 and v4 for 3.11). The tests results can be found [here|https://app.circleci.com/pipelines/github/adelapena/cassandra/2536/workflows/5791569d-8ea1-42b5-bacd-bd8716afaee8/jobs/25163]. The artifacts stored for each test contain an identical stacktrace, for example [this one|https://output.circle-artifacts.com/output/job/f4cbecbc-92dd-49c8-a75d-a5a7b53bcd21/artifacts/0/stdout/fails/1/org.apache.cassandra.distributed.upgrade.CompactStoragePagingTest%23testPagingWithCompactStorageAndProtocolVersion.txt] If this actually is caused by the combination of {{{}COMPACT STORAGE{}}}, paging and and old protocol version, probably the easiest workaround until we get a fix is setting the driver to use a most recent version of the native transport protocol. > IllegalArgumentException in query code path during 3.11.12 => 4.0.3 rolling > upgrade > --- > > Key: CASSANDRA-17507 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17507 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Coordination >Reporter: Thomas Steinmaurer >Priority: Normal > Fix For: 4.0.x > > > In a 6 node 3.11.12 test cluster - freshly set up, thus no legacy SSTables > etc. - with ~ 1TB SSTables on disk per node, I have been running a rolling > upgrade to 4.0.3. On upgraded 4.0.3 nodes I then have seen the following > exception regularly, which disappeared once all 6 nodes have been on 4.0.3. > Is this known? Can this be ignored? As said, just a test drive, but not sure > if we want to have that in production, especially with a larger number of > nodes, where it could take some time, until all are upgraded. Thanks! > {code} > ERROR [Native-Transport-Requests-8] 2022-03-30 11:30:24,057 > ErrorMessage.java:457 - Unexpected exception during request > java.lang.IllegalArgumentException: newLimit > capacity: (290 > 15) > at java.base/java.nio.Buffer.createLimitException(Buffer.java:372) > at java.base/java.nio.Buffer.limit(Buffer.java:346) > at java.base/java.nio.ByteBuffer.limit(ByteBuffer.java:1107) > at java.base/java.nio.ByteBuffer.limit(ByteBuffer.java:262) > at > org.apache.cassandra.db.marshal.ByteBufferAccessor.slice(ByteBufferAccessor.java:107) > at > org.apache.cassandra.db.marshal.ByteBufferAccessor.slice(ByteBufferAccessor.java:39) > at > org.apache.cassandra.db.marshal.ValueAccessor.sliceWithShortLength(ValueAccessor.java:225) > at > org.apache.cassandra.db.marshal.CompositeType.splitName(CompositeType.java:222) > at > org.apache.cassandra.service.pager.PagingState$RowMark.decodeClustering(PagingState.java:434) > at > org.apache.cassandra.service.pager.PagingState$RowMark.clustering(PagingState.java:388) > at > org.apache.cassandra.service.pager.SinglePartitionPager.nextPageReadQuery(SinglePartitionPager.java:88) > at > org.apache.cassandra.service.pager.SinglePartitionPager.nextPageReadQuery(SinglePartitionPager.java:32) >