[ https://issues.apache.org/jira/browse/CASSANDRA-19662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17861278#comment-17861278 ]
BHARATH KUMAR commented on CASSANDRA-19662: ------------------------------------------- [~absurdfarce] , Apologies for the delayed response. To answer your questions. 1. These are prepared statements that are being used and 2. The issue persists even after using Protocol version V5, Java driver 4.17.0, Apache Cassandra 4.0.12 and DataStax Spark Cassandra Connector 3.4.1 > Data Corruption and OOM Issues During Schema Alterations > --------------------------------------------------------- > > Key: CASSANDRA-19662 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19662 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Schema > Reporter: BHARATH KUMAR > Priority: Urgent > Attachments: BufferUnderflow_plus_error > > > h2. Description > > *Overview:* The primary issue is data corruption occurring during schema > alterations (ADD/DROP column) on large tables(300+ columns and 6TB size ) in > the production cluster. This is accompanied by out-of-memory (OOM) errors and > other exceptions, specifically during batch reads. This problem has been > replicated on multiple clusters, running Apache Cassandra version 4.0.12 and > Datastax Java Driver Version: 4.17 > *Details:* > *Main Issue:* > * *Data Corruption:* When dynamically adding a column to a table, the data > intended for the new column is shifted, causing misalignment in the data. > * *Symptoms:* The object implementing > {{com.datastax.oss.driver.api.core.cql.Row}} returns values shifted against > the column names returned by {{{}row.getColumnDefinitions(){}}}. The driver > returns a corrupted row, leading to incorrect data insertion. > *Additional Issues:* > *Exceptions:* > * {{java.nio.BufferUnderflowException}} during batch reads when ALTER TABLE > ADD/DROP column statements are issued. > * {{java.lang.ArrayIndexOutOfBoundsException}} in some cases. > * Buffer underflow exceptions with messages like "Invalid 32-bits integer > value, expecting 4 bytes but got 292". > * OOM errors mostly occur during ADD column operations, while other > exceptions occur during DELETE column operations. > * *Method Specific:* Errors occur specifically with > {{{}row.getList(columnName, Float.class){}}}, returning incorrect values. > *Reproducibility:* > * The issue is reproducible on larger tables (300 columns, 6 TB size) but > not on smaller tables. > * SELECT * statements are used during reads > * *Method Specific:* Errors occur specifically with > {{{}row.getList(columnName, Float.class){}}}, returning incorrect values. > However, the code registers a driver exception when calling the method > {{{}row.getList(columnName, Float.class){}}}. We pass the exact column name > obtained from {{{}row.getColumnDefinition{}}}, but it returns the wrong value > for a column with this name. This suggests that the issue lies with the > driver returning an object with incorrect properties, rather than with the > SQL query itself. > *Debugging Efforts:* > * *Metadata Refresh:* Enabling metadata refresh did not resolve the issue. > * *Schema Agreement:* {{session.getCqlSession().checkSchemaAgreement()}} did > not detect inconsistencies during test execution. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org