[ https://issues.apache.org/jira/browse/CASSANDRA-19662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
BHARATH KUMAR updated CASSANDRA-19662: -------------------------------------- Description: h2. Description *Overview:* The primary issue is data corruption occurring during schema alterations (ADD/DROP column) on large tables(300+ columns and 6TB size ) in the production cluster. This problem has been replicated on multiple clusters, running Apache Cassandra version 4.0.12 and Datastax Java Driver Version: 4.17 *Details:* *Main Issue:* * *Data Corruption:* When dynamically adding a column to a table, the data intended for the new column is shifted, causing misalignment in the data. * *Symptoms:* The object implementing {{com.datastax.oss.driver.api.core.cql.Row}} returns values shifted against the column names returned by {{{}row.getColumnDefinitions(){}}}. The driver returns a corrupted row, leading to incorrect data insertion. *Additional Issues:* *Exceptions:* * {{java.nio.BufferUnderflowException}} during batch reads when ALTER TABLE ADD/DROP column statements are issued. * {{java.lang.ArrayIndexOutOfBoundsException}} in some cases. * -Buffer underflow exceptions with messages like "Invalid 32-bits integer value, expecting 4 bytes but got 292".- [This is no longer issue after upgrading to V5] * {-}OOM errors mostly occur during ADD column operations, while other exceptions occur during DELETE column operations.{-}[This is no longer issue after upgrading to V5] * *Method Specific:* Errors occur specifically with {{{}row.getList(columnName, Float.class){}}}, returning incorrect values. *Reproducibility:* * The issue is reproducible on larger tables (300 columns, 6 TB size) but not on smaller tables. * SELECT * statements are used during reads * *Method Specific:* Errors occur specifically with {{{}row.getList(columnName, Float.class){}}}, returning incorrect values. However, the code registers a driver exception when calling the method {{{}row.getList(columnName, Float.class){}}}. We pass the exact column name obtained from {{{}row.getColumnDefinition{}}}, but it returns the wrong value for a column with this name. This suggests that the issue lies with the driver returning an object with incorrect properties, rather than with the SQL query itself. *Debugging Efforts:* * *Metadata Refresh:* Enabling metadata refresh did not resolve the issue. * *Schema Agreement:* {{session.getCqlSession().checkSchemaAgreement()}} did not detect inconsistencies during test execution. Additonal information about mis-alignment: 1) *Result Set returned by the driver has mixed up values for the column type “text”.* This is observed in *signals library* itself before any processing from our end Example: In Cassandra Column 1 text – Value 1 Column 2 text – Value 2 Column 3 text – Value 3 In Result set retrieved by the *signals library* Column 1 text – Value 2 Column 2 text – Value 1 Column 3 text – Value 3 was: h2. Description *Overview:* The primary issue is data corruption occurring during schema alterations (ADD/DROP column) on large tables(300+ columns and 6TB size ) in the production cluster. This problem has been replicated on multiple clusters, running Apache Cassandra version 4.0.12 and Datastax Java Driver Version: 4.17 *Details:* *Main Issue:* * *Data Corruption:* When dynamically adding a column to a table, the data intended for the new column is shifted, causing misalignment in the data. * *Symptoms:* The object implementing {{com.datastax.oss.driver.api.core.cql.Row}} returns values shifted against the column names returned by {{{}row.getColumnDefinitions(){}}}. The driver returns a corrupted row, leading to incorrect data insertion. *Additional Issues:* *Exceptions:* * {{java.nio.BufferUnderflowException}} during batch reads when ALTER TABLE ADD/DROP column statements are issued. * {{java.lang.ArrayIndexOutOfBoundsException}} in some cases. * -Buffer underflow exceptions with messages like "Invalid 32-bits integer value, expecting 4 bytes but got 292".- [This is no longer issue after upgrading to V5] * {-}OOM errors mostly occur during ADD column operations, while other exceptions occur during DELETE column operations.{-}[This is no longer issue after upgrading to V5] * *Method Specific:* Errors occur specifically with {{{}row.getList(columnName, Float.class){}}}, returning incorrect values. *Reproducibility:* * The issue is reproducible on larger tables (300 columns, 6 TB size) but not on smaller tables. * SELECT * statements are used during reads * *Method Specific:* Errors occur specifically with {{{}row.getList(columnName, Float.class){}}}, returning incorrect values. However, the code registers a driver exception when calling the method {{{}row.getList(columnName, Float.class){}}}. We pass the exact column name obtained from {{{}row.getColumnDefinition{}}}, but it returns the wrong value for a column with this name. This suggests that the issue lies with the driver returning an object with incorrect properties, rather than with the SQL query itself. *Debugging Efforts:* * *Metadata Refresh:* Enabling metadata refresh did not resolve the issue. * *Schema Agreement:* {{session.getCqlSession().checkSchemaAgreement()}} did not detect inconsistencies during test execution. > Data Corruption during Schema Alterations > ------------------------------------------ > > Key: CASSANDRA-19662 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19662 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Schema > Reporter: BHARATH KUMAR > Priority: Urgent > Attachments: BufferUnderflow_plus_error > > > h2. Description > > *Overview:* The primary issue is data corruption occurring during schema > alterations (ADD/DROP column) on large tables(300+ columns and 6TB size ) in > the production cluster. This problem has been replicated on multiple > clusters, running Apache Cassandra version 4.0.12 and Datastax Java Driver > Version: 4.17 > *Details:* > *Main Issue:* > * *Data Corruption:* When dynamically adding a column to a table, the data > intended for the new column is shifted, causing misalignment in the data. > * *Symptoms:* The object implementing > {{com.datastax.oss.driver.api.core.cql.Row}} returns values shifted against > the column names returned by {{{}row.getColumnDefinitions(){}}}. The driver > returns a corrupted row, leading to incorrect data insertion. > *Additional Issues:* > *Exceptions:* > * {{java.nio.BufferUnderflowException}} during batch reads when ALTER TABLE > ADD/DROP column statements are issued. > * {{java.lang.ArrayIndexOutOfBoundsException}} in some cases. > * -Buffer underflow exceptions with messages like "Invalid 32-bits integer > value, expecting 4 bytes but got 292".- [This is no longer issue after > upgrading to V5] > * {-}OOM errors mostly occur during ADD column operations, while other > exceptions occur during DELETE column operations.{-}[This is no longer issue > after upgrading to V5] > * *Method Specific:* Errors occur specifically with > {{{}row.getList(columnName, Float.class){}}}, returning incorrect values. > *Reproducibility:* > * The issue is reproducible on larger tables (300 columns, 6 TB size) but > not on smaller tables. > * SELECT * statements are used during reads > * *Method Specific:* Errors occur specifically with > {{{}row.getList(columnName, Float.class){}}}, returning incorrect values. > However, the code registers a driver exception when calling the method > {{{}row.getList(columnName, Float.class){}}}. We pass the exact column name > obtained from {{{}row.getColumnDefinition{}}}, but it returns the wrong value > for a column with this name. This suggests that the issue lies with the > driver returning an object with incorrect properties, rather than with the > SQL query itself. > *Debugging Efforts:* > * *Metadata Refresh:* Enabling metadata refresh did not resolve the issue. > * *Schema Agreement:* {{session.getCqlSession().checkSchemaAgreement()}} did > not detect inconsistencies during test execution. > Additonal information about mis-alignment: > 1) *Result Set returned by the driver has mixed up values for the column type > “text”.* > This is observed in *signals library* itself before any processing from our > end > > Example: > In Cassandra > Column 1 text – Value 1 > Column 2 text – Value 2 > Column 3 text – Value 3 > > In Result set retrieved by the *signals library* > Column 1 text – Value 2 > Column 2 text – Value 1 > Column 3 text – Value 3 -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org