[ 
https://issues.apache.org/jira/browse/KUDU-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Serbin resolved KUDU-3483.
---------------------------------
    Fix Version/s: 1.18.0
       Resolution: Fixed

Thank you for reporting the fixing the issue, [~wangxixu]!

> Flushing data in AUTO_FLUSH_BACKGROUND mode fails when the table's schema is 
> changing
> -------------------------------------------------------------------------------------
>
>                 Key: KUDU-3483
>                 URL: https://issues.apache.org/jira/browse/KUDU-3483
>             Project: Kudu
>          Issue Type: Bug
>            Reporter: Xixu Wang
>            Priority: Major
>             Fix For: 1.18.0
>
>         Attachments: image-2023-05-30-16-12-20-361.png
>
>
>  
> *1.The problem*
> Flush multiple data in auto_flush_background mode maybe fail when the table 
> schema has changed. The following is the error message:
> !image-2023-05-30-16-12-20-361.png!
>  
> *2.How to repeat the case*
> 1.create a table with 2 columns.
> 2.insert a data into this table in auto_flush_background mode.
> 3.Add 3 new columns for this table.
> 4.reopen this table
> 5.insert a data into this table in auto_flush_background mode.
> 6.flush the buffer
> {code:java}
> KuduTable table = createTable(ImmutableList.of());
> // Add a row with addNullableDef=null    
> final KuduSession session = client.newSession();    
> session.setFlushMode(SessionConfiguration.FlushMode.AUTO_FLUSH_BACKGROUND);   
>  
> Insert insert = table.newInsert();    
> PartialRow row = insert.getRow();    
> row.addInt("c0", 101);    
> row.addInt("c1", 101);    
> session.apply(insert);
> // Add some new columns.    
> client.alterTable(tableName, new AlterTableOptions()    
>   .addColumn("addNonNull", Type.INT32, 100)    
>   .addNullableColumn("addNullable", Type.INT32)    
>   .addNullableColumn("addNullableDef", Type.INT32, 200));
>     
> // Reopen table for the new schema.    
> table = client.openTable(tableName);    
> assertEquals(5, table.getSchema().getColumnCount());    
> Insert newinsert = table.newInsert();    
> PartialRow newrow = newinsert.getRow();    
> newrow.addInt("c0", 101);    
> newrow.addInt("c1", 101);    
> newrow.addInt("addNonNull", 101);    
> newrow.addInt("addNullable", 101);    
> newrow.setNull("addNullableDef");    
> session.apply(newinsert);    
> session.flush(); {code}
>  
> *3.Why this problem happened*
> In auto_flush_background mode, applying an operation will firstly be inserted 
> into the buffer. When the buffer is full or function flush() is called, it 
> will try to flush multiple data into Kudu server. First, it will group these 
> data according to the tablet id as a batch. A batch may contains multiple 
> rows which belong to the same tablet. Then a batch will encode into bytes. At 
> this time, it will read the table schema of the first row and decide the 
> format of the data. If two rows has different schema but belongs to the same 
> table, which because of altering the table between inserting two rows, it 
> will cause array index outbound exception.
>  
> By the way, it hard to trace the whole process, especially in kudu tablet 
> server, it is better to log downstream IP and client id.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to