fateh288 commented on PR #569: URL: https://github.com/apache/ranger/pull/569#issuecomment-3120882420
@vikaskr22 `New binaries not yet applied, means, read/write will continue to happen with old column. How you are handling the case, where a column is being updated by the application after copying from col1-> col2 ? ` --- great point. No, this case is not being handled - any update to old column after it has been copied but before trigger is created won't be updated in the new column. For KMS, it was observed that copy one column to another for 1M rows took 35 seconds. Realistically we won't have 1M rows and would just be a few hundred max rows so time between copy of column and trigger creation would be a few seconds max. `If we are using a trigger on each data modification that will copy to new column. Then, we have multiple triggers for the same column ( if multiple modification is happening), and these triggers may also overlap with the job/cursor that is copying old format data to new column.` --- Trigger is being created after the copy job is completed, so I am not clear how data copy job and trigger will overlap ``` This scenario is posing two risks, Triggers processing ( due to multiple updates on same column ) may be processed in different order and older data may be written to new column. ``` --- from my understanding, is there are multiple modifications to same column, each of the triggers are processed sequentially - this behavior is out of the box for databases -- we can confirm this incase there is any gap in my understanding `what if some trigger processing/execution fails, We may log this, report this. But do we have any systematic way ( out of box from the framework itself) to detect this and retry ?` --- My understanding is that if the trigger fails, then the original transaction also fails i.e. original update will be reverted if the trigger fails to copy the data to the new column basically trigger is now a part of original transaction ``` Step1: Apply the schema changes, means, create a new column. Step2: Using dynamic configuration (needs to be implemented), let running instances know that ZDU process has started and in such cases they will write to old column in older format and additionally they will also write an event in one table (say ZDU upgrade table, a new table). After writing at both places , then only, transaction will be successful. Step3: As part of your framework, that contains logic to migrate data from col1 to col2, you code/cursor should read event in the insertion order to process this and once it is processed, then only it should be deleted from table. If any Runtime error occurs, since it has not been deleted, it will be retried. ``` --- I am not clear on this. The old bits / old kms instance will not have logic to write to a new table. Old application is not aware of new schema changes, new table that needs to be written to. Writing to both new and old columns can be done by new KMS instance so that new data is available to the old instances too. But how old instances can write to new column I don't fully understand. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@ranger.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org