Re: [I] [SUPPORT] How to do Schema Evolution with Apache Flink DataStream API when doing CDC? [hudi]
FranMorilloAWS commented on issue #10349: URL: https://github.com/apache/hudi/issues/10349#issuecomment-1983754269 Or if using a schema registry could help? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [SUPPORT] How to do Schema Evolution with Apache Flink DataStream API when doing CDC? [hudi]
FranMorilloAWS commented on issue #10349: URL: https://github.com/apache/hudi/issues/10349#issuecomment-1983749818 With the MySQLSyncDatabaseAction they claim the following:Currently supported schema changes includes: Adding columns. Altering column types. More specifically, altering from a string type (char, varchar, text) to another string type with longer length, altering from a binary type (binary, varbinary, blob) to another binary type with longer length, altering from an integer type (tinyint, smallint, int, bigint) to another integer type with wider range, altering from a floating-point type (float, double) to another floating-point type with wider range, are supported. So i am wondering how we would be able to do the same with Flink Hudi? Or at least the steps on how to manage if there is a schema change from the database? Any documentation you can provide or example? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [SUPPORT] How to do Schema Evolution with Apache Flink DataStream API when doing CDC? [hudi]
danny0405 commented on issue #10349: URL: https://github.com/apache/hudi/issues/10349#issuecomment-1972255060 Paimon does not do it, it just detect the schema when the first time it is started. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [SUPPORT] How to do Schema Evolution with Apache Flink DataStream API when doing CDC? [hudi]
FranMorilloAWS commented on issue #10349: URL: https://github.com/apache/hudi/issues/10349#issuecomment-1970749563 Then how is Apache Paimon being able to do it now? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [SUPPORT] How to do Schema Evolution with Apache Flink DataStream API when doing CDC? [hudi]
danny0405 commented on issue #10349: URL: https://github.com/apache/hudi/issues/10349#issuecomment-1970306800 No automatic schema evolution for streaming writer now, the limitation is from the Flink engine, the Flink table API already assumes constant schema for all the records there, so for the pipeline itself, the user needs to restart it manually when the schema change is detected. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [SUPPORT] How to do Schema Evolution with Apache Flink DataStream API when doing CDC? [hudi]
FranMorilloAWS commented on issue #10349: URL: https://github.com/apache/hudi/issues/10349#issuecomment-1969367187 any updates on this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [SUPPORT] How to do Schema Evolution with Apache Flink DataStream API when doing CDC? [hudi]
ad1happy2go commented on issue #10349: URL: https://github.com/apache/hudi/issues/10349#issuecomment-1919260878 @danny0405 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [SUPPORT] How to do Schema Evolution with Apache Flink DataStream API when doing CDC? [hudi]
FranMorilloAWS commented on issue #10349: URL: https://github.com/apache/hudi/issues/10349#issuecomment-1862304670 And also modify the schema in rowdata or capture the schema from a schema registry and then use it in the sink? Wondering if there is any other writer that could be helpful here, more than the hoodie pipeline sink -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [SUPPORT] How to do Schema Evolution with Apache Flink DataStream API when doing CDC? [hudi]
danny0405 commented on issue #10349: URL: https://github.com/apache/hudi/issues/10349#issuecomment-1862067864 Currently, you should stop the streaming job and execute the alter table cmd with spark then restart the job. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [SUPPORT] How to do Schema Evolution with Apache Flink DataStream API when doing CDC? [hudi]
ad1happy2go commented on issue #10349: URL: https://github.com/apache/hudi/issues/10349#issuecomment-1860925122 Yes, It should ideally work if the data type is backward compatible. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [SUPPORT] How to do Schema Evolution with Apache Flink DataStream API when doing CDC? [hudi]
FranMorilloAWS commented on issue #10349: URL: https://github.com/apache/hudi/issues/10349#issuecomment-1859865158 Do you have an example? Would this only work for adding a new field, or could it work as well by modifying a column type? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [SUPPORT] How to do Schema Evolution with Apache Flink DataStream API when doing CDC? [hudi]
ad1happy2go commented on issue #10349: URL: https://github.com/apache/hudi/issues/10349#issuecomment-1859818819 What kind of schema evolution usecases you may have, like common usecases like adding a new field should be automatically supported with new write. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org