Re: [I] [SUPPORT] How to do Schema Evolution with Apache Flink DataStream API when doing CDC? [hudi]

2024-03-07 Thread via GitHub


FranMorilloAWS commented on issue #10349:
URL: https://github.com/apache/hudi/issues/10349#issuecomment-1983754269

   Or if using a schema registry could help?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] How to do Schema Evolution with Apache Flink DataStream API when doing CDC? [hudi]

2024-03-07 Thread via GitHub


FranMorilloAWS commented on issue #10349:
URL: https://github.com/apache/hudi/issues/10349#issuecomment-1983749818

   With the MySQLSyncDatabaseAction they claim the following:Currently 
supported schema changes includes:
   
   Adding columns.
   
   Altering column types. More specifically,
   altering from a string type (char, varchar, text) to another string 
type with longer length,
   altering from a binary type (binary, varbinary, blob) to another 
binary type with longer length,
   altering from an integer type (tinyint, smallint, int, bigint) to 
another integer type with wider range,
   altering from a floating-point type (float, double) to another 
floating-point type with wider range,
   
   are supported.
   
   
   So i am wondering how we would be able to do the same with Flink Hudi?  Or 
at least the steps on how to manage if there is a schema change from the 
database?
   
   Any documentation you can provide or example?   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] How to do Schema Evolution with Apache Flink DataStream API when doing CDC? [hudi]

2024-02-29 Thread via GitHub


danny0405 commented on issue #10349:
URL: https://github.com/apache/hudi/issues/10349#issuecomment-1972255060

   Paimon does not do it, it just detect the schema when the first time it is 
started.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] How to do Schema Evolution with Apache Flink DataStream API when doing CDC? [hudi]

2024-02-29 Thread via GitHub


FranMorilloAWS commented on issue #10349:
URL: https://github.com/apache/hudi/issues/10349#issuecomment-1970749563

   Then how is Apache Paimon being able to do it now?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] How to do Schema Evolution with Apache Flink DataStream API when doing CDC? [hudi]

2024-02-28 Thread via GitHub


danny0405 commented on issue #10349:
URL: https://github.com/apache/hudi/issues/10349#issuecomment-1970306800

   No automatic schema evolution for streaming writer now, the limitation is 
from the Flink engine, the Flink table API already assumes constant schema for 
all the records there, so for the pipeline itself, the user needs to restart it 
manually when the schema change is detected.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] How to do Schema Evolution with Apache Flink DataStream API when doing CDC? [hudi]

2024-02-28 Thread via GitHub


FranMorilloAWS commented on issue #10349:
URL: https://github.com/apache/hudi/issues/10349#issuecomment-1969367187

   any updates on this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] How to do Schema Evolution with Apache Flink DataStream API when doing CDC? [hudi]

2024-01-31 Thread via GitHub


ad1happy2go commented on issue #10349:
URL: https://github.com/apache/hudi/issues/10349#issuecomment-1919260878

   @danny0405 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] How to do Schema Evolution with Apache Flink DataStream API when doing CDC? [hudi]

2023-12-19 Thread via GitHub


FranMorilloAWS commented on issue #10349:
URL: https://github.com/apache/hudi/issues/10349#issuecomment-1862304670

   And also modify the schema in rowdata or capture the schema from a schema 
registry and then use it in the sink?   Wondering if there is any other writer 
that could be helpful here, more than the hoodie pipeline sink


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] How to do Schema Evolution with Apache Flink DataStream API when doing CDC? [hudi]

2023-12-18 Thread via GitHub


danny0405 commented on issue #10349:
URL: https://github.com/apache/hudi/issues/10349#issuecomment-1862067864

   Currently, you should stop the streaming job and execute the alter table cmd 
with spark then restart the job.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] How to do Schema Evolution with Apache Flink DataStream API when doing CDC? [hudi]

2023-12-18 Thread via GitHub


ad1happy2go commented on issue #10349:
URL: https://github.com/apache/hudi/issues/10349#issuecomment-1860925122

   Yes, It should ideally work if the data type is backward compatible.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] How to do Schema Evolution with Apache Flink DataStream API when doing CDC? [hudi]

2023-12-18 Thread via GitHub


FranMorilloAWS commented on issue #10349:
URL: https://github.com/apache/hudi/issues/10349#issuecomment-1859865158

   Do you have an example?   Would this only work for adding a new field, or 
could it work as well by modifying a column type?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] How to do Schema Evolution with Apache Flink DataStream API when doing CDC? [hudi]

2023-12-18 Thread via GitHub


ad1happy2go commented on issue #10349:
URL: https://github.com/apache/hudi/issues/10349#issuecomment-1859818819

   What kind of schema evolution usecases you may have, like common usecases 
like adding a new field should be automatically supported with new write. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org