[GitHub] [hudi] xiarixiaoyao commented on issue #5452: Schema Evolution: Missing column for previous records when new entry does not have the same while upsert.

2022-09-07 Thread GitBox
xiarixiaoyao commented on issue #5452: URL: https://github.com/apache/hudi/issues/5452#issuecomment-1240122505 @santoshsb you need use schema evolution and hoodie.datasource.write.reconcile.schema, see the follow codes ``` def perf(spark: SparkSession) = { import

[GitHub] [hudi] xiarixiaoyao commented on issue #5452: Schema Evolution: Missing column for previous records when new entry does not have the same while upsert.

2022-06-20 Thread GitBox
xiarixiaoyao commented on issue #5452: URL: https://github.com/apache/hudi/issues/5452#issuecomment-1161139148 @santoshsb could you pls share me the nested columns case code. thanks i think may be we can solve this problem together with https://issues.apache.org/jira/browse/HUDI-4276

[GitHub] [hudi] xiarixiaoyao commented on issue #5452: Schema Evolution: Missing column for previous records when new entry does not have the same while upsert.

2022-05-04 Thread GitBox
xiarixiaoyao commented on issue #5452: URL: https://github.com/apache/hudi/issues/5452#issuecomment-1118100800 @santoshsb createNewDF cannot support rewrite DataFrame with nested schema change. -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [hudi] xiarixiaoyao commented on issue #5452: Schema Evolution: Missing column for previous records when new entry does not have the same while upsert.

2022-04-29 Thread GitBox
xiarixiaoyao commented on issue #5452: URL: https://github.com/apache/hudi/issues/5452#issuecomment-1113004662 @santoshsb it was strange, let me try it -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [hudi] xiarixiaoyao commented on issue #5452: Schema Evolution: Missing column for previous records when new entry does not have the same while upsert.

2022-04-28 Thread GitBox
xiarixiaoyao commented on issue #5452: URL: https://github.com/apache/hudi/issues/5452#issuecomment-963890 @santoshsb pls use follow code to solve your problem ``` def createNewDF(df: DataFrame, oldTableSchema: StructType): DataFrame = { val writeSchema = df.schema

[GitHub] [hudi] xiarixiaoyao commented on issue #5452: Schema Evolution: Missing column for previous records when new entry does not have the same while upsert.

2022-04-27 Thread GitBox
xiarixiaoyao commented on issue #5452: URL: https://github.com/apache/hudi/issues/5452#issuecomment-713903 @santoshsb multipleBirthBoolean is a new column to be added, but How to determine its added position? is it added as the last column or somewhere else ? If the above