Actually, I guess it is a bug in hudi. reader and writer schema arguments
are called wrongly. (reader is sent for writer and writer is sent for
reader). Will file a bug. Then, as you expect, INT should be evolvable to
LONG, where as vice versa is incompatible.


On Mon, Sep 7, 2020 at 12:17 PM Sivabalan <[email protected]> wrote:

> Hudi relies on avro's Schema compatability check. Looks like as per avro
> SchemaCompatability, INT can't be evolved to a LONG, but LONG to INT is
> allowed.
>
> Check line no 339 here
> <https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/SchemaCompatibility.java>
> .
> Also, check their test case here
> <https://github.com/apache/avro/blob/master/lang/java/avro/src/test/java/org/apache/avro/TestSchemaCompatibilityTypeMismatch.java>
>  at
> line 44.
>
>
>
> On Mon, Sep 7, 2020 at 12:02 PM Prashant Wason <[email protected]>
> wrote:
>
>> Yes, the schema change looks fine. That would mean its an issue with the
>> schema compatibility checker. The are explicit checks for such cases so
>> can't say where the issue lies.
>>
>> I am out on a vacation this week. I will look into this as soon as I am
>> back.
>>
>> Thanks
>> Prashant
>>
>> On Sun, Sep 6, 2020, 11:18 AM Vinoth Chandar <[email protected]> wrote:
>>
>> > That does sound like a backwards compatible change.
>> > @prashant , any ideas here? (since you have the best context on the
>> schema
>> > validation checks)
>> >
>> > On Thu, Sep 3, 2020 at 8:12 PM cadl <[email protected]> wrote:
>> >
>> > > Hi All,
>> > >
>> > > I want to change the type of one column in my COW table, from int to
>> > long.
>> > > When I set “hoodie.avro.schema.validate = true” and upsert new data
>> with
>> > > long type, I got a “Failed upsert schema compatibility check” error.
>> > Dose
>> > > it break backwards compatibility? If I disable
>> > hoodie.avro.schema.validate,
>> > > I can upsert and read normally.
>> > >
>> > >
>> > > code demo:
>> https://gist.github.com/cadl/be433079747aeea88c9c1f45321cc2eb
>> > >
>> > > stacktrace:
>> > >
>> > >
>> > > org.apache.hudi.exception.HoodieUpsertException: Failed upsert schema
>> > > compatibility check.
>> > >   at
>> > >
>> >
>> org.apache.hudi.table.HoodieTable.validateUpsertSchema(HoodieTable.java:572)
>> > >   at
>> > >
>> >
>> org.apache.hudi.client.HoodieWriteClient.upsert(HoodieWriteClient.java:190)
>> > >   at
>> > >
>> >
>> org.apache.hudi.DataSourceUtils.doWriteOperation(DataSourceUtils.java:260)
>> > >   at
>> > >
>> >
>> org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:169)
>> > >   at
>> > org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:125)
>> > >   at
>> > >
>> >
>> org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:45)
>> > >   at
>> > >
>> >
>> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
>> > >   at
>> > >
>> >
>> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
>> > >   at
>> > >
>> >
>> org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:86)
>> > >   at
>> > >
>> >
>> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
>> > >   at
>> > >
>> >
>> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
>> > >   at
>> > >
>> >
>> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
>> > >   at
>> > >
>> >
>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>> > >   at
>> > >
>> >
>> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
>> > >   at
>> > org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
>> > >   at
>> > >
>> >
>> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80)
>> > >   at
>> > >
>> >
>> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80)
>> > >   at
>> > >
>> >
>> org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676)
>> > >   at
>> > >
>> >
>> org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676)
>> > >   at
>> > >
>> >
>> org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
>> > >   at
>> > >
>> >
>> org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
>> > >   at
>> > >
>> >
>> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
>> > >   at
>> > >
>> >
>> org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:676)
>> > >   at
>> > >
>> >
>> org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:285)
>> > >   at
>> org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:271)
>> > >   at
>> org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:229)
>> > >   ... 69 elided
>> > > Caused by: org.apache.hudi.exception.HoodieException: Failed schema
>> > > compatibility check for writerSchema
>> > >
>> >
>> :{"type":"record","name":"foo_record","namespace":"hoodie.foo","fields":[{"name":"_hoodie_commit_time","type":["null","string"],"doc":"","default":null},{"name":"_hoodie_commit_seqno","type":["null","string"],"doc":"","default":null},{"name":"_hoodie_record_key","type":["null","string"],"doc":"","default":null},{"name":"_hoodie_partition_path","type":["null","string"],"doc":"","default":null},{"name":"_hoodie_file_name","type":["null","string"],"doc":"","default":null},{"name":"a","type":"long"},{"name":"b","type":"string"},{"name":"__row_key","type":"int"},{"name":"__row_version","type":"int"}]},
>> > > table schema
>> > >
>> >
>> :{"type":"record","name":"foo_record","namespace":"hoodie.foo","fields":[{"name":"_hoodie_commit_time","type":["null","string"],"doc":"","default":null},{"name":"_hoodie_commit_seqno","type":["null","string"],"doc":"","default":null},{"name":"_hoodie_record_key","type":["null","string"],"doc":"","default":null},{"name":"_hoodie_partition_path","type":["null","string"],"doc":"","default":null},{"name":"_hoodie_file_name","type":["null","string"],"doc":"","default":null},{"name":"a","type":"int"},{"name":"b","type":"string"},{"name":"__row_key","type":"int"},{"name":"__row_version","type":"int"}]},
>> > > base path :file:///jfs/cadl/hudi_data/schema/foo
>> > >   at
>> > org.apache.hudi.table.HoodieTable.validateSchema(HoodieTable.java:564)
>> > >   at
>> > >
>> >
>> org.apache.hudi.table.HoodieTable.validateUpsertSchema(HoodieTable.java:570)
>> > >   ... 94 more.
>> >
>>
>
>
> --
> Regards,
> -Sivabalan
>


-- 
Regards,
-Sivabalan

Reply via email to