[jira] [Updated] (HUDI-1453) Throw Exception when input data schema is not equal to the hoodie table schema
[ https://issues.apache.org/jira/browse/HUDI-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1453: -- Status: Open (was: New) > Throw Exception when input data schema is not equal to the hoodie table schema > -- > > Key: HUDI-1453 > URL: https://issues.apache.org/jira/browse/HUDI-1453 > Project: Apache Hudi > Issue Type: Improvement > Components: Writer Core >Affects Versions: 0.9.0 >Reporter: pengzhiwei >Assignee: pengzhiwei >Priority: Major > Labels: pull-request-available, sev:high, user-support-issues > Fix For: 0.9.0 > > > The hoodie table *h0's* schema is : > {code:java} > (id long, price double){code} > when I write the *dataframe* to *h0* with the follow schema: > {code:java} > (id long, price int){code} > An Exception is threw as follow: > {code:java} > at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:136)at > org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:136) at > org.apache.hudi.common.util.ParquetReaderIterator.hasNext(ParquetReaderIterator.java:49) > at > org.apache.hudi.common.util.queue.IteratorBasedQueueProducer.produce(IteratorBasedQueueProducer.java:45) > at > org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$0(BoundedInMemoryExecutor.java:102) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ... 4 > moreCaused by: java.lang.UnsupportedOperationException: > org.apache.parquet.avro.AvroConverters$FieldIntegerConverter at > org.apache.parquet.io.api.PrimitiveConverter.addDouble(PrimitiveConverter.java:84) > at > org.apache.parquet.column.impl.ColumnReaderImpl$2$2.writeValue(ColumnReaderImpl.java:228) > at > org.apache.parquet.column.impl.ColumnReaderImpl.writeCurrentValueToConverter(ColumnReaderImpl.java:367) > at > org.apache.parquet.io.RecordReaderImplementation.read(RecordReaderImplementation.java:406) > at > org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:226) > ... 11 more > {code} > I have enable the *AVRO_SCHEMA_VALIDATE,* it *can pass the schema validate > in HoodieTable#validateUpsertSchema,* so it is right to write the "int" data > to the "double" field in hoodie. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-1453) Throw Exception when input data schema is not equal to the hoodie table schema
[ https://issues.apache.org/jira/browse/HUDI-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Li updated HUDI-1453: -- Fix Version/s: (was: 0.8.0) 0.9.0 > Throw Exception when input data schema is not equal to the hoodie table schema > -- > > Key: HUDI-1453 > URL: https://issues.apache.org/jira/browse/HUDI-1453 > Project: Apache Hudi > Issue Type: Improvement > Components: Writer Core >Affects Versions: 0.9.0 >Reporter: pengzhiwei >Assignee: pengzhiwei >Priority: Major > Labels: pull-request-available, sev:high, user-support-issues > Fix For: 0.9.0 > > > The hoodie table *h0's* schema is : > {code:java} > (id long, price double){code} > when I write the *dataframe* to *h0* with the follow schema: > {code:java} > (id long, price int){code} > An Exception is threw as follow: > {code:java} > at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:136)at > org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:136) at > org.apache.hudi.common.util.ParquetReaderIterator.hasNext(ParquetReaderIterator.java:49) > at > org.apache.hudi.common.util.queue.IteratorBasedQueueProducer.produce(IteratorBasedQueueProducer.java:45) > at > org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$0(BoundedInMemoryExecutor.java:102) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ... 4 > moreCaused by: java.lang.UnsupportedOperationException: > org.apache.parquet.avro.AvroConverters$FieldIntegerConverter at > org.apache.parquet.io.api.PrimitiveConverter.addDouble(PrimitiveConverter.java:84) > at > org.apache.parquet.column.impl.ColumnReaderImpl$2$2.writeValue(ColumnReaderImpl.java:228) > at > org.apache.parquet.column.impl.ColumnReaderImpl.writeCurrentValueToConverter(ColumnReaderImpl.java:367) > at > org.apache.parquet.io.RecordReaderImplementation.read(RecordReaderImplementation.java:406) > at > org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:226) > ... 11 more > {code} > I have enable the *AVRO_SCHEMA_VALIDATE,* it *can pass the schema validate > in HoodieTable#validateUpsertSchema,* so it is right to write the "int" data > to the "double" field in hoodie. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-1453) Throw Exception when input data schema is not equal to the hoodie table schema
[ https://issues.apache.org/jira/browse/HUDI-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Li updated HUDI-1453: -- Affects Version/s: 0.9.0 > Throw Exception when input data schema is not equal to the hoodie table schema > -- > > Key: HUDI-1453 > URL: https://issues.apache.org/jira/browse/HUDI-1453 > Project: Apache Hudi > Issue Type: Improvement > Components: Writer Core >Affects Versions: 0.9.0 >Reporter: pengzhiwei >Assignee: pengzhiwei >Priority: Major > Labels: pull-request-available, sev:high, user-support-issues > Fix For: 0.8.0 > > > The hoodie table *h0's* schema is : > {code:java} > (id long, price double){code} > when I write the *dataframe* to *h0* with the follow schema: > {code:java} > (id long, price int){code} > An Exception is threw as follow: > {code:java} > at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:136)at > org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:136) at > org.apache.hudi.common.util.ParquetReaderIterator.hasNext(ParquetReaderIterator.java:49) > at > org.apache.hudi.common.util.queue.IteratorBasedQueueProducer.produce(IteratorBasedQueueProducer.java:45) > at > org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$0(BoundedInMemoryExecutor.java:102) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ... 4 > moreCaused by: java.lang.UnsupportedOperationException: > org.apache.parquet.avro.AvroConverters$FieldIntegerConverter at > org.apache.parquet.io.api.PrimitiveConverter.addDouble(PrimitiveConverter.java:84) > at > org.apache.parquet.column.impl.ColumnReaderImpl$2$2.writeValue(ColumnReaderImpl.java:228) > at > org.apache.parquet.column.impl.ColumnReaderImpl.writeCurrentValueToConverter(ColumnReaderImpl.java:367) > at > org.apache.parquet.io.RecordReaderImplementation.read(RecordReaderImplementation.java:406) > at > org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:226) > ... 11 more > {code} > I have enable the *AVRO_SCHEMA_VALIDATE,* it *can pass the schema validate > in HoodieTable#validateUpsertSchema,* so it is right to write the "int" data > to the "double" field in hoodie. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-1453) Throw Exception when input data schema is not equal to the hoodie table schema
[ https://issues.apache.org/jira/browse/HUDI-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1453: -- Labels: pull-request-available sev:high user-support-issues (was: pull-request-available user-support-issues) > Throw Exception when input data schema is not equal to the hoodie table schema > -- > > Key: HUDI-1453 > URL: https://issues.apache.org/jira/browse/HUDI-1453 > Project: Apache Hudi > Issue Type: Improvement > Components: Writer Core >Reporter: pengzhiwei >Assignee: pengzhiwei >Priority: Major > Labels: pull-request-available, sev:high, user-support-issues > Fix For: 0.8.0 > > > The hoodie table *h0's* schema is : > {code:java} > (id long, price double){code} > when I write the *dataframe* to *h0* with the follow schema: > {code:java} > (id long, price int){code} > An Exception is threw as follow: > {code:java} > at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:136)at > org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:136) at > org.apache.hudi.common.util.ParquetReaderIterator.hasNext(ParquetReaderIterator.java:49) > at > org.apache.hudi.common.util.queue.IteratorBasedQueueProducer.produce(IteratorBasedQueueProducer.java:45) > at > org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$0(BoundedInMemoryExecutor.java:102) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ... 4 > moreCaused by: java.lang.UnsupportedOperationException: > org.apache.parquet.avro.AvroConverters$FieldIntegerConverter at > org.apache.parquet.io.api.PrimitiveConverter.addDouble(PrimitiveConverter.java:84) > at > org.apache.parquet.column.impl.ColumnReaderImpl$2$2.writeValue(ColumnReaderImpl.java:228) > at > org.apache.parquet.column.impl.ColumnReaderImpl.writeCurrentValueToConverter(ColumnReaderImpl.java:367) > at > org.apache.parquet.io.RecordReaderImplementation.read(RecordReaderImplementation.java:406) > at > org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:226) > ... 11 more > {code} > I have enable the *AVRO_SCHEMA_VALIDATE,* it *can pass the schema validate > in HoodieTable#validateUpsertSchema,* so it is right to write the "int" data > to the "double" field in hoodie. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-1453) Throw Exception when input data schema is not equal to the hoodie table schema
[ https://issues.apache.org/jira/browse/HUDI-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] pengzhiwei updated HUDI-1453: - Description: The hoodie table *h0's* schema is : {code:java} (id long, price double){code} when I write the *dataframe* to *h0* with the follow schema: {code:java} (id long, price int){code} An Exception is threw as follow: {code:java} at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:136)at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:136) at org.apache.hudi.common.util.ParquetReaderIterator.hasNext(ParquetReaderIterator.java:49) at org.apache.hudi.common.util.queue.IteratorBasedQueueProducer.produce(IteratorBasedQueueProducer.java:45) at org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$0(BoundedInMemoryExecutor.java:102) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ... 4 moreCaused by: java.lang.UnsupportedOperationException: org.apache.parquet.avro.AvroConverters$FieldIntegerConverter at org.apache.parquet.io.api.PrimitiveConverter.addDouble(PrimitiveConverter.java:84) at org.apache.parquet.column.impl.ColumnReaderImpl$2$2.writeValue(ColumnReaderImpl.java:228) at org.apache.parquet.column.impl.ColumnReaderImpl.writeCurrentValueToConverter(ColumnReaderImpl.java:367) at org.apache.parquet.io.RecordReaderImplementation.read(RecordReaderImplementation.java:406) at org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:226) ... 11 more {code} I have enable the *AVRO_SCHEMA_VALIDATE,* it *can pass the schema validate in HoodieTable#validateUpsertSchema,* so it is right to write the "int" data to the "double" field in hoodie. was: The hoodie table *h0's* schema is : {code:java} (id long, price double){code} when I write *dataframe* to *h0* with the follow schema: {code:java} (id long, price int){code} An Exception is threw as follow: {code:java} at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:136)at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:136) at org.apache.hudi.common.util.ParquetReaderIterator.hasNext(ParquetReaderIterator.java:49) at org.apache.hudi.common.util.queue.IteratorBasedQueueProducer.produce(IteratorBasedQueueProducer.java:45) at org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$0(BoundedInMemoryExecutor.java:102) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ... 4 moreCaused by: java.lang.UnsupportedOperationException: org.apache.parquet.avro.AvroConverters$FieldIntegerConverter at org.apache.parquet.io.api.PrimitiveConverter.addDouble(PrimitiveConverter.java:84) at org.apache.parquet.column.impl.ColumnReaderImpl$2$2.writeValue(ColumnReaderImpl.java:228) at org.apache.parquet.column.impl.ColumnReaderImpl.writeCurrentValueToConverter(ColumnReaderImpl.java:367) at org.apache.parquet.io.RecordReaderImplementation.read(RecordReaderImplementation.java:406) at org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:226) ... 11 more {code} I have enable the *AVRO_SCHEMA_VALIDATE,* it ** can pass the schema validate in *HoodieTable#validateUpsertSchema,* so it is right to write the "int" data to the "double" field in hoodie. > Throw Exception when input data schema is not equal to the hoodie table schema > -- > > Key: HUDI-1453 > URL: https://issues.apache.org/jira/browse/HUDI-1453 > Project: Apache Hudi > Issue Type: Improvement > Components: Writer Core >Reporter: pengzhiwei >Assignee: pengzhiwei >Priority: Major > Labels: pull-request-available > Fix For: 0.7.0 > > > The hoodie table *h0's* schema is : > {code:java} > (id long, price double){code} > when I write the *dataframe* to *h0* with the follow schema: > {code:java} > (id long, price int){code} > An Exception is threw as follow: > {code:java} > at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:136)at > org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:136) at > org.apache.hudi.common.util.ParquetReaderIterator.hasNext(ParquetReaderIterator.java:49) > at > org.apache.hudi.common.util.queue.IteratorBasedQueueProducer.produce(IteratorBasedQueueProducer.java:45) > at > org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$0(BoundedInMemoryExecutor.java:102) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ... 4 > moreCaused by: java.lang.UnsupportedOperationException: >
[jira] [Updated] (HUDI-1453) Throw Exception when input data schema is not equal to the hoodie table schema
[ https://issues.apache.org/jira/browse/HUDI-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1453: - Labels: pull-request-available (was: ) > Throw Exception when input data schema is not equal to the hoodie table schema > -- > > Key: HUDI-1453 > URL: https://issues.apache.org/jira/browse/HUDI-1453 > Project: Apache Hudi > Issue Type: Improvement > Components: Writer Core >Reporter: pengzhiwei >Assignee: pengzhiwei >Priority: Major > Labels: pull-request-available > Fix For: 0.7.0 > > > The hoodie table *h0's* schema is : > {code:java} > (id long, price double){code} > when I write *dataframe* to *h0* with the follow schema: > {code:java} > (id long, price int){code} > An Exception is threw as follow: > {code:java} > at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:136)at > org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:136) at > org.apache.hudi.common.util.ParquetReaderIterator.hasNext(ParquetReaderIterator.java:49) > at > org.apache.hudi.common.util.queue.IteratorBasedQueueProducer.produce(IteratorBasedQueueProducer.java:45) > at > org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$0(BoundedInMemoryExecutor.java:102) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ... 4 > moreCaused by: java.lang.UnsupportedOperationException: > org.apache.parquet.avro.AvroConverters$FieldIntegerConverter at > org.apache.parquet.io.api.PrimitiveConverter.addDouble(PrimitiveConverter.java:84) > at > org.apache.parquet.column.impl.ColumnReaderImpl$2$2.writeValue(ColumnReaderImpl.java:228) > at > org.apache.parquet.column.impl.ColumnReaderImpl.writeCurrentValueToConverter(ColumnReaderImpl.java:367) > at > org.apache.parquet.io.RecordReaderImplementation.read(RecordReaderImplementation.java:406) > at > org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:226) > ... 11 more > {code} > I have enable the *AVRO_SCHEMA_VALIDATE,* it ** can pass the schema > validate in *HoodieTable#validateUpsertSchema,* so it is right to write the > "int" data to the "double" field in hoodie. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-1453) Throw Exception when input data schema is not equal to the hoodie table schema
[ https://issues.apache.org/jira/browse/HUDI-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] pengzhiwei updated HUDI-1453: - Summary: Throw Exception when input data schema is not equal to the hoodie table schema (was: Throw Exception when incoming data schema is not equal to the hoodie table schema) > Throw Exception when input data schema is not equal to the hoodie table schema > -- > > Key: HUDI-1453 > URL: https://issues.apache.org/jira/browse/HUDI-1453 > Project: Apache Hudi > Issue Type: Improvement > Components: Writer Core >Reporter: pengzhiwei >Assignee: pengzhiwei >Priority: Major > Fix For: 0.7.0 > > > The hoodie table *h0's* schema is : > {code:java} > (id long, price double){code} > when I write *dataframe* to *h0* with the follow schema: > {code:java} > (id long, price int){code} > An Exception is threw as follow: > {code:java} > at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:136)at > org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:136) at > org.apache.hudi.common.util.ParquetReaderIterator.hasNext(ParquetReaderIterator.java:49) > at > org.apache.hudi.common.util.queue.IteratorBasedQueueProducer.produce(IteratorBasedQueueProducer.java:45) > at > org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$0(BoundedInMemoryExecutor.java:102) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ... 4 > moreCaused by: java.lang.UnsupportedOperationException: > org.apache.parquet.avro.AvroConverters$FieldIntegerConverter at > org.apache.parquet.io.api.PrimitiveConverter.addDouble(PrimitiveConverter.java:84) > at > org.apache.parquet.column.impl.ColumnReaderImpl$2$2.writeValue(ColumnReaderImpl.java:228) > at > org.apache.parquet.column.impl.ColumnReaderImpl.writeCurrentValueToConverter(ColumnReaderImpl.java:367) > at > org.apache.parquet.io.RecordReaderImplementation.read(RecordReaderImplementation.java:406) > at > org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:226) > ... 11 more > {code} > I have enable the *AVRO_SCHEMA_VALIDATE,* it ** can pass the schema > validate in *HoodieTable#validateUpsertSchema,* so it is right to write the > "int" data to the "double" field in hoodie. -- This message was sent by Atlassian Jira (v8.3.4#803005)