[jira] [Updated] (HUDI-837) Fix AvroKafkaSource to use the latest schema for reading

2021-03-23 Thread sivabalan narayanan (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sivabalan narayanan updated HUDI-837:
-
Status: Closed  (was: Patch Available)

> Fix AvroKafkaSource to use the latest schema for reading
> 
>
> Key: HUDI-837
> URL: https://issues.apache.org/jira/browse/HUDI-837
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: DeltaStreamer
>Reporter: Pratyaksh Sharma
>Assignee: Pratyaksh Sharma
>Priority: Major
>  Labels: pull-request-available, sev:critical, user-support-issues
> Fix For: 0.8.0
>
>
> Currently we specify KafkaAvroDeserializer as the value for 
> value.deserializer in AvroKafkaSource. This implies the published record is 
> read using the same schema with which it was written even though the schema 
> got evolved in between. As a result, messages in incoming batch can have 
> different schemas. This has to be handled at the time of actually writing 
> records in parquet. 
> This Jira aims at providing an option to read all the messages with the same 
> schema by implementing a new custom deserializer class. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-837) Fix AvroKafkaSource to use the latest schema for reading

2021-03-23 Thread sivabalan narayanan (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sivabalan narayanan updated HUDI-837:
-
Fix Version/s: (was: 0.9.0)
   0.8.0

> Fix AvroKafkaSource to use the latest schema for reading
> 
>
> Key: HUDI-837
> URL: https://issues.apache.org/jira/browse/HUDI-837
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: DeltaStreamer
>Reporter: Pratyaksh Sharma
>Assignee: Pratyaksh Sharma
>Priority: Major
>  Labels: pull-request-available, sev:critical, user-support-issues
> Fix For: 0.8.0
>
>
> Currently we specify KafkaAvroDeserializer as the value for 
> value.deserializer in AvroKafkaSource. This implies the published record is 
> read using the same schema with which it was written even though the schema 
> got evolved in between. As a result, messages in incoming batch can have 
> different schemas. This has to be handled at the time of actually writing 
> records in parquet. 
> This Jira aims at providing an option to read all the messages with the same 
> schema by implementing a new custom deserializer class. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-837) Fix AvroKafkaSource to use the latest schema for reading

2021-03-23 Thread Gary Li (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Li updated HUDI-837:
-
Fix Version/s: (was: 0.8.0)
   0.9.0

> Fix AvroKafkaSource to use the latest schema for reading
> 
>
> Key: HUDI-837
> URL: https://issues.apache.org/jira/browse/HUDI-837
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: DeltaStreamer
>Reporter: Pratyaksh Sharma
>Assignee: Pratyaksh Sharma
>Priority: Major
>  Labels: pull-request-available, sev:critical, user-support-issues
> Fix For: 0.9.0
>
>
> Currently we specify KafkaAvroDeserializer as the value for 
> value.deserializer in AvroKafkaSource. This implies the published record is 
> read using the same schema with which it was written even though the schema 
> got evolved in between. As a result, messages in incoming batch can have 
> different schemas. This has to be handled at the time of actually writing 
> records in parquet. 
> This Jira aims at providing an option to read all the messages with the same 
> schema by implementing a new custom deserializer class. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-837) Fix AvroKafkaSource to use the latest schema for reading

2021-02-06 Thread sivabalan narayanan (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sivabalan narayanan updated HUDI-837:
-
Labels: pull-request-available sev:critical user-support-issues  (was: 
pull-request-available user-support-issues)

> Fix AvroKafkaSource to use the latest schema for reading
> 
>
> Key: HUDI-837
> URL: https://issues.apache.org/jira/browse/HUDI-837
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: DeltaStreamer
>Reporter: Pratyaksh Sharma
>Assignee: Pratyaksh Sharma
>Priority: Major
>  Labels: pull-request-available, sev:critical, user-support-issues
> Fix For: 0.8.0
>
>
> Currently we specify KafkaAvroDeserializer as the value for 
> value.deserializer in AvroKafkaSource. This implies the published record is 
> read using the same schema with which it was written even though the schema 
> got evolved in between. As a result, messages in incoming batch can have 
> different schemas. This has to be handled at the time of actually writing 
> records in parquet. 
> This Jira aims at providing an option to read all the messages with the same 
> schema by implementing a new custom deserializer class. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-837) Fix AvroKafkaSource to use the latest schema for reading

2021-01-26 Thread sivabalan narayanan (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sivabalan narayanan updated HUDI-837:
-
Labels: pull-request-available user-support-issues  (was: bug-bash-0.6.0 
pull-request-available)

> Fix AvroKafkaSource to use the latest schema for reading
> 
>
> Key: HUDI-837
> URL: https://issues.apache.org/jira/browse/HUDI-837
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: DeltaStreamer
>Reporter: Pratyaksh Sharma
>Assignee: Pratyaksh Sharma
>Priority: Major
>  Labels: pull-request-available, user-support-issues
> Fix For: 0.8.0
>
>
> Currently we specify KafkaAvroDeserializer as the value for 
> value.deserializer in AvroKafkaSource. This implies the published record is 
> read using the same schema with which it was written even though the schema 
> got evolved in between. As a result, messages in incoming batch can have 
> different schemas. This has to be handled at the time of actually writing 
> records in parquet. 
> This Jira aims at providing an option to read all the messages with the same 
> schema by implementing a new custom deserializer class. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-837) Fix AvroKafkaSource to use the latest schema for reading

2021-01-20 Thread Vinoth Chandar (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinoth Chandar updated HUDI-837:

Fix Version/s: (was: 0.7.0)
   0.8.0

> Fix AvroKafkaSource to use the latest schema for reading
> 
>
> Key: HUDI-837
> URL: https://issues.apache.org/jira/browse/HUDI-837
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: DeltaStreamer
>Reporter: Pratyaksh Sharma
>Assignee: Pratyaksh Sharma
>Priority: Major
>  Labels: bug-bash-0.6.0, pull-request-available
> Fix For: 0.8.0
>
>
> Currently we specify KafkaAvroDeserializer as the value for 
> value.deserializer in AvroKafkaSource. This implies the published record is 
> read using the same schema with which it was written even though the schema 
> got evolved in between. As a result, messages in incoming batch can have 
> different schemas. This has to be handled at the time of actually writing 
> records in parquet. 
> This Jira aims at providing an option to read all the messages with the same 
> schema by implementing a new custom deserializer class. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-837) Fix AvroKafkaSource to use the latest schema for reading

2020-08-14 Thread Bhavani Sudha (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhavani Sudha updated HUDI-837:
---
Fix Version/s: (was: 0.6.0)
   0.6.1

> Fix AvroKafkaSource to use the latest schema for reading
> 
>
> Key: HUDI-837
> URL: https://issues.apache.org/jira/browse/HUDI-837
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: DeltaStreamer
>Reporter: Pratyaksh Sharma
>Assignee: Pratyaksh Sharma
>Priority: Major
>  Labels: bug-bash-0.6.0, pull-request-available
> Fix For: 0.6.1
>
>
> Currently we specify KafkaAvroDeserializer as the value for 
> value.deserializer in AvroKafkaSource. This implies the published record is 
> read using the same schema with which it was written even though the schema 
> got evolved in between. As a result, messages in incoming batch can have 
> different schemas. This has to be handled at the time of actually writing 
> records in parquet. 
> This Jira aims at providing an option to read all the messages with the same 
> schema by implementing a new custom deserializer class. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-837) Fix AvroKafkaSource to use the latest schema for reading

2020-05-23 Thread sivabalan narayanan (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sivabalan narayanan updated HUDI-837:
-
Status: Patch Available  (was: In Progress)

> Fix AvroKafkaSource to use the latest schema for reading
> 
>
> Key: HUDI-837
> URL: https://issues.apache.org/jira/browse/HUDI-837
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: DeltaStreamer
>Reporter: Pratyaksh Sharma
>Assignee: Pratyaksh Sharma
>Priority: Major
>  Labels: bug-bash-0.6.0, pull-request-available
> Fix For: 0.6.0
>
>
> Currently we specify KafkaAvroDeserializer as the value for 
> value.deserializer in AvroKafkaSource. This implies the published record is 
> read using the same schema with which it was written even though the schema 
> got evolved in between. As a result, messages in incoming batch can have 
> different schemas. This has to be handled at the time of actually writing 
> records in parquet. 
> This Jira aims at providing an option to read all the messages with the same 
> schema by implementing a new custom deserializer class. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-837) Fix AvroKafkaSource to use the latest schema for reading

2020-05-08 Thread sivabalan narayanan (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sivabalan narayanan updated HUDI-837:
-
Labels: bug-bash-0.6.0 pull-request-available  (was: pull-request-available)

> Fix AvroKafkaSource to use the latest schema for reading
> 
>
> Key: HUDI-837
> URL: https://issues.apache.org/jira/browse/HUDI-837
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: DeltaStreamer
>Reporter: Pratyaksh Sharma
>Assignee: Pratyaksh Sharma
>Priority: Major
>  Labels: bug-bash-0.6.0, pull-request-available
> Fix For: 0.6.0
>
>
> Currently we specify KafkaAvroDeserializer as the value for 
> value.deserializer in AvroKafkaSource. This implies the published record is 
> read using the same schema with which it was written even though the schema 
> got evolved in between. As a result, messages in incoming batch can have 
> different schemas. This has to be handled at the time of actually writing 
> records in parquet. 
> This Jira aims at providing an option to read all the messages with the same 
> schema by implementing a new custom deserializer class. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-837) Fix AvroKafkaSource to use the latest schema for reading

2020-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-837:

Labels: pull-request-available  (was: )

> Fix AvroKafkaSource to use the latest schema for reading
> 
>
> Key: HUDI-837
> URL: https://issues.apache.org/jira/browse/HUDI-837
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: DeltaStreamer
>Reporter: Pratyaksh Sharma
>Assignee: Pratyaksh Sharma
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.6.0
>
>
> Currently we specify KafkaAvroDeserializer as the value for 
> value.deserializer in AvroKafkaSource. This implies the published record is 
> read using the same schema with which it was written even though the schema 
> got evolved in between. As a result, messages in incoming batch can have 
> different schemas. This has to be handled at the time of actually writing 
> records in parquet. 
> This Jira aims at providing an option to read all the messages with the same 
> schema by implementing a new custom deserializer class. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-837) Fix AvroKafkaSource to use the latest schema for reading

2020-04-24 Thread Pratyaksh Sharma (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pratyaksh Sharma updated HUDI-837:
--
Summary: Fix AvroKafkaSource to use the latest schema for reading  (was: 
Fix KafkaAvroSource to use the latest schema for reading)

> Fix AvroKafkaSource to use the latest schema for reading
> 
>
> Key: HUDI-837
> URL: https://issues.apache.org/jira/browse/HUDI-837
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: DeltaStreamer
>Reporter: Pratyaksh Sharma
>Assignee: Pratyaksh Sharma
>Priority: Major
> Fix For: 0.6.0
>
>
> Currently we specify KafkaAvroDeserializer as the value for 
> value.deserializer in AvroKafkaSource. This implies the published record is 
> read using the same schema with which it was written even though the schema 
> got evolved in between. As a result, messages in incoming batch can have 
> different schemas. This has to be handled at the time of actually writing 
> records in parquet. 
> This Jira aims at providing an option to read all the messages with the same 
> schema by implementing a new custom deserializer class. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)