[jira] [Commented] (HUDI-874) Schema evolution does not work with AWS Glue catalog

2021-05-11 Thread Balaji Balasubramaniam (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17342814#comment-17342814
 ] 

Balaji Balasubramaniam commented on HUDI-874:
-

[~uditme] [~wenningd] - I'll try again with EMR 6.2.0 and see how it goes. The 
issue is not with adding additional column, HUDI handles that one beautifully. 
The issue happens when you are partitioning on a column and a new value comes 
in and a new partition needs to be created, that's when it fails. 

I'll attach sample schema and data file hopefully by end of today or tomorrow. 

> Schema evolution does not work with AWS Glue catalog
> 
>
> Key: HUDI-874
> URL: https://issues.apache.org/jira/browse/HUDI-874
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Hive Integration
>Reporter: Udit Mehrotra
>Assignee: Udit Mehrotra
>Priority: Major
>  Labels: aws-emr, sev:critical, user-support-issues
>
> This issue has been discussed here 
> [https://github.com/apache/incubator-hudi/issues/1581] and at other places as 
> well. Glue catalog currently does not support *cascade* for *ALTER TABLE* 
> statements. As a result features like adding new columns to an existing table 
> does now work with glue catalog .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-874) Schema evolution does not work with AWS Glue catalog

2021-05-11 Thread Wenning Ding (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17342805#comment-17342805
 ] 

Wenning Ding commented on HUDI-874:
---

Can you share some reproduction steps.

Here is what I tried on EMR 6.1.0:
 # Created a Hudi table with 4 columns.
 # Append a new column at the end (5 columns totally), upsert Hudi table.

> Schema evolution does not work with AWS Glue catalog
> 
>
> Key: HUDI-874
> URL: https://issues.apache.org/jira/browse/HUDI-874
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Hive Integration
>Reporter: Udit Mehrotra
>Assignee: Udit Mehrotra
>Priority: Major
>  Labels: aws-emr, sev:critical, user-support-issues
>
> This issue has been discussed here 
> [https://github.com/apache/incubator-hudi/issues/1581] and at other places as 
> well. Glue catalog currently does not support *cascade* for *ALTER TABLE* 
> statements. As a result features like adding new columns to an existing table 
> does now work with glue catalog .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-874) Schema evolution does not work with AWS Glue catalog

2021-05-06 Thread Udit Mehrotra (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340409#comment-17340409
 ] 

Udit Mehrotra commented on HUDI-874:


[~balajiit] can you share some quick/easy reproduction steps.

> Schema evolution does not work with AWS Glue catalog
> 
>
> Key: HUDI-874
> URL: https://issues.apache.org/jira/browse/HUDI-874
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Hive Integration
>Reporter: Udit Mehrotra
>Assignee: Udit Mehrotra
>Priority: Major
>  Labels: aws-emr, sev:critical, user-support-issues
>
> This issue has been discussed here 
> [https://github.com/apache/incubator-hudi/issues/1581] and at other places as 
> well. Glue catalog currently does not support *cascade* for *ALTER TABLE* 
> statements. As a result features like adding new columns to an existing table 
> does now work with glue catalog .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-874) Schema evolution does not work with AWS Glue catalog

2021-04-22 Thread Balaji Balasubramaniam (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17329312#comment-17329312
 ] 

Balaji Balasubramaniam commented on HUDI-874:
-

I don't know why it is marked as resolved, though I was clearly able to 
reproduce the issue on EMR 6.1.0.

> Schema evolution does not work with AWS Glue catalog
> 
>
> Key: HUDI-874
> URL: https://issues.apache.org/jira/browse/HUDI-874
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Hive Integration
>Reporter: Udit Mehrotra
>Assignee: Udit Mehrotra
>Priority: Major
>  Labels: aws-emr, sev:critical, user-support-issues
>
> This issue has been discussed here 
> [https://github.com/apache/incubator-hudi/issues/1581] and at other places as 
> well. Glue catalog currently does not support *cascade* for *ALTER TABLE* 
> statements. As a result features like adding new columns to an existing table 
> does now work with glue catalog .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-874) Schema evolution does not work with AWS Glue catalog

2021-04-21 Thread Udit Mehrotra (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17327008#comment-17327008
 ] 

Udit Mehrotra commented on HUDI-874:


This has been fixed since EMR 6.1.0 and EMR 5.32.0 releases.

> Schema evolution does not work with AWS Glue catalog
> 
>
> Key: HUDI-874
> URL: https://issues.apache.org/jira/browse/HUDI-874
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Hive Integration
>Reporter: Udit Mehrotra
>Priority: Major
>  Labels: aws-emr, sev:critical, user-support-issues
>
> This issue has been discussed here 
> [https://github.com/apache/incubator-hudi/issues/1581] and at other places as 
> well. Glue catalog currently does not support *cascade* for *ALTER TABLE* 
> statements. As a result features like adding new columns to an existing table 
> does now work with glue catalog .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-874) Schema evolution does not work with AWS Glue catalog

2021-04-02 Thread sivabalan narayanan (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17313840#comment-17313840
 ] 

sivabalan narayanan commented on HUDI-874:
--

[~uditme]: is someone from AWS looking into this. 

> Schema evolution does not work with AWS Glue catalog
> 
>
> Key: HUDI-874
> URL: https://issues.apache.org/jira/browse/HUDI-874
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Hive Integration
>Reporter: Udit Mehrotra
>Priority: Major
>  Labels: sev:critical, user-support-issues
>
> This issue has been discussed here 
> [https://github.com/apache/incubator-hudi/issues/1581] and at other places as 
> well. Glue catalog currently does not support *cascade* for *ALTER TABLE* 
> statements. As a result features like adding new columns to an existing table 
> does now work with glue catalog .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-874) Schema evolution does not work with AWS Glue catalog

2021-01-26 Thread sivabalan narayanan (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17272368#comment-17272368
 ] 

sivabalan narayanan commented on HUDI-874:
--

[~uditme]: can you please look into this ticket when you can. 

> Schema evolution does not work with AWS Glue catalog
> 
>
> Key: HUDI-874
> URL: https://issues.apache.org/jira/browse/HUDI-874
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Hive Integration
>Reporter: Udit Mehrotra
>Priority: Major
>
> This issue has been discussed here 
> [https://github.com/apache/incubator-hudi/issues/1581] and at other places as 
> well. Glue catalog currently does not support *cascade* for *ALTER TABLE* 
> statements. As a result features like adding new columns to an existing table 
> does now work with glue catalog .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-874) Schema evolution does not work with AWS Glue catalog

2020-11-20 Thread Balaji Balasubramaniam (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17236528#comment-17236528
 ] 

Balaji Balasubramaniam commented on HUDI-874:
-

[~uditme] [~vbalaji]

We are using AWS EMR 6.1.0 and I can able to reproduce the same issue as well. 
Any time a new partition is created, it is failing with the following error.

 

org.apache.hudi.hive.HoodieHiveSyncException: Failed in executing SQL ALTER 
TABLE ``.`` REPLACE COLUMNS(`_hoodie_commit_time` string, 
`_hoodie_commit_seqno` string, `_hoodie_record_key` string, 
`_hoodie_partition_path` string, `_hoodie_file_name` string, `xx` string, 
`` int, `` int, `` string, `` bigint ) cascade

 at 
org.apache.hudi.hive.HoodieHiveClient.updateHiveSQL(HoodieHiveClient.java:482)

 at 
org.apache.hudi.hive.HoodieHiveClient.updateTableDefinition(HoodieHiveClient.java:261)

 at org.apache.hudi.hive.HiveSyncTool.syncSchema(HiveSyncTool.java:164)

 at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:114)

 at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:87)

 at 
org.apache.hudi.HoodieSparkSqlWriter$.syncHive(HoodieSparkSqlWriter.scala:229)

 at 
org.apache.hudi.HoodieSparkSqlWriter$.checkWriteStatus(HoodieSparkSqlWriter.scala:279)

 at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:184)

 at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:108)

 at 
org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:46)

 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)

 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)

 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:90)

 at 
org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:180)

 at 
org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:218)

 at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)

 at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:215)

 at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:176)

 at 
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:124)

 at 
org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:123)

 at 
org.apache.spark.sql.DataFrameWriter.$anonfun$runCommand$1(DataFrameWriter.scala:944)

 at 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:106)

 at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:207)

 at 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:88)

 at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:763)

 at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:65)

 at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:944)

 at 
org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:396)

 at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:380)

 at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:269)

 at $line39.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(:37)

 at $line39.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(:41)

 at $line39.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(:43)

 at $line39.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(:45)

 at $line39.$read$$iw$$iw$$iw$$iw$$iw$$iw.(:47)

 at $line39.$read$$iw$$iw$$iw$$iw$$iw.(:49)

 at $line39.$read$$iw$$iw$$iw$$iw.(:51)

 at $line39.$read$$iw$$iw$$iw.(:53)

 at $line39.$read$$iw$$iw.(:55)

 at $line39.$read$$iw.(:57)

 at $line39.$read.(:59)

 at $line39.$read$.(:63)

 at $line39.$read$.()

 at $line39.$eval$.$print$lzycompute(:7)

 at $line39.$eval$.$print(:6)

 at $line39.$eval.$print()

 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

 at java.lang.reflect.Method.invoke(Method.java:498)

 at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:745)

 at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1021)

 at scala.tools.nsc.interpreter.IMain.$anonfun$interpret$1(IMain.scala:574)

 at 
scala.reflect.internal.util.ScalaClassLoader.asContext(ScalaClassLoader.scala:41)

 at 
scala.reflect.internal.util.ScalaClassLoader.asContext$(ScalaClassLoader.scala:37)

 at 
scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:41)

 at scala.tools.nsc.interpreter.IMain.loadAndRunReq$1(IMain.scala:573)

 at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:600)

[jira] [Commented] (HUDI-874) Schema evolution does not work with AWS Glue catalog

2020-10-09 Thread Udit Mehrotra (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211437#comment-17211437
 ] 

Udit Mehrotra commented on HUDI-874:


This fix is already on emr-6.1.0 release. However its not yet there in emr 5.x 
releases. You can expect it in the next emr 5.x release as well.

> Schema evolution does not work with AWS Glue catalog
> 
>
> Key: HUDI-874
> URL: https://issues.apache.org/jira/browse/HUDI-874
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Hive Integration
>Reporter: Udit Mehrotra
>Priority: Major
>
> This issue has been discussed here 
> [https://github.com/apache/incubator-hudi/issues/1581] and at other places as 
> well. Glue catalog currently does not support *cascade* for *ALTER TABLE* 
> statements. As a result features like adding new columns to an existing table 
> does now work with glue catalog .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-874) Schema evolution does not work with AWS Glue catalog

2020-07-22 Thread Udit Mehrotra (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163073#comment-17163073
 ] 

Udit Mehrotra commented on HUDI-874:


This has been fixed by EMR folks, but the fix will make it in upcoming EMR 
releases. This is not a change in Hudi but rather a change in EMR's integration 
with Glue metastore. That is why it will be part of future EMR release. Will 
update this Jira when we land this in a new release.

> Schema evolution does not work with AWS Glue catalog
> 
>
> Key: HUDI-874
> URL: https://issues.apache.org/jira/browse/HUDI-874
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Hive Integration
>Reporter: Udit Mehrotra
>Priority: Major
>
> This issue has been discussed here 
> [https://github.com/apache/incubator-hudi/issues/1581] and at other places as 
> well. Glue catalog currently does not support *cascade* for *ALTER TABLE* 
> statements. As a result features like adding new columns to an existing table 
> does now work with glue catalog .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-874) Schema evolution does not work with AWS Glue catalog

2020-07-22 Thread Balaji Varadarajan (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17162918#comment-17162918
 ] 

Balaji Varadarajan commented on HUDI-874:
-

This issue keeps coming up. New ticket: 
[https://github.com/apache/hudi/issues/1856]

> Schema evolution does not work with AWS Glue catalog
> 
>
> Key: HUDI-874
> URL: https://issues.apache.org/jira/browse/HUDI-874
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Hive Integration
>Reporter: Udit Mehrotra
>Priority: Major
>
> This issue has been discussed here 
> [https://github.com/apache/incubator-hudi/issues/1581] and at other places as 
> well. Glue catalog currently does not support *cascade* for *ALTER TABLE* 
> statements. As a result features like adding new columns to an existing table 
> does now work with glue catalog .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)