[GitHub] [hudi] codope commented on pull request #5674: [HUDI-4011] Add hudi-aws-bundle

2022-06-01 Thread GitBox


codope commented on PR #5674:
URL: https://github.com/apache/hudi/pull/5674#issuecomment-1143421652

   > > ## CI report:
   > > ```
   > > * 
[89136ed](https://github.com/apache/hudi/commit/89136edeeb79df92fdf60c5863e1eac92356f5f5)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=9013)
   
   > Do we know why the Azure CI failed ? When I go to the link it seems like 
all the tests have passed but CI still failed.
   
   this could be CI reporting error. Anyway, I have rebased and triggered CI 
again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] codope commented on pull request #5674: [HUDI-4011] Add hudi-aws-bundle

2022-06-01 Thread GitBox


codope commented on PR #5674:
URL: https://github.com/apache/hudi/pull/5674#issuecomment-1143412485

   > LGTM. Better have some e2e test on this bundle jar alone to validate the 
functionality.
   
   Running this bundle jar with run_sync_tool was throwing below error
   ```
   Exception in thread "main" java.lang.NoSuchMethodError: 
org.apache.parquet.avro.AvroSchemaConverter.convert(Lorg/apache/parquet/schema/MessageType;)Lorg/apache/avro/Schema;
at 
org.apache.hudi.common.table.TableSchemaResolver.convertParquetSchemaToAvro(TableSchemaResolver.java:351)
at 
org.apache.hudi.common.table.TableSchemaResolver.getTableAvroSchemaFromDataFile(TableSchemaResolver.java:158)
at 
org.apache.hudi.common.table.TableSchemaResolver.hasOperationField(TableSchemaResolver.java:575)
at 
org.apache.hudi.common.table.TableSchemaResolver.(TableSchemaResolver.java:83)
at 
org.apache.hudi.sync.common.AbstractSyncHoodieClient.getDataSchema(AbstractSyncHoodieClient.java:164)
at 
org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:196)
at org.apache.hudi.hive.HiveSyncTool.doSync(HiveSyncTool.java:142)
at 
org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:130)
at 
org.apache.hudi.aws.sync.AwsGlueCatalogSyncTool.main(AwsGlueCatalogSyncTool.java:68)
   ```
   Please check the last commit where I explicitly added parquet-avro 
dependencies and shaded from all other modules. After that I was able to run 
run_sync_tool with this bundle.
   
   https://user-images.githubusercontent.com/16440354/171382549-f876b8b6-75d3-41a0-bae3-29269869c902.png;>
   
   Note: These dependencies are not really required if we run it together with 
hudi-utilities-slim-bundle and hudi-spark-bundle. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] codope commented on pull request #5674: [HUDI-4011] Add hudi-aws-bundle

2022-05-31 Thread GitBox


codope commented on PR #5674:
URL: https://github.com/apache/hudi/pull/5674#issuecomment-1141761159

   Tested `AwsGlueCatalogSyncTool` with the new hudi-aws-bundle. It's working.
   
   https://user-images.githubusercontent.com/16440354/171115905-ec8f3b11-d973-4268-8228-7fac8ecb2c0f.png;>
   
   https://user-images.githubusercontent.com/16440354/171115945-10c2cdef-ab70-4b2c-ab1f-cad196a9c4c7.png;>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org