codope opened a new issue, #7583: URL: https://github.com/apache/hudi/issues/7583
**Describe the problem you faced** Original issue: https://github.com/trinodb/trino/issues/15368 > Our team is testing the same on COPY ON WRITE HUDI (0.10.1) tables with metadata enabled at version using Trino 400. And we are facing the error while reading from partitioned tables. > `Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.apache.hudi.common.bootstrap.index.HFileBootstrapIndex`. The issue was resolved by placing some dependencies in the classpath. Interestingly, those dependencies are [already included in the trino-hudi-bundle](https://github.com/apache/hudi/blob/release-0.12.1/packaging/hudi-trino-bundle/pom.xml#L69-L98). This particular issues tracks any gap in packaging. **To Reproduce** Steps to reproduce the behavior: 1. Write a Hudi COW table with the below properties and metadata enabled. 2. Query the same table using the trino-hudi connector (properties mentioned below) with `hudi.metadata-enabled=true`. **Trino Hudi Connector Properties:** ``` connector.name=hudi hive.metastore.uri={METASTORE_URI} hive.s3.iam-role={S3_IAM_ROLE} hive.metastore-refresh-interval=2m hive.metastore-timeout=3m hudi.max-outstanding-splits=1800 hive.s3.max-error-retries=50 hive.s3.connect-timeout=1m hive.s3.socket-timeout=2m hudi.parquet.use-column-names=true hudi.metadata-enabled=true ``` **Hudi Properties set while writing:** ``` hoodie.datasource.write.partitionpath.field = "insert_ds_ist", hoodie.datasource.write.recordkey.field = "id", hoodie.datasource.write.precombine.field = "_hoodie_incremental_key", (self generated column), hoodie.datasource.write.hive_style_partitioning = "true", hoodie.datasource.hive_sync.auto_create_database = "true", hoodie.parquet.compression.codec = "gzip", hoodie.table.name = "<table_name>", hoodie.datasource.write.keygenerator.class = "org.apache.hudi.keygen.SimpleKeyGenerator", hoodie.datasource.write.table.type = "COPY_ON_WRITE", hoodie.metadata.enable = "true", hoodie.datasource.hive_sync.enable = "true", hoodie.datasource.hive_sync.partition_fields = "insert_ds_ist", hoodie.datasource.hive_sync.partition_extractor_class = "org.apache.hudi.hive.MultiPartKeysValueExtractor" ``` **General information of table:** Total rows = 1,213,959,199 Total Partitions = 2400+ Total file objects = 120,000 Total Size on S3 = 12~13 GB The table was upgraded from 0.9.0 to 0.10.1 **Coordinator Relevant Logs:** **Expected behavior** They query should work out-of-the-box without having to place jars in classpath. **Environment Description** * Hudi version : 0.10.1 * Spark version : 2.4 * Trino version : [400](https://github.com/trinodb/trino/tree/400) * Hadoop version : * Storage (HDFS/S3/GCS..) : * Running on Docker? (yes/no) : no **Additional context** Add any other context about the problem here. **Stacktrace** Full stacktrace in [Partitioned_COW_Hudi_Coordinator_logs.log](https://github.com/apache/hudi/files/10323254/Partitioned_COW_Hudi_Coordinator_logs.log) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org