lucasmo opened a new issue, #466:
URL: https://github.com/apache/incubator-xtable/issues/466

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/incubator-xtable/issues?q=is%3Aissue) and 
found no similar issues.
   
   
   ### Please describe the bug šŸž
   
   I’m trying to use XTable to convert a hudi source to a delta target and I am 
receiving the following exception. The table is active and frequently updated. 
It is being actively queried as a hudi table.
   
   Is there any other debug information I can provide to make this more useful?
   
   My git head is 4a96627a
   OS is Linux/Ubuntu
   Java 11
   Modified log4j2.xml to set level=trace for org.apache.hudi, o.a.xtable
   
   ## Run with stacktrace:
   
   ```
   $ java -jar 
./xtable-utilities/target/xtable-utilities-0.1.0-SNAPSHOT-bundled.jar 
--datasetConfig config.yaml
   WARNING: Runtime environment or build system does not support multi-release 
JARs. This will impact location-based features.
   2024-06-05 23:22:05 INFO  org.apache.xtable.utilities.RunSync:148 - Running 
sync for basePath s3://hidden-s3-bucket/hidden-prefix/ for following table 
formats [DELTA]
   2024-06-05 23:22:05 INFO  
org.apache.hudi.common.table.HoodieTableMetaClient:133 - Loading 
HoodieTableMetaClient from s3://hidden-s3-bucket/hidden-prefix
   2024-06-05 23:22:05 WARN  org.apache.hadoop.util.NativeCodeLoader:60 - 
Unable to load native-hadoop library for your platform... using builtin-java 
classes where applicable
   2024-06-05 23:22:05 WARN  org.apache.hadoop.metrics2.impl.MetricsConfig:136 
- Cannot locate configuration: tried 
hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties
   2024-06-05 23:22:06 WARN  org.apache.hadoop.fs.s3a.SDKV2Upgrade:39 - 
Directly referencing AWS SDK V1 credential provider 
com.amazonaws.auth.DefaultAWSCredentialsProviderChain. AWS SDK V1 credential 
providers will be removed once S3A is upgraded to SDK V2
   2024-06-05 23:22:07 INFO  org.apache.hudi.common.table.HoodieTableConfig:276 
- Loading table properties from 
s3://hidden-s3-bucket/hidden-prefix/.hoodie/hoodie.properties
   2024-06-05 23:22:07 INFO  
org.apache.hudi.common.table.HoodieTableMetaClient:152 - Finished Loading Table 
of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from 
s3://hidden-s3-bucket/hidden-prefix
   2024-06-05 23:22:07 INFO  
org.apache.hudi.common.table.HoodieTableMetaClient:155 - Loading Active commit 
timeline for s3://hidden-s3-bucket/hidden-prefix
   2024-06-05 23:22:07 INFO  
org.apache.hudi.common.table.timeline.HoodieActiveTimeline:171 - Loaded 
instants upto : 
Option{val=[20240605231910580__clean__COMPLETED__20240605231918000]}
   2024-06-05 23:22:07 INFO  
org.apache.hudi.common.table.HoodieTableMetaClient:133 - Loading 
HoodieTableMetaClient from s3://hidden-s3-bucket/hidden-prefix
   2024-06-05 23:22:07 INFO  org.apache.hudi.common.table.HoodieTableConfig:276 
- Loading table properties from 
s3://hidden-s3-bucket/hidden-prefix/.hoodie/hoodie.properties
   2024-06-05 23:22:07 INFO  
org.apache.hudi.common.table.HoodieTableMetaClient:152 - Finished Loading Table 
of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from 
s3://hidden-s3-bucket/hidden-prefix
   2024-06-05 23:22:07 INFO  
org.apache.hudi.common.table.HoodieTableMetaClient:133 - Loading 
HoodieTableMetaClient from s3://hidden-s3-bucket/hidden-prefix/.hoodie/metadata
   2024-06-05 23:22:07 INFO  org.apache.hudi.common.table.HoodieTableConfig:276 
- Loading table properties from 
s3://hidden-s3-bucket/hidden-prefix/.hoodie/metadata/.hoodie/hoodie.properties
   2024-06-05 23:22:07 INFO  
org.apache.hudi.common.table.HoodieTableMetaClient:152 - Finished Loading Table 
of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from 
s3://hidden-s3-bucket/hidden-prefix/.hoodie/metadata
   2024-06-05 23:22:08 INFO  
org.apache.hudi.common.table.timeline.HoodieActiveTimeline:171 - Loaded 
instants upto : 
Option{val=[20240605231910580__deltacommit__COMPLETED__20240605231917000]}
   2024-06-05 23:22:08 INFO  
org.apache.hudi.common.table.view.AbstractTableFileSystemView:259 - Took 7 ms 
to read  0 instants, 0 replaced file groups
   WARNING: An illegal reflective access operation has occurred
   WARNING: Illegal reflective access by 
org.apache.hadoop.hbase.util.UnsafeAvailChecker 
(file:/incubator-xtable/xtable-utilities/target/xtable-utilities-0.1.0-SNAPSHOT-bundled.jar)
 to method java.nio.Bits.unaligned()
   WARNING: Please consider reporting this to the maintainers of 
org.apache.hadoop.hbase.util.UnsafeAvailChecker
   WARNING: Use --illegal-access=warn to enable warnings of further illegal 
reflective access operations
   WARNING: All illegal access operations will be denied in a future release
   2024-06-05 23:22:08 INFO  org.apache.hudi.common.util.ClusteringUtils:147 - 
Found 0 files in pending clustering operations
   2024-06-05 23:22:08 INFO  
org.apache.hudi.common.table.view.FileSystemViewManager:243 - Creating View 
Manager with storage type :MEMORY
   2024-06-05 23:22:08 INFO  
org.apache.hudi.common.table.view.FileSystemViewManager:255 - Creating 
in-memory based Table View
   2024-06-05 23:22:11 INFO  
org.apache.spark.sql.delta.storage.DelegatingLogStore:60 - LogStore 
`LogStoreAdapter(io.delta.storage.S3SingleDriverLogStore)` is used for scheme 
`s3`
   2024-06-05 23:22:11 INFO  org.apache.spark.sql.delta.DeltaLog:60 - Creating 
initial snapshot without metadata, because the directory is empty
   2024-06-05 23:22:13 INFO  org.apache.spark.sql.delta.InitialSnapshot:60 - 
[tableId=8eda3e8f-9dae-4d19-ac72-f625b8ccb0c5] Created snapshot 
InitialSnapshot(path=s3://hidden-s3-bucket/hidden-prefix/_delta_log, 
version=-1, 
metadata=Metadata(167f7b26-f82d-4765-97b9-b6e47d9147ec,null,null,Format(parquet,Map()),null,List(),Map(),Some(1717629733296)),
 
logSegment=LogSegment(s3://hidden-s3-bucket/hidden-prefix/_delta_log,-1,List(),None,-1),
 checksumOpt=None)
   2024-06-05 23:22:13 INFO  
org.apache.xtable.conversion.ConversionController:240 - No previous 
InternalTable sync for target. Falling back to snapshot sync.
   2024-06-05 23:22:13 INFO  
org.apache.hudi.common.table.TableSchemaResolver:317 - Reading schema from 
s3://hidden-s3-bucket/hidden-prefix/op_date=2024-06-05/3b5d27af-ef39-4862-bbd9-d4a010f6056e-0_0-71-375_20240605231837826.parquet
   2024-06-05 23:22:14 INFO  
org.apache.hudi.metadata.HoodieTableMetadataUtil:927 - Loading latest merged 
file slices for metadata table partition files
   2024-06-05 23:22:14 INFO  
org.apache.hudi.common.table.view.AbstractTableFileSystemView:259 - Took 1 ms 
to read  0 instants, 0 replaced file groups
   2024-06-05 23:22:14 INFO  org.apache.hudi.common.util.ClusteringUtils:147 - 
Found 0 files in pending clustering operations
   2024-06-05 23:22:14 INFO  
org.apache.hudi.common.table.view.AbstractTableFileSystemView:429 - Building 
file system view for partition (files)
   2024-06-05 23:22:14 DEBUG 
org.apache.hudi.common.table.view.AbstractTableFileSystemView:435 - #files 
found in partition (files) =30, Time taken =40
   2024-06-05 23:22:14 DEBUG 
org.apache.hudi.common.table.view.HoodieTableFileSystemView:386 - Adding 
file-groups for partition :files, #FileGroups=1
   2024-06-05 23:22:14 DEBUG 
org.apache.hudi.common.table.view.AbstractTableFileSystemView:165 - 
addFilesToView: NumFiles=30, NumFileGroups=1, FileGroupsCreationTime=15, 
StoreTimeTaken=1
   2024-06-05 23:22:14 DEBUG 
org.apache.hudi.common.table.view.AbstractTableFileSystemView:449 - Time to 
load partition (files) =57
   2024-06-05 23:22:14 INFO  
org.apache.hudi.metadata.HoodieBackedTableMetadata:451 - Opened metadata base 
file from 
s3://hidden-s3-bucket/hidden-prefix/.hoodie/metadata/files/files-0000-0_0-67-1304_20240605210834482001.hfile
 at instant 20240605210834482001 in 9 ms
   2024-06-05 23:22:14 INFO  
org.apache.hudi.common.table.timeline.HoodieActiveTimeline:171 - Loaded 
instants upto : 
Option{val=[20240605231910580__clean__COMPLETED__20240605231918000]}
   2024-06-05 23:22:14 ERROR org.apache.xtable.utilities.RunSync:171 - Error 
running sync for s3://hidden-s3-bucket/hidden-prefix/
   org.apache.hudi.exception.HoodieMetadataException: Failed to retrieve list 
of partition from metadata
       at 
org.apache.hudi.metadata.BaseTableMetadata.getAllPartitionPaths(BaseTableMetadata.java:127)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       at 
org.apache.xtable.hudi.HudiDataFileExtractor.getFilesCurrentState(HudiDataFileExtractor.java:116)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       at 
org.apache.xtable.hudi.HudiConversionSource.getCurrentSnapshot(HudiConversionSource.java:97)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       at 
org.apache.xtable.spi.extractor.ExtractFromSource.extractSnapshot(ExtractFromSource.java:38)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       at 
org.apache.xtable.conversion.ConversionController.syncSnapshot(ConversionController.java:183)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       at 
org.apache.xtable.conversion.ConversionController.sync(ConversionController.java:121)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       at org.apache.xtable.utilities.RunSync.main(RunSync.java:169) 
[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
   Caused by: java.lang.IllegalStateException: Recursive update
       at 
java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1739)
 ~[?:?]
       at org.apache.avro.util.MapUtil.computeIfAbsent(MapUtil.java:42) 
~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       at org.apache.avro.specific.SpecificData.getClass(SpecificData.java:257) 
~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       at 
org.apache.avro.specific.SpecificData.newRecord(SpecificData.java:508) 
~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       at 
org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:237)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       at 
org.apache.avro.specific.SpecificDatumReader.readRecord(SpecificDatumReader.java:123)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       at 
org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:180)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       at 
org.apache.avro.generic.GenericDatumReader.readMap(GenericDatumReader.java:355) 
~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       at 
org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:186)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       at 
org.apache.avro.specific.SpecificDatumReader.readField(SpecificDatumReader.java:136)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       at 
org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:248)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       at 
org.apache.avro.specific.SpecificDatumReader.readRecord(SpecificDatumReader.java:123)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       at 
org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:180)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:161) 
~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:154) 
~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       at org.apache.avro.file.DataFileStream.next(DataFileStream.java:263) 
~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       at org.apache.avro.file.DataFileStream.next(DataFileStream.java:248) 
~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       at 
org.apache.hudi.common.table.timeline.TimelineMetadataUtils.deserializeAvroMetadata(TimelineMetadataUtils.java:209)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       at 
org.apache.hudi.common.table.timeline.TimelineMetadataUtils.deserializeHoodieRollbackMetadata(TimelineMetadataUtils.java:177)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       at 
org.apache.hudi.metadata.HoodieTableMetadataUtil.getRollbackedCommits(HoodieTableMetadataUtil.java:1355)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       at 
org.apache.hudi.metadata.HoodieTableMetadataUtil.lambda$getValidInstantTimestamps$37(HoodieTableMetadataUtil.java:1284)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       at 
java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) ~[?:?]
       at 
java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177) ~[?:?]
       at 
java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655) 
~[?:?]
       at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) 
~[?:?]
       at 
java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) 
~[?:?]
       at 
java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) 
~[?:?]
       at 
java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
 ~[?:?]
       at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) 
~[?:?]
       at 
java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497) ~[?:?]
       at 
org.apache.hudi.metadata.HoodieTableMetadataUtil.getValidInstantTimestamps(HoodieTableMetadataUtil.java:1283)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       at 
org.apache.hudi.metadata.HoodieBackedTableMetadata.getLogRecordScanner(HoodieBackedTableMetadata.java:473)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       at 
org.apache.hudi.metadata.HoodieBackedTableMetadata.openReaders(HoodieBackedTableMetadata.java:429)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       at 
org.apache.hudi.metadata.HoodieBackedTableMetadata.lambda$getOrCreateReaders$10(HoodieBackedTableMetadata.java:412)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       at 
java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1705)
 ~[?:?]
       at 
org.apache.hudi.metadata.HoodieBackedTableMetadata.getOrCreateReaders(HoodieBackedTableMetadata.java:412)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       at 
org.apache.hudi.metadata.HoodieBackedTableMetadata.lookupKeysFromFileSlice(HoodieBackedTableMetadata.java:291)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       at 
org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordsByKeys(HoodieBackedTableMetadata.java:255)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       at 
org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordByKey(HoodieBackedTableMetadata.java:145)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       at 
org.apache.hudi.metadata.BaseTableMetadata.fetchAllPartitionPaths(BaseTableMetadata.java:316)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       at 
org.apache.hudi.metadata.BaseTableMetadata.getAllPartitionPaths(BaseTableMetadata.java:125)
 ~[xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:0.1.0-SNAPSHOT]
       ... 6 more
   ```
   
   ## config.yaml:
   
   ```
   sourceFormat: HUDI
   targetFormats:
     - DELTA
   datasets:
     -
       tableBasePath: s3://hidden-s3-bucket/hidden-prefix
       tableName: hidden_table
       partitionSpec: op_date:VALUE
   ```
   
   ## hoodie.properties from the table:
   
   ```
   hoodie.table.timeline.timezone=LOCAL
   hoodie.table.keygenerator.class=org.apache.hudi.keygen.SimpleKeyGenerator
   hoodie.table.precombine.field=ts_millis
   hoodie.table.version=6
   hoodie.database.name=
   hoodie.datasource.write.hive_style_partitioning=true
   hoodie.table.metadata.partitions.inflight=
   hoodie.table.checksum=2622850774
   hoodie.partition.metafile.use.base.format=false
   hoodie.table.cdc.enabled=false
   hoodie.archivelog.folder=archived
   hoodie.table.name=hidden_table
   hoodie.populate.meta.fields=true
   hoodie.table.type=COPY_ON_WRITE
   hoodie.datasource.write.partitionpath.urlencode=false
   hoodie.table.base.file.format=PARQUET
   hoodie.datasource.write.drop.partition.columns=false
   hoodie.table.metadata.partitions=files
   hoodie.timeline.layout.version=1
   hoodie.table.recordkey.fields=record_id
   hoodie.table.partition.fields=op_date
   ```
   
   I submitted this to the dev@ mailing list and received no response, so 
filing as an issue.
   
   ### Are you willing to submit PR?
   
   - [ ] I am willing to submit a PR!
   - [X] I am willing to submit a PR but need help getting started!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to