[ https://issues.apache.org/jira/browse/HUDI-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17483610#comment-17483610 ]
Harsha Teja Kanna commented on HUDI-3335: ----------------------------------------- Log 22/01/28 01:29:34 INFO Executor: Adding file:/private/var/folders/61/3vd56bjx3cj0hpdq_139d5hm0000gp/T/spark-823349b0-aeeb-494d-bdc6-c276419a0fe1/userFiles-644e376c-59bb-4837-a421-590697992dc6/hudi-utilities-bundle_2.12-0.10.1.jar to class loader 22/01/28 01:29:34 INFO Executor: Fetching spark://192.168.86.5:49947/jars/org.spark-project.spark_unused-1.0.0.jar with timestamp 1643354959702 22/01/28 01:29:34 INFO Utils: Fetching spark://192.168.86.5:49947/jars/org.spark-project.spark_unused-1.0.0.jar to /private/var/folders/61/3vd56bjx3cj0hpdq_139d5hm0000gp/T/spark-823349b0-aeeb-494d-bdc6-c276419a0fe1/userFiles-644e376c-59bb-4837-a421-590697992dc6/fetchFileTemp5819832321479921719.tmp 22/01/28 01:29:34 INFO Executor: Adding file:/private/var/folders/61/3vd56bjx3cj0hpdq_139d5hm0000gp/T/spark-823349b0-aeeb-494d-bdc6-c276419a0fe1/userFiles-644e376c-59bb-4837-a421-590697992dc6/org.spark-project.spark_unused-1.0.0.jar to class loader 22/01/28 01:29:34 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 49956. 22/01/28 01:29:34 INFO NettyBlockTransferService: Server created on 192.168.86.5:49956 22/01/28 01:29:34 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 22/01/28 01:29:34 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.86.5, 49956, None) 22/01/28 01:29:34 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.86.5:49956 with 2004.6 MiB RAM, BlockManagerId(driver, 192.168.86.5, 49956, None) 22/01/28 01:29:34 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.86.5, 49956, None) 22/01/28 01:29:34 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 192.168.86.5, 49956, None) 22/01/28 01:29:35 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/Users/harshakanna/spark-warehouse/'). 22/01/28 01:29:35 INFO SharedState: Warehouse path is 'file:/Users/harshakanna/spark-warehouse/'. 22/01/28 01:29:36 INFO DataSourceUtils: Getting table path.. 22/01/28 01:29:36 INFO TablePathUtils: Getting table path from path : s3a://datalake-hudi/sessions 22/01/28 01:29:36 INFO DefaultSource: Obtained hudi table path: s3a://datalake-hudi/sessions 22/01/28 01:29:36 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from s3a://datalake-hudi/sessions 22/01/28 01:29:36 INFO HoodieTableConfig: Loading table properties from s3a://datalake-hudi/sessions/.hoodie/hoodie.properties 22/01/28 01:29:36 INFO HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from s3a://datalake-hudi/sessions 22/01/28 01:29:36 INFO DefaultSource: Is bootstrapped table => false, tableType is: COPY_ON_WRITE, queryType is: snapshot 22/01/28 01:29:36 INFO DefaultSource: Loading Base File Only View with options :Map(hoodie.datasource.query.type -> snapshot, hoodie.metadata.enable -> true, path -> s3a://datalake-hudi/sessions/) 22/01/28 01:29:37 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from s3a://datalake-hudi/sessions 22/01/28 01:29:37 INFO HoodieTableConfig: Loading table properties from s3a://datalake-hudi/sessions/.hoodie/hoodie.properties 22/01/28 01:29:37 INFO HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from s3a://datalake-hudi/sessions 22/01/28 01:29:37 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from s3a://datalake-hudi/sessions/.hoodie/metadata 22/01/28 01:29:37 INFO HoodieTableConfig: Loading table properties from s3a://datalake-hudi/sessions/.hoodie/metadata/.hoodie/hoodie.properties 22/01/28 01:29:37 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from s3a://datalake-hudi/sessions/.hoodie/metadata 22/01/28 01:29:37 INFO HoodieTableMetadataUtil: Loading latest merged file slices for metadata table partition files 22/01/28 01:29:38 INFO HoodieActiveTimeline: Loaded instants upto : Option\{val=[20220126024720121__deltacommit__COMPLETED]} 22/01/28 01:29:38 INFO AbstractTableFileSystemView: Took 2 ms to read 0 instants, 0 replaced file groups 22/01/28 01:29:38 INFO ClusteringUtils: Found 0 files in pending clustering operations 22/01/28 01:29:38 INFO AbstractTableFileSystemView: Building file system view for partition (files) 22/01/28 01:29:38 INFO AbstractTableFileSystemView: addFilesToView: NumFiles=9, NumFileGroups=1, FileGroupsCreationTime=11, StoreTimeTaken=0 22/01/28 01:29:38 INFO CacheConfig: Allocating LruBlockCache size=1.42 GB, blockSize=64 KB 22/01/28 01:29:38 INFO CacheConfig: Created cacheConfig: blockCache=LruBlockCache\{blockCount=0, currentSize=1567280, freeSize=1525578832, maxSize=1527146112, heapSize=1567280, minSize=1450788736, minFactor=0.95, multiSize=725394368, multiFactor=0.5, singleSize=362697184, singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false 22/01/28 01:29:39 INFO S3AInputStream: Switching to Random IO seek policy 22/01/28 01:29:39 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:39 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:39 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:39 INFO HoodieBackedTableMetadata: Opened metadata base file from s3a://datalake-hudi/sessions/.hoodie/metadata/files/files-0000_0-17-118_20220126022154561001.hfile at instant 20220126022154561001 in 1427 ms 22/01/28 01:29:39 INFO HoodieActiveTimeline: Loaded instants upto : Option\{val=[20220126024720121__commit__COMPLETED]} 22/01/28 01:29:39 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from s3a://datalake-hudi/sessions/.hoodie/metadata 22/01/28 01:29:40 INFO HoodieTableConfig: Loading table properties from s3a://datalake-hudi/sessions/.hoodie/metadata/.hoodie/hoodie.properties 22/01/28 01:29:40 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from s3a://datalake-hudi/sessions/.hoodie/metadata 22/01/28 01:29:40 INFO HoodieActiveTimeline: Loaded instants upto : Option\{val=[20220126024720121__deltacommit__COMPLETED]} 22/01/28 01:29:40 INFO AbstractHoodieLogRecordReader: Scanning log file HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.1_0-27-187', fileLen=0} 22/01/28 01:29:40 INFO S3AInputStream: Switching to Random IO seek policy 22/01/28 01:29:40 INFO AbstractHoodieLogRecordReader: Reading a data block from file s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.1_0-27-187 at instant 20220126022523011 22/01/28 01:29:41 INFO HoodieLogFormatReader: Moving to the next reader for logfile HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.2_0-20-180', fileLen=0} 22/01/28 01:29:41 INFO AbstractHoodieLogRecordReader: Scanning log file HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.2_0-20-180', fileLen=0} 22/01/28 01:29:41 INFO S3AInputStream: Switching to Random IO seek policy 22/01/28 01:29:41 INFO AbstractHoodieLogRecordReader: Reading a data block from file s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.2_0-20-180 at instant 20220126023054200 22/01/28 01:29:41 INFO AbstractHoodieLogRecordReader: Number of remaining logblocks to merge 1 22/01/28 01:29:41 INFO CacheConfig: Created cacheConfig: blockCache=LruBlockCache\{blockCount=0, currentSize=1567280, freeSize=1525578832, maxSize=1527146112, heapSize=1567280, minSize=1450788736, minFactor=0.95, multiSize=725394368, multiFactor=0.5, singleSize=362697184, singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false 22/01/28 01:29:41 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:41 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:41 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:41 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:41 INFO ExternalSpillableMap: Estimated Payload size => 600 22/01/28 01:29:41 INFO HoodieLogFormatReader: Moving to the next reader for logfile HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.3_0-20-184', fileLen=0} 22/01/28 01:29:41 INFO AbstractHoodieLogRecordReader: Scanning log file HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.3_0-20-184', fileLen=0} 22/01/28 01:29:41 INFO S3AInputStream: Switching to Random IO seek policy 22/01/28 01:29:41 INFO AbstractHoodieLogRecordReader: Reading a data block from file s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.3_0-20-184 at instant 20220126023530250 22/01/28 01:29:41 INFO AbstractHoodieLogRecordReader: Number of remaining logblocks to merge 1 22/01/28 01:29:41 INFO CacheConfig: Created cacheConfig: blockCache=LruBlockCache\{blockCount=0, currentSize=1567280, freeSize=1525578832, maxSize=1527146112, heapSize=1567280, minSize=1450788736, minFactor=0.95, multiSize=725394368, multiFactor=0.5, singleSize=362697184, singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false 22/01/28 01:29:41 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:41 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:41 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:41 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:41 INFO HoodieLogFormatReader: Moving to the next reader for logfile HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.4_0-38-381', fileLen=0} 22/01/28 01:29:41 INFO AbstractHoodieLogRecordReader: Scanning log file HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.4_0-38-381', fileLen=0} 22/01/28 01:29:41 INFO S3AInputStream: Switching to Random IO seek policy 22/01/28 01:29:41 INFO AbstractHoodieLogRecordReader: Reading a data block from file s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.4_0-38-381 at instant 20220126023637109 22/01/28 01:29:41 INFO AbstractHoodieLogRecordReader: Number of remaining logblocks to merge 1 22/01/28 01:29:41 INFO CacheConfig: Created cacheConfig: blockCache=LruBlockCache\{blockCount=0, currentSize=1567280, freeSize=1525578832, maxSize=1527146112, heapSize=1567280, minSize=1450788736, minFactor=0.95, multiSize=725394368, multiFactor=0.5, singleSize=362697184, singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false 22/01/28 01:29:41 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:41 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:41 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:41 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:41 INFO HoodieLogFormatReader: Moving to the next reader for logfile HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.5_0-20-168', fileLen=0} 22/01/28 01:29:41 INFO AbstractHoodieLogRecordReader: Scanning log file HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.5_0-20-168', fileLen=0} 22/01/28 01:29:41 INFO S3AInputStream: Switching to Random IO seek policy 22/01/28 01:29:41 INFO AbstractHoodieLogRecordReader: Reading a data block from file s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.5_0-20-168 at instant 20220126024028688 22/01/28 01:29:41 INFO AbstractHoodieLogRecordReader: Number of remaining logblocks to merge 1 22/01/28 01:29:41 INFO CacheConfig: Created cacheConfig: blockCache=LruBlockCache\{blockCount=0, currentSize=1567280, freeSize=1525578832, maxSize=1527146112, heapSize=1567280, minSize=1450788736, minFactor=0.95, multiSize=725394368, multiFactor=0.5, singleSize=362697184, singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false 22/01/28 01:29:41 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:41 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:41 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:41 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:42 INFO HoodieLogFormatReader: Moving to the next reader for logfile HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.6_0-38-247', fileLen=0} 22/01/28 01:29:42 INFO AbstractHoodieLogRecordReader: Scanning log file HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.6_0-38-247', fileLen=0} 22/01/28 01:29:42 INFO S3AInputStream: Switching to Random IO seek policy 22/01/28 01:29:42 INFO AbstractHoodieLogRecordReader: Reading a data block from file s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.6_0-38-247 at instant 20220126024137627 22/01/28 01:29:42 INFO AbstractHoodieLogRecordReader: Number of remaining logblocks to merge 1 22/01/28 01:29:42 INFO CacheConfig: Created cacheConfig: blockCache=LruBlockCache\{blockCount=0, currentSize=1567280, freeSize=1525578832, maxSize=1527146112, heapSize=1567280, minSize=1450788736, minFactor=0.95, multiSize=725394368, multiFactor=0.5, singleSize=362697184, singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false 22/01/28 01:29:42 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:42 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:42 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:42 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:42 INFO HoodieLogFormatReader: Moving to the next reader for logfile HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.7_0-20-204', fileLen=0} 22/01/28 01:29:42 INFO AbstractHoodieLogRecordReader: Scanning log file HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.7_0-20-204', fileLen=0} 22/01/28 01:29:42 INFO S3AInputStream: Switching to Random IO seek policy 22/01/28 01:29:42 INFO AbstractHoodieLogRecordReader: Reading a data block from file s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.7_0-20-204 at instant 20220126024720121 22/01/28 01:29:42 INFO AbstractHoodieLogRecordReader: Number of remaining logblocks to merge 1 22/01/28 01:29:42 INFO CacheConfig: Created cacheConfig: blockCache=LruBlockCache\{blockCount=0, currentSize=1567280, freeSize=1525578832, maxSize=1527146112, heapSize=1567280, minSize=1450788736, minFactor=0.95, multiSize=725394368, multiFactor=0.5, singleSize=362697184, singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false 22/01/28 01:29:42 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:42 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:42 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:42 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:42 INFO AbstractHoodieLogRecordReader: Merging the final data blocks 22/01/28 01:29:42 INFO AbstractHoodieLogRecordReader: Number of remaining logblocks to merge 1 22/01/28 01:29:42 INFO CacheConfig: Created cacheConfig: blockCache=LruBlockCache\{blockCount=0, currentSize=1567280, freeSize=1525578832, maxSize=1527146112, heapSize=1567280, minSize=1450788736, minFactor=0.95, multiSize=725394368, multiFactor=0.5, singleSize=362697184, singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false 22/01/28 01:29:42 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:42 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:42 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:42 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:42 INFO HoodieMergedLogRecordScanner: Number of log files scanned => 7 22/01/28 01:29:42 INFO HoodieMergedLogRecordScanner: MaxMemoryInBytes allowed for compaction => 1073741824 22/01/28 01:29:42 INFO HoodieMergedLogRecordScanner: Number of entries in MemoryBasedMap in ExternalSpillableMap => 3 22/01/28 01:29:42 INFO HoodieMergedLogRecordScanner: Total size in bytes of MemoryBasedMap in ExternalSpillableMap => 1800 22/01/28 01:29:42 INFO HoodieMergedLogRecordScanner: Number of entries in BitCaskDiskMap in ExternalSpillableMap => 0 22/01/28 01:29:42 INFO HoodieMergedLogRecordScanner: Size of file spilled to disk => 0 22/01/28 01:29:42 INFO HoodieBackedTableMetadata: Opened 7 metadata log files (dataset instant=20220126024720121, metadata instant=20220126024720121) in 3273 ms 22/01/28 01:29:42 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:42 INFO HoodieBackedTableMetadata: Metadata read for 1 keys took [baseFileRead, logMerge] [0, 129] ms 22/01/28 01:29:42 INFO BaseTableMetadata: Listed partitions from metadata: #partitions=389 22/01/28 01:29:42 INFO HoodieActiveTimeline: Loaded instants upto : Option\{val=[20220126024720121__commit__COMPLETED]} 22/01/28 01:29:43 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from s3a://datalake-hudi/sessions 22/01/28 01:29:43 INFO HoodieTableConfig: Loading table properties from s3a://datalake-hudi/sessions/.hoodie/hoodie.properties 22/01/28 01:29:43 INFO HoodieTableMetaClient: Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from s3a://datalake-hudi/sessions 22/01/28 01:29:43 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from s3a://datalake-hudi/sessions/.hoodie/metadata 22/01/28 01:29:43 INFO HoodieTableConfig: Loading table properties from s3a://datalake-hudi/sessions/.hoodie/metadata/.hoodie/hoodie.properties 22/01/28 01:29:43 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from s3a://datalake-hudi/sessions/.hoodie/metadata 22/01/28 01:29:43 INFO HoodieTableMetadataUtil: Loading latest merged file slices for metadata table partition files 22/01/28 01:29:44 INFO HoodieActiveTimeline: Loaded instants upto : Option\{val=[20220126024720121__deltacommit__COMPLETED]} 22/01/28 01:29:44 INFO AbstractTableFileSystemView: Took 0 ms to read 0 instants, 0 replaced file groups 22/01/28 01:29:44 INFO ClusteringUtils: Found 0 files in pending clustering operations 22/01/28 01:29:44 INFO AbstractTableFileSystemView: Building file system view for partition (files) 22/01/28 01:29:44 INFO AbstractTableFileSystemView: addFilesToView: NumFiles=9, NumFileGroups=1, FileGroupsCreationTime=1, StoreTimeTaken=0 22/01/28 01:29:44 INFO CacheConfig: Created cacheConfig: blockCache=LruBlockCache\{blockCount=0, currentSize=1567280, freeSize=1525578832, maxSize=1527146112, heapSize=1567280, minSize=1450788736, minFactor=0.95, multiSize=725394368, multiFactor=0.5, singleSize=362697184, singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false 22/01/28 01:29:44 INFO S3AInputStream: Switching to Random IO seek policy 22/01/28 01:29:44 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:44 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:44 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:44 INFO HoodieBackedTableMetadata: Opened metadata base file from s3a://datalake-hudi/sessions/.hoodie/metadata/files/files-0000_0-17-118_20220126022154561001.hfile at instant 20220126022154561001 in 1041 ms 22/01/28 01:29:45 INFO HoodieActiveTimeline: Loaded instants upto : Option\{val=[20220126024720121__commit__COMPLETED]} 22/01/28 01:29:45 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from s3a://datalake-hudi/sessions/.hoodie/metadata 22/01/28 01:29:45 INFO HoodieTableConfig: Loading table properties from s3a://datalake-hudi/sessions/.hoodie/metadata/.hoodie/hoodie.properties 22/01/28 01:29:45 INFO HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from s3a://datalake-hudi/sessions/.hoodie/metadata 22/01/28 01:29:45 INFO HoodieActiveTimeline: Loaded instants upto : Option\{val=[20220126024720121__deltacommit__COMPLETED]} 22/01/28 01:29:45 INFO AbstractHoodieLogRecordReader: Scanning log file HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.1_0-27-187', fileLen=0} 22/01/28 01:29:45 INFO S3AInputStream: Switching to Random IO seek policy 22/01/28 01:29:45 INFO AbstractHoodieLogRecordReader: Reading a data block from file s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.1_0-27-187 at instant 20220126022523011 22/01/28 01:29:46 INFO HoodieLogFormatReader: Moving to the next reader for logfile HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.2_0-20-180', fileLen=0} 22/01/28 01:29:46 INFO AbstractHoodieLogRecordReader: Scanning log file HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.2_0-20-180', fileLen=0} 22/01/28 01:29:46 INFO S3AInputStream: Switching to Random IO seek policy 22/01/28 01:29:46 INFO AbstractHoodieLogRecordReader: Reading a data block from file s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.2_0-20-180 at instant 20220126023054200 22/01/28 01:29:46 INFO AbstractHoodieLogRecordReader: Number of remaining logblocks to merge 1 22/01/28 01:29:46 INFO CacheConfig: Created cacheConfig: blockCache=LruBlockCache\{blockCount=0, currentSize=1567280, freeSize=1525578832, maxSize=1527146112, heapSize=1567280, minSize=1450788736, minFactor=0.95, multiSize=725394368, multiFactor=0.5, singleSize=362697184, singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false 22/01/28 01:29:46 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:46 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:46 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:46 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:46 INFO ExternalSpillableMap: Estimated Payload size => 600 22/01/28 01:29:46 INFO HoodieLogFormatReader: Moving to the next reader for logfile HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.3_0-20-184', fileLen=0} 22/01/28 01:29:46 INFO AbstractHoodieLogRecordReader: Scanning log file HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.3_0-20-184', fileLen=0} 22/01/28 01:29:46 INFO S3AInputStream: Switching to Random IO seek policy 22/01/28 01:29:46 INFO AbstractHoodieLogRecordReader: Reading a data block from file s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.3_0-20-184 at instant 20220126023530250 22/01/28 01:29:46 INFO AbstractHoodieLogRecordReader: Number of remaining logblocks to merge 1 22/01/28 01:29:46 INFO CacheConfig: Created cacheConfig: blockCache=LruBlockCache\{blockCount=0, currentSize=1567280, freeSize=1525578832, maxSize=1527146112, heapSize=1567280, minSize=1450788736, minFactor=0.95, multiSize=725394368, multiFactor=0.5, singleSize=362697184, singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false 22/01/28 01:29:46 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:46 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:46 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:46 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:46 INFO HoodieLogFormatReader: Moving to the next reader for logfile HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.4_0-38-381', fileLen=0} 22/01/28 01:29:46 INFO AbstractHoodieLogRecordReader: Scanning log file HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.4_0-38-381', fileLen=0} 22/01/28 01:29:46 INFO S3AInputStream: Switching to Random IO seek policy 22/01/28 01:29:46 INFO AbstractHoodieLogRecordReader: Reading a data block from file s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.4_0-38-381 at instant 20220126023637109 22/01/28 01:29:46 INFO AbstractHoodieLogRecordReader: Number of remaining logblocks to merge 1 22/01/28 01:29:46 INFO CacheConfig: Created cacheConfig: blockCache=LruBlockCache\{blockCount=0, currentSize=1567280, freeSize=1525578832, maxSize=1527146112, heapSize=1567280, minSize=1450788736, minFactor=0.95, multiSize=725394368, multiFactor=0.5, singleSize=362697184, singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false 22/01/28 01:29:46 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:46 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:46 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:46 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:46 INFO HoodieLogFormatReader: Moving to the next reader for logfile HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.5_0-20-168', fileLen=0} 22/01/28 01:29:46 INFO AbstractHoodieLogRecordReader: Scanning log file HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.5_0-20-168', fileLen=0} 22/01/28 01:29:46 INFO S3AInputStream: Switching to Random IO seek policy 22/01/28 01:29:47 INFO AbstractHoodieLogRecordReader: Reading a data block from file s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.5_0-20-168 at instant 20220126024028688 22/01/28 01:29:47 INFO AbstractHoodieLogRecordReader: Number of remaining logblocks to merge 1 22/01/28 01:29:47 INFO CacheConfig: Created cacheConfig: blockCache=LruBlockCache\{blockCount=0, currentSize=1567280, freeSize=1525578832, maxSize=1527146112, heapSize=1567280, minSize=1450788736, minFactor=0.95, multiSize=725394368, multiFactor=0.5, singleSize=362697184, singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false 22/01/28 01:29:47 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:47 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:47 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:47 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:47 INFO HoodieLogFormatReader: Moving to the next reader for logfile HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.6_0-38-247', fileLen=0} 22/01/28 01:29:47 INFO AbstractHoodieLogRecordReader: Scanning log file HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.6_0-38-247', fileLen=0} 22/01/28 01:29:47 INFO S3AInputStream: Switching to Random IO seek policy 22/01/28 01:29:47 INFO AbstractHoodieLogRecordReader: Reading a data block from file s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.6_0-38-247 at instant 20220126024137627 22/01/28 01:29:47 INFO AbstractHoodieLogRecordReader: Number of remaining logblocks to merge 1 22/01/28 01:29:47 INFO CacheConfig: Created cacheConfig: blockCache=LruBlockCache\{blockCount=0, currentSize=1567280, freeSize=1525578832, maxSize=1527146112, heapSize=1567280, minSize=1450788736, minFactor=0.95, multiSize=725394368, multiFactor=0.5, singleSize=362697184, singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false 22/01/28 01:29:47 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:47 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:47 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:47 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:47 INFO HoodieLogFormatReader: Moving to the next reader for logfile HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.7_0-20-204', fileLen=0} 22/01/28 01:29:47 INFO AbstractHoodieLogRecordReader: Scanning log file HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.7_0-20-204', fileLen=0} 22/01/28 01:29:47 INFO S3AInputStream: Switching to Random IO seek policy 22/01/28 01:29:47 INFO AbstractHoodieLogRecordReader: Reading a data block from file s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.7_0-20-204 at instant 20220126024720121 22/01/28 01:29:47 INFO AbstractHoodieLogRecordReader: Number of remaining logblocks to merge 1 22/01/28 01:29:47 INFO CacheConfig: Created cacheConfig: blockCache=LruBlockCache\{blockCount=0, currentSize=1567280, freeSize=1525578832, maxSize=1527146112, heapSize=1567280, minSize=1450788736, minFactor=0.95, multiSize=725394368, multiFactor=0.5, singleSize=362697184, singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false 22/01/28 01:29:47 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:47 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:47 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:47 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:47 INFO AbstractHoodieLogRecordReader: Merging the final data blocks 22/01/28 01:29:47 INFO AbstractHoodieLogRecordReader: Number of remaining logblocks to merge 1 22/01/28 01:29:47 INFO CacheConfig: Created cacheConfig: blockCache=LruBlockCache\{blockCount=0, currentSize=1567280, freeSize=1525578832, maxSize=1527146112, heapSize=1567280, minSize=1450788736, minFactor=0.95, multiSize=725394368, multiFactor=0.5, singleSize=362697184, singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false 22/01/28 01:29:47 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:47 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:47 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:47 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:47 INFO HoodieMergedLogRecordScanner: Number of log files scanned => 7 22/01/28 01:29:47 INFO HoodieMergedLogRecordScanner: MaxMemoryInBytes allowed for compaction => 1073741824 22/01/28 01:29:47 INFO HoodieMergedLogRecordScanner: Number of entries in MemoryBasedMap in ExternalSpillableMap => 3 22/01/28 01:29:47 INFO HoodieMergedLogRecordScanner: Total size in bytes of MemoryBasedMap in ExternalSpillableMap => 1800 22/01/28 01:29:47 INFO HoodieMergedLogRecordScanner: Number of entries in BitCaskDiskMap in ExternalSpillableMap => 0 22/01/28 01:29:47 INFO HoodieMergedLogRecordScanner: Size of file spilled to disk => 0 22/01/28 01:29:47 INFO HoodieBackedTableMetadata: Opened 7 metadata log files (dataset instant=20220126024720121, metadata instant=20220126024720121) in 2814 ms 22/01/28 01:29:47 INFO CodecPool: Got brand-new decompressor [.gz] 22/01/28 01:29:47 INFO HoodieBackedTableMetadata: Metadata read for 389 keys took [baseFileRead, logMerge] [1, 122] ms 22/01/28 01:29:47 INFO BaseTableMetadata: Listed files in partitions from metadata: partition list =[s3a://datalake-hudi/sessions/date=2021/07/20, s3a://datalake-hudi/sessions/date=2021/04/29, s3a://datalake-hudi/sessions/date=2021/02/02, s3a://datalake-hudi/sessions/date=2021/09/30, s3a://datalake-hudi/sessions/date=2021/12/30, s3a://datalake-hudi/sessions/date=2021/03/10, s3a://datalake-hudi/sessions/date=2021/08/31, s3a://datalake-hudi/sessions/date=2021/01/08, s3a://datalake-hudi/sessions/date=2021/12/19, s3a://datalake-hudi/sessions/date=2021/04/26, s3a://datalake-hudi/sessions/date=2021/08/06, s3a://datalake-hudi/sessions/date=2021/05/23, s3a://datalake-hudi/sessions/date=2021/04/09, s3a://datalake-hudi/sessions/date=2021/10/12, s3a://datalake-hudi/sessions/date=2021/06/12, s3a://datalake-hudi/sessions/date=2021/05/20, s3a://datalake-hudi/sessions/date=2021/03/04, s3a://datalake-hudi/sessions/date=2021/08/20, s3a://datalake-hudi/sessions/date=2021/02/13, s3a://datalake-hudi/sessions/date=2021/08/02, s3a://datalake-hudi/sessions/date=2021/05/05, s3a://datalake-hudi/sessions/date=2021/03/03, s3a://datalake-hudi/sessions/date=2021/06/05, s3a://datalake-hudi/sessions/date=2021/10/23, s3a://datalake-hudi/sessions/date=2021/09/29, s3a://datalake-hudi/sessions/date=2021/04/14, s3a://datalake-hudi/sessions/date=2021/02/18, s3a://datalake-hudi/sessions/date=2021/03/21, s3a://datalake-hudi/sessions/date=2021/07/24, s3a://datalake-hudi/sessions/date=2021/04/30, s3a://datalake-hudi/sessions/date=2021/01/21, s3a://datalake-hudi/sessions/date=2021/10/29, s3a://datalake-hudi/sessions/date=2021/10/01, s3a://datalake-hudi/sessions/date=2021/06/23, s3a://datalake-hudi/sessions/date=2021/12/27, s3a://datalake-hudi/sessions/date=2021/04/08, s3a://datalake-hudi/sessions/date=2021/03/14, s3a://datalake-hudi/sessions/date=2021/03/26, s3a://datalake-hudi/sessions/date=2021/12/04, s3a://datalake-hudi/sessions/date=2021/05/27, s3a://datalake-hudi/sessions/date=2021/11/27, s3a://datalake-hudi/sessions/date=2021/05/12, s3a://datalake-hudi/sessions/date=2021/03/19, s3a://datalake-hudi/sessions/date=2021/11/07, s3a://datalake-hudi/sessions/date=2021/03/08, s3a://datalake-hudi/sessions/date=2021/12/16, s3a://datalake-hudi/sessions/date=2021/11/18, s3a://datalake-hudi/sessions/date=2021/10/18, s3a://datalake-hudi/sessions/date=2021/05/31, s3a://datalake-hudi/sessions/date=2021/10/07, s3a://datalake-hudi/sessions/date=2022/01/04, s3a://datalake-hudi/sessions/date=2021/09/10, s3a://datalake-hudi/sessions/date=2021/06/01, s3a://datalake-hudi/sessions/date=2021/06/27, s3a://datalake-hudi/sessions/date=2021/07/01, s3a://datalake-hudi/sessions/date=2021/06/30, s3a://datalake-hudi/sessions/date=2021/08/17, s3a://datalake-hudi/sessions/date=2021/04/17, s3a://datalake-hudi/sessions/date=2022/01/07, s3a://datalake-hudi/sessions/date=2021/05/26, s3a://datalake-hudi/sessions/date=2021/03/18, s3a://datalake-hudi/sessions/date=2021/09/20, s3a://datalake-hudi/sessions/date=2021/11/26, s3a://datalake-hudi/sessions/date=2021/12/11, s3a://datalake-hudi/sessions/date=2021/04/01, s3a://datalake-hudi/sessions/date=2021/03/22, s3a://datalake-hudi/sessions/date=2021/05/15, s3a://datalake-hudi/sessions/date=2021/08/28, s3a://datalake-hudi/sessions/date=2021/01/11, s3a://datalake-hudi/sessions/date=2021/11/03, s3a://datalake-hudi/sessions/date=2022/01/18, s3a://datalake-hudi/sessions/date=2021/11/14, s3a://datalake-hudi/sessions/date=2021/02/10, s3a://datalake-hudi/sessions/date=2021/12/09, s3a://datalake-hudi/sessions/date=2021/03/29, s3a://datalake-hudi/sessions/date=2021/06/16, s3a://datalake-hudi/sessions/date=2021/02/06, s3a://datalake-hudi/sessions/date=2021/11/22, s3a://datalake-hudi/sessions/date=2021/09/07, s3a://datalake-hudi/sessions/date=2021/09/24, s3a://datalake-hudi/sessions/date=2021/03/31, s3a://datalake-hudi/sessions/date=2021/01/19, s3a://datalake-hudi/sessions/date=2021/08/13, s3a://datalake-hudi/sessions/date=2022/01/12, s3a://datalake-hudi/sessions/date=2021/12/22, s3a://datalake-hudi/sessions/date=2021/08/24, s3a://datalake-hudi/sessions/date=2021/07/12, s3a://datalake-hudi/sessions/date=2021/11/30, s3a://datalake-hudi/sessions/date=2021/02/21, s3a://datalake-hudi/sessions/date=2021/09/18, s3a://datalake-hudi/sessions/date=2021/03/11, s3a://datalake-hudi/sessions/date=2022/01/01, s3a://datalake-hudi/sessions/date=2021/02/17, s3a://datalake-hudi/sessions/date=2021/07/31, s3a://datalake-hudi/sessions/date=2021/01/29, s3a://datalake-hudi/sessions/date=2021/02/28, s3a://datalake-hudi/sessions/date=2021/03/07, s3a://datalake-hudi/sessions/date=2021/06/09, s3a://datalake-hudi/sessions/date=2021/11/10, s3a://datalake-hudi/sessions/date=2021/02/07, s3a://datalake-hudi/sessions/date=2021/09/28, s3a://datalake-hudi/sessions/date=2021/03/02, s3a://datalake-hudi/sessions/date=2021/01/25, s3a://datalake-hudi/sessions/date=2021/02/19, s3a://datalake-hudi/sessions/date=2021/11/05, s3a://datalake-hudi/sessions/date=2021/06/13, s3a://datalake-hudi/sessions/date=2021/02/01, s3a://datalake-hudi/sessions/date=2021/07/25, s3a://datalake-hudi/sessions/date=2021/01/27, s3a://datalake-hudi/sessions/date=2021/02/12, s3a://datalake-hudi/sessions/date=2021/06/24, s3a://datalake-hudi/sessions/date=2021/01/14, s3a://datalake-hudi/sessions/date=2021/03/30, s3a://datalake-hudi/sessions/date=2021/01/03, s3a://datalake-hudi/sessions/date=2021/02/08, s3a://datalake-hudi/sessions/date=2021/04/23, s3a://datalake-hudi/sessions/date=2021/09/11, s3a://datalake-hudi/sessions/date=2021/09/03, s3a://datalake-hudi/sessions/date=2021/12/13, s3a://datalake-hudi/sessions/date=2021/09/14, s3a://datalake-hudi/sessions/date=2021/10/02, s3a://datalake-hudi/sessions/date=2021/12/24, s3a://datalake-hudi/sessions/date=2021/01/09, s3a://datalake-hudi/sessions/date=2021/12/18, s3a://datalake-hudi/sessions/date=2021/10/17, s3a://datalake-hudi/sessions/date=2021/11/09, s3a://datalake-hudi/sessions/date=2021/06/04, s3a://datalake-hudi/sessions/date=2021/04/13, s3a://datalake-hudi/sessions/date=2021/03/09, s3a://datalake-hudi/sessions/date=2022/01/23, s3a://datalake-hudi/sessions/date=2021/11/08, s3a://datalake-hudi/sessions/date=2021/11/16, s3a://datalake-hudi/sessions/date=2022/01/20, s3a://datalake-hudi/sessions/date=2021/01/20, s3a://datalake-hudi/sessions/date=2021/08/21, s3a://datalake-hudi/sessions/date=2021/10/28, s3a://datalake-hudi/sessions/date=2021/08/03, s3a://datalake-hudi/sessions/date=2021/03/13, s3a://datalake-hudi/sessions/date=2021/07/03, s3a://datalake-hudi/sessions/date=2021/10/06, s3a://datalake-hudi/sessions/date=2021/07/09, s3a://datalake-hudi/sessions/date=2021/06/02, s3a://datalake-hudi/sessions/date=2021/11/28, s3a://datalake-hudi/sessions/date=2021/05/01, s3a://datalake-hudi/sessions/date=2021/12/29, s3a://datalake-hudi/sessions/date=2021/01/31, s3a://datalake-hudi/sessions/date=2021/01/05, s3a://datalake-hudi/sessions/date=2021/04/10, s3a://datalake-hudi/sessions/date=2021/11/19, s3a://datalake-hudi/sessions/date=2021/04/07, s3a://datalake-hudi/sessions/date=2021/02/23, s3a://datalake-hudi/sessions/date=2021/07/14, s3a://datalake-hudi/sessions/date=2021/05/17, s3a://datalake-hudi/sessions/date=2021/03/25, s3a://datalake-hudi/sessions/date=2021/08/14, s3a://datalake-hudi/sessions/date=2021/04/04, s3a://datalake-hudi/sessions/date=2021/01/16, s3a://datalake-hudi/sessions/date=2021/06/08, s3a://datalake-hudi/sessions/date=2022/01/17, s3a://datalake-hudi/sessions/date=2021/07/28, s3a://datalake-hudi/sessions/date=2021/10/24, s3a://datalake-hudi/sessions/date=2021/10/13, s3a://datalake-hudi/sessions/date=2021/07/13, s3a://datalake-hudi/sessions/date=2021/05/16, s3a://datalake-hudi/sessions/date=2022/01/06, s3a://datalake-hudi/sessions/date=2021/05/02, s3a://datalake-hudi/sessions/date=2021/03/17, s3a://datalake-hudi/sessions/date=2021/02/27, s3a://datalake-hudi/sessions/date=2021/12/12, s3a://datalake-hudi/sessions/date=2021/05/28, s3a://datalake-hudi/sessions/date=2021/05/30, s3a://datalake-hudi/sessions/date=2021/06/15, s3a://datalake-hudi/sessions/date=2021/08/25, s3a://datalake-hudi/sessions/date=2021/02/24, s3a://datalake-hudi/sessions/date=2021/08/10, s3a://datalake-hudi/sessions/date=2021/12/06, s3a://datalake-hudi/sessions/date=2021/09/21, s3a://datalake-hudi/sessions/date=2021/07/06, s3a://datalake-hudi/sessions/date=2021/11/25, s3a://datalake-hudi/sessions/date=2021/09/06, s3a://datalake-hudi/sessions/date=2021/12/03, s3a://datalake-hudi/sessions/date=2021/10/20, s3a://datalake-hudi/sessions/date=2022/01/13, s3a://datalake-hudi/sessions/date=2021/11/13, s3a://datalake-hudi/sessions/date=2021/03/06, s3a://datalake-hudi/sessions/date=2021/06/26, s3a://datalake-hudi/sessions/date=2021/07/17, s3a://datalake-hudi/sessions/date=2021/07/21, s3a://datalake-hudi/sessions/date=2021/05/06, s3a://datalake-hudi/sessions/date=2021/06/20, s3a://datalake-hudi/sessions/date=2021/01/10, s3a://datalake-hudi/sessions/date=2021/02/05, s3a://datalake-hudi/sessions/date=2021/02/16, s3a://datalake-hudi/sessions/date=2021/04/22, s3a://datalake-hudi/sessions/date=2022/01/02, s3a://datalake-hudi/sessions/date=2021/08/09, s3a://datalake-hudi/sessions/date=2021/04/27, s3a://datalake-hudi/sessions/date=2021/06/19, s3a://datalake-hudi/sessions/date=2021/09/17, s3a://datalake-hudi/sessions/date=2021/03/28, s3a://datalake-hudi/sessions/date=2021/07/02, s3a://datalake-hudi/sessions/date=2021/12/23, s3a://datalake-hudi/sessions/date=2021/11/21, s3a://datalake-hudi/sessions/date=2021/09/25, s3a://datalake-hudi/sessions/date=2021/11/02, s3a://datalake-hudi/sessions/date=2021/02/20, s3a://datalake-hudi/sessions/date=2021/11/17, s3a://datalake-hudi/sessions/date=2021/12/14, s3a://datalake-hudi/sessions/date=2022/01/21, s3a://datalake-hudi/sessions/date=2021/10/31, s3a://datalake-hudi/sessions/date=2021/08/27, s3a://datalake-hudi/sessions/date=2021/10/05, s3a://datalake-hudi/sessions/date=2021/01/02, s3a://datalake-hudi/sessions/date=2022/01/10, s3a://datalake-hudi/sessions/date=2021/01/24, s3a://datalake-hudi/sessions/date=2021/09/01, s3a://datalake-hudi/sessions/date=2021/11/12, s3a://datalake-hudi/sessions/date=2021/04/16, s3a://datalake-hudi/sessions/date=2021/06/18, s3a://datalake-hudi/sessions/date=2021/06/21, s3a://datalake-hudi/sessions/date=2021/07/22, s3a://datalake-hudi/sessions/date=2021/11/24, s3a://datalake-hudi/sessions/date=2021/04/11, s3a://datalake-hudi/sessions/date=2021/07/11, s3a://datalake-hudi/sessions/date=2021/12/25, s3a://datalake-hudi/sessions/date=2021/08/08, s3a://datalake-hudi/sessions/date=2021/12/01, s3a://datalake-hudi/sessions/date=2021/12/31, s3a://datalake-hudi/sessions/date=2021/07/10, s3a://datalake-hudi/sessions/date=2021/11/06, s3a://datalake-hudi/sessions/date=2021/06/10, s3a://datalake-hudi/sessions/date=2021/09/04, s3a://datalake-hudi/sessions/date=2021/01/04, s3a://datalake-hudi/sessions/date=2021/01/26, s3a://datalake-hudi/sessions/date=2021/08/04, s3a://datalake-hudi/sessions/date=2021/01/13, s3a://datalake-hudi/sessions/date=2021/05/22, s3a://datalake-hudi/sessions/date=2021/05/03, s3a://datalake-hudi/sessions/date=2021/10/16, s3a://datalake-hudi/sessions/date=2021/08/16, s3a://datalake-hudi/sessions/date=2022/01/24, s3a://datalake-hudi/sessions/date=2021/01/06, s3a://datalake-hudi/sessions/date=2021/09/09, s3a://datalake-hudi/sessions/date=2021/10/27, s3a://datalake-hudi/sessions/date=2021/05/11, s3a://datalake-hudi/sessions/date=2021/09/15, s3a://datalake-hudi/sessions/date=2022/01/09, s3a://datalake-hudi/sessions/date=2021/12/17, s3a://datalake-hudi/sessions/date=2021/12/28, s3a://datalake-hudi/sessions/date=2021/12/07, s3a://datalake-hudi/sessions/date=2021/05/07, s3a://datalake-hudi/sessions/date=2021/08/22, s3a://datalake-hudi/sessions/date=2021/02/26, s3a://datalake-hudi/sessions/date=2021/04/21, s3a://datalake-hudi/sessions/date=2021/07/16, s3a://datalake-hudi/sessions/date=2021/02/09, s3a://datalake-hudi/sessions/date=2021/01/15, s3a://datalake-hudi/sessions/date=2021/10/25, s3a://datalake-hudi/sessions/date=2022/01/16, s3a://datalake-hudi/sessions/date=2021/08/11, s3a://datalake-hudi/sessions/date=2021/07/19, s3a://datalake-hudi/sessions/date=2021/08/05, s3a://datalake-hudi/sessions/date=2021/04/19, s3a://datalake-hudi/sessions/date=2022/01/05, s3a://datalake-hudi/sessions/date=2021/09/26, s3a://datalake-hudi/sessions/date=2021/03/24, s3a://datalake-hudi/sessions/date=2021/05/18, s3a://datalake-hudi/sessions/date=2021/07/08, s3a://datalake-hudi/sessions/date=2021/08/15, s3a://datalake-hudi/sessions/date=2021/04/03, s3a://datalake-hudi/sessions/date=2021/05/29, s3a://datalake-hudi/sessions/date=2021/06/29, s3a://datalake-hudi/sessions/date=2021/04/24, s3a://datalake-hudi/sessions/date=2021/10/14, s3a://datalake-hudi/sessions/date=2021/07/05, s3a://datalake-hudi/sessions/date=2021/02/15, s3a://datalake-hudi/sessions/date=2021/11/01, s3a://datalake-hudi/sessions/date=2021/05/13, s3a://datalake-hudi/sessions/date=2021/10/09, s3a://datalake-hudi/sessions/date=2021/10/21, s3a://datalake-hudi/sessions/date=2021/06/07, s3a://datalake-hudi/sessions/date=2021/04/12, s3a://datalake-hudi/sessions/date=2021/02/04, s3a://datalake-hudi/sessions/date=2021/12/20, s3a://datalake-hudi/sessions/date=2021/05/24, s3a://datalake-hudi/sessions/date=2021/08/19, s3a://datalake-hudi/sessions/date=2021/04/06, s3a://datalake-hudi/sessions/date=2021/09/22, s3a://datalake-hudi/sessions/date=2021/12/02, s3a://datalake-hudi/sessions/date=2021/07/27, s3a://datalake-hudi/sessions/date=2021/08/26, s3a://datalake-hudi/sessions/date=2021/10/10, s3a://datalake-hudi/sessions/date=2021/08/07, s3a://datalake-hudi/sessions/date=2021/03/05, s3a://datalake-hudi/sessions/date=2021/11/23, s3a://datalake-hudi/sessions/date=2021/09/16, s3a://datalake-hudi/sessions/date=2021/06/06, s3a://datalake-hudi/sessions/date=2021/10/11, s3a://datalake-hudi/sessions/date=2021/07/23, s3a://datalake-hudi/sessions/date=2021/05/21, s3a://datalake-hudi/sessions/date=2022/01/25, s3a://datalake-hudi/sessions/date=2021/10/03, s3a://datalake-hudi/sessions/date=2021/01/22, s3a://datalake-hudi/sessions/date=2021/06/22, s3a://datalake-hudi/sessions/date=2022/01/22, s3a://datalake-hudi/sessions/date=2021/10/26, s3a://datalake-hudi/sessions/date=2021/10/30, s3a://datalake-hudi/sessions/date=2021/05/04, s3a://datalake-hudi/sessions/date=2021/05/10, s3a://datalake-hudi/sessions/date=2022/01/11, s3a://datalake-hudi/sessions/date=2021/08/01, s3a://datalake-hudi/sessions/date=2021/07/07, s3a://datalake-hudi/sessions/date=2022/01/03, s3a://datalake-hudi/sessions/date=2021/06/28, s3a://datalake-hudi/sessions/date=2021/09/05, s3a://datalake-hudi/sessions/date=2021/12/15, s3a://datalake-hudi/sessions/date=2021/01/07, s3a://datalake-hudi/sessions/date=2021/10/15, s3a://datalake-hudi/sessions/date=2021/03/27, s3a://datalake-hudi/sessions/date=2021/12/05, s3a://datalake-hudi/sessions/date=2021/03/16, s3a://datalake-hudi/sessions/date=2021/03/20, s3a://datalake-hudi/sessions/date=2021/04/28, s3a://datalake-hudi/sessions/date=2022/01/14, s3a://datalake-hudi/sessions/date=2021/06/11, s3a://datalake-hudi/sessions/date=2021/01/17, s3a://datalake-hudi/sessions/date=2021/11/29, s3a://datalake-hudi/sessions/date=2021/08/23, s3a://datalake-hudi/sessions/date=2021/10/08, s3a://datalake-hudi/sessions/date=2021/07/18, s3a://datalake-hudi/sessions/date=2021/02/25, s3a://datalake-hudi/sessions/date=2021/02/11, s3a://datalake-hudi/sessions/date=2021/08/12, s3a://datalake-hudi/sessions/date=2021/12/08, s3a://datalake-hudi/sessions/date=2021/09/19, s3a://datalake-hudi/sessions/date=2021/07/04, s3a://datalake-hudi/sessions/date=2021/06/03, s3a://datalake-hudi/sessions/date=2021/09/08, s3a://datalake-hudi/sessions/date=2021/11/04, s3a://datalake-hudi/sessions/date=2021/12/26, s3a://datalake-hudi/sessions/date=2021/01/30, s3a://datalake-hudi/sessions/date=2021/09/27, s3a://datalake-hudi/sessions/date=2021/08/30, s3a://datalake-hudi/sessions/date=2021/01/18, s3a://datalake-hudi/sessions/date=2021/11/15, s3a://datalake-hudi/sessions/date=2022/01/15, s3a://datalake-hudi/sessions/date=2021/04/02, s3a://datalake-hudi/sessions/date=2021/10/19, s3a://datalake-hudi/sessions/date=2021/03/23, s3a://datalake-hudi/sessions/date=2021/07/29, s3a://datalake-hudi/sessions/date=2021/02/03, s3a://datalake-hudi/sessions/date=2021/03/12, s3a://datalake-hudi/sessions/date=2021/01/12, s3a://datalake-hudi/sessions/date=2022/01/08, s3a://datalake-hudi/sessions/date=2021/03/15, s3a://datalake-hudi/sessions/date=2021/12/10, s3a://datalake-hudi/sessions/date=2021/10/04, s3a://datalake-hudi/sessions/date=2021/04/25, s3a://datalake-hudi/sessions/date=2021/09/13, s3a://datalake-hudi/sessions/date=2021/02/22, s3a://datalake-hudi/sessions/date=2021/06/17, s3a://datalake-hudi/sessions/date=2021/03/01, s3a://datalake-hudi/sessions/date=2021/12/21, s3a://datalake-hudi/sessions/date=2021/04/05, s3a://datalake-hudi/sessions/date=2021/09/12, s3a://datalake-hudi/sessions/date=2021/09/23, s3a://datalake-hudi/sessions/date=2021/05/19, s3a://datalake-hudi/sessions/date=2021/07/30, s3a://datalake-hudi/sessions/date=2021/07/15, s3a://datalake-hudi/sessions/date=2021/01/28, s3a://datalake-hudi/sessions/date=2021/09/02, s3a://datalake-hudi/sessions/date=2022/01/19, s3a://datalake-hudi/sessions/date=2021/10/22, s3a://datalake-hudi/sessions/date=2021/11/11, s3a://datalake-hudi/sessions/date=2021/05/09, s3a://datalake-hudi/sessions/date=2021/01/23, s3a://datalake-hudi/sessions/date=2021/07/26, s3a://datalake-hudi/sessions/date=2021/06/25, s3a://datalake-hudi/sessions/date=2021/04/15, s3a://datalake-hudi/sessions/date=2021/05/14, s3a://datalake-hudi/sessions/date=2021/08/29, s3a://datalake-hudi/sessions/date=2021/05/08, s3a://datalake-hudi/sessions/date=2021/11/20, s3a://datalake-hudi/sessions/date=2021/08/18, s3a://datalake-hudi/sessions/date=2021/05/25, s3a://datalake-hudi/sessions/date=2021/02/14, s3a://datalake-hudi/sessions/date=2021/06/14, s3a://datalake-hudi/sessions/date=2021/04/18, s3a://datalake-hudi/sessions/date=2021/04/20] 22/01/28 01:29:47 INFO SparkUI: Stopped Spark web UI at http://192.168.86.5:4040 22/01/28 01:29:47 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 22/01/28 01:29:47 INFO MemoryStore: MemoryStore cleared 22/01/28 01:29:47 INFO BlockManager: BlockManager stopped 22/01/28 01:29:48 INFO BlockManagerMaster: BlockManagerMaster stopped 22/01/28 01:29:48 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 22/01/28 01:29:48 INFO SparkContext: Successfully stopped SparkContext Exception in thread "main" java.lang.NullPointerException at org.sparkproject.guava.base.Preconditions.checkNotNull(Preconditions.java:191) at org.sparkproject.guava.cache.LocalCache.put(LocalCache.java:4210) at org.sparkproject.guava.cache.LocalCache$LocalManualCache.put(LocalCache.java:4804) at org.apache.spark.sql.execution.datasources.SharedInMemoryCache$$anon$3.putLeafFiles(FileStatusCache.scala:161) at org.apache.hudi.HoodieFileIndex.$anonfun$loadPartitionPathFiles$4(HoodieFileIndex.scala:631) at org.apache.hudi.HoodieFileIndex.$anonfun$loadPartitionPathFiles$4$adapted(HoodieFileIndex.scala:629) at scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:234) at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:468) at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:468) at org.apache.hudi.HoodieFileIndex.loadPartitionPathFiles(HoodieFileIndex.scala:629) at org.apache.hudi.HoodieFileIndex.refresh0(HoodieFileIndex.scala:387) at org.apache.hudi.HoodieFileIndex.<init>(HoodieFileIndex.scala:184) at org.apache.hudi.DefaultSource.getBaseFileOnlyView(DefaultSource.scala:199) at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:119) at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:69) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:355) at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:325) at org.apache.spark.sql.DataFrameReader.$anonfun$load$3(DataFrameReader.scala:307) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:307) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:239) at com.h7kanna.data.reports.MetadataTestETL$.main(MetadataTestETL.scala:30) at com.h7kanna.data.reports.MetadataTestETL.main(MetadataTestETL.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 22/01/28 01:29:48 INFO ShutdownHookManager: Shutdown hook called 22/01/28 01:29:48 INFO ShutdownHookManager: Deleting directory /private/var/folders/61/3vd56bjx3cj0hpdq_139d5hm0000gp/T/spark-7f2e050f-240e-40dd-b433-a6ebe08232ae 22/01/28 01:29:48 INFO ShutdownHookManager: Deleting directory /private/var/folders/61/3vd56bjx3cj0hpdq_139d5hm0000gp/T/spark-823349b0-aeeb-494d-bdc6-c276419a0fe1 22/01/28 01:29:48 INFO MetricsSystemImpl: Stopping s3a-file-system metrics system... 22/01/28 01:29:48 INFO MetricsSystemImpl: s3a-file-system metrics system stopped. 22/01/28 01:29:48 INFO MetricsSystemImpl: s3a-file-system metrics system shutdown complete. > Loading Hudi table fails with NullPointerException > -------------------------------------------------- > > Key: HUDI-3335 > URL: https://issues.apache.org/jira/browse/HUDI-3335 > Project: Apache Hudi > Issue Type: Bug > Affects Versions: 0.10.1 > Reporter: Harsha Teja Kanna > Priority: Blocker > Fix For: 0.11.0 > > > Have a COW table with metadata enabled. Loading from Spark query fails with > java.lang.NullPointerException > *Environment* > Spark 3.1.2 > Hudi 0.10.1 > *Query* > import org.apache.hudi.DataSourceReadOptions > import org.apache.hudi.common.config.HoodieMetadataConfig > val basePath = "s3a://datalake-hudi/v1" > val df = spark. > read. > format("org.apache.hudi"). > option(HoodieMetadataConfig.ENABLE.key(), "true"). > option(DataSourceReadOptions.QUERY_TYPE.key(), > DataSourceReadOptions.QUERY_TYPE_SNAPSHOT_OPT_VAL). > load(s"${basePath}/sessions/") > df.createOrReplaceTempView(table) > *Passing an individual partition works though* > val df = spark. > read. > format("org.apache.hudi"). > option(HoodieMetadataConfig.ENABLE.key(), "true"). > option(DataSourceReadOptions.QUERY_TYPE.key(), > DataSourceReadOptions.QUERY_TYPE_SNAPSHOT_OPT_VAL). > load(s"${basePath}/sessions/date=2022/01/25") > df.createOrReplaceTempView(table) > *Also, disabling metadata works, but the query taking very long time* > val df = spark. > read. > format("org.apache.hudi"). > option(DataSourceReadOptions.QUERY_TYPE.key(), > DataSourceReadOptions.QUERY_TYPE_SNAPSHOT_OPT_VAL). > load(s"${basePath}/sessions/") > df.createOrReplaceTempView(table) > *Loading files with stacktrace:* > at > org.sparkproject.guava.base.Preconditions.checkNotNull(Preconditions.java:191) > at org.sparkproject.guava.cache.LocalCache.put(LocalCache.java:4210) > at > org.sparkproject.guava.cache.LocalCache$LocalManualCache.put(LocalCache.java:4804) > at > org.apache.spark.sql.execution.datasources.SharedInMemoryCache$$anon$3.putLeafFiles(FileStatusCache.scala:161) > at > org.apache.hudi.HoodieFileIndex.$anonfun$loadPartitionPathFiles$4(HoodieFileIndex.scala:631) > at > org.apache.hudi.HoodieFileIndex.$anonfun$loadPartitionPathFiles$4$adapted(HoodieFileIndex.scala:629) > at scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:234) > at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:468) > at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:468) > at > org.apache.hudi.HoodieFileIndex.loadPartitionPathFiles(HoodieFileIndex.scala:629) > at org.apache.hudi.HoodieFileIndex.refresh0(HoodieFileIndex.scala:387) > at org.apache.hudi.HoodieFileIndex.<init>(HoodieFileIndex.scala:184) > at > org.apache.hudi.DefaultSource.getBaseFileOnlyView(DefaultSource.scala:199) > at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:119) > at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:69) > at > org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:355) > at > org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:325) > at > org.apache.spark.sql.DataFrameReader.$anonfun$load$3(DataFrameReader.scala:307) > at scala.Option.getOrElse(Option.scala:189) > at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:307) > at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:239) > at $anonfun$res3$1(<console>:46) > at $anonfun$res3$1$adapted(<console>:40) > at scala.collection.Iterator.foreach(Iterator.scala:941) > at scala.collection.Iterator.foreach$(Iterator.scala:941) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1429) > at scala.collection.IterableLike.foreach(IterableLike.scala:74) > at scala.collection.IterableLike.foreach$(IterableLike.scala:73) > at scala.collection.AbstractIterable.foreach(Iterable.scala:56) > *Writer config* > ** > spark-submit \ > --master yarn \ > --deploy-mode cluster \ > --driver-cores 4 \ > --driver-memory 4g \ > --executor-cores 4 \ > --executor-memory 6g \ > --num-executors 8 \ > --jars > s3://datalake/jars/unused-1.0.0.jar,s3://datalake/jars/spark-avro_2.12-3.1.2.jar > \ > --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer \ > --conf spark.serializer=org.apache.spark.serializer.KryoSerializer \ > --conf spark.sql.sources.parallelPartitionDiscovery.parallelism=25000 \ > s3://datalake/jars/hudi-0.10.1/hudi-utilities-bundle_2.12-0.10.1.jar \ > --table-type COPY_ON_WRITE \ > --source-ordering-field timestamp \ > --source-class org.apache.hudi.utilities.sources.ParquetDFSSource \ > --target-base-path s3a://datalake-hudi/sessions \ > --target-table sessions \ > --transformer-class > org.apache.hudi.utilities.transform.SqlQueryBasedTransformer \ > --op INSERT \ > --hoodie-conf hoodie.clean.automatic=true \ > --hoodie-conf hoodie.cleaner.commits.retained=10 \ > --hoodie-conf hoodie.cleaner.policy=KEEP_LATEST_COMMITS \ > --hoodie-conf hoodie.clustering.inline=true \ > --hoodie-conf hoodie.clustering.inline.max.commits=5 \ > --hoodie-conf > hoodie.clustering.plan.strategy.class=org.apache.hudi.client.clustering.plan.strategy.SparkRecentDaysClusteringPlanStrategy > \ > --hoodie-conf hoodie.clustering.plan.strategy.max.num.groups=1000 \ > --hoodie-conf hoodie.clustering.plan.strategy.small.file.limit=268435456 \ > --hoodie-conf > hoodie.clustering.plan.strategy.sort.columns=survey_dbid,session_dbid \ > --hoodie-conf hoodie.clustering.plan.strategy.target.file.max.bytes=536870912 > \ > --hoodie-conf hoodie.clustering.preserve.commit.metadata=true \ > --hoodie-conf hoodie.datasource.hive_sync.database=datalake-hudi \ > --hoodie-conf hoodie.datasource.hive_sync.enable=false \ > --hoodie-conf hoodie.datasource.hive_sync.ignore_exceptions=true \ > --hoodie-conf hoodie.datasource.hive_sync.mode=hms \ > --hoodie-conf > hoodie.datasource.hive_sync.partition_extractor_class=org.apache.hudi.hive.HiveStylePartitionValueExtractor > \ > --hoodie-conf hoodie.datasource.hive_sync.partition_fields=date \ > --hoodie-conf hoodie.datasource.hive_sync.table=sessions \ > --hoodie-conf hoodie.datasource.hive_sync.use_jdbc=false \ > --hoodie-conf hoodie.datasource.write.hive_style_partitioning=true \ > --hoodie-conf > hoodie.datasource.write.keygenerator.class=org.apache.hudi.keygen.CustomKeyGenerator > \ > --hoodie-conf hoodie.datasource.write.operation=insert \ > --hoodie-conf hoodie.datasource.write.partitionpath.field=date:TIMESTAMP \ > --hoodie-conf hoodie.datasource.write.precombine.field=timestamp \ > --hoodie-conf > hoodie.datasource.write.recordkey.field=session_dbid,question_id,answer \ > --hoodie-conf > hoodie.deltastreamer.keygen.timebased.input.dateformat=yyyy/MM/dd \ > --hoodie-conf hoodie.deltastreamer.keygen.timebased.input.timezone=GMT \ > --hoodie-conf > hoodie.deltastreamer.keygen.timebased.output.dateformat=yyyy/MM/dd \ > --hoodie-conf hoodie.deltastreamer.keygen.timebased.output.timezone=GMT \ > --hoodie-conf > hoodie.deltastreamer.keygen.timebased.timestamp.type=DATE_STRING \ > --hoodie-conf > hoodie.deltastreamer.source.dfs.root=s3://datalake-hudi/raw/parquet/data/sessions/year=2022/month=01/day=26/hour=02 > \ > --hoodie-conf > hoodie.deltastreamer.source.input.selector=org.apache.hudi.utilities.sources.helpers.DFSPathSelector > \ > --hoodie-conf "\"hoodie.deltastreamer.transformer.sql=SELECT question_id, > answer, to_timestamp(timestamp) as timestamp, session_dbid, survey_dbid, > date_format(to_timestamp(timestamp), 'yyyy/MM/dd') AS date FROM <SRC> a \"" \ > --hoodie-conf hoodie.file.listing.parallelism=256 \ > --hoodie-conf hoodie.finalize.write.parallelism=256 \ > --hoodie-conf > hoodie.generate.consistent.timestamp.logical.for.key.generator=true \ > --hoodie-conf hoodie.insert.shuffle.parallelism=1000 \ > --hoodie-conf hoodie.metadata.enable=true \ > --hoodie-conf hoodie.metadata.metrics.enable=true \ > --hoodie-conf > hoodie.metrics.cloudwatch.metric.prefix=emr.datalake.prd.insert.sessions \ > --hoodie-conf hoodie.metrics.on=false \ > --hoodie-conf hoodie.metrics.reporter.type=CLOUDWATCH \ > --hoodie-conf hoodie.parquet.block.size=536870912 \ > --hoodie-conf hoodie.parquet.compression.codec=snappy \ > --hoodie-conf hoodie.parquet.max.file.size=536870912 \ > --hoodie-conf hoodie.parquet.small.file.limit=268435456 > > *Metadata Commits (.hoodie/metadata/.hoodie)* > ** > 20220125154001455002.clean > 20220125154001455002.clean.inflight > 20220125154001455002.clean.requested > 20220125160751769002.clean > 20220125160751769002.clean.inflight > 20220125160751769002.clean.requested > 20220125163020781002.clean > 20220125163020781002.clean.inflight > 20220125163020781002.clean.requested > 20220125165722170002.clean > 20220125165722170002.clean.inflight > 20220125165722170002.clean.requested > 20220125172016239002.clean > 20220125172016239002.clean.inflight > 20220125172016239002.clean.requested > 20220125174427654002.clean > 20220125174427654002.clean.inflight > 20220125174427654002.clean.requested > 20220125181218237002.clean > 20220125181218237002.clean.inflight > 20220125181218237002.clean.requested > 20220125184343588002.clean > 20220125184343588002.clean.inflight > 20220125184343588002.clean.requested > 20220125191038318002.clean > 20220125191038318002.clean.inflight > 20220125191038318002.clean.requested > 20220125193445223002.clean > 20220125193445223002.clean.inflight > 20220125193445223002.clean.requested > 20220125200741168002.clean > 20220125200741168002.clean.inflight > 20220125200741168002.clean.requested > 20220125203814934002.clean > 20220125203814934002.clean.inflight > 20220125203814934002.clean.requested > 20220125211447323002.clean > 20220125211447323002.clean.inflight > 20220125211447323002.clean.requested > 20220125214421740002.clean > 20220125214421740002.clean.inflight > 20220125214421740002.clean.requested > 20220125221009798002.clean > 20220125221009798002.clean.inflight > 20220125221009798002.clean.requested > 20220125224319264002.clean > 20220125224319264002.clean.inflight > 20220125224319264002.clean.requested > 20220125231128580002.clean > 20220125231128580002.clean.inflight > 20220125231128580002.clean.requested > 20220125234345790002.clean > 20220125234345790002.clean.inflight > 20220125234345790002.clean.requested > 20220126001130415002.clean > 20220126001130415002.clean.inflight > 20220126001130415002.clean.requested > 20220126004341130002.clean > 20220126004341130002.clean.inflight > 20220126004341130002.clean.requested > 20220126011114529002.clean > 20220126011114529002.clean.inflight > 20220126011114529002.clean.requested > 20220126013648751002.clean > 20220126013648751002.clean.inflight > 20220126013648751002.clean.requested > 20220126013859643.deltacommit > 20220126013859643.deltacommit.inflight > 20220126013859643.deltacommit.requested > 20220126014254294.deltacommit > 20220126014254294.deltacommit.inflight > 20220126014254294.deltacommit.requested > 20220126014516195.deltacommit > 20220126014516195.deltacommit.inflight > 20220126014516195.deltacommit.requested > 20220126014711043.deltacommit > 20220126014711043.deltacommit.inflight > 20220126014711043.deltacommit.requested > 20220126014808898.deltacommit > 20220126014808898.deltacommit.inflight > 20220126014808898.deltacommit.requested > 20220126015008443.deltacommit > 20220126015008443.deltacommit.inflight > 20220126015008443.deltacommit.requested > 20220126015119193.deltacommit > 20220126015119193.deltacommit.inflight > 20220126015119193.deltacommit.requested > 20220126015119193001.commit > 20220126015119193001.compaction.inflight > 20220126015119193001.compaction.requested > 20220126015653770.deltacommit > 20220126015653770.deltacommit.inflight > 20220126015653770.deltacommit.requested > 20220126020011172.deltacommit > 20220126020011172.deltacommit.inflight > 20220126020011172.deltacommit.requested > 20220126020405299.deltacommit > 20220126020405299.deltacommit.inflight > 20220126020405299.deltacommit.requested > 20220126020405299002.clean > 20220126020405299002.clean.inflight > 20220126020405299002.clean.requested > 20220126020813841.deltacommit > 20220126020813841.deltacommit.inflight > 20220126020813841.deltacommit.requested > 20220126021002748.deltacommit > 20220126021002748.deltacommit.inflight > 20220126021002748.deltacommit.requested > 20220126021231085.deltacommit > 20220126021231085.deltacommit.inflight > 20220126021231085.deltacommit.requested > 20220126021429124.deltacommit > 20220126021429124.deltacommit.inflight > 20220126021429124.deltacommit.requested > 20220126021445188.deltacommit > 20220126021445188.deltacommit.inflight > 20220126021445188.deltacommit.requested > 20220126021949824.deltacommit > 20220126021949824.deltacommit.inflight > 20220126021949824.deltacommit.requested > 20220126022154561.deltacommit > 20220126022154561.deltacommit.inflight > 20220126022154561.deltacommit.requested > 20220126022154561001.commit > 20220126022154561001.compaction.inflight > 20220126022154561001.compaction.requested > 20220126022523011.deltacommit > 20220126022523011.deltacommit.inflight > 20220126022523011.deltacommit.requested > 20220126023054200.deltacommit > 20220126023054200.deltacommit.inflight > 20220126023054200.deltacommit.requested > 20220126023530250.deltacommit > 20220126023530250.deltacommit.inflight > 20220126023530250.deltacommit.requested > 20220126023530250002.clean > 20220126023530250002.clean.inflight > 20220126023530250002.clean.requested > 20220126023637109.deltacommit > 20220126023637109.deltacommit.inflight > 20220126023637109.deltacommit.requested > 20220126024028688.deltacommit > 20220126024028688.deltacommit.inflight > 20220126024028688.deltacommit.requested > 20220126024137627.deltacommit > 20220126024137627.deltacommit.inflight > 20220126024137627.deltacommit.requested > 20220126024720121.deltacommit > 20220126024720121.deltacommit.inflight > 20220126024720121.deltacommit.requested > *Commits(.hoodie)* > 20220125224502471.clean > 20220125224502471.clean.inflight > 20220125224502471.clean.requested > 20220125225810828.clean > 20220125225810828.clean.inflight > 20220125225810828.clean.requested > 20220125230125674.clean > 20220125230125674.clean.inflight > 20220125230125674.clean.requested > 20220125230854957.clean > 20220125230854957.clean.inflight > 20220125230854957.clean.requested > 20220125232236767.clean > 20220125232236767.clean.inflight > 20220125232236767.clean.requested > 20220125232638588.clean > 20220125232638588.clean.inflight > 20220125232638588.clean.requested > 20220125233355290.clean > 20220125233355290.clean.inflight > 20220125233355290.clean.requested > 20220125234539672.clean > 20220125234539672.clean.inflight > 20220125234539672.clean.requested > 20220125234944271.clean > 20220125234944271.clean.inflight > 20220125234944271.clean.requested > 20220125235718218.clean > 20220125235718218.clean.inflight > 20220125235718218.clean.requested > 20220126000225375.clean > 20220126000225375.clean.inflight > 20220126000225375.clean.requested > 20220126000937875.clean > 20220126000937875.clean.inflight > 20220126000937875.clean.requested > 20220126003307449.clean > 20220126003307449.clean.inflight > 20220126003307449.clean.requested > 20220126003617137.clean > 20220126003617137.clean.inflight > 20220126003617137.clean.requested > 20220126004518227.clean > 20220126004518227.clean.inflight > 20220126004518227.clean.requested > 20220126005806798.clean > 20220126005806798.clean.inflight > 20220126005806798.clean.requested > 20220126010011407.commit > 20220126010011407.commit.requested > 20220126010011407.inflight > 20220126010227320.clean > 20220126010227320.clean.inflight > 20220126010227320.clean.requested > 20220126010242754.replacecommit > 20220126010242754.replacecommit.inflight > 20220126010242754.replacecommit.requested > 20220126010800207.commit > 20220126010800207.commit.requested > 20220126010800207.inflight > 20220126010920192.clean > 20220126010920192.clean.inflight > 20220126010920192.clean.requested > 20220126011114529.commit > 20220126011114529.commit.requested > 20220126011114529.inflight > 20220126011230532.clean > 20220126011230532.clean.inflight > 20220126011230532.clean.requested > 20220126011426028.commit > 20220126011426028.commit.requested > 20220126011426028.inflight > 20220126011818299.commit > 20220126011818299.commit.requested > 20220126011818299.inflight > 20220126012003045.clean > 20220126012003045.clean.inflight > 20220126012003045.clean.requested > 20220126012240288.commit > 20220126012240288.commit.requested > 20220126012240288.inflight > 20220126012443455.clean > 20220126012443455.clean.inflight > 20220126012443455.clean.requested > 20220126012508460.replacecommit > 20220126012508460.replacecommit.inflight > 20220126012508460.replacecommit.requested > 20220126013218816.commit > 20220126013218816.commit.requested > 20220126013218816.inflight > 20220126013428875.clean > 20220126013428875.clean.inflight > 20220126013428875.clean.requested > 20220126013648751.commit > 20220126013648751.commit.requested > 20220126013648751.inflight > 20220126013859643.clean > 20220126013859643.clean.inflight > 20220126013859643.clean.requested > 20220126014254294.commit > 20220126014254294.commit.requested > 20220126014254294.inflight > 20220126014516195.clean > 20220126014516195.clean.inflight > 20220126014516195.clean.requested > 20220126014711043.commit > 20220126014711043.commit.requested > 20220126014711043.inflight > 20220126014808898.clean > 20220126014808898.clean.inflight > 20220126014808898.clean.requested > 20220126015008443.commit > 20220126015008443.commit.requested > 20220126015008443.inflight > 20220126015119193.replacecommit > 20220126015119193.replacecommit.inflight > 20220126015119193.replacecommit.requested > 20220126015653770.commit > 20220126015653770.commit.requested > 20220126015653770.inflight > 20220126020011172.commit > 20220126020011172.commit.requested > 20220126020011172.inflight > 20220126020405299.commit > 20220126020405299.commit.requested > 20220126020405299.inflight > 20220126020813841.commit > 20220126020813841.commit.requested > 20220126020813841.inflight > 20220126021002748.clean > 20220126021002748.clean.inflight > 20220126021002748.clean.requested > 20220126021231085.commit > 20220126021231085.commit.requested > 20220126021231085.inflight > 20220126021429124.clean > 20220126021429124.clean.inflight > 20220126021429124.clean.requested > 20220126021445188.replacecommit > 20220126021445188.replacecommit.inflight > 20220126021445188.replacecommit.requested > 20220126021949824.commit > 20220126021949824.commit.requested > 20220126021949824.inflight > 20220126022154561.clean > 20220126022154561.clean.inflight > 20220126022154561.clean.requested > 20220126022523011.commit > 20220126022523011.commit.requested > 20220126022523011.inflight > 20220126023054200.commit > 20220126023054200.commit.requested > 20220126023054200.inflight > 20220126023530250.commit > 20220126023530250.commit.requested > 20220126023530250.inflight > 20220126023637109.clean > 20220126023637109.clean.inflight > 20220126023637109.clean.requested > 20220126024028688.commit > 20220126024028688.commit.requested > 20220126024028688.inflight > 20220126024137627.replacecommit > 20220126024137627.replacecommit.inflight > 20220126024137627.replacecommit.requested > 20220126024720121.commit > 20220126024720121.commit.requested > 20220126024720121.inflight > > ** -- This message was sent by Atlassian Jira (v8.20.1#820001)