[ 
https://issues.apache.org/jira/browse/HUDI-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17483610#comment-17483610
 ] 

Harsha Teja Kanna commented on HUDI-3335:
-----------------------------------------

Log



22/01/28 01:29:34 INFO Executor: Adding 
file:/private/var/folders/61/3vd56bjx3cj0hpdq_139d5hm0000gp/T/spark-823349b0-aeeb-494d-bdc6-c276419a0fe1/userFiles-644e376c-59bb-4837-a421-590697992dc6/hudi-utilities-bundle_2.12-0.10.1.jar
 to class loader
22/01/28 01:29:34 INFO Executor: Fetching 
spark://192.168.86.5:49947/jars/org.spark-project.spark_unused-1.0.0.jar with 
timestamp 1643354959702
22/01/28 01:29:34 INFO Utils: Fetching 
spark://192.168.86.5:49947/jars/org.spark-project.spark_unused-1.0.0.jar to 
/private/var/folders/61/3vd56bjx3cj0hpdq_139d5hm0000gp/T/spark-823349b0-aeeb-494d-bdc6-c276419a0fe1/userFiles-644e376c-59bb-4837-a421-590697992dc6/fetchFileTemp5819832321479921719.tmp
22/01/28 01:29:34 INFO Executor: Adding 
file:/private/var/folders/61/3vd56bjx3cj0hpdq_139d5hm0000gp/T/spark-823349b0-aeeb-494d-bdc6-c276419a0fe1/userFiles-644e376c-59bb-4837-a421-590697992dc6/org.spark-project.spark_unused-1.0.0.jar
 to class loader
22/01/28 01:29:34 INFO Utils: Successfully started service 
'org.apache.spark.network.netty.NettyBlockTransferService' on port 49956.
22/01/28 01:29:34 INFO NettyBlockTransferService: Server created on 
192.168.86.5:49956
22/01/28 01:29:34 INFO BlockManager: Using 
org.apache.spark.storage.RandomBlockReplicationPolicy for block replication 
policy
22/01/28 01:29:34 INFO BlockManagerMaster: Registering BlockManager 
BlockManagerId(driver, 192.168.86.5, 49956, None)
22/01/28 01:29:34 INFO BlockManagerMasterEndpoint: Registering block manager 
192.168.86.5:49956 with 2004.6 MiB RAM, BlockManagerId(driver, 192.168.86.5, 
49956, None)
22/01/28 01:29:34 INFO BlockManagerMaster: Registered BlockManager 
BlockManagerId(driver, 192.168.86.5, 49956, None)
22/01/28 01:29:34 INFO BlockManager: Initialized BlockManager: 
BlockManagerId(driver, 192.168.86.5, 49956, None)
22/01/28 01:29:35 INFO SharedState: Setting hive.metastore.warehouse.dir 
('null') to the value of spark.sql.warehouse.dir 
('file:/Users/harshakanna/spark-warehouse/').
22/01/28 01:29:35 INFO SharedState: Warehouse path is 
'file:/Users/harshakanna/spark-warehouse/'.
22/01/28 01:29:36 INFO DataSourceUtils: Getting table path..
22/01/28 01:29:36 INFO TablePathUtils: Getting table path from path : 
s3a://datalake-hudi/sessions
22/01/28 01:29:36 INFO DefaultSource: Obtained hudi table path: 
s3a://datalake-hudi/sessions
22/01/28 01:29:36 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient 
from s3a://datalake-hudi/sessions
22/01/28 01:29:36 INFO HoodieTableConfig: Loading table properties from 
s3a://datalake-hudi/sessions/.hoodie/hoodie.properties
22/01/28 01:29:36 INFO HoodieTableMetaClient: Finished Loading Table of type 
COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from 
s3a://datalake-hudi/sessions
22/01/28 01:29:36 INFO DefaultSource: Is bootstrapped table => false, tableType 
is: COPY_ON_WRITE, queryType is: snapshot
22/01/28 01:29:36 INFO DefaultSource: Loading Base File Only View with options 
:Map(hoodie.datasource.query.type -> snapshot, hoodie.metadata.enable -> true, 
path -> s3a://datalake-hudi/sessions/)
22/01/28 01:29:37 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient 
from s3a://datalake-hudi/sessions
22/01/28 01:29:37 INFO HoodieTableConfig: Loading table properties from 
s3a://datalake-hudi/sessions/.hoodie/hoodie.properties
22/01/28 01:29:37 INFO HoodieTableMetaClient: Finished Loading Table of type 
COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from 
s3a://datalake-hudi/sessions
22/01/28 01:29:37 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient 
from s3a://datalake-hudi/sessions/.hoodie/metadata
22/01/28 01:29:37 INFO HoodieTableConfig: Loading table properties from 
s3a://datalake-hudi/sessions/.hoodie/metadata/.hoodie/hoodie.properties
22/01/28 01:29:37 INFO HoodieTableMetaClient: Finished Loading Table of type 
MERGE_ON_READ(version=1, baseFileFormat=HFILE) from 
s3a://datalake-hudi/sessions/.hoodie/metadata
22/01/28 01:29:37 INFO HoodieTableMetadataUtil: Loading latest merged file 
slices for metadata table partition files
22/01/28 01:29:38 INFO HoodieActiveTimeline: Loaded instants upto : 
Option\{val=[20220126024720121__deltacommit__COMPLETED]}
22/01/28 01:29:38 INFO AbstractTableFileSystemView: Took 2 ms to read 0 
instants, 0 replaced file groups
22/01/28 01:29:38 INFO ClusteringUtils: Found 0 files in pending clustering 
operations
22/01/28 01:29:38 INFO AbstractTableFileSystemView: Building file system view 
for partition (files)
22/01/28 01:29:38 INFO AbstractTableFileSystemView: addFilesToView: NumFiles=9, 
NumFileGroups=1, FileGroupsCreationTime=11, StoreTimeTaken=0
22/01/28 01:29:38 INFO CacheConfig: Allocating LruBlockCache size=1.42 GB, 
blockSize=64 KB
22/01/28 01:29:38 INFO CacheConfig: Created cacheConfig: 
blockCache=LruBlockCache\{blockCount=0, currentSize=1567280, 
freeSize=1525578832, maxSize=1527146112, heapSize=1567280, minSize=1450788736, 
minFactor=0.95, multiSize=725394368, multiFactor=0.5, singleSize=362697184, 
singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, 
cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, 
cacheDataCompressed=false, prefetchOnOpen=false
22/01/28 01:29:39 INFO S3AInputStream: Switching to Random IO seek policy
22/01/28 01:29:39 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:39 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:39 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:39 INFO HoodieBackedTableMetadata: Opened metadata base file 
from 
s3a://datalake-hudi/sessions/.hoodie/metadata/files/files-0000_0-17-118_20220126022154561001.hfile
 at instant 20220126022154561001 in 1427 ms
22/01/28 01:29:39 INFO HoodieActiveTimeline: Loaded instants upto : 
Option\{val=[20220126024720121__commit__COMPLETED]}
22/01/28 01:29:39 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient 
from s3a://datalake-hudi/sessions/.hoodie/metadata
22/01/28 01:29:40 INFO HoodieTableConfig: Loading table properties from 
s3a://datalake-hudi/sessions/.hoodie/metadata/.hoodie/hoodie.properties
22/01/28 01:29:40 INFO HoodieTableMetaClient: Finished Loading Table of type 
MERGE_ON_READ(version=1, baseFileFormat=HFILE) from 
s3a://datalake-hudi/sessions/.hoodie/metadata
22/01/28 01:29:40 INFO HoodieActiveTimeline: Loaded instants upto : 
Option\{val=[20220126024720121__deltacommit__COMPLETED]}
22/01/28 01:29:40 INFO AbstractHoodieLogRecordReader: Scanning log file 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.1_0-27-187',
 fileLen=0}
22/01/28 01:29:40 INFO S3AInputStream: Switching to Random IO seek policy
22/01/28 01:29:40 INFO AbstractHoodieLogRecordReader: Reading a data block from 
file 
s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.1_0-27-187
 at instant 20220126022523011
22/01/28 01:29:41 INFO HoodieLogFormatReader: Moving to the next reader for 
logfile 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.2_0-20-180',
 fileLen=0}
22/01/28 01:29:41 INFO AbstractHoodieLogRecordReader: Scanning log file 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.2_0-20-180',
 fileLen=0}
22/01/28 01:29:41 INFO S3AInputStream: Switching to Random IO seek policy
22/01/28 01:29:41 INFO AbstractHoodieLogRecordReader: Reading a data block from 
file 
s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.2_0-20-180
 at instant 20220126023054200
22/01/28 01:29:41 INFO AbstractHoodieLogRecordReader: Number of remaining 
logblocks to merge 1
22/01/28 01:29:41 INFO CacheConfig: Created cacheConfig: 
blockCache=LruBlockCache\{blockCount=0, currentSize=1567280, 
freeSize=1525578832, maxSize=1527146112, heapSize=1567280, minSize=1450788736, 
minFactor=0.95, multiSize=725394368, multiFactor=0.5, singleSize=362697184, 
singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, 
cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, 
cacheDataCompressed=false, prefetchOnOpen=false
22/01/28 01:29:41 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:41 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:41 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:41 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:41 INFO ExternalSpillableMap: Estimated Payload size => 600
22/01/28 01:29:41 INFO HoodieLogFormatReader: Moving to the next reader for 
logfile 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.3_0-20-184',
 fileLen=0}
22/01/28 01:29:41 INFO AbstractHoodieLogRecordReader: Scanning log file 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.3_0-20-184',
 fileLen=0}
22/01/28 01:29:41 INFO S3AInputStream: Switching to Random IO seek policy
22/01/28 01:29:41 INFO AbstractHoodieLogRecordReader: Reading a data block from 
file 
s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.3_0-20-184
 at instant 20220126023530250
22/01/28 01:29:41 INFO AbstractHoodieLogRecordReader: Number of remaining 
logblocks to merge 1
22/01/28 01:29:41 INFO CacheConfig: Created cacheConfig: 
blockCache=LruBlockCache\{blockCount=0, currentSize=1567280, 
freeSize=1525578832, maxSize=1527146112, heapSize=1567280, minSize=1450788736, 
minFactor=0.95, multiSize=725394368, multiFactor=0.5, singleSize=362697184, 
singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, 
cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, 
cacheDataCompressed=false, prefetchOnOpen=false
22/01/28 01:29:41 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:41 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:41 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:41 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:41 INFO HoodieLogFormatReader: Moving to the next reader for 
logfile 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.4_0-38-381',
 fileLen=0}
22/01/28 01:29:41 INFO AbstractHoodieLogRecordReader: Scanning log file 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.4_0-38-381',
 fileLen=0}
22/01/28 01:29:41 INFO S3AInputStream: Switching to Random IO seek policy
22/01/28 01:29:41 INFO AbstractHoodieLogRecordReader: Reading a data block from 
file 
s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.4_0-38-381
 at instant 20220126023637109
22/01/28 01:29:41 INFO AbstractHoodieLogRecordReader: Number of remaining 
logblocks to merge 1
22/01/28 01:29:41 INFO CacheConfig: Created cacheConfig: 
blockCache=LruBlockCache\{blockCount=0, currentSize=1567280, 
freeSize=1525578832, maxSize=1527146112, heapSize=1567280, minSize=1450788736, 
minFactor=0.95, multiSize=725394368, multiFactor=0.5, singleSize=362697184, 
singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, 
cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, 
cacheDataCompressed=false, prefetchOnOpen=false
22/01/28 01:29:41 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:41 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:41 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:41 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:41 INFO HoodieLogFormatReader: Moving to the next reader for 
logfile 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.5_0-20-168',
 fileLen=0}
22/01/28 01:29:41 INFO AbstractHoodieLogRecordReader: Scanning log file 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.5_0-20-168',
 fileLen=0}
22/01/28 01:29:41 INFO S3AInputStream: Switching to Random IO seek policy
22/01/28 01:29:41 INFO AbstractHoodieLogRecordReader: Reading a data block from 
file 
s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.5_0-20-168
 at instant 20220126024028688
22/01/28 01:29:41 INFO AbstractHoodieLogRecordReader: Number of remaining 
logblocks to merge 1
22/01/28 01:29:41 INFO CacheConfig: Created cacheConfig: 
blockCache=LruBlockCache\{blockCount=0, currentSize=1567280, 
freeSize=1525578832, maxSize=1527146112, heapSize=1567280, minSize=1450788736, 
minFactor=0.95, multiSize=725394368, multiFactor=0.5, singleSize=362697184, 
singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, 
cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, 
cacheDataCompressed=false, prefetchOnOpen=false
22/01/28 01:29:41 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:41 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:41 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:41 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:42 INFO HoodieLogFormatReader: Moving to the next reader for 
logfile 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.6_0-38-247',
 fileLen=0}
22/01/28 01:29:42 INFO AbstractHoodieLogRecordReader: Scanning log file 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.6_0-38-247',
 fileLen=0}
22/01/28 01:29:42 INFO S3AInputStream: Switching to Random IO seek policy
22/01/28 01:29:42 INFO AbstractHoodieLogRecordReader: Reading a data block from 
file 
s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.6_0-38-247
 at instant 20220126024137627
22/01/28 01:29:42 INFO AbstractHoodieLogRecordReader: Number of remaining 
logblocks to merge 1
22/01/28 01:29:42 INFO CacheConfig: Created cacheConfig: 
blockCache=LruBlockCache\{blockCount=0, currentSize=1567280, 
freeSize=1525578832, maxSize=1527146112, heapSize=1567280, minSize=1450788736, 
minFactor=0.95, multiSize=725394368, multiFactor=0.5, singleSize=362697184, 
singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, 
cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, 
cacheDataCompressed=false, prefetchOnOpen=false
22/01/28 01:29:42 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:42 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:42 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:42 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:42 INFO HoodieLogFormatReader: Moving to the next reader for 
logfile 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.7_0-20-204',
 fileLen=0}
22/01/28 01:29:42 INFO AbstractHoodieLogRecordReader: Scanning log file 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.7_0-20-204',
 fileLen=0}
22/01/28 01:29:42 INFO S3AInputStream: Switching to Random IO seek policy
22/01/28 01:29:42 INFO AbstractHoodieLogRecordReader: Reading a data block from 
file 
s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.7_0-20-204
 at instant 20220126024720121
22/01/28 01:29:42 INFO AbstractHoodieLogRecordReader: Number of remaining 
logblocks to merge 1
22/01/28 01:29:42 INFO CacheConfig: Created cacheConfig: 
blockCache=LruBlockCache\{blockCount=0, currentSize=1567280, 
freeSize=1525578832, maxSize=1527146112, heapSize=1567280, minSize=1450788736, 
minFactor=0.95, multiSize=725394368, multiFactor=0.5, singleSize=362697184, 
singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, 
cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, 
cacheDataCompressed=false, prefetchOnOpen=false
22/01/28 01:29:42 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:42 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:42 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:42 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:42 INFO AbstractHoodieLogRecordReader: Merging the final data 
blocks
22/01/28 01:29:42 INFO AbstractHoodieLogRecordReader: Number of remaining 
logblocks to merge 1
22/01/28 01:29:42 INFO CacheConfig: Created cacheConfig: 
blockCache=LruBlockCache\{blockCount=0, currentSize=1567280, 
freeSize=1525578832, maxSize=1527146112, heapSize=1567280, minSize=1450788736, 
minFactor=0.95, multiSize=725394368, multiFactor=0.5, singleSize=362697184, 
singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, 
cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, 
cacheDataCompressed=false, prefetchOnOpen=false
22/01/28 01:29:42 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:42 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:42 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:42 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:42 INFO HoodieMergedLogRecordScanner: Number of log files 
scanned => 7
22/01/28 01:29:42 INFO HoodieMergedLogRecordScanner: MaxMemoryInBytes allowed 
for compaction => 1073741824
22/01/28 01:29:42 INFO HoodieMergedLogRecordScanner: Number of entries in 
MemoryBasedMap in ExternalSpillableMap => 3
22/01/28 01:29:42 INFO HoodieMergedLogRecordScanner: Total size in bytes of 
MemoryBasedMap in ExternalSpillableMap => 1800
22/01/28 01:29:42 INFO HoodieMergedLogRecordScanner: Number of entries in 
BitCaskDiskMap in ExternalSpillableMap => 0
22/01/28 01:29:42 INFO HoodieMergedLogRecordScanner: Size of file spilled to 
disk => 0
22/01/28 01:29:42 INFO HoodieBackedTableMetadata: Opened 7 metadata log files 
(dataset instant=20220126024720121, metadata instant=20220126024720121) in 3273 
ms
22/01/28 01:29:42 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:42 INFO HoodieBackedTableMetadata: Metadata read for 1 keys took 
[baseFileRead, logMerge] [0, 129] ms
22/01/28 01:29:42 INFO BaseTableMetadata: Listed partitions from metadata: 
#partitions=389
22/01/28 01:29:42 INFO HoodieActiveTimeline: Loaded instants upto : 
Option\{val=[20220126024720121__commit__COMPLETED]}
22/01/28 01:29:43 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient 
from s3a://datalake-hudi/sessions
22/01/28 01:29:43 INFO HoodieTableConfig: Loading table properties from 
s3a://datalake-hudi/sessions/.hoodie/hoodie.properties
22/01/28 01:29:43 INFO HoodieTableMetaClient: Finished Loading Table of type 
COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from 
s3a://datalake-hudi/sessions
22/01/28 01:29:43 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient 
from s3a://datalake-hudi/sessions/.hoodie/metadata
22/01/28 01:29:43 INFO HoodieTableConfig: Loading table properties from 
s3a://datalake-hudi/sessions/.hoodie/metadata/.hoodie/hoodie.properties
22/01/28 01:29:43 INFO HoodieTableMetaClient: Finished Loading Table of type 
MERGE_ON_READ(version=1, baseFileFormat=HFILE) from 
s3a://datalake-hudi/sessions/.hoodie/metadata
22/01/28 01:29:43 INFO HoodieTableMetadataUtil: Loading latest merged file 
slices for metadata table partition files
22/01/28 01:29:44 INFO HoodieActiveTimeline: Loaded instants upto : 
Option\{val=[20220126024720121__deltacommit__COMPLETED]}
22/01/28 01:29:44 INFO AbstractTableFileSystemView: Took 0 ms to read 0 
instants, 0 replaced file groups
22/01/28 01:29:44 INFO ClusteringUtils: Found 0 files in pending clustering 
operations
22/01/28 01:29:44 INFO AbstractTableFileSystemView: Building file system view 
for partition (files)
22/01/28 01:29:44 INFO AbstractTableFileSystemView: addFilesToView: NumFiles=9, 
NumFileGroups=1, FileGroupsCreationTime=1, StoreTimeTaken=0
22/01/28 01:29:44 INFO CacheConfig: Created cacheConfig: 
blockCache=LruBlockCache\{blockCount=0, currentSize=1567280, 
freeSize=1525578832, maxSize=1527146112, heapSize=1567280, minSize=1450788736, 
minFactor=0.95, multiSize=725394368, multiFactor=0.5, singleSize=362697184, 
singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, 
cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, 
cacheDataCompressed=false, prefetchOnOpen=false
22/01/28 01:29:44 INFO S3AInputStream: Switching to Random IO seek policy
22/01/28 01:29:44 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:44 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:44 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:44 INFO HoodieBackedTableMetadata: Opened metadata base file 
from 
s3a://datalake-hudi/sessions/.hoodie/metadata/files/files-0000_0-17-118_20220126022154561001.hfile
 at instant 20220126022154561001 in 1041 ms
22/01/28 01:29:45 INFO HoodieActiveTimeline: Loaded instants upto : 
Option\{val=[20220126024720121__commit__COMPLETED]}
22/01/28 01:29:45 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient 
from s3a://datalake-hudi/sessions/.hoodie/metadata
22/01/28 01:29:45 INFO HoodieTableConfig: Loading table properties from 
s3a://datalake-hudi/sessions/.hoodie/metadata/.hoodie/hoodie.properties
22/01/28 01:29:45 INFO HoodieTableMetaClient: Finished Loading Table of type 
MERGE_ON_READ(version=1, baseFileFormat=HFILE) from 
s3a://datalake-hudi/sessions/.hoodie/metadata
22/01/28 01:29:45 INFO HoodieActiveTimeline: Loaded instants upto : 
Option\{val=[20220126024720121__deltacommit__COMPLETED]}
22/01/28 01:29:45 INFO AbstractHoodieLogRecordReader: Scanning log file 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.1_0-27-187',
 fileLen=0}
22/01/28 01:29:45 INFO S3AInputStream: Switching to Random IO seek policy
22/01/28 01:29:45 INFO AbstractHoodieLogRecordReader: Reading a data block from 
file 
s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.1_0-27-187
 at instant 20220126022523011
22/01/28 01:29:46 INFO HoodieLogFormatReader: Moving to the next reader for 
logfile 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.2_0-20-180',
 fileLen=0}
22/01/28 01:29:46 INFO AbstractHoodieLogRecordReader: Scanning log file 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.2_0-20-180',
 fileLen=0}
22/01/28 01:29:46 INFO S3AInputStream: Switching to Random IO seek policy
22/01/28 01:29:46 INFO AbstractHoodieLogRecordReader: Reading a data block from 
file 
s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.2_0-20-180
 at instant 20220126023054200
22/01/28 01:29:46 INFO AbstractHoodieLogRecordReader: Number of remaining 
logblocks to merge 1
22/01/28 01:29:46 INFO CacheConfig: Created cacheConfig: 
blockCache=LruBlockCache\{blockCount=0, currentSize=1567280, 
freeSize=1525578832, maxSize=1527146112, heapSize=1567280, minSize=1450788736, 
minFactor=0.95, multiSize=725394368, multiFactor=0.5, singleSize=362697184, 
singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, 
cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, 
cacheDataCompressed=false, prefetchOnOpen=false
22/01/28 01:29:46 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:46 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:46 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:46 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:46 INFO ExternalSpillableMap: Estimated Payload size => 600
22/01/28 01:29:46 INFO HoodieLogFormatReader: Moving to the next reader for 
logfile 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.3_0-20-184',
 fileLen=0}
22/01/28 01:29:46 INFO AbstractHoodieLogRecordReader: Scanning log file 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.3_0-20-184',
 fileLen=0}
22/01/28 01:29:46 INFO S3AInputStream: Switching to Random IO seek policy
22/01/28 01:29:46 INFO AbstractHoodieLogRecordReader: Reading a data block from 
file 
s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.3_0-20-184
 at instant 20220126023530250
22/01/28 01:29:46 INFO AbstractHoodieLogRecordReader: Number of remaining 
logblocks to merge 1
22/01/28 01:29:46 INFO CacheConfig: Created cacheConfig: 
blockCache=LruBlockCache\{blockCount=0, currentSize=1567280, 
freeSize=1525578832, maxSize=1527146112, heapSize=1567280, minSize=1450788736, 
minFactor=0.95, multiSize=725394368, multiFactor=0.5, singleSize=362697184, 
singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, 
cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, 
cacheDataCompressed=false, prefetchOnOpen=false
22/01/28 01:29:46 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:46 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:46 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:46 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:46 INFO HoodieLogFormatReader: Moving to the next reader for 
logfile 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.4_0-38-381',
 fileLen=0}
22/01/28 01:29:46 INFO AbstractHoodieLogRecordReader: Scanning log file 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.4_0-38-381',
 fileLen=0}
22/01/28 01:29:46 INFO S3AInputStream: Switching to Random IO seek policy
22/01/28 01:29:46 INFO AbstractHoodieLogRecordReader: Reading a data block from 
file 
s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.4_0-38-381
 at instant 20220126023637109
22/01/28 01:29:46 INFO AbstractHoodieLogRecordReader: Number of remaining 
logblocks to merge 1
22/01/28 01:29:46 INFO CacheConfig: Created cacheConfig: 
blockCache=LruBlockCache\{blockCount=0, currentSize=1567280, 
freeSize=1525578832, maxSize=1527146112, heapSize=1567280, minSize=1450788736, 
minFactor=0.95, multiSize=725394368, multiFactor=0.5, singleSize=362697184, 
singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, 
cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, 
cacheDataCompressed=false, prefetchOnOpen=false
22/01/28 01:29:46 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:46 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:46 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:46 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:46 INFO HoodieLogFormatReader: Moving to the next reader for 
logfile 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.5_0-20-168',
 fileLen=0}
22/01/28 01:29:46 INFO AbstractHoodieLogRecordReader: Scanning log file 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.5_0-20-168',
 fileLen=0}
22/01/28 01:29:46 INFO S3AInputStream: Switching to Random IO seek policy
22/01/28 01:29:47 INFO AbstractHoodieLogRecordReader: Reading a data block from 
file 
s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.5_0-20-168
 at instant 20220126024028688
22/01/28 01:29:47 INFO AbstractHoodieLogRecordReader: Number of remaining 
logblocks to merge 1
22/01/28 01:29:47 INFO CacheConfig: Created cacheConfig: 
blockCache=LruBlockCache\{blockCount=0, currentSize=1567280, 
freeSize=1525578832, maxSize=1527146112, heapSize=1567280, minSize=1450788736, 
minFactor=0.95, multiSize=725394368, multiFactor=0.5, singleSize=362697184, 
singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, 
cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, 
cacheDataCompressed=false, prefetchOnOpen=false
22/01/28 01:29:47 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:47 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:47 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:47 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:47 INFO HoodieLogFormatReader: Moving to the next reader for 
logfile 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.6_0-38-247',
 fileLen=0}
22/01/28 01:29:47 INFO AbstractHoodieLogRecordReader: Scanning log file 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.6_0-38-247',
 fileLen=0}
22/01/28 01:29:47 INFO S3AInputStream: Switching to Random IO seek policy
22/01/28 01:29:47 INFO AbstractHoodieLogRecordReader: Reading a data block from 
file 
s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.6_0-38-247
 at instant 20220126024137627
22/01/28 01:29:47 INFO AbstractHoodieLogRecordReader: Number of remaining 
logblocks to merge 1
22/01/28 01:29:47 INFO CacheConfig: Created cacheConfig: 
blockCache=LruBlockCache\{blockCount=0, currentSize=1567280, 
freeSize=1525578832, maxSize=1527146112, heapSize=1567280, minSize=1450788736, 
minFactor=0.95, multiSize=725394368, multiFactor=0.5, singleSize=362697184, 
singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, 
cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, 
cacheDataCompressed=false, prefetchOnOpen=false
22/01/28 01:29:47 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:47 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:47 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:47 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:47 INFO HoodieLogFormatReader: Moving to the next reader for 
logfile 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.7_0-20-204',
 fileLen=0}
22/01/28 01:29:47 INFO AbstractHoodieLogRecordReader: Scanning log file 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.7_0-20-204',
 fileLen=0}
22/01/28 01:29:47 INFO S3AInputStream: Switching to Random IO seek policy
22/01/28 01:29:47 INFO AbstractHoodieLogRecordReader: Reading a data block from 
file 
s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20220126022154561001.log.7_0-20-204
 at instant 20220126024720121
22/01/28 01:29:47 INFO AbstractHoodieLogRecordReader: Number of remaining 
logblocks to merge 1
22/01/28 01:29:47 INFO CacheConfig: Created cacheConfig: 
blockCache=LruBlockCache\{blockCount=0, currentSize=1567280, 
freeSize=1525578832, maxSize=1527146112, heapSize=1567280, minSize=1450788736, 
minFactor=0.95, multiSize=725394368, multiFactor=0.5, singleSize=362697184, 
singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, 
cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, 
cacheDataCompressed=false, prefetchOnOpen=false
22/01/28 01:29:47 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:47 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:47 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:47 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:47 INFO AbstractHoodieLogRecordReader: Merging the final data 
blocks
22/01/28 01:29:47 INFO AbstractHoodieLogRecordReader: Number of remaining 
logblocks to merge 1
22/01/28 01:29:47 INFO CacheConfig: Created cacheConfig: 
blockCache=LruBlockCache\{blockCount=0, currentSize=1567280, 
freeSize=1525578832, maxSize=1527146112, heapSize=1567280, minSize=1450788736, 
minFactor=0.95, multiSize=725394368, multiFactor=0.5, singleSize=362697184, 
singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, 
cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, 
cacheDataCompressed=false, prefetchOnOpen=false
22/01/28 01:29:47 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:47 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:47 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:47 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:47 INFO HoodieMergedLogRecordScanner: Number of log files 
scanned => 7
22/01/28 01:29:47 INFO HoodieMergedLogRecordScanner: MaxMemoryInBytes allowed 
for compaction => 1073741824
22/01/28 01:29:47 INFO HoodieMergedLogRecordScanner: Number of entries in 
MemoryBasedMap in ExternalSpillableMap => 3
22/01/28 01:29:47 INFO HoodieMergedLogRecordScanner: Total size in bytes of 
MemoryBasedMap in ExternalSpillableMap => 1800
22/01/28 01:29:47 INFO HoodieMergedLogRecordScanner: Number of entries in 
BitCaskDiskMap in ExternalSpillableMap => 0
22/01/28 01:29:47 INFO HoodieMergedLogRecordScanner: Size of file spilled to 
disk => 0
22/01/28 01:29:47 INFO HoodieBackedTableMetadata: Opened 7 metadata log files 
(dataset instant=20220126024720121, metadata instant=20220126024720121) in 2814 
ms
22/01/28 01:29:47 INFO CodecPool: Got brand-new decompressor [.gz]
22/01/28 01:29:47 INFO HoodieBackedTableMetadata: Metadata read for 389 keys 
took [baseFileRead, logMerge] [1, 122] ms
22/01/28 01:29:47 INFO BaseTableMetadata: Listed files in partitions from 
metadata: partition list =[s3a://datalake-hudi/sessions/date=2021/07/20, 
s3a://datalake-hudi/sessions/date=2021/04/29, 
s3a://datalake-hudi/sessions/date=2021/02/02, 
s3a://datalake-hudi/sessions/date=2021/09/30, 
s3a://datalake-hudi/sessions/date=2021/12/30, 
s3a://datalake-hudi/sessions/date=2021/03/10, 
s3a://datalake-hudi/sessions/date=2021/08/31, 
s3a://datalake-hudi/sessions/date=2021/01/08, 
s3a://datalake-hudi/sessions/date=2021/12/19, 
s3a://datalake-hudi/sessions/date=2021/04/26, 
s3a://datalake-hudi/sessions/date=2021/08/06, 
s3a://datalake-hudi/sessions/date=2021/05/23, 
s3a://datalake-hudi/sessions/date=2021/04/09, 
s3a://datalake-hudi/sessions/date=2021/10/12, 
s3a://datalake-hudi/sessions/date=2021/06/12, 
s3a://datalake-hudi/sessions/date=2021/05/20, 
s3a://datalake-hudi/sessions/date=2021/03/04, 
s3a://datalake-hudi/sessions/date=2021/08/20, 
s3a://datalake-hudi/sessions/date=2021/02/13, 
s3a://datalake-hudi/sessions/date=2021/08/02, 
s3a://datalake-hudi/sessions/date=2021/05/05, 
s3a://datalake-hudi/sessions/date=2021/03/03, 
s3a://datalake-hudi/sessions/date=2021/06/05, 
s3a://datalake-hudi/sessions/date=2021/10/23, 
s3a://datalake-hudi/sessions/date=2021/09/29, 
s3a://datalake-hudi/sessions/date=2021/04/14, 
s3a://datalake-hudi/sessions/date=2021/02/18, 
s3a://datalake-hudi/sessions/date=2021/03/21, 
s3a://datalake-hudi/sessions/date=2021/07/24, 
s3a://datalake-hudi/sessions/date=2021/04/30, 
s3a://datalake-hudi/sessions/date=2021/01/21, 
s3a://datalake-hudi/sessions/date=2021/10/29, 
s3a://datalake-hudi/sessions/date=2021/10/01, 
s3a://datalake-hudi/sessions/date=2021/06/23, 
s3a://datalake-hudi/sessions/date=2021/12/27, 
s3a://datalake-hudi/sessions/date=2021/04/08, 
s3a://datalake-hudi/sessions/date=2021/03/14, 
s3a://datalake-hudi/sessions/date=2021/03/26, 
s3a://datalake-hudi/sessions/date=2021/12/04, 
s3a://datalake-hudi/sessions/date=2021/05/27, 
s3a://datalake-hudi/sessions/date=2021/11/27, 
s3a://datalake-hudi/sessions/date=2021/05/12, 
s3a://datalake-hudi/sessions/date=2021/03/19, 
s3a://datalake-hudi/sessions/date=2021/11/07, 
s3a://datalake-hudi/sessions/date=2021/03/08, 
s3a://datalake-hudi/sessions/date=2021/12/16, 
s3a://datalake-hudi/sessions/date=2021/11/18, 
s3a://datalake-hudi/sessions/date=2021/10/18, 
s3a://datalake-hudi/sessions/date=2021/05/31, 
s3a://datalake-hudi/sessions/date=2021/10/07, 
s3a://datalake-hudi/sessions/date=2022/01/04, 
s3a://datalake-hudi/sessions/date=2021/09/10, 
s3a://datalake-hudi/sessions/date=2021/06/01, 
s3a://datalake-hudi/sessions/date=2021/06/27, 
s3a://datalake-hudi/sessions/date=2021/07/01, 
s3a://datalake-hudi/sessions/date=2021/06/30, 
s3a://datalake-hudi/sessions/date=2021/08/17, 
s3a://datalake-hudi/sessions/date=2021/04/17, 
s3a://datalake-hudi/sessions/date=2022/01/07, 
s3a://datalake-hudi/sessions/date=2021/05/26, 
s3a://datalake-hudi/sessions/date=2021/03/18, 
s3a://datalake-hudi/sessions/date=2021/09/20, 
s3a://datalake-hudi/sessions/date=2021/11/26, 
s3a://datalake-hudi/sessions/date=2021/12/11, 
s3a://datalake-hudi/sessions/date=2021/04/01, 
s3a://datalake-hudi/sessions/date=2021/03/22, 
s3a://datalake-hudi/sessions/date=2021/05/15, 
s3a://datalake-hudi/sessions/date=2021/08/28, 
s3a://datalake-hudi/sessions/date=2021/01/11, 
s3a://datalake-hudi/sessions/date=2021/11/03, 
s3a://datalake-hudi/sessions/date=2022/01/18, 
s3a://datalake-hudi/sessions/date=2021/11/14, 
s3a://datalake-hudi/sessions/date=2021/02/10, 
s3a://datalake-hudi/sessions/date=2021/12/09, 
s3a://datalake-hudi/sessions/date=2021/03/29, 
s3a://datalake-hudi/sessions/date=2021/06/16, 
s3a://datalake-hudi/sessions/date=2021/02/06, 
s3a://datalake-hudi/sessions/date=2021/11/22, 
s3a://datalake-hudi/sessions/date=2021/09/07, 
s3a://datalake-hudi/sessions/date=2021/09/24, 
s3a://datalake-hudi/sessions/date=2021/03/31, 
s3a://datalake-hudi/sessions/date=2021/01/19, 
s3a://datalake-hudi/sessions/date=2021/08/13, 
s3a://datalake-hudi/sessions/date=2022/01/12, 
s3a://datalake-hudi/sessions/date=2021/12/22, 
s3a://datalake-hudi/sessions/date=2021/08/24, 
s3a://datalake-hudi/sessions/date=2021/07/12, 
s3a://datalake-hudi/sessions/date=2021/11/30, 
s3a://datalake-hudi/sessions/date=2021/02/21, 
s3a://datalake-hudi/sessions/date=2021/09/18, 
s3a://datalake-hudi/sessions/date=2021/03/11, 
s3a://datalake-hudi/sessions/date=2022/01/01, 
s3a://datalake-hudi/sessions/date=2021/02/17, 
s3a://datalake-hudi/sessions/date=2021/07/31, 
s3a://datalake-hudi/sessions/date=2021/01/29, 
s3a://datalake-hudi/sessions/date=2021/02/28, 
s3a://datalake-hudi/sessions/date=2021/03/07, 
s3a://datalake-hudi/sessions/date=2021/06/09, 
s3a://datalake-hudi/sessions/date=2021/11/10, 
s3a://datalake-hudi/sessions/date=2021/02/07, 
s3a://datalake-hudi/sessions/date=2021/09/28, 
s3a://datalake-hudi/sessions/date=2021/03/02, 
s3a://datalake-hudi/sessions/date=2021/01/25, 
s3a://datalake-hudi/sessions/date=2021/02/19, 
s3a://datalake-hudi/sessions/date=2021/11/05, 
s3a://datalake-hudi/sessions/date=2021/06/13, 
s3a://datalake-hudi/sessions/date=2021/02/01, 
s3a://datalake-hudi/sessions/date=2021/07/25, 
s3a://datalake-hudi/sessions/date=2021/01/27, 
s3a://datalake-hudi/sessions/date=2021/02/12, 
s3a://datalake-hudi/sessions/date=2021/06/24, 
s3a://datalake-hudi/sessions/date=2021/01/14, 
s3a://datalake-hudi/sessions/date=2021/03/30, 
s3a://datalake-hudi/sessions/date=2021/01/03, 
s3a://datalake-hudi/sessions/date=2021/02/08, 
s3a://datalake-hudi/sessions/date=2021/04/23, 
s3a://datalake-hudi/sessions/date=2021/09/11, 
s3a://datalake-hudi/sessions/date=2021/09/03, 
s3a://datalake-hudi/sessions/date=2021/12/13, 
s3a://datalake-hudi/sessions/date=2021/09/14, 
s3a://datalake-hudi/sessions/date=2021/10/02, 
s3a://datalake-hudi/sessions/date=2021/12/24, 
s3a://datalake-hudi/sessions/date=2021/01/09, 
s3a://datalake-hudi/sessions/date=2021/12/18, 
s3a://datalake-hudi/sessions/date=2021/10/17, 
s3a://datalake-hudi/sessions/date=2021/11/09, 
s3a://datalake-hudi/sessions/date=2021/06/04, 
s3a://datalake-hudi/sessions/date=2021/04/13, 
s3a://datalake-hudi/sessions/date=2021/03/09, 
s3a://datalake-hudi/sessions/date=2022/01/23, 
s3a://datalake-hudi/sessions/date=2021/11/08, 
s3a://datalake-hudi/sessions/date=2021/11/16, 
s3a://datalake-hudi/sessions/date=2022/01/20, 
s3a://datalake-hudi/sessions/date=2021/01/20, 
s3a://datalake-hudi/sessions/date=2021/08/21, 
s3a://datalake-hudi/sessions/date=2021/10/28, 
s3a://datalake-hudi/sessions/date=2021/08/03, 
s3a://datalake-hudi/sessions/date=2021/03/13, 
s3a://datalake-hudi/sessions/date=2021/07/03, 
s3a://datalake-hudi/sessions/date=2021/10/06, 
s3a://datalake-hudi/sessions/date=2021/07/09, 
s3a://datalake-hudi/sessions/date=2021/06/02, 
s3a://datalake-hudi/sessions/date=2021/11/28, 
s3a://datalake-hudi/sessions/date=2021/05/01, 
s3a://datalake-hudi/sessions/date=2021/12/29, 
s3a://datalake-hudi/sessions/date=2021/01/31, 
s3a://datalake-hudi/sessions/date=2021/01/05, 
s3a://datalake-hudi/sessions/date=2021/04/10, 
s3a://datalake-hudi/sessions/date=2021/11/19, 
s3a://datalake-hudi/sessions/date=2021/04/07, 
s3a://datalake-hudi/sessions/date=2021/02/23, 
s3a://datalake-hudi/sessions/date=2021/07/14, 
s3a://datalake-hudi/sessions/date=2021/05/17, 
s3a://datalake-hudi/sessions/date=2021/03/25, 
s3a://datalake-hudi/sessions/date=2021/08/14, 
s3a://datalake-hudi/sessions/date=2021/04/04, 
s3a://datalake-hudi/sessions/date=2021/01/16, 
s3a://datalake-hudi/sessions/date=2021/06/08, 
s3a://datalake-hudi/sessions/date=2022/01/17, 
s3a://datalake-hudi/sessions/date=2021/07/28, 
s3a://datalake-hudi/sessions/date=2021/10/24, 
s3a://datalake-hudi/sessions/date=2021/10/13, 
s3a://datalake-hudi/sessions/date=2021/07/13, 
s3a://datalake-hudi/sessions/date=2021/05/16, 
s3a://datalake-hudi/sessions/date=2022/01/06, 
s3a://datalake-hudi/sessions/date=2021/05/02, 
s3a://datalake-hudi/sessions/date=2021/03/17, 
s3a://datalake-hudi/sessions/date=2021/02/27, 
s3a://datalake-hudi/sessions/date=2021/12/12, 
s3a://datalake-hudi/sessions/date=2021/05/28, 
s3a://datalake-hudi/sessions/date=2021/05/30, 
s3a://datalake-hudi/sessions/date=2021/06/15, 
s3a://datalake-hudi/sessions/date=2021/08/25, 
s3a://datalake-hudi/sessions/date=2021/02/24, 
s3a://datalake-hudi/sessions/date=2021/08/10, 
s3a://datalake-hudi/sessions/date=2021/12/06, 
s3a://datalake-hudi/sessions/date=2021/09/21, 
s3a://datalake-hudi/sessions/date=2021/07/06, 
s3a://datalake-hudi/sessions/date=2021/11/25, 
s3a://datalake-hudi/sessions/date=2021/09/06, 
s3a://datalake-hudi/sessions/date=2021/12/03, 
s3a://datalake-hudi/sessions/date=2021/10/20, 
s3a://datalake-hudi/sessions/date=2022/01/13, 
s3a://datalake-hudi/sessions/date=2021/11/13, 
s3a://datalake-hudi/sessions/date=2021/03/06, 
s3a://datalake-hudi/sessions/date=2021/06/26, 
s3a://datalake-hudi/sessions/date=2021/07/17, 
s3a://datalake-hudi/sessions/date=2021/07/21, 
s3a://datalake-hudi/sessions/date=2021/05/06, 
s3a://datalake-hudi/sessions/date=2021/06/20, 
s3a://datalake-hudi/sessions/date=2021/01/10, 
s3a://datalake-hudi/sessions/date=2021/02/05, 
s3a://datalake-hudi/sessions/date=2021/02/16, 
s3a://datalake-hudi/sessions/date=2021/04/22, 
s3a://datalake-hudi/sessions/date=2022/01/02, 
s3a://datalake-hudi/sessions/date=2021/08/09, 
s3a://datalake-hudi/sessions/date=2021/04/27, 
s3a://datalake-hudi/sessions/date=2021/06/19, 
s3a://datalake-hudi/sessions/date=2021/09/17, 
s3a://datalake-hudi/sessions/date=2021/03/28, 
s3a://datalake-hudi/sessions/date=2021/07/02, 
s3a://datalake-hudi/sessions/date=2021/12/23, 
s3a://datalake-hudi/sessions/date=2021/11/21, 
s3a://datalake-hudi/sessions/date=2021/09/25, 
s3a://datalake-hudi/sessions/date=2021/11/02, 
s3a://datalake-hudi/sessions/date=2021/02/20, 
s3a://datalake-hudi/sessions/date=2021/11/17, 
s3a://datalake-hudi/sessions/date=2021/12/14, 
s3a://datalake-hudi/sessions/date=2022/01/21, 
s3a://datalake-hudi/sessions/date=2021/10/31, 
s3a://datalake-hudi/sessions/date=2021/08/27, 
s3a://datalake-hudi/sessions/date=2021/10/05, 
s3a://datalake-hudi/sessions/date=2021/01/02, 
s3a://datalake-hudi/sessions/date=2022/01/10, 
s3a://datalake-hudi/sessions/date=2021/01/24, 
s3a://datalake-hudi/sessions/date=2021/09/01, 
s3a://datalake-hudi/sessions/date=2021/11/12, 
s3a://datalake-hudi/sessions/date=2021/04/16, 
s3a://datalake-hudi/sessions/date=2021/06/18, 
s3a://datalake-hudi/sessions/date=2021/06/21, 
s3a://datalake-hudi/sessions/date=2021/07/22, 
s3a://datalake-hudi/sessions/date=2021/11/24, 
s3a://datalake-hudi/sessions/date=2021/04/11, 
s3a://datalake-hudi/sessions/date=2021/07/11, 
s3a://datalake-hudi/sessions/date=2021/12/25, 
s3a://datalake-hudi/sessions/date=2021/08/08, 
s3a://datalake-hudi/sessions/date=2021/12/01, 
s3a://datalake-hudi/sessions/date=2021/12/31, 
s3a://datalake-hudi/sessions/date=2021/07/10, 
s3a://datalake-hudi/sessions/date=2021/11/06, 
s3a://datalake-hudi/sessions/date=2021/06/10, 
s3a://datalake-hudi/sessions/date=2021/09/04, 
s3a://datalake-hudi/sessions/date=2021/01/04, 
s3a://datalake-hudi/sessions/date=2021/01/26, 
s3a://datalake-hudi/sessions/date=2021/08/04, 
s3a://datalake-hudi/sessions/date=2021/01/13, 
s3a://datalake-hudi/sessions/date=2021/05/22, 
s3a://datalake-hudi/sessions/date=2021/05/03, 
s3a://datalake-hudi/sessions/date=2021/10/16, 
s3a://datalake-hudi/sessions/date=2021/08/16, 
s3a://datalake-hudi/sessions/date=2022/01/24, 
s3a://datalake-hudi/sessions/date=2021/01/06, 
s3a://datalake-hudi/sessions/date=2021/09/09, 
s3a://datalake-hudi/sessions/date=2021/10/27, 
s3a://datalake-hudi/sessions/date=2021/05/11, 
s3a://datalake-hudi/sessions/date=2021/09/15, 
s3a://datalake-hudi/sessions/date=2022/01/09, 
s3a://datalake-hudi/sessions/date=2021/12/17, 
s3a://datalake-hudi/sessions/date=2021/12/28, 
s3a://datalake-hudi/sessions/date=2021/12/07, 
s3a://datalake-hudi/sessions/date=2021/05/07, 
s3a://datalake-hudi/sessions/date=2021/08/22, 
s3a://datalake-hudi/sessions/date=2021/02/26, 
s3a://datalake-hudi/sessions/date=2021/04/21, 
s3a://datalake-hudi/sessions/date=2021/07/16, 
s3a://datalake-hudi/sessions/date=2021/02/09, 
s3a://datalake-hudi/sessions/date=2021/01/15, 
s3a://datalake-hudi/sessions/date=2021/10/25, 
s3a://datalake-hudi/sessions/date=2022/01/16, 
s3a://datalake-hudi/sessions/date=2021/08/11, 
s3a://datalake-hudi/sessions/date=2021/07/19, 
s3a://datalake-hudi/sessions/date=2021/08/05, 
s3a://datalake-hudi/sessions/date=2021/04/19, 
s3a://datalake-hudi/sessions/date=2022/01/05, 
s3a://datalake-hudi/sessions/date=2021/09/26, 
s3a://datalake-hudi/sessions/date=2021/03/24, 
s3a://datalake-hudi/sessions/date=2021/05/18, 
s3a://datalake-hudi/sessions/date=2021/07/08, 
s3a://datalake-hudi/sessions/date=2021/08/15, 
s3a://datalake-hudi/sessions/date=2021/04/03, 
s3a://datalake-hudi/sessions/date=2021/05/29, 
s3a://datalake-hudi/sessions/date=2021/06/29, 
s3a://datalake-hudi/sessions/date=2021/04/24, 
s3a://datalake-hudi/sessions/date=2021/10/14, 
s3a://datalake-hudi/sessions/date=2021/07/05, 
s3a://datalake-hudi/sessions/date=2021/02/15, 
s3a://datalake-hudi/sessions/date=2021/11/01, 
s3a://datalake-hudi/sessions/date=2021/05/13, 
s3a://datalake-hudi/sessions/date=2021/10/09, 
s3a://datalake-hudi/sessions/date=2021/10/21, 
s3a://datalake-hudi/sessions/date=2021/06/07, 
s3a://datalake-hudi/sessions/date=2021/04/12, 
s3a://datalake-hudi/sessions/date=2021/02/04, 
s3a://datalake-hudi/sessions/date=2021/12/20, 
s3a://datalake-hudi/sessions/date=2021/05/24, 
s3a://datalake-hudi/sessions/date=2021/08/19, 
s3a://datalake-hudi/sessions/date=2021/04/06, 
s3a://datalake-hudi/sessions/date=2021/09/22, 
s3a://datalake-hudi/sessions/date=2021/12/02, 
s3a://datalake-hudi/sessions/date=2021/07/27, 
s3a://datalake-hudi/sessions/date=2021/08/26, 
s3a://datalake-hudi/sessions/date=2021/10/10, 
s3a://datalake-hudi/sessions/date=2021/08/07, 
s3a://datalake-hudi/sessions/date=2021/03/05, 
s3a://datalake-hudi/sessions/date=2021/11/23, 
s3a://datalake-hudi/sessions/date=2021/09/16, 
s3a://datalake-hudi/sessions/date=2021/06/06, 
s3a://datalake-hudi/sessions/date=2021/10/11, 
s3a://datalake-hudi/sessions/date=2021/07/23, 
s3a://datalake-hudi/sessions/date=2021/05/21, 
s3a://datalake-hudi/sessions/date=2022/01/25, 
s3a://datalake-hudi/sessions/date=2021/10/03, 
s3a://datalake-hudi/sessions/date=2021/01/22, 
s3a://datalake-hudi/sessions/date=2021/06/22, 
s3a://datalake-hudi/sessions/date=2022/01/22, 
s3a://datalake-hudi/sessions/date=2021/10/26, 
s3a://datalake-hudi/sessions/date=2021/10/30, 
s3a://datalake-hudi/sessions/date=2021/05/04, 
s3a://datalake-hudi/sessions/date=2021/05/10, 
s3a://datalake-hudi/sessions/date=2022/01/11, 
s3a://datalake-hudi/sessions/date=2021/08/01, 
s3a://datalake-hudi/sessions/date=2021/07/07, 
s3a://datalake-hudi/sessions/date=2022/01/03, 
s3a://datalake-hudi/sessions/date=2021/06/28, 
s3a://datalake-hudi/sessions/date=2021/09/05, 
s3a://datalake-hudi/sessions/date=2021/12/15, 
s3a://datalake-hudi/sessions/date=2021/01/07, 
s3a://datalake-hudi/sessions/date=2021/10/15, 
s3a://datalake-hudi/sessions/date=2021/03/27, 
s3a://datalake-hudi/sessions/date=2021/12/05, 
s3a://datalake-hudi/sessions/date=2021/03/16, 
s3a://datalake-hudi/sessions/date=2021/03/20, 
s3a://datalake-hudi/sessions/date=2021/04/28, 
s3a://datalake-hudi/sessions/date=2022/01/14, 
s3a://datalake-hudi/sessions/date=2021/06/11, 
s3a://datalake-hudi/sessions/date=2021/01/17, 
s3a://datalake-hudi/sessions/date=2021/11/29, 
s3a://datalake-hudi/sessions/date=2021/08/23, 
s3a://datalake-hudi/sessions/date=2021/10/08, 
s3a://datalake-hudi/sessions/date=2021/07/18, 
s3a://datalake-hudi/sessions/date=2021/02/25, 
s3a://datalake-hudi/sessions/date=2021/02/11, 
s3a://datalake-hudi/sessions/date=2021/08/12, 
s3a://datalake-hudi/sessions/date=2021/12/08, 
s3a://datalake-hudi/sessions/date=2021/09/19, 
s3a://datalake-hudi/sessions/date=2021/07/04, 
s3a://datalake-hudi/sessions/date=2021/06/03, 
s3a://datalake-hudi/sessions/date=2021/09/08, 
s3a://datalake-hudi/sessions/date=2021/11/04, 
s3a://datalake-hudi/sessions/date=2021/12/26, 
s3a://datalake-hudi/sessions/date=2021/01/30, 
s3a://datalake-hudi/sessions/date=2021/09/27, 
s3a://datalake-hudi/sessions/date=2021/08/30, 
s3a://datalake-hudi/sessions/date=2021/01/18, 
s3a://datalake-hudi/sessions/date=2021/11/15, 
s3a://datalake-hudi/sessions/date=2022/01/15, 
s3a://datalake-hudi/sessions/date=2021/04/02, 
s3a://datalake-hudi/sessions/date=2021/10/19, 
s3a://datalake-hudi/sessions/date=2021/03/23, 
s3a://datalake-hudi/sessions/date=2021/07/29, 
s3a://datalake-hudi/sessions/date=2021/02/03, 
s3a://datalake-hudi/sessions/date=2021/03/12, 
s3a://datalake-hudi/sessions/date=2021/01/12, 
s3a://datalake-hudi/sessions/date=2022/01/08, 
s3a://datalake-hudi/sessions/date=2021/03/15, 
s3a://datalake-hudi/sessions/date=2021/12/10, 
s3a://datalake-hudi/sessions/date=2021/10/04, 
s3a://datalake-hudi/sessions/date=2021/04/25, 
s3a://datalake-hudi/sessions/date=2021/09/13, 
s3a://datalake-hudi/sessions/date=2021/02/22, 
s3a://datalake-hudi/sessions/date=2021/06/17, 
s3a://datalake-hudi/sessions/date=2021/03/01, 
s3a://datalake-hudi/sessions/date=2021/12/21, 
s3a://datalake-hudi/sessions/date=2021/04/05, 
s3a://datalake-hudi/sessions/date=2021/09/12, 
s3a://datalake-hudi/sessions/date=2021/09/23, 
s3a://datalake-hudi/sessions/date=2021/05/19, 
s3a://datalake-hudi/sessions/date=2021/07/30, 
s3a://datalake-hudi/sessions/date=2021/07/15, 
s3a://datalake-hudi/sessions/date=2021/01/28, 
s3a://datalake-hudi/sessions/date=2021/09/02, 
s3a://datalake-hudi/sessions/date=2022/01/19, 
s3a://datalake-hudi/sessions/date=2021/10/22, 
s3a://datalake-hudi/sessions/date=2021/11/11, 
s3a://datalake-hudi/sessions/date=2021/05/09, 
s3a://datalake-hudi/sessions/date=2021/01/23, 
s3a://datalake-hudi/sessions/date=2021/07/26, 
s3a://datalake-hudi/sessions/date=2021/06/25, 
s3a://datalake-hudi/sessions/date=2021/04/15, 
s3a://datalake-hudi/sessions/date=2021/05/14, 
s3a://datalake-hudi/sessions/date=2021/08/29, 
s3a://datalake-hudi/sessions/date=2021/05/08, 
s3a://datalake-hudi/sessions/date=2021/11/20, 
s3a://datalake-hudi/sessions/date=2021/08/18, 
s3a://datalake-hudi/sessions/date=2021/05/25, 
s3a://datalake-hudi/sessions/date=2021/02/14, 
s3a://datalake-hudi/sessions/date=2021/06/14, 
s3a://datalake-hudi/sessions/date=2021/04/18, 
s3a://datalake-hudi/sessions/date=2021/04/20]
22/01/28 01:29:47 INFO SparkUI: Stopped Spark web UI at http://192.168.86.5:4040
22/01/28 01:29:47 INFO MapOutputTrackerMasterEndpoint: 
MapOutputTrackerMasterEndpoint stopped!
22/01/28 01:29:47 INFO MemoryStore: MemoryStore cleared
22/01/28 01:29:47 INFO BlockManager: BlockManager stopped
22/01/28 01:29:48 INFO BlockManagerMaster: BlockManagerMaster stopped
22/01/28 01:29:48 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: 
OutputCommitCoordinator stopped!
22/01/28 01:29:48 INFO SparkContext: Successfully stopped SparkContext
Exception in thread "main" java.lang.NullPointerException
at 
org.sparkproject.guava.base.Preconditions.checkNotNull(Preconditions.java:191)
at org.sparkproject.guava.cache.LocalCache.put(LocalCache.java:4210)
at 
org.sparkproject.guava.cache.LocalCache$LocalManualCache.put(LocalCache.java:4804)
at 
org.apache.spark.sql.execution.datasources.SharedInMemoryCache$$anon$3.putLeafFiles(FileStatusCache.scala:161)
at 
org.apache.hudi.HoodieFileIndex.$anonfun$loadPartitionPathFiles$4(HoodieFileIndex.scala:631)
at 
org.apache.hudi.HoodieFileIndex.$anonfun$loadPartitionPathFiles$4$adapted(HoodieFileIndex.scala:629)
at scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:234)
at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:468)
at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:468)
at 
org.apache.hudi.HoodieFileIndex.loadPartitionPathFiles(HoodieFileIndex.scala:629)
at org.apache.hudi.HoodieFileIndex.refresh0(HoodieFileIndex.scala:387)
at org.apache.hudi.HoodieFileIndex.<init>(HoodieFileIndex.scala:184)
at org.apache.hudi.DefaultSource.getBaseFileOnlyView(DefaultSource.scala:199)
at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:119)
at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:69)
at 
org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:355)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:325)
at 
org.apache.spark.sql.DataFrameReader.$anonfun$load$3(DataFrameReader.scala:307)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:307)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:239)
at com.h7kanna.data.reports.MetadataTestETL$.main(MetadataTestETL.scala:30)
at com.h7kanna.data.reports.MetadataTestETL.main(MetadataTestETL.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
22/01/28 01:29:48 INFO ShutdownHookManager: Shutdown hook called
22/01/28 01:29:48 INFO ShutdownHookManager: Deleting directory 
/private/var/folders/61/3vd56bjx3cj0hpdq_139d5hm0000gp/T/spark-7f2e050f-240e-40dd-b433-a6ebe08232ae
22/01/28 01:29:48 INFO ShutdownHookManager: Deleting directory 
/private/var/folders/61/3vd56bjx3cj0hpdq_139d5hm0000gp/T/spark-823349b0-aeeb-494d-bdc6-c276419a0fe1
22/01/28 01:29:48 INFO MetricsSystemImpl: Stopping s3a-file-system metrics 
system...
22/01/28 01:29:48 INFO MetricsSystemImpl: s3a-file-system metrics system 
stopped.
22/01/28 01:29:48 INFO MetricsSystemImpl: s3a-file-system metrics system 
shutdown complete.

> Loading Hudi table fails with NullPointerException
> --------------------------------------------------
>
>                 Key: HUDI-3335
>                 URL: https://issues.apache.org/jira/browse/HUDI-3335
>             Project: Apache Hudi
>          Issue Type: Bug
>    Affects Versions: 0.10.1
>            Reporter: Harsha Teja Kanna
>            Priority: Blocker
>             Fix For: 0.11.0
>
>
> Have a COW table with metadata enabled. Loading from Spark query fails with 
> java.lang.NullPointerException
> *Environment*
> Spark 3.1.2
> Hudi 0.10.1
> *Query*
> import org.apache.hudi.DataSourceReadOptions
> import org.apache.hudi.common.config.HoodieMetadataConfig
> val basePath = "s3a://datalake-hudi/v1"
>  val df = spark.
>     read.
>     format("org.apache.hudi").
>     option(HoodieMetadataConfig.ENABLE.key(), "true").
>     option(DataSourceReadOptions.QUERY_TYPE.key(), 
> DataSourceReadOptions.QUERY_TYPE_SNAPSHOT_OPT_VAL).
>     load(s"${basePath}/sessions/")
>  df.createOrReplaceTempView(table)
> *Passing an individual partition works though*
> val df = spark.
>     read.
>     format("org.apache.hudi").
>     option(HoodieMetadataConfig.ENABLE.key(), "true").
>     option(DataSourceReadOptions.QUERY_TYPE.key(), 
> DataSourceReadOptions.QUERY_TYPE_SNAPSHOT_OPT_VAL).
>     load(s"${basePath}/sessions/date=2022/01/25")
>  df.createOrReplaceTempView(table)
> *Also, disabling metadata works, but the query taking very long time*
> val df = spark.
>     read.
>     format("org.apache.hudi").
>     option(DataSourceReadOptions.QUERY_TYPE.key(), 
> DataSourceReadOptions.QUERY_TYPE_SNAPSHOT_OPT_VAL).
>     load(s"${basePath}/sessions/")
>  df.createOrReplaceTempView(table)
> *Loading files with stacktrace:*
>   at 
> org.sparkproject.guava.base.Preconditions.checkNotNull(Preconditions.java:191)
>   at org.sparkproject.guava.cache.LocalCache.put(LocalCache.java:4210)
>   at 
> org.sparkproject.guava.cache.LocalCache$LocalManualCache.put(LocalCache.java:4804)
>   at 
> org.apache.spark.sql.execution.datasources.SharedInMemoryCache$$anon$3.putLeafFiles(FileStatusCache.scala:161)
>   at 
> org.apache.hudi.HoodieFileIndex.$anonfun$loadPartitionPathFiles$4(HoodieFileIndex.scala:631)
>   at 
> org.apache.hudi.HoodieFileIndex.$anonfun$loadPartitionPathFiles$4$adapted(HoodieFileIndex.scala:629)
>   at scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:234)
>   at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:468)
>   at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:468)
>   at 
> org.apache.hudi.HoodieFileIndex.loadPartitionPathFiles(HoodieFileIndex.scala:629)
>   at org.apache.hudi.HoodieFileIndex.refresh0(HoodieFileIndex.scala:387)
>   at org.apache.hudi.HoodieFileIndex.<init>(HoodieFileIndex.scala:184)
>   at 
> org.apache.hudi.DefaultSource.getBaseFileOnlyView(DefaultSource.scala:199)
>   at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:119)
>   at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:69)
>   at 
> org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:355)
>   at 
> org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:325)
>   at 
> org.apache.spark.sql.DataFrameReader.$anonfun$load$3(DataFrameReader.scala:307)
>   at scala.Option.getOrElse(Option.scala:189)
>   at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:307)
>   at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:239)
>   at $anonfun$res3$1(<console>:46)
>   at $anonfun$res3$1$adapted(<console>:40)
>   at scala.collection.Iterator.foreach(Iterator.scala:941)
>   at scala.collection.Iterator.foreach$(Iterator.scala:941)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1429)
>   at scala.collection.IterableLike.foreach(IterableLike.scala:74)
>   at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
>   at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
> *Writer config*
> **
> spark-submit \
> --master yarn \
> --deploy-mode cluster \
> --driver-cores 4 \
> --driver-memory 4g \
> --executor-cores 4 \
> --executor-memory 6g \
> --num-executors 8 \
> --jars 
> s3://datalake/jars/unused-1.0.0.jar,s3://datalake/jars/spark-avro_2.12-3.1.2.jar
>  \
> --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer \
> --conf spark.serializer=org.apache.spark.serializer.KryoSerializer \
> --conf spark.sql.sources.parallelPartitionDiscovery.parallelism=25000 \
> s3://datalake/jars/hudi-0.10.1/hudi-utilities-bundle_2.12-0.10.1.jar \
> --table-type COPY_ON_WRITE \
> --source-ordering-field timestamp \
> --source-class org.apache.hudi.utilities.sources.ParquetDFSSource \
> --target-base-path s3a://datalake-hudi/sessions \
> --target-table sessions \
> --transformer-class 
> org.apache.hudi.utilities.transform.SqlQueryBasedTransformer \
> --op INSERT \
> --hoodie-conf hoodie.clean.automatic=true \
> --hoodie-conf hoodie.cleaner.commits.retained=10 \
> --hoodie-conf hoodie.cleaner.policy=KEEP_LATEST_COMMITS \
> --hoodie-conf hoodie.clustering.inline=true \
> --hoodie-conf hoodie.clustering.inline.max.commits=5 \
> --hoodie-conf 
> hoodie.clustering.plan.strategy.class=org.apache.hudi.client.clustering.plan.strategy.SparkRecentDaysClusteringPlanStrategy
>  \
> --hoodie-conf hoodie.clustering.plan.strategy.max.num.groups=1000 \
> --hoodie-conf hoodie.clustering.plan.strategy.small.file.limit=268435456 \
> --hoodie-conf 
> hoodie.clustering.plan.strategy.sort.columns=survey_dbid,session_dbid \
> --hoodie-conf hoodie.clustering.plan.strategy.target.file.max.bytes=536870912 
> \
> --hoodie-conf hoodie.clustering.preserve.commit.metadata=true \
> --hoodie-conf hoodie.datasource.hive_sync.database=datalake-hudi \
> --hoodie-conf hoodie.datasource.hive_sync.enable=false \
> --hoodie-conf hoodie.datasource.hive_sync.ignore_exceptions=true \
> --hoodie-conf hoodie.datasource.hive_sync.mode=hms \
> --hoodie-conf 
> hoodie.datasource.hive_sync.partition_extractor_class=org.apache.hudi.hive.HiveStylePartitionValueExtractor
>  \
> --hoodie-conf hoodie.datasource.hive_sync.partition_fields=date \
> --hoodie-conf hoodie.datasource.hive_sync.table=sessions \
> --hoodie-conf hoodie.datasource.hive_sync.use_jdbc=false \
> --hoodie-conf hoodie.datasource.write.hive_style_partitioning=true \
> --hoodie-conf 
> hoodie.datasource.write.keygenerator.class=org.apache.hudi.keygen.CustomKeyGenerator
>  \
> --hoodie-conf hoodie.datasource.write.operation=insert \
> --hoodie-conf hoodie.datasource.write.partitionpath.field=date:TIMESTAMP \
> --hoodie-conf hoodie.datasource.write.precombine.field=timestamp \
> --hoodie-conf 
> hoodie.datasource.write.recordkey.field=session_dbid,question_id,answer \
> --hoodie-conf 
> hoodie.deltastreamer.keygen.timebased.input.dateformat=yyyy/MM/dd \
> --hoodie-conf hoodie.deltastreamer.keygen.timebased.input.timezone=GMT \
> --hoodie-conf 
> hoodie.deltastreamer.keygen.timebased.output.dateformat=yyyy/MM/dd \
> --hoodie-conf hoodie.deltastreamer.keygen.timebased.output.timezone=GMT \
> --hoodie-conf 
> hoodie.deltastreamer.keygen.timebased.timestamp.type=DATE_STRING \
> --hoodie-conf 
> hoodie.deltastreamer.source.dfs.root=s3://datalake-hudi/raw/parquet/data/sessions/year=2022/month=01/day=26/hour=02
>  \
> --hoodie-conf 
> hoodie.deltastreamer.source.input.selector=org.apache.hudi.utilities.sources.helpers.DFSPathSelector
>  \
> --hoodie-conf "\"hoodie.deltastreamer.transformer.sql=SELECT question_id, 
> answer, to_timestamp(timestamp) as timestamp, session_dbid, survey_dbid, 
> date_format(to_timestamp(timestamp), 'yyyy/MM/dd') AS date FROM <SRC> a \"" \
> --hoodie-conf hoodie.file.listing.parallelism=256 \
> --hoodie-conf hoodie.finalize.write.parallelism=256 \
> --hoodie-conf 
> hoodie.generate.consistent.timestamp.logical.for.key.generator=true \
> --hoodie-conf hoodie.insert.shuffle.parallelism=1000 \
> --hoodie-conf hoodie.metadata.enable=true \
> --hoodie-conf hoodie.metadata.metrics.enable=true \
> --hoodie-conf 
> hoodie.metrics.cloudwatch.metric.prefix=emr.datalake.prd.insert.sessions \
> --hoodie-conf hoodie.metrics.on=false \
> --hoodie-conf hoodie.metrics.reporter.type=CLOUDWATCH \
> --hoodie-conf hoodie.parquet.block.size=536870912 \
> --hoodie-conf hoodie.parquet.compression.codec=snappy \
> --hoodie-conf hoodie.parquet.max.file.size=536870912 \
> --hoodie-conf hoodie.parquet.small.file.limit=268435456
>  
> *Metadata Commits (.hoodie/metadata/.hoodie)*
> **
> 20220125154001455002.clean
> 20220125154001455002.clean.inflight
> 20220125154001455002.clean.requested
> 20220125160751769002.clean
> 20220125160751769002.clean.inflight
> 20220125160751769002.clean.requested
> 20220125163020781002.clean
> 20220125163020781002.clean.inflight
> 20220125163020781002.clean.requested
> 20220125165722170002.clean
> 20220125165722170002.clean.inflight
> 20220125165722170002.clean.requested
> 20220125172016239002.clean
> 20220125172016239002.clean.inflight
> 20220125172016239002.clean.requested
> 20220125174427654002.clean
> 20220125174427654002.clean.inflight
> 20220125174427654002.clean.requested
> 20220125181218237002.clean
> 20220125181218237002.clean.inflight
> 20220125181218237002.clean.requested
> 20220125184343588002.clean
> 20220125184343588002.clean.inflight
> 20220125184343588002.clean.requested
> 20220125191038318002.clean
> 20220125191038318002.clean.inflight
> 20220125191038318002.clean.requested
> 20220125193445223002.clean
> 20220125193445223002.clean.inflight
> 20220125193445223002.clean.requested
> 20220125200741168002.clean
> 20220125200741168002.clean.inflight
> 20220125200741168002.clean.requested
> 20220125203814934002.clean
> 20220125203814934002.clean.inflight
> 20220125203814934002.clean.requested
> 20220125211447323002.clean
> 20220125211447323002.clean.inflight
> 20220125211447323002.clean.requested
> 20220125214421740002.clean
> 20220125214421740002.clean.inflight
> 20220125214421740002.clean.requested
> 20220125221009798002.clean
> 20220125221009798002.clean.inflight
> 20220125221009798002.clean.requested
> 20220125224319264002.clean
> 20220125224319264002.clean.inflight
> 20220125224319264002.clean.requested
> 20220125231128580002.clean
> 20220125231128580002.clean.inflight
> 20220125231128580002.clean.requested
> 20220125234345790002.clean
> 20220125234345790002.clean.inflight
> 20220125234345790002.clean.requested
> 20220126001130415002.clean
> 20220126001130415002.clean.inflight
> 20220126001130415002.clean.requested
> 20220126004341130002.clean
> 20220126004341130002.clean.inflight
> 20220126004341130002.clean.requested
> 20220126011114529002.clean
> 20220126011114529002.clean.inflight
> 20220126011114529002.clean.requested
> 20220126013648751002.clean
> 20220126013648751002.clean.inflight
> 20220126013648751002.clean.requested
> 20220126013859643.deltacommit
> 20220126013859643.deltacommit.inflight
> 20220126013859643.deltacommit.requested
> 20220126014254294.deltacommit
> 20220126014254294.deltacommit.inflight
> 20220126014254294.deltacommit.requested
> 20220126014516195.deltacommit
> 20220126014516195.deltacommit.inflight
> 20220126014516195.deltacommit.requested
> 20220126014711043.deltacommit
> 20220126014711043.deltacommit.inflight
> 20220126014711043.deltacommit.requested
> 20220126014808898.deltacommit
> 20220126014808898.deltacommit.inflight
> 20220126014808898.deltacommit.requested
> 20220126015008443.deltacommit
> 20220126015008443.deltacommit.inflight
> 20220126015008443.deltacommit.requested
> 20220126015119193.deltacommit
> 20220126015119193.deltacommit.inflight
> 20220126015119193.deltacommit.requested
> 20220126015119193001.commit
> 20220126015119193001.compaction.inflight
> 20220126015119193001.compaction.requested
> 20220126015653770.deltacommit
> 20220126015653770.deltacommit.inflight
> 20220126015653770.deltacommit.requested
> 20220126020011172.deltacommit
> 20220126020011172.deltacommit.inflight
> 20220126020011172.deltacommit.requested
> 20220126020405299.deltacommit
> 20220126020405299.deltacommit.inflight
> 20220126020405299.deltacommit.requested
> 20220126020405299002.clean
> 20220126020405299002.clean.inflight
> 20220126020405299002.clean.requested
> 20220126020813841.deltacommit
> 20220126020813841.deltacommit.inflight
> 20220126020813841.deltacommit.requested
> 20220126021002748.deltacommit
> 20220126021002748.deltacommit.inflight
> 20220126021002748.deltacommit.requested
> 20220126021231085.deltacommit
> 20220126021231085.deltacommit.inflight
> 20220126021231085.deltacommit.requested
> 20220126021429124.deltacommit
> 20220126021429124.deltacommit.inflight
> 20220126021429124.deltacommit.requested
> 20220126021445188.deltacommit
> 20220126021445188.deltacommit.inflight
> 20220126021445188.deltacommit.requested
> 20220126021949824.deltacommit
> 20220126021949824.deltacommit.inflight
> 20220126021949824.deltacommit.requested
> 20220126022154561.deltacommit
> 20220126022154561.deltacommit.inflight
> 20220126022154561.deltacommit.requested
> 20220126022154561001.commit
> 20220126022154561001.compaction.inflight
> 20220126022154561001.compaction.requested
> 20220126022523011.deltacommit
> 20220126022523011.deltacommit.inflight
> 20220126022523011.deltacommit.requested
> 20220126023054200.deltacommit
> 20220126023054200.deltacommit.inflight
> 20220126023054200.deltacommit.requested
> 20220126023530250.deltacommit
> 20220126023530250.deltacommit.inflight
> 20220126023530250.deltacommit.requested
> 20220126023530250002.clean
> 20220126023530250002.clean.inflight
> 20220126023530250002.clean.requested
> 20220126023637109.deltacommit
> 20220126023637109.deltacommit.inflight
> 20220126023637109.deltacommit.requested
> 20220126024028688.deltacommit
> 20220126024028688.deltacommit.inflight
> 20220126024028688.deltacommit.requested
> 20220126024137627.deltacommit
> 20220126024137627.deltacommit.inflight
> 20220126024137627.deltacommit.requested
> 20220126024720121.deltacommit
> 20220126024720121.deltacommit.inflight
> 20220126024720121.deltacommit.requested
>  *Commits(.hoodie)*
> 20220125224502471.clean
> 20220125224502471.clean.inflight
> 20220125224502471.clean.requested
> 20220125225810828.clean
> 20220125225810828.clean.inflight
> 20220125225810828.clean.requested
> 20220125230125674.clean
> 20220125230125674.clean.inflight
> 20220125230125674.clean.requested
> 20220125230854957.clean
> 20220125230854957.clean.inflight
> 20220125230854957.clean.requested
> 20220125232236767.clean
> 20220125232236767.clean.inflight
> 20220125232236767.clean.requested
> 20220125232638588.clean
> 20220125232638588.clean.inflight
> 20220125232638588.clean.requested
> 20220125233355290.clean
> 20220125233355290.clean.inflight
> 20220125233355290.clean.requested
> 20220125234539672.clean
> 20220125234539672.clean.inflight
> 20220125234539672.clean.requested
> 20220125234944271.clean
> 20220125234944271.clean.inflight
> 20220125234944271.clean.requested
> 20220125235718218.clean
> 20220125235718218.clean.inflight
> 20220125235718218.clean.requested
> 20220126000225375.clean
> 20220126000225375.clean.inflight
> 20220126000225375.clean.requested
> 20220126000937875.clean
> 20220126000937875.clean.inflight
> 20220126000937875.clean.requested
> 20220126003307449.clean
> 20220126003307449.clean.inflight
> 20220126003307449.clean.requested
> 20220126003617137.clean
> 20220126003617137.clean.inflight
> 20220126003617137.clean.requested
> 20220126004518227.clean
> 20220126004518227.clean.inflight
> 20220126004518227.clean.requested
> 20220126005806798.clean
> 20220126005806798.clean.inflight
> 20220126005806798.clean.requested
> 20220126010011407.commit
> 20220126010011407.commit.requested
> 20220126010011407.inflight
> 20220126010227320.clean
> 20220126010227320.clean.inflight
> 20220126010227320.clean.requested
> 20220126010242754.replacecommit
> 20220126010242754.replacecommit.inflight
> 20220126010242754.replacecommit.requested
> 20220126010800207.commit
> 20220126010800207.commit.requested
> 20220126010800207.inflight
> 20220126010920192.clean
> 20220126010920192.clean.inflight
> 20220126010920192.clean.requested
> 20220126011114529.commit
> 20220126011114529.commit.requested
> 20220126011114529.inflight
> 20220126011230532.clean
> 20220126011230532.clean.inflight
> 20220126011230532.clean.requested
> 20220126011426028.commit
> 20220126011426028.commit.requested
> 20220126011426028.inflight
> 20220126011818299.commit
> 20220126011818299.commit.requested
> 20220126011818299.inflight
> 20220126012003045.clean
> 20220126012003045.clean.inflight
> 20220126012003045.clean.requested
> 20220126012240288.commit
> 20220126012240288.commit.requested
> 20220126012240288.inflight
> 20220126012443455.clean
> 20220126012443455.clean.inflight
> 20220126012443455.clean.requested
> 20220126012508460.replacecommit
> 20220126012508460.replacecommit.inflight
> 20220126012508460.replacecommit.requested
> 20220126013218816.commit
> 20220126013218816.commit.requested
> 20220126013218816.inflight
> 20220126013428875.clean
> 20220126013428875.clean.inflight
> 20220126013428875.clean.requested
> 20220126013648751.commit
> 20220126013648751.commit.requested
> 20220126013648751.inflight
> 20220126013859643.clean
> 20220126013859643.clean.inflight
> 20220126013859643.clean.requested
> 20220126014254294.commit
> 20220126014254294.commit.requested
> 20220126014254294.inflight
> 20220126014516195.clean
> 20220126014516195.clean.inflight
> 20220126014516195.clean.requested
> 20220126014711043.commit
> 20220126014711043.commit.requested
> 20220126014711043.inflight
> 20220126014808898.clean
> 20220126014808898.clean.inflight
> 20220126014808898.clean.requested
> 20220126015008443.commit
> 20220126015008443.commit.requested
> 20220126015008443.inflight
> 20220126015119193.replacecommit
> 20220126015119193.replacecommit.inflight
> 20220126015119193.replacecommit.requested
> 20220126015653770.commit
> 20220126015653770.commit.requested
> 20220126015653770.inflight
> 20220126020011172.commit
> 20220126020011172.commit.requested
> 20220126020011172.inflight
> 20220126020405299.commit
> 20220126020405299.commit.requested
> 20220126020405299.inflight
> 20220126020813841.commit
> 20220126020813841.commit.requested
> 20220126020813841.inflight
> 20220126021002748.clean
> 20220126021002748.clean.inflight
> 20220126021002748.clean.requested
> 20220126021231085.commit
> 20220126021231085.commit.requested
> 20220126021231085.inflight
> 20220126021429124.clean
> 20220126021429124.clean.inflight
> 20220126021429124.clean.requested
> 20220126021445188.replacecommit
> 20220126021445188.replacecommit.inflight
> 20220126021445188.replacecommit.requested
> 20220126021949824.commit
> 20220126021949824.commit.requested
> 20220126021949824.inflight
> 20220126022154561.clean
> 20220126022154561.clean.inflight
> 20220126022154561.clean.requested
> 20220126022523011.commit
> 20220126022523011.commit.requested
> 20220126022523011.inflight
> 20220126023054200.commit
> 20220126023054200.commit.requested
> 20220126023054200.inflight
> 20220126023530250.commit
> 20220126023530250.commit.requested
> 20220126023530250.inflight
> 20220126023637109.clean
> 20220126023637109.clean.inflight
> 20220126023637109.clean.requested
> 20220126024028688.commit
> 20220126024028688.commit.requested
> 20220126024028688.inflight
> 20220126024137627.replacecommit
> 20220126024137627.replacecommit.inflight
> 20220126024137627.replacecommit.requested
> 20220126024720121.commit
> 20220126024720121.commit.requested
> 20220126024720121.inflight
>  
> **



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to