[ 
https://issues.apache.org/jira/browse/HUDI-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsha Teja Kanna updated HUDI-3066:
------------------------------------
    Description: 
After 'metadata table' is enabled, File listing takes long time.

If metadata is enabled on Reader side, it is taking even more time per file 
listing task.

Existing tables (COW) have inline clustering on and have many replace commits.

Logs seem to suggest the delay is in view.AbstractTableFileSystemView 

resetFileGroupsReplaced function or metadata.HoodieBackedTableMetadata

Also many log messages in AbstractHoodieLogRecordReader

 

2021-12-18 23:17:54,056 INFO view.AbstractTableFileSystemView: Took 4118 ms to 
read  136 instants, 9731 replaced file groups

2021-12-18 23:37:46,086 INFO log.AbstractHoodieLogRecordReader: Number of 
remaining logblocks to merge 1
2021-12-18 23:37:46,090 INFO log.AbstractHoodieLogRecordReader: Reading a data 
block from file 
s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.76_0-20-515
 at instant 20211217035105329
2021-12-18 23:37:46,090 INFO log.AbstractHoodieLogRecordReader: Number of 
remaining logblocks to merge 1
2021-12-18 23:37:46,094 INFO log.HoodieLogFormatReader: Moving to the next 
reader for logfile 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.121_0-57-663',
 fileLen=0}
2021-12-18 23:37:46,095 INFO log.AbstractHoodieLogRecordReader: Scanning log 
file 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.20_0-35-613',
 fileLen=0}
2021-12-18 23:37:46,095 INFO s3a.S3AInputStream: Switching to Random IO seek 
policy
2021-12-18 23:37:46,096 INFO log.AbstractHoodieLogRecordReader: Reading a data 
block from file 
s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.62_0-34-377
 at instant 20211217022049877
2021-12-18 23:37:46,096 INFO log.AbstractHoodieLogRecordReader: Number of 
remaining logblocks to merge 1
2021-12-18 23:37:46,105 INFO log.HoodieLogFormatReader: Moving to the next 
reader for logfile 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.86_0-20-362',
 fileLen=0}
2021-12-18 23:37:46,109 INFO log.AbstractHoodieLogRecordReader: Scanning log 
file 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.121_0-57-663',
 fileLen=0}
2021-12-18 23:37:46,109 INFO s3a.S3AInputStream: Switching to Random IO seek 
policy
2021-12-18 23:37:46,110 INFO log.HoodieLogFormatReader: Moving to the next 
reader for logfile 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.77_0-35-590',
 fileLen=0}
2021-12-18 23:37:46,112 INFO log.AbstractHoodieLogRecordReader: Reading a data 
block from file 
s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.20_0-35-613
 at instant 20211216183448389
2021-12-18 23:37:46,112 INFO log.AbstractHoodieLogRecordReader: Number of 
remaining logblocks to merge 1
2021-12-18 23:37:46,118 INFO log.HoodieLogFormatReader: Moving to the next 
reader for logfile 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.63_0-56-519',
 fileLen=0}
2021-12-18 23:37:46,122 INFO log.AbstractHoodieLogRecordReader: Scanning log 
file 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.86_0-20-362',
 fileLen=0}
2021-12-18 23:37:46,122 INFO s3a.S3AInputStream: Switching to Random IO seek 
policy
2021-12-18 23:37:46,123 INFO log.AbstractHoodieLogRecordReader: Reading a data 
block from file 
s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.121_0-57-663
 at instant 20211217090337935
2021-12-18 23:37:46,123 INFO log.AbstractHoodieLogRecordReader: Number of 
remaining logblocks to merge 1
2021-12-18 23:37:46,127 INFO log.AbstractHoodieLogRecordReader: Scanning log 
file 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.77_0-35-590',
 fileLen=0}
2021-12-18 23:37:46,127 INFO s3a.S3AInputStream: Switching to Random IO seek 
policy
2021-12-18 23:37:46,141 INFO log.HoodieLogFormatReader: Moving to the next 
reader for logfile 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.21_0-56-795',
 fileLen=0}
2021-12-18 23:37:46,143 INFO log.AbstractHoodieLogRecordReader: Scanning log 
file 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.63_0-56-519',
 fileLen=0}
2021-12-18 23:37:46,143 INFO s3a.S3AInputStream: Switching to Random IO seek 
policy
2021-12-18 23:37:46,144 INFO log.AbstractHoodieLogRecordReader: Reading a data 
block from file 
s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.86_0-20-362
 at instant 20211217045206861
2021-12-18 23:37:46,144 INFO log.AbstractHoodieLogRecordReader: Number of 
remaining logblocks to merge 1
2021-12-18 23:37:46,145 INFO log.AbstractHoodieLogRecordReader: Reading a data 
block from file 
s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.77_0-35-590
 at instant 20211217035058289
2021-12-18 23:37:46,145 INFO log.AbstractHoodieLogRecordReader: Number of 
remaining logblocks to merge 1
2021-12-18 23:37:46,162 INFO log.AbstractHoodieLogRecordReader: Scanning log 
file 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.21_0-56-795',
 fileLen=0}
2021-12-18 23:37:46,162 INFO log.AbstractHoodieLogRecordReader: Reading a data 
block from file 
s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.63_0-56-519
 at instant 20211217022306173
2021-12-18 23:37:46,162 INFO s3a.S3AInputStream: Switching to Random IO seek 
policy
2021-12-18 23:37:46,162 INFO log.AbstractHoodieLogRecordReader: Number of 
remaining logblocks to merge 1
2021-12-18 23:37:46,170 INFO log.HoodieLogFormatReader: Moving to the next 
reader for logfile 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.87_0-35-433',
 fileLen=0}
2021-12-18 23:37:46,171 INFO log.HoodieLogFormatReader: Moving to the next 
reader for logfile 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.78_0-56-705',
 fileLen=0}
2021-12-18 23:37:46,178 INFO log.AbstractHoodieLogRecordReader: Reading a data 
block from file 
s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.21_0-56-795
 at instant 20211216184106961
2021-12-18 23:37:46,178 INFO log.AbstractHoodieLogRecordReader: Number of 
remaining logblocks to merge

2021-12-18 23:38:35,380 INFO log.HoodieMergedLogRecordScanner: Number of log 
files scanned => 437
2021-12-18 23:38:35,380 INFO log.HoodieMergedLogRecordScanner: MaxMemoryInBytes 
allowed for compaction => 1073741824
2021-12-18 23:38:35,380 INFO log.HoodieMergedLogRecordScanner: Number of 
entries in MemoryBasedMap in ExternalSpillableMap => 165
2021-12-18 23:38:35,380 INFO log.HoodieMergedLogRecordScanner: Total size in 
bytes of MemoryBasedMap in ExternalSpillableMap => 259380
2021-12-18 23:38:35,380 INFO log.HoodieMergedLogRecordScanner: Number of 
entries in BitCaskDiskMap in ExternalSpillableMap => 0
2021-12-18 23:38:35,380 INFO log.HoodieMergedLogRecordScanner: Size of file 
spilled to disk => 0
2021-12-18 23:38:35,380 INFO metadata.HoodieBackedTableMetadata: Opened 437 
metadata log files (dataset instant=20211218233649435, metadata 
instant=20211218233649435) in 22935 ms

2021-12-18 23:38:37,193 INFO metadata.HoodieBackedTableMetadata: Opened 437 
metadata log files (dataset instant=20211218233649435, metadata 
instant=20211218233649435) in 22802 ms

Sample file listing tasks

!Screen Shot 2021-12-18 at 6.16.29 PM.png!

 

  was:
After 'metadata table' is enabled, File listing takes long time.

If metadata is enabled on Reader side, it is taking even more time per file 
listing task.

Existing tables (COW) have inline clustering on and have many replace commits.

Logs seem to suggest the delay is in view.AbstractTableFileSystemView 

resetFileGroupsReplaced function.

Also many log messages in AbstractHoodieLogRecordReader

 

2021-12-18 23:17:54,056 INFO view.AbstractTableFileSystemView: Took 4118 ms to 
read  136 instants, 9731 replaced file groups

2021-12-18 23:37:46,086 INFO log.AbstractHoodieLogRecordReader: Number of 
remaining logblocks to merge 1
2021-12-18 23:37:46,090 INFO log.AbstractHoodieLogRecordReader: Reading a data 
block from file 
s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.76_0-20-515
 at instant 20211217035105329
2021-12-18 23:37:46,090 INFO log.AbstractHoodieLogRecordReader: Number of 
remaining logblocks to merge 1
2021-12-18 23:37:46,094 INFO log.HoodieLogFormatReader: Moving to the next 
reader for logfile 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.121_0-57-663',
 fileLen=0}
2021-12-18 23:37:46,095 INFO log.AbstractHoodieLogRecordReader: Scanning log 
file 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.20_0-35-613',
 fileLen=0}
2021-12-18 23:37:46,095 INFO s3a.S3AInputStream: Switching to Random IO seek 
policy
2021-12-18 23:37:46,096 INFO log.AbstractHoodieLogRecordReader: Reading a data 
block from file 
s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.62_0-34-377
 at instant 20211217022049877
2021-12-18 23:37:46,096 INFO log.AbstractHoodieLogRecordReader: Number of 
remaining logblocks to merge 1
2021-12-18 23:37:46,105 INFO log.HoodieLogFormatReader: Moving to the next 
reader for logfile 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.86_0-20-362',
 fileLen=0}
2021-12-18 23:37:46,109 INFO log.AbstractHoodieLogRecordReader: Scanning log 
file 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.121_0-57-663',
 fileLen=0}
2021-12-18 23:37:46,109 INFO s3a.S3AInputStream: Switching to Random IO seek 
policy
2021-12-18 23:37:46,110 INFO log.HoodieLogFormatReader: Moving to the next 
reader for logfile 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.77_0-35-590',
 fileLen=0}
2021-12-18 23:37:46,112 INFO log.AbstractHoodieLogRecordReader: Reading a data 
block from file 
s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.20_0-35-613
 at instant 20211216183448389
2021-12-18 23:37:46,112 INFO log.AbstractHoodieLogRecordReader: Number of 
remaining logblocks to merge 1
2021-12-18 23:37:46,118 INFO log.HoodieLogFormatReader: Moving to the next 
reader for logfile 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.63_0-56-519',
 fileLen=0}
2021-12-18 23:37:46,122 INFO log.AbstractHoodieLogRecordReader: Scanning log 
file 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.86_0-20-362',
 fileLen=0}
2021-12-18 23:37:46,122 INFO s3a.S3AInputStream: Switching to Random IO seek 
policy
2021-12-18 23:37:46,123 INFO log.AbstractHoodieLogRecordReader: Reading a data 
block from file 
s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.121_0-57-663
 at instant 20211217090337935
2021-12-18 23:37:46,123 INFO log.AbstractHoodieLogRecordReader: Number of 
remaining logblocks to merge 1
2021-12-18 23:37:46,127 INFO log.AbstractHoodieLogRecordReader: Scanning log 
file 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.77_0-35-590',
 fileLen=0}
2021-12-18 23:37:46,127 INFO s3a.S3AInputStream: Switching to Random IO seek 
policy
2021-12-18 23:37:46,141 INFO log.HoodieLogFormatReader: Moving to the next 
reader for logfile 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.21_0-56-795',
 fileLen=0}
2021-12-18 23:37:46,143 INFO log.AbstractHoodieLogRecordReader: Scanning log 
file 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.63_0-56-519',
 fileLen=0}
2021-12-18 23:37:46,143 INFO s3a.S3AInputStream: Switching to Random IO seek 
policy
2021-12-18 23:37:46,144 INFO log.AbstractHoodieLogRecordReader: Reading a data 
block from file 
s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.86_0-20-362
 at instant 20211217045206861
2021-12-18 23:37:46,144 INFO log.AbstractHoodieLogRecordReader: Number of 
remaining logblocks to merge 1
2021-12-18 23:37:46,145 INFO log.AbstractHoodieLogRecordReader: Reading a data 
block from file 
s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.77_0-35-590
 at instant 20211217035058289
2021-12-18 23:37:46,145 INFO log.AbstractHoodieLogRecordReader: Number of 
remaining logblocks to merge 1
2021-12-18 23:37:46,162 INFO log.AbstractHoodieLogRecordReader: Scanning log 
file 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.21_0-56-795',
 fileLen=0}
2021-12-18 23:37:46,162 INFO log.AbstractHoodieLogRecordReader: Reading a data 
block from file 
s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.63_0-56-519
 at instant 20211217022306173
2021-12-18 23:37:46,162 INFO s3a.S3AInputStream: Switching to Random IO seek 
policy
2021-12-18 23:37:46,162 INFO log.AbstractHoodieLogRecordReader: Number of 
remaining logblocks to merge 1
2021-12-18 23:37:46,170 INFO log.HoodieLogFormatReader: Moving to the next 
reader for logfile 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.87_0-35-433',
 fileLen=0}
2021-12-18 23:37:46,171 INFO log.HoodieLogFormatReader: Moving to the next 
reader for logfile 
HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.78_0-56-705',
 fileLen=0}
2021-12-18 23:37:46,178 INFO log.AbstractHoodieLogRecordReader: Reading a data 
block from file 
s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.21_0-56-795
 at instant 20211216184106961
2021-12-18 23:37:46,178 INFO log.AbstractHoodieLogRecordReader: Number of 
remaining logblocks to merge

2021-12-18 23:38:35,380 INFO log.HoodieMergedLogRecordScanner: Number of log 
files scanned => 437
2021-12-18 23:38:35,380 INFO log.HoodieMergedLogRecordScanner: MaxMemoryInBytes 
allowed for compaction => 1073741824
2021-12-18 23:38:35,380 INFO log.HoodieMergedLogRecordScanner: Number of 
entries in MemoryBasedMap in ExternalSpillableMap => 165
2021-12-18 23:38:35,380 INFO log.HoodieMergedLogRecordScanner: Total size in 
bytes of MemoryBasedMap in ExternalSpillableMap => 259380
2021-12-18 23:38:35,380 INFO log.HoodieMergedLogRecordScanner: Number of 
entries in BitCaskDiskMap in ExternalSpillableMap => 0
2021-12-18 23:38:35,380 INFO log.HoodieMergedLogRecordScanner: Size of file 
spilled to disk => 0
2021-12-18 23:38:35,380 INFO metadata.HoodieBackedTableMetadata: Opened 437 
metadata log files (dataset instant=20211218233649435, metadata 
instant=20211218233649435) in 22935 ms

2021-12-18 23:38:37,193 INFO metadata.HoodieBackedTableMetadata: Opened 437 
metadata log files (dataset instant=20211218233649435, metadata 
instant=20211218233649435) in 22802 ms

Sample file listing tasks

!Screen Shot 2021-12-18 at 6.16.29 PM.png!

 


> Very slow file listing after enabling metadata for existing tables in 0.10.0 
> release
> ------------------------------------------------------------------------------------
>
>                 Key: HUDI-3066
>                 URL: https://issues.apache.org/jira/browse/HUDI-3066
>             Project: Apache Hudi
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>         Environment: EMR 6.4.0
> Hudi version : 0.10.0
>            Reporter: Harsha Teja Kanna
>            Priority: Critical
>              Labels: performance
>         Attachments: Screen Shot 2021-12-18 at 6.16.29 PM.png
>
>
> After 'metadata table' is enabled, File listing takes long time.
> If metadata is enabled on Reader side, it is taking even more time per file 
> listing task.
> Existing tables (COW) have inline clustering on and have many replace commits.
> Logs seem to suggest the delay is in view.AbstractTableFileSystemView 
> resetFileGroupsReplaced function or metadata.HoodieBackedTableMetadata
> Also many log messages in AbstractHoodieLogRecordReader
>  
> 2021-12-18 23:17:54,056 INFO view.AbstractTableFileSystemView: Took 4118 ms 
> to read  136 instants, 9731 replaced file groups
> 2021-12-18 23:37:46,086 INFO log.AbstractHoodieLogRecordReader: Number of 
> remaining logblocks to merge 1
> 2021-12-18 23:37:46,090 INFO log.AbstractHoodieLogRecordReader: Reading a 
> data block from file 
> s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.76_0-20-515
>  at instant 20211217035105329
> 2021-12-18 23:37:46,090 INFO log.AbstractHoodieLogRecordReader: Number of 
> remaining logblocks to merge 1
> 2021-12-18 23:37:46,094 INFO log.HoodieLogFormatReader: Moving to the next 
> reader for logfile 
> HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.121_0-57-663',
>  fileLen=0}
> 2021-12-18 23:37:46,095 INFO log.AbstractHoodieLogRecordReader: Scanning log 
> file 
> HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.20_0-35-613',
>  fileLen=0}
> 2021-12-18 23:37:46,095 INFO s3a.S3AInputStream: Switching to Random IO seek 
> policy
> 2021-12-18 23:37:46,096 INFO log.AbstractHoodieLogRecordReader: Reading a 
> data block from file 
> s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.62_0-34-377
>  at instant 20211217022049877
> 2021-12-18 23:37:46,096 INFO log.AbstractHoodieLogRecordReader: Number of 
> remaining logblocks to merge 1
> 2021-12-18 23:37:46,105 INFO log.HoodieLogFormatReader: Moving to the next 
> reader for logfile 
> HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.86_0-20-362',
>  fileLen=0}
> 2021-12-18 23:37:46,109 INFO log.AbstractHoodieLogRecordReader: Scanning log 
> file 
> HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.121_0-57-663',
>  fileLen=0}
> 2021-12-18 23:37:46,109 INFO s3a.S3AInputStream: Switching to Random IO seek 
> policy
> 2021-12-18 23:37:46,110 INFO log.HoodieLogFormatReader: Moving to the next 
> reader for logfile 
> HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.77_0-35-590',
>  fileLen=0}
> 2021-12-18 23:37:46,112 INFO log.AbstractHoodieLogRecordReader: Reading a 
> data block from file 
> s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.20_0-35-613
>  at instant 20211216183448389
> 2021-12-18 23:37:46,112 INFO log.AbstractHoodieLogRecordReader: Number of 
> remaining logblocks to merge 1
> 2021-12-18 23:37:46,118 INFO log.HoodieLogFormatReader: Moving to the next 
> reader for logfile 
> HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.63_0-56-519',
>  fileLen=0}
> 2021-12-18 23:37:46,122 INFO log.AbstractHoodieLogRecordReader: Scanning log 
> file 
> HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.86_0-20-362',
>  fileLen=0}
> 2021-12-18 23:37:46,122 INFO s3a.S3AInputStream: Switching to Random IO seek 
> policy
> 2021-12-18 23:37:46,123 INFO log.AbstractHoodieLogRecordReader: Reading a 
> data block from file 
> s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.121_0-57-663
>  at instant 20211217090337935
> 2021-12-18 23:37:46,123 INFO log.AbstractHoodieLogRecordReader: Number of 
> remaining logblocks to merge 1
> 2021-12-18 23:37:46,127 INFO log.AbstractHoodieLogRecordReader: Scanning log 
> file 
> HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.77_0-35-590',
>  fileLen=0}
> 2021-12-18 23:37:46,127 INFO s3a.S3AInputStream: Switching to Random IO seek 
> policy
> 2021-12-18 23:37:46,141 INFO log.HoodieLogFormatReader: Moving to the next 
> reader for logfile 
> HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.21_0-56-795',
>  fileLen=0}
> 2021-12-18 23:37:46,143 INFO log.AbstractHoodieLogRecordReader: Scanning log 
> file 
> HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.63_0-56-519',
>  fileLen=0}
> 2021-12-18 23:37:46,143 INFO s3a.S3AInputStream: Switching to Random IO seek 
> policy
> 2021-12-18 23:37:46,144 INFO log.AbstractHoodieLogRecordReader: Reading a 
> data block from file 
> s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.86_0-20-362
>  at instant 20211217045206861
> 2021-12-18 23:37:46,144 INFO log.AbstractHoodieLogRecordReader: Number of 
> remaining logblocks to merge 1
> 2021-12-18 23:37:46,145 INFO log.AbstractHoodieLogRecordReader: Reading a 
> data block from file 
> s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.77_0-35-590
>  at instant 20211217035058289
> 2021-12-18 23:37:46,145 INFO log.AbstractHoodieLogRecordReader: Number of 
> remaining logblocks to merge 1
> 2021-12-18 23:37:46,162 INFO log.AbstractHoodieLogRecordReader: Scanning log 
> file 
> HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.21_0-56-795',
>  fileLen=0}
> 2021-12-18 23:37:46,162 INFO log.AbstractHoodieLogRecordReader: Reading a 
> data block from file 
> s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.63_0-56-519
>  at instant 20211217022306173
> 2021-12-18 23:37:46,162 INFO s3a.S3AInputStream: Switching to Random IO seek 
> policy
> 2021-12-18 23:37:46,162 INFO log.AbstractHoodieLogRecordReader: Number of 
> remaining logblocks to merge 1
> 2021-12-18 23:37:46,170 INFO log.HoodieLogFormatReader: Moving to the next 
> reader for logfile 
> HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.87_0-35-433',
>  fileLen=0}
> 2021-12-18 23:37:46,171 INFO log.HoodieLogFormatReader: Moving to the next 
> reader for logfile 
> HoodieLogFile\{pathStr='s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.78_0-56-705',
>  fileLen=0}
> 2021-12-18 23:37:46,178 INFO log.AbstractHoodieLogRecordReader: Reading a 
> data block from file 
> s3a://datalake-hudi/sessions/.hoodie/metadata/files/.files-0000_20211216144130775001.log.21_0-56-795
>  at instant 20211216184106961
> 2021-12-18 23:37:46,178 INFO log.AbstractHoodieLogRecordReader: Number of 
> remaining logblocks to merge
> 2021-12-18 23:38:35,380 INFO log.HoodieMergedLogRecordScanner: Number of log 
> files scanned => 437
> 2021-12-18 23:38:35,380 INFO log.HoodieMergedLogRecordScanner: 
> MaxMemoryInBytes allowed for compaction => 1073741824
> 2021-12-18 23:38:35,380 INFO log.HoodieMergedLogRecordScanner: Number of 
> entries in MemoryBasedMap in ExternalSpillableMap => 165
> 2021-12-18 23:38:35,380 INFO log.HoodieMergedLogRecordScanner: Total size in 
> bytes of MemoryBasedMap in ExternalSpillableMap => 259380
> 2021-12-18 23:38:35,380 INFO log.HoodieMergedLogRecordScanner: Number of 
> entries in BitCaskDiskMap in ExternalSpillableMap => 0
> 2021-12-18 23:38:35,380 INFO log.HoodieMergedLogRecordScanner: Size of file 
> spilled to disk => 0
> 2021-12-18 23:38:35,380 INFO metadata.HoodieBackedTableMetadata: Opened 437 
> metadata log files (dataset instant=20211218233649435, metadata 
> instant=20211218233649435) in 22935 ms
> 2021-12-18 23:38:37,193 INFO metadata.HoodieBackedTableMetadata: Opened 437 
> metadata log files (dataset instant=20211218233649435, metadata 
> instant=20211218233649435) in 22802 ms
> Sample file listing tasks
> !Screen Shot 2021-12-18 at 6.16.29 PM.png!
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to