Denys Kuzmenko created HIVE-28918:
-------------------------------------

             Summary: Log files processed by each split in Tez tableScan Map 
task
                 Key: HIVE-28918
                 URL: https://issues.apache.org/jira/browse/HIVE-28918
             Project: Hive
          Issue Type: Improvement
          Components: Iceberg integration
            Reporter: Denys Kuzmenko


==> Iceberg read <== 

{code}

2025-04-18T05:03:35.740Z query-executor-1ad7f85b25ff-0 query-executor 1 
9b6c97bb-db94-4f56-84e2-2aaa504dcc57 [mdc@38374 class="lib.MRReaderMapred" 
dagId="dag_1744952592909_0000_1" fragmentId="1744952592909_0000_1_00_000045_0" 
level="INFO" queryId="hive_20250418045502_7d0fdac2-fcb1-4374-b633-1ad7f85b25ff" 
thread="TezTR-592909_0_1_0_45_0"] Processing split: 
TezGroupedSplit\{wrappedSplits=[org.apache.iceberg.mr.hive.HiveIcebergInputFormat:null:0+0,
 org.apache.iceberg.mr.hive.HiveIcebergInputFormat:null:0+0], 
wrappedInputFormatName='org.apache.hadoop.hive.ql.io.HiveInputFormat', 
locations=[*], rack='null', length=255065964}

{code}

 

==> non iceberg <==

 

{code}

2025-04-16 15:04:33,956 [INFO] [TezChild] |lib.MRReaderMapred|: Processing 
split: 
TezGroupedSplit{wrappedSplits=[org.apache.hadoop.hive.ql.io.orc.OrcInputFormat:OrcSplit
 
[[hdfs://cima-prod-b/data/hive/warehouse/dtoa_raw.db/wdpr_best_smry_gst/date_partition=2025-04-13/000344_0],
 start=0, length=33812732, isOriginal=true, fileLength=33812732, 
hasFooter=false, hasBase=true, deltas=0], 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat:OrcSplit 
[[hdfs://cima-prod-b/data/hive/warehouse/dtoa_raw.db/wdpr_best_smry_gst/date_partition=2025-04-13/000363_0],
 start=0, length=29508271, isOriginal=true, fileLength=29508271, 
hasFooter=false, hasBase=true, deltas=0]], 
wrappedInputFormatName='org.apache.hadoop.hive.ql.io.HiveInputFormat', 
locations=[uwcimahdn001b.starwave.com], rack='null', length=63321003}

{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to