[ 
https://issues.apache.org/jira/browse/HIVE-28918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko updated HIVE-28918:
----------------------------------
    Description: 
==> Iceberg read <== 
{code:java}
2025-04-18T05:03:35.740Z query-executor-1ad7f85b25ff-0 query-executor 1 
9b6c97bb-db94-4f56-84e2-2aaa504dcc57 [mdc@38374 class="lib.MRReaderMapred" 
dagId="dag_1744952592909_0000_1" fragmentId="1744952592909_0000_1_00_000045_0" 
level="INFO" queryId="hive_20250418045502_7d0fdac2-fcb1-4374-b633-1ad7f85b25ff" 
thread="TezTR-592909_0_1_0_45_0"] Processing split: 
TezGroupedSplit\{wrappedSplits=[org.apache.iceberg.mr.hive.HiveIcebergInputFormat:null:0+0,
 org.apache.iceberg.mr.hive.HiveIcebergInputFormat:null:0+0], 
wrappedInputFormatName='org.apache.hadoop.hive.ql.io.HiveInputFormat', 
locations=[*], rack='null', length=255065964}

{code}
should be `{color:#172b4d}*_HiveIcebergInputFormat:HiveIcebergSplit_*{color}`

==> non iceberg <==
{code:java}
2025-04-16 15:04:33,956 [INFO] [TezChild] |lib.MRReaderMapred|: Processing 
split: 
TezGroupedSplit{wrappedSplits=[org.apache.hadoop.hive.ql.io.orc.OrcInputFormat:OrcSplit
 
[[hdfs://cima-prod-b/data/hive/warehouse/dtoa_raw.db/wdpr_best_smry_gst/date_partition=2025-04-13/000344_0],
 start=0, length=33812732, isOriginal=true, fileLength=33812732, 
hasFooter=false, hasBase=true, deltas=0], 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat:OrcSplit 
[[hdfs://cima-prod-b/data/hive/warehouse/dtoa_raw.db/wdpr_best_smry_gst/date_partition=2025-04-13/000363_0],
 start=0, length=29508271, isOriginal=true, fileLength=29508271, 
hasFooter=false, hasBase=true, deltas=0]], 
wrappedInputFormatName='org.apache.hadoop.hive.ql.io.HiveInputFormat', 
locations=[uwcimahdn001b.starwave.com], rack='null', length=63321003}

{code}

  was:
==> Iceberg read <== 
{code:java}
2025-04-18T05:03:35.740Z query-executor-1ad7f85b25ff-0 query-executor 1 
9b6c97bb-db94-4f56-84e2-2aaa504dcc57 [mdc@38374 class="lib.MRReaderMapred" 
dagId="dag_1744952592909_0000_1" fragmentId="1744952592909_0000_1_00_000045_0" 
level="INFO" queryId="hive_20250418045502_7d0fdac2-fcb1-4374-b633-1ad7f85b25ff" 
thread="TezTR-592909_0_1_0_45_0"] Processing split: 
TezGroupedSplit\{wrappedSplits=[org.apache.iceberg.mr.hive.HiveIcebergInputFormat:null:0+0,
 org.apache.iceberg.mr.hive.HiveIcebergInputFormat:null:0+0], 
wrappedInputFormatName='org.apache.hadoop.hive.ql.io.HiveInputFormat', 
locations=[*], rack='null', length=255065964}

{code}
should be `{color:#FF0000}_HiveIcebergInputFormat:HiveIcebergSplit_{color}`

==> non iceberg <==
{code:java}
2025-04-16 15:04:33,956 [INFO] [TezChild] |lib.MRReaderMapred|: Processing 
split: 
TezGroupedSplit{wrappedSplits=[org.apache.hadoop.hive.ql.io.orc.OrcInputFormat:OrcSplit
 
[[hdfs://cima-prod-b/data/hive/warehouse/dtoa_raw.db/wdpr_best_smry_gst/date_partition=2025-04-13/000344_0],
 start=0, length=33812732, isOriginal=true, fileLength=33812732, 
hasFooter=false, hasBase=true, deltas=0], 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat:OrcSplit 
[[hdfs://cima-prod-b/data/hive/warehouse/dtoa_raw.db/wdpr_best_smry_gst/date_partition=2025-04-13/000363_0],
 start=0, length=29508271, isOriginal=true, fileLength=29508271, 
hasFooter=false, hasBase=true, deltas=0]], 
wrappedInputFormatName='org.apache.hadoop.hive.ql.io.HiveInputFormat', 
locations=[uwcimahdn001b.starwave.com], rack='null', length=63321003}

{code}


> Log files processed by each split in Tez tableScan Map task
> -----------------------------------------------------------
>
>                 Key: HIVE-28918
>                 URL: https://issues.apache.org/jira/browse/HIVE-28918
>             Project: Hive
>          Issue Type: Improvement
>          Components: Iceberg integration
>            Reporter: Denys Kuzmenko
>            Priority: Major
>
> ==> Iceberg read <== 
> {code:java}
> 2025-04-18T05:03:35.740Z query-executor-1ad7f85b25ff-0 query-executor 1 
> 9b6c97bb-db94-4f56-84e2-2aaa504dcc57 [mdc@38374 class="lib.MRReaderMapred" 
> dagId="dag_1744952592909_0000_1" 
> fragmentId="1744952592909_0000_1_00_000045_0" level="INFO" 
> queryId="hive_20250418045502_7d0fdac2-fcb1-4374-b633-1ad7f85b25ff" 
> thread="TezTR-592909_0_1_0_45_0"] Processing split: 
> TezGroupedSplit\{wrappedSplits=[org.apache.iceberg.mr.hive.HiveIcebergInputFormat:null:0+0,
>  org.apache.iceberg.mr.hive.HiveIcebergInputFormat:null:0+0], 
> wrappedInputFormatName='org.apache.hadoop.hive.ql.io.HiveInputFormat', 
> locations=[*], rack='null', length=255065964}
> {code}
> should be `{color:#172b4d}*_HiveIcebergInputFormat:HiveIcebergSplit_*{color}`
> ==> non iceberg <==
> {code:java}
> 2025-04-16 15:04:33,956 [INFO] [TezChild] |lib.MRReaderMapred|: Processing 
> split: 
> TezGroupedSplit{wrappedSplits=[org.apache.hadoop.hive.ql.io.orc.OrcInputFormat:OrcSplit
>  
> [[hdfs://cima-prod-b/data/hive/warehouse/dtoa_raw.db/wdpr_best_smry_gst/date_partition=2025-04-13/000344_0],
>  start=0, length=33812732, isOriginal=true, fileLength=33812732, 
> hasFooter=false, hasBase=true, deltas=0], 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat:OrcSplit 
> [[hdfs://cima-prod-b/data/hive/warehouse/dtoa_raw.db/wdpr_best_smry_gst/date_partition=2025-04-13/000363_0],
>  start=0, length=29508271, isOriginal=true, fileLength=29508271, 
> hasFooter=false, hasBase=true, deltas=0]], 
> wrappedInputFormatName='org.apache.hadoop.hive.ql.io.HiveInputFormat', 
> locations=[uwcimahdn001b.starwave.com], rack='null', length=63321003}
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to