[
https://issues.apache.org/jira/browse/HIVE-28918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Denys Kuzmenko updated HIVE-28918:
----------------------------------
Description:
==> Iceberg read <==
{code:java}
2025-04-18T05:03:35.740Z query-executor-1ad7f85b25ff-0 query-executor 1
9b6c97bb-db94-4f56-84e2-2aaa504dcc57 [mdc@38374 class="lib.MRReaderMapred"
dagId="dag_1744952592909_0000_1" fragmentId="1744952592909_0000_1_00_000045_0"
level="INFO" queryId="hive_20250418045502_7d0fdac2-fcb1-4374-b633-1ad7f85b25ff"
thread="TezTR-592909_0_1_0_45_0"] Processing split:
TezGroupedSplit\{wrappedSplits=[org.apache.iceberg.mr.hive.HiveIcebergInputFormat:null:0+0,
org.apache.iceberg.mr.hive.HiveIcebergInputFormat:null:0+0],
wrappedInputFormatName='org.apache.hadoop.hive.ql.io.HiveInputFormat',
locations=[*], rack='null', length=255065964}
{code}
should be `{color:#172b4d}*_HiveIcebergInputFormat:HiveIcebergSplit_*{color}`
==> non iceberg <==
{code:java}
2025-04-16 15:04:33,956 [INFO] [TezChild] |lib.MRReaderMapred|: Processing
split:
TezGroupedSplit{wrappedSplits=[org.apache.hadoop.hive.ql.io.orc.OrcInputFormat:OrcSplit
[[hdfs://cima-prod-b/data/hive/warehouse/dtoa_raw.db/wdpr_best_smry_gst/date_partition=2025-04-13/000344_0],
start=0, length=33812732, isOriginal=true, fileLength=33812732,
hasFooter=false, hasBase=true, deltas=0],
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat:OrcSplit
[[hdfs://cima-prod-b/data/hive/warehouse/dtoa_raw.db/wdpr_best_smry_gst/date_partition=2025-04-13/000363_0],
start=0, length=29508271, isOriginal=true, fileLength=29508271,
hasFooter=false, hasBase=true, deltas=0]],
wrappedInputFormatName='org.apache.hadoop.hive.ql.io.HiveInputFormat',
locations=[uwcimahdn001b.starwave.com], rack='null', length=63321003}
{code}
was:
==> Iceberg read <==
{code:java}
2025-04-18T05:03:35.740Z query-executor-1ad7f85b25ff-0 query-executor 1
9b6c97bb-db94-4f56-84e2-2aaa504dcc57 [mdc@38374 class="lib.MRReaderMapred"
dagId="dag_1744952592909_0000_1" fragmentId="1744952592909_0000_1_00_000045_0"
level="INFO" queryId="hive_20250418045502_7d0fdac2-fcb1-4374-b633-1ad7f85b25ff"
thread="TezTR-592909_0_1_0_45_0"] Processing split:
TezGroupedSplit\{wrappedSplits=[org.apache.iceberg.mr.hive.HiveIcebergInputFormat:null:0+0,
org.apache.iceberg.mr.hive.HiveIcebergInputFormat:null:0+0],
wrappedInputFormatName='org.apache.hadoop.hive.ql.io.HiveInputFormat',
locations=[*], rack='null', length=255065964}
{code}
should be `{color:#FF0000}_HiveIcebergInputFormat:HiveIcebergSplit_{color}`
==> non iceberg <==
{code:java}
2025-04-16 15:04:33,956 [INFO] [TezChild] |lib.MRReaderMapred|: Processing
split:
TezGroupedSplit{wrappedSplits=[org.apache.hadoop.hive.ql.io.orc.OrcInputFormat:OrcSplit
[[hdfs://cima-prod-b/data/hive/warehouse/dtoa_raw.db/wdpr_best_smry_gst/date_partition=2025-04-13/000344_0],
start=0, length=33812732, isOriginal=true, fileLength=33812732,
hasFooter=false, hasBase=true, deltas=0],
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat:OrcSplit
[[hdfs://cima-prod-b/data/hive/warehouse/dtoa_raw.db/wdpr_best_smry_gst/date_partition=2025-04-13/000363_0],
start=0, length=29508271, isOriginal=true, fileLength=29508271,
hasFooter=false, hasBase=true, deltas=0]],
wrappedInputFormatName='org.apache.hadoop.hive.ql.io.HiveInputFormat',
locations=[uwcimahdn001b.starwave.com], rack='null', length=63321003}
{code}
> Log files processed by each split in Tez tableScan Map task
> -----------------------------------------------------------
>
> Key: HIVE-28918
> URL: https://issues.apache.org/jira/browse/HIVE-28918
> Project: Hive
> Issue Type: Improvement
> Components: Iceberg integration
> Reporter: Denys Kuzmenko
> Priority: Major
>
> ==> Iceberg read <==
> {code:java}
> 2025-04-18T05:03:35.740Z query-executor-1ad7f85b25ff-0 query-executor 1
> 9b6c97bb-db94-4f56-84e2-2aaa504dcc57 [mdc@38374 class="lib.MRReaderMapred"
> dagId="dag_1744952592909_0000_1"
> fragmentId="1744952592909_0000_1_00_000045_0" level="INFO"
> queryId="hive_20250418045502_7d0fdac2-fcb1-4374-b633-1ad7f85b25ff"
> thread="TezTR-592909_0_1_0_45_0"] Processing split:
> TezGroupedSplit\{wrappedSplits=[org.apache.iceberg.mr.hive.HiveIcebergInputFormat:null:0+0,
> org.apache.iceberg.mr.hive.HiveIcebergInputFormat:null:0+0],
> wrappedInputFormatName='org.apache.hadoop.hive.ql.io.HiveInputFormat',
> locations=[*], rack='null', length=255065964}
> {code}
> should be `{color:#172b4d}*_HiveIcebergInputFormat:HiveIcebergSplit_*{color}`
> ==> non iceberg <==
> {code:java}
> 2025-04-16 15:04:33,956 [INFO] [TezChild] |lib.MRReaderMapred|: Processing
> split:
> TezGroupedSplit{wrappedSplits=[org.apache.hadoop.hive.ql.io.orc.OrcInputFormat:OrcSplit
>
> [[hdfs://cima-prod-b/data/hive/warehouse/dtoa_raw.db/wdpr_best_smry_gst/date_partition=2025-04-13/000344_0],
> start=0, length=33812732, isOriginal=true, fileLength=33812732,
> hasFooter=false, hasBase=true, deltas=0],
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat:OrcSplit
> [[hdfs://cima-prod-b/data/hive/warehouse/dtoa_raw.db/wdpr_best_smry_gst/date_partition=2025-04-13/000363_0],
> start=0, length=29508271, isOriginal=true, fileLength=29508271,
> hasFooter=false, hasBase=true, deltas=0]],
> wrappedInputFormatName='org.apache.hadoop.hive.ql.io.HiveInputFormat',
> locations=[uwcimahdn001b.starwave.com], rack='null', length=63321003}
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)