[ 
https://issues.apache.org/jira/browse/HUDI-5276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kudinkin reassigned HUDI-5276:
-------------------------------------

    Assignee: Ethan Guo  (was: Alexey Kudinkin)

> Hudi getAllQueryPartitionPaths use regular match caused Invalid input path 
> add 
> -------------------------------------------------------------------------------
>
>                 Key: HUDI-5276
>                 URL: https://issues.apache.org/jira/browse/HUDI-5276
>             Project: Apache Hudi
>          Issue Type: Bug
>            Reporter: yuehanwang
>            Assignee: Ethan Guo
>            Priority: Blocker
>              Labels: pull-request-available
>             Fix For: 0.13.0
>
>
>  
> When we query sql in hive like:
> select mainwaybillno,
> zonecode,
> accountantcode,
> baroprcode,
> opcode,
> row_number() over(PARTITION BY mainwaybillno, zonecode, opcode ORDER BY 
> barscantm) sn from dm_kafka_rdmp_dw.fvp_core_fact_route_op_hudi_op_new_rt 
> WHERE opcode IN ('50') and inc_day='20221120' limit 10;
> In MapReduce Job the config 
> mapreduce.input.fileinputformat.inputdir=hdfs://dw/hive/warehouse/dm/dm_kafka_rdmp_dw/fvp_core_fact_route_op_hudi_op_new/inc_day=20221120/opcode=50
> But this file split 
> hdfs://dw/hive/warehouse/dm/dm_kafka_rdmp_dw/fvp_core_fact_route_op_hudi_op_new/inc_day=20221120/opcode=5000
>  was added to the job.
> This job was failed and throw exception :
> 2022-11-21 18:11:33,895 INFO [IPC Server handler 1 on 45077] 
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Diagnostics report from 
> attempt_1668750926041_1011874_m_000110_0: Error: java.lang.RuntimeException: 
> java.lang.IllegalStateException: Invalid input path 
> hdfs://dw/hive/warehouse/dm/dm_kafka_rdmp_dw/fvp_core_fact_route_op_hudi_op_new/inc_day=20221120/opcode=501/.00000006-2d6e-4d26-93ea-1026632abb67_20221119235956333.log.1_44-150-2
> at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:169)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.IllegalStateException: Invalid input path 
> hdfs://dw/hive/warehouse/dm/dm_kafka_rdmp_dw/fvp_core_fact_route_op_hudi_op_new/inc_day=20221120/opcode=501/.00000006-2d6e-4d26-93ea-1026632abb67_20221119235956333.log.1_44-150-2
> at 
> org.apache.hadoop.hive.ql.exec.AbstractMapOperator.getNominalPath(AbstractMapOperator.java:119)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.cleanUpInputFileChangedOp(MapOperator.java:452)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1106)
> at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:482)
> at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:160)
> ... 8 more
> 2022-11-21 18:11:33,897 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics 
> report from attempt_1668750926041_1011874_m_000110_0: Error: 
> java.lang.RuntimeException: java.lang.IllegalStateException: Invalid input 
> path 
> hdfs://dw/hive/warehouse/dm/dm_kafka_rdmp_dw/fvp_core_fact_route_op_hudi_op_new/inc_day=20221120/opcode=501/.00000006-2d6e-4d26-93ea-1026632abb67_20221119235956333.log.1_44-150-2
> at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:169)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.IllegalStateException: Invalid input path 
> hdfs://dw/hive/warehouse/dm/dm_kafka_rdmp_dw/fvp_core_fact_route_op_hudi_op_new/inc_day=20221120/opcode=501/.00000006-2d6e-4d26-93ea-1026632abb67_20221119235956333.log.1_44-150-2
> at 
> org.apache.hadoop.hive.ql.exec.AbstractMapOperator.getNominalPath(AbstractMapOperator.java:119)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.cleanUpInputFileChangedOp(MapOperator.java:452)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1106)
> at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:482)
> at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:160)
> ... 8 more



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to