[ 
https://issues.apache.org/jira/browse/FLINK-21523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu updated FLINK-21523:
----------------------------
    Fix Version/s: 1.12.3

> ArrayIndexOutOfBoundsException occurs while run a hive streaming job with 
> partitioned table source 
> ---------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-21523
>                 URL: https://issues.apache.org/jira/browse/FLINK-21523
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / Hive
>    Affects Versions: 1.12.1
>            Reporter: zouyunhe
>            Assignee: zouyunhe
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.13.0, 1.12.3
>
>
> we have two hive table, the ddl as below
> {code:java}
> //test_tbl5
> create table test.test_5
>  (dpi int,     
>   uid bigint) 
> partitioned by( day string, hour string) stored as parquet;
> //test_tbl3
> create table test.test_3(
>   dpi int,
>  uid bigint,    
>  itime timestamp) stored as parquet;{code}
>  then add a partiton to test_tbl5, 
> {code:java}
> alter table test_tbl5 add partition(day='2021-02-27',hour='12');
> {code}
> we start a flink streaming job to read hive table test_tbl5 , and write the 
> data into test_tbl3, the job's  sql as 
> {code:java}
> set test_tbl5.streaming-source.enable = true;
> insert into hive.test.test_tbl3 select dpi, uid, 
> cast(to_timestamp('2020-08-09 00:00:00') as timestamp(9)) from 
> hive.test.test_tbl5 where `day` = '2021-02-27';
> {code}
> and we seen the exception throws
> {code:java}
> 2021-02-28 22:33:16,553 ERROR 
> org.apache.flink.runtime.source.coordinator.SourceCoordinatorContext - 
> Exception while handling result from async call in SourceCoordinator-Source: 
> HiveSource-test.test_tbl5. Triggering job 
> failover.org.apache.flink.connectors.hive.FlinkHiveException: Failed to 
> enumerate files    at 
> org.apache.flink.connectors.hive.ContinuousHiveSplitEnumerator.handleNewSplits(ContinuousHiveSplitEnumerator.java:152)
>  ~[flink-connector-hive_2.12-1.12.1.jar:1.12.1]    at 
> org.apache.flink.runtime.source.coordinator.ExecutorNotifier.lambda$null$4(ExecutorNotifier.java:136)
>  ~[flink-dist_2.12-1.12.1.jar:1.12.1]    at 
> org.apache.flink.util.ThrowableCatchingRunnable.run(ThrowableCatchingRunnable.java:40)
>  [flink-dist_2.12-1.12.1.jar:1.12.1]    at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [?:1.8.0_60]    at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [?:1.8.0_60]    at java.lang.Thread.run(Thread.java:745) [?:1.8.0_60]Caused 
> by: java.lang.ArrayIndexOutOfBoundsException: -1    at 
> org.apache.flink.connectors.hive.util.HivePartitionUtils.toHiveTablePartition(HivePartitionUtils.java:184)
>  ~[flink-connector-hive_2.12-1.12.1.jar:1.12.1]    at 
> org.apache.flink.connectors.hive.HiveTableSource$HiveContinuousPartitionFetcherContext.toHiveTablePartition(HiveTableSource.java:417)
>  ~[flink-connector-hive_2.12-1.12.1.jar:1.12.1]    at 
> org.apache.flink.connectors.hive.ContinuousHiveSplitEnumerator$PartitionMonitor.call(ContinuousHiveSplitEnumerator.java:237)
>  ~[flink-connector-hive_2.12-1.12.1.jar:1.12.1]    at 
> org.apache.flink.connectors.hive.ContinuousHiveSplitEnumerator$PartitionMonitor.call(ContinuousHiveSplitEnumerator.java:177)
>  ~[flink-connector-hive_2.12-1.12.1.jar:1.12.1]    at 
> org.apache.flink.runtime.source.coordinator.ExecutorNotifier.lambda$notifyReadyAsync$5(ExecutorNotifier.java:133)
>  ~[flink-dist_2.12-1.12.1.jar:1.12.1]    at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[?:1.8.0_60]    at 
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) 
> ~[?:1.8.0_60]    at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>  ~[?:1.8.0_60]    at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>  ~[?:1.8.0_60]    ... 3 more{code}
> it seems the partitoned field is not found in the source table field list.
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to