[ 
https://issues.apache.org/jira/browse/FLINK-19635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17233629#comment-17233629
 ] 

Leonard Xu commented on FLINK-19635:
------------------------------------

 
{code:java}
 {code}
04:35:14,975 [hbase-upsert-sink-flusher-thread-1] ERROR 
org.apache.hadoop.hbase.client.AsyncProcess [] - Failed to get region location 
org.apache.hadoop.hbase.DoNotRetryIOException: hconnection-0x932344b closed at 
org.apache.hadoop.hbase.client.ConnectionImplementation.checkClosed(ConnectionImplementation.java:591)
 ~[hbase-client-2.2.3.jar:2.2.3] at 
org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:776)
 ~[hbase-client-2.2.3.jar:2.2.3] at 
org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:273) 
~[hbase-client-2.2.3.jar:2.2.3] at 
org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:228) 
~[hbase-client-2.2.3.jar:2.2.3] at 
org.apache.hadoop.hbase.client.BufferedMutatorImpl.doFlush(BufferedMutatorImpl.java:303)
 ~[hbase-client-2.2.3.jar:2.2.3] at 
org.apache.hadoop.hbase.client.BufferedMutatorImpl.flush(BufferedMutatorImpl.java:280)
 ~[hbase-client-2.2.3.jar:2.2.3] at 
org.apache.flink.connector.hbase.sink.HBaseSinkFunction.flush(HBaseSinkFunction.java:189)
 ~[flink-connector-hbase-base_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] at 
org.apache.flink.connector.hbase.sink.HBaseSinkFunction.lambda$open$0(HBaseSinkFunction.java:134)
 ~[flink-connector-hbase-base_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT] at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[?:1.8.0_242] at 
java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [?:1.8.0_242] 
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
 [?:1.8.0_242] at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
 [?:1.8.0_242] at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_242] at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_242] at java.lang.Thread.run(Thread.java:748) 
[?:1.8.0_242]04:35:14,975 [flink-akka.actor.default-dispatcher-4] INFO 
org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Un-registering task and 
sending final execution state FINISHED to JobManager for task Source: 
TableSourceScan(table=[[default_catalog, default_database, testTable1]], 
fields=[rowkey, family1, family2, family3, family4]) -> Sink: 
Sink(table=[default_catalog.default_database.testTable3], fields=[rowkey, 
family1, family2, family3, family4]) (1/32) 
65f7b61a1994ee9b341a2a7f5c6cca3a_cbc357ccb763df2852fee8c4fc7d55f2_0_0.04:35:14,976
 [flink-akka.actor.default-dispatcher-4] INFO 
org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: 
TableSourceScan(table=[[default_catalog, default_database, testTable1]], 
fields=[rowkey, family1, family2, family3, family4]) -> Sink: 
Sink(table=[default_catalog.default_database.testTable3], fields=[rowkey, 
family1, family2, family3, family4]) (1/32) 
(65f7b61a1994ee9b341a2a7f5c6cca3a_cbc357ccb763df2852fee8c4fc7d55f2_0_0) 
switched from RUNNING to FINISHED.04:35:14,986 [RS-EventLoopGroup-3-54] INFO 
SecurityLogger.org.apache.hadoop.hbase.Server 
[] - Connection from 192.168.192.2:49496, version=2.2.3, sasl=false, 
ugi=agent01_azpcontainer (auth:SIMPLE), service=ClientService04:35:15,016 [ 
htable-pool146-t1] WARN org.apache.hadoop.hbase.client.AsyncRequestFutureImpl 
[] - id=175, table=testTable3, attempt=1/16, failureCount=1ops, last 
exception=org.apache.hadoop.hbase.DoNotRetryIOException: hconnection-0x932344b 
closed on 295934f72d17,36709,1602650080067, tracking started Wed Oct 14 
04:35:14 UTC 2020; *NOT retrying, failed=1 -- final attempt!* 

 

from the failed log, I think the mismatch result is because one soure task 
failed to scan the data, we set the 

*hbase.client.retries.number* to "*1*" which means do not retry, but the 
default value in hbase-1.4 is "*35*" and in hbase-2.2 is "*15*", so we should 
at least allow the hbase client do retry.

 

> HBaseConnectorITCase.testTableSourceSinkWithDDL is unstable with a result 
> mismatch
> ----------------------------------------------------------------------------------
>
>                 Key: FLINK-19635
>                 URL: https://issues.apache.org/jira/browse/FLINK-19635
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / HBase
>    Affects Versions: 1.12.0
>            Reporter: Robert Metzger
>            Assignee: Leonard Xu
>            Priority: Critical
>              Labels: test-stability
>             Fix For: 1.12.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=7562&view=logs&j=d44f43ce-542c-597d-bf94-b0718c71e5e8&t=03dca39c-73e8-5aaf-601d-328ae5c35f20
> {code}
> 2020-10-14T04:35:36.9268975Z testTableSourceSinkWithDDL[planner = 
> BLINK_PLANNER, legacy = 
> false](org.apache.flink.connector.hbase2.HBaseConnectorITCase)  Time elapsed: 
> 3.131 sec  <<< FAILURE!
> 2020-10-14T04:35:36.9276246Z java.lang.AssertionError: 
> expected:<[1,10,Hello-1,100,1.01,false,Welt-1,2019-08-18T19:00,2019-08-18,19:00,12345678.0001,
>  
> 2,20,Hello-2,200,2.02,true,Welt-2,2019-08-18T19:01,2019-08-18,19:01,12345678.0002,
>  
> 3,30,Hello-3,300,3.03,false,Welt-3,2019-08-18T19:02,2019-08-18,19:02,12345678.0003,
>  
> 4,40,null,400,4.04,true,Welt-4,2019-08-18T19:03,2019-08-18,19:03,12345678.0004,
>  
> 5,50,Hello-5,500,5.05,false,Welt-5,2019-08-19T19:10,2019-08-19,19:10,12345678.0005,
>  
> 6,60,Hello-6,600,6.06,true,Welt-6,2019-08-19T19:20,2019-08-19,19:20,12345678.0006,
>  
> 7,70,Hello-7,700,7.07,false,Welt-7,2019-08-19T19:30,2019-08-19,19:30,12345678.0007,
>  
> 8,80,null,800,8.08,true,Welt-8,2019-08-19T19:40,2019-08-19,19:40,12345678.0008]>
>  but 
> was:<[1,10,Hello-1,100,1.01,false,Welt-1,2019-08-18T19:00,2019-08-18,19:00,12345678.0001,
>  
> 2,20,Hello-2,200,2.02,true,Welt-2,2019-08-18T19:01,2019-08-18,19:01,12345678.0002,
>  
> 3,30,Hello-3,300,3.03,false,Welt-3,2019-08-18T19:02,2019-08-18,19:02,12345678.0003]>
> 2020-10-14T04:35:36.9281340Z  at org.junit.Assert.fail(Assert.java:88)
> 2020-10-14T04:35:36.9282023Z  at 
> org.junit.Assert.failNotEquals(Assert.java:834)
> 2020-10-14T04:35:36.9328385Z  at 
> org.junit.Assert.assertEquals(Assert.java:118)
> 2020-10-14T04:35:36.9338939Z  at 
> org.junit.Assert.assertEquals(Assert.java:144)
> 2020-10-14T04:35:36.9339880Z  at 
> org.apache.flink.connector.hbase2.HBaseConnectorITCase.testTableSourceSinkWithDDL(HBaseConnectorITCase.java:449)
> 2020-10-14T04:35:36.9341003Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to