[ 
https://issues.apache.org/jira/browse/HDFS-6533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-6533:
----------------------------------
    Attachment: HDFS-6533.001.patch

Rev01: proof of concept.

There are two issues here: 
(1) Upon exiting, the test interrupts BPServiceActor threads, but does not wait 
for them to finish. This is error-prone, and makes the output of subsequent 
tests mix with previous ones.

(2) The direct cause of test failure is that the test does not properly wait 
for actors register with both name nodes. The method 
BPOfferService.isInitialized() returns true if any of the actors register with 
the corresponding name nodes. I modified the wait-for condition to explicitly 
wait for the registrations with both name nodes.

> intermittent 
> org.apache.hadoop.hdfs.server.datanode.TestBPOfferService.testBasicFunctionalitytest
>  failure 
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-6533
>                 URL: https://issues.apache.org/jira/browse/HDFS-6533
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode, hdfs-client
>    Affects Versions: 2.4.0
>            Reporter: Yongjun Zhang
>            Assignee: Wei-Chiu Chuang
>         Attachments: HDFS-6533.001.patch
>
>
> Per https://builds.apache.org/job/Hadoop-Hdfs-trunk/1774/testReport, the 
> following test failed. However, local rerun is successful.
> {code}
> org.apache.hadoop.hdfs.server.datanode.TestBPOfferService.testBasicFunctionality
> Error Message
> Wanted but not invoked:
> datanodeProtocolClientSideTranslatorPB.registerDatanode(
>     <any>
> );
> -> at 
> org.apache.hadoop.hdfs.server.datanode.TestBPOfferService.testBasicFunctionality(TestBPOfferService.java:175)
> Actually, there were zero interactions with this mock.
> Stacktrace
> org.mockito.exceptions.verification.WantedButNotInvoked: 
> Wanted but not invoked:
> datanodeProtocolClientSideTranslatorPB.registerDatanode(
>     <any>
> );
> -> at 
> org.apache.hadoop.hdfs.server.datanode.TestBPOfferService.testBasicFunctionality(TestBPOfferService.java:175)
> Actually, there were zero interactions with this mock.
>       at 
> org.apache.hadoop.hdfs.server.datanode.TestBPOfferService.testBasicFunctionality(TestBPOfferService.java:175)
> Standard Output
> 2014-06-14 12:42:08,723 INFO  datanode.DataNode 
> (SimulatedFSDataset.java:registerMBean(968)) - Registered FSDatasetState MBean
> 2014-06-14 12:42:08,730 INFO  datanode.DataNode 
> (BPServiceActor.java:run(805)) - Block pool <registering> (Datanode Uuid 
> unassigned) service to 0.0.0.0/0.0.0.0:0 starting to offer service
> 2014-06-14 12:42:08,730 DEBUG datanode.DataNode 
> (BPServiceActor.java:retrieveNamespaceInfo(170)) - Block pool <registering> 
> (Datanode Uuid unassigned) service to 0.0.0.0/0.0.0.0:0 received 
> versionRequest response: lv=-57;cid=fake cluster;nsid=1;c=0;bpid=fake bpid
> 2014-06-14 12:42:08,731 INFO  datanode.DataNode 
> (BPServiceActor.java:register(765)) - Block pool fake bpid (Datanode Uuid 
> null) service to 0.0.0.0/0.0.0.0:0 beginning handshake with NN
> 2014-06-14 12:42:08,731 INFO  datanode.DataNode 
> (BPServiceActor.java:register(778)) - Block pool Block pool fake bpid 
> (Datanode Uuid null) service to 0.0.0.0/0.0.0.0:0 successfully registered 
> with NN
> 2014-06-14 12:42:08,732 INFO  datanode.DataNode 
> (BPServiceActor.java:offerService(637)) - For namenode 0.0.0.0/0.0.0.0:0 
> using DELETEREPORT_INTERVAL of 300000 msec  BLOCKREPORT_INTERVAL of 
> 21600000msec CACHEREPORT_INTERVAL of 10000msec Initial delay: 0msec; 
> heartBeatInterval=3000
> 2014-06-14 12:42:08,732 DEBUG datanode.DataNode 
> (BPServiceActor.java:sendHeartBeat(562)) - Sending heartbeat with 1 storage 
> reports from service actor: Block pool fake bpid (Datanode Uuid null) service 
> to 0.0.0.0/0.0.0.0:0
> 2014-06-14 12:42:08,734 INFO  datanode.DataNode 
> (BPServiceActor.java:blockReport(498)) - Sent 1 blockreports 0 blocks total. 
> Took 1 msec to generate and 0 msecs for RPC and NN processing.  Got back 
> commands none
> 2014-06-14 12:42:08,738 INFO  datanode.DataNode 
> (BPServiceActor.java:run(805)) - Block pool fake bpid (Datanode Uuid null) 
> service to 0.0.0.0/0.0.0.0:1 starting to offer service
> 2014-06-14 12:42:08,739 DEBUG datanode.DataNode 
> (BPServiceActor.java:retrieveNamespaceInfo(170)) - Block pool fake bpid 
> (Datanode Uuid null) service to 0.0.0.0/0.0.0.0:1 received versionRequest 
> response: lv=-57;cid=fake cluster;nsid=1;c=0;bpid=fake bpid
> 2014-06-14 12:42:08,739 INFO  datanode.DataNode 
> (BPServiceActor.java:register(765)) - Block pool fake bpid (Datanode Uuid 
> null) service to 0.0.0.0/0.0.0.0:1 beginning handshake with NN
> 2014-06-14 12:42:08,740 INFO  datanode.DataNode 
> (BPServiceActor.java:register(778)) - Block pool Block pool fake bpid 
> (Datanode Uuid null) service to 0.0.0.0/0.0.0.0:1 successfully registered 
> with NN
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to