[ 
https://issues.apache.org/jira/browse/HDDS-2347?focusedWorklogId=333359&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-333359
 ]

ASF GitHub Bot logged work on HDDS-2347:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 24/Oct/19 12:35
            Start Date: 24/Oct/19 12:35
    Worklog Time Spent: 10m 
      Work Description: fapifta commented on pull request #81: HDDS-2347 
XCeiverClientGrpc's parallel use leads to NPE
URL: https://github.com/apache/hadoop-ozone/pull/81
 
 
   ## What changes were proposed in this pull request?
   We found this issue during Hive TPCDS tests, the basis of the problem is 
that Hive starts up an arbitrary number of threads to work on the same file, 
and reads the file from multiple threads.
   In this case, the same XCeiverClientGrpc is called, and there are certain 
scenarios, where the current client is not synchronized properly. This PR is to 
add necessary synchronization around the closed internal boolean state, and 
around the channels and asyncstubs structures.
   A fundamental change in behaviour is that the XCeiverClientGrpc instances 
are served after connecting to the first DN in a synchronized fashion in the 
XCeiverClientManager, then reconnect if needed is done after checking wether 
the DN is connected properly, and if not then reconnect in a synchronized block.
   
   ## What is the link to the Apache JIRA
   https://issues.apache.org/jira/browse/HDDS-2347
   
   ## How was this patch tested?
   As this issue comes out intermittently, and reproduction depends on how the 
JVM schedules the code of different threads, I was not able to write any 
reliable tests so far.
   Manually the patch was tested on a 42 node cluster, with the 100 tpcds 
queries on a scale 2 and scale 3 large data set generated by the tools here: 
https://github.com/fapifta/hive-testbench
   These tools are coming from https://github.com/hortonworks/hive-testbench 
with some modification to be able to use Ozone and HDFS as filesystems in 
parallel.
   
   After applying the patch on the cluster with current trunk, I have not seen 
the NPE in 3 runs of the 99 TPCDS queries, before the patch I was able to see 
2-5 queries failing with the given NPE per run.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

            Worklog Id:     (was: 333359)
    Remaining Estimate: 0h
            Time Spent: 10m

> XCeiverClientGrpc's parallel use leads to NPE
> ---------------------------------------------
>
>                 Key: HDDS-2347
>                 URL: https://issues.apache.org/jira/browse/HDDS-2347
>             Project: Hadoop Distributed Data Store
>          Issue Type: Improvement
>          Components: Ozone Client
>            Reporter: Istvan Fajth
>            Assignee: Istvan Fajth
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: changes.diff, logs.txt
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> This issue came up when testing Hive with ORC tables on Ozone storage 
> backend, I so far I could not reproduce it locally within a JUnit test but 
> the issue.
> I am attaching a diff file that shows what logging I have added in 
> XCevierClientGrpc and in KeyInputStream to get the results that made me 
> arrive to the following understanding of the scenario:
> - Hive starts a couple of threads to work on the table data during query 
> execution
> - There is one RPCClient that is being used by these threads
> - The threads are opening different stream to read from the same key in ozone
> - The InputStreams internally are using the same XCeiverClientGrpc
> - XCeiverClientGrpc throws the following NPE intermittently:
> {code}
> Caused by: java.lang.NullPointerException
>         at 
> org.apache.hadoop.hdds.scm.XceiverClientGrpc.sendCommandAsync(XceiverClientGrpc.java:398)
>         at 
> org.apache.hadoop.hdds.scm.XceiverClientGrpc.sendCommandWithRetry(XceiverClientGrpc.java:295)
>         at 
> org.apache.hadoop.hdds.scm.XceiverClientGrpc.sendCommandWithTraceIDAndRetry(XceiverClientGrpc.java:259)
>         at 
> org.apache.hadoop.hdds.scm.XceiverClientGrpc.sendCommand(XceiverClientGrpc.java:242)
>         at 
> org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.getBlock(ContainerProtocolCalls.java:118)
>         at 
> org.apache.hadoop.hdds.scm.storage.BlockInputStream.getChunkInfos(BlockInputStream.java:169)
>         at 
> org.apache.hadoop.hdds.scm.storage.BlockInputStream.initialize(BlockInputStream.java:118)
>         at 
> org.apache.hadoop.hdds.scm.storage.BlockInputStream.read(BlockInputStream.java:224)
>         at 
> org.apache.hadoop.ozone.client.io.KeyInputStream.read(KeyInputStream.java:173)
>         at 
> org.apache.hadoop.fs.ozone.OzoneFSInputStream.read(OzoneFSInputStream.java:52)
>         at org.apache.hadoop.fs.FSInputStream.read(FSInputStream.java:75)
>         at 
> org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121)
>         at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
>         at org.apache.orc.impl.ReaderImpl.extractFileTail(ReaderImpl.java:555)
>         at org.apache.orc.impl.ReaderImpl.<init>(ReaderImpl.java:370)
>         at 
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.<init>(ReaderImpl.java:61)
>         at 
> org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:105)
>         at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.populateAndCacheStripeDetails(OrcInputFormat.java:1708)
>         at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.callInternal(OrcInputFormat.java:1596)
>         at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.access$2900(OrcInputFormat.java:1383)
>         at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator$1.run(OrcInputFormat.java:1568)
>         at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator$1.run(OrcInputFormat.java:1565)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
>         at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:1565)
>         at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:1383)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> {code}
> I have two proposals to fix this issue, one is the easy answer to put 
> synchronization to the XCeiverClientGrpc code, the other one is a bit more 
> complicated, let me explain below.
> Naively I would assume that when I get a client SPI instance from 
> XCeiverClientManager, that instance is ready to use. In fact it is not, and 
> when the user of the SPI instance sends the first request that is the point 
> when the client gets essentially ready. Now if we put synchronization to this 
> code, that is the easy solution, but my pragmatic half screams for a better 
> solution, that ensures that the Manager essentially manages the clients that 
> is giving to it's users, and the clients themselves are not getting ready by 
> accident.
> I am working on a proposal that moves things around a bit, and looking for 
> possible other solutions that does not feel hacky as I feel with the easy 
> solution.
> I am attaching the followings:
> - a diff that shows the added extended logging in XCeiverClientGrpc and 
> KeyInputStream.
> - a job log snippet from a Hive query that shows the relevant output from the 
> extensive logging added by the diff in a cluster.
> - later a proposal for the fix I need to work on a bit more



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to