[ 
https://issues.apache.org/jira/browse/HADOOP-18883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17804148#comment-17804148
 ] 

ASF GitHub Bot commented on HADOOP-18883:
-----------------------------------------

saxenapranav commented on code in PR #6022:
URL: https://github.com/apache/hadoop/pull/6022#discussion_r1444240890


##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsHttpOperation.java:
##########
@@ -340,8 +344,11 @@ public void sendRequest(byte[] buffer, int offset, int 
length) throws IOExceptio
            If expect header is not enabled, we throw back the exception.
          */
         String expectHeader = getConnProperty(EXPECT);
-        if (expectHeader != null && expectHeader.equals(HUNDRED_CONTINUE)) {
+        if (expectHeader != null && expectHeader.equals(HUNDRED_CONTINUE)
+            && e instanceof ProtocolException
+            && EXPECT_100_JDK_ERROR.equals(e.getMessage())) {

Review Comment:
   At `httpUrlConnection.getOutputStream`,  either the error could 
IOException(including ConnectionTimeout and ReadTimeout) or expect-100 error 
(this raises ProtocolException which is child of IOException). Server errors if 
any would be caught in `processResponse` and the treatment would be same as 
done with all other apis (analyse if needed to be retried and then 
RestOperation would retry it).
   
   In the JDK's implementation of `getOutputStream`, For the IOExceptions, the 
connection is killed. So, if further APIs are let go ahead, they would be 
firing a new server call all together. So, other APIs, like getHeaderField() 
etc, would be returning the data as per the new server call which is 
undesirable.
   
   Also, the implementation of `httpUrlConnection` is such that the other APIs 
(like getHeaderField()), would internally call getInputStream(), which would 
would first call getOutputStream() (if the sendData flag is true and  doesnt 
hold strOutputStream object). Now, here two things can happen:
   1. Expect100 failure: no data capture, and again any next API on the 
httpUrlConnection would fire a new call.
   2. Status-100 : Now, it is not in the block where data can be put in the 
outputStream, the stream shall be closed which will raise IOException, and from 
here it will go back to retry loop. Ref: 
https://github.com/openjdk/jdk8/blob/master/jdk/src/share/classes/sun/net/www/protocol/http/HttpURLConnection.java#L1463-L1471
   
   Hence, any further API is prevented on the HttpUrlConnection object which 
has got an IOException in getOutputStream.





> Expect-100 JDK bug resolution: prevent multiple server calls
> ------------------------------------------------------------
>
>                 Key: HADOOP-18883
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18883
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/azure
>            Reporter: Pranav Saxena
>            Assignee: Pranav Saxena
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 3.4.0
>
>
> This is inline to JDK bug: [https://bugs.openjdk.org/browse/JDK-8314978].
>  
> With the current implementation of HttpURLConnection if server rejects the 
> “Expect 100-continue” then there will be ‘java.net.ProtocolException’ will be 
> thrown from 'expect100Continue()' method.
> After the exception thrown, If we call any other method on the same instance 
> (ex getHeaderField(), or getHeaderFields()). They will internally call 
> getOuputStream() which invokes writeRequests(), which make the actual server 
> call. 
> In the AbfsHttpOperation, after sendRequest() we call processResponse() 
> method from AbfsRestOperation. Even if the conn.getOutputStream() fails due 
> to expect-100 error, we consume the exception and let the code go ahead. So, 
> we can have getHeaderField() / getHeaderFields() / getHeaderFieldLong() which 
> will be triggered after getOutputStream is failed. These invocation will lead 
> to server calls.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to