[jira] [Created] (ZEPPELIN-5758) BigQuery hits socket timeout before reaching "wait_time" setting

Marcus Truscello (Jira) Thu, 30 Jun 2022 18:06:16 -0700

Marcus Truscello created ZEPPELIN-5758:
------------------------------------------


             Summary: BigQuery hits socket timeout before reaching "wait_time" 
setting
                 Key: ZEPPELIN-5758
                 URL: https://issues.apache.org/jira/browse/ZEPPELIN-5758
             Project: Zeppelin
          Issue Type: Bug
          Components: interpreter-setting, Interpreters, zeppelin-interpreter
    Affects Versions: 0.10.1
            Reporter: Marcus Truscello
         Attachments: bigquery-timeout.patch, stacktrace.log

The {{zeppelin.bigquery.wait_time}} BigQuery interpreter parameter is only 
useful up to a value of 30 seconds. Anything beyond that exceeds the underlying 
HTTP client's default read timeout and will result in a 
{{java.net.SocketTimeoutException: Read timed out}} exception being thrown. (A 
full stack trace is attached.)

Google's Java API guide suggests overriding the {{HttpRequestInitializer}} to 
set the desired connect and read timeouts: 
[https://developers.google.com/api-client-library/java/google-api-java-client/errors#timeouts]

This exact approach isn't feasible because the BigQuery interpreter's 
{{createAuthorizedClient}} method is static. Instead, we can modify the 
solution to use an approach similar to this StackOverflow answer which uses the 
builder's {{{}setHttpRequestInitializer{}}}: 
[https://stackoverflow.com/a/32894630]

It should be noted that setting the read timeout too large likely won't provide 
any value.  Regardless of the {{timeoutMs}} value, BigQuery will always return 
a response within ~200 seconds regardless if the job has actually completed or 
not: 
[https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/getQueryResults#query-parameters]
Given that the BigQuery interpreter doesn't handle jobComplete being false, 
there's no reason to set the read timeout much larger than 200 seconds.
 
I've attached a diff of the changes I applied to fix this issue.  It should be 
noted that I am not a Java developer, so I apologize if the solution is a bit 
crude. :D
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (ZEPPELIN-5758) BigQuery hits socket timeout before reaching "wait_time" setting

Reply via email to