[ 
https://issues.apache.org/jira/browse/SPARK-24476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bharath kumar avusherla updated SPARK-24476:
--------------------------------------------
    Description: 
We are working on spark streaming application using spark structured streaming 
with checkpointing in s3. When we start the application, the application runs 
just fine for sometime  then it crashes with the error mentioned below. The 
amount of time it will run successfully varies from time to time, sometimes it 
will run for 2 days without any issues then crashes, sometimes it will crash 
after 4hrs/ 24hrs. 

Our streaming application joins(left and inner) multiple sources from kafka and 
also s3 and aurora database.

Can you please let us know how to solve this problem? Is it possible to 
increase the timeout period?

Here, I'm pasting the few line of complete exception log below. Also attached 
the complete exception to the issue.

*_Exception:_*

*_Caused by: java.net.SocketTimeoutException: Read timed out_*

        _at java.net.SocketInputStream.socketRead0(Native Method)_

        _at java.net.SocketInputStream.read(SocketInputStream.java:150)_

        _at java.net.SocketInputStream.read(SocketInputStream.java:121)_

        _at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)_

        _at sun.security.ssl.InputRecord.read(InputRecord.java:503)_

        _at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:954)_

        _at 
sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1343)_

        _at 
sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1371)_

        _at 
sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1355)_

        _at 
org.apache.http.conn.ssl.SSLSocketFactory.connectSocket(SSLSocketFactory.java:553)_

        _at 
org.apache.http.conn.ssl.SSLSocketFactory.connectSocket(SSLSocketFactory.java:412)_

 

  was:
We are working on spark streaming application using spark structured streaming 
with checkpointing in s3. When we start the application, the application runs 
just fine for sometime  then it crashes with the error mentioned below. The 
amount of time it will run successfully varies from time to time, sometimes it 
will run for 2 days without any issues then crashes, sometimes it will crash 
after 4hrs/ 24hrs. 

Our streaming application joins(left and inner) multiple sources from kafka and 
also s3 and aurora database.

Can you please let us know how to solve this problem? Is it possible to 
increase the timeout period?

Here, I'm pasting the complete exception log below.

*_Exception:_*

*_Caused by: java.net.SocketTimeoutException: Read timed out_*

        _at java.net.SocketInputStream.socketRead0(Native Method)_

        _at java.net.SocketInputStream.read(SocketInputStream.java:150)_

        _at java.net.SocketInputStream.read(SocketInputStream.java:121)_

        _at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)_

        _at sun.security.ssl.InputRecord.read(InputRecord.java:503)_

        _at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:954)_

        _at 
sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1343)_

        _at 
sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1371)_

        _at 
sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1355)_

        _at 
org.apache.http.conn.ssl.SSLSocketFactory.connectSocket(SSLSocketFactory.java:553)_

        _at 
org.apache.http.conn.ssl.SSLSocketFactory.connectSocket(SSLSocketFactory.java:412)_

        _at 
org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:179)_

        _at 
org.apache.http.impl.conn.AbstractPoolEntry.open(AbstractPoolEntry.java:144)_

        _at 
org.apache.http.impl.conn.AbstractPooledConnAdapter.open(AbstractPooledConnAdapter.java:134)_

        _at 
org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:612)_

        _at 
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:447)_

        _at 
org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:884)_

        _at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)_

        _at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55)_

        _at 
org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:334)_

        _at 
org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:281)_

        _at 
org.jets3t.service.impl.rest.httpclient.RestStorageService.performRestHead(RestStorageService.java:942)_

        _at 
org.jets3t.service.impl.rest.httpclient.RestStorageService.getObjectImpl(RestStorageService.java:2148)_

        _at 
org.jets3t.service.impl.rest.httpclient.RestStorageService.getObjectDetailsImpl(RestStorageService.java:2075)_

        _at 
org.jets3t.service.StorageService.getObjectDetails(StorageService.java:1093)_

        _at 
org.jets3t.service.StorageService.getObjectDetails(StorageService.java:548)_

        _at 
org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:174)_

        _at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)_

        _at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)_

        _at java.lang.reflect.Method.invoke(Method.java:483)_

        _at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:409)_

        _at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163)_

        _at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155)_

        _at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)_

        _at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:346)_

        _at org.apache.hadoop.fs.s3native.$Proxy18.retrieveMetadata(Unknown 
Source)_

        _at 
org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:493)_

        _at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1437)_

        _at 
org.apache.spark.sql.execution.streaming.HDFSMetadataLog$FileSystemManager.exists(HDFSMetadataLog.scala:446)_

        _at 
org.apache.spark.sql.execution.streaming.HDFSMetadataLog.get(HDFSMetadataLog.scala:195)_

        _at 
org.apache.spark.sql.execution.streaming.HDFSMetadataLog.add(HDFSMetadataLog.scala:110)_

        _at 
org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1.apply$mcV$sp(MicroBatchExecution.scala:339)_

        _at 
org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1.apply(MicroBatchExecution.scala:338)_

        _at 
org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1.apply(MicroBatchExecution.scala:338)_

        _at 
org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:271)_

        _at 
org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58)_

        _at 
org.apache.spark.sql.execution.streaming.MicroBatchExecution.org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch(MicroBatchExecution.scala:338)_

        _at 
org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply$mcV$sp(MicroBatchExecution.scala:128)_

        _at 
org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply(MicroBatchExecution.scala:121)_

        _at 
org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply(MicroBatchExecution.scala:121)_

        _at 
org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:271)_

        _at 
org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58)_

        _at 
org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1.apply$mcZ$sp(MicroBatchExecution.scala:121)_

        _at 
org.apache.spark.sql.execution.streaming.ProcessingTimeExecutor.execute(TriggerExecutor.scala:56)_

        _at 
org.apache.spark.sql.execution.streaming.MicroBatchExecution.runActivatedStream(MicroBatchExecution.scala:117)_

        _at 
org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:279)_


> java.net.SocketTimeoutException: Read timed out Exception while running the 
> Spark Structured Streaming in 2.3.0
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-24476
>                 URL: https://issues.apache.org/jira/browse/SPARK-24476
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.3.0
>            Reporter: bharath kumar avusherla
>            Priority: Major
>         Attachments: socket-timeout-exception
>
>
> We are working on spark streaming application using spark structured 
> streaming with checkpointing in s3. When we start the application, the 
> application runs just fine for sometime  then it crashes with the error 
> mentioned below. The amount of time it will run successfully varies from time 
> to time, sometimes it will run for 2 days without any issues then crashes, 
> sometimes it will crash after 4hrs/ 24hrs. 
> Our streaming application joins(left and inner) multiple sources from kafka 
> and also s3 and aurora database.
> Can you please let us know how to solve this problem? Is it possible to 
> increase the timeout period?
> Here, I'm pasting the few line of complete exception log below. Also attached 
> the complete exception to the issue.
> *_Exception:_*
> *_Caused by: java.net.SocketTimeoutException: Read timed out_*
>         _at java.net.SocketInputStream.socketRead0(Native Method)_
>         _at java.net.SocketInputStream.read(SocketInputStream.java:150)_
>         _at java.net.SocketInputStream.read(SocketInputStream.java:121)_
>         _at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)_
>         _at sun.security.ssl.InputRecord.read(InputRecord.java:503)_
>         _at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:954)_
>         _at 
> sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1343)_
>         _at 
> sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1371)_
>         _at 
> sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1355)_
>         _at 
> org.apache.http.conn.ssl.SSLSocketFactory.connectSocket(SSLSocketFactory.java:553)_
>         _at 
> org.apache.http.conn.ssl.SSLSocketFactory.connectSocket(SSLSocketFactory.java:412)_
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to