Are you sitting behind a proxy or something? Can you look more into the
executor logs? I have a strange feeling that you are blowing the memory
(and possibly hitting GC etc).

Thanks
Best Regards

On Thu, Sep 10, 2015 at 10:05 PM, Mario Pastorelli <
mario.pastore...@teralytics.ch> wrote:

> Dear community,
> I am facing a problem accessing data on S3 via Spark. My current
> configuration is the following:
>
> - Spark 1.4.1
> - Hadoop 2.7.1
> - hadoop-aws-2.7.1
> - mesos 0.22.1
>
> I am accessing the data using the s3a protocol but it just hangs. The job
> runs through the whole data set but
> systematically there is one tasks never finishing. In the stderr I am
> reading
> quite some timeout errors but it looks like the application is recovering
> from these. It is just infinitely running without proceeding to the next
> stage.
>
> This is the stacktrace I am reading from the errors that the job is
> recovering from:
>
> java.net.SocketTimeoutException: Read timed out
>         at java.net.SocketInputStream.socketRead0(Native Method)
>         at java.net.SocketInpu
> tStream.read(SocketInputStream.java:152)
>         at java.net.SocketInputStream.read(SocketInputStream.java:122)
>         at sun.security.ssl.InputRecord.readFully(InputRecord.java:442)
>         at sun.security.ssl.InputRecord.readV3Record(InputRecord.java:554)
>         at sun.security.ssl.InputRecord.read(InputRecord.java:509)
>         at
> sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:934)
>         at
> sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:891)
>         at sun.security.ssl.AppInputStream.read(AppInputStream.java:102)
>         at
> org.apache.http.impl.io.AbstractSessionInputBuffer.read(AbstractSessionInputBuffer.java:198)
>         at
> org.apache.http.impl.io.ContentLengthInputStream.read(ContentLengthInputStream.java:178)
>         at
> org.apache.http.conn.EofSensorInputStream.read(EofSensorInputStream.java:137)
>         at java.io.FilterInputStream.read(FilterInputStream.java:133)
>         at java.io.FilterInputStream.read(FilterInputStream.java:133)
>         at java.io.FilterInputStream.read(FilterInputStream.java:133)
>         at
> com.amazonaws.util.ContentLengthValidationInputStream.read(ContentLengthValidationInputStream.java:77)
>         at java.io.FilterInputStream.read(FilterInputStream.java:133)
>         at
> org.apache.hadoop.fs.s3a.S3AInputStream.read(S3AInputStream.java:164)
>         at java.io.DataInputStream.read(DataInputStream.java:149)
>         at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>         at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
>         at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
>         at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>         at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
>         at
> org.apache.hadoop.io.compress.bzip2.CBZip2InputStream.readAByte(CBZip2InputStream.java:195)
>         at
> org.apache.hadoop.io.compress.bzip2.CBZip2InputStream.getAndMoveToFrontDecode(CBZip2InputStream.java:949)
>         at
> org.apache.hadoop.io.compress.bzip2.CBZip2InputStream.initBlock(CBZip2InputStream.java:506)
>         at
> org.apache.hadoop.io.compress.bzip2.CBZip2InputStream.changeStateToProcessABlock(CBZip2InputStream.java:335)
>         at
> org.apache.hadoop.io.compress.bzip2.CBZip2InputStream.read(CBZip2InputStream.java:425)
>         at
> org.apache.hadoop.io.compress.BZip2Codec$BZip2CompressionInputStream.read(BZip2Codec.java:485)
>         at java.io.InputStream.read(InputStream.java:101)
>         at
> org.apache.hadoop.mapreduce.lib.input.CompressedSplitLineReader.fillBuffer(CompressedSplitLineReader.java:130)
>         at
> org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:216)
>
> My gut feeling is that the job is "failing at failing". It looks like some
> tasks that should be failing, unfortunately are not. This seems to not
> happen and, thus, the job just hangs forever. Moreover, debugging this
> problem is really hard because there is no concrete error in the logs.
>
> Could you help me figuring out what is happening and trying to find a
> solution to this issue?
> Thank you!
>

Reply via email to