Re: Spark <--> S3 flakiness

Gourav Sengupta Sun, 14 May 2017 14:35:24 -0700

Are you running EMR?

On Sun, May 14, 2017 at 4:59 AM, Miguel Morales <therevolti...@gmail.com>
wrote:


> Some things just didn't work as i had first expected it.  For example,
> when writing from a spark collection to an alluxio destination didn't
> persist them to s3 automatically.
>
> I remember having to use the alluxio library directly to force the
> files to persist to s3 after spark finished writing to alluxio.
>
> On Fri, May 12, 2017 at 6:52 AM, Gene Pang <gene.p...@gmail.com> wrote:
> > Hi,
> >
> > Yes, you can use Alluxio with Spark to read/write to S3. Here is a blog
> post
> > on Spark + Alluxio + S3, and here is some documentation for configuring
> > Alluxio + S3 and configuring Spark + Alluxio.
> >
> > You mentioned that it required a lot of effort to get working. May I ask
> > what you ran into, and how you got it to work?
> >
> > Thanks,
> > Gene
> >
> > On Thu, May 11, 2017 at 11:55 AM, Miguel Morales <
> therevolti...@gmail.com>
> > wrote:
> >>
> >> Might want to try to use gzip as opposed to parquet.  The only way i
> >> ever reliably got parquet to work on S3 is by using Alluxio as a
> >> buffer, but it's a decent amount of work.
> >>
> >> On Thu, May 11, 2017 at 11:50 AM, lucas.g...@gmail.com
> >> <lucas.g...@gmail.com> wrote:
> >> > Also, and this is unrelated to the actual question... Why don't these
> >> > messages show up in the archive?
> >> >
> >> > http://apache-spark-user-list.1001560.n3.nabble.com/
> >> >
> >> > Ideally I'd want to post a link to our internal wiki for these
> >> > questions,
> >> > but can't find them in the archive.
> >> >
> >> > On 11 May 2017 at 07:16, lucas.g...@gmail.com <lucas.g...@gmail.com>
> >> > wrote:
> >> >>
> >> >> Looks like this isn't viable in spark 2.0.0 (and greater I presume).
> >> >> I'm
> >> >> pretty sure I came across this blog and ignored it due to that.
> >> >>
> >> >> Any other thoughts?  The linked tickets in:
> >> >> https://issues.apache.org/jira/browse/SPARK-10063
> >> >> https://issues.apache.org/jira/browse/HADOOP-13786
> >> >> https://issues.apache.org/jira/browse/HADOOP-9565 look relevant too.
> >> >>
> >> >> On 10 May 2017 at 22:24, Miguel Morales <therevolti...@gmail.com>
> >> >> wrote:
> >> >>>
> >> >>> Try using the DirectParquetOutputCommiter:
> >> >>> http://dev.sortable.com/spark-directparquetoutputcommitter/
> >> >>>
> >> >>> On Wed, May 10, 2017 at 10:07 PM, lucas.g...@gmail.com
> >> >>> <lucas.g...@gmail.com> wrote:
> >> >>> > Hi users, we have a bunch of pyspark jobs that are using S3 for
> >> >>> > loading
> >> >>> > /
> >> >>> > intermediate steps and final output of parquet files.
> >> >>> >
> >> >>> > We're running into the following issues on a semi regular basis:
> >> >>> > * These are intermittent errors, IE we have about 300 jobs that
> run
> >> >>> > nightly... And a fairly random but small-ish percentage of them
> fail
> >> >>> > with
> >> >>> > the following classes of errors.
> >> >>> >
> >> >>> > S3 write errors
> >> >>> >
> >> >>> >> "ERROR Utils: Aborting task
> >> >>> >> com.amazonaws.services.s3.model.AmazonS3Exception: Status Code:
> >> >>> >> 404,
> >> >>> >> AWS
> >> >>> >> Service: Amazon S3, AWS Request ID: 2D3RP, AWS Error Code: null,
> >> >>> >> AWS
> >> >>> >> Error
> >> >>> >> Message: Not Found, S3 Extended Request ID: BlaBlahEtc="
> >> >>> >
> >> >>> >
> >> >>> >>
> >> >>> >> "Py4JJavaError: An error occurred while calling o43.parquet.
> >> >>> >> : com.amazonaws.services.s3.model.MultiObjectDeleteException:
> >> >>> >> Status
> >> >>> >> Code:
> >> >>> >> 0, AWS Service: null, AWS Request ID: null, AWS Error Code: null,
> >> >>> >> AWS
> >> >>> >> Error
> >> >>> >> Message: One or more objects could not be deleted, S3 Extended
> >> >>> >> Request
> >> >>> >> ID:
> >> >>> >> null"
> >> >>> >
> >> >>> >
> >> >>> >
> >> >>> > S3 Read Errors:
> >> >>> >
> >> >>> >> [Stage 1:=================================================>
> >> >>> >> (27
> >> >>> >> + 4)
> >> >>> >> / 31]17/05/10 16:25:23 ERROR Executor: Exception in task 10.0 in
> >> >>> >> stage
> >> >>> >> 1.0
> >> >>> >> (TID 11)
> >> >>> >> java.net.SocketException: Connection reset
> >> >>> >> at java.net.SocketInputStream.read(SocketInputStream.java:196)
> >> >>> >> at java.net.SocketInputStream.read(SocketInputStream.java:122)
> >> >>> >> at sun.security.ssl.InputRecord.readFully(InputRecord.java:442)
> >> >>> >> at sun.security.ssl.InputRecord.readV3Record(InputRecord.java:
> 554)
> >> >>> >> at sun.security.ssl.InputRecord.read(InputRecord.java:509)
> >> >>> >> at
> >> >>> >> sun.security.ssl.SSLSocketImpl.readRecord(
> SSLSocketImpl.java:927)
> >> >>> >> at
> >> >>> >>
> >> >>> >> sun.security.ssl.SSLSocketImpl.readDataRecord(
> SSLSocketImpl.java:884)
> >> >>> >> at sun.security.ssl.AppInputStream.read(AppInputStream.java:102)
> >> >>> >> at
> >> >>> >>
> >> >>> >>
> >> >>> >> org.apache.http.impl.io.AbstractSessionInputBuffer.read(
> AbstractSessionInputBuffer.java:198)
> >> >>> >> at
> >> >>> >>
> >> >>> >>
> >> >>> >> org.apache.http.impl.io.ContentLengthInputStream.read(
> ContentLengthInputStream.java:178)
> >> >>> >> at
> >> >>> >>
> >> >>> >>
> >> >>> >> org.apache.http.impl.io.ContentLengthInputStream.read(
> ContentLengthInputStream.java:200)
> >> >>> >> at
> >> >>> >>
> >> >>> >>
> >> >>> >> org.apache.http.impl.io.ContentLengthInputStream.close(
> ContentLengthInputStream.java:103)
> >> >>> >> at
> >> >>> >>
> >> >>> >>
> >> >>> >> org.apache.http.conn.BasicManagedEntity.streamClosed(
> BasicManagedEntity.java:168)
> >> >>> >> at
> >> >>> >>
> >> >>> >>
> >> >>> >> org.apache.http.conn.EofSensorInputStream.checkClose(
> EofSensorInputStream.java:228)
> >> >>> >> at
> >> >>> >>
> >> >>> >>
> >> >>> >> org.apache.http.conn.EofSensorInputStream.close(
> EofSensorInputStream.java:174)
> >> >>> >> at java.io.FilterInputStream.close(FilterInputStream.java:181)
> >> >>> >> at java.io.FilterInputStream.close(FilterInputStream.java:181)
> >> >>> >> at java.io.FilterInputStream.close(FilterInputStream.java:181)
> >> >>> >> at java.io.FilterInputStream.close(FilterInputStream.java:181)
> >> >>> >> at
> >> >>> >> com.amazonaws.services.s3.model.S3Object.close(S3Object.
> java:203)
> >> >>> >> at
> >> >>> >>
> >> >>> >> org.apache.hadoop.fs.s3a.S3AInputStream.close(
> S3AInputStream.java:187)
> >> >>> >
> >> >>> >
> >> >>> >
> >> >>> > We have literally tons of logs we can add but it would make the
> >> >>> > email
> >> >>> > unwieldy big.  If it would be helpful I'll drop them in a pastebin
> >> >>> > or
> >> >>> > something.
> >> >>> >
> >> >>> > Our config is along the lines of:
> >> >>> >
> >> >>> > spark-2.1.0-bin-hadoop2.7
> >> >>> > '--packages
> >> >>> >
> >> >>> > com.amazonaws:aws-java-sdk:1.10.34,org.apache.hadoop:
> hadoop-aws:2.6.0
> >> >>> > pyspark-shell'
> >> >>> >
> >> >>> > Given the stack overflow / googling I've been doing I know we're
> not
> >> >>> > the
> >> >>> > only org with these issues but I haven't found a good set of
> >> >>> > solutions
> >> >>> > in
> >> >>> > those spaces yet.
> >> >>> >
> >> >>> > Thanks!
> >> >>> >
> >> >>> > Gary Lucas
> >> >>
> >> >>
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
> >>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

Re: Spark <--> S3 flakiness

Reply via email to