Re: Unable to write snapshots to S3 on EMR

Andy M. Mon, 02 Oct 2017 12:51:26 -0700

Hi Fabian,

Sorry, I just realized I forgot to include that part.  The error returned
is:


java.lang.NoSuchMethodError:
org.apache.hadoop.conf.Configuration.addResource(Lorg/apache/hadoop/conf/Configuration;)V
    at 
com.amazon.ws.emr.hadoop.fs.EmrFileSystem.initialize(EmrFileSystem.java:93)
    at 
org.apache.flink.runtime.fs.hdfs.HadoopFileSystem.initialize(HadoopFileSystem.java:328)
    at 
org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:350)
    at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:389)
    at org.apache.flink.core.fs.Path.getFileSystem(Path.java:293)
    at 
org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory.<init>(FsCheckpointStreamFactory.java:99)
    at 
org.apache.flink.runtime.state.filesystem.FsStateBackend.createStreamFactory(FsStateBackend.java:282)
    at 
org.apache.flink.contrib.streaming.state.RocksDBStateBackend.createStreamFactory(RocksDBStateBackend.java:273

I believe it has something to do with the classpath, but I am unsure why or
how to fix it.  The classpath being used during the execution is:
/home/hadoop/flink-1.3.2/lib/flink-python_2.11-1.3.2.jar:/ho‌me/hadoop/flink-1.3.‌2/lib/flink-shaded-h‌adoop2-uber-1.3.2.ja‌r:/home/hadoop/flink‌-1.3.2/lib/log4j-1.2‌.17.jar:/home/hadoop‌/flink-1.3.2/lib/slf‌4j-log4j12-1.7.7.jar‌:/home/hadoop/flink-‌1.3.2/lib/flink-dist‌_2.11-1.3.2.jar::/et‌c/hadoop/conf:

I decompiled flink-shaded-h‌adoop2-uber-1.3.2.ja‌r and it seems the
addResource function does seem to be there.

Thank you

On Mon, Oct 2, 2017 at 3:43 PM, Fabian Hueske <[email protected]> wrote:

> Hi Andy,
>
> can you describe in more detail what exactly isn't working?
> Do you see error messages in the log files or on the console?
>
> Thanks, Fabian
>
> 2017-10-02 15:52 GMT+02:00 Andy M. <[email protected]>:
>
> > Hello,
> >
> > I am about to deploy my first Flink projects  to production, but I am
> > running into a very big hurdle.  I am unable to launch my project so it
> can
> > write to an S3 bucket.  My project is running on an EMR cluster, where I
> > have installed Flink 1.3.2.  I am using Yarn to launch the application,
> and
> > it seems to run fine unless I am trying to enable check pointing(with a
> S3
> > target).  I am looking to use RocksDB as my check-pointing backend.  I
> have
> > asked a few places, and I am still unable to find a solution to this
> > problem.  Here are my steps for creating a cluster, and launching my
> > application, perhaps I am missing a step.  I'd be happy to provide any
> > additional information if needed.
> >
> > AWS Portal:
> >
> >     1) EMR -> Create Cluster
> >     2) Advanced Options
> >     3) Release = emr-5.8.0
> >     4) Only select Hadoop 2.7.3
> >     5) Next -> Next -> Next -> Create Cluster ( I do fill out
> > names/keys/etc)
> >
> > Once the cluster is up I ssh into the Master and do the following:
> >
> >     1  wget
> > http://apache.claz.org/flink/flink-1.3.2/flink-1.3.2-bin-
> > hadoop27-scala_2.11.tgz
> >     2  tar -xzf flink-1.3.2-bin-hadoop27-scala_2.11.tgz
> >     3  cd flink-1.3.2
> >     4  ./bin/yarn-session.sh -n 2 -tm 5120 -s 4 -d
> >     5  Change conf/flink-conf.yaml
> >     6  ./bin/flink run -m yarn-cluster -yn 1 ~/flink-consumer.jar
> >
> > My conf/flink-conf.yaml I add the following fields:
> >
> >     state.backend: rocksdb
> >     state.backend.fs.checkpointdir: s3:/bucket/location
> >     state.checkpoints.dir: s3:/bucket/location
> >
> > My program's checkpointing setup:
> >
> >
> > env.enableCheckpointing(getCheckpointRate,CheckpointingMode.EXACTLY_
> ONCE)
> >
> > env.getCheckpointConfig.enableExternalizedCheckpoints(
> > ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION)
> >
> > env.getCheckpointConfig.setMinPauseBetweenCheckpoints(
> > getCheckpointMinPause)
> >     env.getCheckpointConfig.setCheckpointTimeout(getCheckpointTimeout)
> >     env.getCheckpointConfig.setMaxConcurrentCheckpoints(1)
> >     env.setStateBackend(new RocksDBStateBackend("s3://bucket/location",
> > true))
> >
>

Re: Unable to write snapshots to S3 on EMR

Reply via email to