Hi Gordon/Fabian, Thanks for helping with this! Downgrading the Maven version I was using to build Flink appears to have fixed that problem - I was using Maven 3.3.3 before and have downgraded to 3.2.5.
Just for reference, I printed the loaded class at runtime and found that when I was using Flink built with Maven 3.3.3, it was pulling in: jar:file:/opt/flink/flink-1.1-SNAPSHOT/lib/flink-dist_2.11-1.1-SNAPSHOT.jar!/org/apache/http/params/HttpConnectionParams.class But after building with the older Maven version, it pulled in the class from my jar: jar:file:/tmp/my-assembly-1.0.jar!/org/apache/http/params/HttpConnectionParams.class Unfortunately now that problem is fixed I've now got a different classpath issue. It started with: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.amazon.ws.emr.hadoop.fs.EmrFileSystem not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2227) at org.apache.flink.runtime.fs.hdfs.HadoopFileSystem.getHadoopWrapperClassNameForFileSystem(HadoopFileSystem.java:460) at org.apache.flink.core.fs.FileSystem.getHadoopWrapperClassNameForFileSystem(FileSystem.java:352) at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:280) at org.apache.flink.runtime.state.filesystem.FsStateBackend.validateAndNormalizeUri(FsStateBackend.java:383) at org.apache.flink.runtime.state.filesystem.FsStateBackend.<init>(FsStateBackend.java:175) at org.apache.flink.runtime.state.filesystem.FsStateBackend.<init>(FsStateBackend.java:144) at org.apache.flink.contrib.streaming.state.RocksDBStateBackend.<init>(RocksDBStateBackend.java:205) This is strange because I used an s3:// checkpoint directory when running Flink 1.0.3 on EMR and it worked fine. (according to https://ci.apache.org/projects/flink/flink-docs-master/setup/aws.html#provide-s3-filesystem-dependency no configuration should be needed to use S3 when running on EMR). Anyway I tried executing /etc/hadoop/conf/hadoop-env.sh before running my job, as this sets up the HADOOP_CLASSPATH env var. The exception then changed to: java.lang.NoClassDefFoundError: org/apache/hadoop/fs/common/Abortable I found that this class is related to a jar called s3-dist-cp, so then I tried copying that jar to Flink's lib directory from /usr/share/aws/emr/s3-dist-cp/lib/* And now I'm back to another Kinesis connector classpath error: java.lang.NoClassDefFoundError: org/apache/http/conn/ssl/SSLSocketFactory at com.amazonaws.AmazonWebServiceClient.<init>(AmazonWebServiceClient.java:136) at com.amazonaws.AmazonWebServiceClient.<init>(AmazonWebServiceClient.java:120) at com.amazonaws.services.kinesis.AmazonKinesisClient.<init>(AmazonKinesisClient.java:157) at com.amazonaws.services.kinesis.AmazonKinesisClient.<init>(AmazonKinesisClient.java:137) at org.apache.flink.streaming.connectors.kinesis.proxy.KinesisProxy.<init>(KinesisProxy.java:76) at org.apache.flink.streaming.connectors.kinesis.FlinkKinesisConsumer.<init>(FlinkKinesisConsumer.java:166) at org.apache.flink.streaming.connectors.kinesis.FlinkKinesisConsumer.<init>(FlinkKinesisConsumer.java:140) at org.apache.flink.streaming.connectors.kinesis.FlinkKinesisConsumer.<init>(FlinkKinesisConsumer.java:123) I guess this is related to me adding a bunch of extra stuff to the classpath in an attempt to solve the EmrFileSystem error. Any ideas what caused that error in the first place? By the way, I built Flink with: mvn clean install -Pinclude-kinesis,vendor-repos -DskipTests -Dhadoop.version=2.7.1 Josh On Fri, Jun 17, 2016 at 9:56 AM, Fabian Hueske <fhue...@gmail.com> wrote: > Hi Josh, > > I assume that you build the SNAPSHOT version yourself. I had similar > version conflicts for Apache HttpCore with Flink SNAPSHOT versions on EMR. > The problem is cause by a changed behavior in Maven 3.3 and following > versions. > Due to these changes, the dependency shading is not working correctly. > That's why we use Maven 3.2 to build the Flink release artifacts. > > Can you check whether you used Maven 3.3 and try to downgrade to 3.2 if > that was the case? > > Cheers, Fabian > > 2016-06-17 8:12 GMT+02:00 Tai Gordon <tzuli...@gmail.com>: > >> Hi Josh, >> >> I’m looking into the problem. Seems like the connector is somehow using >> older versions of httpclient. >> Can you print the loaded class path at runtime, and check the path & >> version of the loaded httpclient / httpcore dependency? >> i.e. `classOf[HttpConnectionParams].getResource(" >> HttpConnectionParams.class").toString` >> >> Also, on which commit was your kinesis connector built? >> >> Regards, >> Gordon >> >> >> On June 17, 2016 at 1:08:37 AM, Josh (jof...@gmail.com) wrote: >> >> Hey, >> >> I've been running the Kinesis connector successfully now for a couple of >> weeks, on a Flink cluster running Flink 1.0.3 on EMR 2.7.1/YARN. >> >> Today I've been trying to get it working on a cluster running the current >> Flink master (1.1-SNAPSHOT) but am running into a classpath issue when >> starting the job. This only happens when running on EMR/YARN (it's fine >> when running 1.1-SNAPSHOT locally, and when running 1.0.3 on EMR) >> >> ---- >> The program finished with the following exception: >> >> java.lang.NoSuchMethodError: >> org.apache.http.params.HttpConnectionParams.setSoKeepalive(Lorg/apache/http/params/HttpParams;Z)V >> at >> com.amazonaws.http.HttpClientFactory.createHttpClient(HttpClientFactory.java:96) >> at com.amazonaws.http.AmazonHttpClient.<init>(AmazonHttpClient.java:187) >> at >> com.amazonaws.AmazonWebServiceClient.<init>(AmazonWebServiceClient.java:136) >> at >> com.amazonaws.AmazonWebServiceClient.<init>(AmazonWebServiceClient.java:120) >> at >> com.amazonaws.services.kinesis.AmazonKinesisClient.<init>(AmazonKinesisClient.java:157) >> at >> com.amazonaws.services.kinesis.AmazonKinesisClient.<init>(AmazonKinesisClient.java:137) >> at >> org.apache.flink.streaming.connectors.kinesis.proxy.KinesisProxy.<init>(KinesisProxy.java:76) >> at >> org.apache.flink.streaming.connectors.kinesis.FlinkKinesisConsumer.<init>(FlinkKinesisConsumer.java:166) >> at >> org.apache.flink.streaming.connectors.kinesis.FlinkKinesisConsumer.<init>(FlinkKinesisConsumer.java:140) >> at >> org.apache.flink.streaming.connectors.kinesis.FlinkKinesisConsumer.<init>(FlinkKinesisConsumer.java:123) >> --- >> >> Any ideas what's going on? >> >> The job I'm deploying has httpclient 4.3.6 and httpcore 4.3.3 which I >> believe are the libraries with the HttpConnectionParams class. >> >> Thanks, >> Josh >> >> >