Hi Gordon/Fabian,

Thanks for helping with this! Downgrading the Maven version I was using to
build Flink appears to have fixed that problem - I was using Maven 3.3.3
before and have downgraded to 3.2.5.

Just for reference, I printed the loaded class at runtime and found that
when I was using Flink built with Maven 3.3.3, it was pulling in:
jar:file:/opt/flink/flink-1.1-SNAPSHOT/lib/flink-dist_2.11-1.1-SNAPSHOT.jar!/org/apache/http/params/HttpConnectionParams.class
But after building with the older Maven version, it pulled in the class
from my jar:
jar:file:/tmp/my-assembly-1.0.jar!/org/apache/http/params/HttpConnectionParams.class


Unfortunately now that problem is fixed I've now got a different classpath
issue. It started with:

java.lang.RuntimeException: java.lang.ClassNotFoundException: Class
com.amazon.ws.emr.hadoop.fs.EmrFileSystem not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2227)
at
org.apache.flink.runtime.fs.hdfs.HadoopFileSystem.getHadoopWrapperClassNameForFileSystem(HadoopFileSystem.java:460)
at
org.apache.flink.core.fs.FileSystem.getHadoopWrapperClassNameForFileSystem(FileSystem.java:352)
at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:280)
at
org.apache.flink.runtime.state.filesystem.FsStateBackend.validateAndNormalizeUri(FsStateBackend.java:383)
at
org.apache.flink.runtime.state.filesystem.FsStateBackend.<init>(FsStateBackend.java:175)
at
org.apache.flink.runtime.state.filesystem.FsStateBackend.<init>(FsStateBackend.java:144)
at
org.apache.flink.contrib.streaming.state.RocksDBStateBackend.<init>(RocksDBStateBackend.java:205)

This is strange because I used an s3:// checkpoint directory when running
Flink 1.0.3 on EMR and it worked fine. (according to
https://ci.apache.org/projects/flink/flink-docs-master/setup/aws.html#provide-s3-filesystem-dependency
no configuration should be needed to use S3 when running on EMR).

Anyway I tried executing /etc/hadoop/conf/hadoop-env.sh before running my
job, as this sets up the HADOOP_CLASSPATH env var. The exception then
changed to:
java.lang.NoClassDefFoundError: org/apache/hadoop/fs/common/Abortable

I found that this class is related to a jar called s3-dist-cp, so then I
tried copying that jar to Flink's lib directory from
/usr/share/aws/emr/s3-dist-cp/lib/*

And now I'm back to another Kinesis connector classpath error:

java.lang.NoClassDefFoundError: org/apache/http/conn/ssl/SSLSocketFactory
at
com.amazonaws.AmazonWebServiceClient.<init>(AmazonWebServiceClient.java:136)
at
com.amazonaws.AmazonWebServiceClient.<init>(AmazonWebServiceClient.java:120)
at
com.amazonaws.services.kinesis.AmazonKinesisClient.<init>(AmazonKinesisClient.java:157)
at
com.amazonaws.services.kinesis.AmazonKinesisClient.<init>(AmazonKinesisClient.java:137)
at
org.apache.flink.streaming.connectors.kinesis.proxy.KinesisProxy.<init>(KinesisProxy.java:76)
at
org.apache.flink.streaming.connectors.kinesis.FlinkKinesisConsumer.<init>(FlinkKinesisConsumer.java:166)
at
org.apache.flink.streaming.connectors.kinesis.FlinkKinesisConsumer.<init>(FlinkKinesisConsumer.java:140)
at
org.apache.flink.streaming.connectors.kinesis.FlinkKinesisConsumer.<init>(FlinkKinesisConsumer.java:123)

I guess this is related to me adding a bunch of extra stuff to the
classpath in an attempt to solve the EmrFileSystem error. Any ideas what
caused that error in the first place?

By the way, I built Flink with:
mvn clean install -Pinclude-kinesis,vendor-repos -DskipTests
-Dhadoop.version=2.7.1

Josh

On Fri, Jun 17, 2016 at 9:56 AM, Fabian Hueske <fhue...@gmail.com> wrote:

> Hi Josh,
>
> I assume that you build the SNAPSHOT version yourself. I had similar
> version conflicts for Apache HttpCore with Flink SNAPSHOT versions on EMR.
> The problem is cause by a changed behavior in Maven 3.3 and following
> versions.
> Due to these changes, the dependency shading is not working correctly.
> That's why we use Maven 3.2 to build the Flink release artifacts.
>
> Can you check whether you used Maven 3.3 and try to downgrade to 3.2 if
> that was the case?
>
> Cheers, Fabian
>
> 2016-06-17 8:12 GMT+02:00 Tai Gordon <tzuli...@gmail.com>:
>
>> Hi Josh,
>>
>> I’m looking into the problem. Seems like the connector is somehow using
>> older versions of httpclient.
>> Can you print the loaded class path at runtime, and check the path &
>> version of the loaded httpclient / httpcore dependency?
>> i.e. `classOf[HttpConnectionParams].getResource("
>> HttpConnectionParams.class").toString`
>>
>> Also, on which commit was your kinesis connector built?
>>
>> Regards,
>> Gordon
>>
>>
>> On June 17, 2016 at 1:08:37 AM, Josh (jof...@gmail.com) wrote:
>>
>> Hey,
>>
>> I've been running the Kinesis connector successfully now for a couple of
>> weeks, on a Flink cluster running Flink 1.0.3 on EMR 2.7.1/YARN.
>>
>> Today I've been trying to get it working on a cluster running the current
>> Flink master (1.1-SNAPSHOT) but am running into a classpath issue when
>> starting the job. This only happens when running on EMR/YARN (it's fine
>> when running 1.1-SNAPSHOT locally, and when running 1.0.3 on EMR)
>>
>> ----
>>  The program finished with the following exception:
>>
>> java.lang.NoSuchMethodError:
>> org.apache.http.params.HttpConnectionParams.setSoKeepalive(Lorg/apache/http/params/HttpParams;Z)V
>> at
>> com.amazonaws.http.HttpClientFactory.createHttpClient(HttpClientFactory.java:96)
>> at com.amazonaws.http.AmazonHttpClient.<init>(AmazonHttpClient.java:187)
>> at
>> com.amazonaws.AmazonWebServiceClient.<init>(AmazonWebServiceClient.java:136)
>> at
>> com.amazonaws.AmazonWebServiceClient.<init>(AmazonWebServiceClient.java:120)
>> at
>> com.amazonaws.services.kinesis.AmazonKinesisClient.<init>(AmazonKinesisClient.java:157)
>> at
>> com.amazonaws.services.kinesis.AmazonKinesisClient.<init>(AmazonKinesisClient.java:137)
>> at
>> org.apache.flink.streaming.connectors.kinesis.proxy.KinesisProxy.<init>(KinesisProxy.java:76)
>> at
>> org.apache.flink.streaming.connectors.kinesis.FlinkKinesisConsumer.<init>(FlinkKinesisConsumer.java:166)
>> at
>> org.apache.flink.streaming.connectors.kinesis.FlinkKinesisConsumer.<init>(FlinkKinesisConsumer.java:140)
>> at
>> org.apache.flink.streaming.connectors.kinesis.FlinkKinesisConsumer.<init>(FlinkKinesisConsumer.java:123)
>> ---
>>
>> Any ideas what's going on?
>>
>> The job I'm deploying has httpclient 4.3.6 and httpcore 4.3.3 which I
>> believe are the libraries with the HttpConnectionParams class.
>>
>> Thanks,
>> Josh
>>
>>
>

Reply via email to