Hi there, We're using EMR 4.4.0 -> I suppose this is a bit old, and I can migrate forward if you think that would be best.
I've appended the classpath that the Flink cluster was started with at the end of this email (with a slight improvement to the formatting to make it readable). Willing to poke around or fiddle with this as necessary - thanks very much for the help! Justin Task Manager's classpath from logs: lib/flink-dist_2.11-1.1.3.jar lib/flink-python_2.11-1.1.3.jar lib/log4j-1.2.17.jar lib/slf4j-log4j12-1.7.7.jar logback.xml log4j.properties flink.jar flink-conf.yaml /etc/hadoop/conf /usr/lib/hadoop/hadoop-annotations-2.7.1-amzn-1.jar /usr/lib/hadoop/hadoop-extras.jar /usr/lib/hadoop/hadoop-archives-2.7.1-amzn-1.jar /usr/lib/hadoop/hadoop-aws-2.7.1-amzn-1.jar /usr/lib/hadoop/hadoop-sls-2.7.1-amzn-1.jar /usr/lib/hadoop/hadoop-auth-2.7.1-amzn-1.jar /usr/lib/hadoop/hadoop-sls.jar /usr/lib/hadoop/hadoop-gridmix.jar /usr/lib/hadoop/hadoop-auth.jar /usr/lib/hadoop/hadoop-gridmix-2.7.1-amzn-1.jar /usr/lib/hadoop/hadoop-rumen.jar /usr/lib/hadoop/hadoop-azure-2.7.1-amzn-1.jar /usr/lib/hadoop/hadoop-common-2.7.1-amzn-1.jar /usr/lib/hadoop/hadoop-azure.jar /usr/lib/hadoop/hadoop-datajoin-2.7.1-amzn-1.jar /usr/lib/hadoop/hadoop-nfs.jar /usr/lib/hadoop/hadoop-aws.jar /usr/lib/hadoop/hadoop-streaming-2.7.1-amzn-1.jar /usr/lib/hadoop/hadoop-archives.jar /usr/lib/hadoop/hadoop-openstack.jar /usr/lib/hadoop/hadoop-distcp.jar /usr/lib/hadoop/hadoop-annotations.jar /usr/lib/hadoop/hadoop-distcp-2.7.1-amzn-1.jar /usr/lib/hadoop/hadoop-streaming.jar /usr/lib/hadoop/hadoop-rumen-2.7.1-amzn-1.jar /usr/lib/hadoop/hadoop-common.jar /usr/lib/hadoop/hadoop-nfs-2.7.1-amzn-1.jar /usr/lib/hadoop/hadoop-common-2.7.1-amzn-1-tests.jar /usr/lib/hadoop/hadoop-ant-2.7.1-amzn-1.jar /usr/lib/hadoop/hadoop-datajoin.jar /usr/lib/hadoop/hadoop-ant.jar /usr/lib/hadoop/hadoop-extras-2.7.1-amzn-1.jar /usr/lib/hadoop/hadoop-openstack-2.7.1-amzn-1.jar /usr/lib/hadoop/lib/jackson-xc-1.9.13.jar /usr/lib/hadoop/lib/api-asn1-api-1.0.0-M20.jar /usr/lib/hadoop/lib/curator-client-2.7.1.jar /usr/lib/hadoop/lib/jackson-mapper-asl-1.9.13.jar /usr/lib/hadoop/lib/commons-io-2.4.jar /usr/lib/hadoop/lib/jackson-jaxrs-1.9.13.jar /usr/lib/hadoop/lib/log4j-1.2.17.jar /usr/lib/hadoop/lib/junit-4.11.jar /usr/lib/hadoop/lib/apacheds-i18n-2.0.0-M15.jar /usr/lib/hadoop/lib/commons-cli-1.2.jar /usr/lib/hadoop/lib/curator-recipes-2.7.1.jar /usr/lib/hadoop/lib/xmlenc-0.52.jar /usr/lib/hadoop/lib/zookeeper-3.4.6.jar /usr/lib/hadoop/lib/jsr305-3.0.0.jar /usr/lib/hadoop/lib/htrace-core-3.1.0-incubating.jar /usr/lib/hadoop/lib/httpclient-4.3.4.jar /usr/lib/hadoop/lib/jettison-1.1.jar /usr/lib/hadoop/lib/commons-beanutils-1.7.0.jar /usr/lib/hadoop/lib/commons-math3-3.1.1.jar /usr/lib/hadoop/lib/jersey-core-1.9.jar /usr/lib/hadoop/lib/httpcore-4.3.2.jar /usr/lib/hadoop/lib/commons-compress-1.4.1.jar /usr/lib/hadoop/lib/asm-3.2.jar /usr/lib/hadoop/lib/slf4j-api-1.7.10.jar /usr/lib/hadoop/lib/xz-1.0.jar /usr/lib/hadoop/lib/commons-collections-3.2.1.jar /usr/lib/hadoop/lib/commons-net-3.1.jar /usr/lib/hadoop/lib/commons-configuration-1.6.jar /usr/lib/hadoop/lib/jetty-util-6.1.26-emr.jar /usr/lib/hadoop/lib/commons-codec-1.4.jar /usr/lib/hadoop/lib/protobuf-java-2.5.0.jar /usr/lib/hadoop/lib/jetty-6.1.26-emr.jar /usr/lib/hadoop/lib/java-xmlbuilder-0.4.jar /usr/lib/hadoop/lib/apacheds-kerberos-codec-2.0.0-M15.jar /usr/lib/hadoop/lib/commons-logging-1.1.3.jar /usr/lib/hadoop/lib/jersey-json-1.9.jar /usr/lib/hadoop/lib/jackson-core-asl-1.9.13.jar /usr/lib/hadoop/lib/gson-2.2.4.jar /usr/lib/hadoop/lib/stax-api-1.0-2.jar /usr/lib/hadoop/lib/commons-digester-1.8.jar /usr/lib/hadoop/lib/servlet-api-2.5.jar /usr/lib/hadoop/lib/curator-framework-2.7.1.jar /usr/lib/hadoop/lib/commons-httpclient-3.1.jar /usr/lib/hadoop/lib/jets3t-0.9.0.jar /usr/lib/hadoop/lib/jaxb-api-2.2.2.jar /usr/lib/hadoop/lib/slf4j-log4j12-1.7.10.jar /usr/lib/hadoop/lib/mockito-all-1.8.5.jar /usr/lib/hadoop/lib/snappy-java-1.0.4.1.jar /usr/lib/hadoop/lib/jaxb-impl-2.2.3-1.jar /usr/lib/hadoop/lib/paranamer-2.3.jar /usr/lib/hadoop/lib/avro-1.7.4.jar /usr/lib/hadoop/lib/commons-beanutils-core-1.8.0.jar /usr/lib/hadoop/lib/jsp-api-2.1.jar /usr/lib/hadoop/lib/api-util-1.0.0-M20.jar /usr/lib/hadoop/lib/activation-1.1.jar /usr/lib/hadoop/lib/emr-metrics-client-2.1.0.jar /usr/lib/hadoop/lib/commons-lang-2.6.jar /usr/lib/hadoop/lib/jersey-server-1.9.jar /usr/lib/hadoop/lib/guava-11.0.2.jar /usr/lib/hadoop/lib/jsch-0.1.42.jar /usr/lib/hadoop/lib/netty-3.6.2.Final.jar /usr/lib/hadoop/lib/hamcrest-core-1.3.jar /usr/lib/hadoop-hdfs/hadoop-hdfs.jar /usr/lib/hadoop-hdfs/hadoop-hdfs-nfs-2.7.1-amzn-1.jar /usr/lib/hadoop-hdfs/hadoop-hdfs-2.7.1-amzn-1.jar /usr/lib/hadoop-hdfs/hadoop-hdfs-2.7.1-amzn-1-tests.jar /usr/lib/hadoop-hdfs/hadoop-hdfs-nfs.jar /usr/lib/hadoop-hdfs/lib/jackson-mapper-asl-1.9.13.jar /usr/lib/hadoop-hdfs/lib/commons-io-2.4.jar /usr/lib/hadoop-hdfs/lib/log4j-1.2.17.jar /usr/lib/hadoop-hdfs/lib/commons-daemon-1.0.13.jar /usr/lib/hadoop-hdfs/lib/commons-cli-1.2.jar /usr/lib/hadoop-hdfs/lib/xmlenc-0.52.jar /usr/lib/hadoop-hdfs/lib/jsr305-3.0.0.jar /usr/lib/hadoop-hdfs/lib/htrace-core-3.1.0-incubating.jar /usr/lib/hadoop-hdfs/lib/jersey-core-1.9.jar /usr/lib/hadoop-hdfs/lib/httpcore-4.3.2.jar /usr/lib/hadoop-hdfs/lib/asm-3.2.jar /usr/lib/hadoop-hdfs/lib/netty-all-4.0.23.Final.jar /usr/lib/hadoop-hdfs/lib/leveldbjni-all-1.8.jar /usr/lib/hadoop-hdfs/lib/xml-apis-1.3.04.jar /usr/lib/hadoop-hdfs/lib/jetty-util-6.1.26-emr.jar /usr/lib/hadoop-hdfs/lib/commons-codec-1.4.jar /usr/lib/hadoop-hdfs/lib/protobuf-java-2.5.0.jar /usr/lib/hadoop-hdfs/lib/jetty-6.1.26-emr.jar /usr/lib/hadoop-hdfs/lib/commons-logging-1.1.3.jar /usr/lib/hadoop-hdfs/lib/jackson-core-asl-1.9.13.jar /usr/lib/hadoop-hdfs/lib/gson-2.2.4.jar /usr/lib/hadoop-hdfs/lib/servlet-api-2.5.jar /usr/lib/hadoop-hdfs/lib/xercesImpl-2.9.1.jar /usr/lib/hadoop-hdfs/lib/emr-metrics-client-2.1.0.jar /usr/lib/hadoop-hdfs/lib/commons-lang-2.6.jar /usr/lib/hadoop-hdfs/lib/jersey-server-1.9.jar /usr/lib/hadoop-hdfs/lib/guava-11.0.2.jar /usr/lib/hadoop-hdfs/lib/netty-3.6.2.Final.jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-2.7.1-amzn-1.jar /usr/lib/hadoop-mapreduce/jackson-xc-1.9.13.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-rds-1.10.48.jar /usr/lib/hadoop-mapreduce/hadoop-extras.jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar /usr/lib/hadoop-mapreduce/api-asn1-api-1.0.0-M20.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-elasticbeanstalk-1.10.48.jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-hs-plugins.jar /usr/lib/hadoop-mapreduce/curator-client-2.7.1.jar /usr/lib/hadoop-mapreduce/jackson-mapper-asl-1.9.13.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-marketplacecommerceanalytics-1.10.48.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-datapipeline-1.10.48.jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-hs-plugins-2.7.1-amzn-1.jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-shuffle-2.7.1-amzn-1.jar /usr/lib/hadoop-mapreduce/commons-io-2.4.jar /usr/lib/hadoop-mapreduce/hadoop-archives-2.7.1-amzn-1.jar /usr/lib/hadoop-mapreduce/jackson-jaxrs-1.9.13.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-cloudtrail-1.10.48.jar /usr/lib/hadoop-mapreduce/log4j-1.2.17.jar /usr/lib/hadoop-mapreduce/junit-4.11.jar /usr/lib/hadoop-mapreduce/hadoop-aws-2.7.1-amzn-1.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-cloudfront-1.10.48.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-machinelearning-1.10.48.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-iam-1.10.48.jar /usr/lib/hadoop-mapreduce/jackson-databind-2.4.4.jar /usr/lib/hadoop-mapreduce/hadoop-sls-2.7.1-amzn-1.jar /usr/lib/hadoop-mapreduce/apacheds-i18n-2.0.0-M15.jar /usr/lib/hadoop-mapreduce/commons-cli-1.2.jar /usr/lib/hadoop-mapreduce/curator-recipes-2.7.1.jar /usr/lib/hadoop-mapreduce/xmlenc-0.52.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-efs-1.10.48.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-devicefarm-1.10.48.jar /usr/lib/hadoop-mapreduce/hadoop-auth-2.7.1-amzn-1.jar /usr/lib/hadoop-mapreduce/commons-lang3-3.3.2.jar /usr/lib/hadoop-mapreduce/zookeeper-3.4.6.jar /usr/lib/hadoop-mapreduce/jsr305-3.0.0.jar /usr/lib/hadoop-mapreduce/htrace-core-3.1.0-incubating.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-core-1.10.48.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-cognitoidentity-1.10.48.jar /usr/lib/hadoop-mapreduce/httpclient-4.3.4.jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-common-2.7.1-amzn-1.jar /usr/lib/hadoop-mapreduce/jettison-1.1.jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-hs.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-autoscaling-1.10.48.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-simpledb-1.10.48.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-kms-1.10.48.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-api-gateway-1.10.48.jar /usr/lib/hadoop-mapreduce/commons-beanutils-1.7.0.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-dynamodb-1.10.48.jar /usr/lib/hadoop-mapreduce/commons-math3-3.1.1.jar /usr/lib/hadoop-mapreduce/jersey-core-1.9.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-config-1.10.48.jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-hs-2.7.1-amzn-1.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-ssm-1.10.48.jar /usr/lib/hadoop-mapreduce/hadoop-sls.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-cloudwatchmetrics-1.10.48.jar /usr/lib/hadoop-mapreduce/hadoop-gridmix.jar /usr/lib/hadoop-mapreduce/httpcore-4.3.2.jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-app-2.7.1-amzn-1.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-ses-1.10.48.jar /usr/lib/hadoop-mapreduce/hadoop-auth.jar /usr/lib/hadoop-mapreduce/commons-compress-1.4.1.jar /usr/lib/hadoop-mapreduce/hadoop-gridmix-2.7.1-amzn-1.jar /usr/lib/hadoop-mapreduce/asm-3.2.jar /usr/lib/hadoop-mapreduce/xz-1.0.jar /usr/lib/hadoop-mapreduce/commons-collections-3.2.1.jar /usr/lib/hadoop-mapreduce/commons-net-3.1.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-cloudformation-1.10.48.jar /usr/lib/hadoop-mapreduce/hadoop-rumen.jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-shuffle.jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-core.jar /usr/lib/hadoop-mapreduce/hadoop-azure-2.7.1-amzn-1.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-emr-1.10.48.jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-2.7.1-amzn-1-tests.jar /usr/lib/hadoop-mapreduce/commons-configuration-1.6.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-ecr-1.10.48.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-ec2-1.10.48.jar /usr/lib/hadoop-mapreduce/jetty-util-6.1.26-emr.jar /usr/lib/hadoop-mapreduce/hadoop-azure.jar /usr/lib/hadoop-mapreduce/commons-codec-1.4.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-importexport-1.10.48.jar /usr/lib/hadoop-mapreduce/protobuf-java-2.5.0.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-iot-1.10.48.jar /usr/lib/hadoop-mapreduce/hadoop-datajoin-2.7.1-amzn-1.jar /usr/lib/hadoop-mapreduce/jetty-6.1.26-emr.jar /usr/lib/hadoop-mapreduce/java-xmlbuilder-0.4.jar /usr/lib/hadoop-mapreduce/apacheds-kerberos-codec-2.0.0-M15.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-glacier-1.10.48.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-waf-1.10.48.jar /usr/lib/hadoop-mapreduce/jackson-core-2.4.4.jar /usr/lib/hadoop-mapreduce/commons-logging-1.1.3.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-elastictranscoder-1.10.48.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-events-1.10.48.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-codepipeline-1.10.48.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-elasticache-1.10.48.jar /usr/lib/hadoop-mapreduce/jersey-json-1.9.jar /usr/lib/hadoop-mapreduce/jackson-core-asl-1.9.13.jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-common.jar /usr/lib/hadoop-mapreduce/hadoop-aws.jar /usr/lib/hadoop-mapreduce/gson-2.2.4.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-redshift-1.10.48.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-cognitosync-1.10.48.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-route53-1.10.48.jar /usr/lib/hadoop-mapreduce/stax-api-1.0-2.jar /usr/lib/hadoop-mapreduce/commons-digester-1.8.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-cloudhsm-1.10.48.jar /usr/lib/hadoop-mapreduce/hadoop-streaming-2.7.1-amzn-1.jar /usr/lib/hadoop-mapreduce/servlet-api-2.5.jar /usr/lib/hadoop-mapreduce/curator-framework-2.7.1.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-storagegateway-1.10.48.jar /usr/lib/hadoop-mapreduce/commons-httpclient-3.1.jar /usr/lib/hadoop-mapreduce/hadoop-archives.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-1.10.48.jar /usr/lib/hadoop-mapreduce/hadoop-openstack.jar /usr/lib/hadoop-mapreduce/jets3t-0.9.0.jar /usr/lib/hadoop-mapreduce/jaxb-api-2.2.2.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-opsworks-1.10.48.jar /usr/lib/hadoop-mapreduce/hadoop-distcp.jar /usr/lib/hadoop-mapreduce/mockito-all-1.8.5.jar /usr/lib/hadoop-mapreduce/snappy-java-1.0.4.1.jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-app.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-ecs-1.10.48.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-sts-1.10.48.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-codedeploy-1.10.48.jar /usr/lib/hadoop-mapreduce/jackson-annotations-2.4.4.jar /usr/lib/hadoop-mapreduce/hadoop-distcp-2.7.1-amzn-1.jar /usr/lib/hadoop-mapreduce/jaxb-impl-2.2.3-1.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-directory-1.10.48.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-cloudsearch-1.10.48.jar /usr/lib/hadoop-mapreduce/paranamer-2.3.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-swf-libraries-1.10.48.jar /usr/lib/hadoop-mapreduce/avro-1.7.4.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-support-1.10.48.jar /usr/lib/hadoop-mapreduce/commons-beanutils-core-1.8.0.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-elasticloadbalancing-1.10.48.jar /usr/lib/hadoop-mapreduce/jsp-api-2.1.jar /usr/lib/hadoop-mapreduce/azure-storage-2.0.0.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-logs-1.10.48.jar /usr/lib/hadoop-mapreduce/metrics-core-3.0.1.jar /usr/lib/hadoop-mapreduce/hadoop-streaming.jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.7.1-amzn-1.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-sqs-1.10.48.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-kinesis-1.10.48.jar /usr/lib/hadoop-mapreduce/hadoop-rumen-2.7.1-amzn-1.jar /usr/lib/hadoop-mapreduce/api-util-1.0.0-M20.jar /usr/lib/hadoop-mapreduce/activation-1.1.jar /usr/lib/hadoop-mapreduce/emr-metrics-client-2.1.0.jar /usr/lib/hadoop-mapreduce/commons-lang-2.6.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-directconnect-1.10.48.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-sns-1.10.48.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-workspaces-1.10.48.jar /usr/lib/hadoop-mapreduce/jersey-server-1.9.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-s3-1.10.48.jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-cloudwatch-1.10.48.jar /usr/lib/hadoop-mapreduce/guava-11.0.2.jar /usr/lib/hadoop-mapreduce/hadoop-ant-2.7.1-amzn-1.jar /usr/lib/hadoop-mapreduce/hadoop-datajoin.jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-core-2.7.1-amzn-1.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-elasticsearch-1.10.48.jar /usr/lib/hadoop-mapreduce/hadoop-ant.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-codecommit-1.10.48.jar /usr/lib/hadoop-mapreduce/jsch-0.1.42.jar /usr/lib/hadoop-mapreduce/netty-3.6.2.Final.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-lambda-1.10.48.jar /usr/lib/hadoop-mapreduce/joda-time-2.8.1.jar /usr/lib/hadoop-mapreduce/hamcrest-core-1.3.jar /usr/lib/hadoop-mapreduce/hadoop-extras-2.7.1-amzn-1.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-simpleworkflow-1.10.48.jar /usr/lib/hadoop-mapreduce/hadoop-openstack-2.7.1-amzn-1.jar /usr/lib/hadoop-mapreduce/aws-java-sdk-inspector-1.10.48.jar /usr/lib/hadoop-mapreduce/lib/jackson-mapper-asl-1.9.13.jar /usr/lib/hadoop-mapreduce/lib/commons-io-2.4.jar /usr/lib/hadoop-mapreduce/lib/log4j-1.2.17.jar /usr/lib/hadoop-mapreduce/lib/junit-4.11.jar /usr/lib/hadoop-mapreduce/lib/javax.inject-1.jar /usr/lib/hadoop-mapreduce/lib/jersey-guice-1.9.jar /usr/lib/hadoop-mapreduce/lib/guice-3.0.jar /usr/lib/hadoop-mapreduce/lib/jersey-core-1.9.jar /usr/lib/hadoop-mapreduce/lib/commons-compress-1.4.1.jar /usr/lib/hadoop-mapreduce/lib/asm-3.2.jar /usr/lib/hadoop-mapreduce/lib/xz-1.0.jar /usr/lib/hadoop-mapreduce/lib/leveldbjni-all-1.8.jar /usr/lib/hadoop-mapreduce/lib/protobuf-java-2.5.0.jar /usr/lib/hadoop-mapreduce/lib/aopalliance-1.0.jar /usr/lib/hadoop-mapreduce/lib/jackson-core-asl-1.9.13.jar /usr/lib/hadoop-mapreduce/lib/snappy-java-1.0.4.1.jar /usr/lib/hadoop-mapreduce/lib/paranamer-2.3.jar /usr/lib/hadoop-mapreduce/lib/avro-1.7.4.jar /usr/lib/hadoop-mapreduce/lib/guice-servlet-3.0.jar /usr/lib/hadoop-mapreduce/lib/jersey-server-1.9.jar /usr/lib/hadoop-mapreduce/lib/netty-3.6.2.Final.jar /usr/lib/hadoop-mapreduce/lib/hamcrest-core-1.3.jar /usr/lib/hadoop-yarn/hadoop-yarn-registry-2.7.1-amzn-1.jar /usr/lib/hadoop-yarn/hadoop-yarn-client.jar /usr/lib/hadoop-yarn/hadoop-yarn-applications-distributedshell.jar /usr/lib/hadoop-yarn/hadoop-yarn-api-2.7.1-amzn-1.jar /usr/lib/hadoop-yarn/hadoop-yarn-client-2.7.1-amzn-1.jar /usr/lib/hadoop-yarn/hadoop-yarn-applications-distributedshell-2.7.1-amzn-1.jar /usr/lib/hadoop-yarn/hadoop-yarn-server-resourcemanager.jar /usr/lib/hadoop-yarn/hadoop-yarn-server-sharedcachemanager.jar /usr/lib/hadoop-yarn/hadoop-yarn-server-common.jar /usr/lib/hadoop-yarn/hadoop-yarn-server-web-proxy-2.7.1-amzn-1.jar /usr/lib/hadoop-yarn/hadoop-yarn-server-sharedcachemanager-2.7.1-amzn-1.jar /usr/lib/hadoop-yarn/hadoop-yarn-server-tests-2.7.1-amzn-1.jar /usr/lib/hadoop-yarn/hadoop-yarn-server-nodemanager-2.7.1-amzn-1.jar /usr/lib/hadoop-yarn/hadoop-yarn-registry.jar /usr/lib/hadoop-yarn/hadoop-yarn-applications-unmanaged-am-launcher.jar /usr/lib/hadoop-yarn/hadoop-yarn-server-web-proxy.jar /usr/lib/hadoop-yarn/hadoop-yarn-api.jar /usr/lib/hadoop-yarn/hadoop-yarn-server-tests.jar /usr/lib/hadoop-yarn/hadoop-yarn-server-resourcemanager-2.7.1-amzn-1.jar /usr/lib/hadoop-yarn/hadoop-yarn-server-applicationhistoryservice-2.7.1-amzn-1.jar /usr/lib/hadoop-yarn/hadoop-yarn-server-nodemanager.jar /usr/lib/hadoop-yarn/hadoop-yarn-common-2.7.1-amzn-1.jar /usr/lib/hadoop-yarn/hadoop-yarn-common.jar /usr/lib/hadoop-yarn/hadoop-yarn-server-applicationhistoryservice.jar /usr/lib/hadoop-yarn/hadoop-yarn-server-common-2.7.1-amzn-1.jar /usr/lib/hadoop-yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.7.1-amzn-1.jar /usr/lib/hadoop-yarn/lib/jackson-xc-1.9.13.jar /usr/lib/hadoop-yarn/lib/jackson-mapper-asl-1.9.13.jar /usr/lib/hadoop-yarn/lib/commons-io-2.4.jar /usr/lib/hadoop-yarn/lib/jackson-jaxrs-1.9.13.jar /usr/lib/hadoop-yarn/lib/log4j-1.2.17.jar /usr/lib/hadoop-yarn/lib/commons-cli-1.2.jar /usr/lib/hadoop-yarn/lib/javax.inject-1.jar /usr/lib/hadoop-yarn/lib/jersey-guice-1.9.jar /usr/lib/hadoop-yarn/lib/zookeeper-3.4.6.jar /usr/lib/hadoop-yarn/lib/jsr305-3.0.0.jar /usr/lib/hadoop-yarn/lib/jettison-1.1.jar /usr/lib/hadoop-yarn/lib/guice-3.0.jar /usr/lib/hadoop-yarn/lib/jersey-core-1.9.jar /usr/lib/hadoop-yarn/lib/commons-compress-1.4.1.jar /usr/lib/hadoop-yarn/lib/asm-3.2.jar /usr/lib/hadoop-yarn/lib/xz-1.0.jar /usr/lib/hadoop-yarn/lib/commons-collections-3.2.1.jar /usr/lib/hadoop-yarn/lib/leveldbjni-all-1.8.jar /usr/lib/hadoop-yarn/lib/jetty-util-6.1.26-emr.jar /usr/lib/hadoop-yarn/lib/commons-codec-1.4.jar /usr/lib/hadoop-yarn/lib/protobuf-java-2.5.0.jar /usr/lib/hadoop-yarn/lib/jetty-6.1.26-emr.jar /usr/lib/hadoop-yarn/lib/aopalliance-1.0.jar /usr/lib/hadoop-yarn/lib/commons-logging-1.1.3.jar /usr/lib/hadoop-yarn/lib/jersey-json-1.9.jar /usr/lib/hadoop-yarn/lib/jackson-core-asl-1.9.13.jar /usr/lib/hadoop-yarn/lib/stax-api-1.0-2.jar /usr/lib/hadoop-yarn/lib/servlet-api-2.5.jar /usr/lib/hadoop-yarn/lib/zookeeper-3.4.6-tests.jar /usr/lib/hadoop-yarn/lib/jaxb-api-2.2.2.jar /usr/lib/hadoop-yarn/lib/jaxb-impl-2.2.3-1.jar /usr/lib/hadoop-yarn/lib/jersey-client-1.9.jar /usr/lib/hadoop-yarn/lib/guice-servlet-3.0.jar /usr/lib/hadoop-yarn/lib/activation-1.1.jar /usr/lib/hadoop-yarn/lib/commons-lang-2.6.jar /usr/lib/hadoop-yarn/lib/jersey-server-1.9.jar /usr/lib/hadoop-yarn/lib/guava-11.0.2.jar /usr/lib/hadoop-lzo/lib/hadoop-lzo-0.4.19.jar /usr/lib/hadoop-lzo/lib/hadoop-lzo.jar /usr/share/aws/emr/emrfs/conf /usr/share/aws/emr/emrfs/lib/jsr-275-0.9.1.jar /usr/share/aws/emr/emrfs/lib/junit-4.11.jar /usr/share/aws/emr/emrfs/lib/commons-cli-1.2.jar /usr/share/aws/emr/emrfs/lib/javax.inject-1.jar /usr/share/aws/emr/emrfs/lib/commons-codec-1.9.jar /usr/share/aws/emr/emrfs/lib/httpclient-4.3.4.jar /usr/share/aws/emr/emrfs/lib/commons-httpclient-3.0.jar /usr/share/aws/emr/emrfs/lib/guice-3.0.jar /usr/share/aws/emr/emrfs/lib/httpcore-4.3.2.jar /usr/share/aws/emr/emrfs/lib/joda-time-2.3.jar /usr/share/aws/emr/emrfs/lib/bcprov-jdk15on-1.51.jar /usr/share/aws/emr/emrfs/lib/emrfs-hadoop-2.4.0.jar /usr/share/aws/emr/emrfs/lib/protobuf-java-2.5.0.jar /usr/share/aws/emr/emrfs/lib/slf4j-api-1.7.16.jar /usr/share/aws/emr/emrfs/lib/aopalliance-1.0.jar /usr/share/aws/emr/emrfs/lib/commons-logging-1.1.3.jar /usr/share/aws/emr/emrfs/lib/commons-lang3-3.3.jar /usr/share/aws/emr/emrfs/lib/commons-math-2.1.jar /usr/share/aws/emr/emrfs/lib/gson-2.2.4.jar /usr/share/aws/emr/emrfs/lib/jsr305-2.0.1.jar /usr/share/aws/emr/emrfs/lib/emr-core-2.5.0.jar /usr/share/aws/emr/emrfs/lib/emr-metrics-client-2.1.0.jar /usr/share/aws/emr/emrfs/lib/commons-exec-1.2.jar /usr/share/aws/emr/emrfs/lib/guava-15.0.jar /usr/share/aws/emr/emrfs/lib/bcpkix-jdk15on-1.51.jar /usr/share/aws/emr/emrfs/lib/hamcrest-core-1.3.jar /usr/share/aws/emr/emrfs/auxlib/* /usr/share/aws/emr/lib/jsr-275-0.9.1.jar /usr/share/aws/emr/lib/commons-httpclient-3.0.jar /usr/share/aws/emr/lib/joda-time-2.3.jar /usr/share/aws/emr/lib/slf4j-api-1.7.16.jar /usr/share/aws/emr/lib/commons-codec-1.2.jar /usr/share/aws/emr/lib/gson-2.2.4.jar /usr/share/aws/emr/lib/commons-logging-1.0.3.jar /usr/share/aws/emr/lib/jsr305-2.0.1.jar /usr/share/aws/emr/lib/emr-core-2.5.0.jar /usr/share/aws/emr/ddb/lib/emr-ddb-hadoop.jar /usr/share/aws/emr/goodies/lib/emr-hadoop-goodies.jar /usr/share/aws/emr/kinesis/lib/emr-kinesis-hadoop.jar /usr/share/aws/emr/cloudwatch-sink/lib/cloudwatch-sink-1.0.0.jar /usr/share/aws/emr/cloudwatch-sink/lib/cloudwatch-sink.jar On Tue, Nov 1, 2016 at 3:57 AM, Till Rohrmann <trohrm...@apache.org> wrote: > Hi Justin, > > I think this might be a problem in Flink's Kinesis consumer. The Flink > Kinesis consumer uses the aws-java-sdk version 1.10.71 which indeed > contains the afore mentioned methods. However, already version 1.10.46 no > longer contains this method. Thus, I suspect, that Yarn puts some older > version of this jar into the classpath. For these cases, I think we have to > shade our aws-java-sdk dependency so that it also works with older versions > of EMR. > > In order to verify this, could you tell us which EMR version you're > running? Additionally, it would be helpful if you sent us the classpath > with which the Flink cluster was started on Yarn. You can find this > information at the beginning of your TaskManager log file. Thanks a lot. > > Cheers, > Till > > On Mon, Oct 31, 2016 at 8:22 PM, Justin Yan <justin....@remitly.com> > wrote: > >> Hi all - first time on the mailing list, so my apologies if I break >> protocol on anything. Really excited to be using Flink, and hoping to be >> active here in the future! Also, apologies for the length of this email - >> I tried to include details but may have gone overboard. >> >> The gist of my problem is an issue with packaging the Flink Kinesis >> Connector into my user code for execution on a YARN cluster in EMR - >> there's some dependency trouble happening, but after about 48 hours of >> attempts, I'm not sure how to make progress, and I'd really appreciate any >> ideas or assistance. Thank you in advance! >> >> ### First, Some Context. >> >> We're hoping to write our Flink jobs in scala 2.11. The Flink JM/TMs >> currently run on an EMR cluster with Hadoop 2.7 as YARN containers. We run >> our jobs via an Azkaban server, which has the Hadoop and Flink clients >> installed, and the configurations are set to point at the YARN master on >> our EMR cluster (with $HADOOP_HOME set so Flink can discover the hadoop >> configs). We're using Java OpenJDK7 everywhere, and Maven 3.3.9 when >> building Flink from source. >> >> We use SBT and the assembly plugin to create an Uberjar of our code and >> its dependencies. This gets uploaded to Azkaban, whereupon the following >> command is run on the azkaban server to execute a Flink job: >> >> flink run -c <className> usercodeuberjar-assembly-1.0.jar >> >> I've successfully run a few flink jobs that execute on our EMR cluster in >> this fashion (the WordCount example, etc.). >> >> ### The Problem >> >> We use AWS Kinesis, and are hoping to integrate Flink with it. >> Naturally, we were hoping to use the Kinesis connector: < >> https://ci.apache.org/projects/flink/flink-docs-release-1. >> 1/apis/streaming/connectors/kinesis.html>. >> >> After following the instructions with some experimentation, I was able to >> run a Flink Kinesis application on my laptop in Local Cluster mode. >> (Ubuntu 16.04, local cluster initiated with the `./start-local.sh` >> command, job submitted via `flink run -c <className> >> usercodeuberjar-assembly-1.0.jar`) >> >> I uploaded the same JAR to Azkaban and tried to run the same command to >> submit to our EMR cluster, and got a `java.lang.NoSuchMethodError: >> com.amazonaws.SDKGlobalConfiguration.isInRegionOptimizedModeEnabled()` >> (I've included the full stack trace at the bottom of this email). I went >> to inspect the uploaded JAR with a `unzip usercodeuberjar-assembly-1.0.jar`, >> looked in `com/amazonaws` and found the SDKGlobalConfiguration.class file. >> I decompiled and inspected it, and the isInRegionOptimizedModeEnabled >> method that was purportedly missing was indeed present. >> >> I've included the steps I took to manifest this problem below, along with >> a variety of things that I tried to do to resolve the problem - any help or >> insight is greatly appreciated! >> >> ### Repro >> >> I'm not sure how to provide a clear repro, but I'll try to include as >> much detail as I can about the sequence of actions and commands I ran since >> there may be some obvious mistakes: >> >> Downloading the flink release to my laptop: >> >> wget http://www-us.apache.org/dist/flink/flink-1.1.3/flink-1.1.3- >> bin-hadoop27-scala_2.11.tgz >> tar xfzv flink-1.1.3-bin-hadoop27-scala_2.11.tgz >> >> I then SSH'd into Azkaban, and ran the same two commands, while adding >> the bin/ directory to my PATH and tweaking the config for >> fs.hdfs.hadoopconf. Next, after getting the flink binaries, I went to >> fetch the source code in order to follow the instructions here: < >> https://ci.apache.org/projects/flink/flink-docs-release-1. >> 1/apis/streaming/connectors/kinesis.html> >> >> wget https://github.com/apache/flink/archive/release-1.1.3.tar.gz >> tar xfzv release-1.1.3.tar.gz >> >> Here, I wanted to leverage our EMR instance profile Role instead of >> passing in credentials, hence I wanted the AUTO value for the >> "aws.credentials.provider" config, which seems to have been added after >> 1.1.3 - I made a couple of small tweaks to AWSConfigConstants.java and >> AWSUtil.java to allow for that AUTO value. >> >> Next, we're using Scala 2.11, so per the instructions here, I changed the >> scala version: <https://ci.apache.org/project >> s/flink/flink-docs-release-1.1/setup/building.html#scala-versions> >> >> tools/change-scala-version.sh 2.11 >> >> Back to the Kinesis Connector documentation... >> >> mvn clean install -Pinclude-kinesis -DskipTests >> cd flink-dist >> mvn clean install -Pinclude-kinesis -DskipTests >> >> When running that second mvn clean install, I get some warnings about the >> maven shade plugin having conflicting versions. I also get a "[WARNING] >> The requested profile "include-kinesis" could not be activated because it >> does not exist." >> >> At this point, the instructions are not too clear on what to do. I >> proceed to this section to try and figure it out: < >> https://ci.apache.org/projects/flink/flink-docs-release-1. >> 1/apis/cluster_execution.html#linking-with-modules-not- >> contained-in-the-binary-distribution> >> >> My goal is to package everything in my usercode JAR, and I'll try to do >> that with SBT. My first try is to install the Flink Kinesis Connector JAR >> generated by mvn clean install to my local Maven Repo: >> >> mvn install:install-file -Dfile=flink-connector-kinesis_2.11-1.1.3.jar >> >> I then build the jar with a build.sbt that looks like this (extraneous >> details removed): >> >> scalaVersion in ThisBuild := "2.11.8" >> >> val flinkVersion = "1.1.3" >> >> val flinkDependencies = Seq( >> "org.apache.flink" %% "flink-scala" % flinkVersion, >> "org.apache.flink" %% "flink-streaming-scala" % flinkVersion, >> "org.apache.flink" %% "flink-connector-kinesis" % flinkVersion >> ) >> >> lazy val proj = (project in file(".")). >> settings( >> libraryDependencies ++= flinkDependencies >> ) >> >> After this builds, I unzip the jar and use JD to decompile the >> com.amazonaws.SDKGlobalConfiguration class file to see if the method in >> question is present or not (it is). I then run the jar locally with a >> `flink run -c <className> usercodeuberjar-assembly-1.0.jar`, and I see >> it running just fine when navigating to localhost:8081. I then upload this >> same JAR to our Azkaban server, and run the same `flink run -c <className> >> usercodeuberjar-assembly-1.0.jar` command to submit as a YARN >> application - this time, I get the `NoSuchMethodError`. >> >> I've tried a variety of permutations of this, so I'll attempt to list >> them out along with their results: >> >> 1. A non-kinesis Flink job: I was able to successfully the example >> WordCount Flink job as a YARN application. >> 2. I mvn installed the newly built flink-scala and flink-streaming-scala >> JARs to my local maven repository in case these were different - after >> building and running on Azkaban... same error. >> 3. Using the newly-built flink-dist JAR (with the -Pinclude-kinesis >> flag): After replacing the flink-dist JAR in the /lib dir on Azkaban (that >> the `flink` command runs), I still had the same error. >> 4. Packaging the JAR in different ways: >> - I tried adding the flink-connector-kinesis JAR by adding it to a >> /lib directory in my SBT project for direct inclusion. This actually >> caused the NoSuchMethodError to occur during *local* execution as well. >> - I tried using mvn-assembly to package all of the >> flink-connector-kinesis dependencies into that JAR, and then added it to >> the /lib directory in my SBT project. Local execution no longer has error, >> but submission from Azkaban still has same error. >> 5. I thought it might be a classpath issue (since my laptop doesn't have >> a hadoop installation, so I figured there may be some kind of collision >> with the AWS SDK included by EMR), so I set, on Azkaban, the environment >> variable FLINK_CLASSPATH=usercodeuberjar-assembly-1.0.jar in order to >> get it prepended - same error. >> 6. I realized this wasn't necessarily doing anything to the resolution >> of classnames of the Flink job executing in YARN. So I dug into the client >> source, which eventually led me to >> flink-clients/.../program/PackagedProgram.java >> which has the following line of code setting the ClassLoader: >> >> this.userCodeClassLoader = >> JobWithJars.buildUserCodeClassLoader(getAllLibraries(), >> classpaths, getClass().getClassLoader()); >> >> getAllLibraries() does seem to set the jar that you pass into the `flink` >> command at the top of the class resolution hierarchy, which, as my previous >> foray into decompilation shows, does seem to include the method that is >> supposedly missing. >> >> At this point, I ran out of ideas to investigate, and so I'm hoping >> someone here is able to help me. Thanks in advance for reading this! >> >> Full Stack Trace: >> >> java.lang.NoSuchMethodError: com.amazonaws.SDKGlobalConfigu >> ration.isInRegionOptimizedModeEnabled()Z >> at com.amazonaws.ClientConfigurationFactory.getConfig(ClientCon >> figurationFactory.java:35) >> at org.apache.flink.streaming.connectors.kinesis.util.AWSUtil. >> createKinesisClient(AWSUtil.java:50) >> at org.apache.flink.streaming.connectors.kinesis.proxy.KinesisP >> roxy.(KinesisProxy.java:118) >> at org.apache.flink.streaming.connectors.kinesis.proxy.KinesisP >> roxy.create(KinesisProxy.java:176) >> at org.apache.flink.streaming.connectors.kinesis.internals.Kine >> sisDataFetcher.(KinesisDataFetcher.java:188) >> at org.apache.flink.streaming.connectors.kinesis.FlinkKinesisCo >> nsumer.run(FlinkKinesisConsumer.java:198) >> at org.apache.flink.streaming.api.operators.StreamSource.run( >> StreamSource.java:80) >> at org.apache.flink.streaming.api.operators.StreamSource.run( >> StreamSource.java:53) >> at org.apache.flink.streaming.runtime.tasks.SourceStreamTask. >> run(SourceStreamTask.java:56) >> at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke( >> StreamTask.java:266) >> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:585) >> at java.lang.Thread.run(Thread.java:745) >> > >