And it is an NoSuchMethodError, not a classnofound error
And default I think the spark is only compile against Hadoop 2.2? For this issue itself, I just check the latest spark (1.3.0), its version can work (because it package with a newer version of httpclient, I can see the method is there, although still don’t know the exact version), but this doesn’t really solve the whole problem, it is very unclear that what version of third party library is used by Spark even there is someway to figure it out, still a horrible decision to do that? From: Ted Yu [mailto:yuzhih...@gmail.com] Sent: Monday, March 16, 2015 1:06 PM To: Shuai Zheng Cc: user Subject: Re: [SPARK-3638 ] java.lang.NoSuchMethodError: org.apache.http.impl.conn.DefaultClientConnectionOperator. >From my local maven repo: $ jar tvf ~/.m2/repository/org/apache/httpcomponents/httpclient/4.2.5/httpclient-4.2.5.jar | grep SchemeRegistry 1373 Fri Apr 19 18:19:36 PDT 2013 org/apache/http/impl/conn/SchemeRegistryFactory.class 2954 Fri Apr 19 18:19:36 PDT 2013 org/apache/http/conn/scheme/SchemeRegistry.class 2936 Fri Apr 19 18:19:36 PDT 2013 org/apache/http/auth/AuthSchemeRegistry.class If you run mvn dependency:tree, you would see something similar to the following: [INFO] | +- org.apache.hadoop:hadoop-client:jar:2.6.0:compile [INFO] | | +- org.apache.hadoop:hadoop-common:jar:2.6.0:compile [INFO] | | | +- commons-cli:commons-cli:jar:1.2:compile [INFO] | | | +- xmlenc:xmlenc:jar:0.52:compile [INFO] | | | +- commons-io:commons-io:jar:2.4:compile [INFO] | | | +- commons-collections:commons-collections:jar:3.2.1:compile [INFO] | | | +- commons-lang:commons-lang:jar:2.6:compile [INFO] | | | +- commons-configuration:commons-configuration:jar:1.6:compile [INFO] | | | | +- commons-digester:commons-digester:jar:1.8:compile [INFO] | | | | | \- commons-beanutils:commons-beanutils:jar:1.7.0:compile [INFO] | | | | \- commons-beanutils:commons-beanutils-core:jar:1.8.0:compile [INFO] | | | +- org.codehaus.jackson:jackson-core-asl:jar:1.9.13:compile [INFO] | | | +- org.codehaus.jackson:jackson-mapper-asl:jar:1.8.8:compile [INFO] | | | +- org.apache.avro:avro:jar:1.7.6:compile [INFO] | | | +- com.google.protobuf:protobuf-java:jar:2.5.0:compile [INFO] | | | +- com.google.code.gson:gson:jar:2.2.4:compile [INFO] | | | +- org.apache.hadoop:hadoop-auth:jar:2.6.0:compile [INFO] | | | | +- org.apache.httpcomponents:httpclient:jar:4.2.5:compile Cheers On Mon, Mar 16, 2015 at 9:38 AM, Shuai Zheng <szheng.c...@gmail.com> wrote: Hi All, I am running Spark 1.2.1 and AWS SDK. To make sure AWS compatible on the httpclient 4.2 (which I assume spark use?), I have already downgrade to the version 1.9.0 But even that, I still got an error: Exception in thread "main" java.lang.NoSuchMethodError: org.apache.http.impl.conn.DefaultClientConnectionOperator.<init>(Lorg/apache/http/conn/scheme/SchemeRegistry;Lorg/apache/http/conn/DnsResolver;)V at org.apache.http.impl.conn.PoolingClientConnectionManager.createConnectionOperator(PoolingClientConnectionManager.java:140) at org.apache.http.impl.conn.PoolingClientConnectionManager.<init>(PoolingClientConnectionManager.java:114) at org.apache.http.impl.conn.PoolingClientConnectionManager.<init>(PoolingClientConnectionManager.java:99) at com.amazonaws.http.ConnectionManagerFactory.createPoolingClientConnManager(ConnectionManagerFactory.java:29) at com.amazonaws.http.HttpClientFactory.createHttpClient(HttpClientFactory.java:102) at com.amazonaws.http.AmazonHttpClient.<init>(AmazonHttpClient.java:190) at com.amazonaws.AmazonWebServiceClient.<init>(AmazonWebServiceClient.java:119) at com.amazonaws.services.s3.AmazonS3Client.<init>(AmazonS3Client.java:410) at com.amazonaws.services.s3.AmazonS3Client.<init>(AmazonS3Client.java:392) at com.amazonaws.services.s3.AmazonS3Client.<init>(AmazonS3Client.java:376) When I search the maillist, it looks the same issue as: https://github.com/apache/spark/pull/2535 http://stackoverflow.com/questions/24788949/nosuchmethoderror-while-running-aws-s3-client-on-spark-while-javap-shows-otherwi But I don’t understand the solution mention here? The issue is caused by an pre-package DefaultClientConnectionOperator in the spark all-in-one jar file which doesn’t have the that method. I have some questions here: How can we find out which exact version when spark try to pre-package everything (this really very painful). and how can we override it? I have tried: val conf = new SparkConf() .set("spark.files.userClassPathFirst", "true")// For non Yarn APP before spark 1.3 .set("spark.executor.userClassPathFirst", "true")// For spark 1.3.0 But it doesn’t work This really create a lot of issues to me (especially we don’t know what version is used by Spark to package its own jar, we need to try out). Even maven doesn’t give enough information because httpclient is not under the maven dependency (even indirect dependency, after I use tools to resolved the whole dependency tree). Regards, Shuai