Re: Class loading issue, spark.files.userClassPathFirst doesn't seem to be working

2015-02-18 Thread Marcelo Vanzin
Hello,

On Tue, Feb 17, 2015 at 8:53 PM, dgoldenberg dgoldenberg...@gmail.com wrote:
 I've tried setting spark.files.userClassPathFirst to true in SparkConf in my
 program, also setting it to true in  $SPARK-HOME/conf/spark-defaults.conf as

Is the code in question running on the driver or in some executor?
spark.files.userClassPathFirst only applies to executors. To override
classes in the driver's classpath, you need to modify
spark.driver.extraClassPath (or --driver-class-path in spark-submit's
command line).

In 1.3 there's an option similar to spark.files.userClassPathFirst
that works for the driver too.

-- 
Marcelo

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Class loading issue, spark.files.userClassPathFirst doesn't seem to be working

2015-02-18 Thread Dmitry Goldenberg
I'm not sure what on the driver means but I've tried
setting  spark.files.userClassPathFirst to true,
in $SPARK-HOME/conf/spark-defaults.conf and also in the SparkConf
programmatically; it appears to be ignored. The solution was to follow
Emre's recommendation and downgrade the selected Solrj distro to 4.0.0.
That did the trick as it appears to be using the same HttpClient as one
used by Spark/Hadoop.

The Spark program I'm running is a jar I submit via a spark-submit
invokation.



On Wed, Feb 18, 2015 at 1:57 PM, Marcelo Vanzin van...@cloudera.com wrote:

 Hello,

 On Tue, Feb 17, 2015 at 8:53 PM, dgoldenberg dgoldenberg...@gmail.com
 wrote:
  I've tried setting spark.files.userClassPathFirst to true in SparkConf
 in my
  program, also setting it to true in
 $SPARK-HOME/conf/spark-defaults.conf as

 Is the code in question running on the driver or in some executor?
 spark.files.userClassPathFirst only applies to executors. To override
 classes in the driver's classpath, you need to modify
 spark.driver.extraClassPath (or --driver-class-path in spark-submit's
 command line).

 In 1.3 there's an option similar to spark.files.userClassPathFirst
 that works for the driver too.

 --
 Marcelo



Re: Class loading issue, spark.files.userClassPathFirst doesn't seem to be working

2015-02-18 Thread Dmitry Goldenberg
Are you proposing I downgrade Solrj's httpclient dependency to be on par with 
that of Spark/Hadoop? Or upgrade Spark/Hadoop's httpclient to the latest?

Solrj has to stay with its selected version. I could try and rebuild Spark with 
the latest httpclient but I've no idea what effects that may cause on Spark.

Sent from my iPhone

 On Feb 18, 2015, at 1:37 AM, Arush Kharbanda ar...@sigmoidanalytics.com 
 wrote:
 
 Hi
 
 Did you try to make maven pick the latest version
 
 http://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#Dependency_Management
 
 That way solrj won't cause any issue, you can try this and check if the part 
 of your code where you access HDFS works fine?
 
 
 
 On Wed, Feb 18, 2015 at 10:23 AM, dgoldenberg dgoldenberg...@gmail.com 
 wrote:
 I'm getting the below error when running spark-submit on my class. This class
 has a transitive dependency on HttpClient v.4.3.1 since I'm calling SolrJ
 4.10.3 from within the class.
 
 This is in conflict with the older version, HttpClient 3.1 that's a
 dependency of Hadoop 2.4 (I'm running Spark 1.2.1 built for Hadoop 2.4).
 
 I've tried setting spark.files.userClassPathFirst to true in SparkConf in my
 program, also setting it to true in  $SPARK-HOME/conf/spark-defaults.conf as
 
 spark.files.userClassPathFirst true
 
 No go, I'm still getting the error, as below. Is there anything else I can
 try? Are there any plans in Spark to support multiple class loaders?
 
 Exception in thread main java.lang.NoSuchMethodError:
 org.apache.http.impl.conn.SchemeRegistryFactory.createSystemDefault()Lorg/apache/http/conn/scheme/SchemeRegistry;
 at
 org.apache.http.impl.client.SystemDefaultHttpClient.createClientConnectionManager(SystemDefaultHttpClient.java:121)
 at
 org.apache.http.impl.client.AbstractHttpClient.getConnectionManager(AbstractHttpClient.java:445)
 at
 org.apache.solr.client.solrj.impl.HttpClientUtil.setMaxConnections(HttpClientUtil.java:206)
 at
 org.apache.solr.client.solrj.impl.HttpClientConfigurer.configure(HttpClientConfigurer.java:35)
 at
 org.apache.solr.client.solrj.impl.HttpClientUtil.configureClient(HttpClientUtil.java:142)
 at
 org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:118)
 at
 org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:168)
 at
 org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:141)
 ...
 
 
 
 
 
 --
 View this message in context: 
 http://apache-spark-user-list.1001560.n3.nabble.com/Class-loading-issue-spark-files-userClassPathFirst-doesn-t-seem-to-be-working-tp21693.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.
 
 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org
 
 
 
 -- 
 
 
 Arush Kharbanda || Technical Teamlead
 
 ar...@sigmoidanalytics.com || www.sigmoidanalytics.com


Re: Class loading issue, spark.files.userClassPathFirst doesn't seem to be working

2015-02-18 Thread Emre Sevinc
Hello Dmitry,

I had almost the same problem and solved it by using version 4.0.0 of SolrJ:

dependency
  groupIdorg.apache.solr/groupId
  artifactIdsolr-solrj/artifactId
  version4.0.0/version
 /dependency

In my case, I was lucky that version 4.0.0 of SolrJ had all the
functionality I needed.

--
Emre Sevinç
http://www.bigindustries.be/



On Wed, Feb 18, 2015 at 4:39 PM, Dmitry Goldenberg dgoldenberg...@gmail.com
 wrote:

 I think I'm going to have to rebuild Spark with commons.httpclient.version
 set to 4.3.1 which looks to be the version chosen by Solrj, rather than the
 4.2.6 that Spark's pom mentions. Might work.

 On Wed, Feb 18, 2015 at 1:37 AM, Arush Kharbanda 
 ar...@sigmoidanalytics.com wrote:

 Hi

 Did you try to make maven pick the latest version


 http://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#Dependency_Management

 That way solrj won't cause any issue, you can try this and check if the
 part of your code where you access HDFS works fine?



 On Wed, Feb 18, 2015 at 10:23 AM, dgoldenberg dgoldenberg...@gmail.com
 wrote:

 I'm getting the below error when running spark-submit on my class. This
 class
 has a transitive dependency on HttpClient v.4.3.1 since I'm calling SolrJ
 4.10.3 from within the class.

 This is in conflict with the older version, HttpClient 3.1 that's a
 dependency of Hadoop 2.4 (I'm running Spark 1.2.1 built for Hadoop 2.4).

 I've tried setting spark.files.userClassPathFirst to true in SparkConf
 in my
 program, also setting it to true in
 $SPARK-HOME/conf/spark-defaults.conf as

 spark.files.userClassPathFirst true

 No go, I'm still getting the error, as below. Is there anything else I
 can
 try? Are there any plans in Spark to support multiple class loaders?

 Exception in thread main java.lang.NoSuchMethodError:

 org.apache.http.impl.conn.SchemeRegistryFactory.createSystemDefault()Lorg/apache/http/conn/scheme/SchemeRegistry;
 at

 org.apache.http.impl.client.SystemDefaultHttpClient.createClientConnectionManager(SystemDefaultHttpClient.java:121)
 at

 org.apache.http.impl.client.AbstractHttpClient.getConnectionManager(AbstractHttpClient.java:445)
 at

 org.apache.solr.client.solrj.impl.HttpClientUtil.setMaxConnections(HttpClientUtil.java:206)
 at

 org.apache.solr.client.solrj.impl.HttpClientConfigurer.configure(HttpClientConfigurer.java:35)
 at

 org.apache.solr.client.solrj.impl.HttpClientUtil.configureClient(HttpClientUtil.java:142)
 at

 org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:118)
 at

 org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:168)
 at

 org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:141)
 ...





 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/Class-loading-issue-spark-files-userClassPathFirst-doesn-t-seem-to-be-working-tp21693.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




 --

 [image: Sigmoid Analytics] http://htmlsig.com/www.sigmoidanalytics.com

 *Arush Kharbanda* || Technical Teamlead

 ar...@sigmoidanalytics.com || www.sigmoidanalytics.com





-- 
Emre Sevinc


Re: Class loading issue, spark.files.userClassPathFirst doesn't seem to be working

2015-02-18 Thread Dmitry Goldenberg
Thank you, Emre. It seems solrj still depends on HttpClient 4.1.3; would
that not collide with Spark/Hadoop's default dependency on HttpClient set
to 4.2.6? If that's the case that might just solve the problem.

Would Solrj 4.0.0 work with the latest Solr, 4.10.3?

On Wed, Feb 18, 2015 at 10:50 AM, Emre Sevinc emre.sev...@gmail.com wrote:

 Hello Dmitry,

 I had almost the same problem and solved it by using version 4.0.0 of
 SolrJ:

 dependency
   groupIdorg.apache.solr/groupId
   artifactIdsolr-solrj/artifactId
   version4.0.0/version
  /dependency

 In my case, I was lucky that version 4.0.0 of SolrJ had all the
 functionality I needed.

 --
 Emre Sevinç
 http://www.bigindustries.be/



 On Wed, Feb 18, 2015 at 4:39 PM, Dmitry Goldenberg 
 dgoldenberg...@gmail.com wrote:

 I think I'm going to have to rebuild Spark with
 commons.httpclient.version set to 4.3.1 which looks to be the version
 chosen by Solrj, rather than the 4.2.6 that Spark's pom mentions. Might
 work.

 On Wed, Feb 18, 2015 at 1:37 AM, Arush Kharbanda 
 ar...@sigmoidanalytics.com wrote:

 Hi

 Did you try to make maven pick the latest version


 http://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#Dependency_Management

 That way solrj won't cause any issue, you can try this and check if the
 part of your code where you access HDFS works fine?



 On Wed, Feb 18, 2015 at 10:23 AM, dgoldenberg dgoldenberg...@gmail.com
 wrote:

 I'm getting the below error when running spark-submit on my class. This
 class
 has a transitive dependency on HttpClient v.4.3.1 since I'm calling
 SolrJ
 4.10.3 from within the class.

 This is in conflict with the older version, HttpClient 3.1 that's a
 dependency of Hadoop 2.4 (I'm running Spark 1.2.1 built for Hadoop 2.4).

 I've tried setting spark.files.userClassPathFirst to true in SparkConf
 in my
 program, also setting it to true in
 $SPARK-HOME/conf/spark-defaults.conf as

 spark.files.userClassPathFirst true

 No go, I'm still getting the error, as below. Is there anything else I
 can
 try? Are there any plans in Spark to support multiple class loaders?

 Exception in thread main java.lang.NoSuchMethodError:

 org.apache.http.impl.conn.SchemeRegistryFactory.createSystemDefault()Lorg/apache/http/conn/scheme/SchemeRegistry;
 at

 org.apache.http.impl.client.SystemDefaultHttpClient.createClientConnectionManager(SystemDefaultHttpClient.java:121)
 at

 org.apache.http.impl.client.AbstractHttpClient.getConnectionManager(AbstractHttpClient.java:445)
 at

 org.apache.solr.client.solrj.impl.HttpClientUtil.setMaxConnections(HttpClientUtil.java:206)
 at

 org.apache.solr.client.solrj.impl.HttpClientConfigurer.configure(HttpClientConfigurer.java:35)
 at

 org.apache.solr.client.solrj.impl.HttpClientUtil.configureClient(HttpClientUtil.java:142)
 at

 org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:118)
 at

 org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:168)
 at

 org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:141)
 ...





 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/Class-loading-issue-spark-files-userClassPathFirst-doesn-t-seem-to-be-working-tp21693.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




 --

 [image: Sigmoid Analytics] http://htmlsig.com/www.sigmoidanalytics.com

 *Arush Kharbanda* || Technical Teamlead

 ar...@sigmoidanalytics.com || www.sigmoidanalytics.com





 --
 Emre Sevinc



Re: Class loading issue, spark.files.userClassPathFirst doesn't seem to be working

2015-02-18 Thread Emre Sevinc
On Wed, Feb 18, 2015 at 4:54 PM, Dmitry Goldenberg dgoldenberg...@gmail.com
 wrote:

 Thank you, Emre. It seems solrj still depends on HttpClient 4.1.3; would
 that not collide with Spark/Hadoop's default dependency on HttpClient set
 to 4.2.6? If that's the case that might just solve the problem.

 Would Solrj 4.0.0 work with the latest Solr, 4.10.3?


In my case, it worked; I mean I was trying to send some documents to the
latest version of Solr server (v4.10.3), and using v4.0.0 of SolrJ worked
without any problems so far. I couldn't find any other way to deal with
this old httpclient dependency problem in Spark.

--
Emre Sevinç
http://www.bigindustries.be/


Re: Class loading issue, spark.files.userClassPathFirst doesn't seem to be working

2015-02-18 Thread Dmitry Goldenberg
Thanks, Emre! Will definitely try this.

On Wed, Feb 18, 2015 at 11:00 AM, Emre Sevinc emre.sev...@gmail.com wrote:


 On Wed, Feb 18, 2015 at 4:54 PM, Dmitry Goldenberg 
 dgoldenberg...@gmail.com wrote:

 Thank you, Emre. It seems solrj still depends on HttpClient 4.1.3; would
 that not collide with Spark/Hadoop's default dependency on HttpClient set
 to 4.2.6? If that's the case that might just solve the problem.

 Would Solrj 4.0.0 work with the latest Solr, 4.10.3?


 In my case, it worked; I mean I was trying to send some documents to the
 latest version of Solr server (v4.10.3), and using v4.0.0 of SolrJ worked
 without any problems so far. I couldn't find any other way to deal with
 this old httpclient dependency problem in Spark.

 --
 Emre Sevinç
 http://www.bigindustries.be/





Re: Class loading issue, spark.files.userClassPathFirst doesn't seem to be working

2015-02-18 Thread Dmitry Goldenberg
I think I'm going to have to rebuild Spark with commons.httpclient.version
set to 4.3.1 which looks to be the version chosen by Solrj, rather than the
4.2.6 that Spark's pom mentions. Might work.

On Wed, Feb 18, 2015 at 1:37 AM, Arush Kharbanda ar...@sigmoidanalytics.com
 wrote:

 Hi

 Did you try to make maven pick the latest version


 http://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#Dependency_Management

 That way solrj won't cause any issue, you can try this and check if the
 part of your code where you access HDFS works fine?



 On Wed, Feb 18, 2015 at 10:23 AM, dgoldenberg dgoldenberg...@gmail.com
 wrote:

 I'm getting the below error when running spark-submit on my class. This
 class
 has a transitive dependency on HttpClient v.4.3.1 since I'm calling SolrJ
 4.10.3 from within the class.

 This is in conflict with the older version, HttpClient 3.1 that's a
 dependency of Hadoop 2.4 (I'm running Spark 1.2.1 built for Hadoop 2.4).

 I've tried setting spark.files.userClassPathFirst to true in SparkConf in
 my
 program, also setting it to true in  $SPARK-HOME/conf/spark-defaults.conf
 as

 spark.files.userClassPathFirst true

 No go, I'm still getting the error, as below. Is there anything else I can
 try? Are there any plans in Spark to support multiple class loaders?

 Exception in thread main java.lang.NoSuchMethodError:

 org.apache.http.impl.conn.SchemeRegistryFactory.createSystemDefault()Lorg/apache/http/conn/scheme/SchemeRegistry;
 at

 org.apache.http.impl.client.SystemDefaultHttpClient.createClientConnectionManager(SystemDefaultHttpClient.java:121)
 at

 org.apache.http.impl.client.AbstractHttpClient.getConnectionManager(AbstractHttpClient.java:445)
 at

 org.apache.solr.client.solrj.impl.HttpClientUtil.setMaxConnections(HttpClientUtil.java:206)
 at

 org.apache.solr.client.solrj.impl.HttpClientConfigurer.configure(HttpClientConfigurer.java:35)
 at

 org.apache.solr.client.solrj.impl.HttpClientUtil.configureClient(HttpClientUtil.java:142)
 at

 org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:118)
 at

 org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:168)
 at

 org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:141)
 ...





 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/Class-loading-issue-spark-files-userClassPathFirst-doesn-t-seem-to-be-working-tp21693.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




 --

 [image: Sigmoid Analytics] http://htmlsig.com/www.sigmoidanalytics.com

 *Arush Kharbanda* || Technical Teamlead

 ar...@sigmoidanalytics.com || www.sigmoidanalytics.com



Class loading issue, spark.files.userClassPathFirst doesn't seem to be working

2015-02-17 Thread dgoldenberg
I'm getting the below error when running spark-submit on my class. This class
has a transitive dependency on HttpClient v.4.3.1 since I'm calling SolrJ
4.10.3 from within the class.

This is in conflict with the older version, HttpClient 3.1 that's a
dependency of Hadoop 2.4 (I'm running Spark 1.2.1 built for Hadoop 2.4).

I've tried setting spark.files.userClassPathFirst to true in SparkConf in my
program, also setting it to true in  $SPARK-HOME/conf/spark-defaults.conf as

spark.files.userClassPathFirst true

No go, I'm still getting the error, as below. Is there anything else I can
try? Are there any plans in Spark to support multiple class loaders?

Exception in thread main java.lang.NoSuchMethodError:
org.apache.http.impl.conn.SchemeRegistryFactory.createSystemDefault()Lorg/apache/http/conn/scheme/SchemeRegistry;
at
org.apache.http.impl.client.SystemDefaultHttpClient.createClientConnectionManager(SystemDefaultHttpClient.java:121)
at
org.apache.http.impl.client.AbstractHttpClient.getConnectionManager(AbstractHttpClient.java:445)
at
org.apache.solr.client.solrj.impl.HttpClientUtil.setMaxConnections(HttpClientUtil.java:206)
at
org.apache.solr.client.solrj.impl.HttpClientConfigurer.configure(HttpClientConfigurer.java:35)
at
org.apache.solr.client.solrj.impl.HttpClientUtil.configureClient(HttpClientUtil.java:142)
at
org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:118)
at
org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:168)
at
org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:141)
...





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Class-loading-issue-spark-files-userClassPathFirst-doesn-t-seem-to-be-working-tp21693.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Class loading issue, spark.files.userClassPathFirst doesn't seem to be working

2015-02-17 Thread Arush Kharbanda
Hi

Did you try to make maven pick the latest version

http://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#Dependency_Management

That way solrj won't cause any issue, you can try this and check if the
part of your code where you access HDFS works fine?



On Wed, Feb 18, 2015 at 10:23 AM, dgoldenberg dgoldenberg...@gmail.com
wrote:

 I'm getting the below error when running spark-submit on my class. This
 class
 has a transitive dependency on HttpClient v.4.3.1 since I'm calling SolrJ
 4.10.3 from within the class.

 This is in conflict with the older version, HttpClient 3.1 that's a
 dependency of Hadoop 2.4 (I'm running Spark 1.2.1 built for Hadoop 2.4).

 I've tried setting spark.files.userClassPathFirst to true in SparkConf in
 my
 program, also setting it to true in  $SPARK-HOME/conf/spark-defaults.conf
 as

 spark.files.userClassPathFirst true

 No go, I'm still getting the error, as below. Is there anything else I can
 try? Are there any plans in Spark to support multiple class loaders?

 Exception in thread main java.lang.NoSuchMethodError:

 org.apache.http.impl.conn.SchemeRegistryFactory.createSystemDefault()Lorg/apache/http/conn/scheme/SchemeRegistry;
 at

 org.apache.http.impl.client.SystemDefaultHttpClient.createClientConnectionManager(SystemDefaultHttpClient.java:121)
 at

 org.apache.http.impl.client.AbstractHttpClient.getConnectionManager(AbstractHttpClient.java:445)
 at

 org.apache.solr.client.solrj.impl.HttpClientUtil.setMaxConnections(HttpClientUtil.java:206)
 at

 org.apache.solr.client.solrj.impl.HttpClientConfigurer.configure(HttpClientConfigurer.java:35)
 at

 org.apache.solr.client.solrj.impl.HttpClientUtil.configureClient(HttpClientUtil.java:142)
 at

 org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:118)
 at

 org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:168)
 at

 org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:141)
 ...





 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/Class-loading-issue-spark-files-userClassPathFirst-doesn-t-seem-to-be-working-tp21693.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




-- 

[image: Sigmoid Analytics] http://htmlsig.com/www.sigmoidanalytics.com

*Arush Kharbanda* || Technical Teamlead

ar...@sigmoidanalytics.com || www.sigmoidanalytics.com