[jira] [Commented] (SPARK-13975) Cannot specify extra libs for executor from /extra-lib

2016-03-19 Thread Leonid Poliakov (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199675#comment-15199675
 ] 

Leonid Poliakov commented on SPARK-13975:
-

I am trying to avoid adding extra steps for users to perform when they deal 
with bundle.
It would be great if libs were loaded out-of-box.
And that's what user would actually expect, since there is no good excuse for 
an extra step for user to go in config and add absolute path to something that 
is already within bundle.

> Cannot specify extra libs for executor from /extra-lib
> --
>
> Key: SPARK-13975
> URL: https://issues.apache.org/jira/browse/SPARK-13975
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 1.6.1
>Reporter: Leonid Poliakov
>
> If you build a framework on top of spark and want to bundle it with the 
> spark, there is no easy way to add your framework libs to executor classpath.
> Let's say I want to add my custom libs to {{/extra-lib}} folder, ship the new 
> bundle (with my libs in it) to nodes, run the bundle. I want executors on 
> node to always automatically load my libs from {{/extra-lib}}, because that's 
> how future developers would use framework out-of-the-box.
> The config doc says you can specify extraClasspath for the executor in 
> {{spark-defaults.conf}}, which is good because custom config may be put in 
> the bundle for the framework, but the syntax of the property is unclear.
> You can basically specify the value that will be appended to {{-cp}} for a 
> executor Java process, so it follows the Java how-to-set-classpath rules, so 
> basically you have two options here:
> 1. specify absolute path
> bq. spark.executor.extraClassPath /home/user/Apps/spark-bundled/extra-lib/*
> 2. specify relative path
> bq. spark.executor.extraClassPath ../../../extra-lib/*
> But none of these ways look good: absolute path won't work at all since you 
> cannot know where users will put the bundle, relative path looks weird 
> because executor will have it's work directory set to something like 
> {{/work/app-20160316070310-0002/0}} and can also be broken if custom worker 
> folder is configured.
> So, it's required to have a proper way to bundle custom libs and set executor 
> classpath to load them up.
> *Expected*: you can specify {{spark.executor.extraClassPath}} relative to 
> {{$SPARK_HOME}} using placeholders, e.g. with next syntax:
> bq. spark.executor.extraClassPath ${home}/extra-lib/*
> Code will resolve placeholders in properties with a proper path
> Executor will get absolute path in {{-cp}} this way
> *Actual*: you cannot specify extra libs for executor relative to 
> {{$SPARK_HOME}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13975) Cannot specify extra libs for executor from /extra-lib

2016-03-19 Thread Leonid Poliakov (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199705#comment-15199705
 ] 

Leonid Poliakov commented on SPARK-13975:
-

Thanks for you quick response, Sean

The problem is, I'm bundling extra files with my framework, specifically, a 
memory cache that the spark will interact with to store/read RDDs, etc.
The libs for the memcache to work are quite huge, and I can't really reduce 
their size.
So user apps have to do either:
a. include libs and be used as fat jar
This slows down the apps transfer, which I really want to avoid
b. put a wrapper around spark-submit to append libs to {{--jars}}
This is how I do it right now, but executors fetch the jars each time, which 
causes fetched jars to stack up in {{/work}} and manual clean-up is required
c. load jars in each executor
That's what I want, but can't figure out how to do

I know it's a rare use case for spark, but I believe shipping custom libs with 
the spark to nodes in one bundle must be a very portable and easy approach.

> Cannot specify extra libs for executor from /extra-lib
> --
>
> Key: SPARK-13975
> URL: https://issues.apache.org/jira/browse/SPARK-13975
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 1.6.1
>Reporter: Leonid Poliakov
>
> If you build a framework on top of spark and want to bundle it with the 
> spark, there is no easy way to add your framework libs to executor classpath.
> Let's say I want to add my custom libs to {{/extra-lib}} folder, ship the new 
> bundle (with my libs in it) to nodes, run the bundle. I want executors on 
> node to always automatically load my libs from {{/extra-lib}}, because that's 
> how future developers would use framework out-of-the-box.
> The config doc says you can specify extraClasspath for the executor in 
> {{spark-defaults.conf}}, which is good because custom config may be put in 
> the bundle for the framework, but the syntax of the property is unclear.
> You can basically specify the value that will be appended to {{-cp}} for a 
> executor Java process, so it follows the Java how-to-set-classpath rules, so 
> basically you have two options here:
> 1. specify absolute path
> bq. spark.executor.extraClassPath /home/user/Apps/spark-bundled/extra-lib/*
> 2. specify relative path
> bq. spark.executor.extraClassPath ../../../extra-lib/*
> But none of these ways look good: absolute path won't work at all since you 
> cannot know where users will put the bundle, relative path looks weird 
> because executor will have it's work directory set to something like 
> {{/work/app-20160316070310-0002/0}} and can also be broken if custom worker 
> folder is configured.
> So, it's required to have a proper way to bundle custom libs and set executor 
> classpath to load them up.
> *Expected*: you can specify {{spark.executor.extraClassPath}} relative to 
> {{$SPARK_HOME}} using placeholders, e.g. with next syntax:
> bq. spark.executor.extraClassPath ${home}/extra-lib/*
> Code will resolve placeholders in properties with a proper path
> Executor will get absolute path in {{-cp}} this way
> *Actual*: you cannot specify extra libs for executor relative to 
> {{$SPARK_HOME}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13975) Cannot specify extra libs for executor from /extra-lib

2016-03-19 Thread Leonid Poliakov (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199723#comment-15199723
 ] 

Leonid Poliakov commented on SPARK-13975:
-

Do you know any other clean way to load bundled libs transparent to user?
As for implementation, I will implement it and create a pull request, if it is 
a cool feature to have in your opinion. Might be useful for other properties.

> Cannot specify extra libs for executor from /extra-lib
> --
>
> Key: SPARK-13975
> URL: https://issues.apache.org/jira/browse/SPARK-13975
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Submit
>Affects Versions: 1.6.1
>Reporter: Leonid Poliakov
>Priority: Minor
>
> If you build a framework on top of spark and want to bundle it with the 
> spark, there is no easy way to add your framework libs to executor classpath.
> Let's say I want to add my custom libs to {{/extra-lib}} folder, ship the new 
> bundle (with my libs in it) to nodes, run the bundle. I want executors on 
> node to always automatically load my libs from {{/extra-lib}}, because that's 
> how future developers would use framework out-of-the-box.
> The config doc says you can specify extraClasspath for the executor in 
> {{spark-defaults.conf}}, which is good because custom config may be put in 
> the bundle for the framework, but the syntax of the property is unclear.
> You can basically specify the value that will be appended to {{-cp}} for a 
> executor Java process, so it follows the Java how-to-set-classpath rules, so 
> basically you have two options here:
> 1. specify absolute path
> bq. spark.executor.extraClassPath /home/user/Apps/spark-bundled/extra-lib/*
> 2. specify relative path
> bq. spark.executor.extraClassPath ../../../extra-lib/*
> But none of these ways look good: absolute path won't work at all since you 
> cannot know where users will put the bundle, relative path looks weird 
> because executor will have it's work directory set to something like 
> {{/work/app-20160316070310-0002/0}} and can also be broken if custom worker 
> folder is configured.
> So, it's required to have a proper way to bundle custom libs and set executor 
> classpath to load them up.
> *Expected*: you can specify {{spark.executor.extraClassPath}} relative to 
> {{$SPARK_HOME}} using placeholders, e.g. with next syntax:
> bq. spark.executor.extraClassPath ${home}/extra-lib/*
> Code will resolve placeholders in properties with a proper path
> Executor will get absolute path in {{-cp}} this way
> *Actual*: you cannot specify extra libs for executor relative to 
> {{$SPARK_HOME}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-13975) Cannot specify extra libs for executor from /extra-lib

2016-03-20 Thread Leonid Poliakov (JIRA)
Leonid Poliakov created SPARK-13975:
---

 Summary: Cannot specify extra libs for executor from /extra-lib
 Key: SPARK-13975
 URL: https://issues.apache.org/jira/browse/SPARK-13975
 Project: Spark
  Issue Type: Bug
  Components: Spark Submit
Affects Versions: 1.6.1
Reporter: Leonid Poliakov


If you build a framework on top of spark and want to bundle it with the spark, 
there is no easy way to add your framework libs to executor classpath.

Let's say I want to add my custom libs to {{/extra-lib}} folder, ship the new 
bundle (with my libs in it) to nodes, run the bundle. I want executors on node 
to always automatically load my libs from {{/extra-lib}}, because that's how 
future developers would use framework out-of-the-box.

The config doc says you can specify extraClasspath for the executor in 
{{spark-defaults.conf}}, which is good because custom config may be put in the 
bundle for the framework, but the syntax of the property is unclear.
You can basically specify the value that will be appended to {{-cp}} for a 
executor Java process, so it follows the Java how-to-set-classpath rules, so 
basically you have two options here:
1. specify absolute path
bq. spark.executor.extraClassPath /home/user/Apps/spark-bundled/extra-lib/*
2. specify relative path
bq. spark.executor.extraClassPath ../../../extra-lib/*

But none of these ways look good: absolute path won't work at all since you 
cannot know where users will put the bundle, relative path looks weird because 
executor will have it's work directory set to something like 
{{/work/app-20160316070310-0002/0}} and can also be broken if custom worker 
folder is configured.

So, it's required to have a proper way to bundle custom libs and set executor 
classpath to load them up.

*Expected*: you can specify {{spark.executor.extraClassPath}} relative to 
{{$SPARK_HOME}} using placeholders, e.g. with next syntax:
bq. spark.executor.extraClassPath ${home}/extra-lib/*
Code will resolve placeholders in properties with a proper path
Executor will get absolute path in {{-cp}} this way

*Actual*: you cannot specify extra libs for executor relative to {{$SPARK_HOME}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org