[jira] [Commented] (SPARK-13975) Cannot specify extra libs for executor from /extra-lib

2016-03-19 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15199775#comment-15199775
 ] 

Sean Owen commented on SPARK-13975:
---

You can transmit the files separately with --files, and then specify a relative 
path, but that still means copying. 

> Cannot specify extra libs for executor from /extra-lib
> --
>
> Key: SPARK-13975
> URL: https://issues.apache.org/jira/browse/SPARK-13975
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Submit
>Affects Versions: 1.6.1
>Reporter: Leonid Poliakov
>Priority: Minor
>
> If you build a framework on top of spark and want to bundle it with the 
> spark, there is no easy way to add your framework libs to executor classpath.
> Let's say I want to add my custom libs to {{/extra-lib}} folder, ship the new 
> bundle (with my libs in it) to nodes, run the bundle. I want executors on 
> node to always automatically load my libs from {{/extra-lib}}, because that's 
> how future developers would use framework out-of-the-box.
> The config doc says you can specify extraClasspath for the executor in 
> {{spark-defaults.conf}}, which is good because custom config may be put in 
> the bundle for the framework, but the syntax of the property is unclear.
> You can basically specify the value that will be appended to {{-cp}} for a 
> executor Java process, so it follows the Java how-to-set-classpath rules, so 
> basically you have two options here:
> 1. specify absolute path
> bq. spark.executor.extraClassPath /home/user/Apps/spark-bundled/extra-lib/*
> 2. specify relative path
> bq. spark.executor.extraClassPath ../../../extra-lib/*
> But none of these ways look good: absolute path won't work at all since you 
> cannot know where users will put the bundle, relative path looks weird 
> because executor will have it's work directory set to something like 
> {{/work/app-20160316070310-0002/0}} and can also be broken if custom worker 
> folder is configured.
> So, it's required to have a proper way to bundle custom libs and set executor 
> classpath to load them up.
> *Expected*: you can specify {{spark.executor.extraClassPath}} relative to 
> {{$SPARK_HOME}} using placeholders, e.g. with next syntax:
> bq. spark.executor.extraClassPath ${home}/extra-lib/*
> Code will resolve placeholders in properties with a proper path
> Executor will get absolute path in {{-cp}} this way
> *Actual*: you cannot specify extra libs for executor relative to 
> {{$SPARK_HOME}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13975) Cannot specify extra libs for executor from /extra-lib

2016-03-19 Thread Leonid Poliakov (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15199723#comment-15199723
 ] 

Leonid Poliakov commented on SPARK-13975:
-

Do you know any other clean way to load bundled libs transparent to user?
As for implementation, I will implement it and create a pull request, if it is 
a cool feature to have in your opinion. Might be useful for other properties.

> Cannot specify extra libs for executor from /extra-lib
> --
>
> Key: SPARK-13975
> URL: https://issues.apache.org/jira/browse/SPARK-13975
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Submit
>Affects Versions: 1.6.1
>Reporter: Leonid Poliakov
>Priority: Minor
>
> If you build a framework on top of spark and want to bundle it with the 
> spark, there is no easy way to add your framework libs to executor classpath.
> Let's say I want to add my custom libs to {{/extra-lib}} folder, ship the new 
> bundle (with my libs in it) to nodes, run the bundle. I want executors on 
> node to always automatically load my libs from {{/extra-lib}}, because that's 
> how future developers would use framework out-of-the-box.
> The config doc says you can specify extraClasspath for the executor in 
> {{spark-defaults.conf}}, which is good because custom config may be put in 
> the bundle for the framework, but the syntax of the property is unclear.
> You can basically specify the value that will be appended to {{-cp}} for a 
> executor Java process, so it follows the Java how-to-set-classpath rules, so 
> basically you have two options here:
> 1. specify absolute path
> bq. spark.executor.extraClassPath /home/user/Apps/spark-bundled/extra-lib/*
> 2. specify relative path
> bq. spark.executor.extraClassPath ../../../extra-lib/*
> But none of these ways look good: absolute path won't work at all since you 
> cannot know where users will put the bundle, relative path looks weird 
> because executor will have it's work directory set to something like 
> {{/work/app-20160316070310-0002/0}} and can also be broken if custom worker 
> folder is configured.
> So, it's required to have a proper way to bundle custom libs and set executor 
> classpath to load them up.
> *Expected*: you can specify {{spark.executor.extraClassPath}} relative to 
> {{$SPARK_HOME}} using placeholders, e.g. with next syntax:
> bq. spark.executor.extraClassPath ${home}/extra-lib/*
> Code will resolve placeholders in properties with a proper path
> Executor will get absolute path in {{-cp}} this way
> *Actual*: you cannot specify extra libs for executor relative to 
> {{$SPARK_HOME}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13975) Cannot specify extra libs for executor from /extra-lib

2016-03-19 Thread Leonid Poliakov (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15199705#comment-15199705
 ] 

Leonid Poliakov commented on SPARK-13975:
-

Thanks for you quick response, Sean

The problem is, I'm bundling extra files with my framework, specifically, a 
memory cache that the spark will interact with to store/read RDDs, etc.
The libs for the memcache to work are quite huge, and I can't really reduce 
their size.
So user apps have to do either:
a. include libs and be used as fat jar
This slows down the apps transfer, which I really want to avoid
b. put a wrapper around spark-submit to append libs to {{--jars}}
This is how I do it right now, but executors fetch the jars each time, which 
causes fetched jars to stack up in {{/work}} and manual clean-up is required
c. load jars in each executor
That's what I want, but can't figure out how to do

I know it's a rare use case for spark, but I believe shipping custom libs with 
the spark to nodes in one bundle must be a very portable and easy approach.

> Cannot specify extra libs for executor from /extra-lib
> --
>
> Key: SPARK-13975
> URL: https://issues.apache.org/jira/browse/SPARK-13975
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 1.6.1
>Reporter: Leonid Poliakov
>
> If you build a framework on top of spark and want to bundle it with the 
> spark, there is no easy way to add your framework libs to executor classpath.
> Let's say I want to add my custom libs to {{/extra-lib}} folder, ship the new 
> bundle (with my libs in it) to nodes, run the bundle. I want executors on 
> node to always automatically load my libs from {{/extra-lib}}, because that's 
> how future developers would use framework out-of-the-box.
> The config doc says you can specify extraClasspath for the executor in 
> {{spark-defaults.conf}}, which is good because custom config may be put in 
> the bundle for the framework, but the syntax of the property is unclear.
> You can basically specify the value that will be appended to {{-cp}} for a 
> executor Java process, so it follows the Java how-to-set-classpath rules, so 
> basically you have two options here:
> 1. specify absolute path
> bq. spark.executor.extraClassPath /home/user/Apps/spark-bundled/extra-lib/*
> 2. specify relative path
> bq. spark.executor.extraClassPath ../../../extra-lib/*
> But none of these ways look good: absolute path won't work at all since you 
> cannot know where users will put the bundle, relative path looks weird 
> because executor will have it's work directory set to something like 
> {{/work/app-20160316070310-0002/0}} and can also be broken if custom worker 
> folder is configured.
> So, it's required to have a proper way to bundle custom libs and set executor 
> classpath to load them up.
> *Expected*: you can specify {{spark.executor.extraClassPath}} relative to 
> {{$SPARK_HOME}} using placeholders, e.g. with next syntax:
> bq. spark.executor.extraClassPath ${home}/extra-lib/*
> Code will resolve placeholders in properties with a proper path
> Executor will get absolute path in {{-cp}} this way
> *Actual*: you cannot specify extra libs for executor relative to 
> {{$SPARK_HOME}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13975) Cannot specify extra libs for executor from /extra-lib

2016-03-19 Thread Leonid Poliakov (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15199675#comment-15199675
 ] 

Leonid Poliakov commented on SPARK-13975:
-

I am trying to avoid adding extra steps for users to perform when they deal 
with bundle.
It would be great if libs were loaded out-of-box.
And that's what user would actually expect, since there is no good excuse for 
an extra step for user to go in config and add absolute path to something that 
is already within bundle.

> Cannot specify extra libs for executor from /extra-lib
> --
>
> Key: SPARK-13975
> URL: https://issues.apache.org/jira/browse/SPARK-13975
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 1.6.1
>Reporter: Leonid Poliakov
>
> If you build a framework on top of spark and want to bundle it with the 
> spark, there is no easy way to add your framework libs to executor classpath.
> Let's say I want to add my custom libs to {{/extra-lib}} folder, ship the new 
> bundle (with my libs in it) to nodes, run the bundle. I want executors on 
> node to always automatically load my libs from {{/extra-lib}}, because that's 
> how future developers would use framework out-of-the-box.
> The config doc says you can specify extraClasspath for the executor in 
> {{spark-defaults.conf}}, which is good because custom config may be put in 
> the bundle for the framework, but the syntax of the property is unclear.
> You can basically specify the value that will be appended to {{-cp}} for a 
> executor Java process, so it follows the Java how-to-set-classpath rules, so 
> basically you have two options here:
> 1. specify absolute path
> bq. spark.executor.extraClassPath /home/user/Apps/spark-bundled/extra-lib/*
> 2. specify relative path
> bq. spark.executor.extraClassPath ../../../extra-lib/*
> But none of these ways look good: absolute path won't work at all since you 
> cannot know where users will put the bundle, relative path looks weird 
> because executor will have it's work directory set to something like 
> {{/work/app-20160316070310-0002/0}} and can also be broken if custom worker 
> folder is configured.
> So, it's required to have a proper way to bundle custom libs and set executor 
> classpath to load them up.
> *Expected*: you can specify {{spark.executor.extraClassPath}} relative to 
> {{$SPARK_HOME}} using placeholders, e.g. with next syntax:
> bq. spark.executor.extraClassPath ${home}/extra-lib/*
> Code will resolve placeholders in properties with a proper path
> Executor will get absolute path in {{-cp}} this way
> *Actual*: you cannot specify extra libs for executor relative to 
> {{$SPARK_HOME}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13975) Cannot specify extra libs for executor from /extra-lib

2016-03-19 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15199662#comment-15199662
 ] 

Sean Owen commented on SPARK-13975:
---

You would generally control the environment Spark runs in, and would use this 
to put files in a known uniform place on the classpath. Dependent libraries 
should be bundled with your app though if you need to control their 
distribution.

> Cannot specify extra libs for executor from /extra-lib
> --
>
> Key: SPARK-13975
> URL: https://issues.apache.org/jira/browse/SPARK-13975
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 1.6.1
>Reporter: Leonid Poliakov
>
> If you build a framework on top of spark and want to bundle it with the 
> spark, there is no easy way to add your framework libs to executor classpath.
> Let's say I want to add my custom libs to {{/extra-lib}} folder, ship the new 
> bundle (with my libs in it) to nodes, run the bundle. I want executors on 
> node to always automatically load my libs from {{/extra-lib}}, because that's 
> how future developers would use framework out-of-the-box.
> The config doc says you can specify extraClasspath for the executor in 
> {{spark-defaults.conf}}, which is good because custom config may be put in 
> the bundle for the framework, but the syntax of the property is unclear.
> You can basically specify the value that will be appended to {{-cp}} for a 
> executor Java process, so it follows the Java how-to-set-classpath rules, so 
> basically you have two options here:
> 1. specify absolute path
> bq. spark.executor.extraClassPath /home/user/Apps/spark-bundled/extra-lib/*
> 2. specify relative path
> bq. spark.executor.extraClassPath ../../../extra-lib/*
> But none of these ways look good: absolute path won't work at all since you 
> cannot know where users will put the bundle, relative path looks weird 
> because executor will have it's work directory set to something like 
> {{/work/app-20160316070310-0002/0}} and can also be broken if custom worker 
> folder is configured.
> So, it's required to have a proper way to bundle custom libs and set executor 
> classpath to load them up.
> *Expected*: you can specify {{spark.executor.extraClassPath}} relative to 
> {{$SPARK_HOME}} using placeholders, e.g. with next syntax:
> bq. spark.executor.extraClassPath ${home}/extra-lib/*
> Code will resolve placeholders in properties with a proper path
> Executor will get absolute path in {{-cp}} this way
> *Actual*: you cannot specify extra libs for executor relative to 
> {{$SPARK_HOME}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13975) Cannot specify extra libs for executor from /extra-lib

2016-03-18 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15199685#comment-15199685
 ] 

Sean Owen commented on SPARK-13975:
---

I'm not suggesting the user should configure something, but that jars that 
logically belong with your app should be bundled in your app JAR. At least, 
that's the intended usage for your use case, as far as I can tell. No paths 
required. (Or:  your users just bundle your framework with their app they build 
on it.)

> Cannot specify extra libs for executor from /extra-lib
> --
>
> Key: SPARK-13975
> URL: https://issues.apache.org/jira/browse/SPARK-13975
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 1.6.1
>Reporter: Leonid Poliakov
>
> If you build a framework on top of spark and want to bundle it with the 
> spark, there is no easy way to add your framework libs to executor classpath.
> Let's say I want to add my custom libs to {{/extra-lib}} folder, ship the new 
> bundle (with my libs in it) to nodes, run the bundle. I want executors on 
> node to always automatically load my libs from {{/extra-lib}}, because that's 
> how future developers would use framework out-of-the-box.
> The config doc says you can specify extraClasspath for the executor in 
> {{spark-defaults.conf}}, which is good because custom config may be put in 
> the bundle for the framework, but the syntax of the property is unclear.
> You can basically specify the value that will be appended to {{-cp}} for a 
> executor Java process, so it follows the Java how-to-set-classpath rules, so 
> basically you have two options here:
> 1. specify absolute path
> bq. spark.executor.extraClassPath /home/user/Apps/spark-bundled/extra-lib/*
> 2. specify relative path
> bq. spark.executor.extraClassPath ../../../extra-lib/*
> But none of these ways look good: absolute path won't work at all since you 
> cannot know where users will put the bundle, relative path looks weird 
> because executor will have it's work directory set to something like 
> {{/work/app-20160316070310-0002/0}} and can also be broken if custom worker 
> folder is configured.
> So, it's required to have a proper way to bundle custom libs and set executor 
> classpath to load them up.
> *Expected*: you can specify {{spark.executor.extraClassPath}} relative to 
> {{$SPARK_HOME}} using placeholders, e.g. with next syntax:
> bq. spark.executor.extraClassPath ${home}/extra-lib/*
> Code will resolve placeholders in properties with a proper path
> Executor will get absolute path in {{-cp}} this way
> *Actual*: you cannot specify extra libs for executor relative to 
> {{$SPARK_HOME}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org