[jira] [Commented] (SPARK-13975) Cannot specify extra libs for executor from /extra-lib
[ https://issues.apache.org/jira/browse/SPARK-13975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199685#comment-15199685 ] Sean Owen commented on SPARK-13975: --- I'm not suggesting the user should configure something, but that jars that logically belong with your app should be bundled in your app JAR. At least, that's the intended usage for your use case, as far as I can tell. No paths required. (Or: your users just bundle your framework with their app they build on it.) > Cannot specify extra libs for executor from /extra-lib > -- > > Key: SPARK-13975 > URL: https://issues.apache.org/jira/browse/SPARK-13975 > Project: Spark > Issue Type: Bug > Components: Spark Submit >Affects Versions: 1.6.1 >Reporter: Leonid Poliakov > > If you build a framework on top of spark and want to bundle it with the > spark, there is no easy way to add your framework libs to executor classpath. > Let's say I want to add my custom libs to {{/extra-lib}} folder, ship the new > bundle (with my libs in it) to nodes, run the bundle. I want executors on > node to always automatically load my libs from {{/extra-lib}}, because that's > how future developers would use framework out-of-the-box. > The config doc says you can specify extraClasspath for the executor in > {{spark-defaults.conf}}, which is good because custom config may be put in > the bundle for the framework, but the syntax of the property is unclear. > You can basically specify the value that will be appended to {{-cp}} for a > executor Java process, so it follows the Java how-to-set-classpath rules, so > basically you have two options here: > 1. specify absolute path > bq. spark.executor.extraClassPath /home/user/Apps/spark-bundled/extra-lib/* > 2. specify relative path > bq. spark.executor.extraClassPath ../../../extra-lib/* > But none of these ways look good: absolute path won't work at all since you > cannot know where users will put the bundle, relative path looks weird > because executor will have it's work directory set to something like > {{/work/app-20160316070310-0002/0}} and can also be broken if custom worker > folder is configured. > So, it's required to have a proper way to bundle custom libs and set executor > classpath to load them up. > *Expected*: you can specify {{spark.executor.extraClassPath}} relative to > {{$SPARK_HOME}} using placeholders, e.g. with next syntax: > bq. spark.executor.extraClassPath ${home}/extra-lib/* > Code will resolve placeholders in properties with a proper path > Executor will get absolute path in {{-cp}} this way > *Actual*: you cannot specify extra libs for executor relative to > {{$SPARK_HOME}} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13975) Cannot specify extra libs for executor from /extra-lib
[ https://issues.apache.org/jira/browse/SPARK-13975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199662#comment-15199662 ] Sean Owen commented on SPARK-13975: --- You would generally control the environment Spark runs in, and would use this to put files in a known uniform place on the classpath. Dependent libraries should be bundled with your app though if you need to control their distribution. > Cannot specify extra libs for executor from /extra-lib > -- > > Key: SPARK-13975 > URL: https://issues.apache.org/jira/browse/SPARK-13975 > Project: Spark > Issue Type: Bug > Components: Spark Submit >Affects Versions: 1.6.1 >Reporter: Leonid Poliakov > > If you build a framework on top of spark and want to bundle it with the > spark, there is no easy way to add your framework libs to executor classpath. > Let's say I want to add my custom libs to {{/extra-lib}} folder, ship the new > bundle (with my libs in it) to nodes, run the bundle. I want executors on > node to always automatically load my libs from {{/extra-lib}}, because that's > how future developers would use framework out-of-the-box. > The config doc says you can specify extraClasspath for the executor in > {{spark-defaults.conf}}, which is good because custom config may be put in > the bundle for the framework, but the syntax of the property is unclear. > You can basically specify the value that will be appended to {{-cp}} for a > executor Java process, so it follows the Java how-to-set-classpath rules, so > basically you have two options here: > 1. specify absolute path > bq. spark.executor.extraClassPath /home/user/Apps/spark-bundled/extra-lib/* > 2. specify relative path > bq. spark.executor.extraClassPath ../../../extra-lib/* > But none of these ways look good: absolute path won't work at all since you > cannot know where users will put the bundle, relative path looks weird > because executor will have it's work directory set to something like > {{/work/app-20160316070310-0002/0}} and can also be broken if custom worker > folder is configured. > So, it's required to have a proper way to bundle custom libs and set executor > classpath to load them up. > *Expected*: you can specify {{spark.executor.extraClassPath}} relative to > {{$SPARK_HOME}} using placeholders, e.g. with next syntax: > bq. spark.executor.extraClassPath ${home}/extra-lib/* > Code will resolve placeholders in properties with a proper path > Executor will get absolute path in {{-cp}} this way > *Actual*: you cannot specify extra libs for executor relative to > {{$SPARK_HOME}} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13975) Cannot specify extra libs for executor from /extra-lib
[ https://issues.apache.org/jira/browse/SPARK-13975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199675#comment-15199675 ] Leonid Poliakov commented on SPARK-13975: - I am trying to avoid adding extra steps for users to perform when they deal with bundle. It would be great if libs were loaded out-of-box. And that's what user would actually expect, since there is no good excuse for an extra step for user to go in config and add absolute path to something that is already within bundle. > Cannot specify extra libs for executor from /extra-lib > -- > > Key: SPARK-13975 > URL: https://issues.apache.org/jira/browse/SPARK-13975 > Project: Spark > Issue Type: Bug > Components: Spark Submit >Affects Versions: 1.6.1 >Reporter: Leonid Poliakov > > If you build a framework on top of spark and want to bundle it with the > spark, there is no easy way to add your framework libs to executor classpath. > Let's say I want to add my custom libs to {{/extra-lib}} folder, ship the new > bundle (with my libs in it) to nodes, run the bundle. I want executors on > node to always automatically load my libs from {{/extra-lib}}, because that's > how future developers would use framework out-of-the-box. > The config doc says you can specify extraClasspath for the executor in > {{spark-defaults.conf}}, which is good because custom config may be put in > the bundle for the framework, but the syntax of the property is unclear. > You can basically specify the value that will be appended to {{-cp}} for a > executor Java process, so it follows the Java how-to-set-classpath rules, so > basically you have two options here: > 1. specify absolute path > bq. spark.executor.extraClassPath /home/user/Apps/spark-bundled/extra-lib/* > 2. specify relative path > bq. spark.executor.extraClassPath ../../../extra-lib/* > But none of these ways look good: absolute path won't work at all since you > cannot know where users will put the bundle, relative path looks weird > because executor will have it's work directory set to something like > {{/work/app-20160316070310-0002/0}} and can also be broken if custom worker > folder is configured. > So, it's required to have a proper way to bundle custom libs and set executor > classpath to load them up. > *Expected*: you can specify {{spark.executor.extraClassPath}} relative to > {{$SPARK_HOME}} using placeholders, e.g. with next syntax: > bq. spark.executor.extraClassPath ${home}/extra-lib/* > Code will resolve placeholders in properties with a proper path > Executor will get absolute path in {{-cp}} this way > *Actual*: you cannot specify extra libs for executor relative to > {{$SPARK_HOME}} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13975) Cannot specify extra libs for executor from /extra-lib
[ https://issues.apache.org/jira/browse/SPARK-13975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199705#comment-15199705 ] Leonid Poliakov commented on SPARK-13975: - Thanks for you quick response, Sean The problem is, I'm bundling extra files with my framework, specifically, a memory cache that the spark will interact with to store/read RDDs, etc. The libs for the memcache to work are quite huge, and I can't really reduce their size. So user apps have to do either: a. include libs and be used as fat jar This slows down the apps transfer, which I really want to avoid b. put a wrapper around spark-submit to append libs to {{--jars}} This is how I do it right now, but executors fetch the jars each time, which causes fetched jars to stack up in {{/work}} and manual clean-up is required c. load jars in each executor That's what I want, but can't figure out how to do I know it's a rare use case for spark, but I believe shipping custom libs with the spark to nodes in one bundle must be a very portable and easy approach. > Cannot specify extra libs for executor from /extra-lib > -- > > Key: SPARK-13975 > URL: https://issues.apache.org/jira/browse/SPARK-13975 > Project: Spark > Issue Type: Bug > Components: Spark Submit >Affects Versions: 1.6.1 >Reporter: Leonid Poliakov > > If you build a framework on top of spark and want to bundle it with the > spark, there is no easy way to add your framework libs to executor classpath. > Let's say I want to add my custom libs to {{/extra-lib}} folder, ship the new > bundle (with my libs in it) to nodes, run the bundle. I want executors on > node to always automatically load my libs from {{/extra-lib}}, because that's > how future developers would use framework out-of-the-box. > The config doc says you can specify extraClasspath for the executor in > {{spark-defaults.conf}}, which is good because custom config may be put in > the bundle for the framework, but the syntax of the property is unclear. > You can basically specify the value that will be appended to {{-cp}} for a > executor Java process, so it follows the Java how-to-set-classpath rules, so > basically you have two options here: > 1. specify absolute path > bq. spark.executor.extraClassPath /home/user/Apps/spark-bundled/extra-lib/* > 2. specify relative path > bq. spark.executor.extraClassPath ../../../extra-lib/* > But none of these ways look good: absolute path won't work at all since you > cannot know where users will put the bundle, relative path looks weird > because executor will have it's work directory set to something like > {{/work/app-20160316070310-0002/0}} and can also be broken if custom worker > folder is configured. > So, it's required to have a proper way to bundle custom libs and set executor > classpath to load them up. > *Expected*: you can specify {{spark.executor.extraClassPath}} relative to > {{$SPARK_HOME}} using placeholders, e.g. with next syntax: > bq. spark.executor.extraClassPath ${home}/extra-lib/* > Code will resolve placeholders in properties with a proper path > Executor will get absolute path in {{-cp}} this way > *Actual*: you cannot specify extra libs for executor relative to > {{$SPARK_HOME}} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13975) Cannot specify extra libs for executor from /extra-lib
[ https://issues.apache.org/jira/browse/SPARK-13975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199723#comment-15199723 ] Leonid Poliakov commented on SPARK-13975: - Do you know any other clean way to load bundled libs transparent to user? As for implementation, I will implement it and create a pull request, if it is a cool feature to have in your opinion. Might be useful for other properties. > Cannot specify extra libs for executor from /extra-lib > -- > > Key: SPARK-13975 > URL: https://issues.apache.org/jira/browse/SPARK-13975 > Project: Spark > Issue Type: Improvement > Components: Spark Submit >Affects Versions: 1.6.1 >Reporter: Leonid Poliakov >Priority: Minor > > If you build a framework on top of spark and want to bundle it with the > spark, there is no easy way to add your framework libs to executor classpath. > Let's say I want to add my custom libs to {{/extra-lib}} folder, ship the new > bundle (with my libs in it) to nodes, run the bundle. I want executors on > node to always automatically load my libs from {{/extra-lib}}, because that's > how future developers would use framework out-of-the-box. > The config doc says you can specify extraClasspath for the executor in > {{spark-defaults.conf}}, which is good because custom config may be put in > the bundle for the framework, but the syntax of the property is unclear. > You can basically specify the value that will be appended to {{-cp}} for a > executor Java process, so it follows the Java how-to-set-classpath rules, so > basically you have two options here: > 1. specify absolute path > bq. spark.executor.extraClassPath /home/user/Apps/spark-bundled/extra-lib/* > 2. specify relative path > bq. spark.executor.extraClassPath ../../../extra-lib/* > But none of these ways look good: absolute path won't work at all since you > cannot know where users will put the bundle, relative path looks weird > because executor will have it's work directory set to something like > {{/work/app-20160316070310-0002/0}} and can also be broken if custom worker > folder is configured. > So, it's required to have a proper way to bundle custom libs and set executor > classpath to load them up. > *Expected*: you can specify {{spark.executor.extraClassPath}} relative to > {{$SPARK_HOME}} using placeholders, e.g. with next syntax: > bq. spark.executor.extraClassPath ${home}/extra-lib/* > Code will resolve placeholders in properties with a proper path > Executor will get absolute path in {{-cp}} this way > *Actual*: you cannot specify extra libs for executor relative to > {{$SPARK_HOME}} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13975) Cannot specify extra libs for executor from /extra-lib
[ https://issues.apache.org/jira/browse/SPARK-13975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199775#comment-15199775 ] Sean Owen commented on SPARK-13975: --- You can transmit the files separately with --files, and then specify a relative path, but that still means copying. > Cannot specify extra libs for executor from /extra-lib > -- > > Key: SPARK-13975 > URL: https://issues.apache.org/jira/browse/SPARK-13975 > Project: Spark > Issue Type: Improvement > Components: Spark Submit >Affects Versions: 1.6.1 >Reporter: Leonid Poliakov >Priority: Minor > > If you build a framework on top of spark and want to bundle it with the > spark, there is no easy way to add your framework libs to executor classpath. > Let's say I want to add my custom libs to {{/extra-lib}} folder, ship the new > bundle (with my libs in it) to nodes, run the bundle. I want executors on > node to always automatically load my libs from {{/extra-lib}}, because that's > how future developers would use framework out-of-the-box. > The config doc says you can specify extraClasspath for the executor in > {{spark-defaults.conf}}, which is good because custom config may be put in > the bundle for the framework, but the syntax of the property is unclear. > You can basically specify the value that will be appended to {{-cp}} for a > executor Java process, so it follows the Java how-to-set-classpath rules, so > basically you have two options here: > 1. specify absolute path > bq. spark.executor.extraClassPath /home/user/Apps/spark-bundled/extra-lib/* > 2. specify relative path > bq. spark.executor.extraClassPath ../../../extra-lib/* > But none of these ways look good: absolute path won't work at all since you > cannot know where users will put the bundle, relative path looks weird > because executor will have it's work directory set to something like > {{/work/app-20160316070310-0002/0}} and can also be broken if custom worker > folder is configured. > So, it's required to have a proper way to bundle custom libs and set executor > classpath to load them up. > *Expected*: you can specify {{spark.executor.extraClassPath}} relative to > {{$SPARK_HOME}} using placeholders, e.g. with next syntax: > bq. spark.executor.extraClassPath ${home}/extra-lib/* > Code will resolve placeholders in properties with a proper path > Executor will get absolute path in {{-cp}} this way > *Actual*: you cannot specify extra libs for executor relative to > {{$SPARK_HOME}} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org