[GitHub] spark pull request: spark-submit with accept multiple properties-f...

2015-02-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/3490


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: spark-submit with accept multiple properties-f...

2015-01-07 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/3490#issuecomment-69116580
  
I would recommend that we close this PR for now until we file a 
corresponding JIRA that describes the issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: spark-submit with accept multiple properties-f...

2014-12-10 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/3490#issuecomment-66516901
  
I see. I think the complexity in merging multiple properties file arises 
when you have the same configs declared in two properties files. What is the 
expected semantics there, do we merge the values somehow? If not, which one is 
overridden? Or do we throw an exception? The worst that could happen is that 
the user thinks that a particular value is used, but it turns out that it's 
actually being overridden silently because another properties file also defines 
it.

That said, I think I understand your use case a little more. Could you file 
a JIRA and add it to the title? Maybe we can have more people look at this and 
decide whether we actually want to support this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: spark-submit with accept multiple properties-f...

2014-12-09 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/3490#issuecomment-66398025
  
I agree. Would you mind closing this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: spark-submit with accept multiple properties-f...

2014-12-09 Thread lvsoft
Github user lvsoft commented on the pull request:

https://github.com/apache/spark/pull/3490#issuecomment-66404194
  
Sorry for late reply. I'll explain the use cases for multiple properties 
files. 

Currently I'm working on a benchmark utility for spark. It'll be nature to 
adjust properties for different workloads.
I'd like to setup the configures with two parts: global confs for common 
properties, and private confs for each workloads. Without the support of 
multiple properties files, I have to merge the properties as a tmp conf file, 
and remove it after spark-submit finished. What's more, consider to submit 
multiple workloads for multiple times concurrently, the tmp conf file name need 
to be mutually exclusive. And if the benchmark processing was interrupted, the 
tmp conf files will be hard to clean.

So I think, a more elegant approach is to add the support of multiple 
properties files for spark.

Another reason for this PR: currently spark will use `spark-defaults.conf` 
if no properties-file specified, or use the specified properties-file and 
*discard* `spark-defaults.conf`. This behavior is also counter-intuitive for 
beginners. In most systems, it is a natural assumption that the values in 
`xxx-defaults.conf` will take effect if the properties is not overrided in 
user's config.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: spark-submit with accept multiple properties-f...

2014-12-09 Thread WangTaoTheTonic
Github user WangTaoTheTonic commented on the pull request:

https://github.com/apache/spark/pull/3490#issuecomment-66404521
  
In your case, why don't just add common properties into private config and 
set a seperate propertiy file for each workload?
Why would the tmp conf file be deleted after job finished?
I don't think this is reasonalbe to make this change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: spark-submit with accept multiple properties-f...

2014-12-09 Thread lvsoft
Github user lvsoft commented on the pull request:

https://github.com/apache/spark/pull/3490#issuecomment-66405387
  
Well, that's called separated property files, not *common* properties. 
It'll be hard to adjust common properties and easy to make mistakes.

Delete tmp files is a common requirement in system design. Of course you 
can ignore tmp files. As I said, I think it's a more elegant approach.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: spark-submit with accept multiple properties-f...

2014-12-09 Thread WangTaoTheTonic
Github user WangTaoTheTonic commented on the pull request:

https://github.com/apache/spark/pull/3490#issuecomment-66408350
  
As Patrick said, this will make confiugration more complex than more 
elegant.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: spark-submit with accept multiple properties-f...

2014-12-09 Thread lvsoft
Github user lvsoft commented on the pull request:

https://github.com/apache/spark/pull/3490#issuecomment-66414770
  
Well, I can't understand what's the complexity of this PR. I've reviewed 
the SPARK-3779 marked as related and didn't find something related to this 
patch.
And, this patch will be downward compatible with current `spark-submit` 
behavior.

From my point of view, let's talk it level by level:
1. In case of necessity: I've give out two reasons, one for benchmark case, 
one for common intuition in most systems.
2. In case of complexity: This patch maintains downward compatibility, and 
I've described its detail at the beginning and didn't catch the relationship 
with SPARK-3779.
3. In case of elegance: I don't think this is the most elegant solution. 
However, in order to maintain compatibility and least impact to current system, 
this is the relatively elegant solution.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: spark-submit with accept multiple properties-f...

2014-11-28 Thread pwendell
Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/3490#issuecomment-64916925
  
Hi @lvsoft - it was not in the design of this component to support multiple 
files, and I'd prefer not to do it. It makes it very hard to reason about the 
effective configuration.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: spark-submit with accept multiple properties-f...

2014-11-26 Thread lvsoft
GitHub user lvsoft opened a pull request:

https://github.com/apache/spark/pull/3490

spark-submit with accept multiple properties-files and merge the values

Current ```spark-submit``` accepts only one properties-file, and use 
```spark-defaults.conf``` if unspecified.
A more nature approach is patching the properties-files sequentially 
against ```spark-defaults.conf```.

This PR affairs:
1. spark-submit script: join multiple ```--properties-file``` with comma 
and stored as ```SPARK_SUBMIT_PROPERTIES_FILES``` environment variable. Peek 
each properties-file to set ```SPARK_SUBMIT_BOOTSTRAP_DRIVER``` flag.
2. SparkSubmitArguments.scala: similar with 1.
3. SparkSubmitDriverBootstrapper.scala: accept 
```SPARK_SUBMIT_PROPERTIES_FILES``` and call ```getPropertiesFromFiles``` for 
parsing.
4. Utils.scala: add ```getPropertiesFromFiles``` for the parsing of 
multiple properties-files.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lvsoft/spark 
spark_submit_with_multi_properties

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3490.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3490


commit c18a266a1fa0c20331faed1193c168c1021edcf1
Author: Lv, Qi qi...@intel.com
Date:   2014-11-25T08:48:03Z

Spark submit accept multiple properties files

commit 752a0581fde0692ee05213b51d0fc0368d8fd205
Author: Lv, Qi qi...@intel.com
Date:   2014-11-26T08:56:29Z

test pass




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: spark-submit with accept multiple properties-f...

2014-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3490#issuecomment-64739092
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org