Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/21589
I am closing the PR since there is no consensus regarding new methods.
---
-
To unsubscribe, e-mail:
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21589
> Can we add the methods as experimental and if we will observe some
problems in the upcoming releases, we will just remove them?
For clarification, I think we could but if there was
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/21589
> Unless there is some other compelling reason for introducing this which I
have missed; I am -1 on introducing this change.
I would like to describe one class of use cases which you don't
Github user markhamstra commented on the issue:
https://github.com/apache/spark/pull/21589
Thank you, @HyukjinKwon
There are a significant number of Spark users who use the Job Scheduler
model with a SparkContext shared across many users and many Jobs. Promoting
tools and
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21589
I wouldn't argue who more take care of or represent users or not though.
That's easily biased. If there's a technical concern from a committer or PMC, I
wouldn't go for it.
---
Github user ssimeonov commented on the issue:
https://github.com/apache/spark/pull/21589
> Repartitioning based upon a snapshot of the number of cores available
cluster-wide is clearly not the correct thing to do in many instances and use
cases.
I wholeheartedly agree and I
Github user markhamstra commented on the issue:
https://github.com/apache/spark/pull/21589
I don't accept you assertions of what constitutes the majority and minority
of Spark users or use cases or their relative importance. As a long-time
maintainer of the Spark scheduler, it is
Github user ssimeonov commented on the issue:
https://github.com/apache/spark/pull/21589
@markhamstra I am confused about your API evaluation criteria.
You are not arguing about the specific benefits these changes can provide
immediately to an increasing majority of Spark
Github user markhamstra commented on the issue:
https://github.com/apache/spark/pull/21589
It is precisely because the audience that I am concerned with is not
limited to just data scientists or notebook users and their particular needs
that I am far from convinced that exposing
Github user ssimeonov commented on the issue:
https://github.com/apache/spark/pull/21589
@markhamstra even the words you are using indicate that you are missing the
intended audience.
> high-level, declarative abstraction that can be used to specify requested
Job
Github user markhamstra commented on the issue:
https://github.com/apache/spark/pull/21589
@ssimeonov the purpose of a public API is not to offer hack solutions to a
subset of problems. What is needed is a high-level, declarative abstraction
that can be used to specify requested Job
Github user ssimeonov commented on the issue:
https://github.com/apache/spark/pull/21589
@markhamstra the purpose of this PR is not to address the topic of dynamic
resource management in arbitrarily complex Spark environments. Most Spark users
do not operate in such environments. It
Github user markhamstra commented on the issue:
https://github.com/apache/spark/pull/21589
No, defaultParallelism isn't more useful in that case, but that just starts
getting to my overall assessment of this JIRA and PR: It smells of defining the
problem to align with a preconception
Github user ssimeonov commented on the issue:
https://github.com/apache/spark/pull/21589
@mridulm your comments make an implicit assumption, which is quite
incorrect: that Spark users read the Spark codebase and/or are aware of Spark
internals. Please, consider this PR in the context
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/21589
> it's not terribly useful to know, e.g., that there are 5 million cores in
the cluster if your Job is running in a scheduler pool that is restricted to
using far fewer CPUs via the pool's
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/21589
@MaxGekk We are going in circles.
I dont think this is a good api to expose currently - the data is available
through multiple other means as I detailed and while not a succinct oneliner,
it is
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/21589
> ... unless explicitly overridden by user.
This is the problem this PR addresses, actually.
> If you need fine grained information about executors, use spark listener
(it is
Github user markhamstra commented on the issue:
https://github.com/apache/spark/pull/21589
@mridulm scheduler pools could also make the cluster-wide resource numbers
not very meaningful. I don't think the maxShare work has been merged yet (kind
of a stalled TODO on an open PR, IIRC),
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/21589
@MaxGekk The example you cites is literally one of a handful of usages
which is not easily overridden - and is prefixed with a 'HACK ALERT' ! A few
others are in mllib, typically for reading
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21589
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21589
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93223/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21589
**[Test build #93223 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93223/testReport)**
for PR 21589 at commit
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/21589
> User's are not expected to override it unless they want fine grained
control over the value
This is actually one of the use cases when an user need to take control or
tune a query. The
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/21589
+CC @markhamstra since you were looking at API stability.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/21589
I am not convinced by the rationale given for adding the new api's in the
jira.
The examples given there can be easily modeled using `defaultParallelism`
(to get current state) and
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/21589
> I am not seeing the utility of these two methods.
@mridulm I describe the utility of the methods in the ticket:
https://issues.apache.org/jira/browse/SPARK-24591
>
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/21589
I am not seeing the utility of these two methods.
`defaultParallelism` already captures the current number of cores.
For monitoring usecases, existing events fired via listener can be
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21589
**[Test build #93223 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93223/testReport)**
for PR 21589 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21589
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93174/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21589
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21589
**[Test build #93174 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93174/testReport)**
for PR 21589 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21589
**[Test build #93174 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93174/testReport)**
for PR 21589 at commit
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/21589
jenkins, retest this, please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21589
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21589
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93135/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21589
**[Test build #93135 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93135/testReport)**
for PR 21589 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21589
**[Test build #93135 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93135/testReport)**
for PR 21589 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21589
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93119/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21589
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21589
**[Test build #93119 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93119/testReport)**
for PR 21589 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21589
**[Test build #93119 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93119/testReport)**
for PR 21589 at commit
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/21589
jenkins, retest this, please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21589
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93092/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21589
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21589
**[Test build #93092 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93092/testReport)**
for PR 21589 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21589
**[Test build #93092 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93092/testReport)**
for PR 21589 at commit
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21589
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21589
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93085/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21589
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21589
**[Test build #93085 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93085/testReport)**
for PR 21589 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21589
**[Test build #93085 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93085/testReport)**
for PR 21589 at commit
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/21589
jenkins, retest this, please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21589
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93041/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21589
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21589
**[Test build #93041 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93041/testReport)**
for PR 21589 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21589
**[Test build #93041 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93041/testReport)**
for PR 21589 at commit
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21589
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21589
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21589
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93031/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21589
**[Test build #93031 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93031/testReport)**
for PR 21589 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21589
**[Test build #93031 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93031/testReport)**
for PR 21589 at commit
Github user jiangxb1987 commented on the issue:
https://github.com/apache/spark/pull/21589
> @felixcheung I am not sure that our users are so interested in getting a
list of cores per executors and calculate total numbers cores by summurizing
the list. It will just complicate API and
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21589
Yarn can use dynamic allocation as well. That's why I said "in general". To
address @felixcheung's concern, I guess it's good to mention like see the
configuration section and details can be
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/21589
> AFAIK, we always have num of executor ...
Not in all cases, Databricks clients can create auto-scaling clusters:
Github user felixcheung commented on the issue:
https://github.com/apache/spark/pull/21589
AFAIK, we always have num of executor and then num of core per executor
right?
https://spark.apache.org/docs/latest/configuration.html#execution-behavior
maybe we should have the
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21589
sgtm
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21589
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21589
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92989/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21589
**[Test build #92989 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92989/testReport)**
for PR 21589 at commit
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/21589
> in this cluster do we really mean cores allocated to the "application" or
"job"?
@felixcheung What about `number of CPUs/Executors potentially available to
an job submitted via the
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21589
**[Test build #92989 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92989/testReport)**
for PR 21589 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21589
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92907/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21589
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21589
**[Test build #92907 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92907/testReport)**
for PR 21589 at commit
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21589
LGTM otherwise
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21589
**[Test build #92907 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92907/testReport)**
for PR 21589 at commit
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21589
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21589
cc @jiangxb1987
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/21589
@felixcheung @HyukjinKwon Could you tell me, please, what does prevent the
PR from getting merged?
---
-
To unsubscribe,
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/21589
> Are you maybe able to manually test this in other cluster like standalone
or yarn too?
I have tested standalone mode but didn't check yarn though
`YarnClientSchedulerBackend` and
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21589
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21589
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92423/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21589
**[Test build #92423 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92423/testReport)**
for PR 21589 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21589
**[Test build #92423 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92423/testReport)**
for PR 21589 at commit
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21589
Are you maybe able to manually test this in other cluster like standalone
or yarn too?
---
-
To unsubscribe, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21589
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21589
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92402/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21589
**[Test build #92402 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92402/testReport)**
for PR 21589 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21589
**[Test build #92402 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92402/testReport)**
for PR 21589 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21589
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21589
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92393/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21589
**[Test build #92393 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92393/testReport)**
for PR 21589 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21589
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21589
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92389/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21589
**[Test build #92389 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92389/testReport)**
for PR 21589 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21589
**[Test build #92393 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92393/testReport)**
for PR 21589 at commit
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/21589
> what's the convention here, I thought SparkContext has get* methods
instead
`SparkContext` has a few methods without such prefix, for example:
`defaultParallelism`,
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21589
**[Test build #92389 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92389/testReport)**
for PR 21589 at commit
Github user felixcheung commented on the issue:
https://github.com/apache/spark/pull/21589
what's the convention here, I thought SparkContext has get* methods instead
---
-
To unsubscribe, e-mail:
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21589
Seems fine.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
1 - 100 of 128 matches
Mail list logo