Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21451
**[Test build #93250 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93250/testReport)**
for PR 21451 at commit
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/21202
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21451
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21451
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user srowen commented on the issue:
https://github.com/apache/spark/pull/21202
Merged to master
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user dongjoon-hyun commented on the issue:
https://github.com/apache/spark/pull/21582
Thank you so much, @gatorsmile . I will proceed.
Also, thank you, @viirya and @maropu .
---
-
To unsubscribe, e-mail:
Github user srowen commented on the issue:
https://github.com/apache/spark/pull/21638
Except for `binaryFiles`, everything else that needs to change is private
to Spark. I know it's public in the bytecode, but only Java callers could
accidentally exploit that. Still I don't
Github user hvanhovell commented on the issue:
https://github.com/apache/spark/pull/21807
@NiharS yeah that makes sense. @mauropalsgraaf we missed this today (sorry
about that). Can you add the null check (bonus points if you call `eval()` only
once), add a test for this case?
---
Github user bomeng commented on the issue:
https://github.com/apache/spark/pull/21638
Either way works for me, but I think since this is not a private method, so
people may use it in their own approach. The minimal change will be the best.
---
Github user echarles commented on the issue:
https://github.com/apache/spark/pull/21748
> I'm personally leaning towards doing that for the user.
Especially if the user is a data scientist behind his notebook launching a
paragraph which is supposed to instanciate a Spark
Github user liyinan926 commented on the issue:
https://github.com/apache/spark/pull/21748
> That is why I suggested also to remove the driver's knowledge of the
driver pod name and to remove the owner reference concept entirely.
While, not worrying about the driver pod name
Github user srowen commented on the issue:
https://github.com/apache/spark/pull/21794
`spark-sql` suggests that -0 and 0 are considered the same though. `SELECT
-0.0 == 0.0;` returns `true`. It's probably essential not to change behavior
here, but if performance is the issue, I think
Github user hvanhovell commented on the issue:
https://github.com/apache/spark/pull/21806
@viirya this current change is only useful when you compare canonicalized
plans created on different JVMs. This has come up when we tried to detect
changes in plans over spark versions (plan
Github user mccheah commented on the issue:
https://github.com/apache/spark/pull/21748
I think taking a step back, it seems unwise more so to be making any
assumptions about the location in which a driver is running in client mode.
Client mode is simply just saying that the
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21533
**[Test build #4220 has
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4220/testReport)**
for PR 21533 at commit
Github user liyinan926 commented on the issue:
https://github.com/apache/spark/pull/21748
> In that case, the client process could create its own
spark-client-app-id...
Yes, and that's what my point above is about. Regardless of how the driver
pod is created and managed,
Github user echarles commented on the issue:
https://github.com/apache/spark/pull/21748
> Label spark-app-id is only set if spark-submit goes through the steps to
create the driver pod so doesn't apply in this case.
In that case, the client process could create its own
Github user bavardage commented on the issue:
https://github.com/apache/spark/pull/21794
it does seem that spark currently does distinguish -0 and 0, at least as
far as groupbys go
```
scala> case class Thing(x : Float)
defined class Thing
scala> val df =
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/21720
@gatorsmile @maryannxue Can we move forward with this PR:
https://github.com/apache/spark/pull/21699 ?
---
-
To unsubscribe,
Github user huaxingao commented on a diff in the pull request:
https://github.com/apache/spark/pull/21710#discussion_r203526021
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/r/PrefixSpanWrapper.scala ---
@@ -0,0 +1,34 @@
+/*
+ * Licensed to the Apache Software
Github user liyinan926 commented on the issue:
https://github.com/apache/spark/pull/21748
> Got you points. About labels, right, we could take the road of the code
that creates labels on its own pod. To ensure uniqueness, we could use the
spark-app-id as key (if it maps the
Github user echarles commented on the issue:
https://github.com/apache/spark/pull/21748
Got you points. About labels, right, we could take the road of the code
that creates labels on its own pod. To ensure uniqueness, we could use the
`spark-app-id` as key (if it maps the requirement
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/21589
> ... unless explicitly overridden by user.
This is the problem this PR addresses, actually.
> If you need fine grained information about executors, use spark listener
(it is
Github user liyinan926 commented on the issue:
https://github.com/apache/spark/pull/21748
> The problem is that the driver's labels might not be unique to that
driver, which therefore would require the user to assign their own unique
labels or for us to patch the driver pod in-place
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/21806
The change looks fine. However I'm wondering that have we have chance to
compare hash code between expr ids from different jvms?
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21729
**[Test build #93249 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93249/testReport)**
for PR 21729 at commit
Github user mccheah commented on the issue:
https://github.com/apache/spark/pull/21748
> Yes, the service gets its endpoints by matching its label selector
against labels on the pods so it's critical to have matching labels. Another
tenable solution is for the driver backend code to
Github user liyinan926 commented on the issue:
https://github.com/apache/spark/pull/21748
> I don't think you can back a service with a selector that's a pod's name,
but someone with more knowledge of the Service API might be able to correct me
here. I was under the impression one
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21729
**[Test build #93248 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93248/testReport)**
for PR 21729 at commit
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/21720
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org
Github user mccheah commented on the issue:
https://github.com/apache/spark/pull/21748
> About selecting the pod with labels, another approach I have taken is
simply using the name of the driver pod, a bit like I have done with the
following deployment (so no need to ensure labels -
Github user markhamstra commented on the issue:
https://github.com/apache/spark/pull/21589
@mridulm scheduler pools could also make the cluster-wide resource numbers
not very meaningful. I don't think the maxShare work has been merged yet (kind
of a stalled TODO on an open PR, IIRC),
Github user squito commented on a diff in the pull request:
https://github.com/apache/spark/pull/21221#discussion_r203520320
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/SparkListener.scala ---
@@ -160,11 +160,29 @@ case class
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21720
LGTM
Thanks! Merged to master.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user echarles commented on the issue:
https://github.com/apache/spark/pull/21748
@mccheah If I compare with yarn-client with all nodes on the same LAN, we
introduce complexity here as the user has to ensure not only configuration, but
also deployment of a particular resource.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21806
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21806
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93240/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21806
**[Test build #93240 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93240/testReport)**
for PR 21806 at commit
Github user mccheah commented on the issue:
https://github.com/apache/spark/pull/21748
Though I suppose you could have the driver patch its own metadata fields to
assign itself a unique label. I could see that being confusing to users when
they observe that their driver pod metadata
Github user mccheah commented on the issue:
https://github.com/apache/spark/pull/21748
@echarles I don't think we should be special-casing Kubernetes here as
being any different from the other cluster managers. The main point of client
mode is that the driver is running locally and
Github user gvr commented on the issue:
https://github.com/apache/spark/pull/21806
Updated description, thanks @hvanhovell
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands,
Github user echarles commented on the issue:
https://github.com/apache/spark/pull/21748
PS: Actually, there would even be no issue with the port assignment as
Spark knows which ports he will be using, so he can create the headless service
with the correct ports for the user.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21809
**[Test build #93247 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93247/testReport)**
for PR 21809 at commit
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/21589
@MaxGekk The example you cites is literally one of a handful of usages
which is not easily overridden - and is prefixed with a 'HACK ALERT' ! A few
others are in mllib, typically for reading
Github user tgravescs commented on the issue:
https://github.com/apache/spark/pull/21809
ok to test
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user echarles commented on the issue:
https://github.com/apache/spark/pull/21748
> Note that we only invoke any of the feature steps and the entry point of
KubernetesClientApplication if we run in cluster mode. If we run in client
mode, we enter directly into the user's main
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21795
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93239/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21795
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21710
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21710
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93245/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21795
**[Test build #93239 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93239/testReport)**
for PR 21795 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21710
**[Test build #93245 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93245/testReport)**
for PR 21710 at commit
Github user edwinalu commented on a diff in the pull request:
https://github.com/apache/spark/pull/21221#discussion_r203503691
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/SparkListener.scala ---
@@ -160,11 +160,29 @@ case class
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/20146
Yeah, looks appveyer tests are triggered.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/21468
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/21742#discussion_r203496489
--- Diff:
external/avro/src/main/scala/org/apache/spark/sql/avro/package.scala ---
@@ -0,0 +1,39 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user tgravescs commented on the issue:
https://github.com/apache/spark/pull/21468
merged thanks @pgandhi999
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user tgravescs commented on a diff in the pull request:
https://github.com/apache/spark/pull/21635#discussion_r203495430
--- Diff:
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMasterSource.scala
---
@@ -0,0 +1,49 @@
+/*
+ * Licensed
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/20146
**[Test build #93246 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93246/testReport)**
for PR 20146 at commit
Github user tgravescs commented on the issue:
https://github.com/apache/spark/pull/21656
merged thanks @cxzl25
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21447
@maropu Do you want to take this over and add such a project in
`ColumnPruning`?
---
-
To unsubscribe, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20146
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20146
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user NiharS commented on the issue:
https://github.com/apache/spark/pull/21807
New to SQL, but it seems like the query
`SELECT 1 LIMIT CAST('1' AS INT)`
should work, right? I tried both on Spark without to your change and the
W3Schools SQL tester and it's
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21533
**[Test build #4219 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4219/testReport)**
for PR 21533 at commit
Github user liyinan926 commented on the issue:
https://github.com/apache/spark/pull/21748
> Sounds fine. How does the documentation look now in that regard?
I think we should add the following: 1) be explicit about the
`OwnerReference` when there's a driver pod, and 2)
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/20146
Thanks. I rebased it.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21754
cc @cloud-fan
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21777
@maropu Except the TPC-DS queries, are we able to find some workloads that
could perform faster using the bytecode generated by the JDK compiler? Or, does
that mean Janino compiler is always
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21777
Also cc @rednaxelafx
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user mccheah commented on the issue:
https://github.com/apache/spark/pull/21748
Sounds fine. How does the documentation look now in that regard?
---
-
To unsubscribe, e-mail:
Github user liyinan926 commented on the issue:
https://github.com/apache/spark/pull/21748
> I wonder if we want to have the pod name owner reference still be a
thing, if you will, in client mode. For example what if the pod name that is
given is accidentally one that is assigned to a
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/21221#discussion_r203489913
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/SparkListener.scala ---
@@ -160,11 +160,29 @@ case class
Github user mccheah commented on the issue:
https://github.com/apache/spark/pull/21748
> there was a change in Spark recently in how the driver self-discovered
its hostname by default, if I am not mistaken. Can't recall the exact patch. I
remember that change specifically prompting
Github user mccheah commented on the issue:
https://github.com/apache/spark/pull/21748
@echarles there was a change in Spark recently in how the driver
self-discovered its hostname by default, if I am not mistaken. Can't recall the
exact patch. I remember that change specifically
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21710
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21710
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21748
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21748
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21748
Kubernetes integration test status success
URL:
https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/1102/
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21803
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21803
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93235/
Test PASSed.
---
Github user echarles commented on the issue:
https://github.com/apache/spark/pull/21748
@mccheah @liyinan926 the code base has largely changed from the fork, but
at that time it was working fine without having to manually create any headless
service. Not sure why... but sure it was
Github user mccheah commented on the issue:
https://github.com/apache/spark/pull/21748
@liyinan926 I wonder if we want to have the pod name owner reference still
be a thing, if you will, in client mode. For example what if the pod name that
is given is accidentally one that is
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21803
**[Test build #93235 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93235/testReport)**
for PR 21803 at commit
Github user hvanhovell commented on the issue:
https://github.com/apache/spark/pull/21806
otherwise, LGTM pending jenkins
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands,
Github user hvanhovell commented on the issue:
https://github.com/apache/spark/pull/21806
@gvr can you clean-up the description somewhat? It currently also has part
of the template in it.
---
-
To unsubscribe,
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/21710#discussion_r203486740
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/r/PrefixSpanWrapper.scala ---
@@ -0,0 +1,34 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21748
Kubernetes integration test starting
URL:
https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/1102/
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21809
Can one of the admins verify this patch?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user echarles commented on the issue:
https://github.com/apache/spark/pull/21748
@mccheah @liyinan926 it is now working on my env in Out-Cluster. I was
failing because I forgot to remove the `spark.kubernetes.driver.pod.name`
props. In general, configuration is tedious and we
Github user hvanhovell commented on the issue:
https://github.com/apache/spark/pull/21807
Ok to test
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21809
Can one of the admins verify this patch?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21809
Can one of the admins verify this patch?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user echarles commented on the issue:
https://github.com/apache/spark/pull/21748
@mccheah thx for information. As a reader, I didn't understand that if I
didn't implement a headless service, I had to implement something else.
---
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/21710#discussion_r203485926
--- Diff: R/pkg/R/generics.R ---
@@ -1415,6 +1415,13 @@ setGeneric("spark.freqItemsets", function(object) {
standardGeneric("spark.freqI
#' @rdname
GitHub user pgandhi999 opened a pull request:
https://github.com/apache/spark/pull/21809
[SPARK-24851] : Map a Stage ID to it's Associated Job ID in UI
It would be nice to have a field in Stage Page UI which would show mapping
of the current stage id to the job id's to which that
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/21772
Let me clarify it. So this means that when `LongToUnsafeRowMap` is
broadcasted to executors, and it is too big to hold in memory, it will be
stored in disk. At that time, because `write` uses
Github user mccheah commented on the issue:
https://github.com/apache/spark/pull/21748
> Can you point in the fork where the submission client is create the
headless service? (just to help me understand the internals)
> Btw If we stick to this manual approach, the need for
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21748
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
201 - 300 of 585 matches
Mail list logo