Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/17276#discussion_r108052849
--- Diff:
core/src/main/scala/org/apache/spark/executor/ShuffleReadMetrics.scala ---
@@ -80,13 +86,15 @@ class ShuffleReadMetrics private[spark] () extends
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/17276#discussion_r108052758
--- Diff:
core/src/main/java/org/apache/spark/shuffle/sort/UnsafeShuffleWriter.java ---
@@ -228,6 +230,36 @@ void closeAndWriteOutput() throws IOException
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/17276#discussion_r108052804
--- Diff:
core/src/main/scala/org/apache/spark/shuffle/sort/SortShuffleWriter.scala ---
@@ -72,6 +72,27 @@ private[spark] class SortShuffleWriter[K, V, C
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/17276
@jinxing64
If the intent behind these metrics is to help with SPARK-19659, it would be
good to either add it as part of SPARK-19659 or subsequently (once the feature
is merged).
This
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/17276#discussion_r108103956
--- Diff:
core/src/main/java/org/apache/spark/shuffle/sort/BypassMergeSortShuffleWriter.java
---
@@ -169,6 +173,36 @@ public void write(Iterator> reco
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/17522
I dont think we should be pointing to third party projects in spark
documentation - for example, it might be possible that some other effort gets
merged in instead of the above.
If/when it
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/17533
Tasks are scheduled by locality (which includes shuffle tasks too to some
extent).
This is making a lot of state mutable within TSM - is there any tests done
which show improvements due to
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/17531#discussion_r109865317
--- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala ---
@@ -432,7 +432,7 @@ private[spark] class Executor
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/17531#discussion_r110028203
--- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala ---
@@ -432,7 +432,7 @@ private[spark] class Executor
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/17531
I will leave it around a bit in case @JoshRosen has any further comments.
Feel free to merge btw if you dont !
---
If your project is set up for it, you can reply to this email and have your
reply
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/17088
fetch failure does not imply lost executor - it could be a transient issue.
Similarly, executor loss does not imply host loss.
This is quite drastic for a fetch failure : spark already
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/17012
What is the proposed semantics from this PR now ?
- If file exists, ignore.
- If file does not exist, try to rename - if fails, throw exception.
Is this right ? If yes, the PR title
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/16867#discussion_r103516283
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -911,14 +916,14 @@ private[spark] class TaskSetManager
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/17088
+CC @tgravescs You might be interested in this given your comments on on
the blacklisting PR.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/16639
@squito It looks good to me, thanks for the changes !
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/17140
LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15505
@kayousterhout I am surprised it is not more, but I agree that the added
complexity for such low returns is not worth it.
---
If your project is set up for it, you can reply to this email and have
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/17166
What is the rationale for this change ? Is it to propagate the task kill
reason to UI ?
The one line in
https://github.com/apache/spark/pull/17166/files#diff
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/17166
If I did not miss it, there is no way for user to provide this information
currently, right ?
Or is that coming in a subsequent PR ?
---
If your project is set up for it, you can reply to this
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/17166
In that case, let us make it a more complete PR - with the proposed api
changes included - so that we can evaluate the merits of the change in total.
---
If your project is set up for it, you can
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/16867#discussion_r104342097
--- Diff:
core/src/test/scala/org/apache/spark/util/collection/MedianHeapSuite.scala ---
@@ -0,0 +1,78 @@
+/*
+ * Licensed to the Apache Software
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/16867#discussion_r104341937
--- Diff:
core/src/test/scala/org/apache/spark/util/collection/MedianHeapSuite.scala ---
@@ -0,0 +1,78 @@
+/*
+ * Licensed to the Apache Software
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/16867#discussion_r104341778
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/MedianHeap.scala ---
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/16867#discussion_r104341854
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -754,7 +743,6 @@ private[spark] class TaskSetManager
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/16867#discussion_r104352237
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -754,7 +743,6 @@ private[spark] class TaskSetManager
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/16867#discussion_r104351990
--- Diff:
core/src/test/scala/org/apache/spark/util/collection/MedianHeapSuite.scala ---
@@ -0,0 +1,78 @@
+/*
+ * Licensed to the Apache Software
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/16867#discussion_r104352100
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/MedianHeap.scala ---
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/16867
The cost for median heap could be higher than TreeMap imo - for example,
the additional dequeue + enqueue when rebalance is required ?
If the cost is high enough, we might want to relook at the
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/17113
@markhamstra given the impact interruption has on lower layer libraries
which dont handle it well (iirc hdfs ?), we probably will not set it to true
even if spark code is robust,
---
If your
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/16867#discussion_r104513453
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -740,6 +743,7 @@ private[spark] class TaskSetManager
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/16867
You are right, the strict definition requires us to average - it just makes
it difficult to reason based on logs at times when the duration mentioned
does not exist when there are
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/16867#discussion_r104515039
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/MedianHeap.scala ---
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/16867#discussion_r104515191
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -740,6 +743,7 @@ private[spark] class TaskSetManager
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/16867#discussion_r104516899
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -740,6 +743,7 @@ private[spark] class TaskSetManager
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/16867#discussion_r104517384
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/MedianHeap.scala ---
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/16867#discussion_r104520626
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/MedianHeap.scala ---
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/16867#discussion_r104520859
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -740,6 +743,7 @@ private[spark] class TaskSetManager
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/17113
@markhamstra Completely agree, I would love to see this enabled by default.
For example, I really hate to see speculative tasks continuing to run when the
taskset has completed (for example) - used
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/16867#discussion_r104524384
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/MedianHeap.scala ---
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/17166#discussion_r104772015
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
@@ -467,7 +474,7 @@ private[spark] class TaskSchedulerImpl
private
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/17166#discussion_r104769358
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -2250,6 +2250,25 @@ class SparkContext(config: SparkConf) extends
Logging
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/17166#discussion_r104773405
--- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
---
@@ -732,6 +732,13 @@ class DAGScheduler
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/17166#discussion_r104770757
--- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala ---
@@ -168,7 +168,8 @@ private[spark] class Executor(
case Some
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/17024
It makes it possible to identify what the data within the file is
(compressed or not) - for user's perusal (it does not change anything for the
application, that is true).
But before you c
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/17198
This is one of the scarey bugs I see in scala code ... the benefits of an
explicit delimiter !
Wondering if it is possible to generalize this as a scala style check ...
LGTM
---
If
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/17220
This was an exposed parameter, we cannot remove it - irrespective of the
duplication.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/17238
This is something to be configured appropriately at the lower level (namely
topology config), and should be handled there Special casing this in spark is
not right.
Two scenarios
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/17244#discussion_r105489927
--- Diff: core/src/main/scala/org/apache/spark/TaskContextImpl.scala ---
@@ -57,57 +68,75 @@ private[spark] class TaskContextImpl(
// Whether the
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/17244#discussion_r105489232
--- Diff: core/src/main/scala/org/apache/spark/TaskContextImpl.scala ---
@@ -57,57 +68,75 @@ private[spark] class TaskContextImpl(
// Whether the
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/17244#discussion_r105489812
--- Diff: core/src/main/scala/org/apache/spark/TaskContextImpl.scala ---
@@ -57,57 +68,75 @@ private[spark] class TaskContextImpl(
// Whether the
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/17244#discussion_r105490080
--- Diff: core/src/main/scala/org/apache/spark/TaskContextImpl.scala ---
@@ -57,57 +68,75 @@ private[spark] class TaskContextImpl(
// Whether the
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/17244#discussion_r105510888
--- Diff: core/src/main/scala/org/apache/spark/TaskContextImpl.scala ---
@@ -57,57 +68,75 @@ private[spark] class TaskContextImpl(
// Whether the
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/16788#discussion_r105319198
--- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
@@ -151,9 +152,13 @@ object SparkSubmit extends CommandLineUtils {
val
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/16788#discussion_r105526225
--- Diff: core/src/main/scala/org/apache/spark/scheduler/ResultTask.scala
---
@@ -60,9 +60,10 @@ private[spark] class ResultTask[T, U
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/16788#discussion_r105319759
--- Diff: core/src/main/scala/org/apache/spark/scheduler/KerberosUtil.scala
---
@@ -0,0 +1,108 @@
+/*
+ * Licensed to the Apache Software
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/16788#discussion_r105318927
--- Diff:
core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala ---
@@ -257,10 +257,6 @@ private[deploy] class SparkSubmitArguments(args
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/16788#discussion_r105319503
--- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
---
@@ -1028,7 +1028,7 @@ class DAGScheduler(
val locs
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/16788#discussion_r105526180
--- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
---
@@ -1006,7 +1006,7 @@ class DAGScheduler(
runningStages
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/16788#discussion_r105526086
--- Diff: core/src/main/scala/org/apache/spark/scheduler/KerberosUser.scala
---
@@ -0,0 +1,33 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/16788
@tgravescs , @vanzin - this PR for mesos changes how spark handles kerberos
tokens fundamentally; would be good to have your views.
+CC @jerryshao to also look at the PR, since you have
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/17244
LGTM. Would be great if other reviewers can also take a look.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/17238
@squito I looked at yarn bits about 3-4 years back, so I am sure you and
@tgravescs know better :-)
But yes, you did summarize what I was attempting to get to.
I am not sure what the
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/17109#discussion_r106052638
--- Diff:
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackendUtil.scala
---
@@ -99,6 +99,26 @@ private
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/17295
I have not looked at the implementation in detail, but can you comment on
why the change w.r.t plain text block data to remote executor ? Isn't it not
simpler to transmit block conten
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/17295
> First, keep in mind that there's no metadata that tells the receiver
whether a block is encrypted or not. This means that methods like
BlockManager.get, which can read block data fro
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/17295
Just to be clear, I would prefer if we consistently did things - either
encrypt all blocks while transferring (irrespective of sasl being enabled or
not); or depend only on sasl for channel
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/17295
> Not really sure what you mean here. But transferring encrypted data
without RPC encryption is not really secure, since the encryption key is
transferred to executors using an RPC. There'
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/14789
Hi @tgravescs,
This is specifically for the case where new input source/output
destinations are getting added after the driver has already started. Notebooks,
long running streaming
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/14789#discussion_r87078341
--- Diff:
yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnSchedulerBackend.scala
---
@@ -189,6 +190,39 @@ private[spark] abstract class
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/14789#discussion_r87075067
--- Diff: core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala
---
@@ -138,6 +138,14 @@ class SparkHadoopUtil extends Logging
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/14789#discussion_r87072908
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
---
@@ -64,7 +64,7 @@ class
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/14789
@tgravescs Interesting, so it means that there wont be a
--principal/--keytab, and what is being done in AMCredentialRenewer will be
done in the launcher itself ? So the driver behaves just like
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/14789#discussion_r87116199
--- Diff:
yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnSchedulerBackend.scala
---
@@ -189,6 +190,39 @@ private[spark] abstract class
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/14789
@vanzin You are right, this does look orthogonal to what you and Tom
described.
'Where' the renewal is happening is the difference - from within spark or
from outside of spark (
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15823
LGTM. Not at my laptop, would be great if you can merge @rxin, thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15563
Jenkins, test this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/14789
Interesting, thanks for clarifying.
The way I was looking at the change was, given we have ability for custom
credential providers now, we would need to support for out-of-band (to the
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15563
Looks good to me
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15563
Merging into master
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15861
@rxin I did see this PR, unfortunately it is a bit big and I am tied up
with other things - cant get to it for next few days.
---
If your project is set up for it, you can reply to this
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/16038
Without understanding the specifics of the ML part here - wont the actual
impact of a large dense vector on Task 'bytes' be minimal at best ?
We do compress the task binary; and 1B zer
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15861#discussion_r90129155
--- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
---
@@ -1089,66 +1064,10 @@ class PairRDDFunctions[K, V](self: RDD[(K, V
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15861#discussion_r90116879
--- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
---
@@ -1016,11 +1013,6 @@ class PairRDDFunctions[K, V](self: RDD[(K, V
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15861#discussion_r88075283
--- Diff:
core/src/main/scala/org/apache/spark/internal/io/HadoopMapRedCommitProtocol.scala
---
@@ -0,0 +1,36 @@
+/*
+ * Licensed to the Apache
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15861#discussion_r90122251
--- Diff:
core/src/main/scala/org/apache/spark/internal/io/SparkHadoopWriter.scala ---
@@ -0,0 +1,408 @@
+/*
+ * Licensed to the Apache Software
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15861#discussion_r90121527
--- Diff:
core/src/main/scala/org/apache/spark/internal/io/SparkHadoopWriter.scala ---
@@ -0,0 +1,408 @@
+/*
+ * Licensed to the Apache Software
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15861#discussion_r90120144
--- Diff:
core/src/main/scala/org/apache/spark/internal/io/SparkHadoopWriter.scala ---
@@ -0,0 +1,408 @@
+/*
+ * Licensed to the Apache Software
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15861#discussion_r90127987
--- Diff:
core/src/main/scala/org/apache/spark/internal/io/SparkHadoopWriter.scala ---
@@ -0,0 +1,408 @@
+/*
+ * Licensed to the Apache Software
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15861#discussion_r90129259
--- Diff:
core/src/test/scala/org/apache/spark/rdd/PairRDDFunctionsSuite.scala ---
@@ -561,7 +561,7 @@ class PairRDDFunctionsSuite extends SparkFunSuite
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15861#discussion_r88077635
--- Diff:
core/src/main/scala/org/apache/spark/internal/io/SparkHadoopWriter.scala ---
@@ -0,0 +1,408 @@
+/*
+ * Licensed to the Apache Software
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15861#discussion_r90121670
--- Diff:
core/src/main/scala/org/apache/spark/internal/io/SparkHadoopWriter.scala ---
@@ -0,0 +1,408 @@
+/*
+ * Licensed to the Apache Software
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15861#discussion_r87708046
--- Diff:
core/src/main/scala/org/apache/spark/internal/io/SparkHadoopWriter.scala ---
@@ -0,0 +1,408 @@
+/*
+ * Licensed to the Apache Software
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15861#discussion_r90119536
--- Diff:
core/src/main/scala/org/apache/spark/internal/io/SparkHadoopWriter.scala ---
@@ -0,0 +1,408 @@
+/*
+ * Licensed to the Apache Software
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15861#discussion_r90124359
--- Diff:
core/src/main/scala/org/apache/spark/internal/io/SparkHadoopWriter.scala ---
@@ -0,0 +1,408 @@
+/*
+ * Licensed to the Apache Software
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15861
@jiangxb1987 I did a single pass review - particularly given the
similarities in both the codepaths and the classnames, I will need to go over
it again to ensure we dont miss anything.
---
If
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15553#discussion_r84374076
--- Diff: sql/hive-thriftserver/pom.xml ---
@@ -41,11 +41,8 @@
${project.version}
- org.apache.spark
- spark
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15579
Btw, curious if you have actually tested this in yarn - I have a feeling it
wont work.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15553
Changes done, please do let me know of any other comments, thanks !
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/15553#discussion_r84533967
--- Diff: sql/hive-thriftserver/pom.xml ---
@@ -41,11 +41,8 @@
${project.version}
- org.apache.spark
- spark
Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15553
@srowen -Dmaven.test.skip=true is the documented way in maven to skip test
compilation
(http://maven.apache.org/surefire/maven-surefire-plugin/examples/skipping-test.html)
which we are not
401 - 500 of 1358 matches
Mail list logo