Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/15579
@srowen, another use cases would be trace tools like `strace` which will
trace the system calls for process. One way of using `strace` is to add
`strace` before executing command
Github user jerryshao closed the pull request at:
https://github.com/apache/spark/pull/15210
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/15210
Sure, thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/15598#discussion_r84643826
--- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
@@ -1059,9 +1059,11 @@ private[spark] class Client(
} catch
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/15588
I think this fix cannot really handle this imbalance receiver allocation
problem, also blindly waste the CPU time.
What @lw-lin mentioned is a feasible solution to wait for executors
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/15563#discussion_r84229497
--- Diff: core/src/main/scala/org/apache/spark/scheduler/Task.scala ---
@@ -92,8 +92,16 @@ private[spark] abstract class Task[T](
kill
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/15563#discussion_r84228545
--- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
@@ -465,6 +465,8 @@ object SparkSubmit {
OptionAssigner
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/15377#discussion_r84212380
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -2432,6 +2432,26 @@ private[spark] object Utils extends Logging
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/15545
@zsxwing , would you mind taking a look at this PR? Thanks a lot.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/15377
LGTM just some minor things.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
GitHub user jerryshao opened a pull request:
https://github.com/apache/spark/pull/15545
[SPARK-17999][Kafka][SQL] Add getPreferredLocations for KafkaSourceRDD
## What changes were proposed in this pull request?
The newly implemented Structured Streaming `KafkaSource` did
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/15481#discussion_r83991614
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
---
@@ -393,7 +393,7 @@ class
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/15481
Seems it could be changed to `send` instead.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/15377
Yup, that's what I mean.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/15377
@srowen I'm not against this change, personally because the usage of flag
is wired to me and frankly saying I haven't seen such pattern in the Spark code.
Since we want to avoid re
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/15481
LGTM, sorry to bring in deadlock issue.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/15377#discussion_r83340292
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -2474,25 +2478,42 @@ private[spark] class CallerContext(
val context
Github user jerryshao closed the pull request at:
https://github.com/apache/spark/pull/15253
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/15253
Sure, thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/15377#discussion_r83336101
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -2474,25 +2478,42 @@ private[spark] class CallerContext(
val context
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/15377
Another thing, do you verify it locally? Since there's no unit test to
cover it.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/15377#discussion_r83181601
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -2479,20 +2483,35 @@ private[spark] class CallerContext
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/15456
Looks like unrelated test failure, it can be passed in my local test.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/15456
Thanks @rxin and @andrewor14 .
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/15456#discussion_r83146972
--- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
@@ -104,6 +104,8 @@ object SparkSubmit
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/15253
@zsxwing , would you mind taking a look at this fix for 1.6 branch, thanks
a lot.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
GitHub user jerryshao opened a pull request:
https://github.com/apache/spark/pull/15456
[SPARK-17686][Core] Support printing out scala and java version with
spark-submit --version command
## What changes were proposed in this pull request?
In our universal gateway service
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/15377#discussion_r82943152
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -2474,25 +2478,42 @@ private[spark] class CallerContext(
val context
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/15377#discussion_r82942829
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -2474,25 +2478,42 @@ private[spark] class CallerContext(
val context
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/15377#discussion_r82942438
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -2432,6 +2432,10 @@ private[spark] object Utils extends Logging
GitHub user jerryshao opened a pull request:
https://github.com/apache/spark/pull/15253
[SPARK-17678][REPL][Branch-1.6] Honor spark.replClassServer.port in
scala-2.11 repl
## What changes were proposed in this pull request?
Spark 1.6 Scala-2.11 repl doesn't honor
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/15195#discussion_r80396970
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/streaming/DataStreamWriter.scala
---
@@ -290,8 +284,8 @@ final class DataStreamWriter[T] private
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/15195#discussion_r80178129
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/streaming/DataStreamWriter.scala
---
@@ -290,8 +284,8 @@ final class DataStreamWriter[T] private
GitHub user jerryshao opened a pull request:
https://github.com/apache/spark/pull/15210
[SPARK-17604][SQL][Streaming] Supprt purging aged file entries in
FileStreamSourceLog
## What changes were proposed in this pull request?
Currently with
[SPARK-15698](https
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/15206
LGTM, thanks for the fix.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/15137
Try to think in another way from `PythonRunner`'s point, this comment looks
correct. For yarn and mesos cluster mode, it is because we leverage distributed
cache or other to download python files
GitHub user jerryshao opened a pull request:
https://github.com/apache/spark/pull/15173
[SPARK-15698][SQL][Streaming][Follw-up]Fix FileStream source and sink log
get configuration issue
## What changes were proposed in this pull request?
This issue was introduced
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/13513#discussion_r79747197
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSinkLog.scala
---
@@ -79,213 +76,46 @@ object SinkFileStatus
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/13513#discussion_r79530152
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSourceLog.scala
---
@@ -0,0 +1,132 @@
+/*
+ * Licensed
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/15134
My concern is that previously Spark will throw an exception if app name is
not set, while in 2.0 we bring in SparkSession which breaks the convention, so
do we need to let SparkSession
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/15134
@phalodi we don't restrict user to have to set an app name either for
SparkContext or SparkSession. You could refer to this code in SparkSubmit:
```
// Set name from main class
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/15134
Well, I understand your meaning, I'm guessing most of the users they're
using SparkSubmit or SparkLaunch to start application and in that case app name
should be well figured out even if not set
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/15134
From my understanding of current code, looks like there's no chance app
name will be null if we're using spark-submit to submit applications.
---
If your project is set up for it, you can reply
GitHub user jerryshao opened a pull request:
https://github.com/apache/spark/pull/15137
[SPARK-17512][Core] Avoid formatting to python path for yarn and mesos
cluster mode
## What changes were proposed in this pull request?
Yarn and mesos cluster mode support remote python
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/13513#discussion_r79099764
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSourceLog.scala
---
@@ -0,0 +1,133 @@
+/*
+ * Licensed
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/13513#discussion_r79093102
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSourceLog.scala
---
@@ -0,0 +1,133 @@
+/*
+ * Licensed
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/13513
Thanks a lot @zsxwing and @frreiss for your comments.
For the slow scan problem of compact batch. Originally I planned to to not
merge the latest batch as I did before, also suggested
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/13513
@zsxwing @frreiss thanks a lot for your comments.
I think the semantics of `FileStreamSource.getBatch(start: Option[Offset],
end: Offset)` still keeps the same, since I overrided
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/13513
@zsxwing , thanks a lot for your comments, I did several refactorings:
1. Abstract and consolidate `FileStreamSinkLog` and `FileStreamSourceLog`,
now they share same code path to do
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/13513
Sure, I will change the code.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/14887
@zhaoyunjiong , the fix you made may introduce a situation where recovery
data will be existed in multiple directories, I'm not sure if this will
introduce recovery issue or others, since now
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/14961
Also many other downstream and upstream applications may also use different
version of Netty jar, it would be better to keep stable for these fundamental
dependences.
---
If your project is set
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/14961
Upgrading Netty version to branch 1.6 may cause API version incompatible
issue for yarn shuffle service, please see
[SPARK-16018](https://issues.apache.org/jira/browse/SPARK-16018) and
[SPARK
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/14887#discussion_r77282332
--- Diff:
common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java
---
@@ -25,6 +25,8 @@
import
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/14916
Agree with @tgravescs .
Actually this issue only exists when local `yarn#client` process is gone
and application is killed by yarn command. In this case the staging dir
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/14873
From my understanding it is more like a personal preference rather than
code style issue. We may change the code for now, but how can we guarantee
other people not to use pattern match in future
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/14887#discussion_r76911079
--- Diff:
common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java
---
@@ -270,9 +272,17 @@ protected Path getRecoveryPath
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/14617
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/14804
I think
[here](http://ux.stackexchange.com/questions/13815/files-size-units-kib-vs-kb-vs-kb)
has a precise definition. AFAIK in Spark the conversion is 1024 based either
KB, K, or kb, KiB
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/14804
Because in the log it shows Memory MB in 1024 based, while in the web UI it
is 1000 based, so this is slightly different.
You could check `Utils#bytesToString`. I think we unify
GitHub user jerryshao opened a pull request:
https://github.com/apache/spark/pull/14804
[MINOR][Web UI] Correctly convert bytes in web UI
## What changes were proposed in this pull request?
should be 1024 based, not 1000.
## How was this patch tested
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/14802
Looks like this is a little similar to this one #13513 .
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/14617
@mallman I changed the UI based on your comment, here is the new one
(separate the on heap and off heap memory usage in two columns):
![screen shot 2016-08-25 at 3 28 31
pm](https
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/14789
@tgravescs , with
[SPARK-14743](https://issues.apache.org/jira/browse/SPARK-14743),
credentials/tokens can be managed out of Spark with their own credential
provider. In that case user could
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/14789
The user themselves will call this API. Another option in
[SPARK-14743](https://issues.apache.org/jira/browse/SPARK-14743) is to add this
API in `SparkContext`. But from my understanding
GitHub user jerryshao opened a pull request:
https://github.com/apache/spark/pull/14789
[SPARK-17209][YARN] Add the ability to manually update credentials for
Spark running on YARN
## What changes were proposed in this pull request?
This PR propose to add a new API
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/14728
For the definition of `maxAge`, currently from the code it is max age to
latest file, people may misunderstand it is max age to current time, so it
would be better to document the meaning
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/14728#discussion_r76015528
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala
---
@@ -41,36 +40,59 @@ class FileStreamSource
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/14728#discussion_r76011569
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala
---
@@ -41,36 +40,59 @@ class FileStreamSource
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/14728#discussion_r76011328
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala
---
@@ -41,36 +40,59 @@ class FileStreamSource
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/14728#discussion_r76011102
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamOptions.scala
---
@@ -0,0 +1,59 @@
+/*
+ * Licensed
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/14728
Sure, let me take a look at this.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/14617
@mallman thanks a lot for your comments, I will change the UI to split into
separate columns.
Yes, as you mentioned current executor memory usage tracked in Standalone
Master only shows
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/14768#discussion_r75805277
--- Diff:
core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorter.java
---
@@ -522,7 +522,8 @@ public long spill() throws
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/14744#discussion_r75633858
--- Diff: docs/configuration.md ---
@@ -1752,6 +1752,15 @@ showDF(properties, numRows = 200, truncate = FALSE)
Executable for executing R scripts
GitHub user jerryshao opened a pull request:
https://github.com/apache/spark/pull/14617
[SPARK-17019][Core] Expose on-heap and off-heap memory usage various places
## What changes were proposed in this pull request?
With [SPARK-13992](https://issues.apache.org/jira/browse
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/14581
LGTM.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/14561
Do you have any specific reason or use case that have to refactor this part?
IMHO, I think unless we have a concrete reason to change it, it is better
not to do refactoring
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/14557#discussion_r74017373
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -1564,6 +1564,14 @@ class SparkContext(config: SparkConf) extends
Logging
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/14556
Would you please add a unit test to verify the changes?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/14542#discussion_r73987834
--- Diff:
yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala ---
@@ -404,7 +410,8 @@ private[spark] class ApplicationMaster
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/14540
Looks good to me, we should also change the unit tests accordingly.
Currently several related tests are excluded from python3 test.
---
If your project is set up for it, you can reply
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/14065
@vazin, looks like I missed that comment, I will address that today.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/14065
@vanzin , I did refactoring on some interfaces, especially for
`obtainCredentials` method, and the implementation of `HDFSCredentialProvider`
and `HiveCredential`, would you please help
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/14065#discussion_r72752972
--- Diff:
yarn/src/main/scala/org/apache/spark/deploy/yarn/security/HDFSCredentialProvider.scala
---
@@ -0,0 +1,118 @@
+/*
+ * Licensed
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/14065#discussion_r72750534
--- Diff:
yarn/src/test/scala/org/apache/spark/deploy/yarn/security/HDFSCredentialProviderSuite.scala
---
@@ -0,0 +1,106 @@
+/*
+ * Licensed
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/14065#discussion_r72750018
--- Diff:
yarn/src/main/scala/org/apache/spark/deploy/yarn/security/CredentialUpdater.scala
---
@@ -107,8 +110,16 @@ private[spark] class
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/14065#discussion_r72749483
--- Diff:
yarn/src/main/scala/org/apache/spark/deploy/yarn/security/CredentialUpdater.scala
---
@@ -41,16 +43,18 @@ private[spark] class
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/14065#discussion_r72747890
--- Diff:
yarn/src/main/scala/org/apache/spark/deploy/yarn/security/ConfigurableCredentialManager.scala
---
@@ -0,0 +1,96 @@
+/*
+ * Licensed
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/14340
Thanks a lot @rxin for your comments, let me close it.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user jerryshao closed the pull request at:
https://github.com/apache/spark/pull/14340
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/14312
>Spark caller context written into Yarn log will be"{spark.app.name}
running on Spark".
This may not be so useful, I think we could get app name form yarn through
many d
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/14312#discussion_r72198340
--- Diff: core/src/main/scala/org/apache/spark/scheduler/Task.scala ---
@@ -78,6 +79,12 @@ private[spark] abstract class Task[T](
metrics
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/14312#discussion_r72198063
--- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
@@ -66,6 +66,9 @@ private[spark] class Client(
import Client
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/14312#discussion_r72197100
--- Diff:
yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala ---
@@ -197,6 +197,9 @@ private[spark] class ApplicationMaster
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/14312#discussion_r72196752
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -2418,6 +2419,25 @@ private[spark] object Utils extends Logging
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/14312#discussion_r72196631
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -2418,6 +2419,25 @@ private[spark] object Utils extends Logging
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/14340
Thanks a lot @koeninger for your review, I think it is not so flexible for
Python API to achieve same functionalities as Java/Scala APIs, especially for
things like extended class like
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/14340#discussion_r72192793
--- Diff: python/pyspark/streaming/kafka010.py ---
@@ -0,0 +1,370 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/14340#discussion_r72177558
--- Diff:
external/kafka-0-10/src/main/scala/org/apache/spark/streaming/kafka010/KafkaUtils.scala
---
@@ -177,3 +182,172 @@ object KafkaUtils extends
1501 - 1600 of 2761 matches
Mail list logo