spark git commit: [SPARK-14642][SQL] import org.apache.spark.sql.expressions._ breaks udf under functions

2016-05-10 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 82f69594f -> 5a4a188fe [SPARK-14642][SQL] import org.apache.spark.sql.expressions._ breaks udf under functions ## What changes were proposed in this pull request? PR fixes the import issue which breaks udf functions. The following co

spark git commit: [SPARK-6005][TESTS] Fix flaky test: o.a.s.streaming.kafka.DirectKafkaStreamSuite.offset recovery

2016-05-10 Thread zsxwing
ves the logic of `offsetRangesBeforeStop` (also renamed to `offsetRangesAfterStop`) after `ssc.stop()` to fix the flaky test. ## How was this patch tested? Jenkins unit tests. Author: Shixiong Zhu Closes #12903 from zsxwing/SPARK-6005. Project: http://git-wip-us.apache.org/repos/asf/spark/r

spark git commit: [SPARK-6005][TESTS] Fix flaky test: o.a.s.streaming.kafka.DirectKafkaStreamSuite.offset recovery

2016-05-10 Thread zsxwing
ust moves the logic of `offsetRangesBeforeStop` (also renamed to `offsetRangesAfterStop`) after `ssc.stop()` to fix the flaky test. ## How was this patch tested? Jenkins unit tests. Author: Shixiong Zhu Closes #12903 from zsxwing/SPARK-6005. (cherry picked from com

spark git commit: [SPARK-14936][BUILD][TESTS] FlumePollingStreamSuite is slow

2016-05-10 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master da02d006b -> 86475520f [SPARK-14936][BUILD][TESTS] FlumePollingStreamSuite is slow https://issues.apache.org/jira/browse/SPARK-14936 ## What changes were proposed in this pull request? FlumePollingStreamSuite contains two tests which run

spark git commit: [SPARK-14936][BUILD][TESTS] FlumePollingStreamSuite is slow

2016-05-10 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 1db027d11 -> f021f3460 [SPARK-14936][BUILD][TESTS] FlumePollingStreamSuite is slow https://issues.apache.org/jira/browse/SPARK-14936 ## What changes were proposed in this pull request? FlumePollingStreamSuite contains two tests which

spark git commit: [SPARK-15262] Synchronize block manager / scheduler executor state

2016-05-11 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 6e08eb469 -> 2454f6abf [SPARK-15262] Synchronize block manager / scheduler executor state ## What changes were proposed in this pull request? If an executor is still alive even after the scheduler has removed its metadata, we may rece

spark git commit: [SPARK-15262] Synchronize block manager / scheduler executor state

2016-05-11 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 ced71d353 -> e2a43d007 [SPARK-15262] Synchronize block manager / scheduler executor state ## What changes were proposed in this pull request? If an executor is still alive even after the scheduler has removed its metadata, we may rece

spark git commit: [SPARK-15262] Synchronize block manager / scheduler executor state

2016-05-11 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 7ecd49688 -> 40a949aae [SPARK-15262] Synchronize block manager / scheduler executor state ## What changes were proposed in this pull request? If an executor is still alive even after the scheduler has removed its metadata, we may receive

spark git commit: [SPARK-12652][PYSPARK] Upgrade Py4J to 0.9.1

2016-01-12 Thread zsxwing
his is a manual change and worth to take a look carefully. https://github.com/zsxwing/spark/commit/bfd4b5c040eb29394c3132af3c670b1a7272457c - [x] Verify no leak any more after reverting our workarounds Author: Shixiong Zhu Closes #10692 from zsxwing/py4j-0.9.1. Project: http://git-

spark git commit: [SPARK-12784][UI] Fix Spark UI IndexOutOfBoundsException with dynamic allocation

2016-01-14 Thread zsxwing
rom zsxwing/SPARK-12784. (cherry picked from commit 501e99ef0fbd2f2165095548fe67a3447ccbfc91) Signed-off-by: Shixiong Zhu Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d1855adb Tree: http://git-wip-us.apache.org/repos/asf/spark/t

spark git commit: [SPARK-12784][UI] Fix Spark UI IndexOutOfBoundsException with dynamic allocation

2016-01-14 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 56cdbd654 -> 501e99ef0 [SPARK-12784][UI] Fix Spark UI IndexOutOfBoundsException with dynamic allocation Add `listener.synchronized` to get `storageStatusList` and `execInfo` atomically. Author: Shixiong Zhu Closes #10728 from zsxw

spark git commit: [SPARK-12784][UI] Fix Spark UI IndexOutOfBoundsException with dynamic allocation

2016-01-14 Thread zsxwing
rom zsxwing/SPARK-12784. (cherry picked from commit 501e99ef0fbd2f2165095548fe67a3447ccbfc91) Signed-off-by: Shixiong Zhu Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/132718ad Tree: http://git-wip-us.apache.org/repos/asf/spark/t

spark git commit: Revert "[SPARK-12829] Turn Java style checker on"

2016-01-18 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master a973f483f -> 4bcea1b85 Revert "[SPARK-12829] Turn Java style checker on" This reverts commit 591c88c9e2a6c2e2ca84f1b66c635f198a16d112. `lint-java` doesn't work on a machine with a clean Maven cache. Project: http://git-wip-us.apache.org/

spark git commit: [HOTFIX][BUILD][TEST-MAVEN] Remove duplicate dependency

2016-01-22 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 8a88e1212 -> d8fefab4d [HOTFIX][BUILD][TEST-MAVEN] Remove duplicate dependency Author: Shixiong Zhu Closes #10868 from zsxwing/hotfix-akka-pom. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-

spark git commit: [HOTFIX]Remove rpcEnv.awaitTermination to avoid dead-lock in some test

2016-01-22 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master bc1babd63 -> ea5c38fe7 [HOTFIX]Remove rpcEnv.awaitTermination to avoid dead-lock in some test Looks rpcEnv.awaitTermination may block some tests forever. Just remove it and investigate the tests. Project: http://git-wip-us.apache.org/rep

spark git commit: [SPARK-12614][CORE] Don't throw non fatal exception from ask

2016-01-26 Thread zsxwing
ing exception from RpcEndpointRef.ask. We can send the exception to the future for `ask`. Author: Shixiong Zhu Closes #10568 from zsxwing/send-ask-fail. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/22662b24 Tree: h

spark git commit: [SPARK-12967][NETTY] Avoid NettyRpc error message during sparkContext shutdown

2016-01-26 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 58f5d8c1d -> bae3c9a4e [SPARK-12967][NETTY] Avoid NettyRpc error message during sparkContext shutdown If there's an RPC issue while sparkContext is alive but stopped (which would happen only when executing SparkContext.stop), log a warning

spark git commit: [SPARK-13055] SQLHistoryListener throws ClassCastException

2016-01-29 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 2b027e9a3 -> e38b0baa3 [SPARK-13055] SQLHistoryListener throws ClassCastException This is an existing issue uncovered recently by #10835. The reason for the exception was because the `SQLHistoryListener` gets all sorts of accumulators, no

spark git commit: [SPARK-13082][PYSPARK] Backport the fix of 'read.json(rdd)' in #10559 to branch-1.6

2016-01-29 Thread zsxwing
kported the fix of 'read.json(rdd)' to branch-1.6. Author: Shixiong Zhu Closes #10988 from zsxwing/json-rdd. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/84dab726 Tree: http://git-wip-us.apache.org/repos/asf/sp

spark git commit: [SPARK-13121][STREAMING] java mapWithState mishandles scala Option

2016-02-02 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 53f518a6e -> 4c28b4c8f [SPARK-13121][STREAMING] java mapWithState mishandles scala Option java mapwithstate with Function3 has wrong conversion of java `Optional` to scala `Option`, fixed code uses same conversion used in the mapwithst

spark git commit: [SPARK-13121][STREAMING] java mapWithState mishandles scala Option

2016-02-02 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master be5dd881f -> d0df2ca40 [SPARK-13121][STREAMING] java mapWithState mishandles scala Option Already merged into 1.6 branch, this PR is to commit to master the same change Author: Gabriele Nizzoli Closes #11028 from gabrielenizzoli/patch-1.

spark git commit: [SPARK-12739][STREAMING] Details of batch in Streaming tab uses two Duration columns

2016-02-03 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 138c300f9 -> e9eb248ed [SPARK-12739][STREAMING] Details of batch in Streaming tab uses two Duration columns I have clearly prefix the two 'Duration' columns in 'Details of Batch' Streaming tab as 'Output Op Duration' and 'Job Duration' A

spark git commit: [SPARK-12739][STREAMING] Details of batch in Streaming tab uses two Duration columns

2016-02-03 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 2f8abb4af -> 5fe8796c2 [SPARK-12739][STREAMING] Details of batch in Streaming tab uses two Duration columns I have clearly prefix the two 'Duration' columns in 'Details of Batch' Streaming tab as 'Output Op Duration' and 'Job Duration

spark git commit: [SPARK-3611][WEB UI] Show number of cores for each executor in application web UI

2016-02-03 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 9dd2741eb -> 3221eddb8 [SPARK-3611][WEB UI] Show number of cores for each executor in application web UI Added a Cores column in the Executors UI Author: Alex Bozarth Closes #11039 from ajbozarth/spark3611. Project: http://git-wip-us.

spark git commit: [SPARK-13195][STREAMING] Fix NoSuchElementException when a state is not set but timeoutThreshold is defined

2016-02-04 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 2f390d306 -> a907c7c64 [SPARK-13195][STREAMING] Fix NoSuchElementException when a state is not set but timeoutThreshold is defined Check the state Existence before calling get. Author: Shixiong Zhu Closes #11081 from zsxwing/SP

spark git commit: [SPARK-13195][STREAMING] Fix NoSuchElementException when a state is not set but timeoutThreshold is defined

2016-02-04 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master bd38dd6f7 -> 8e2f29630 [SPARK-13195][STREAMING] Fix NoSuchElementException when a state is not set but timeoutThreshold is defined Check the state Existence before calling get. Author: Shixiong Zhu Closes #11081 from zsxwing/SP

spark git commit: [SPARK-13166][SQL] Rename DataStreamReaderWriterSuite to DataFrameReaderWriterSuite

2016-02-05 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 82d84ff2d -> 7b73f1719 [SPARK-13166][SQL] Rename DataStreamReaderWriterSuite to DataFrameReaderWriterSuite A follow up PR for #11062 because it didn't rename the test suite. Author: Shixiong Zhu Closes #11096 from zsxwin

spark git commit: [SPARK-13245][CORE] Call shuffleMetrics methods only in one thread for ShuffleBlockFetcherIterator

2016-02-09 Thread zsxwing
s` so as to always use shuffleMetrics in one thread. Also fix a race condition that could cause memory leak. Author: Shixiong Zhu Closes #11138 from zsxwing/SPARK-13245. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/fae83

[1/2] spark git commit: [SPARK-13146][SQL] Management API for continuous queries

2016-02-10 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 29c547303 -> 0902e2028 http://git-wip-us.apache.org/repos/asf/spark/blob/0902e202/sql/core/src/test/scala/org/apache/spark/sql/util/ContinuousQueryListenerSuite.scala -- dif

[2/2] spark git commit: [SPARK-13146][SQL] Management API for continuous queries

2016-02-10 Thread zsxwing
[SPARK-13146][SQL] Management API for continuous queries ### Management API for Continuous Queries **API for getting status of each query** - Whether active or not - Unique name of each query - Status of the sources and sinks - Exceptions **API for managing each query** - Immediately stop an act

spark git commit: [STREAMING][TEST] Fix flaky streaming.FailureSuite

2016-02-11 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 13c17cbb0 -> 219a74a7c [STREAMING][TEST] Fix flaky streaming.FailureSuite Under some corner cases, the test suite failed to shutdown the SparkContext causing cascaded failures. This fix does two things - Makes sure no SparkContext is activ

spark git commit: [SPARK-6166] Limit number of in flight outbound requests

2016-02-11 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master a2c7dcf61 -> 894921d81 [SPARK-6166] Limit number of in flight outbound requests This JIRA is related to https://github.com/apache/spark/pull/5852 Had to do some minor rework and test to make sure it works with current version of spark. Aut

spark git commit: [SPARK-13308] ManagedBuffers passed to OneToOneStreamManager need to be freed in non-error cases

2016-02-16 Thread zsxwing
the relevant network code so that the ManagedBuffers are freed as soon as the messages containing them are processed by the lower-level Netty message sending code. /cc zsxwing for review. Author: Josh Rosen Closes #11193 from JoshRosen/add-missing-release-calls-in-network-layer. Project: h

spark git commit: [SPARK-11627] Add initial input rate limit for spark streaming backpressure mechanism.

2016-02-16 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 5f37aad48 -> 7218c0eba [SPARK-11627] Add initial input rate limit for spark streaming backpressure mechanism. https://issues.apache.org/jira/browse/SPARK-11627 Spark Streaming backpressure mechanism has no initial input rate limit, it mi

spark git commit: Revert "[SPARK-13117][WEB UI] WebUI should use the local ip not 0.0.0.0"

2016-02-25 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 5fcf4c2bf -> 46f6e7931 Revert "[SPARK-13117][WEB UI] WebUI should use the local ip not 0.0.0.0" This reverts commit 2e44031fafdb8cf486573b98e4faa6b31ffb90a4. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wi

spark git commit: [SPARK-13464][STREAMING][PYSPARK] Fix failed streaming in pyspark in branch 1.3

2016-02-25 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.3 387d81891 -> 6ddde8eda [SPARK-13464][STREAMING][PYSPARK] Fix failed streaming in pyspark in branch 1.3 JIRA: https://issues.apache.org/jira/browse/SPARK-13464 ## What changes were proposed in this pull request? During backport a mllib

spark git commit: [SPARK-13069][STREAMING] Add "ask" style store() to ActorReciever

2016-02-25 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 751724b13 -> fb8bb0476 [SPARK-13069][STREAMING] Add "ask" style store() to ActorReciever Introduces a "ask" style ```store``` in ```ActorReceiver``` as a way to allow actor receiver blocked by back pressure or maxRate. Author: Lin Zhao

spark git commit: [SPARK-13468][WEB UI] Fix a corner case where the Stage UI page should show DAG but it doesn't show

2016-02-25 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 35316cb0b -> dc6c5ea4c [SPARK-13468][WEB UI] Fix a corner case where the Stage UI page should show DAG but it doesn't show When uses clicks more than one time on any stage in the DAG graph on the *Job* web UI page, many new *Stage* web UI

spark git commit: [SPARK-13584][SQL][TESTS] Make ContinuousQueryManagerSuite not output logs to the console

2016-03-03 Thread zsxwing
The logs will still output to `unit-tests.log`. I also updated `SQLListenerMemoryLeakSuite` to use `quietly` to avoid changing the log level which won't output logs to `unit-tests.log`. ## How was this patch tested? Just check Jenkins output. Author: Shixiong Zhu Closes #11439 from zsxwing

spark git commit: [SPARK-13652][CORE] Copy ByteBuffer in sendRpcSync as it will be recycled

2016-03-03 Thread zsxwing
be recycled and reused. ## How was this patch tested? Jenkins unit tests. Author: Shixiong Zhu Closes #11499 from zsxwing/SPARK-13652. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/465c665d Tree: http://git-wip-us.apache.

spark git commit: [SPARK-13652][CORE] Copy ByteBuffer in sendRpcSync as it will be recycled

2016-03-03 Thread zsxwing
be recycled and reused. ## How was this patch tested? Jenkins unit tests. Author: Shixiong Zhu Closes #11499 from zsxwing/SPARK-13652. (cherry picked from commit 465c665db1dc65e3b02c584cf7f8d06b24909b0c) Signed-off-by: Shixiong Zhu Project: http://git-wip-us.apache.org/repos/asf/spark/r

spark git commit: [SPARK-12073][STREAMING] backpressure rate controller consumes events preferentially from lagg…

2016-03-04 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master a6e2bd31f -> f19228eed [SPARK-12073][STREAMING] backpressure rate controller consumes events preferentially from lagg… …ing partitions I'm pretty sure this is the reason we couldn't easily recover from an unbalanced Kafka partition u

spark git commit: [SPARK-13693][STREAMING][TESTS] Stop StreamingContext before deleting checkpoint dir

2016-03-05 Thread zsxwing
his patch tested? unit tests Author: Shixiong Zhu Closes #11531 from zsxwing/SPARK-13693. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/8290004d Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/8290004d Diff: http://

spark git commit: [MINOR][DOC] improve the doc for "spark.memory.offHeap.size"

2016-03-07 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master e72914f37 -> a3ec50a4b [MINOR][DOC] improve the doc for "spark.memory.offHeap.size" The description of "spark.memory.offHeap.size" in the current document does not clearly state that memory is counted with bytes This PR contains a sma

spark git commit: [MINOR][DOC] improve the doc for "spark.memory.offHeap.size"

2016-03-07 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 2434f16cc -> cf4e62ec2 [MINOR][DOC] improve the doc for "spark.memory.offHeap.size" The description of "spark.memory.offHeap.size" in the current document does not clearly state that memory is counted with bytes This PR contains a

spark git commit: [SPARK-13655] Improve isolation between tests in KinesisBackedBlockRDDSuite

2016-03-07 Thread zsxwing
to hang. See #11558 for more details. /cc zsxwing srowen Author: Josh Rosen Closes #11564 from JoshRosen/SPARK-13655. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e9e67b39 Tree: http://git-wip-us.apache.org/repos/asf/sp

spark git commit: [SPARK-13711][CORE] Don't call SparkUncaughtExceptionHandler in AppClient as it's in driver

2016-03-07 Thread zsxwing
ils.tryOrExit` as it will send exception to SparkUncaughtExceptionHandler and call `System.exit`. This PR just removed `Utils.tryOrExit`. ## How was this patch tested? manual tests. Author: Shixiong Zhu Closes #11566 from zsxwing/SPARK-13711. Project: http://git-wip-us.apache.org/repos/a

spark git commit: [SPARK-13711][CORE] Don't call SparkUncaughtExceptionHandler in AppClient as it's in driver

2016-03-07 Thread zsxwing
ould not call `Utils.tryOrExit` as it will send exception to SparkUncaughtExceptionHandler and call `System.exit`. This PR just removed `Utils.tryOrExit`. ## How was this patch tested? manual tests. Author: Shixiong Zhu Closes #11566 from zsxwing/SPARK-13711. Project: http://git-wip-us.apache.org/re

spark git commit: [SPARK-14942][SQL][STREAMING] Reduce delay between batch construction and execution

2016-05-16 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 f937ce766 -> 0dd1f8720 [SPARK-14942][SQL][STREAMING] Reduce delay between batch construction and execution ## Problem Currently in `StreamExecution`, [we first run the batch, then construct the next](https://github.com/apache/spark/b

spark git commit: [SPARK-14942][SQL][STREAMING] Reduce delay between batch construction and execution

2016-05-16 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master fabc8e5b1 -> 95f4fbae5 [SPARK-14942][SQL][STREAMING] Reduce delay between batch construction and execution ## Problem Currently in `StreamExecution`, [we first run the batch, then construct the next](https://github.com/apache/spark/blob/

spark git commit: [SPARK-15395][CORE] Use getHostString to create RpcAddress

2016-05-18 Thread zsxwing
nd this behavior will make the check incorrect. This PR uses `getHostString` to resolve the issue. ## How was this patch tested? Jenkins unit tests. Author: Shixiong Zhu Closes #13185 from zsxwing/host-string. (cherry picked from commit 5c9117a3ed373461529f9f9306668ed4149c63fb) Signed-off-by:

spark git commit: [SPARK-15395][CORE] Use getHostString to create RpcAddress

2016-05-18 Thread zsxwing
nd this behavior will make the check incorrect. This PR uses `getHostString` to resolve the issue. ## How was this patch tested? Jenkins unit tests. Author: Shixiong Zhu Closes #13185 from zsxwing/host-string. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.ap

spark git commit: Fix the compiler error introduced by #13153 for Scala 2.10

2016-05-19 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 5ccecc078 -> 305263954 Fix the compiler error introduced by #13153 for Scala 2.10 Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/30526395 Tree: http://git-wip-us.apach

spark git commit: Fix the compiler error introduced by #13153 for Scala 2.10

2016-05-19 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 d1b5df83d -> 4257ba372 Fix the compiler error introduced by #13153 for Scala 2.10 Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/4257ba37 Tree: http://git-wip-us.a

spark git commit: [SPARK-15395][CORE] Use getHostString to create RpcAddress (backport for 1.6)

2016-05-20 Thread zsxwing
sts. Author: Shixiong Zhu Closes #13196 from zsxwing/host-string-1.6. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/7ad82b66 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/7ad82b66 Diff: http://git-wip-us.apache.

spark git commit: [SPARK-15508][STREAMING][TESTS] Fix flaky test: JavaKafkaStreamSuite.testKafkaStream

2016-05-24 Thread zsxwing
://spark-tests.appspot.com/tests/org.apache.spark.streaming.kafka.JavaKafkaStreamSuite/testKafkaStream ## How was this patch tested? Jenkins unit tests. Author: Shixiong Zhu Closes #13281 from zsxwing/flaky-kafka-test. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http:/

spark git commit: [SPARK-15508][STREAMING][TESTS] Fix flaky test: JavaKafkaStreamSuite.testKafkaStream

2016-05-24 Thread zsxwing
s: http://spark-tests.appspot.com/tests/org.apache.spark.streaming.kafka.JavaKafkaStreamSuite/testKafkaStream ## How was this patch tested? Jenkins unit tests. Author: Shixiong Zhu Closes #13281 from zsxwing/flaky-kafka-test. (cherry picked from commit c9c1c0e54d34773ac2cf5457fe5925559ece36c7) Si

spark git commit: [SPARK-9044] Fix "Storage" tab in UI so that it reflects RDD name change.

2016-05-25 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 4f27b8dd5 -> b120fba6a [SPARK-9044] Fix "Storage" tab in UI so that it reflects RDD name change. ## What changes were proposed in this pull request? 1. Making 'name' field of RDDInfo mutable. 2. In StorageListener: catching the fact that R

spark git commit: [SPARK-9044] Fix "Storage" tab in UI so that it reflects RDD name change.

2016-05-25 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 f63ba2210 -> 69327667d [SPARK-9044] Fix "Storage" tab in UI so that it reflects RDD name change. ## What changes were proposed in this pull request? 1. Making 'name' field of RDDInfo mutable. 2. In StorageListener: catching the fact th

spark git commit: Revert "[SPARK-11753][SQL][TEST-HADOOP2.2] Make allowNonNumericNumbers option work

2016-05-31 Thread zsxwing
a PR to run Jenkins tests due to the revert conflicts of `dev/deps/spark-deps-hadoop*`. ## How was this patch tested? Jenkins unit tests, integration tests, manual tests) Author: Shixiong Zhu Closes #13417 from zsxwing/revert-SPARK-11753. (cherry picked fro

spark git commit: Revert "[SPARK-11753][SQL][TEST-HADOOP2.2] Make allowNonNumericNumbers option work

2016-05-31 Thread zsxwing
R to run Jenkins tests due to the revert conflicts of `dev/deps/spark-deps-hadoop*`. ## How was this patch tested? Jenkins unit tests, integration tests, manual tests) Author: Shixiong Zhu Closes #13417 from zsxwing/revert-SPARK-11753. Project: http://git-wip-us.apache.org/repos/asf/spark/rep

spark git commit: [SPARK-15606][CORE] Use non-blocking removeExecutor call to avoid deadlocks

2016-06-02 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 63b7f127c -> 7c07d176f [SPARK-15606][CORE] Use non-blocking removeExecutor call to avoid deadlocks ## What changes were proposed in this pull request? Set minimum number of dispatcher threads to 3 to avoid deadlocks on machines with only 2

spark git commit: [SPARK-15606][CORE] Use non-blocking removeExecutor call to avoid deadlocks

2016-06-02 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 fe639adea -> 18d613a4d [SPARK-15606][CORE] Use non-blocking removeExecutor call to avoid deadlocks ## What changes were proposed in this pull request? Set minimum number of dispatcher threads to 3 to avoid deadlocks on machines with on

spark git commit: [SPARK-15515][SQL] Error Handling in Running SQL Directly On Files

2016-06-02 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 cd7bf4b8e -> 32b025e94 [SPARK-15515][SQL] Error Handling in Running SQL Directly On Files What changes were proposed in this pull request? This PR is to address the following issues: - **ISSUE 1:** For ORC source format, we are re

spark git commit: [SPARK-15515][SQL] Error Handling in Running SQL Directly On Files

2016-06-02 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 8900c8d8f -> 9aff6f3b1 [SPARK-15515][SQL] Error Handling in Running SQL Directly On Files What changes were proposed in this pull request? This PR is to address the following issues: - **ISSUE 1:** For ORC source format, we are report

spark git commit: [SPARK-15841][Tests] REPLSuite has incorrect env set for a couple of tests.

2016-06-09 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master aa0364510 -> 83070cd1d [SPARK-15841][Tests] REPLSuite has incorrect env set for a couple of tests. Description from JIRA. In ReplSuite, for a test that can be tested well on just local should not really have to start a local-cluster. And s

spark git commit: [SPARK-15841][Tests] REPLSuite has incorrect env set for a couple of tests.

2016-06-09 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 b2d076c35 -> 3119d8eef [SPARK-15841][Tests] REPLSuite has incorrect env set for a couple of tests. Description from JIRA. In ReplSuite, for a test that can be tested well on just local should not really have to start a local-cluster. A

spark git commit: [SPARK-15871][SQL] Add `assertNotPartitioned` check in `DataFrameWriter`

2016-06-10 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 c1390ccbb -> f15d641e2 [SPARK-15871][SQL] Add `assertNotPartitioned` check in `DataFrameWriter` ## What changes were proposed in this pull request? It doesn't make sense to specify partitioning parameters, when we write data out from

spark git commit: [SPARK-15871][SQL] Add `assertNotPartitioned` check in `DataFrameWriter`

2016-06-10 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 5c16ad0d5 -> fb219029d [SPARK-15871][SQL] Add `assertNotPartitioned` check in `DataFrameWriter` ## What changes were proposed in this pull request? It doesn't make sense to specify partitioning parameters, when we write data out from Data

spark git commit: [SPARK-15697][REPL] Unblock some of the useful repl commands.

2016-06-13 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 938434dc7 -> 4134653e5 [SPARK-15697][REPL] Unblock some of the useful repl commands. ## What changes were proposed in this pull request? Unblock some of the useful repl commands. like, "implicits", "javap", "power", "type", "kind". As the

spark git commit: [SPARK-15697][REPL] Unblock some of the useful repl commands.

2016-06-13 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 c01dc815d -> 413826d40 [SPARK-15697][REPL] Unblock some of the useful repl commands. ## What changes were proposed in this pull request? Unblock some of the useful repl commands. like, "implicits", "javap", "power", "type", "kind". As

spark git commit: [SPARK-15889][SQL][STREAMING] Add a unique id to ContinuousQuery

2016-06-13 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 d9db8a9c8 -> 97fe1d8ee [SPARK-15889][SQL][STREAMING] Add a unique id to ContinuousQuery ## What changes were proposed in this pull request? ContinuousQueries have names that are unique across all the active ones. However, when queries

spark git commit: [SPARK-15889][SQL][STREAMING] Add a unique id to ContinuousQuery

2016-06-13 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 5ad4e32d4 -> c654ae214 [SPARK-15889][SQL][STREAMING] Add a unique id to ContinuousQuery ## What changes were proposed in this pull request? ContinuousQueries have names that are unique across all the active ones. However, when queries are

spark git commit: [SPARK-15935][PYSPARK] Fix a wrong format tag in the error message

2016-06-14 Thread zsxwing
ins unit tests. Author: Shixiong Zhu Closes #13665 from zsxwing/fix. (cherry picked from commit 0ee9fd9e528206a5edfb2cc4a56538250b428aaf) Signed-off-by: Shixiong Zhu Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/df9a1

spark git commit: [SPARK-15935][PYSPARK] Fix a wrong format tag in the error message

2016-06-14 Thread zsxwing
ins unit tests. Author: Shixiong Zhu Closes #13665 from zsxwing/fix. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/0ee9fd9e Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/0ee9fd9e Diff: http://git-wip-us.apache.org/re

[1/3] spark git commit: [SPARK-15953][WIP][STREAMING] Renamed ContinuousQuery to StreamingQuery

2016-06-15 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master d30b7e669 -> 9a5071996 http://git-wip-us.apache.org/repos/asf/spark/blob/9a507199/sql/core/src/test/scala/org/apache/spark/sql/streaming/test/DataStreamReaderWriterSuite.scala -

[2/3] spark git commit: [SPARK-15953][WIP][STREAMING] Renamed ContinuousQuery to StreamingQuery

2016-06-15 Thread zsxwing
http://git-wip-us.apache.org/repos/asf/spark/blob/9a507199/sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryListener.scala -- diff --git a/sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryLi

[3/3] spark git commit: [SPARK-15953][WIP][STREAMING] Renamed ContinuousQuery to StreamingQuery

2016-06-15 Thread zsxwing
[SPARK-15953][WIP][STREAMING] Renamed ContinuousQuery to StreamingQuery Renamed for simplicity, so that its obvious that its related to streaming. Existing unit tests. Author: Tathagata Das Closes #13673 from tdas/SPARK-15953. (cherry picked from commit 9a5071996b968148f6b9aba12e0d3fe888d9acd

[2/3] spark git commit: [SPARK-15953][WIP][STREAMING] Renamed ContinuousQuery to StreamingQuery

2016-06-15 Thread zsxwing
http://git-wip-us.apache.org/repos/asf/spark/blob/885e74a3/sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryListener.scala -- diff --git a/sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryLi

[1/3] spark git commit: [SPARK-15953][WIP][STREAMING] Renamed ContinuousQuery to StreamingQuery

2016-06-15 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 4c950a757 -> 885e74a38 http://git-wip-us.apache.org/repos/asf/spark/blob/885e74a3/sql/core/src/test/scala/org/apache/spark/sql/streaming/test/DataStreamReaderWriterSuite.scala -

[3/3] spark git commit: [SPARK-15953][WIP][STREAMING] Renamed ContinuousQuery to StreamingQuery

2016-06-15 Thread zsxwing
[SPARK-15953][WIP][STREAMING] Renamed ContinuousQuery to StreamingQuery Renamed for simplicity, so that its obvious that its related to streaming. Existing unit tests. Author: Tathagata Das Closes #13673 from tdas/SPARK-15953. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit

spark git commit: [SPARK-15826][CORE] PipedRDD to allow configurable char encoding

2016-06-15 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 de56ea9bf -> 8ef31fbd7 [SPARK-15826][CORE] PipedRDD to allow configurable char encoding ## What changes were proposed in this pull request? Link to jira which describes the problem: https://issues.apache.org/jira/browse/SPARK-15826 T

spark git commit: [SPARK-15826][CORE] PipedRDD to allow configurable char encoding

2016-06-15 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 9b234b55d -> 279bd4aa5 [SPARK-15826][CORE] PipedRDD to allow configurable char encoding ## What changes were proposed in this pull request? Link to jira which describes the problem: https://issues.apache.org/jira/browse/SPARK-15826 The f

spark git commit: [SPARK-12492][SQL] Add missing SQLExecution.withNewExecutionId for hiveResultString

2016-06-15 Thread zsxwing
hat queries running in `spark-sql` will be shown in Web UI. Closes #13115 ## How was this patch tested? Existing unit tests. Author: KaiXinXiaoLei Closes #13689 from zsxwing/pr13115. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/sp

spark git commit: [SPARK-12492][SQL] Add missing SQLExecution.withNewExecutionId for hiveResultString

2016-06-15 Thread zsxwing
hat queries running in `spark-sql` will be shown in Web UI. Closes #13115 ## How was this patch tested? Existing unit tests. Author: KaiXinXiaoLei Closes #13689 from zsxwing/pr13115. (cherry picked from commit 3e6d567a4688f064f2a2259c8e436b7c628a431c) Signed-off-by: Shixiong Zhu Project: h

spark git commit: [SPARK-15981][SQL][STREAMING] Fixed bug and added tests in DataStreamReader Python API

2016-06-16 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master a865f6e05 -> 084dca770 [SPARK-15981][SQL][STREAMING] Fixed bug and added tests in DataStreamReader Python API ## What changes were proposed in this pull request? - Fixed bug in Python API of DataStreamReader. Because a single path was be

spark git commit: [SPARK-15981][SQL][STREAMING] Fixed bug and added tests in DataStreamReader Python API

2016-06-16 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 0a2291cd1 -> e11c27918 [SPARK-15981][SQL][STREAMING] Fixed bug and added tests in DataStreamReader Python API ## What changes were proposed in this pull request? - Fixed bug in Python API of DataStreamReader. Because a single path wa

spark git commit: [SPARK-15991] SparkContext.hadoopConfiguration should be always the base of hadoop conf created by SessionState

2016-06-16 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 62d2fa5e9 -> d9c6628c4 [SPARK-15991] SparkContext.hadoopConfiguration should be always the base of hadoop conf created by SessionState ## What changes were proposed in this pull request? Before this patch, after a SparkSession has been cre

spark git commit: [SPARK-15991] SparkContext.hadoopConfiguration should be always the base of hadoop conf created by SessionState

2016-06-16 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 8f7138859 -> b3678eb7e [SPARK-15991] SparkContext.hadoopConfiguration should be always the base of hadoop conf created by SessionState ## What changes were proposed in this pull request? Before this patch, after a SparkSession has been

spark git commit: Revert "[SPARK-15395][CORE] Use getHostString to create RpcAddress (backport for 1.6)"

2016-06-17 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 4621fe94b -> e530823dd Revert "[SPARK-15395][CORE] Use getHostString to create RpcAddress (backport for 1.6)" This reverts commit 7ad82b663092615b02bef3991fb1a21af77d2358. See SPARK-16017. Project: http://git-wip-us.apache.org/repos/

spark git commit: [SPARK-16017][CORE] Send hostname from CoarseGrainedExecutorBackend to driver

2016-06-17 Thread zsxwing
Zhu Closes #13741 from zsxwing/SPARK-16017. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/62d8fe20 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/62d8fe20 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/62d8f

spark git commit: [SPARK-16017][CORE] Send hostname from CoarseGrainedExecutorBackend to driver

2016-06-17 Thread zsxwing
ong Zhu Closes #13741 from zsxwing/SPARK-16017. (cherry picked from commit 62d8fe2089659e8212753a622708517e0f4a77bc) Signed-off-by: Shixiong Zhu Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/0701b8d9 Tree: http://git-

spark git commit: [SPARK-16020][SQL] Fix complete mode aggregation with console sink

2016-06-17 Thread zsxwing
his PR just collects `DataFrame` and calls `show` on a batch DataFrame based on the result. This is fine since ConsoleSink is only for debugging. ## How was this patch tested? Manually confirmed ConsoleSink now works with complete mode aggregation. Author: Shixiong Zhu Closes #13740 from zsxw

spark git commit: [SPARK-16020][SQL] Fix complete mode aggregation with console sink

2016-06-17 Thread zsxwing
PR just collects `DataFrame` and calls `show` on a batch DataFrame based on the result. This is fine since ConsoleSink is only for debugging. ## How was this patch tested? Manually confirmed ConsoleSink now works with complete mode aggregation. Author: Shixiong Zhu Closes #13740 from zsxw

spark git commit: [SPARK-15982][SPARK-16009][SPARK-16007][SQL] Harmonize the behavior of DataFrameReader.text/csv/json/parquet/orc

2016-06-20 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 6df8e3886 -> b99129cc4 [SPARK-15982][SPARK-16009][SPARK-16007][SQL] Harmonize the behavior of DataFrameReader.text/csv/json/parquet/orc ## What changes were proposed in this pull request? Issues with current reader behavior. - `text()` wi

spark git commit: [SPARK-15982][SPARK-16009][SPARK-16007][SQL] Harmonize the behavior of DataFrameReader.text/csv/json/parquet/orc

2016-06-20 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 8159da20e -> 54001cb12 [SPARK-15982][SPARK-16009][SPARK-16007][SQL] Harmonize the behavior of DataFrameReader.text/csv/json/parquet/orc ## What changes were proposed in this pull request? Issues with current reader behavior. - `text()

spark git commit: [SPARK-15606][CORE] Use non-blocking removeExecutor call to avoid deadlocks

2016-06-21 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 abe36c53d -> d98fb19c1 [SPARK-15606][CORE] Use non-blocking removeExecutor call to avoid deadlocks ## What changes were proposed in this pull request? Set minimum number of dispatcher threads to 3 to avoid deadlocks on machines with on

spark git commit: [SPARK-16120][STREAMING] getCurrentLogFiles in ReceiverSuite WAL generating and cleaning case uses external variable instead of the passed parameter

2016-06-22 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 0a9c02759 -> c2cebdb7d [SPARK-16120][STREAMING] getCurrentLogFiles in ReceiverSuite WAL generating and cleaning case uses external variable instead of the passed parameter ## What changes were proposed in this pull request? In `ReceiverSu

spark git commit: [SPARK-16120][STREAMING] getCurrentLogFiles in ReceiverSuite WAL generating and cleaning case uses external variable instead of the passed parameter

2016-06-22 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 76d0ef34e -> 520828c90 [SPARK-16120][STREAMING] getCurrentLogFiles in ReceiverSuite WAL generating and cleaning case uses external variable instead of the passed parameter ## What changes were proposed in this pull request? In `Receiv

spark git commit: [SPARK-16131] initialize internal logger lazily in Scala preferred way

2016-06-22 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 857ecff1d -> 044971eca [SPARK-16131] initialize internal logger lazily in Scala preferred way ## What changes were proposed in this pull request? Initialize logger instance lazily in Scala preferred way ## How was this patch tested? By r

<    1   2   3   4   5   6   7   8   >