spark git commit: [SPARK-9801] [STREAMING] Check if file exists before deleting temporary files.

2015-08-10 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.4 4b5bbc589 -> 6dde38026 [SPARK-9801] [STREAMING] Check if file exists before deleting temporary files. Spark streaming deletes the temp file and backup files without checking if they exist or not Author: Hao Zhu Closes #8082 from via

spark git commit: [SPARK-9801] [STREAMING] Check if file exists before deleting temporary files.

2015-08-10 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.5 8f4014fda -> 94692bb14 [SPARK-9801] [STREAMING] Check if file exists before deleting temporary files. Spark streaming deletes the temp file and backup files without checking if they exist or not Author: Hao Zhu Closes #8082 from via

spark git commit: [SPARK-9801] [STREAMING] Check if file exists before deleting temporary files.

2015-08-10 Thread tdas
Repository: spark Updated Branches: refs/heads/master 853809e94 -> 3c9802d94 [SPARK-9801] [STREAMING] Check if file exists before deleting temporary files. Spark streaming deletes the temp file and backup files without checking if they exist or not Author: Hao Zhu Closes #8082 from viadea/

spark git commit: [SPARK-5155] [PYSPARK] [STREAMING] Mqtt streaming support in Python

2015-08-10 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.5 51406becc -> 8f4014fda [SPARK-5155] [PYSPARK] [STREAMING] Mqtt streaming support in Python This PR is based on #4229, thanks prabeesh. Closes #4229 Author: Prabeesh K Author: zsxwing Author: prabs Author: Prabeesh K Closes #7833

spark git commit: [SPARK-5155] [PYSPARK] [STREAMING] Mqtt streaming support in Python

2015-08-10 Thread tdas
Repository: spark Updated Branches: refs/heads/master c4fd2a242 -> 853809e94 [SPARK-5155] [PYSPARK] [STREAMING] Mqtt streaming support in Python This PR is based on #4229, thanks prabeesh. Closes #4229 Author: Prabeesh K Author: zsxwing Author: prabs Author: Prabeesh K Closes #7833 from

spark git commit: [SPARK-9639] [STREAMING] Fix a potential NPE in Streaming JobScheduler

2015-08-06 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.5 8ecfb05e3 -> 980687206 [SPARK-9639] [STREAMING] Fix a potential NPE in Streaming JobScheduler Because `JobScheduler.stop(false)` may set `eventLoop` to null when `JobHandler` is running, then it's possible that when `post` is called,

spark git commit: [SPARK-9639] [STREAMING] Fix a potential NPE in Streaming JobScheduler

2015-08-06 Thread tdas
Repository: spark Updated Branches: refs/heads/master 1723e3489 -> 346209097 [SPARK-9639] [STREAMING] Fix a potential NPE in Streaming JobScheduler Because `JobScheduler.stop(false)` may set `eventLoop` to null when `JobHandler` is running, then it's possible that when `post` is called, `eve

spark git commit: [DOCS] [STREAMING] make the existing parameter docs for OffsetRange ac…

2015-08-06 Thread tdas
Repository: spark Updated Branches: refs/heads/master 0a078303d -> 1723e3489 [DOCS] [STREAMING] make the existing parameter docs for OffsetRange ac… …tually visible Author: cody koeninger Closes #7995 from koeninger/doc-fixes and squashes the following commits: 87af9ea [cody koeninger]

spark git commit: [DOCS] [STREAMING] make the existing parameter docs for OffsetRange ac…

2015-08-06 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.5 3997dd3fd -> 8ecfb05e3 [DOCS] [STREAMING] make the existing parameter docs for OffsetRange ac… …tually visible Author: cody koeninger Closes #7995 from koeninger/doc-fixes and squashes the following commits: 87af9ea [cody koenin

spark git commit: [SPARK-9556] [SPARK-9619] [SPARK-9624] [STREAMING] Make BlockGenerator more robust and make all BlockGenerators subscribe to rate limit updates

2015-08-06 Thread tdas
few internal API to return the current rate of block generators as Long instead of Option\[Long\] (was inconsistent at places). - Updated existing `ReceiverTrackerSuite` to test that custom block generators get rate updates as well. Author: Tathagata Das Closes #7913 from tdas/SPARK-9556 and squas

spark git commit: [SPARK-9556] [SPARK-9619] [SPARK-9624] [STREAMING] Make BlockGenerator more robust and make all BlockGenerators subscribe to rate limit updates

2015-08-06 Thread tdas
nal API to return the current rate of block generators as Long instead of Option\[Long\] (was inconsistent at places). - Updated existing `ReceiverTrackerSuite` to test that custom block generators get rate updates as well. Author: Tathagata Das Closes #7913 from tdas/SPARK-9556 and squashes

spark git commit: [SPARK-8978] [STREAMING] Implements the DirectKafkaRateController

2015-08-06 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.5 8a7956283 -> 8b00c0690 [SPARK-8978] [STREAMING] Implements the DirectKafkaRateController Author: Dean Wampler Author: Nilanjan Raychaudhuri Author: François Garillot Closes #7796 from dragos/topic/streaming-bp/kafka-direct and squ

spark git commit: [SPARK-8978] [STREAMING] Implements the DirectKafkaRateController

2015-08-06 Thread tdas
Repository: spark Updated Branches: refs/heads/master 0d7aac99d -> a1bbf1bc5 [SPARK-8978] [STREAMING] Implements the DirectKafkaRateController Author: Dean Wampler Author: Nilanjan Raychaudhuri Author: François Garillot Closes #7796 from dragos/topic/streaming-bp/kafka-direct and squashe

spark git commit: [SPARK-9601] [DOCS] Fix JavaPairDStream signature for stream-stream and windowed join in streaming guide doc

2015-08-05 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.5 7fa419535 -> 6306019ff [SPARK-9601] [DOCS] Fix JavaPairDStream signature for stream-stream and windowed join in streaming guide doc Author: Namit Katariya Closes #7935 from namitk/SPARK-9601 and squashes the following commits: 03b57

spark git commit: [SPARK-9601] [DOCS] Fix JavaPairDStream signature for stream-stream and windowed join in streaming guide doc

2015-08-05 Thread tdas
Repository: spark Updated Branches: refs/heads/master 6d8a6e416 -> 1bf608b5e [SPARK-9601] [DOCS] Fix JavaPairDStream signature for stream-stream and windowed join in streaming guide doc Author: Namit Katariya Closes #7935 from namitk/SPARK-9601 and squashes the following commits: 03b5784 [

spark git commit: [SPARK-9217] [STREAMING] Make the kinesis receiver reliable by recording sequence numbers

2015-08-05 Thread tdas
825 from tdas/kinesis-receiver and squashes the following commits: 2159be9 [Tathagata Das] Fixed bug 569be83 [Tathagata Das] Fix scala style issue bf31e22 [Tathagata Das] Added more documentation to make the kinesis test endpoint more configurable 3ad8361 [Tathagata Das] Merge remote-tracking bra

spark git commit: [SPARK-9217] [STREAMING] Make the kinesis receiver reliable by recording sequence numbers

2015-08-05 Thread tdas
ses #7825 from tdas/kinesis-receiver and squashes the following commits: 2159be9 [Tathagata Das] Fixed bug 569be83 [Tathagata Das] Fix scala style issue bf31e22 [Tathagata Das] Added more documentation to make the kinesis test endpoint more configurable 3ad8361 [Tathagata Das] Merge remote-track

spark git commit: [SPARK-9504] [STREAMING] [TESTS] Fix o.a.s.streaming.StreamingContextSuite.stop gracefully again

2015-08-04 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.5 d196d3607 -> 6e72d24e2 [SPARK-9504] [STREAMING] [TESTS] Fix o.a.s.streaming.StreamingContextSuite.stop gracefully again The test failure is here: https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-SBT/3150/AMPLAB_JENKINS_BUILD_PR

spark git commit: [SPARK-9504] [STREAMING] [TESTS] Fix o.a.s.streaming.StreamingContextSuite.stop gracefully again

2015-08-04 Thread tdas
Repository: spark Updated Branches: refs/heads/master 2b67fdb60 -> d34bac0e1 [SPARK-9504] [STREAMING] [TESTS] Fix o.a.s.streaming.StreamingContextSuite.stop gracefully again The test failure is here: https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-SBT/3150/AMPLAB_JENKINS_BUILD_PROFIL

spark git commit: [SPARK-1855] Local checkpointing

2015-08-03 Thread tdas
Repository: spark Updated Branches: refs/heads/master 69f5a7c93 -> b41a32718 [SPARK-1855] Local checkpointing Certain use cases of Spark involve RDDs with long lineages that must be truncated periodically (e.g. GraphX). The existing way of doing it is through `rdd.checkpoint()`, which is exp

spark git commit: [SPARK-9056] [STREAMING] Rename configuration `spark.streaming.minRememberDuration` to `spark.streaming.fileStream.minRememberDuration`

2015-07-31 Thread tdas
Repository: spark Updated Branches: refs/heads/master 3c0d2e552 -> 060c79aab [SPARK-9056] [STREAMING] Rename configuration `spark.streaming.minRememberDuration` to `spark.streaming.fileStream.minRememberDuration` Rename configuration `spark.streaming.minRememberDuration` to `spark.streaming

spark git commit: [SPARK-9504] [STREAMING] [TESTS] Use eventually to fix the flaky test

2015-07-31 Thread tdas
Repository: spark Updated Branches: refs/heads/master 3afc1de89 -> d04634701 [SPARK-9504] [STREAMING] [TESTS] Use eventually to fix the flaky test The previous code uses `ssc.awaitTerminationOrTimeout(500)`. Since nobody will stop it during `awaitTerminationOrTimeout`, it's just like `sleep(5

spark git commit: [SPARK-8564] [STREAMING] Add the Python API for Kinesis

2015-07-31 Thread tdas
Repository: spark Updated Branches: refs/heads/master 39ab199a3 -> 3afc1de89 [SPARK-8564] [STREAMING] Add the Python API for Kinesis This PR adds the Python API for Kinesis, including a Python example and a simple unit test. Author: zsxwing Closes #6955 from zsxwing/kinesis-python and squa

spark git commit: [SPARK-8979] Add a PID based rate estimator

2015-07-31 Thread tdas
Repository: spark Updated Branches: refs/heads/master e8bdcdeab -> 0a1d2ca42 [SPARK-8979] Add a PID based rate estimator Based on #7600 /cc tdas Author: Iulian Dragos Author: François Garillot Closes #7648 from dragos/topic/streaming-bp/pid and squashes the following commits: aa5b

spark git commit: [SPARK-9472] [STREAMING] consistent hadoop configuration, streaming only

2015-07-30 Thread tdas
Repository: spark Updated Branches: refs/heads/master 3c66ff727 -> 9307f5653 [SPARK-9472] [STREAMING] consistent hadoop configuration, streaming only Author: cody koeninger Closes #7772 from koeninger/streaming-hadoop-config and squashes the following commits: 5267284 [cody koeninger] [SPA

spark git commit: [STREAMING] [TEST] [HOTFIX] Fixed Kinesis test to not throw weird errors when Kinesis tests are enabled without AWS keys

2015-07-30 Thread tdas
at org.apache.spark.streaming.kinesis.KinesisStreamSuite$$anonfun$3.apply(KinesisStreamSuite.scala:86) ``` This is because attempting to delete a non-existent Kinesis stream throws uncaught exception. This PR fixes it. Author: Tathagata Das Closes #7809 from tdas/kinesis-test-hotfix and squashes

spark git commit: [SPARK-9479] [STREAMING] [TESTS] Fix ReceiverTrackerSuite failure for maven build and other potential test failures in Streaming

2015-07-30 Thread tdas
Repository: spark Updated Branches: refs/heads/master 89cda69ec -> 0dbd6963d [SPARK-9479] [STREAMING] [TESTS] Fix ReceiverTrackerSuite failure for maven build and other potential test failures in Streaming See https://issues.apache.org/jira/browse/SPARK-9479 for the failure cause. The PR inc

spark git commit: [SPARK-9335] [TESTS] Enable Kinesis tests only when files in extras/kinesis-asl are changed

2015-07-30 Thread tdas
Repository: spark Updated Branches: refs/heads/master 1221849f9 -> 76f2e393a [SPARK-9335] [TESTS] Enable Kinesis tests only when files in extras/kinesis-asl are changed Author: zsxwing Closes #7711 from zsxwing/SPARK-9335-test and squashes the following commits: c13ec2f [zsxwing] environs

spark git commit: [SPARK-8977] [STREAMING] Defines the RateEstimator interface, and impements the RateController

2015-07-29 Thread tdas
Repository: spark Updated Branches: refs/heads/master 069a4c414 -> 819be46e5 [SPARK-8977] [STREAMING] Defines the RateEstimator interface, and impements the RateController Based on #7471. - [x] add a test that exercises the publish path from driver to receiver - [ ] remove Serializable from

spark git commit: [STREAMING] [HOTFIX] Ignore ReceiverTrackerSuite flaky test

2015-07-28 Thread tdas
Repository: spark Updated Branches: refs/heads/master 59b92add7 -> c5ed36953 [STREAMING] [HOTFIX] Ignore ReceiverTrackerSuite flaky test Author: Tathagata Das Closes #7738 from tdas/ReceiverTrackerSuite-hotfix and squashes the following commits: 00f0ee1 [Tathagata Das] ignore flaky t

spark git commit: [SPARK-9335] [STREAMING] [TESTS] Make sure the test stream is deleted in KinesisBackedBlockRDDSuite

2015-07-27 Thread tdas
Repository: spark Updated Branches: refs/heads/master 9c5612f4e -> d93ab93d6 [SPARK-9335] [STREAMING] [TESTS] Make sure the test stream is deleted in KinesisBackedBlockRDDSuite KinesisBackedBlockRDDSuite should make sure delete the stream. Author: zsxwing Closes #7663 from zsxwing/fix-SPAR

spark git commit: [SPARK-8882] [STREAMING] Add a new Receiver scheduling mechanism

2015-07-27 Thread tdas
Repository: spark Updated Branches: refs/heads/master ce89ff477 -> daa1964b6 [SPARK-8882] [STREAMING] Add a new Receiver scheduling mechanism The design doc: https://docs.google.com/document/d/1ZsoRvHjpISPrDmSjsGzuSu8UjwgbtmoCTzmhgTurHJw/edit?usp=sharing Author: zsxwing Closes #7276 from z

spark git commit: [SPARK-9216] [STREAMING] Define KinesisBackedBlockRDDs

2015-07-23 Thread tdas
t/d/1k0dl270EnK7uExrsCE7jYw7PYx0YC935uBcxn3p0f58/edit Author: Tathagata Das Closes #7578 from tdas/kinesis-rdd and squashes the following commits: 543d208 [Tathagata Das] Fixed scala style 5082a30 [Tathagata Das] Fixed scala style 3f40c2d [Tathagata Das] Addressed comments c4f25d2 [Tathagata Das] Addressed comment d3d6

spark git commit: [SPARK-8975] [STREAMING] Adds a mechanism to send a new rate from the driver to the block generator

2015-07-22 Thread tdas
Repository: spark Updated Branches: refs/heads/master fe26584a1 -> 798dff7b4 [SPARK-8975] [STREAMING] Adds a mechanism to send a new rate from the driver to the block generator First step for [SPARK-7398](https://issues.apache.org/jira/browse/SPARK-7398). tdas huitseeker Author: Iul

spark git commit: Disable flaky test: ReceiverSuite "block generator throttling".

2015-07-20 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.3 016332535 -> 596a4cb8c Disable flaky test: ReceiverSuite "block generator throttling". Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/596a4cb8 Tree: http://git-wip

spark git commit: [SPARK-9030] [STREAMING] [HOTFIX] Make sure that no attempts to create Kinesis streams is made when no enabled

2015-07-19 Thread tdas
the KinesisStreamSuite attempted to find AWS credentials to create Kinesis stream, and failed. Solution: Made sure all accesses to KinesisTestUtils, that created streams, is under `testOrIgnore` Author: Tathagata Das Closes #7519 from tdas/kinesis-tests and squashes the following commits: 64d6

spark git commit: [SPARK-9030] [STREAMING] Add Kinesis.createStream unit tests that actual sends data

2015-07-17 Thread tdas
ill not run by default. It will only run when the relevant environment variables are set. Author: Tathagata Das Closes #7413 from tdas/kinesis-tests and squashes the following commits: 0e16db5 [Tathagata Das] Added more comments regarding testOrIgnore 1ea5ce0 [Tathagata Das] Added more comme

spark git commit: [SPARK-5681] [STREAMING] Move 'stopReceivers' to the event loop to resolve the race condition

2015-07-17 Thread tdas
Repository: spark Updated Branches: refs/heads/master 074085d67 -> ad0954f6d [SPARK-5681] [STREAMING] Move 'stopReceivers' to the event loop to resolve the race condition This is an alternative way to fix `SPARK-5681`. It minimizes the changes. Closes #4467 Author: zsxwing Author: Liang-Ch

spark git commit: [SPARK-6304] [STREAMING] Fix checkpointing doesn't retain driver port issue.

2015-07-16 Thread tdas
Repository: spark Updated Branches: refs/heads/master fec10f0c6 -> 031d7d414 [SPARK-6304] [STREAMING] Fix checkpointing doesn't retain driver port issue. Author: jerryshao Author: Saisai Shao Closes #5060 from jerryshao/SPARK-6304 and squashes the following commits: 89b01f5 [jerryshao] Upd

spark git commit: [SPARK-5523] [CORE] [STREAMING] Add a cache for hostname in TaskMetrics to decrease the memory usage and GC overhead

2015-07-14 Thread tdas
Repository: spark Updated Branches: refs/heads/master f957796c4 -> bb870e72f [SPARK-5523] [CORE] [STREAMING] Add a cache for hostname in TaskMetrics to decrease the memory usage and GC overhead Hostname in TaskMetrics will be created through deserialization, mostly the number of hostname is

spark git commit: [SPARK-8820] [STREAMING] Add a configuration to set checkpoint dir.

2015-07-14 Thread tdas
Repository: spark Updated Branches: refs/heads/master cc57d705e -> f957796c4 [SPARK-8820] [STREAMING] Add a configuration to set checkpoint dir. Add a configuration to set checkpoint directory for convenience to user. [Jira Address](https://issues.apache.org/jira/browse/SPARK-8820) Author: h

spark git commit: [SPARK-4072] [CORE] Display Streaming blocks in Streaming UI

2015-07-14 Thread tdas
Repository: spark Updated Branches: refs/heads/master 0a4071eab -> fb1d06fc2 [SPARK-4072] [CORE] Display Streaming blocks in Streaming UI Replace #6634 This PR adds `SparkListenerBlockUpdated` to SparkListener so that it can monitor all block update infos that are sent to `BlockManagerMasate

spark git commit: [SPARK-8743] [STREAMING] Deregister Codahale metrics for streaming when StreamingContext is closed

2015-07-13 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.4 898e5f76f -> 50607eca5 [SPARK-8743] [STREAMING] Deregister Codahale metrics for streaming when StreamingContext is closed The issue link: https://issues.apache.org/jira/browse/SPARK-8743 Deregister Codahale metrics for streaming when S

spark git commit: [SPARK-8743] [STREAMING] Deregister Codahale metrics for streaming when StreamingContext is closed

2015-07-13 Thread tdas
Repository: spark Updated Branches: refs/heads/master 0aed38e44 -> b7bcbe25f [SPARK-8743] [STREAMING] Deregister Codahale metrics for streaming when StreamingContext is closed The issue link: https://issues.apache.org/jira/browse/SPARK-8743 Deregister Codahale metrics for streaming when Strea

spark git commit: [SPARK-8533] [STREAMING] Upgrade Flume to 1.6.0

2015-07-13 Thread tdas
Repository: spark Updated Branches: refs/heads/master 4c797f2b0 -> 0aed38e44 [SPARK-8533] [STREAMING] Upgrade Flume to 1.6.0 Author: Hari Shreedharan Closes #6939 from harishreedharan/upgrade-flume-1.6.0 and squashes the following commits: 94b80ae [Hari Shreedharan] [SPARK-8533][Streaming]

spark git commit: [DOCS] Added important updateStateByKey details

2015-07-09 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.4 2f2f9da4b -> bef059150 [DOCS] Added important updateStateByKey details Runs for *all* existing keys and returning "None" will remove the key-value pair. Author: Michael Vogiatzis Closes #7229 from mvogiatzis/patch-1 and squashes the

spark git commit: [DOCS] Added important updateStateByKey details

2015-07-09 Thread tdas
Repository: spark Updated Branches: refs/heads/master 1903641e6 -> d538919cc [DOCS] Added important updateStateByKey details Runs for *all* existing keys and returning "None" will remove the key-value pair. Author: Michael Vogiatzis Closes #7229 from mvogiatzis/patch-1 and squashes the fol

spark git commit: [SPARK-8852] [FLUME] Trim dependencies in flume assembly.

2015-07-09 Thread tdas
Repository: spark Updated Branches: refs/heads/master 2d45571fc -> 0e78e40c0 [SPARK-8852] [FLUME] Trim dependencies in flume assembly. Also, add support for the *-provided profiles. This avoids repackaging things that are already in the Spark assembly, or, in the case of the *-provided profile

spark git commit: [SPARK-8865] [STREAMING] FIX BUG: check key in kafka params

2015-07-09 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.3 960aec976 -> 016332535 [SPARK-8865] [STREAMING] FIX BUG: check key in kafka params Author: guowei2 Closes #7254 from guowei2/spark-8865 and squashes the following commits: 48ca17a [guowei2] fix contains key (cherry picked from commi

spark git commit: [SPARK-8865] [STREAMING] FIX BUG: check key in kafka params

2015-07-09 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.4 990f434e6 -> 2f2f9da4b [SPARK-8865] [STREAMING] FIX BUG: check key in kafka params Author: guowei2 Closes #7254 from guowei2/spark-8865 and squashes the following commits: 48ca17a [guowei2] fix contains key (cherry picked from commi

spark git commit: [SPARK-8865] [STREAMING] FIX BUG: check key in kafka params

2015-07-09 Thread tdas
Repository: spark Updated Branches: refs/heads/master c9e2ef52b -> 897700369 [SPARK-8865] [STREAMING] FIX BUG: check key in kafka params Author: guowei2 Closes #7254 from guowei2/spark-8865 and squashes the following commits: 48ca17a [guowei2] fix contains key Project: http://git-wip-us.a

spark git commit: [SPARK-8389] [STREAMING] [PYSPARK] Expose KafkaRDDs offsetRange in Python

2015-07-09 Thread tdas
Repository: spark Updated Branches: refs/heads/master 1f6b0b123 -> 3ccebf36c [SPARK-8389] [STREAMING] [PYSPARK] Expose KafkaRDDs offsetRange in Python This PR propose a simple way to expose OffsetRange in Python code, also the usage of offsetRanges is similar to Scala/Java way, here in Python

spark git commit: [SPARK-8701] [STREAMING] [WEBUI] Add input metadata in the batch page

2015-07-09 Thread tdas
Repository: spark Updated Branches: refs/heads/master c4830598b -> 1f6b0b123 [SPARK-8701] [STREAMING] [WEBUI] Add input metadata in the batch page This PR adds `metadata` to `InputInfo`. `InputDStream` can report its metadata for a batch and it will be shown in the batch page. For example,

spark git commit: [SPARK-8378] [STREAMING] Add the Python API for Flume

2015-07-01 Thread tdas
Repository: spark Updated Branches: refs/heads/master b8faa3287 -> 75b9fe4c5 [SPARK-8378] [STREAMING] Add the Python API for Flume Author: zsxwing Closes #6830 from zsxwing/flume-python and squashes the following commits: 78dfdac [zsxwing] Fix the compile error in the test code f1bf3c0 [zsx

spark git commit: [SPARK-8619] [STREAMING] Don't recover keytab and principal configuration within Streaming checkpoint

2015-06-30 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.4 c83ec10cb -> f9cd5cc1b [SPARK-8619] [STREAMING] Don't recover keytab and principal configuration within Streaming checkpoint [Client.scala](https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/Cl

spark git commit: [SPARK-8619] [STREAMING] Don't recover keytab and principal configuration within Streaming checkpoint

2015-06-30 Thread tdas
Repository: spark Updated Branches: refs/heads/master 57264400a -> d16a94437 [SPARK-8619] [STREAMING] Don't recover keytab and principal configuration within Streaming checkpoint [Client.scala](https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client

spark git commit: [SPARK-8630] [STREAMING] Prevent from checkpointing QueueInputDStream

2015-06-30 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.4 894404cb2 -> c83ec10cb [SPARK-8630] [STREAMING] Prevent from checkpointing QueueInputDStream This PR throws an exception in `QueueInputDStream.writeObject` so that it can fail the application when calling `StreamingContext.start` rathe

spark git commit: [SPARK-8630] [STREAMING] Prevent from checkpointing QueueInputDStream

2015-06-30 Thread tdas
Repository: spark Updated Branches: refs/heads/master ca7e460f7 -> 57264400a [SPARK-8630] [STREAMING] Prevent from checkpointing QueueInputDStream This PR throws an exception in `QueueInputDStream.writeObject` so that it can fail the application when calling `StreamingContext.start` rather th

spark git commit: [SPARK-7988] [STREAMING] Round-robin scheduling of receivers by default

2015-06-30 Thread tdas
ter isn't really needed. Tested this on a cluster of 6 nodes and noticed 20-25% gain in throughput compared to random scheduling. tdas pwendell Author: nishkamravi2 Author: Nishkam Ravi Closes #6607 from nishkamravi2/master_nravi and squashes the following commits: 1918819 [Nishkam Ravi

spark git commit: [SPARK-8399] [STREAMING] [WEB UI] Overlap between histograms and axis' name in Spark Streaming UI

2015-06-24 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.4 f6682dd6e -> 93793237e [SPARK-8399] [STREAMING] [WEB UI] Overlap between histograms and axis' name in Spark Streaming UI Moved where the X axis' name (#batches) is written in histograms in the spark streaming web ui so the histograms

spark git commit: [SPARK-8399] [STREAMING] [WEB UI] Overlap between histograms and axis' name in Spark Streaming UI

2015-06-24 Thread tdas
Repository: spark Updated Branches: refs/heads/master 31f48e5af -> 1173483f3 [SPARK-8399] [STREAMING] [WEB UI] Overlap between histograms and axis' name in Spark Streaming UI Moved where the X axis' name (#batches) is written in histograms in the spark streaming web ui so the histograms and

spark git commit: [SPARK-8483] [STREAMING] Remove commons-lang3 dependency from Flume Sink

2015-06-22 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.4 d0943afbc -> 929479675 [SPARK-8483] [STREAMING] Remove commons-lang3 dependency from Flume Sink Author: Hari Shreedharan Closes #6910 from harishreedharan/remove-commons-lang3 and squashes the following commits: 9875f7d [Hari Shreed

spark git commit: [SPARK-8483] [STREAMING] Remove commons-lang3 dependency from Flume Si…

2015-06-22 Thread tdas
Repository: spark Updated Branches: refs/heads/master 31bd30687 -> 9b618fb0d [SPARK-8483] [STREAMING] Remove commons-lang3 dependency from Flume Si… …nk. Also bump Flume version to 1.6.0 Author: Hari Shreedharan Closes #6910 from harishreedharan/remove-commons-lang3 and squashes the fo

spark git commit: [SPARK-8127] [STREAMING] [KAFKA] KafkaRDD optimize count() take() isEmpty()

2015-06-19 Thread tdas
Repository: spark Updated Branches: refs/heads/master bec40e52b -> 1b6fe9b1a [SPARK-8127] [STREAMING] [KAFKA] KafkaRDD optimize count() take() isEmpty() …ed KafkaRDD methods. Possible fix for [SPARK-7122], but probably a worthwhile optimization regardless. Author: cody koeninger Closes

[2/2] spark git commit: [SPARK-8390] [STREAMING] [KAFKA] fix docs related to HasOffsetRanges

2015-06-19 Thread tdas
[SPARK-8390] [STREAMING] [KAFKA] fix docs related to HasOffsetRanges Author: cody koeninger Closes #6863 from koeninger/SPARK-8390 and squashes the following commits: 26a06bd [cody koeninger] Merge branch 'master' into SPARK-8390 3744492 [cody koeninger] [Streaming][Kafka][SPARK-8390] doc chang

[1/2] spark git commit: [SPARK-8389] [STREAMING] [KAFKA] Example of getting offset ranges out o…

2015-06-19 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.4 2248ad8b7 -> a7b773a8b [SPARK-8389] [STREAMING] [KAFKA] Example of getting offset ranges out o… …f the existing java direct stream api Author: cody koeninger Closes #6846 from koeninger/SPARK-8389 and squashes the following commi

spark git commit: [SPARK-8390] [STREAMING] [KAFKA] fix docs related to HasOffsetRanges

2015-06-19 Thread tdas
Repository: spark Updated Branches: refs/heads/master a333a72e0 -> b305e377f [SPARK-8390] [STREAMING] [KAFKA] fix docs related to HasOffsetRanges Author: cody koeninger Closes #6863 from koeninger/SPARK-8390 and squashes the following commits: 26a06bd [cody koeninger] Merge branch 'master'

spark git commit: [SPARK-8080] [STREAMING] Receiver.store with Iterator does not give correct count at Spark UI

2015-06-18 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.4 bd9bbd611 -> b55e4b9a5 [SPARK-8080] [STREAMING] Receiver.store with Iterator does not give correct count at Spark UI tdas zsxwing this is the new PR for Spark-8080 I have merged https://github.com/apache/spark/pull/6659 Also

spark git commit: [SPARK-8080] [STREAMING] Receiver.store with Iterator does not give correct count at Spark UI

2015-06-18 Thread tdas
Repository: spark Updated Branches: refs/heads/master 4ce3bab89 -> 3eaed8769 [SPARK-8080] [STREAMING] Receiver.store with Iterator does not give correct count at Spark UI tdas zsxwing this is the new PR for Spark-8080 I have merged https://github.com/apache/spark/pull/6659 Also to ment

spark git commit: [SPARK-8376] [DOCS] Add common lang3 to the Spark Flume Sink doc

2015-06-18 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.4 c1da5cf02 -> 9f293a9eb [SPARK-8376] [DOCS] Add common lang3 to the Spark Flume Sink doc Commons Lang 3 has been added as one of the dependencies of Spark Flume Sink since #5703. This PR updates the doc for it. Author: zsxwing Closes

spark git commit: [SPARK-8376] [DOCS] Add common lang3 to the Spark Flume Sink doc

2015-06-18 Thread tdas
Repository: spark Updated Branches: refs/heads/master 44c931f00 -> 24e53793b [SPARK-8376] [DOCS] Add common lang3 to the Spark Flume Sink doc Commons Lang 3 has been added as one of the dependencies of Spark Flume Sink since #5703. This PR updates the doc for it. Author: zsxwing Closes #68

spark git commit: [SPARK-8404] [STREAMING] [TESTS] Use thread-safe collections to make the tests more reliable

2015-06-17 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.4 5e7973df0 -> 5aedfa2ce [SPARK-8404] [STREAMING] [TESTS] Use thread-safe collections to make the tests more reliable KafkaStreamSuite, DirectKafkaStreamSuite, JavaKafkaStreamSuite and JavaDirectKafkaStreamSuite use non-thread-safe coll

spark git commit: [SPARK-8404] [STREAMING] [TESTS] Use thread-safe collections to make the tests more reliable

2015-06-17 Thread tdas
Repository: spark Updated Branches: refs/heads/master 302556ff9 -> a06d9c8e7 [SPARK-8404] [STREAMING] [TESTS] Use thread-safe collections to make the tests more reliable KafkaStreamSuite, DirectKafkaStreamSuite, JavaKafkaStreamSuite and JavaDirectKafkaStreamSuite use non-thread-safe collecti

spark git commit: [SPARK-7284] [STREAMING] Updated streaming documentation

2015-06-12 Thread tdas
get partitionId in foreachRDD Author: Tathagata Das Closes #6781 from tdas/SPARK-7284 and squashes the following commits: aac7be0 [Tathagata Das] Added information on how to get partition id a66ec22 [Tathagata Das] Complete the line incomplete line, a92ca39 [Tathagata Das] Updated stream

spark git commit: [SPARK-7284] [STREAMING] Updated streaming documentation

2015-06-12 Thread tdas
get partitionId in foreachRDD Author: Tathagata Das Closes #6781 from tdas/SPARK-7284 and squashes the following commits: aac7be0 [Tathagata Das] Added information on how to get partition id a66ec22 [Tathagata Das] Complete the line incomplete line, a92ca39 [Tathagata Das] Updated streaming documentat

spark git commit: [SPARK-8112] [STREAMING] Fix the negative event count issue

2015-06-05 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.4 429c65851 -> 200c980a1 [SPARK-8112] [STREAMING] Fix the negative event count issue Author: zsxwing Closes #6659 from zsxwing/SPARK-8112 and squashes the following commits: a5d7da6 [zsxwing] Address comments d255b6e [zsxwing] Fix the

spark git commit: [SPARK-8112] [STREAMING] Fix the negative event count issue

2015-06-05 Thread tdas
Repository: spark Updated Branches: refs/heads/master 3f80bc841 -> 4f16d3fe2 [SPARK-8112] [STREAMING] Fix the negative event count issue Author: zsxwing Closes #6659 from zsxwing/SPARK-8112 and squashes the following commits: a5d7da6 [zsxwing] Address comments d255b6e [zsxwing] Fix the nega

spark git commit: [SPARK-8098] [WEBUI] Show correct length of bytes on log page

2015-06-04 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.4 0b71b851d -> 3ba6fc515 [SPARK-8098] [WEBUI] Show correct length of bytes on log page The log page should only show desired length of bytes. Currently it shows bytes from the startIndex to the end of the file. The "Next" button on the p

spark git commit: [SPARK-8098] [WEBUI] Show correct length of bytes on log page

2015-06-04 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.3 3e8b0406a -> 5e77d69c7 [SPARK-8098] [WEBUI] Show correct length of bytes on log page The log page should only show desired length of bytes. Currently it shows bytes from the startIndex to the end of the file. The "Next" button on the p

spark git commit: [SPARK-8098] [WEBUI] Show correct length of bytes on log page

2015-06-04 Thread tdas
Repository: spark Updated Branches: refs/heads/master 2bcdf8c23 -> 63bc0c443 [SPARK-8098] [WEBUI] Show correct length of bytes on log page The log page should only show desired length of bytes. Currently it shows bytes from the startIndex to the end of the file. The "Next" button on the page

spark git commit: [SPARK-8015] [FLUME] Remove Guava dependency from flume-sink.

2015-06-02 Thread tdas
Repository: spark Updated Branches: refs/heads/master 1bb5d716c -> 0071bd8d3 [SPARK-8015] [FLUME] Remove Guava dependency from flume-sink. The minimal change would be to disable shading of Guava in the module, and rely on the transitive dependency from other libraries instead. But since Guava'

spark git commit: [SPARK-8015] [FLUME] Remove Guava dependency from flume-sink.

2015-06-02 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.4 f71a09de6 -> fa292dc3d [SPARK-8015] [FLUME] Remove Guava dependency from flume-sink. The minimal change would be to disable shading of Guava in the module, and rely on the transitive dependency from other libraries instead. But since Gu

spark git commit: [SPARK-7958] [STREAMING] Handled exception in StreamingContext.start() to prevent leaking of actors

2015-06-01 Thread tdas
IVE. The solution in this PR is to stop the internal scheduler if start throw exception, and mark the context as STOPPED. Author: Tathagata Das Closes #6559 from tdas/SPARK-7958 and squashes the following commits: 20b2ec1 [Tathagata Das] Added synchronized 790b617 [Tathagata Das] Handled exception

spark git commit: [SPARK-7958] [STREAMING] Handled exception in StreamingContext.start() to prevent leaking of actors

2015-06-01 Thread tdas
IVE. The solution in this PR is to stop the internal scheduler if start throw exception, and mark the context as STOPPED. Author: Tathagata Das Closes #6559 from tdas/SPARK-7958 and squashes the following commits: 20b2ec1 [Tathagata Das] Added synchronized 790b617 [Tathagata Das] Handled except

spark git commit: [SPARK-7497] [PYSPARK] [STREAMING] fix streaming flaky tests

2015-06-01 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.4 2f41cf3e2 -> d023300f4 [SPARK-7497] [PYSPARK] [STREAMING] fix streaming flaky tests Increase the duration and timeout in streaming python tests. Author: Davies Liu Closes #6239 from davies/flaky_tests and squashes the following commi

spark git commit: [SPARK-7497] [PYSPARK] [STREAMING] fix streaming flaky tests

2015-06-01 Thread tdas
Repository: spark Updated Branches: refs/heads/master e7c7e51f2 -> b7ab0299b [SPARK-7497] [PYSPARK] [STREAMING] fix streaming flaky tests Increase the duration and timeout in streaming python tests. Author: Davies Liu Closes #6239 from davies/flaky_tests and squashes the following commits:

spark git commit: [SPARK-7777][Streaming] Handle the case when there is no block in a batch

2015-05-23 Thread tdas
Repository: spark Updated Branches: refs/heads/master a40bca011 -> ad0badba1 [SPARK-][Streaming] Handle the case when there is no block in a batch In the old implementation, if a batch has no block, `areWALRecordHandlesPresent` will be `true` and it will return `WriteAheadLogBackedBlockR

spark git commit: [SPARK-7777][Streaming] Handle the case when there is no block in a batch

2015-05-23 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.4 c8eb76ba6 -> ea9db50bc [SPARK-][Streaming] Handle the case when there is no block in a batch In the old implementation, if a batch has no block, `areWALRecordHandlesPresent` will be `true` and it will return `WriteAheadLogBackedBl

spark git commit: [SPARK-7788] Made KinesisReceiver.onStart() non-blocking

2015-05-22 Thread tdas
the receiver (it's receiverInfo field is a empty map) causing it to be stuck in infinite loop while waiting for running flag to be set to false. Author: Tathagata Das Closes #6348 from tdas/SPARK-7788 and squashes the following commits: 2584683 [Tathagata Das] Added receiver id in thread name

spark git commit: [SPARK-7788] Made KinesisReceiver.onStart() non-blocking

2015-05-22 Thread tdas
ver (it's receiverInfo field is a empty map) causing it to be stuck in infinite loop while waiting for running flag to be set to false. Author: Tathagata Das Closes #6348 from tdas/SPARK-7788 and squashes the following commits: 2584683 [Tathagata Das] Added receiver id in thread name

spark git commit: [SPARK-7776] [STREAMING] Added shutdown hook to StreamingContext

2015-05-21 Thread tdas
hooks priority. Author: Tathagata Das Closes #6307 from tdas/SPARK-7776 and squashes the following commits: e3d5475 [Tathagata Das] Added conf to specify graceful shutdown 4c18652 [Tathagata Das] Added shutdown hook to StreamingContxt. (cherry picked from commit d68ea24d60ce1aa55b06a8c107f42544

spark git commit: [SPARK-7776] [STREAMING] Added shutdown hook to StreamingContext

2015-05-21 Thread tdas
hooks priority. Author: Tathagata Das Closes #6307 from tdas/SPARK-7776 and squashes the following commits: e3d5475 [Tathagata Das] Added conf to specify graceful shutdown 4c18652 [Tathagata Das] Added shutdown hook to StreamingContxt. Project: http://git-wip-us.apache.org/repos/asf/spark/repo

spark git commit: [SPARK-7478] [SQL] Added SQLContext.getOrCreate

2015-05-21 Thread tdas
ark/examples/streaming/SqlNetworkWordCount.scala This can be solved by {{SQLContext.getOrCreate}} which get or creates a new singleton instance of SQLContext using either a given SparkContext or a given SparkConf. rxin marmbrus Author: Tathagata Das Closes #6006 from tdas/SPARK-7478 and squashes

spark git commit: [SPARK-7478] [SQL] Added SQLContext.getOrCreate

2015-05-21 Thread tdas
les/streaming/SqlNetworkWordCount.scala This can be solved by {{SQLContext.getOrCreate}} which get or creates a new singleton instance of SQLContext using either a given SparkContext or a given SparkConf. rxin marmbrus Author: Tathagata Das Closes #6006 from tdas/SPARK-7478 and squashes

spark git commit: [SPARK-7722] [STREAMING] Added Kinesis to style checker

2015-05-21 Thread tdas
Repository: spark Updated Branches: refs/heads/master cdc7c055c -> 311fab6f1 [SPARK-7722] [STREAMING] Added Kinesis to style checker Author: Tathagata Das Closes #6325 from tdas/SPARK-7722 and squashes the following commits: 9ab35b2 [Tathagata Das] Fixed styles in Kinesis Project: h

spark git commit: [SPARK-7722] [STREAMING] Added Kinesis to style checker

2015-05-21 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.4 7e0912b1d -> 33e0e [SPARK-7722] [STREAMING] Added Kinesis to style checker Author: Tathagata Das Closes #6325 from tdas/SPARK-7722 and squashes the following commits: 9ab35b2 [Tathagata Das] Fixed styles in Kinesis (che

spark git commit: [SPARK-7787] [STREAMING] Fix serialization issue of SerializableAWSCredentials

2015-05-21 Thread tdas
ugh KinesisUtils. Author: Tathagata Das Closes #6316 from tdas/SPARK-7787 and squashes the following commits: 248ca5c [Tathagata Das] Fixed serializability Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/4b7ff309 Tree: h

spark git commit: [SPARK-7787] [STREAMING] Fix serialization issue of SerializableAWSCredentials

2015-05-21 Thread tdas
ied through KinesisUtils. Author: Tathagata Das Closes #6316 from tdas/SPARK-7787 and squashes the following commits: 248ca5c [Tathagata Das] Fixed serializability (cherry picked from commit 4b7ff3092c53827817079e0810563cbb0b9d0747) Signed-off-by: Tathagata Das Project: http://git-wip-us.apache.

spark git commit: [SPARK-7745] Change asserts to requires for user input checks in Spark Streaming

2015-05-21 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.4 64762444e -> f08c6f319 [SPARK-7745] Change asserts to requires for user input checks in Spark Streaming Assertions can be turned off. `require` throws an `IllegalArgumentException` which makes more sense when it's a user set variable.

spark git commit: [SPARK-7745] Change asserts to requires for user input checks in Spark Streaming

2015-05-21 Thread tdas
Repository: spark Updated Branches: refs/heads/master 947ea1cf5 -> 1ee8eb431 [SPARK-7745] Change asserts to requires for user input checks in Spark Streaming Assertions can be turned off. `require` throws an `IllegalArgumentException` which makes more sense when it's a user set variable. Aut

<    1   2   3   4   5   6   7   8   9   10   >