[GitHub] spark pull request: [SPARK-4707][STREAMING] Reliable Kafka Receive...

2015-02-04 Thread harishreedharan
Github user harishreedharan commented on the pull request:

https://github.com/apache/spark/pull/3655#issuecomment-72959021
  
Agreed. I will file a jira. We should discuss the issue there. This one was 
more of a break-fix, but that would be a more elaborate fix.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4707][STREAMING] Reliable Kafka Receive...

2015-02-04 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/3655


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4707][STREAMING] Reliable Kafka Receive...

2015-02-04 Thread tdas
Github user tdas commented on the pull request:

https://github.com/apache/spark/pull/3655#issuecomment-72949130
  
I can merge this for now and we can focus on that issue later. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4707][STREAMING] Reliable Kafka Receive...

2015-02-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3655#issuecomment-72789875
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26710/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4707][STREAMING] Reliable Kafka Receive...

2015-02-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3655#issuecomment-72789869
  
  [Test build #26710 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26710/consoleFull)
 for   PR 3655 at commit 
[`5e2e7ad`](https://github.com/apache/spark/commit/5e2e7ad479d2739c4f1bd62fd1d48b216b2bdce0).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4707][STREAMING] Reliable Kafka Receive...

2015-02-03 Thread tdas
Github user tdas commented on the pull request:

https://github.com/apache/spark/pull/3655#issuecomment-72784728
  
@harishreedharan This begs a higher level questions of whether the write 
ahead log (which is the probably component to fail) should have its own retries 
independent of the receiver retrying. 




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4707][STREAMING] Reliable Kafka Receive...

2015-02-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3655#issuecomment-72784691
  
  [Test build #26710 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26710/consoleFull)
 for   PR 3655 at commit 
[`5e2e7ad`](https://github.com/apache/spark/commit/5e2e7ad479d2739c4f1bd62fd1d48b216b2bdce0).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4707][STREAMING] Reliable Kafka Receive...

2015-02-03 Thread tdas
Github user tdas commented on the pull request:

https://github.com/apache/spark/pull/3655#issuecomment-72784543
  
ok to test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4707][STREAMING] Reliable Kafka Receive...

2015-01-14 Thread harishreedharan
Github user harishreedharan commented on the pull request:

https://github.com/apache/spark/pull/3655#issuecomment-70015327
  
No, this does prevent data loss - basically if the store fails multiple 
times, we shutdown the receiver completely. So the new receiver which gets 
started starts from the last commit, so we are safe.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4707][STREAMING] Reliable Kafka Receive...

2015-01-12 Thread jerryshao
Github user jerryshao commented on the pull request:

https://github.com/apache/spark/pull/3655#issuecomment-69684946
  
So I think the aim of this patch is to fix the recoverable problems of data 
store with retries, not prevent data loss. That's my thought, sorry for my 
misunderstanding.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4707][STREAMING] Reliable Kafka Receive...

2015-01-12 Thread jerryshao
Github user jerryshao commented on the pull request:

https://github.com/apache/spark/pull/3655#issuecomment-69684611
  
Hi @harishreedharan , After carefully looking at the code, I think data 
will not be lost even in such failure situation. For example, if we meet 
exception in `onPushBlock` which is called in 
`BlockGenerator#keepPushingBlocks`, this function will catch the exception and 
call `reportError` to notify the exception, After that the thread of 
`blockPushingThread` is exited. So there's no chance to call `onPushBlock` 
again, in another words, `storeBlockAndCommitOffset` will not be called again, 
so offset will not be committed to ZK from the failure point.

And the receiving thread will keep getting data and pushed into 
`blockForPushing` until it is full, after that, the whole receiver system is 
blocked.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4707][STREAMING] Reliable Kafka Receive...

2015-01-09 Thread harishreedharan
Github user harishreedharan commented on the pull request:

https://github.com/apache/spark/pull/3655#issuecomment-69384366
  
@tdas Any comments on this one, or is this one ready to go in?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4707][STREAMING] Reliable Kafka Receive...

2014-12-12 Thread harishreedharan
Github user harishreedharan commented on a diff in the pull request:

https://github.com/apache/spark/pull/3655#discussion_r21766800
  
--- Diff: 
external/kafka/src/main/scala/org/apache/spark/streaming/kafka/ReliableKafkaReceiver.scala
 ---
@@ -201,12 +201,31 @@ class ReliableKafkaReceiver[
 topicPartitionOffsetMap.clear()
   }
 
-  /** Store the ready-to-be-stored block and commit the related offsets to 
zookeeper. */
+  /**
+   * Store the ready-to-be-stored block and commit the related offsets to 
zookeeper. This method
+   * will try a fixed number of times to push the block. If the push 
fails, the receiver is stopped.
+   */
   private def storeBlockAndCommitOffset(
   blockId: StreamBlockId, arrayBuffer: mutable.ArrayBuffer[_]): Unit = 
{
-store(arrayBuffer.asInstanceOf[mutable.ArrayBuffer[(K, V)]])
-Option(blockOffsetMap.get(blockId)).foreach(commitOffset)
-blockOffsetMap.remove(blockId)
+var count = 0
+var pushed = false
+var exception: Exception = null
+while (!pushed && count <= 3) {
--- End diff --

In general, the failures probably are transient. Example, HDFS hflush fails 
because of GC or replication fails to the second BM because of timeouts or 
something. Such issues are likely to succeed on retries, but any major ones 
won't -- so limited retry followed by failures is probably a reasonable 
approach without making things too complex.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4707][STREAMING] Reliable Kafka Receive...

2014-12-09 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/3655#discussion_r21586197
  
--- Diff: 
external/kafka/src/main/scala/org/apache/spark/streaming/kafka/ReliableKafkaReceiver.scala
 ---
@@ -201,12 +201,31 @@ class ReliableKafkaReceiver[
 topicPartitionOffsetMap.clear()
   }
 
-  /** Store the ready-to-be-stored block and commit the related offsets to 
zookeeper. */
+  /**
+   * Store the ready-to-be-stored block and commit the related offsets to 
zookeeper. This method
+   * will try a fixed number of times to push the block. If the push 
fails, the receiver is stopped.
+   */
   private def storeBlockAndCommitOffset(
   blockId: StreamBlockId, arrayBuffer: mutable.ArrayBuffer[_]): Unit = 
{
-store(arrayBuffer.asInstanceOf[mutable.ArrayBuffer[(K, V)]])
-Option(blockOffsetMap.get(blockId)).foreach(commitOffset)
-blockOffsetMap.remove(blockId)
+var count = 0
+var pushed = false
+var exception: Exception = null
+while (!pushed && count <= 3) {
--- End diff --

General question - is it likely that a store fails, but then immediately 
succeeds? Just wondering at the likelihood that this does anything.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4707][STREAMING] Reliable Kafka Receive...

2014-12-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3655#issuecomment-66399109
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24282/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4707][STREAMING] Reliable Kafka Receive...

2014-12-09 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3655#issuecomment-66399105
  
  [Test build #24282 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24282/consoleFull)
 for   PR 3655 at commit 
[`5e2e7ad`](https://github.com/apache/spark/commit/5e2e7ad479d2739c4f1bd62fd1d48b216b2bdce0).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4707][STREAMING] Reliable Kafka Receive...

2014-12-09 Thread jerryshao
Github user jerryshao commented on the pull request:

https://github.com/apache/spark/pull/3655#issuecomment-66398284
  
Thanks Hari, seems this is a simple solution. BTW should we make `count = 
3` as a configurable parameter? For others LGTM. 

Original thoughts of introducing pending queue probably will make the 
design much more complex because of synchronization.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4707][STREAMING] Reliable Kafka Receive...

2014-12-09 Thread harishreedharan
Github user harishreedharan commented on the pull request:

https://github.com/apache/spark/pull/3655#issuecomment-66393421
  
I messed up the jira number in the commit. Please fix it when merging.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org