[GitHub] spark pull request: Expose regionName setting in Kinesis receiver ...

2015-06-20 Thread tdas
Github user tdas commented on the pull request:

https://github.com/apache/spark/pull/5375#issuecomment-113822008
  
Sure, take a look at the new KinesisUtils API. 


https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.streaming.kinesis.KinesisUtils$


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Expose regionName setting in Kinesis receiver ...

2015-06-20 Thread kopiczko
Github user kopiczko commented on the pull request:

https://github.com/apache/spark/pull/5375#issuecomment-113730010
  
Sorry guys, I currently have no time to work on it. @tdas: Would you mind 
giving a reference how this is solved?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Expose regionName setting in Kinesis receiver ...

2015-06-20 Thread kopiczko
Github user kopiczko closed the pull request at:

https://github.com/apache/spark/pull/5375


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Expose regionName setting in Kinesis receiver ...

2015-06-20 Thread kopiczko
Github user kopiczko commented on the pull request:

https://github.com/apache/spark/pull/5375#issuecomment-113729967
  
Sorry guys, I currently have no time to work on it. @tdas: Would you mind 
give a reference how this is solved?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Expose regionName setting in Kinesis receiver ...

2015-06-19 Thread tdas
Github user tdas commented on the pull request:

https://github.com/apache/spark/pull/5375#issuecomment-113648887
  
This is not needed any more as Spark 1.4.0 has fixed this issue. Mind 
closing this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Expose regionName setting in Kinesis receiver ...

2015-05-05 Thread tdas
Github user tdas commented on the pull request:

https://github.com/apache/spark/pull/5375#issuecomment-99006356
  
Any updates on this patch? If you are not able to work on it, mind closing 
it?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Expose regionName setting in Kinesis receiver ...

2015-04-14 Thread cfregly
Github user cfregly commented on a diff in the pull request:

https://github.com/apache/spark/pull/5375#discussion_r28391705
  
--- Diff: 
extras/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala
 ---
@@ -39,6 +39,7 @@ object KinesisUtils {
* @param sscStreamingContext object
* @param streamName   Kinesis stream name
* @param endpointUrl  Url of Kinesis service (e.g., 
https://kinesis.us-east-1.amazonaws.com)
+   * @param regionName   Region name to indicate the location of the 
Amazon Kinesis service
--- End diff --

we may want to consider making a default for this to maintain backward 
compatibility.

the problem is that the Scala and Java createStream() methods in this 
helper class will conflict if you use defaults for this.  i had the same issue 
with initialPositionInStream as well as storageLevel which is why they're not 
defaults.  

not sure we can do much about it without changing the names of the methods 
to createScalaStream() and createJavaStream() or equivalent.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Expose regionName setting in Kinesis receiver ...

2015-04-14 Thread cfregly
Github user cfregly commented on a diff in the pull request:

https://github.com/apache/spark/pull/5375#discussion_r28391631
  
--- Diff: 
extras/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala
 ---
@@ -70,6 +72,7 @@ object KinesisUtils {
* @param jssc Java StreamingContext object
* @param streamName   Kinesis stream name
* @param endpointUrl  Url of Kinesis service (e.g., 
https://kinesis.us-east-1.amazonaws.com)
+   * @param regionName   Region name to indicate the location of the 
Amazon Kinesis service
--- End diff --

and here?  :)

"The Amazon DynamoDB table and Amazon CloudWatch metrics associated with 
your application will also use this region setting."


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Expose regionName setting in Kinesis receiver ...

2015-04-14 Thread cfregly
Github user cfregly commented on a diff in the pull request:

https://github.com/apache/spark/pull/5375#discussion_r28391619
  
--- Diff: 
extras/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala
 ---
@@ -39,6 +39,7 @@ object KinesisUtils {
* @param sscStreamingContext object
* @param streamName   Kinesis stream name
* @param endpointUrl  Url of Kinesis service (e.g., 
https://kinesis.us-east-1.amazonaws.com)
+   * @param regionName   Region name to indicate the location of the 
Amazon Kinesis service
--- End diff --

same here...

"The Amazon DynamoDB table and Amazon CloudWatch metrics associated with 
your application will also use this region setting."


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Expose regionName setting in Kinesis receiver ...

2015-04-14 Thread cfregly
Github user cfregly commented on a diff in the pull request:

https://github.com/apache/spark/pull/5375#discussion_r28391597
  
--- Diff: 
extras/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisReceiver.scala
 ---
@@ -36,18 +36,19 @@ import 
com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker
  * Custom AWS Kinesis-specific implementation of Spark Streaming's 
Receiver.
  * This implementation relies on the Kinesis Client Library (KCL) Worker 
as described here:
  * https://github.com/awslabs/amazon-kinesis-client
- * This is a custom receiver used with 
StreamingContext.receiverStream(Receiver) 
+ * This is a custom receiver used with 
StreamingContext.receiverStream(Receiver)
  *   as described here:
  * http://spark.apache.org/docs/latest/streaming-custom-receivers.html
- * Instances of this class will get shipped to the Spark Streaming Workers 
+ * Instances of this class will get shipped to the Spark Streaming Workers
  *   to run within a Spark Executor.
  *
  * @param appName  Kinesis application name. Kinesis Apps are mapped to 
Kinesis Streams
  * by the Kinesis Client Library.  If you change the App 
name or Stream name,
- * the KCL will throw errors.  This usually requires 
deleting the backing  
+ * the KCL will throw errors.  This usually requires 
deleting the backing
  * DynamoDB table with the same name this Kinesis 
application.
  * @param streamName   Kinesis stream name
  * @param endpointUrl  Url of Kinesis service (e.g., 
https://kinesis.us-east-1.amazonaws.com)
+ * @param regionName   Region name to indicate the location of the Amazon 
Kinesis service
--- End diff --

might want to add a note similar to the KCL README.md when describing this 
param:

"The Amazon DynamoDB table and Amazon CloudWatch metrics associated with 
your application will also use this region setting."


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Expose regionName setting in Kinesis receiver ...

2015-04-07 Thread kopiczko
Github user kopiczko commented on the pull request:

https://github.com/apache/spark/pull/5375#issuecomment-90428686
  
Thanks guys for your response. @cfregly I've already answered your comment 
on Jira. I guess we should move our discussion there. I can improve this 
implementation to meet requirements in SPARK-6514 and rename PR according to 
guidelines in wiki.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Expose regionName setting in Kinesis receiver ...

2015-04-06 Thread cfregly
Github user cfregly commented on the pull request:

https://github.com/apache/spark/pull/5375#issuecomment-90308088
  
@srowen @kopiczko 

This is part of a larger effort to overhaul Kinesis-based streaming slated 
for 1.4.  Lots of API changes including region, AWS credentials, and 
application name - as well as upgrading both the AWS Java SDK and the KCL.

Here's the parent jira:  https://issues.apache.org/jira/browse/SPARK-6599.

Here's the related jira that covers the region portion:  
https://issues.apache.org/jira/browse/SPARK-6514.

We should definitely try to be backward-compatible even though the API is 
Experimental.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Expose regionName setting in Kinesis receiver ...

2015-04-06 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/5375#discussion_r27845691
  
--- Diff: 
extras/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala
 ---
@@ -81,17 +84,19 @@ object KinesisUtils {
* the tip of the stream 
(InitialPositionInStream.LATEST).
* @param storageLevel Storage level to use for storing the received 
objects
*
-   * @return JavaReceiverInputDStream[Array[Byte]]
+   * @return JavaReceiverInputDStream[ Array[Byte] ]
*/
   @Experimental
   def createStream(
-  jssc: JavaStreamingContext, 
-  streamName: String, 
-  endpointUrl: String, 
+  jssc: JavaStreamingContext,
--- End diff --

Isn't this changing the API?
I believe you should make a JIRA along with this PR. It's not trivial. 
https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Expose regionName setting in Kinesis receiver ...

2015-04-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5375#issuecomment-90189045
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Expose regionName setting in Kinesis receiver ...

2015-04-06 Thread kopiczko
GitHub user kopiczko opened a pull request:

https://github.com/apache/spark/pull/5375

Expose regionName setting in Kinesis receiver configuration

Hi,

I'd like to have that setting exposed to be able to set DynamoDB table (for 
KCL) in the same region as the Kinesis stream.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kopiczko/spark 
expose-kinesis-receiver-regionname-setting

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/5375.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5375


commit 0a4a399c4e2c7742466d0f4bea61d8c8ba1d0918
Author: Pawel Kopiczko 
Date:   2015-04-06T18:26:06Z

Expose regionName setting in Kinesis receiver configuration




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org