[GitHub] spark pull request: Expose regionName setting in Kinesis receiver ...
Github user tdas commented on the pull request: https://github.com/apache/spark/pull/5375#issuecomment-113822008 Sure, take a look at the new KinesisUtils API. https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.streaming.kinesis.KinesisUtils$ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Expose regionName setting in Kinesis receiver ...
Github user kopiczko commented on the pull request: https://github.com/apache/spark/pull/5375#issuecomment-113730010 Sorry guys, I currently have no time to work on it. @tdas: Would you mind giving a reference how this is solved? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Expose regionName setting in Kinesis receiver ...
Github user kopiczko closed the pull request at: https://github.com/apache/spark/pull/5375 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Expose regionName setting in Kinesis receiver ...
Github user kopiczko commented on the pull request: https://github.com/apache/spark/pull/5375#issuecomment-113729967 Sorry guys, I currently have no time to work on it. @tdas: Would you mind give a reference how this is solved? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Expose regionName setting in Kinesis receiver ...
Github user tdas commented on the pull request: https://github.com/apache/spark/pull/5375#issuecomment-113648887 This is not needed any more as Spark 1.4.0 has fixed this issue. Mind closing this PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Expose regionName setting in Kinesis receiver ...
Github user tdas commented on the pull request: https://github.com/apache/spark/pull/5375#issuecomment-99006356 Any updates on this patch? If you are not able to work on it, mind closing it? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Expose regionName setting in Kinesis receiver ...
Github user cfregly commented on a diff in the pull request: https://github.com/apache/spark/pull/5375#discussion_r28391705 --- Diff: extras/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala --- @@ -39,6 +39,7 @@ object KinesisUtils { * @param sscStreamingContext object * @param streamName Kinesis stream name * @param endpointUrl Url of Kinesis service (e.g., https://kinesis.us-east-1.amazonaws.com) + * @param regionName Region name to indicate the location of the Amazon Kinesis service --- End diff -- we may want to consider making a default for this to maintain backward compatibility. the problem is that the Scala and Java createStream() methods in this helper class will conflict if you use defaults for this. i had the same issue with initialPositionInStream as well as storageLevel which is why they're not defaults. not sure we can do much about it without changing the names of the methods to createScalaStream() and createJavaStream() or equivalent. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Expose regionName setting in Kinesis receiver ...
Github user cfregly commented on a diff in the pull request: https://github.com/apache/spark/pull/5375#discussion_r28391631 --- Diff: extras/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala --- @@ -70,6 +72,7 @@ object KinesisUtils { * @param jssc Java StreamingContext object * @param streamName Kinesis stream name * @param endpointUrl Url of Kinesis service (e.g., https://kinesis.us-east-1.amazonaws.com) + * @param regionName Region name to indicate the location of the Amazon Kinesis service --- End diff -- and here? :) "The Amazon DynamoDB table and Amazon CloudWatch metrics associated with your application will also use this region setting." --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Expose regionName setting in Kinesis receiver ...
Github user cfregly commented on a diff in the pull request: https://github.com/apache/spark/pull/5375#discussion_r28391619 --- Diff: extras/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala --- @@ -39,6 +39,7 @@ object KinesisUtils { * @param sscStreamingContext object * @param streamName Kinesis stream name * @param endpointUrl Url of Kinesis service (e.g., https://kinesis.us-east-1.amazonaws.com) + * @param regionName Region name to indicate the location of the Amazon Kinesis service --- End diff -- same here... "The Amazon DynamoDB table and Amazon CloudWatch metrics associated with your application will also use this region setting." --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Expose regionName setting in Kinesis receiver ...
Github user cfregly commented on a diff in the pull request: https://github.com/apache/spark/pull/5375#discussion_r28391597 --- Diff: extras/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisReceiver.scala --- @@ -36,18 +36,19 @@ import com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker * Custom AWS Kinesis-specific implementation of Spark Streaming's Receiver. * This implementation relies on the Kinesis Client Library (KCL) Worker as described here: * https://github.com/awslabs/amazon-kinesis-client - * This is a custom receiver used with StreamingContext.receiverStream(Receiver) + * This is a custom receiver used with StreamingContext.receiverStream(Receiver) * as described here: * http://spark.apache.org/docs/latest/streaming-custom-receivers.html - * Instances of this class will get shipped to the Spark Streaming Workers + * Instances of this class will get shipped to the Spark Streaming Workers * to run within a Spark Executor. * * @param appName Kinesis application name. Kinesis Apps are mapped to Kinesis Streams * by the Kinesis Client Library. If you change the App name or Stream name, - * the KCL will throw errors. This usually requires deleting the backing + * the KCL will throw errors. This usually requires deleting the backing * DynamoDB table with the same name this Kinesis application. * @param streamName Kinesis stream name * @param endpointUrl Url of Kinesis service (e.g., https://kinesis.us-east-1.amazonaws.com) + * @param regionName Region name to indicate the location of the Amazon Kinesis service --- End diff -- might want to add a note similar to the KCL README.md when describing this param: "The Amazon DynamoDB table and Amazon CloudWatch metrics associated with your application will also use this region setting." --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Expose regionName setting in Kinesis receiver ...
Github user kopiczko commented on the pull request: https://github.com/apache/spark/pull/5375#issuecomment-90428686 Thanks guys for your response. @cfregly I've already answered your comment on Jira. I guess we should move our discussion there. I can improve this implementation to meet requirements in SPARK-6514 and rename PR according to guidelines in wiki. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Expose regionName setting in Kinesis receiver ...
Github user cfregly commented on the pull request: https://github.com/apache/spark/pull/5375#issuecomment-90308088 @srowen @kopiczko This is part of a larger effort to overhaul Kinesis-based streaming slated for 1.4. Lots of API changes including region, AWS credentials, and application name - as well as upgrading both the AWS Java SDK and the KCL. Here's the parent jira: https://issues.apache.org/jira/browse/SPARK-6599. Here's the related jira that covers the region portion: https://issues.apache.org/jira/browse/SPARK-6514. We should definitely try to be backward-compatible even though the API is Experimental. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Expose regionName setting in Kinesis receiver ...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/5375#discussion_r27845691 --- Diff: extras/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala --- @@ -81,17 +84,19 @@ object KinesisUtils { * the tip of the stream (InitialPositionInStream.LATEST). * @param storageLevel Storage level to use for storing the received objects * - * @return JavaReceiverInputDStream[Array[Byte]] + * @return JavaReceiverInputDStream[ Array[Byte] ] */ @Experimental def createStream( - jssc: JavaStreamingContext, - streamName: String, - endpointUrl: String, + jssc: JavaStreamingContext, --- End diff -- Isn't this changing the API? I believe you should make a JIRA along with this PR. It's not trivial. https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Expose regionName setting in Kinesis receiver ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/5375#issuecomment-90189045 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Expose regionName setting in Kinesis receiver ...
GitHub user kopiczko opened a pull request: https://github.com/apache/spark/pull/5375 Expose regionName setting in Kinesis receiver configuration Hi, I'd like to have that setting exposed to be able to set DynamoDB table (for KCL) in the same region as the Kinesis stream. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kopiczko/spark expose-kinesis-receiver-regionname-setting Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/5375.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5375 commit 0a4a399c4e2c7742466d0f4bea61d8c8ba1d0918 Author: Pawel Kopiczko Date: 2015-04-06T18:26:06Z Expose regionName setting in Kinesis receiver configuration --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org