[jira] (SPARK-19405) Add support to KinesisUtils for cross-account Kinesis reads via STS

2017-01-30 Thread Adam Budde (JIRA)
Title: Message Title
 
 
 
 
 
 
 
 
 
 
  
 
 Adam Budde commented on  SPARK-19405 
 
 
 
 
 
 
 
 
 
 

 
 
 
 
 
 
 
  Re: Add support to KinesisUtils for cross-account Kinesis reads via STS  
 
 
 
 
 
 
 
 
 
 
We'd like to try to get this included in the Spark 2.1.1 release if there are no blockers. 
 
 
 
 
 
 
 
 
 
 
 
 

 
 Add Comment 
 
 
 
 
 
 
 
 
 
 

 
 
 
 
 
 
 
 
 
 

 This message was sent by Atlassian JIRA (v6.3.15#6346-sha1:dbc023d) 
 
 
 
 
  
 
 
 
 
 
 
 
 
   



[jira] (SPARK-19405) Add support to KinesisUtils for cross-account Kinesis reads via STS

2017-01-30 Thread Apache Spark (JIRA)
Title: Message Title
 
 
 
 
 
 
 
 
 
 
  
 
 Apache Spark assigned an issue to Apache Spark 
 
 
 
 
 
 
 
 
 
 

 
 
 
 
 
 
 
 Spark /  SPARK-19405 
 
 
 
  Add support to KinesisUtils for cross-account Kinesis reads via STS  
 
 
 
 
 
 
 
 
 

Change By:
 
 Apache Spark 
 
 
 

Assignee:
 
 Apache Spark 
 
 
 
 
 
 
 
 
 
 
 
 

 
 Add Comment 
 
 
 
 
 
 
 
 
 
 

 
 
 
 
 
 
 
 
 
 

 This message was sent by Atlassian JIRA (v6.3.15#6346-sha1:dbc023d) 
 
 
 
 
  
 
 
 
 
 
 
 
 
   



[jira] (SPARK-19405) Add support to KinesisUtils for cross-account Kinesis reads via STS

2017-01-30 Thread Apache Spark (JIRA)
Title: Message Title
 
 
 
 
 
 
 
 
 
 
  
 
 Apache Spark assigned an issue to Unassigned 
 
 
 
 
 
 
 
 
 
 

 
 
 
 
 
 
 
 Spark /  SPARK-19405 
 
 
 
  Add support to KinesisUtils for cross-account Kinesis reads via STS  
 
 
 
 
 
 
 
 
 

Change By:
 
 Apache Spark 
 
 
 

Assignee:
 
 Apache Spark 
 
 
 
 
 
 
 
 
 
 
 
 

 
 Add Comment 
 
 
 
 
 
 
 
 
 
 

 
 
 
 
 
 
 
 
 
 

 This message was sent by Atlassian JIRA (v6.3.15#6346-sha1:dbc023d) 
 
 
 
 
  
 
 
 
 
 
 
 
 
   



[jira] (SPARK-19405) Add support to KinesisUtils for cross-account Kinesis reads via STS

2017-01-30 Thread Apache Spark (JIRA)
Title: Message Title
 
 
 
 
 
 
 
 
 
 
  
 
 Apache Spark commented on  SPARK-19405 
 
 
 
 
 
 
 
 
 
 

 
 
 
 
 
 
 
  Re: Add support to KinesisUtils for cross-account Kinesis reads via STS  
 
 
 
 
 
 
 
 
 
 
User 'budde' has created a pull request for this issue: https://github.com/apache/spark/pull/16744 
 
 
 
 
 
 
 
 
 
 
 
 

 
 Add Comment 
 
 
 
 
 
 
 
 
 
 

 
 
 
 
 
 
 
 
 
 

 This message was sent by Atlassian JIRA (v6.3.15#6346-sha1:dbc023d) 
 
 
 
 
  
 
 
 
 
 
 
 
 
   



[jira] (SPARK-19405) Add support to KinesisUtils for cross-account Kinesis reads via STS

2017-01-30 Thread Adam Budde (JIRA)
Title: Message Title
 
 
 
 
 
 
 
 
 
 
  
 
 Adam Budde updated an issue 
 
 
 
 
 
 
 
 
 
 

 
 
 
 
 
 
 
 Spark /  SPARK-19405 
 
 
 
  Add support to KinesisUtils for cross-account Kinesis reads via STS  
 
 
 
 
 
 
 
 
 

Change By:
 
 Adam Budde 
 
 
 
 
 
 
 
 
 
 h1. SummaryEnable KinesisReceiver to utilize STSAssumeRoleSessionCredentialsProvider when setting up the Kinesis Client Library in order to enable secure cross-account Kinesis stream reads managed by AWS Simple Token Service (STS)h1. DetailsSpark's KinesisReceiver implementation utilizes the Kinesis Client Library in order to allow users to write Spark Streaming jobs that operate on Kinesis data. The KCL uses a few AWS services under the hood in order to provide checkpointed, load-balanced processing of the underlying data in a Kinesis stream.  Running the KCL requires permissions to be set up for the following AWS resources.* AWS Kinesis for reading stream data* AWS DynamoDB for storing KCL shared state in tables* AWS CloudWatch for logging KCL metricsThe KinesisUtils.createStream() API allows users to authenticate to these services either by specifying an explicit AWS access key/secret key credential pair or by using the default credential provider chain. This supports authorizing to the three AWS services using either an AWS keypair (either provided explicitly or parsed from environment variables, etc.):!https://raw.githubusercontent.com/budde/budde_asf_jira_images/master/spark/kinesis_sts_support/KeypairOnly.png!Or the IAM instance profile (when running on EC2):!https://raw.githubusercontent.com/budde/budde_asf_jira_images/master/spark/kinesis_sts_support/InstanceProfileOnly.png!AWS users often need to access resources across separate accounts. This could be done in order to consume data produced by another organization or from a service running in another account for resource isolation purposes. AWS Simple Token Service (STS) provides a secure way to authorize cross-account resource access by using temporary sessions to assuming an IAM role in the AWS account with the resources being accessed.The [IAM documentation|http://docs.aws.amazon.com/IAM/latest/UserGuide/tutorial_cross-account-with-roles.html] covers the specifics of how cross account IAM role assumption works in much greater detail, but if an actor in account A wanted to read from a Kinesis stream in account B the general steps required would look something like this:* An IAM role is added to account B with read permissions for the Kinesis stream** Trust policy is configured to allow account A to assume the role * Actor in account A uses its own long-lived credentials to tell STS to assume the role in account B* STS returns temporary credentials with permission to read from the stream in account BApplied to KinesisReceiver and the KCL, we could use a keypair as our long-lived credentials to authenticate to STS and assume an external role with the necessary KCL permissions:!https://raw.githubusercontent.com/budde/budde_asf_jira_images/master/spark/kinesis_sts_support/STSKeypair.png!Or the instance profile as long-lived credentials:!https://raw.githubusercontent.com/budde/budde_asf_jira_images/master/spark/kinesis_sts_support/STSInstanceProfile.png!The STSAssumeRoleSessionCredentialsProvider implementation of the AWSCredentialsProviderChain interface from the AWS SDK abstracts all of the management of the temporary session credentia

[jira] (SPARK-19405) Add support to KinesisUtils for cross-account Kinesis reads via STS

2017-01-30 Thread Adam Budde (JIRA)
Title: Message Title
 
 
 
 
 
 
 
 
 
 
  
 
 Adam Budde created an issue 
 
 
 
 
 
 
 
 
 
 

 
 
 
 
 
 
 
 Spark /  SPARK-19405 
 
 
 
  Add support to KinesisUtils for cross-account Kinesis reads via STS  
 
 
 
 
 
 
 
 
 

Issue Type:
 
  Improvement 
 
 
 

Assignee:
 

 Unassigned 
 
 
 

Components:
 

 DStreams 
 
 
 

Created:
 

 30/Jan/17 18:44 
 
 
 

Priority:
 
  Minor 
 
 
 

Reporter:
 
 Adam Budde 
 
 
 
 
 
 
 
 
 
 
Summary 
Enable KinesisReceiver to utilize STSAssumeRoleSessionCredentialsProvider when setting up the Kinesis Client Library in order to enable secure cross-account Kinesis stream reads managed by AWS Simple Token Service (STS) 
Details 
Spark's KinesisReceiver implementation utilizes the Kinesis Client Library in order to allow users to write Spark Streaming jobs that operate on Kinesis data. The KCL uses a few AWS services under the hood in order to provide checkpointed, load-balanced processing of the underlying data in a Kinesis stream. Running the KCL requires permissions to be set up for the following AWS resources. 
 

AWS Kinesis for reading stream data
 

AWS DynamoDB for storing KCL shared state in tables
 

AWS CloudWatch for logging KCL metrics