[jira] [Commented] (SPARK-27742) Security Support in Sources and Sinks for SS and Batch

2019-05-30 Thread Gabor Somogyi (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16851979#comment-16851979
 ] 

Gabor Somogyi commented on SPARK-27742:
---

{quote}For the user this means better flexibility per job to setup an upper 
limit.{quote}

* new token will be obtained at "expiryTimestamp * 0.75"
* Old token will be invalidated at expiryTimestamp because not renewed

Not sure if the app developer has to control the MaxLifeTime.


> Security Support in Sources and Sinks for SS and Batch
> --
>
> Key: SPARK-27742
> URL: https://issues.apache.org/jira/browse/SPARK-27742
> Project: Spark
>  Issue Type: Brainstorming
>  Components: SQL, Structured Streaming
>Affects Versions: 3.0.0
>Reporter: Stavros Kontopoulos
>Priority: Major
>
> As discussed with [~erikerlandson] on the [Big Data on K8s 
> UG|https://docs.google.com/document/d/1pnF38NF6N5eM8DlK088XUW85Vms4V2uTsGZvSp8MNIA]
>  it would be good to capture current status and identify work that needs to 
> be done for securing Spark when accessing sources and sinks. For example what 
> is the status of SSL, Kerberos support in different scenarios. The big 
> concern nowadays is how to secure data pipelines end-to-end. 
> Note: Not sure if this overlaps with some other ticket. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27742) Security Support in Sources and Sinks for SS and Batch

2019-05-30 Thread Stavros Kontopoulos (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16851924#comment-16851924
 ] 

Stavros Kontopoulos commented on SPARK-27742:
-

{quote}From client side the max lifetime can be only decreased for security 
reasons + see my previous point.
{quote}
In general since Kafka allows me to set this value, I should be able to do so. 

 
{quote}There is no possibility to obtain token for anybody else (pls see the 
comment in the code).
{quote}
When proxy user will be supported I guess there will be.

> Security Support in Sources and Sinks for SS and Batch
> --
>
> Key: SPARK-27742
> URL: https://issues.apache.org/jira/browse/SPARK-27742
> Project: Spark
>  Issue Type: Brainstorming
>  Components: SQL, Structured Streaming
>Affects Versions: 3.0.0
>Reporter: Stavros Kontopoulos
>Priority: Major
>
> As discussed with [~erikerlandson] on the [Big Data on K8s 
> UG|https://docs.google.com/document/d/1pnF38NF6N5eM8DlK088XUW85Vms4V2uTsGZvSp8MNIA]
>  it would be good to capture current status and identify work that needs to 
> be done for securing Spark when accessing sources and sinks. For example what 
> is the status of SSL, Kerberos support in different scenarios. The big 
> concern nowadays is how to secure data pipelines end-to-end. 
> Note: Not sure if this overlaps with some other ticket. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27742) Security Support in Sources and Sinks for SS and Batch

2019-05-30 Thread Gabor Somogyi (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16851883#comment-16851883
 ] 

Gabor Somogyi commented on SPARK-27742:
---

{quote}what happens with Kafka delegation tokens after max life time.
{quote}
If you take a look at the code new token will be obtained at "expiryTimestamp * 
0.75". I don't think maxLifeTimeMs and renewal has to be implemented because 
the code will be significantly complicated but the end result would be the same 
from user perspective.
{quote}no option for setting the max life time at least, it defaults to 
maxLifeTimeMs = -1L
{quote}
>From client side the max lifetime can be only decreased for security reasons + 
>see my previous point.
{quote}CreateDelegationTokenOptions also allows you to pass a principal.
{quote}
That list is who can renew the tokens. There is no possibility to obtain token 
for anybody else (pls see the comment in the code).


> Security Support in Sources and Sinks for SS and Batch
> --
>
> Key: SPARK-27742
> URL: https://issues.apache.org/jira/browse/SPARK-27742
> Project: Spark
>  Issue Type: Brainstorming
>  Components: SQL, Structured Streaming
>Affects Versions: 3.0.0
>Reporter: Stavros Kontopoulos
>Priority: Major
>
> As discussed with [~erikerlandson] on the [Big Data on K8s 
> UG|https://docs.google.com/document/d/1pnF38NF6N5eM8DlK088XUW85Vms4V2uTsGZvSp8MNIA]
>  it would be good to capture current status and identify work that needs to 
> be done for securing Spark when accessing sources and sinks. For example what 
> is the status of SSL, Kerberos support in different scenarios. The big 
> concern nowadays is how to secure data pipelines end-to-end. 
> Note: Not sure if this overlaps with some other ticket. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27742) Security Support in Sources and Sinks for SS and Batch

2019-05-30 Thread Stavros Kontopoulos (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16851828#comment-16851828
 ] 

Stavros Kontopoulos commented on SPARK-27742:
-

[~gsomogyi] what happens with Kafka delegation tokens after max life time. 
Streaming jobs in production may run for many days without interruption. From 
what I see in the code `expiryTimestamp` is used to calculate the new time for 
renewal but no option for setting the max life time at least, it defaults to 
maxLifeTimeMs = -1L at 
[https://github.com/apache/spark/blob/2f558094257c38d26650049f2ac93be6d65d6d85/external/kafka-0-10-token-provider/src/main/scala/org/apache/spark/kafka010/KafkaTokenUtil.scala#L65]
 

[https://cwiki.apache.org/confluence/display/KAFKA/KIP-48+Delegation+token+support+for+Kafka]

CreateDelegationTokenOptions also allows you to pass a principal.

> Security Support in Sources and Sinks for SS and Batch
> --
>
> Key: SPARK-27742
> URL: https://issues.apache.org/jira/browse/SPARK-27742
> Project: Spark
>  Issue Type: Brainstorming
>  Components: SQL, Structured Streaming
>Affects Versions: 3.0.0
>Reporter: Stavros Kontopoulos
>Priority: Major
>
> As discussed with [~erikerlandson] on the [Big Data on K8s 
> UG|https://docs.google.com/document/d/1pnF38NF6N5eM8DlK088XUW85Vms4V2uTsGZvSp8MNIA]
>  it would be good to capture current status and identify work that needs to 
> be done for securing Spark when accessing sources and sinks. For example what 
> is the status of SSL, Kerberos support in different scenarios. The big 
> concern nowadays is how to secure data pipelines end-to-end. 
> Note: Not sure if this overlaps with some other ticket. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27742) Security Support in Sources and Sinks for SS and Batch

2019-05-16 Thread Gabor Somogyi (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16841332#comment-16841332
 ] 

Gabor Somogyi commented on SPARK-27742:
---

Kafka delegation token support just added to 3.0 on source and sink side as 
well. There Kerberos + SSL also supported.
Since I'm involved in streaming happy to be part of this effort (though not 
sure how much to be done).


> Security Support in Sources and Sinks for SS and Batch
> --
>
> Key: SPARK-27742
> URL: https://issues.apache.org/jira/browse/SPARK-27742
> Project: Spark
>  Issue Type: Brainstorming
>  Components: SQL, Structured Streaming
>Affects Versions: 3.0.0
>Reporter: Stavros Kontopoulos
>Priority: Major
>
> As discussed with [~erikerlandson] on the [Big Data on K8s 
> UG|https://docs.google.com/document/d/1pnF38NF6N5eM8DlK088XUW85Vms4V2uTsGZvSp8MNIA]
>  it would be good to capture current status and identify work that needs to 
> be done for securing Spark when accessing sources and sinks. For example what 
> is the status of SSL, Kerberos support in different scenarios. The big 
> concern nowadays is how to secure data pipelines end-to-end. 
> Note: Not sure if this overlaps with some other ticket. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org