[jira] [Created] (SPARK-40504) Make yarn appmaster load config from client

2022-09-20 Thread zhengchenyu (Jira)
zhengchenyu created SPARK-40504:
---

 Summary: Make yarn appmaster load config from client
 Key: SPARK-40504
 URL: https://issues.apache.org/jira/browse/SPARK-40504
 Project: Spark
  Issue Type: Improvement
  Components: YARN
Affects Versions: 3.0.1
Reporter: zhengchenyu


In yarn federation mode, config in client side and nm side may be different. 
AppMaster should override config from client side.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-40504) Make yarn appmaster load config from client

2022-09-20 Thread zhengchenyu (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-40504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhengchenyu updated SPARK-40504:

Description: 
In yarn federation mode, config in client side and nm side may be different. 
AppMaster should override config from client side.

For example: 

In client side, yarn.resourcemanager.ha.rm-ids are yarn routers.

In nm side, yarn.resourcemanager.ha.rm-ids are the rms of subcluster.

  was:In yarn federation mode, config in client side and nm side may be 
different. AppMaster should override config from client side.


> Make yarn appmaster load config from client
> ---
>
> Key: SPARK-40504
> URL: https://issues.apache.org/jira/browse/SPARK-40504
> Project: Spark
>  Issue Type: Improvement
>  Components: YARN
>Affects Versions: 3.0.1
>Reporter: zhengchenyu
>Priority: Major
>
> In yarn federation mode, config in client side and nm side may be different. 
> AppMaster should override config from client side.
> For example: 
> In client side, yarn.resourcemanager.ha.rm-ids are yarn routers.
> In nm side, yarn.resourcemanager.ha.rm-ids are the rms of subcluster.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-41073) Spark ThriftServer generate huge amounts of DelegationToken

2022-11-08 Thread zhengchenyu (Jira)
zhengchenyu created SPARK-41073:
---

 Summary: Spark ThriftServer generate huge amounts of 
DelegationToken
 Key: SPARK-41073
 URL: https://issues.apache.org/jira/browse/SPARK-41073
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 3.0.1
Reporter: zhengchenyu






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-41073) Spark ThriftServer generate huge amounts of DelegationToken

2022-11-11 Thread zhengchenyu (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhengchenyu updated SPARK-41073:

Description: 
In our cluster, zookeeper nearly crashed. I found the znodes of 
/zkdtsm/ZKDTSMRoot/ZKDTSMTokensRoot increased quickly. 
After some research, I found some sql running on spark-thriftserver obtain huge 
amounts of DelegationToken.
The reason is that in these spark-sql, every hive parition acquire a different 
delegation token. 
HadoopRDDs can't share delegation token. Thriftserver should share the 
delegation token from HadoopDelegationTokenManager.

> Spark ThriftServer generate huge amounts of DelegationToken
> ---
>
> Key: SPARK-41073
> URL: https://issues.apache.org/jira/browse/SPARK-41073
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.1
>Reporter: zhengchenyu
>Priority: Major
>
> In our cluster, zookeeper nearly crashed. I found the znodes of 
> /zkdtsm/ZKDTSMRoot/ZKDTSMTokensRoot increased quickly. 
> After some research, I found some sql running on spark-thriftserver obtain 
> huge amounts of DelegationToken.
> The reason is that in these spark-sql, every hive parition acquire a 
> different delegation token. 
> HadoopRDDs can't share delegation token. Thriftserver should share the 
> delegation token from HadoopDelegationTokenManager.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-41073) Spark ThriftServer generate huge amounts of DelegationToken

2022-11-11 Thread zhengchenyu (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhengchenyu updated SPARK-41073:

Description: 
In our cluster, zookeeper nearly crashed. I found the znodes of 
/zkdtsm/ZKDTSMRoot/ZKDTSMTokensRoot increased quickly. 
After some research, I found some sql running on spark-thriftserver obtain huge 
amounts of DelegationToken.
The reason is that in these spark-sql, every hive parition acquire a different 
delegation token. 
The rease is that HadoopRDDs in thriftserver can't share credentials from . 

  was:
In our cluster, zookeeper nearly crashed. I found the znodes of 
/zkdtsm/ZKDTSMRoot/ZKDTSMTokensRoot increased quickly. 
After some research, I found some sql running on spark-thriftserver obtain huge 
amounts of DelegationToken.
The reason is that in these spark-sql, every hive parition acquire a different 
delegation token. 
HadoopRDDs can't share delegation token. Thriftserver should share the 
delegation token from HadoopDelegationTokenManager.


> Spark ThriftServer generate huge amounts of DelegationToken
> ---
>
> Key: SPARK-41073
> URL: https://issues.apache.org/jira/browse/SPARK-41073
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.1
>Reporter: zhengchenyu
>Priority: Major
>
> In our cluster, zookeeper nearly crashed. I found the znodes of 
> /zkdtsm/ZKDTSMRoot/ZKDTSMTokensRoot increased quickly. 
> After some research, I found some sql running on spark-thriftserver obtain 
> huge amounts of DelegationToken.
> The reason is that in these spark-sql, every hive parition acquire a 
> different delegation token. 
> The rease is that HadoopRDDs in thriftserver can't share credentials from . 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-41073) Spark ThriftServer generate huge amounts of DelegationToken

2022-11-11 Thread zhengchenyu (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhengchenyu updated SPARK-41073:

Description: 
In our cluster, zookeeper nearly crashed. I found the znodes of 
/zkdtsm/ZKDTSMRoot/ZKDTSMTokensRoot increased quickly. 
After some research, I found some sql running on spark-thriftserver obtain huge 
amounts of DelegationToken.
The reason is that in these spark-sql, every hive parition acquire a different 
delegation token. 
And HadoopRDDs in thriftserver can't share credentials from 
CoarseGrainedSchedulerBackend::delegationTokens, we must share it.

  was:
In our cluster, zookeeper nearly crashed. I found the znodes of 
/zkdtsm/ZKDTSMRoot/ZKDTSMTokensRoot increased quickly. 
After some research, I found some sql running on spark-thriftserver obtain huge 
amounts of DelegationToken.
The reason is that in these spark-sql, every hive parition acquire a different 
delegation token. 
The reason is that HadoopRDDs in thriftserver can't share credentials from 
CoarseGrainedSchedulerBackend::delegationTokens, we must share it.


> Spark ThriftServer generate huge amounts of DelegationToken
> ---
>
> Key: SPARK-41073
> URL: https://issues.apache.org/jira/browse/SPARK-41073
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.1
>Reporter: zhengchenyu
>Priority: Major
>
> In our cluster, zookeeper nearly crashed. I found the znodes of 
> /zkdtsm/ZKDTSMRoot/ZKDTSMTokensRoot increased quickly. 
> After some research, I found some sql running on spark-thriftserver obtain 
> huge amounts of DelegationToken.
> The reason is that in these spark-sql, every hive parition acquire a 
> different delegation token. 
> And HadoopRDDs in thriftserver can't share credentials from 
> CoarseGrainedSchedulerBackend::delegationTokens, we must share it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-41073) Spark ThriftServer generate huge amounts of DelegationToken

2022-11-11 Thread zhengchenyu (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhengchenyu updated SPARK-41073:

Description: 
In our cluster, zookeeper nearly crashed. I found the znodes of 
/zkdtsm/ZKDTSMRoot/ZKDTSMTokensRoot increased quickly. 
After some research, I found some sql running on spark-thriftserver obtain huge 
amounts of DelegationToken.
The reason is that in these spark-sql, every hive parition acquire a different 
delegation token. 
The reason is that HadoopRDDs in thriftserver can't share credentials from 
CoarseGrainedSchedulerBackend::delegationTokens, we must share it.

  was:
In our cluster, zookeeper nearly crashed. I found the znodes of 
/zkdtsm/ZKDTSMRoot/ZKDTSMTokensRoot increased quickly. 
After some research, I found some sql running on spark-thriftserver obtain huge 
amounts of DelegationToken.
The reason is that in these spark-sql, every hive parition acquire a different 
delegation token. 
The rease is that HadoopRDDs in thriftserver can't share credentials from . 


> Spark ThriftServer generate huge amounts of DelegationToken
> ---
>
> Key: SPARK-41073
> URL: https://issues.apache.org/jira/browse/SPARK-41073
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.1
>Reporter: zhengchenyu
>Priority: Major
>
> In our cluster, zookeeper nearly crashed. I found the znodes of 
> /zkdtsm/ZKDTSMRoot/ZKDTSMTokensRoot increased quickly. 
> After some research, I found some sql running on spark-thriftserver obtain 
> huge amounts of DelegationToken.
> The reason is that in these spark-sql, every hive parition acquire a 
> different delegation token. 
> The reason is that HadoopRDDs in thriftserver can't share credentials from 
> CoarseGrainedSchedulerBackend::delegationTokens, we must share it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-41073) Spark ThriftServer generate huge amounts of DelegationToken

2022-11-11 Thread zhengchenyu (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhengchenyu updated SPARK-41073:

Attachment: SPARK-41073.proposal.A.draft.001.patch

> Spark ThriftServer generate huge amounts of DelegationToken
> ---
>
> Key: SPARK-41073
> URL: https://issues.apache.org/jira/browse/SPARK-41073
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.1
>Reporter: zhengchenyu
>Priority: Major
> Attachments: SPARK-41073.proposal.A.draft.001.patch
>
>
> In our cluster, zookeeper nearly crashed. I found the znodes of 
> /zkdtsm/ZKDTSMRoot/ZKDTSMTokensRoot increased quickly. 
> After some research, I found some sql running on spark-thriftserver obtain 
> huge amounts of DelegationToken.
> The reason is that in these spark-sql, every hive parition acquire a 
> different delegation token. 
> And HadoopRDDs in thriftserver can't share credentials from 
> CoarseGrainedSchedulerBackend::delegationTokens, we must share it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-41073) Spark ThriftServer generate huge amounts of DelegationToken

2022-11-11 Thread zhengchenyu (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-41073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17632209#comment-17632209
 ] 

zhengchenyu commented on SPARK-41073:
-

We must add a valid credentials to jobConf.

For now, I think sql in thriftserver or local mode can't get a valid 
credentials.

I have two proposal:

proposal A: set a global credentials for hadoop.

proposal B: extract HadoopDelegationTokenManager from 
CoarseGrainedSchedulerBackend. (Note:I think local spark also wanna global 
credentials)

I prefer B.

But A is simple, I have submit SPARK-41073.proposal.A.draft.001.patch, I solve 
the problem, but not graceful.

[~vanzin] [~xkrogen]   Can you give me some suggesstion? 

> Spark ThriftServer generate huge amounts of DelegationToken
> ---
>
> Key: SPARK-41073
> URL: https://issues.apache.org/jira/browse/SPARK-41073
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.1
>Reporter: zhengchenyu
>Priority: Major
> Attachments: SPARK-41073.proposal.A.draft.001.patch
>
>
> In our cluster, zookeeper nearly crashed. I found the znodes of 
> /zkdtsm/ZKDTSMRoot/ZKDTSMTokensRoot increased quickly. 
> After some research, I found some sql running on spark-thriftserver obtain 
> huge amounts of DelegationToken.
> The reason is that in these spark-sql, every hive parition acquire a 
> different delegation token. 
> And HadoopRDDs in thriftserver can't share credentials from 
> CoarseGrainedSchedulerBackend::delegationTokens, we must share it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-41073) Spark ThriftServer generate huge amounts of DelegationToken

2022-11-14 Thread zhengchenyu (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhengchenyu resolved SPARK-41073.
-
Resolution: Duplicate

> Spark ThriftServer generate huge amounts of DelegationToken
> ---
>
> Key: SPARK-41073
> URL: https://issues.apache.org/jira/browse/SPARK-41073
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.1
>Reporter: zhengchenyu
>Priority: Major
> Attachments: SPARK-41073.proposal.A.draft.001.patch
>
>
> In our cluster, zookeeper nearly crashed. I found the znodes of 
> /zkdtsm/ZKDTSMRoot/ZKDTSMTokensRoot increased quickly. 
> After some research, I found some sql running on spark-thriftserver obtain 
> huge amounts of DelegationToken.
> The reason is that in these spark-sql, every hive parition acquire a 
> different delegation token. 
> And HadoopRDDs in thriftserver can't share credentials from 
> CoarseGrainedSchedulerBackend::delegationTokens, we must share it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-41073) Spark ThriftServer generate huge amounts of DelegationToken

2022-11-14 Thread zhengchenyu (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-41073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17634119#comment-17634119
 ] 

zhengchenyu commented on SPARK-41073:
-

[~xkrogen] Thank you very much for reply. This issue is duplicate with 
SPARK-36328 indeed. I will close it.

> Spark ThriftServer generate huge amounts of DelegationToken
> ---
>
> Key: SPARK-41073
> URL: https://issues.apache.org/jira/browse/SPARK-41073
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.1
>Reporter: zhengchenyu
>Priority: Major
> Attachments: SPARK-41073.proposal.A.draft.001.patch
>
>
> In our cluster, zookeeper nearly crashed. I found the znodes of 
> /zkdtsm/ZKDTSMRoot/ZKDTSMTokensRoot increased quickly. 
> After some research, I found some sql running on spark-thriftserver obtain 
> huge amounts of DelegationToken.
> The reason is that in these spark-sql, every hive parition acquire a 
> different delegation token. 
> And HadoopRDDs in thriftserver can't share credentials from 
> CoarseGrainedSchedulerBackend::delegationTokens, we must share it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-40504) Make yarn appmaster load config from client

2022-11-14 Thread zhengchenyu (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-40504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhengchenyu resolved SPARK-40504.
-
Resolution: Implemented

> Make yarn appmaster load config from client
> ---
>
> Key: SPARK-40504
> URL: https://issues.apache.org/jira/browse/SPARK-40504
> Project: Spark
>  Issue Type: Improvement
>  Components: YARN
>Affects Versions: 3.0.1
>Reporter: zhengchenyu
>Priority: Major
>
> In yarn federation mode, config in client side and nm side may be different. 
> AppMaster should override config from client side.
> For example: 
> In client side, yarn.resourcemanager.ha.rm-ids are yarn routers.
> In nm side, yarn.resourcemanager.ha.rm-ids are the rms of subcluster.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org