[jira] [Resolved] (SPARK-30689) Allow custom resource scheduling to work with YARN versions that don't support custom resource scheduling

2020-01-31 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-30689.
---
Fix Version/s: 3.0.0
 Assignee: Thomas Graves
   Resolution: Fixed

> Allow custom resource scheduling to work with YARN versions that don't 
> support custom resource scheduling
> -
>
> Key: SPARK-30689
> URL: https://issues.apache.org/jira/browse/SPARK-30689
> Project: Spark
>  Issue Type: Improvement
>  Components: YARN
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
> Fix For: 3.0.0
>
>
> Many people/companies will not be moving to Hadoop 3.1 or greater, where it 
> supports custom resource scheduling for things like GPUs soon and have 
> requested support for it in older hadoop 2.x versions. This also means that 
> they may not have isolation enabled which is what the default behavior relies 
> on.
> right now the option is to write a custom discovery script to handle on their 
> own. This is ok but has some limitation because the script runs as a separate 
> process.  It also just a shell script.
> I think we can make this a lot more flexible by making the entire resource 
> discovery class pluggable. The default one would stay as is and call the 
> discovery script, but if an advanced user wanted to replace the entire thing 
> they could implement a pluggable class which they could write custom code on 
> how to discovery resource addresses.
> This will also help users if they are running hadoop 3.1.x or greater but 
> don't have the resources configured or aren't running in an isolated 
> environment.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-30511) Spark marks intentionally killed speculative tasks as pending leads to holding idle executors

2020-01-31 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves reassigned SPARK-30511:
-

Assignee: Zebing Lin

> Spark marks intentionally killed speculative tasks as pending leads to 
> holding idle executors
> -
>
> Key: SPARK-30511
> URL: https://issues.apache.org/jira/browse/SPARK-30511
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler
>Affects Versions: 2.3.0
>Reporter: Zebing Lin
>Assignee: Zebing Lin
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: Screen Shot 2020-01-15 at 11.13.17.png
>
>
> *TL;DR*
>  When speculative tasks fail/get killed, they are still considered as pending 
> and count towards the calculation of number of needed executors.
> h3. Symptom
> In one of our production job (where it's running 4 tasks per executor), we 
> found that it was holding 6 executors at the end with only 2 tasks running (1 
> speculative). With more logging enabled, we found the job printed:
> {code:java}
> pendingTasks is 0 pendingSpeculativeTasks is 17 totalRunningTasks is 2
> {code}
>  while the job only had 1 speculative task running and 16 speculative tasks 
> intentionally killed because of corresponding original tasks had finished.
> An easy repro of the issue (`--conf spark.speculation=true --conf 
> spark.executor.cores=4 --conf spark.dynamicAllocation.maxExecutors=1000` in 
> cluster mode):
> {code:java}
> val n = 4000
> val someRDD = sc.parallelize(1 to n, n)
> someRDD.mapPartitionsWithIndex( (index: Int, it: Iterator[Int]) => {
> if (index < 300 && index >= 150) {
> Thread.sleep(index * 1000) // Fake running tasks
> } else if (index == 300) {
> Thread.sleep(1000 * 1000) // Fake long running tasks
> }
> it.toList.map(x => index + ", " + x).iterator
> }).collect
> {code}
> You will see when running the last task, we would be hold 38 executors (see 
> attachment), which is exactly (152 + 3) / 4 = 38.
> h3. The Bug
> Upon examining the code of _pendingSpeculativeTasks_: 
> {code:java}
> stageAttemptToNumSpeculativeTasks.map { case (stageAttempt, numTasks) =>
>   numTasks - 
> stageAttemptToSpeculativeTaskIndices.get(stageAttempt).map(_.size).getOrElse(0)
> }.sum
> {code}
> where _stageAttemptToNumSpeculativeTasks(stageAttempt)_ is incremented on 
> _onSpeculativeTaskSubmitted_, but never decremented.  
> _stageAttemptToNumSpeculativeTasks -= stageAttempt_ is performed on stage 
> completion. *This means Spark is marking ended speculative tasks as pending, 
> which leads to Spark to hold more executors that it actually needs!*
> I will have a PR ready to fix this issue, along with SPARK-28403 too
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-30511) Spark marks intentionally killed speculative tasks as pending leads to holding idle executors

2020-01-31 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-30511.
---
Fix Version/s: 3.0.0
   Resolution: Fixed

> Spark marks intentionally killed speculative tasks as pending leads to 
> holding idle executors
> -
>
> Key: SPARK-30511
> URL: https://issues.apache.org/jira/browse/SPARK-30511
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler
>Affects Versions: 2.3.0
>Reporter: Zebing Lin
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: Screen Shot 2020-01-15 at 11.13.17.png
>
>
> *TL;DR*
>  When speculative tasks fail/get killed, they are still considered as pending 
> and count towards the calculation of number of needed executors.
> h3. Symptom
> In one of our production job (where it's running 4 tasks per executor), we 
> found that it was holding 6 executors at the end with only 2 tasks running (1 
> speculative). With more logging enabled, we found the job printed:
> {code:java}
> pendingTasks is 0 pendingSpeculativeTasks is 17 totalRunningTasks is 2
> {code}
>  while the job only had 1 speculative task running and 16 speculative tasks 
> intentionally killed because of corresponding original tasks had finished.
> An easy repro of the issue (`--conf spark.speculation=true --conf 
> spark.executor.cores=4 --conf spark.dynamicAllocation.maxExecutors=1000` in 
> cluster mode):
> {code:java}
> val n = 4000
> val someRDD = sc.parallelize(1 to n, n)
> someRDD.mapPartitionsWithIndex( (index: Int, it: Iterator[Int]) => {
> if (index < 300 && index >= 150) {
> Thread.sleep(index * 1000) // Fake running tasks
> } else if (index == 300) {
> Thread.sleep(1000 * 1000) // Fake long running tasks
> }
> it.toList.map(x => index + ", " + x).iterator
> }).collect
> {code}
> You will see when running the last task, we would be hold 38 executors (see 
> attachment), which is exactly (152 + 3) / 4 = 38.
> h3. The Bug
> Upon examining the code of _pendingSpeculativeTasks_: 
> {code:java}
> stageAttemptToNumSpeculativeTasks.map { case (stageAttempt, numTasks) =>
>   numTasks - 
> stageAttemptToSpeculativeTaskIndices.get(stageAttempt).map(_.size).getOrElse(0)
> }.sum
> {code}
> where _stageAttemptToNumSpeculativeTasks(stageAttempt)_ is incremented on 
> _onSpeculativeTaskSubmitted_, but never decremented.  
> _stageAttemptToNumSpeculativeTasks -= stageAttempt_ is performed on stage 
> completion. *This means Spark is marking ended speculative tasks as pending, 
> which leads to Spark to hold more executors that it actually needs!*
> I will have a PR ready to fix this issue, along with SPARK-28403 too
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-30638) add resources as parameter to the PluginContext

2020-01-31 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-30638.
---
Fix Version/s: 3.0.0
 Assignee: Thomas Graves
   Resolution: Fixed

> add resources as parameter to the PluginContext
> ---
>
> Key: SPARK-30638
> URL: https://issues.apache.org/jira/browse/SPARK-30638
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
> Fix For: 3.0.0
>
>
> Add the allocates resources to parameters to the PluginContext so that any 
> plugins in driver or executor could use this information to initialize 
> devices or use this information in a useful manner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-30689) Allow custom resource scheduling to work with YARN versions that don't support custom resource scheduling

2020-01-30 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-30689:
--
Description: 
Many people/companies will not be moving to Hadoop 3.1 or greater, where it 
supports custom resource scheduling for things like GPUs soon and have 
requested support for it in older hadoop 2.x versions. This also means that 
they may not have isolation enabled which is what the default behavior relies 
on.

right now the option is to write a custom discovery script to handle on their 
own. This is ok but has some limitation because the script runs as a separate 
process.  It also just a shell script.

I think we can make this a lot more flexible by making the entire resource 
discovery class pluggable. The default one would stay as is and call the 
discovery script, but if an advanced user wanted to replace the entire thing 
they could implement a pluggable class which they could write custom code on 
how to discovery resource addresses.

This will also help users if they are running hadoop 3.1.x or greater but don't 
have the resources configured or aren't running in an isolated environment.

  was:
Many people/companies will not be moving to Hadoop 3.1 or greater, where it 
supports custom resource scheduling for things like GPUs soon and have 
requested support for it in older hadoop 2.x versions. This also means that 
they may not have isolation enabled which is what the default behavior relies 
on.

right now the option is to write a custom discovery script to handle on their 
own. This is ok but has some limitation because the script runs as a separate 
process.  It also just a shell script.

I think we can make this a lot more flexible by making the entire resource 
discovery class pluggable. The default one would stay as is and call the 
discovery script, but if an advanced user wanted to replace the entire thing 
they could implement a pluggable class which they could write custom code on 
how to discovery resource addresses.


> Allow custom resource scheduling to work with YARN versions that don't 
> support custom resource scheduling
> -
>
> Key: SPARK-30689
> URL: https://issues.apache.org/jira/browse/SPARK-30689
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Priority: Major
>
> Many people/companies will not be moving to Hadoop 3.1 or greater, where it 
> supports custom resource scheduling for things like GPUs soon and have 
> requested support for it in older hadoop 2.x versions. This also means that 
> they may not have isolation enabled which is what the default behavior relies 
> on.
> right now the option is to write a custom discovery script to handle on their 
> own. This is ok but has some limitation because the script runs as a separate 
> process.  It also just a shell script.
> I think we can make this a lot more flexible by making the entire resource 
> discovery class pluggable. The default one would stay as is and call the 
> discovery script, but if an advanced user wanted to replace the entire thing 
> they could implement a pluggable class which they could write custom code on 
> how to discovery resource addresses.
> This will also help users if they are running hadoop 3.1.x or greater but 
> don't have the resources configured or aren't running in an isolated 
> environment.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-30689) Allow custom resource scheduling to work with YARN versions that don't support custom resource scheduling

2020-01-30 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-30689:
--
Description: 
Many people/companies will not be moving to Hadoop 3.1 or greater, where it 
supports custom resource scheduling for things like GPUs soon and have 
requested support for it in older hadoop 2.x versions. This also means that 
they may not have isolation enabled which is what the default behavior relies 
on.

right now the option is to write a custom discovery script to handle on their 
own. This is ok but has some limitation because the script runs as a separate 
process.  It also just a shell script.

I think we can make this a lot more flexible by making the entire resource 
discovery class pluggable. The default one would stay as is and call the 
discovery script, but if an advanced user wanted to replace the entire thing 
they could implement a pluggable class which they could write custom code on 
how to discovery resource addresses.

  was:
Many people/companies will not be moving to Hadoop 3.0 that it supports custom 
resource scheduling for things like GPUs soon and have requested support for it 
in older hadoop 2.x versions. This also means that they may not have isolation 
enabled which is what the default behavior relies on.

right now the option is to write a custom discovery script to handle on their 
own. This is ok but has some limitation because the script runs as a separate 
process.  It also just a shell script.

I think we can make this a lot more flexible by making the entire resource 
discovery class pluggable. The default one would stay as is and call the 
discovery script, but if an advanced user wanted to replace the entire thing 
they could implement a pluggable class which they could write custom code on 
how to discovery resource addresses.


> Allow custom resource scheduling to work with YARN versions that don't 
> support custom resource scheduling
> -
>
> Key: SPARK-30689
> URL: https://issues.apache.org/jira/browse/SPARK-30689
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Priority: Major
>
> Many people/companies will not be moving to Hadoop 3.1 or greater, where it 
> supports custom resource scheduling for things like GPUs soon and have 
> requested support for it in older hadoop 2.x versions. This also means that 
> they may not have isolation enabled which is what the default behavior relies 
> on.
> right now the option is to write a custom discovery script to handle on their 
> own. This is ok but has some limitation because the script runs as a separate 
> process.  It also just a shell script.
> I think we can make this a lot more flexible by making the entire resource 
> discovery class pluggable. The default one would stay as is and call the 
> discovery script, but if an advanced user wanted to replace the entire thing 
> they could implement a pluggable class which they could write custom code on 
> how to discovery resource addresses.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-30689) Allow custom resource scheduling to work with YARN versions that don't support custom resource scheduling

2020-01-30 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-30689:
--
Issue Type: Improvement  (was: Bug)

> Allow custom resource scheduling to work with YARN versions that don't 
> support custom resource scheduling
> -
>
> Key: SPARK-30689
> URL: https://issues.apache.org/jira/browse/SPARK-30689
> Project: Spark
>  Issue Type: Improvement
>  Components: YARN
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Priority: Major
>
> Many people/companies will not be moving to Hadoop 3.1 or greater, where it 
> supports custom resource scheduling for things like GPUs soon and have 
> requested support for it in older hadoop 2.x versions. This also means that 
> they may not have isolation enabled which is what the default behavior relies 
> on.
> right now the option is to write a custom discovery script to handle on their 
> own. This is ok but has some limitation because the script runs as a separate 
> process.  It also just a shell script.
> I think we can make this a lot more flexible by making the entire resource 
> discovery class pluggable. The default one would stay as is and call the 
> discovery script, but if an advanced user wanted to replace the entire thing 
> they could implement a pluggable class which they could write custom code on 
> how to discovery resource addresses.
> This will also help users if they are running hadoop 3.1.x or greater but 
> don't have the resources configured or aren't running in an isolated 
> environment.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-30689) Allow custom resource scheduling to work with Hadoop versions that don't support it

2020-01-30 Thread Thomas Graves (Jira)
Thomas Graves created SPARK-30689:
-

 Summary: Allow custom resource scheduling to work with Hadoop 
versions that don't support it
 Key: SPARK-30689
 URL: https://issues.apache.org/jira/browse/SPARK-30689
 Project: Spark
  Issue Type: Bug
  Components: YARN
Affects Versions: 3.0.0
Reporter: Thomas Graves


Many people/companies will not be moving to Hadoop 3.0 that it supports custom 
resource scheduling for things like GPUs soon and have requested support for it 
in older hadoop 2.x versions. This also means that they may not have isolation 
enabled which is what the default behavior relies on.

right now the option is to write a custom discovery script to handle on their 
own. This is ok but has some limitation because the script runs as a separate 
process.  It also just a shell script.

I think we can make this a lot more flexible by making the entire resource 
discovery class pluggable. The default one would stay as is and call the 
discovery script, but if an advanced user wanted to replace the entire thing 
they could implement a pluggable class which they could write custom code on 
how to discovery resource addresses.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-30689) Allow custom resource scheduling to work with YARN versions that don't support custom resource scheduling

2020-01-30 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-30689:
--
Summary: Allow custom resource scheduling to work with YARN versions that 
don't support custom resource scheduling  (was: Allow custom resource 
scheduling to work with Hadoop versions that don't support it)

> Allow custom resource scheduling to work with YARN versions that don't 
> support custom resource scheduling
> -
>
> Key: SPARK-30689
> URL: https://issues.apache.org/jira/browse/SPARK-30689
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Priority: Major
>
> Many people/companies will not be moving to Hadoop 3.0 that it supports 
> custom resource scheduling for things like GPUs soon and have requested 
> support for it in older hadoop 2.x versions. This also means that they may 
> not have isolation enabled which is what the default behavior relies on.
> right now the option is to write a custom discovery script to handle on their 
> own. This is ok but has some limitation because the script runs as a separate 
> process.  It also just a shell script.
> I think we can make this a lot more flexible by making the entire resource 
> discovery class pluggable. The default one would stay as is and call the 
> discovery script, but if an advanced user wanted to replace the entire thing 
> they could implement a pluggable class which they could write custom code on 
> how to discovery resource addresses.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-30638) add resources as parameter to the PluginContext

2020-01-30 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-30638:
--
Description: Add the allocates resources to parameters to the PluginContext 
so that any plugins in driver or executor could use this information to 
initialize devices or use this information in a useful manner.  (was: Add the 
allocates resources and ResourceProfile to parameters to the PluginContext so 
that any plugins in driver or executor could use this information to initialize 
devices or use this information in a useful manner.)

> add resources as parameter to the PluginContext
> ---
>
> Key: SPARK-30638
> URL: https://issues.apache.org/jira/browse/SPARK-30638
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Priority: Major
>
> Add the allocates resources to parameters to the PluginContext so that any 
> plugins in driver or executor could use this information to initialize 
> devices or use this information in a useful manner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-30512) Use a dedicated boss event group loop in the netty pipeline for external shuffle service

2020-01-29 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-30512:
--
Fix Version/s: 2.4.5

> Use a dedicated boss event group loop in the netty pipeline for external 
> shuffle service
> 
>
> Key: SPARK-30512
> URL: https://issues.apache.org/jira/browse/SPARK-30512
> Project: Spark
>  Issue Type: Bug
>  Components: Shuffle
>Affects Versions: 3.0.0
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Fix For: 2.4.5, 3.0.0
>
>
> We have been seeing a large number of SASL authentication (RPC requests) 
> timing out with the external shuffle service.
>  The issue and all the analysis we did is described here:
>  [https://github.com/netty/netty/issues/9890]
> I added a {{LoggingHandler}} to netty pipeline and realized that even the 
> channel registration is delayed by 30 seconds. 
>  In the Spark External Shuffle service, the boss event group and the worker 
> event group are same which is causing this delay.
> {code:java}
> EventLoopGroup bossGroup =
>   NettyUtils.createEventLoop(ioMode, conf.serverThreads(), 
> conf.getModuleName() + "-server");
> EventLoopGroup workerGroup = bossGroup;
> bootstrap = new ServerBootstrap()
>   .group(bossGroup, workerGroup)
>   .channel(NettyUtils.getServerChannelClass(ioMode))
>   .option(ChannelOption.ALLOCATOR, allocator)
>   .childOption(ChannelOption.ALLOCATOR, allocator);
> {code}
> When the load at the shuffle service increases, since the worker threads are 
> busy with existing channels, registering new channels gets delayed.
> The fix is simple. I created a dedicated boss thread event loop group with 1 
> thread.
> {code:java}
> EventLoopGroup bossGroup = NettyUtils.createEventLoop(ioMode, 1,
>   conf.getModuleName() + "-boss");
> EventLoopGroup workerGroup =  NettyUtils.createEventLoop(ioMode, 
> conf.serverThreads(),
> conf.getModuleName() + "-server");
> bootstrap = new ServerBootstrap()
>   .group(bossGroup, workerGroup)
>   .channel(NettyUtils.getServerChannelClass(ioMode))
>   .option(ChannelOption.ALLOCATOR, allocator)
> {code}
> This fixed the issue.
>  We just need 1 thread in the boss group because there is only a single 
> server bootstrap.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-30512) Use a dedicated boss event group loop in the netty pipeline for external shuffle service

2020-01-29 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves reassigned SPARK-30512:
-

Assignee: Chandni Singh

> Use a dedicated boss event group loop in the netty pipeline for external 
> shuffle service
> 
>
> Key: SPARK-30512
> URL: https://issues.apache.org/jira/browse/SPARK-30512
> Project: Spark
>  Issue Type: Bug
>  Components: Shuffle
>Affects Versions: 3.0.0
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Fix For: 3.0.0
>
>
> We have been seeing a large number of SASL authentication (RPC requests) 
> timing out with the external shuffle service.
>  The issue and all the analysis we did is described here:
>  [https://github.com/netty/netty/issues/9890]
> I added a {{LoggingHandler}} to netty pipeline and realized that even the 
> channel registration is delayed by 30 seconds. 
>  In the Spark External Shuffle service, the boss event group and the worker 
> event group are same which is causing this delay.
> {code:java}
> EventLoopGroup bossGroup =
>   NettyUtils.createEventLoop(ioMode, conf.serverThreads(), 
> conf.getModuleName() + "-server");
> EventLoopGroup workerGroup = bossGroup;
> bootstrap = new ServerBootstrap()
>   .group(bossGroup, workerGroup)
>   .channel(NettyUtils.getServerChannelClass(ioMode))
>   .option(ChannelOption.ALLOCATOR, allocator)
>   .childOption(ChannelOption.ALLOCATOR, allocator);
> {code}
> When the load at the shuffle service increases, since the worker threads are 
> busy with existing channels, registering new channels gets delayed.
> The fix is simple. I created a dedicated boss thread event loop group with 1 
> thread.
> {code:java}
> EventLoopGroup bossGroup = NettyUtils.createEventLoop(ioMode, 1,
>   conf.getModuleName() + "-boss");
> EventLoopGroup workerGroup =  NettyUtils.createEventLoop(ioMode, 
> conf.serverThreads(),
> conf.getModuleName() + "-server");
> bootstrap = new ServerBootstrap()
>   .group(bossGroup, workerGroup)
>   .channel(NettyUtils.getServerChannelClass(ioMode))
>   .option(ChannelOption.ALLOCATOR, allocator)
> {code}
> This fixed the issue.
>  We just need 1 thread in the boss group because there is only a single 
> server bootstrap.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-30512) Use a dedicated boss event group loop in the netty pipeline for external shuffle service

2020-01-29 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-30512.
---
Fix Version/s: 3.0.0
   Resolution: Fixed

this could be pulled back into branch-2.X as well

> Use a dedicated boss event group loop in the netty pipeline for external 
> shuffle service
> 
>
> Key: SPARK-30512
> URL: https://issues.apache.org/jira/browse/SPARK-30512
> Project: Spark
>  Issue Type: Bug
>  Components: Shuffle
>Affects Versions: 3.0.0
>Reporter: Chandni Singh
>Priority: Major
> Fix For: 3.0.0
>
>
> We have been seeing a large number of SASL authentication (RPC requests) 
> timing out with the external shuffle service.
>  The issue and all the analysis we did is described here:
>  [https://github.com/netty/netty/issues/9890]
> I added a {{LoggingHandler}} to netty pipeline and realized that even the 
> channel registration is delayed by 30 seconds. 
>  In the Spark External Shuffle service, the boss event group and the worker 
> event group are same which is causing this delay.
> {code:java}
> EventLoopGroup bossGroup =
>   NettyUtils.createEventLoop(ioMode, conf.serverThreads(), 
> conf.getModuleName() + "-server");
> EventLoopGroup workerGroup = bossGroup;
> bootstrap = new ServerBootstrap()
>   .group(bossGroup, workerGroup)
>   .channel(NettyUtils.getServerChannelClass(ioMode))
>   .option(ChannelOption.ALLOCATOR, allocator)
>   .childOption(ChannelOption.ALLOCATOR, allocator);
> {code}
> When the load at the shuffle service increases, since the worker threads are 
> busy with existing channels, registering new channels gets delayed.
> The fix is simple. I created a dedicated boss thread event loop group with 1 
> thread.
> {code:java}
> EventLoopGroup bossGroup = NettyUtils.createEventLoop(ioMode, 1,
>   conf.getModuleName() + "-boss");
> EventLoopGroup workerGroup =  NettyUtils.createEventLoop(ioMode, 
> conf.serverThreads(),
> conf.getModuleName() + "-server");
> bootstrap = new ServerBootstrap()
>   .group(bossGroup, workerGroup)
>   .channel(NettyUtils.getServerChannelClass(ioMode))
>   .option(ChannelOption.ALLOCATOR, allocator)
> {code}
> This fixed the issue.
>  We just need 1 thread in the boss group because there is only a single 
> server bootstrap.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (YARN-8200) Backport resource types/GPU features to branch-3.0/branch-2

2020-01-28 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17025285#comment-17025285
 ] 

Thomas Graves commented on YARN-8200:
-

After messing with this a bit more I removed the maximum allocation 
configurations after seeing the documentation didn't have them in the 2.10 
release. so removed this setting:


 yarn.resource-types.yarn.io/gpu.maximum-allocation
 4
 

And it appears now  yarn doesn't allocate me a container unless it has 
fullfilled all of the gpus I requested.   So in this case my nodemanager has 4 
gpus so if I request 5 then it just hangs waiting to fullfill the request. This 
behavior is much better then giving me one that is less then I requested.

 

> Backport resource types/GPU features to branch-3.0/branch-2
> ---
>
> Key: YARN-8200
> URL: https://issues.apache.org/jira/browse/YARN-8200
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
>  Labels: release-blocker
> Fix For: 2.10.0
>
> Attachments: YARN-8200-branch-2.001.patch, 
> YARN-8200-branch-2.002.patch, YARN-8200-branch-2.003.patch, 
> YARN-8200-branch-3.0.001.patch, 
> counter.scheduler.operation.allocate.csv.defaultResources, 
> counter.scheduler.operation.allocate.csv.gpuResources, synth_sls.json
>
>
> Currently we have a need for GPU scheduling on our YARN clusters to support 
> deep learning workloads. However, our main production clusters are running 
> older versions of branch-2 (2.7 in our case). To prevent supporting too many 
> very different hadoop versions across multiple clusters, we would like to 
> backport the resource types/resource profiles feature to branch-2, as well as 
> the GPU specific support.
>  
> We have done a trial backport of YARN-3926 and some miscellaneous patches in 
> YARN-7069 based on issues we uncovered, and the backport was fairly smooth. 
> We also did a trial backport of most of YARN-6223 (sans docker support).
>  
> Regarding the backports, perhaps we can do the development in a feature 
> branch and then merge to branch-2 when ready.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8200) Backport resource types/GPU features to branch-3.0/branch-2

2020-01-28 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17025156#comment-17025156
 ] 

Thomas Graves commented on YARN-8200:
-

Hey [~jhung] ,

I am trying out the gpu scheduling in hadoop 2.10 and the first thing I noticed 
is it doesn't error properly if you ask for to many GPU's. It seems to happyily 
say it gave them to me, although I think its really giving me the max 
configured.  Is this a known issue already or did configuration change?

I have gpu max configured at 4 and I try to allocate 8, on hadoop 3 I get:

 

Caused by: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException):
 Invalid resource request, requested resource type=[yarn.io/gpu] < 0 or greater 
than maximum allowed allocation. Requested resource=, maximum allowed allocation=, please note that maximum allowed allocation is calculated by 
scheduler based on maximum resource of registered NodeManagers, which might be 
less than configured maximum allocation=

 

On hadoop 2.10 I get a container allocated but the logs and UI says it only has 
4 gpus. 

> Backport resource types/GPU features to branch-3.0/branch-2
> ---
>
> Key: YARN-8200
> URL: https://issues.apache.org/jira/browse/YARN-8200
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
>  Labels: release-blocker
> Fix For: 2.10.0
>
> Attachments: YARN-8200-branch-2.001.patch, 
> YARN-8200-branch-2.002.patch, YARN-8200-branch-2.003.patch, 
> YARN-8200-branch-3.0.001.patch, 
> counter.scheduler.operation.allocate.csv.defaultResources, 
> counter.scheduler.operation.allocate.csv.gpuResources, synth_sls.json
>
>
> Currently we have a need for GPU scheduling on our YARN clusters to support 
> deep learning workloads. However, our main production clusters are running 
> older versions of branch-2 (2.7 in our case). To prevent supporting too many 
> very different hadoop versions across multiple clusters, we would like to 
> backport the resource types/resource profiles feature to branch-2, as well as 
> the GPU specific support.
>  
> We have done a trial backport of YARN-3926 and some miscellaneous patches in 
> YARN-7069 based on issues we uncovered, and the backport was fairly smooth. 
> We also did a trial backport of most of YARN-6223 (sans docker support).
>  
> Regarding the backports, perhaps we can do the development in a feature 
> branch and then merge to branch-2 when ready.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (SPARK-30638) add resources as parameter to the PluginContext

2020-01-24 Thread Thomas Graves (Jira)
Thomas Graves created SPARK-30638:
-

 Summary: add resources as parameter to the PluginContext
 Key: SPARK-30638
 URL: https://issues.apache.org/jira/browse/SPARK-30638
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 3.0.0
Reporter: Thomas Graves


Add the allocates resources and ResourceProfile to parameters to the 
PluginContext so that any plugins in driver or executor could use this 
information to initialize devices or use this information in a useful manner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28403) Executor Allocation Manager can add an extra executor when speculative tasks

2020-01-23 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17022527#comment-17022527
 ] 

Thomas Graves commented on SPARK-28403:
---

so after looking at the pr for this, this logic may have been an attempt to get 
executors on different hosts. The speculation logic in the scheduler is such 
that it will only run a speculative task on a different host then the current 
running task.

> Executor Allocation Manager can add an extra executor when speculative tasks
> 
>
> Key: SPARK-28403
> URL: https://issues.apache.org/jira/browse/SPARK-28403
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.3.0
>Reporter: Thomas Graves
>Priority: Major
>
> It looks like SPARK-19326 added a bug in the execuctor allocation maanger 
> where it adds an extra executor when it shouldn't when we have pending 
> speculative tasks but the target number didn't change. 
> [https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala#L377]
> It doesn't look like this is necessary since it already added in the 
> pendingSpeculative tasks.
> See the questioning of this on the PR at:
> https://github.com/apache/spark/pull/18492/files#diff-b096353602813e47074ace09a3890d56R379



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-30623) Spark external shuffle allow disable of separate event loop group

2020-01-23 Thread Thomas Graves (Jira)
Thomas Graves created SPARK-30623:
-

 Summary: Spark external shuffle allow disable of separate event 
loop group
 Key: SPARK-30623
 URL: https://issues.apache.org/jira/browse/SPARK-30623
 Project: Spark
  Issue Type: Bug
  Components: Shuffle
Affects Versions: 2.4.4, 3.0.0
Reporter: Thomas Graves


In SPARK-24355 changes were made to add a separate event loop group for 
processing ChunkFetchRequests , this  allow fors the other threads to handle 
regular connection requests when the configuration value is set. This however 
seems to have added some latency (see pr 22173 comments at the end).

To help with this we could make sure the secondary event loop group isn't used 
when the configuration of spark.shuffle.server.chunkFetchHandlerThreadsPercent 
isn't explicitly set. This should result in getting the same behavior as before.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27213) Unexpected results when filter is used after distinct

2020-01-22 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-27213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17021532#comment-17021532
 ] 

Thomas Graves commented on SPARK-27213:
---

did anyone try this on the latest 2.4.x release?

> Unexpected results when filter is used after distinct
> -
>
> Key: SPARK-27213
> URL: https://issues.apache.org/jira/browse/SPARK-27213
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.3.2, 2.4.0
>Reporter: Rinaz Belhaj
>Priority: Blocker
>  Labels: correctness, distinct, filter
>
> The following code gives unexpected output due to the filter not getting 
> pushed down in catalyst optimizer.
> {code:java}
> df = 
> spark.createDataFrame([['a','123','12.3','n'],['a','123','12.3','y'],['a','123','12.4','y']],['x','y','z','y_n'])
> df.show(5)
> df.filter("y_n='y'").select('x','y','z').distinct().show()
> df.select('x','y','z').distinct().filter("y_n='y'").show()
> {code}
> {panel:title=Output}
> |x|y|z|y_n|
> |a|123|12.3|n|
> |a|123|12.3|y|
> |a|123|12.4|y|
>  
> |x|y|z|
> |a|123|12.3|
> |a|123|12.4|
>  
> |x|y|z|
> |a|123|12.4|
> {panel}
> Ideally, the second statement should result in an error since the column used 
> in the filter is not present in the preceding select statement. But the 
> catalyst optimizer is using first() on column y_n and then applying the 
> filter.
> Even if the filter was pushed down, the result would have been accurate.
> {code:java}
> df = 
> spark.createDataFrame([['a','123','12.3','n'],['a','123','12.3','y'],['a','123','12.4','y']],['x','y','z','y_n'])
> df.filter("y_n='y'").select('x','y','z').distinct().explain(True)
> df.select('x','y','z').distinct().filter("y_n='y'").explain(True) 
> {code}
> {panel:title=Output}
>  
>  == Parsed Logical Plan ==
>  Deduplicate [x#74, y#75, z#76|#74, y#75, z#76]
>  +- AnalysisBarrier
>  +- Project [x#74, y#75, z#76|#74, y#75, z#76]
>  +- Filter (y_n#77 = y)
>  +- LogicalRDD [x#74, y#75, z#76, y_n#77|#74, y#75, z#76, y_n#77], false
>   
>  == Analyzed Logical Plan ==
>  x: string, y: string, z: string
>  Deduplicate [x#74, y#75, z#76|#74, y#75, z#76]
>  +- Project [x#74, y#75, z#76|#74, y#75, z#76]
>  +- Filter (y_n#77 = y)
>  +- LogicalRDD [x#74, y#75, z#76, y_n#77|#74, y#75, z#76, y_n#77], false
>   
>  == Optimized Logical Plan ==
>  Aggregate [x#74, y#75, z#76|#74, y#75, z#76], [x#74, y#75, z#76|#74, y#75, 
> z#76]
>  +- Project [x#74, y#75, z#76|#74, y#75, z#76]
>  +- Filter (isnotnull(y_n#77) && (y_n#77 = y))
>  +- LogicalRDD [x#74, y#75, z#76, y_n#77|#74, y#75, z#76, y_n#77], false
>   
>  == Physical Plan ==
>  *(2) HashAggregate(keys=[x#74, y#75, z#76|#74, y#75, z#76], functions=[], 
> output=[x#74, y#75, z#76|#74, y#75, z#76])
>  +- Exchange hashpartitioning(x#74, y#75, z#76, 10)
>  +- *(1) HashAggregate(keys=[x#74, y#75, z#76|#74, y#75, z#76], functions=[], 
> output=[x#74, y#75, z#76|#74, y#75, z#76])
>  +- *(1) Project [x#74, y#75, z#76|#74, y#75, z#76]
>  +- *(1) Filter (isnotnull(y_n#77) && (y_n#77 = y))
>  +- Scan ExistingRDD[x#74,y#75,z#76,y_n#77|#74,y#75,z#76,y_n#77]
>   
>  
> ---
>  
>   
>  == Parsed Logical Plan ==
>  'Filter ('y_n = y)
>  +- AnalysisBarrier
>  +- Deduplicate [x#74, y#75, z#76|#74, y#75, z#76]
>  +- Project [x#74, y#75, z#76|#74, y#75, z#76]
>  +- LogicalRDD [x#74, y#75, z#76, y_n#77|#74, y#75, z#76, y_n#77], false
>   
>  == Analyzed Logical Plan ==
>  x: string, y: string, z: string
>  Project [x#74, y#75, z#76|#74, y#75, z#76]
>  +- Filter (y_n#77 = y)
>  +- Deduplicate [x#74, y#75, z#76|#74, y#75, z#76]
>  +- Project [x#74, y#75, z#76, y_n#77|#74, y#75, z#76, y_n#77]
>  +- LogicalRDD [x#74, y#75, z#76, y_n#77|#74, y#75, z#76, y_n#77], false
>   
>  == Optimized Logical Plan ==
>  Project [x#74, y#75, z#76|#74, y#75, z#76]
>  +- Filter (isnotnull(y_n#77) && (y_n#77 = y))
>  +- Aggregate [x#74, y#75, z#76|#74, y#75, z#76], [x#74, y#75, z#76, 
> first(y_n#77, false) AS y_n#77|#74, y#75, z#76, first(y_n#77, false) AS 
> y_n#77]
>  +- LogicalRDD [x#74, y#75, z#76, y_n#77|#74, y#75, z#76, y_n#77], false
>   
>  == Physical Plan ==
>  *(3) Project [x#74, y#75, z#76|#74, y#75, z#76]
>  +- *(3) Filter (isnotnull(y_n#77) && (y_n#77 = y))
>  +- SortAggregate(key=[x#74, y#75, z#76|#74, y#75, z#76], 
> functions=[first(y_n#77, false)|#77, false)], output=[x#74, y#75, z#76, 
> y_n#77|#74, y#75, z#76, y_n#77])
>  +- *(2) Sort [x#74 ASC NULLS FIRST, y#75 ASC NULLS FIRST, z#76 ASC NULLS 
> FIRST|#74 ASC NULLS FIRST, y#75 ASC NULLS FIRST, z#76 ASC NULLS FIRST], 
> false, 0
>  +- Exchange hashpartitioning(x#74, y#75, z#76, 10)
>  +- SortAggregate(key=[x#74, y#75, z#76|#74, y#75, z#76], 
> functions=[partial_first(y_n#77, false)|#77, 

[jira] [Commented] (SPARK-29699) Different answers in nested aggregates with window functions

2020-01-22 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17021404#comment-17021404
 ] 

Thomas Graves commented on SPARK-29699:
---

this seems to be in the context of feature parity with postgres, we aren't 
doing that now and you say we do the same as mysql, if that is the case I would 
argue this doesn't seem like a correctness issue but a compatibility issue.  
thoughts?

> Different answers in nested aggregates with window functions
> 
>
> Key: SPARK-29699
> URL: https://issues.apache.org/jira/browse/SPARK-29699
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Takeshi Yamamuro
>Priority: Major
>  Labels: correctness
>
> A nested aggregate below with a window function seems to have different 
> answers in the `rsum` column  between PgSQL and Spark;
> {code:java}
> postgres=# create table gstest2 (a integer, b integer, c integer, d integer, 
> e integer, f integer, g integer, h integer);
> postgres=# insert into gstest2 values
> postgres-#   (1, 1, 1, 1, 1, 1, 1, 1),
> postgres-#   (1, 1, 1, 1, 1, 1, 1, 2),
> postgres-#   (1, 1, 1, 1, 1, 1, 2, 2),
> postgres-#   (1, 1, 1, 1, 1, 2, 2, 2),
> postgres-#   (1, 1, 1, 1, 2, 2, 2, 2),
> postgres-#   (1, 1, 1, 2, 2, 2, 2, 2),
> postgres-#   (1, 1, 2, 2, 2, 2, 2, 2),
> postgres-#   (1, 2, 2, 2, 2, 2, 2, 2),
> postgres-#   (2, 2, 2, 2, 2, 2, 2, 2);
> INSERT 0 9
> postgres=# 
> postgres=# select a, b, sum(c), sum(sum(c)) over (order by a,b) as rsum
> postgres-#   from gstest2 group by rollup (a,b) order by rsum, a, b;
>  a | b | sum | rsum 
> ---+---+-+--
>  1 | 1 |   8 |8
>  1 | 2 |   2 |   10
>  1 |   |  10 |   20
>  2 | 2 |   2 |   22
>  2 |   |   2 |   24
>|   |  12 |   36
> (6 rows)
> {code}
> {code:java}
> scala> sql("""
>  | select a, b, sum(c), sum(sum(c)) over (order by a,b) as rsum
>  |   from gstest2 group by rollup (a,b) order by rsum, a, b
>  | """).show()
> +++--++   
>   
> |   a|   b|sum(c)|rsum|
> +++--++
> |null|null|12|  12|
> |   1|null|10|  22|
> |   1|   1| 8|  30|
> |   1|   2| 2|  32|
> |   2|null| 2|  34|
> |   2|   2| 2|  36|
> +++--++
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27082) Dynamic Allocation: we should consider the scenario that speculative task being killed and never resubmit

2020-01-22 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-27082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17021284#comment-17021284
 ] 

Thomas Graves commented on SPARK-27082:
---

I ran into this case while testing, I think when you say killed here you really 
just mean one of the 2 tasks finishes (succeeds) the other task that was 
running is killed, correct?  The dynamic allocation manager never removes it 
from the speculativeTaskIndices and thus we always keep more executors then we 
actually need.

Would you be interested in putting up a PR with a fix?

> Dynamic Allocation: we should consider the scenario that speculative task 
> being killed and never resubmit
> -
>
> Key: SPARK-27082
> URL: https://issues.apache.org/jira/browse/SPARK-27082
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.4.0
>Reporter: Zhen Fan
>Priority: Major
>  Labels: patch
>
> Issue background:
> When we enable dynamic allocation, we expect that the executors can be 
> removed appropriately, especially in some stages with data skew. With 
> speculation enabled, the copying task  can be killed by the original task and 
> vice versa. In TaskSetManager, we set successful(index)=true, and never 
> resubmit the killed tasks. However, in ExecutorAllocationManager which is 
> very related to the dynamic allocation function, doesn't handle this scenario.
> See SPARK-8366. However, (SPARK-8366) ignores one scenario that copying task 
> is being killed. When this happens, the TaskSetManager will mark the task 
> index of the stage as success and never resubmit the killed task, so here we 
> shouldn't treat it as pending task.
> This can do harm to the computing of  maxNumExecutorsNeeded, as a result, we 
> always retain unnecessary  executors and waste the computing resources of 
> clusters.
> Solution:
> When the task index is marked as speculative and the mirror task is 
> successful, we won't treat it as pending task. 
> Code has been tested.
> {code:java}
> private val stageIdToSpeculativeTaskIndices = new mutable.HashMap[Int, 
> mutable.HashMap[Int, Boolean]]
> ... 
> val speculativeTaskIndices = stageIdToSpeculativeTaskIndices.get(stageId)
> if (taskEnd.reason == Success) {
>   if (speculativeTaskIndices.isDefined && 
> speculativeTaskIndices.get.contains(taskIndex)) {
> speculativeTaskIndices.get(taskIndex) = true
>   }
> } else {
>   var resubmitTask = true
>   if (taskEnd.taskInfo.killed) {
> resubmitTask = !(speculativeTaskIndices.isDefined &&
> speculativeTaskIndices.get.getOrElse(taskIndex, false))
>   }
>   if (resubmitTask) {
> if (totalPendingTasks() == 0) {
>   allocationManager.onSchedulerBacklogged()
> }
> if (taskEnd.taskInfo.speculative) {
>   stageIdToSpeculativeTaskIndices.get(stageId).foreach 
> {_.remove(taskIndex)}
> } else {
>   stageIdToTaskIndices.get(stageId).foreach {_.remove(taskIndex)}
> }
>   }
> }{code}
>  Please take a look, Thanks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27784) Alias ID reuse can break correctness when substituting foldable expressions

2020-01-21 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-27784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17020563#comment-17020563
 ] 

Thomas Graves commented on SPARK-27784:
---

[~rdblue] can you confirm this doesn't exist in master?

> Alias ID reuse can break correctness when substituting foldable expressions
> ---
>
> Key: SPARK-27784
> URL: https://issues.apache.org/jira/browse/SPARK-27784
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.1, 2.3.2
>Reporter: Ryan Blue
>Priority: Major
>  Labels: correctness
>
> This is a correctness bug when reusing a set of project expressions in the 
> DataFrame API.
> Use case: a user was migrating a table to a new version with an additional 
> column ("data" in the repro case). To migrate the user unions the old table 
> ("t2") with the new table ("t1"), and applies a common set of projections to 
> ensure the union doesn't hit an issue with ordering (SPARK-22335). In some 
> cases, this produces an incorrect query plan:
> {code:java}
> Seq((4, "a"), (5, "b"), (6, "c")).toDF("id", "data").write.saveAsTable("t1")
> Seq(1, 2, 3).toDF("id").write.saveAsTable("t2")
> val dim = Seq(2, 3, 4).toDF("id")
> val outputCols = Seq($"id", coalesce($"data", lit("_")).as("data"))
> val t1 = spark.table("t1").select(outputCols:_*)
> val t2 = spark.table("t2").withColumn("data", lit(null)).select(outputCols:_*)
> t1.join(dim, t1("id") === dim("id")).select(t1("id"), 
> t1("data")).union(t2).explain(true){code}
> {code:java}
> == Physical Plan ==
> Union
> :- *Project [id#330, _ AS data#237] < THE CONSTANT IS 
> FROM THE OTHER SIDE OF THE UNION
> : +- *BroadcastHashJoin [id#330], [id#234], Inner, BuildRight
> : :- *Project [id#330]
> : :  +- *Filter isnotnull(id#330)
> : : +- *FileScan parquet t1[id#330] Batched: true, Format: Parquet, 
> Location: CatalogFileIndex[s3://.../t1], PartitionFilters: [], PushedFilters: 
> [IsNotNull(id)], ReadSchema: struct
> : +- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, 
> false] as bigint)))
> :+- LocalTableScan [id#234]
> +- *Project [id#340, _ AS data#237]
>+- *FileScan parquet t2[id#340] Batched: true, Format: Parquet, Location: 
> CatalogFileIndex[s3://.../t2], PartitionFilters: [], PushedFilters: [], 
> ReadSchema: struct{code}
> The problem happens because "outputCols" has an alias. The ID for that alias 
> is created when the projection Seq is created, so it is reused in both sides 
> of the union.
> When FoldablePropagation runs, it identifies that "data" in the t2 side of 
> the union is a foldable expression and replaces all references to it, 
> including the references in the t1 side of the union.
> The join to a dimension table is necessary to reproduce the problem because 
> it requires a Projection on top of the join that uses an AttributeReference 
> for data#237. Otherwise, the projections are collapsed and the projection 
> includes an Alias that does not get rewritten by FoldablePropagation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27282) Spark incorrect results when using UNION with GROUP BY clause

2020-01-21 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-27282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17020560#comment-17020560
 ] 

Thomas Graves commented on SPARK-27282:
---

[~sofia] can you confirm this isn't fixed in the last version of Spark?

> Spark incorrect results when using UNION with GROUP BY clause
> -
>
> Key: SPARK-27282
> URL: https://issues.apache.org/jira/browse/SPARK-27282
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Shell, Spark Submit, SQL
>Affects Versions: 2.3.2
> Environment: I'm using :
> IntelliJ  IDEA ==> 2018.1.4
> spark-sql and spark-core ==> 2.3.2.3.1.0.0-78 (for HDP 3.1)
> scala ==> 2.11.8
>Reporter: Sofia
>Priority: Major
>  Labels: correctness
>
> When using UNION clause after a GROUP BY clause in spark, the results 
> obtained are wrong.
> The following example explicit this issue:
> {code:java}
> CREATE TABLE test_un (
> col1 varchar(255),
> col2 varchar(255),
> col3 varchar(255),
> col4 varchar(255)
> );
> INSERT INTO test_un (col1, col2, col3, col4)
> VALUES (1,1,2,4),
> (1,1,2,4),
> (1,1,3,5),
> (2,2,2,null);
> {code}
> I used the following code :
> {code:java}
> val x = Toolkit.HiveToolkit.getDataFromHive("test","test_un")
> val  y = x
>.filter(col("col4")isNotNull)
>   .groupBy("col1", "col2","col3")
>   .agg(count(col("col3")).alias("cnt"))
>   .withColumn("col_name", lit("col3"))
>   .select(col("col1"), col("col2"), 
> col("col_name"),col("col3").alias("col_value"), col("cnt"))
> val z = x
>   .filter(col("col4")isNotNull)
>   .groupBy("col1", "col2","col4")
>   .agg(count(col("col4")).alias("cnt"))
>   .withColumn("col_name", lit("col4"))
>   .select(col("col1"), col("col2"), 
> col("col_name"),col("col4").alias("col_value"), col("cnt"))
> y.union(z).show()
> {code}
>  And i obtained the following results:
> ||col1||col2||col_name||col_value||cnt||
> |1|1|col3|5|1|
> |1|1|col3|4|2|
> |1|1|col4|5|1|
> |1|1|col4|4|2|
> Expected results:
> ||col1||col2||col_name||col_value||cnt||
> |1|1|col3|3|1|
> |1|1|col3|2|2|
> |1|1|col4|4|2|
> |1|1|col4|5|1|
> But when i remove the last row of the table, i obtain the correct results.
> {code:java}
> (2,2,2,null){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-29306) Executors need to track what ResourceProfile they are created with

2020-01-17 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-29306.
---
Fix Version/s: 3.0.0
   Resolution: Fixed

> Executors need to track what ResourceProfile they are created with 
> ---
>
> Key: SPARK-29306
> URL: https://issues.apache.org/jira/browse/SPARK-29306
> Project: Spark
>  Issue Type: Story
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
> Fix For: 3.0.0
>
>
> For stage level scheduling, the Executors need to report what ResourceProfile 
> they are created with so that the ExecutorMonitor can track them and the 
> ExecutorAllocationManager can use that information to know how many to 
> request, etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-30529) Improve error messages when Executor dies before registering with driver

2020-01-16 Thread Thomas Graves (Jira)
Thomas Graves created SPARK-30529:
-

 Summary: Improve error messages when Executor dies before 
registering with driver
 Key: SPARK-30529
 URL: https://issues.apache.org/jira/browse/SPARK-30529
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 3.0.0
Reporter: Thomas Graves


currently when you give a bad configuration for accelerator aware scheduling to 
the executor, the Executors can die but its hard for the user to know why.  The 
executor dies and logs in its log files what went wrong but many times it hard 
to find those logs because the executor hasn't registered yet.  Since it hasn't 
registered the executor doesn't show up on UI to see log files.

One specific example is you give a discovery script that that doesn't find all 
the GPUs:

20/01/16 08:59:24 INFO YarnCoarseGrainedExecutorBackend: Connecting to driver: 
spark://CoarseGrainedScheduler@10.28.9.112:44403
20/01/16 08:59:24 ERROR Inbox: Ignoring error
java.lang.IllegalArgumentException: requirement failed: Resource: gpu, with 
addresses: 0 is less than what the user requested: 2)
 at scala.Predef$.require(Predef.scala:281)
 at 
org.apache.spark.resource.ResourceUtils$.$anonfun$assertAllResourceAllocationsMatchResourceProfile$1(ResourceUtils.scala:251)
 at 
org.apache.spark.resource.ResourceUtils$.$anonfun$assertAllResourceAllocationsMatchResourceProfile$1$adapted(ResourceUtils.scala:248)

 

Figure out a better way of logging or letting user know  what error occurred 
when the executor dies before registering



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-30448) accelerator aware scheduling enforce cores as limiting resource

2020-01-10 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves reassigned SPARK-30448:
-

Assignee: Thomas Graves

> accelerator aware scheduling enforce cores as limiting resource
> ---
>
> Key: SPARK-30448
> URL: https://issues.apache.org/jira/browse/SPARK-30448
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
>
> For the first version of accelerator aware scheduling(SPARK-27495), the SPIP 
> had a condition that we can support dynamic allocation because we were going 
> to have a strict requirement that we don't waste any resources. This means 
> that the number of number of slots each executor has could be calculated from 
> the number of cores and task cpus just as is done today.
> Somewhere along the line of development we relaxed that and only warn when we 
> are wasting resources. This breaks the dynamic allocation logic if the 
> limiting resource is no longer the cores.  This means we will request less 
> executors then we really need to run everything.
> We have to enforce that cores is always the limiting resource so we should 
> throw if its not.
> I guess we could only make this a requirement with dynamic allocation on, but 
> to make the behavior consistent I would say we just require it across the 
> board.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-30448) accelerator aware scheduling enforce cores as limiting resource

2020-01-10 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-30448.
---
Fix Version/s: 3.0.0
   Resolution: Fixed

> accelerator aware scheduling enforce cores as limiting resource
> ---
>
> Key: SPARK-30448
> URL: https://issues.apache.org/jira/browse/SPARK-30448
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
> Fix For: 3.0.0
>
>
> For the first version of accelerator aware scheduling(SPARK-27495), the SPIP 
> had a condition that we can support dynamic allocation because we were going 
> to have a strict requirement that we don't waste any resources. This means 
> that the number of number of slots each executor has could be calculated from 
> the number of cores and task cpus just as is done today.
> Somewhere along the line of development we relaxed that and only warn when we 
> are wasting resources. This breaks the dynamic allocation logic if the 
> limiting resource is no longer the cores.  This means we will request less 
> executors then we really need to run everything.
> We have to enforce that cores is always the limiting resource so we should 
> throw if its not.
> I guess we could only make this a requirement with dynamic allocation on, but 
> to make the behavior consistent I would say we just require it across the 
> board.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-30446) Accelerator aware scheduling checkResourcesPerTask doesn't cover all cases

2020-01-08 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-30446.
---
Resolution: Duplicate

this is superceded by SPARK-30448

> Accelerator aware scheduling checkResourcesPerTask doesn't cover all cases
> --
>
> Key: SPARK-30446
> URL: https://issues.apache.org/jira/browse/SPARK-30446
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Priority: Major
>
> with accelerator aware scheduling SparkContext.checkResourcesPerTask
> Tries to make sure that users have configured things properly and warn or 
> error if not.
> It doesn't properly handle all cases like warning if cpu resources are being 
> wasted.  We should test this better and handle those. 
> I fixed these in the stage level scheduling but not sure the timeline on 
> getting that in so we may want to fix this separately as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-30049) SQL fails to parse when comment contains an unmatched quote character

2020-01-08 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010698#comment-17010698
 ] 

Thomas Graves commented on SPARK-30049:
---

that Pr doesn't seem to be in yet. I'm not sure the context on that.  if that 
doesn't look like it is going to go in we should fix this another way.

Do you know what part of that pr may have fixed it?

[~oleg_bonar] are you still working on this?

> SQL fails to parse when comment contains an unmatched quote character
> -
>
> Key: SPARK-30049
> URL: https://issues.apache.org/jira/browse/SPARK-30049
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Jason Darrell Lowe
>Priority: Major
> Attachments: Screen Shot 2019-12-18 at 9.26.29 AM.png
>
>
> A SQL statement that contains a comment with an unmatched quote character can 
> lead to a parse error.  These queries parsed correctly in older versions of 
> Spark.  For example, here's an excerpt from an interactive spark-sql session 
> on a recent Spark-3.0.0-SNAPSHOT build (commit 
> e23c135e568d4401a5659bc1b5ae8fc8bf147693):
> {noformat}
> spark-sql> SELECT 1 -- someone's comment here
>  > ;
> Error in query: 
> extraneous input ';' expecting (line 2, pos 0)
> == SQL ==
> SELECT 1 -- someone's comment here
> ;
> ^^^
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-30295) Remove Hive dependencies from SparkSQLCLI

2020-01-08 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010697#comment-17010697
 ] 

Thomas Graves commented on SPARK-30295:
---

So what do you mean by unnecessary dependencies?  I assume if you had to 
replace with scala implementation it is necessary. Is this purely to try to 
remove more hive dependencies?  Is this linked to any other of the hive work?

> Remove Hive dependencies from SparkSQLCLI
> -
>
> Key: SPARK-30295
> URL: https://issues.apache.org/jira/browse/SPARK-30295
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Javier Fuentes
>Priority: Major
>
> Removal of unnecessary hive dependencies for the Spark SQL Client. Replacing 
> that with a native Scala implementation. For the client driver, argument 
> parser and SparkSqlCliDriver.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-30417) SPARK-29976 calculation of slots wrong for Standalone Mode

2020-01-07 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010146#comment-17010146
 ] 

Thomas Graves commented on SPARK-30417:
---

The only way for standalone mode would be to look at what each executor 
registers with. Theoretically different executors could have different number 
of cores.  There are actually other issues (SPARK-30299 for instance) with this 
in the code as well that I think we need a global solution for.  So perhaps for 
this Jira we do the easy thing like I suggested and then we have have a 
separate Jira to look at handling this better in the future.

> SPARK-29976 calculation of slots wrong for Standalone Mode
> --
>
> Key: SPARK-30417
> URL: https://issues.apache.org/jira/browse/SPARK-30417
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Priority: Major
>
> In SPARK-29976 we added a config to determine if we should allow speculation 
> when the number of tasks is less then the number of slots on a single 
> executor.  The problem is that for standalone mode (and  mesos coarse 
> grained) the EXECUTOR_CORES config is not set properly by default. In those 
> modes the number of executor cores is all the cores of the Worker.    The 
> default of EXECUTOR_CORES is 1.
> The calculation:
> {color:#80}val {color}{color:#660e7a}speculationTasksLessEqToSlots 
> {color}= {color:#660e7a}numTasks {color}<= 
> ({color:#660e7a}conf{color}.get({color:#660e7a}EXECUTOR_CORES{color}) / 
> sched.{color:#660e7a}CPUS_PER_TASK{color})
> If someone set the cpus per task > 1 then this would end up being false even 
> if 1 task.  Note that the default case where cpus per task is 1 and executor 
> cores is 1 it works out ok but is only applied if 1 task vs number of slots 
> on the executor.
> Here we really don't know the number of executor cores for standalone mode or 
> mesos so I think a decent solution is to just use 1 in those cases and 
> document the difference.
> Something like 
> max({color:#660e7a}conf{color}.get({color:#660e7a}EXECUTOR_CORES{color}) / 
> sched.{color:#660e7a}CPUS_PER_TASK{color}, 1)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-30451) Stage level Sched: ExecutorResourceRequests/TaskResourceRequests should have functions to remove requests

2020-01-07 Thread Thomas Graves (Jira)
Thomas Graves created SPARK-30451:
-

 Summary: Stage level Sched: 
ExecutorResourceRequests/TaskResourceRequests should have functions to remove 
requests
 Key: SPARK-30451
 URL: https://issues.apache.org/jira/browse/SPARK-30451
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 3.0.0
Reporter: Thomas Graves


Stage level Sched: ExecutorResourceRequests/TaskResourceRequests should have 
functions to remove requests

Currently in the design ExecutorResourceRequests and TaskREsourceRequests are 
mutable and users can update as they want. It would make sense to add api's to 
remove certain resource requirements from them. This would allow a user to 
create one ExecutorResourceRequests object and then if they want to just 
add/remove something from it they easily could without having to recreate all 
the requests in that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-30448) accelerator aware scheduling enforce cores as limiting resource

2020-01-07 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009989#comment-17009989
 ] 

Thomas Graves commented on SPARK-30448:
---

Note this actually overlaps with 
https://issues.apache.org/jira/browse/SPARK-30446 since with this change some 
of those checks don't make sense.

> accelerator aware scheduling enforce cores as limiting resource
> ---
>
> Key: SPARK-30448
> URL: https://issues.apache.org/jira/browse/SPARK-30448
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Priority: Major
>
> For the first version of accelerator aware scheduling(SPARK-27495), the SPIP 
> had a condition that we can support dynamic allocation because we were going 
> to have a strict requirement that we don't waste any resources. This means 
> that the number of number of slots each executor has could be calculated from 
> the number of cores and task cpus just as is done today.
> Somewhere along the line of development we relaxed that and only warn when we 
> are wasting resources. This breaks the dynamic allocation logic if the 
> limiting resource is no longer the cores.  This means we will request less 
> executors then we really need to run everything.
> We have to enforce that cores is always the limiting resource so we should 
> throw if its not.
> I guess we could only make this a requirement with dynamic allocation on, but 
> to make the behavior consistent I would say we just require it across the 
> board.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-30448) accelerator aware scheduling enforce cores as limiting resource

2020-01-07 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009901#comment-17009901
 ] 

Thomas Graves commented on SPARK-30448:
---

note there are other calculations throughout spark code that calculate the 
number of slots so I think its best for now just to require cores to be 
limiting resource

> accelerator aware scheduling enforce cores as limiting resource
> ---
>
> Key: SPARK-30448
> URL: https://issues.apache.org/jira/browse/SPARK-30448
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Priority: Major
>
> For the first version of accelerator aware scheduling(SPARK-27495), the SPIP 
> had a condition that we can support dynamic allocation because we were going 
> to have a strict requirement that we don't waste any resources. This means 
> that the number of number of slots each executor has could be calculated from 
> the number of cores and task cpus just as is done today.
> Somewhere along the line of development we relaxed that and only warn when we 
> are wasting resources. This breaks the dynamic allocation logic if the 
> limiting resource is no longer the cores.  This means we will request less 
> executors then we really need to run everything.
> We have to enforce that cores is always the limiting resource so we should 
> throw if its not.
> I guess we could only make this a requirement with dynamic allocation on, but 
> to make the behavior consistent I would say we just require it across the 
> board.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-30446) Accelerator aware scheduling checkResourcesPerTask doesn't cover all cases

2020-01-07 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009896#comment-17009896
 ] 

Thomas Graves commented on SPARK-30446:
---

working on this

> Accelerator aware scheduling checkResourcesPerTask doesn't cover all cases
> --
>
> Key: SPARK-30446
> URL: https://issues.apache.org/jira/browse/SPARK-30446
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Priority: Major
>
> with accelerator aware scheduling SparkContext.checkResourcesPerTask
> Tries to make sure that users have configured things properly and warn or 
> error if not.
> It doesn't properly handle all cases like warning if cpu resources are being 
> wasted.  We should test this better and handle those. 
> I fixed these in the stage level scheduling but not sure the timeline on 
> getting that in so we may want to fix this separately as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-30448) accelerator aware scheduling enforce cores as limiting resource

2020-01-07 Thread Thomas Graves (Jira)
Thomas Graves created SPARK-30448:
-

 Summary: accelerator aware scheduling enforce cores as limiting 
resource
 Key: SPARK-30448
 URL: https://issues.apache.org/jira/browse/SPARK-30448
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 3.0.0
Reporter: Thomas Graves


For the first version of accelerator aware scheduling(SPARK-27495), the SPIP 
had a condition that we can support dynamic allocation because we were going to 
have a strict requirement that we don't waste any resources. This means that 
the number of number of slots each executor has could be calculated from the 
number of cores and task cpus just as is done today.

Somewhere along the line of development we relaxed that and only warn when we 
are wasting resources. This breaks the dynamic allocation logic if the limiting 
resource is no longer the cores.  This means we will request less executors 
then we really need to run everything.

We have to enforce that cores is always the limiting resource so we should 
throw if its not.

I guess we could only make this a requirement with dynamic allocation on, but 
to make the behavior consistent I would say we just require it across the board.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-30446) Accelerator aware scheduling checkResourcesPerTask doesn't cover all cases

2020-01-07 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009830#comment-17009830
 ] 

Thomas Graves commented on SPARK-30446:
---

Yeah so running on standalone if you set the spark.task.cpus=2 (or anything > 
1) and you don't set executor cores it fails even though it shouldn't because 
executor cores are all the cores of the worker by default:

 

20/01/07 09:34:02 ERROR Main: Failed to initialize Spark session.
org.apache.spark.SparkException: The number of cores per executor (=1) has to 
be >= the task config: spark.task.cpus = 2 when run on spark://tomg-x299:7077.

> Accelerator aware scheduling checkResourcesPerTask doesn't cover all cases
> --
>
> Key: SPARK-30446
> URL: https://issues.apache.org/jira/browse/SPARK-30446
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Priority: Major
>
> with accelerator aware scheduling SparkContext.checkResourcesPerTask
> Tries to make sure that users have configured things properly and warn or 
> error if not.
> It doesn't properly handle all cases like warning if cpu resources are being 
> wasted.  We should test this better and handle those. 
> I fixed these in the stage level scheduling but not sure the timeline on 
> getting that in so we may want to fix this separately as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-30446) Accelerator aware scheduling checkResourcesPerTask doesn't cover all cases

2020-01-07 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009742#comment-17009742
 ] 

Thomas Graves commented on SPARK-30446:
---

I think there may also be issues in it with standalone mode since Executor 
cores isn't necessarily right, but I would have to test again to verify that.

> Accelerator aware scheduling checkResourcesPerTask doesn't cover all cases
> --
>
> Key: SPARK-30446
> URL: https://issues.apache.org/jira/browse/SPARK-30446
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Priority: Major
>
> with accelerator aware scheduling SparkContext.checkResourcesPerTask
> Tries to make sure that users have configured things properly and warn or 
> error if not.
> It doesn't properly handle all cases like warning if cpu resources are being 
> wasted.  We should test this better and handle those. 
> I fixed these in the stage level scheduling but not sure the timeline on 
> getting that in so we may want to fix this separately as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-30446) Accelerator aware scheduling checkResourcesPerTask doesn't cover all cases

2020-01-07 Thread Thomas Graves (Jira)
Thomas Graves created SPARK-30446:
-

 Summary: Accelerator aware scheduling checkResourcesPerTask 
doesn't cover all cases
 Key: SPARK-30446
 URL: https://issues.apache.org/jira/browse/SPARK-30446
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 3.0.0
Reporter: Thomas Graves


with accelerator aware scheduling SparkContext.checkResourcesPerTask

Tries to make sure that users have configured things properly and warn or error 
if not.

It doesn't properly handle all cases like warning if cpu resources are being 
wasted.  We should test this better and handle those. 

I fixed these in the stage level scheduling but not sure the timeline on 
getting that in so we may want to fix this separately as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-30445) Accelerator aware scheduling handle setting configs to 0 better

2020-01-07 Thread Thomas Graves (Jira)
Thomas Graves created SPARK-30445:
-

 Summary: Accelerator aware scheduling handle setting configs to 0 
better
 Key: SPARK-30445
 URL: https://issues.apache.org/jira/browse/SPARK-30445
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 3.0.0
Reporter: Thomas Graves


If you set the resource configs to 0, it errors with divide by zero. While I 
think ideally the user should just remove the configs we should handle the 0 
better.

 

{color:#1d1c1d}$ spark-submit --conf spark.driver.resource.gpu.amount=0 
{color}*--conf spark.executor.resource.gpu.amount=0*{color:#1d1c1d} 
{color}*--conf spark.task.resource.gpu.amount=0*{color:#1d1c1d} --conf 
spark.driver.resource.gpu.discoveryScript=/shared/tools/get_gpu_resources.sh 
--conf 
spark.executor.resource.gpu.discoveryScript=/shared/tools/get_gpu_resources.sh 
test.py{color}
{color:#1d1c1d}20/01/07 05:36:42 WARN NativeCodeLoader: Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable{color}
{color:#1d1c1d}Using Spark’s default log4j profile: 
org/apache/spark/log4j-defaults.properties{color}
{color:#1d1c1d}20/01/07 05:36:43 INFO SparkContext: {color}*Running Spark 
version 3.0.0-preview*
{color:#1d1c1d}20/01/07 05:36:43 INFO ResourceUtils: 
=={color}
{color:#1d1c1d}20/01/07 05:36:43 INFO ResourceUtils: Resources for 
spark.driver:{color}
*gpu -> [name: gpu, addresses: 0]*
{color:#1d1c1d}20/01/07 05:36:43 INFO ResourceUtils: 
=={color}
{color:#1d1c1d}20/01/07 05:36:43 INFO SparkContext: Submitted application: 
test.py{color}
{color:#1d1c1d}..{color}
{color:#1d1c1d}20/01/07 05:36:43 ERROR SparkContext: Error initializing 
SparkContext.{color}
*java.lang.ArithmeticException: / by zero*
{color:#1d1c1d}at 
org.apache.spark.SparkContext$.$anonfun$createTaskScheduler$3(SparkContext.scala:2793){color}
{color:#1d1c1d}at 
org.apache.spark.SparkContext$.$anonfun$createTaskScheduler$3$adapted(SparkContext.scala:2775){color}
{color:#1d1c1d}at scala.collection.Iterator.foreach(Iterator.scala:941){color}
{color:#1d1c1d}at scala.collection.Iterator.foreach$(Iterator.scala:941){color}
{color:#1d1c1d}at 
scala.collection.AbstractIterator.foreach(Iterator.scala:1429){color}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-30417) SPARK-29976 calculation of slots wrong for Standalone Mode

2020-01-03 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17007694#comment-17007694
 ] 

Thomas Graves commented on SPARK-30417:
---

[~yuchen.huo] is this something you could work on?

> SPARK-29976 calculation of slots wrong for Standalone Mode
> --
>
> Key: SPARK-30417
> URL: https://issues.apache.org/jira/browse/SPARK-30417
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Priority: Major
>
> In SPARK-29976 we added a config to determine if we should allow speculation 
> when the number of tasks is less then the number of slots on a single 
> executor.  The problem is that for standalone mode (and  mesos coarse 
> grained) the EXECUTOR_CORES config is not set properly by default. In those 
> modes the number of executor cores is all the cores of the Worker.    The 
> default of EXECUTOR_CORES is 1.
> The calculation:
> {color:#80}val {color}{color:#660e7a}speculationTasksLessEqToSlots 
> {color}= {color:#660e7a}numTasks {color}<= 
> ({color:#660e7a}conf{color}.get({color:#660e7a}EXECUTOR_CORES{color}) / 
> sched.{color:#660e7a}CPUS_PER_TASK{color})
> If someone set the cpus per task > 1 then this would end up being false even 
> if 1 task.  Note that the default case where cpus per task is 1 and executor 
> cores is 1 it works out ok but is only applied if 1 task vs number of slots 
> on the executor.
> Here we really don't know the number of executor cores for standalone mode or 
> mesos so I think a decent solution is to just use 1 in those cases and 
> document the difference.
> Something like 
> max({color:#660e7a}conf{color}.get({color:#660e7a}EXECUTOR_CORES{color}) / 
> sched.{color:#660e7a}CPUS_PER_TASK{color}, 1)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-30417) SPARK-29976 calculation of slots wrong for Standalone Mode

2020-01-03 Thread Thomas Graves (Jira)
Thomas Graves created SPARK-30417:
-

 Summary: SPARK-29976 calculation of slots wrong for Standalone Mode
 Key: SPARK-30417
 URL: https://issues.apache.org/jira/browse/SPARK-30417
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 3.0.0
Reporter: Thomas Graves


In SPARK-29976 we added a config to determine if we should allow speculation 
when the number of tasks is less then the number of slots on a single executor. 
 The problem is that for standalone mode (and  mesos coarse grained) the 
EXECUTOR_CORES config is not set properly by default. In those modes the number 
of executor cores is all the cores of the Worker.    The default of 
EXECUTOR_CORES is 1.

The calculation:

{color:#80}val {color}{color:#660e7a}speculationTasksLessEqToSlots {color}= 
{color:#660e7a}numTasks {color}<= 
({color:#660e7a}conf{color}.get({color:#660e7a}EXECUTOR_CORES{color}) / 
sched.{color:#660e7a}CPUS_PER_TASK{color})

If someone set the cpus per task > 1 then this would end up being false even if 
1 task.  Note that the default case where cpus per task is 1 and executor cores 
is 1 it works out ok but is only applied if 1 task vs number of slots on the 
executor.

Here we really don't know the number of executor cores for standalone mode or 
mesos so I think a decent solution is to just use 1 in those cases and document 
the difference.

Something like 
max({color:#660e7a}conf{color}.get({color:#660e7a}EXECUTOR_CORES{color}) / 
sched.{color:#660e7a}CPUS_PER_TASK{color}, 1)

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27495) SPIP: Support Stage level resource configuration and scheduling

2020-01-02 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-27495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-27495:
--
Target Version/s: 3.0.0

> SPIP: Support Stage level resource configuration and scheduling
> ---
>
> Key: SPARK-27495
> URL: https://issues.apache.org/jira/browse/SPARK-27495
> Project: Spark
>  Issue Type: Epic
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
>  Labels: SPIP
>
> *Q1.* What are you trying to do? Articulate your objectives using absolutely 
> no jargon.
> Objectives:
>  # Allow users to specify task and executor resource requirements at the 
> stage level. 
>  # Spark will use the stage level requirements to acquire the necessary 
> resources/executors and schedule tasks based on the per stage requirements.
> Many times users have different resource requirements for different stages of 
> their application so they want to be able to configure resources at the stage 
> level. For instance, you have a single job that has 2 stages. The first stage 
> does some  ETL which requires a lot of tasks, each with a small amount of 
> memory and 1 core each. Then you have a second stage where you feed that ETL 
> data into an ML algorithm. The second stage only requires a few executors but 
> each executor needs a lot of memory, GPUs, and many cores.  This feature 
> allows the user to specify the task and executor resource requirements for 
> the ETL Stage and then change them for the ML stage of the job. 
> Resources include cpu, memory (on heap, overhead, pyspark, and off heap), and 
> extra Resources (GPU/FPGA/etc). It has the potential to allow for other 
> things like limiting the number of tasks per stage, specifying other 
> parameters for things like shuffle, etc. Initially I would propose we only 
> support resources as they are now. So Task resources would be cpu and other 
> resources (GPU, FPGA), that way we aren't adding in extra scheduling things 
> at this point.  Executor resources would be cpu, memory, and extra 
> resources(GPU,FPGA, etc). Changing the executor resources will rely on 
> dynamic allocation being enabled.
> Main use cases:
>  # ML use case where user does ETL and feeds it into an ML algorithm where 
> it’s using the RDD API. This should work with barrier scheduling as well once 
> it supports dynamic allocation.
>  # This adds the framework/api for Spark's own internal use.  In the future 
> (not covered by this SPIP), Catalyst could control the stage level resources 
> as it finds the need to change it between stages for different optimizations. 
> For instance, with the new columnar plugin to the query planner we can insert 
> stages into the plan that would change running something on the CPU in row 
> format to running it on the GPU in columnar format. This API would allow the 
> planner to make sure the stages that run on the GPU get the corresponding GPU 
> resources it needs to run. Another possible use case for catalyst is that it 
> would allow catalyst to add in more optimizations to where the user doesn’t 
> need to configure container sizes at all. If the optimizer/planner can handle 
> that for the user, everyone wins.
> This SPIP focuses on the RDD API but we don’t exclude the Dataset API. I 
> think the DataSet API will require more changes because it specifically hides 
> the RDD from the users via the plans and catalyst can optimize the plan and 
> insert things into the plan. The only way I’ve found to make this work with 
> the Dataset API would be modifying all the plans to be able to get the 
> resource requirements down into where it creates the RDDs, which I believe 
> would be a lot of change.  If other people know better options, it would be 
> great to hear them.
> *Q2.* What problem is this proposal NOT designed to solve?
> The initial implementation is not going to add Dataset APIs.
> We are starting with allowing users to specify a specific set of 
> task/executor resources and plan to design it to be extendable, but the first 
> implementation will not support changing generic SparkConf configs and only 
> specific limited resources.
> This initial version will have a programmatic API for specifying the resource 
> requirements per stage, we can add the ability to perhaps have profiles in 
> the configs later if its useful.
> *Q3.* How is it done today, and what are the limits of current practice?
> Currently this is either done by having multiple spark jobs or requesting 
> containers with the max resources needed for any part of the job.  To do this 
> today, you can break it into separate jobs where each job requests the 
> corresponding resources needed, but then you have to write the data out 
> 

[jira] [Created] (SPARK-30391) ExecutorAllocationManager requestTotalExecutors in removeExecutors may not be needed

2019-12-30 Thread Thomas Graves (Jira)
Thomas Graves created SPARK-30391:
-

 Summary: ExecutorAllocationManager requestTotalExecutors in 
removeExecutors may not be needed
 Key: SPARK-30391
 URL: https://issues.apache.org/jira/browse/SPARK-30391
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 2.4.4, 3.0.0
Reporter: Thomas Graves


in the ExecutorAllocationManager.removeExecutors

[https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala#L459]

 

there is a call to requestTotalExecutors that I don't think is needed anymore 
because SPARK-23365 changed the killExecutors call to not adjust the target 
number. We should investigate if we can remove it.

{color:#808080}
{color}{color:#808080}
{color}{color:#808080}// [SPARK-21834] killExecutors api reduces the target 
number of executors.
{color}{color:#808080}// So we need to update the target with desired value.
{color}client.requestTotalExecutors({color:#660e7a}numLocalityAwareTasksPerResourceProfileId{color}.toMap,
 {color:#660e7a}rpIdToHostToLocalTaskCount{color}, 
{color:#660e7a}numExecutorsTargetPerResourceProfileId{color}.toMap)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-30322) Add stage level scheduling docs

2019-12-20 Thread Thomas Graves (Jira)
Thomas Graves created SPARK-30322:
-

 Summary: Add stage level scheduling docs
 Key: SPARK-30322
 URL: https://issues.apache.org/jira/browse/SPARK-30322
 Project: Spark
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 3.0.0
Reporter: Thomas Graves






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-30300) Update correct string in UI for metrics when driver updates same metrics id as tasks.

2019-12-20 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-30300.
---
Fix Version/s: 3.0.0
 Assignee: Niranjan Artal
   Resolution: Fixed

> Update correct string in UI for metrics when driver updates same metrics id 
> as tasks.
> -
>
> Key: SPARK-30300
> URL: https://issues.apache.org/jira/browse/SPARK-30300
> Project: Spark
>  Issue Type: Bug
>  Components: SQL, Web UI
>Affects Versions: 3.0.0
>Reporter: Niranjan Artal
>Assignee: Niranjan Artal
>Priority: Major
> Fix For: 3.0.0
>
>
> There is a bug in displaying of additional max metrics (stageID 
> (attemptID):task Id).
> If driver is updating the same metric which was updated by tasks and if the 
> drivers value exceeds max, then it is not captured. Need to capture this case 
> and update the UI accordingly.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-30299) Dynamic allocation with Standalone mode calculates to many executors needed

2019-12-18 Thread Thomas Graves (Jira)
Thomas Graves created SPARK-30299:
-

 Summary: Dynamic allocation with Standalone mode calculates to 
many executors needed
 Key: SPARK-30299
 URL: https://issues.apache.org/jira/browse/SPARK-30299
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 2.4.4, 3.0.0
Reporter: Thomas Graves


While I was doing some changes in the executor allocation manager, I realized 
there is a bug with dynamic allocation in standalone mode. 

The issue is that if you run standalone mode with the default settings where 
the executor gets all the cores of the worker, spark core (allocation manager) 
doesn't know the number of cores per executor to be able to calculate how many 
tasks can fit on an executor.

It therefore defaults to the use the default EXECUTOR_CORES which is 1 and thus 
could calculate it needs way more containers there are actually does.

For instance, I have a worker with 12 cores. That means by default when I start 
an executor on it, it gets 12 cores and can fit 12 tasks.  The allocation 
manager would use the default of 1 core per executor and say it needs 12 
executors when it only needs 1.

The fix for this isn't trivial since it would need to know how many cores each 
one has and I assume it would also need to handle  heterogenous nodes.  I could 
start workers on nodes with different numbers of cores - one with 24 cores and 
one with 16 cores.  How do we estimate the number of executors in this case.  
We could just choose the min of existing ones or something like that as an 
estimate and it would be closer, unless of course the next executor you got 
didn't actually have that. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-30209) Display stageId, attemptId, taskId with SQL max metric in UI

2019-12-16 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves reassigned SPARK-30209:
-

Assignee: Niranjan Artal

> Display stageId, attemptId, taskId with SQL max metric in UI
> 
>
> Key: SPARK-30209
> URL: https://issues.apache.org/jira/browse/SPARK-30209
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL, Web UI
>Affects Versions: 3.0.0
>Reporter: Niranjan Artal
>Assignee: Niranjan Artal
>Priority: Major
> Fix For: 3.0.0
>
>
> It would be helpful if we could add stageId, stage attemptId and taskId for 
> in SQL UI for each of the max metrics values.  These additional metrics help 
> in debugging the jobs quicker.  For a  given operator, it will be easy to 
> identify the task which is taking maximum time to complete from the Spark UI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-30209) Display stageId, attemptId, taskId with SQL max metric in UI

2019-12-16 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-30209.
---
Fix Version/s: 3.0.0
   Resolution: Fixed

> Display stageId, attemptId, taskId with SQL max metric in UI
> 
>
> Key: SPARK-30209
> URL: https://issues.apache.org/jira/browse/SPARK-30209
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL, Web UI
>Affects Versions: 3.0.0
>Reporter: Niranjan Artal
>Priority: Major
> Fix For: 3.0.0
>
>
> It would be helpful if we could add stageId, stage attemptId and taskId for 
> in SQL UI for each of the max metrics values.  These additional metrics help 
> in debugging the jobs quicker.  For a  given operator, it will be easy to 
> identify the task which is taking maximum time to complete from the Spark UI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-29976) Allow speculation even if there is only one task

2019-12-10 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-29976.
---
Fix Version/s: 3.0.0
 Assignee: Yuchen Huo
   Resolution: Fixed

> Allow speculation even if there is only one task
> 
>
> Key: SPARK-29976
> URL: https://issues.apache.org/jira/browse/SPARK-29976
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Yuchen Huo
>Assignee: Yuchen Huo
>Priority: Major
> Fix For: 3.0.0
>
>
> In the current speculative execution implementation if there is only one task 
> in the stage then no speculative run would be conducted. However, there might 
> be cases where an executor have some problem in writing to its disk and just 
> hang forever. In this case, if the single task stage get assigned to the 
> problematic executor then the whole job would hang forever. It would be 
> better if we could run the task on another executor if this happens. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18886) Delay scheduling should not delay some executors indefinitely if one task is scheduled before delay timeout

2019-12-03 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-18886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987037#comment-16987037
 ] 

Thomas Graves commented on SPARK-18886:
---

Note there is discussion on this subject on prs:

[https://github.com/apache/spark/pull/26633] (hack to work around it for a 
particular RDD)

PR with proposed solution - but really more discussion solution:

[https://github.com/apache/spark/pull/26696]

 

My proposal I believe is similar to Kay's where we use slots and track the 
delay per slot.  I haven't looked at the code in specific detail, especially in 
the FairScheduler where most of the issues in the conversations above were 
mentioned.  One way around this is we have different policies and allow users 
to configure, or have one for FairScheduler and one for fifo.

> Delay scheduling should not delay some executors indefinitely if one task is 
> scheduled before delay timeout
> ---
>
> Key: SPARK-18886
> URL: https://issues.apache.org/jira/browse/SPARK-18886
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler
>Affects Versions: 2.1.0
>Reporter: Imran Rashid
>Priority: Major
>
> Delay scheduling can introduce an unbounded delay and underutilization of 
> cluster resources under the following circumstances:
> 1. Tasks have locality preferences for a subset of available resources
> 2. Tasks finish in less time than the delay scheduling.
> Instead of having *one* delay to wait for resources with better locality, 
> spark waits indefinitely.
> As an example, consider a cluster with 100 executors, and a taskset with 500 
> tasks.  Say all tasks have a preference for one executor, which is by itself 
> on one host.  Given the default locality wait of 3s per level, we end up with 
> a 6s delay till we schedule on other hosts (process wait + host wait).
> If each task takes 5 seconds (under the 6 second delay), then _all 500_ tasks 
> get scheduled on _only one_ executor.  This means you're only using a 1% of 
> your cluster, and you get a ~100x slowdown.  You'd actually be better off if 
> tasks took 7 seconds.
> *WORKAROUNDS*: 
> (1) You can change the locality wait times so that it is shorter than the 
> task execution time.  You need to take into account the sum of all wait times 
> to use all the resources on your cluster.  For example, if you have resources 
> on different racks, this will include the sum of 
> "spark.locality.wait.process" + "spark.locality.wait.node" + 
> "spark.locality.wait.rack".  Those each default to "3s".  The simplest way to 
> be to set "spark.locality.wait.process" to your desired wait interval, and 
> set both "spark.locality.wait.node" and "spark.locality.wait.rack" to "0".  
> For example, if your tasks take ~3 seconds on average, you might set 
> "spark.locality.wait.process" to "1s".  *NOTE*: due to SPARK-18967, avoid 
> setting the {{spark.locality.wait=0}} -- instead, use 
> {{spark.locality.wait=1ms}}.
> Note that this workaround isn't perfect --with less delay scheduling, you may 
> not get as good resource locality.  After this issue is fixed, you'd most 
> likely want to undo these configuration changes.
> (2) The worst case here will only happen if your tasks have extreme skew in 
> their locality preferences.  Users may be able to modify their job to 
> controlling the distribution of the original input data.
> (2a) A shuffle may end up with very skewed locality preferences, especially 
> if you do a repartition starting from a small number of partitions.  (Shuffle 
> locality preference is assigned if any node has more than 20% of the shuffle 
> input data -- by chance, you may have one node just above that threshold, and 
> all other nodes just below it.)  In this case, you can turn off locality 
> preference for shuffle data by setting 
> {{spark.shuffle.reduceLocality.enabled=false}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-29415) Stage Level Sched: Add base ResourceProfile and Request classes

2019-11-25 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-29415.
---
Fix Version/s: 3.0.0
   Resolution: Fixed

> Stage Level Sched: Add base ResourceProfile and Request classes
> ---
>
> Key: SPARK-29415
> URL: https://issues.apache.org/jira/browse/SPARK-29415
> Project: Spark
>  Issue Type: Story
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
> Fix For: 3.0.0
>
>
> this is just to add initial ResourceProfile, ExecutorResourceRequest and 
> taskResourceRequest classes that are used by the other parts of the code.
> Initially we will have them private until we have other pieces in place.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-29149) Update YARN cluster manager For Stage Level Scheduling

2019-11-13 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves reassigned SPARK-29149:
-

Assignee: Thomas Graves

> Update YARN cluster manager For Stage Level Scheduling
> --
>
> Key: SPARK-29149
> URL: https://issues.apache.org/jira/browse/SPARK-29149
> Project: Spark
>  Issue Type: Story
>  Components: YARN
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
>
> For Stage Level Scheduling, we need to update the YARN allocator to handle 
> requesting executors for multiple ResourceProfiles.
>  * The container requests have to be updated to be based on a number of 
> containers per ResourceProfile.  This is a larger change then you might 
> expect because on YARN you can’t ask for different container sizes within the 
> same YARN container priority.  So we will have to ask for containers at 
> different priorities to be able to get different container sizes. Other YARN 
> applications like Tez handle this now so it shouldn’t be a big deal just a 
> matter of mapping stages to different priorities.
>  * The allocation response from YARN has to match the containers to a 
> resource profile.
>  * We need to launch the container with additional parameters so the executor 
> knows its resource profile to report back to the ExecutorMonitor.  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29762) GPU Scheduling - default task resource amount to 1

2019-11-12 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972755#comment-16972755
 ] 

Thomas Graves commented on SPARK-29762:
---

But like you mention, if we were to support something like that, then the 
default of 1 when task requirement wasn't specified would be an issue.  

> GPU Scheduling - default task resource amount to 1
> --
>
> Key: SPARK-29762
> URL: https://issues.apache.org/jira/browse/SPARK-29762
> Project: Spark
>  Issue Type: Story
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Priority: Major
>
> Default the task level resource configs (for gpu/fpga, etc) to 1.  So if the 
> user specifies the executor resource then to make it more user friendly lets 
> have the task resource config default to 1.  This is ok right now since we 
> require resources to have an address.  It also matches what we do for the 
> spark.task.cpus configs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29762) GPU Scheduling - default task resource amount to 1

2019-11-12 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972750#comment-16972750
 ] 

Thomas Graves commented on SPARK-29762:
---

Ok, in the scenario above you would need 2 different ResourceProfile's and they 
would not use the same containers.  1 profile would need executors with only 
cpu, the other profile would need executors with cpus and gpus.   At least that 
is how it would be in this initial stage level scheduling proposal. Anything 
more complex could come later.  It does bring up a good point that my base pr 
doesn't have the checks to ensure that though, but that is separate

 

 

 

 

> GPU Scheduling - default task resource amount to 1
> --
>
> Key: SPARK-29762
> URL: https://issues.apache.org/jira/browse/SPARK-29762
> Project: Spark
>  Issue Type: Story
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Priority: Major
>
> Default the task level resource configs (for gpu/fpga, etc) to 1.  So if the 
> user specifies the executor resource then to make it more user friendly lets 
> have the task resource config default to 1.  This is ok right now since we 
> require resources to have an address.  It also matches what we do for the 
> spark.task.cpus configs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29762) GPU Scheduling - default task resource amount to 1

2019-11-12 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972661#comment-16972661
 ] 

Thomas Graves commented on SPARK-29762:
---

The code that gets the task requirements is generic and is reused by more then 
just TASKs. Also there is code that loops over the task requirements and assume 
all task requirements are there, but you can't do that if you are relying on a 
default value because they won't be present.  At some point you either have to 
look at all executor requirements and then build up the task requirements that 
include the default ones, or you just have to change those loops to be over the 
executor requirements.

I'm not saying its not possible, my comment was just saying it isn't as 
straight forward as I originally thought. 

I don't think you have the issue you are talking about because you have to have 
the task requirements be based on the executor resource requests (which is part 
of complication).  as long as you do that with the stage level scheduling I 
think its ok.   ResourceProfile ExecutorRequest = 1 cpu, 1 GPU, user defined 
taskRequest = null, then translate into taskRequest = 1 cpu, 1 GPU. 

 

 

> GPU Scheduling - default task resource amount to 1
> --
>
> Key: SPARK-29762
> URL: https://issues.apache.org/jira/browse/SPARK-29762
> Project: Spark
>  Issue Type: Story
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Priority: Major
>
> Default the task level resource configs (for gpu/fpga, etc) to 1.  So if the 
> user specifies the executor resource then to make it more user friendly lets 
> have the task resource config default to 1.  This is ok right now since we 
> require resources to have an address.  It also matches what we do for the 
> spark.task.cpus configs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29762) GPU Scheduling - default task resource amount to 1

2019-11-11 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971798#comment-16971798
 ] 

Thomas Graves commented on SPARK-29762:
---

this is actually more complex then you might think because the resource configs 
are just configs.  So you have spark.executor.resource.gpu.amount for instance, 
the corresponding task config would be spark.task.resource.gpu.amount.  Where 
gpu could be any resource.  The way the code is written now if it just grabs 
all the resources and iterates over them in various places and assume you have 
specified a task requirement for each executor resource. 

If you remove that assumption you now have to be careful about what you are 
iterating over and really you have to use the resources from the executor 
configs, not the task configs.  But you still have to read the task configs and 
if a resource isn't there then default it to 1.

> GPU Scheduling - default task resource amount to 1
> --
>
> Key: SPARK-29762
> URL: https://issues.apache.org/jira/browse/SPARK-29762
> Project: Spark
>  Issue Type: Story
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Priority: Major
>
> Default the task level resource configs (for gpu/fpga, etc) to 1.  So if the 
> user specifies the executor resource then to make it more user friendly lets 
> have the task resource config default to 1.  This is ok right now since we 
> require resources to have an address.  It also matches what we do for the 
> spark.task.cpus configs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-29763) Stage UI Page not showing all accumulators in Task Table

2019-11-05 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves reassigned SPARK-29763:
-

Assignee: Thomas Graves

> Stage UI Page not showing all accumulators in Task Table
> 
>
> Key: SPARK-29763
> URL: https://issues.apache.org/jira/browse/SPARK-29763
> Project: Spark
>  Issue Type: Story
>  Components: Web UI
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
>
> In the Stage specific ui page, the Task table doesn't properly show all 
> accumulators. Its only showing the last one.
> We need to fix the javascript to show all of them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-29763) Stage UI Page not showing all accumulators in Task Table

2019-11-05 Thread Thomas Graves (Jira)
Thomas Graves created SPARK-29763:
-

 Summary: Stage UI Page not showing all accumulators in Task Table
 Key: SPARK-29763
 URL: https://issues.apache.org/jira/browse/SPARK-29763
 Project: Spark
  Issue Type: Story
  Components: Web UI
Affects Versions: 3.0.0
Reporter: Thomas Graves


In the Stage specific ui page, the Task table doesn't properly show all 
accumulators. Its only showing the last one.

We need to fix the javascript to show all of them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-29151) Support fraction resources for task resource scheduling

2019-11-05 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-29151.
---
Fix Version/s: 3.0.0
   Resolution: Fixed

> Support fraction resources for task resource scheduling
> ---
>
> Key: SPARK-29151
> URL: https://issues.apache.org/jira/browse/SPARK-29151
> Project: Spark
>  Issue Type: Story
>  Components: Scheduler
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Alessandro Bellina
>Priority: Major
> Fix For: 3.0.0
>
>
> The current resource scheduling code for GPU/FPGA, etc only supports amounts 
> as integers, so you can only schedule whole resources.  There are cases where 
> you may want to share the resources and schedule multiple tasks to run on the 
> same resources (GPU).  It would be nice to support fractional resources for 
> the task level settings.  Somehow say we want a task to have 1/4 of a GPU for 
> instance.  I think we only want to support fractional when the resources 
> amount is < 1.  Otherwise you run into issues where someone asks for 2 1/8 
> GPU, which doesn't really make sense to me and makes assigning addresses very 
> complicated.
> Need to think about implementation details, for instance using a float can be 
> troublesome here due to floating point math precision issues.
> Another thing to consider, depending on implementation is limiting the 
> precision - go down to tenths, hundreths, thousandths, etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-29151) Support fraction resources for task resource scheduling

2019-11-05 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves reassigned SPARK-29151:
-

Assignee: Alessandro Bellina

> Support fraction resources for task resource scheduling
> ---
>
> Key: SPARK-29151
> URL: https://issues.apache.org/jira/browse/SPARK-29151
> Project: Spark
>  Issue Type: Story
>  Components: Scheduler
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Alessandro Bellina
>Priority: Major
>
> The current resource scheduling code for GPU/FPGA, etc only supports amounts 
> as integers, so you can only schedule whole resources.  There are cases where 
> you may want to share the resources and schedule multiple tasks to run on the 
> same resources (GPU).  It would be nice to support fractional resources for 
> the task level settings.  Somehow say we want a task to have 1/4 of a GPU for 
> instance.  I think we only want to support fractional when the resources 
> amount is < 1.  Otherwise you run into issues where someone asks for 2 1/8 
> GPU, which doesn't really make sense to me and makes assigning addresses very 
> complicated.
> Need to think about implementation details, for instance using a float can be 
> troublesome here due to floating point math precision issues.
> Another thing to consider, depending on implementation is limiting the 
> precision - go down to tenths, hundreths, thousandths, etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-29762) GPU Scheduling - default task resource amount to 1

2019-11-05 Thread Thomas Graves (Jira)
Thomas Graves created SPARK-29762:
-

 Summary: GPU Scheduling - default task resource amount to 1
 Key: SPARK-29762
 URL: https://issues.apache.org/jira/browse/SPARK-29762
 Project: Spark
  Issue Type: Story
  Components: Spark Core
Affects Versions: 3.0.0
Reporter: Thomas Graves


Default the task level resource configs (for gpu/fpga, etc) to 1.  So if the 
user specifies the executor resource then to make it more user friendly lets 
have the task resource config default to 1.  This is ok right now since we 
require resources to have an address.  It also matches what we do for the 
spark.task.cpus configs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-29415) Stage Level Sched: Add base ResourceProfile and Request classes

2019-10-31 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16964360#comment-16964360
 ] 

Thomas Graves edited comment on SPARK-29415 at 10/31/19 8:03 PM:
-

More details:

TaskResourceRequest - this supports taking a resourceName and an amount (Double 
for fractional resources).  It only supports cpus (spark.task.cpus) and 
accelerator resource types (spark.*.resource.[resourceName].*.  So user can 
specify cpus and resources like GPU's and FPGAS. The accerator type resources 
match what we already have for configs in the acceralator aware scheduling. 
[https://github.com/apache/spark/blob/master/docs/configuration.md#custom-resource-scheduling-and-configuration-overview]

 

ExecutorResourceRequest - this supports specifying the requirements for the 
executors.  It supports all the configs needed for accelerator aware scheduling 
- {{spark.\{executor/driver}.resource.\{resourceName}.\{amount, 
discoveryScript, vendor} as well as heap memory, overhead memory, pyspark 
memory, and cores. In order to support memory types we added a "units" 
parameter into ExecutorResourceRequest.  the other parameters resourceName, 
vendor, discoveryScript, amount all match the accelerator aware scheduling 
parameters.}}

 

ResourceProfile - this class takes in the executor and task requirement and 
holds them to be used by other components.  For instance, we have to pass the 
executor resources into the cluster managers so it can ask for the proper 
containers.  The requests have to also be passed into the executors when 
launched so they use the correct discovery Script.  The task requirements are 
used by the scheduler to assign tasks to proper containers. {{  We also have a 
ResourceProfile object that has an accessor to get the default ResourceProfile. 
This is the profile generated from the configs the user passes in when the 
spark application is submitted.  So it will have --executor-cores, memory, 
overhead memory, pyspark memory, accelerator resources the user all specified 
via --confs or properties file on submit.  The default profile will be used in 
a lot of places since the user may never specify another ResourceProfile and 
want an easy way to access it.}}


was (Author: tgraves):
More details:

TaskResourceRequest - this supports taking a resourceName and an amount (Double 
for fractional resources).  It only supports cpus (spark.task.cpus) and 
accelerator resource types (spark.*.resource.[resourceName].*.  So user can 
specify cpus and resources like GPU's and FPGAS. The accerator type resources 
match what we already have for configs in the acceralator aware scheduling. 
[https://github.com/apache/spark/blob/master/docs/configuration.md#custom-resource-scheduling-and-configuration-overview]

 

ExecutorResourceRequest - this supports specifying the requirements for the 
executors.  It supports all the configs needed for accelerator aware scheduling 
- {{spark.\{executor/driver}.resource.\{resourceName}.\{amount, 
discoveryScript, vendor} as well as heap memory, overhead memory, pyspark 
memory, and cores. In order to support memory types we added a "units" 
parameter into ExecutorResourceRequest.  the other parameters resourceName, 
vendor, discoveryScript, amount all match the accelerator aware scheduling 
parameters.}}

 

ResourceProfile - this class takes in the executor and task requirement and 
holds them to be used by other components.  For instance, we have to pass the 
executor resources into the cluster managers so it can ask for the proper 
containers.  The requests have to also be passed into the executors when 
launched so they use the correct discovery Script.  The task requirements are 
used by the scheduler to assign tasks to proper containers. {{  We also have a 
ResourceProfile object that has an accessor to get the default ResourceProfile. 
This is the profile generated from the configs the user passes in when the 
spark application is submitted.  So it will have --executor-cores, memory, 
overhead memory, pyspark memory, accelerator resources the user all specified 
via --confs or properties file on submit.}}

> Stage Level Sched: Add base ResourceProfile and Request classes
> ---
>
> Key: SPARK-29415
> URL: https://issues.apache.org/jira/browse/SPARK-29415
> Project: Spark
>  Issue Type: Story
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
>
> this is just to add initial ResourceProfile, ExecutorResourceRequest and 
> taskResourceRequest classes that are used by the other parts of the code.
> Initially we will have them private until we have other pieces in place.



--
This message was sent by Atlassian Jira

[jira] [Comment Edited] (SPARK-29415) Stage Level Sched: Add base ResourceProfile and Request classes

2019-10-31 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16964360#comment-16964360
 ] 

Thomas Graves edited comment on SPARK-29415 at 10/31/19 7:59 PM:
-

More details:

TaskResourceRequest - this supports taking a resourceName and an amount (Double 
for fractional resources).  It only supports cpus (spark.task.cpus) and 
accelerator resource types (spark.*.resource.[resourceName].*.  So user can 
specify cpus and resources like GPU's and FPGAS. The accerator type resources 
match what we already have for configs in the acceralator aware scheduling. 
[https://github.com/apache/spark/blob/master/docs/configuration.md#custom-resource-scheduling-and-configuration-overview]

 

ExecutorResourceRequest - this supports specifying the requirements for the 
executors.  It supports all the configs needed for accelerator aware scheduling 
- {{spark.\{executor/driver}.resource.\{resourceName}.\{amount, 
discoveryScript, vendor} as well as heap memory, overhead memory, pyspark 
memory, and cores. In order to support memory types we added a "units" 
parameter into ExecutorResourceRequest.  the other parameters resourceName, 
vendor, discoveryScript, amount all match the accelerator aware scheduling 
parameters.}}

 

ResourceProfile - this class takes in the executor and task requirement and 
holds them to be used by other components.  For instance, we have to pass the 
executor resources into the cluster managers so it can ask for the proper 
containers.  The requests have to also be passed into the executors when 
launched so they use the correct discovery Script.  The task requirements are 
used by the scheduler to assign tasks to proper containers. {{  We also have a 
ResourceProfile object that has an accessor to get the default ResourceProfile. 
This is the profile generated from the configs the user passes in when the 
spark application is submitted.  So it will have --executor-cores, memory, 
overhead memory, pyspark memory, accelerator resources the user all specified 
via --confs or properties file on submit.}}


was (Author: tgraves):
More details:

TaskResourceRequest - this supports taking a resourceName and an amount (Double 
for fractional resources).  It only supports cpus (spark.task.cpus) and 
accelerator resource types (spark.*.resource.[resourceName].*.  So user can 
specify cpus and resources like GPU's and FPGAS. The accerator type resources 
match what we already have for configs in the acceralator aware scheduling. 
[https://github.com/apache/spark/blob/master/docs/configuration.md#custom-resource-scheduling-and-configuration-overview]

 

ExecutorResourceRequest - this supports specifying the requirements for the 
executors.  It supports all the configs needed for accelerator aware scheduling 
- {{spark.\{executor/driver}.resource.\{resourceName}.\{amount, 
discoveryScript, vendor} as well as heap memory, overhead memory, pyspark 
memory, and cores. In order to support memory types we added a "units" 
parameter into ExecutorResourceRequest.  the other parameters resourceName, 
vendor, discoveryScript, amount all match the accelerator aware scheduling 
parameters.}}

{{}}

ResourceProfile - this class takes in the executor and task requirement and 
holds them to be used by other components.  For instance, we have to pass the 
executor resources into the cluster managers so it can ask for the proper 
containers.  The requests have to also be passed into the executors when 
launched so they use the correct discovery Script.  The task requirements are 
used by the scheduler to assign tasks to proper containers. {{  We also have a 
ResourceProfile object that has an accessor to get the default ResourceProfile. 
This is the profile generated from the configs the user passes in when the 
spark application is submitted.  So it will have --executor-cores, memory, 
overhead memory, pyspark memory, accelerator resources the user all specified 
via --confs or properties file on submit.}}

> Stage Level Sched: Add base ResourceProfile and Request classes
> ---
>
> Key: SPARK-29415
> URL: https://issues.apache.org/jira/browse/SPARK-29415
> Project: Spark
>  Issue Type: Story
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
>
> this is just to add initial ResourceProfile, ExecutorResourceRequest and 
> taskResourceRequest classes that are used by the other parts of the code.
> Initially we will have them private until we have other pieces in place.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For 

[jira] [Commented] (SPARK-29415) Stage Level Sched: Add base ResourceProfile and Request classes

2019-10-31 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16964360#comment-16964360
 ] 

Thomas Graves commented on SPARK-29415:
---

More details:

TaskResourceRequest - this supports taking a resourceName and an amount (Double 
for fractional resources).  It only supports cpus (spark.task.cpus) and 
accelerator resource types (spark.*.resource.[resourceName].*.  So user can 
specify cpus and resources like GPU's and FPGAS. The accerator type resources 
match what we already have for configs in the acceralator aware scheduling. 
[https://github.com/apache/spark/blob/master/docs/configuration.md#custom-resource-scheduling-and-configuration-overview]

 

ExecutorResourceRequest - this supports specifying the requirements for the 
executors.  It supports all the configs needed for accelerator aware scheduling 
- {{spark.\{executor/driver}.resource.\{resourceName}.\{amount, 
discoveryScript, vendor} as well as heap memory, overhead memory, pyspark 
memory, and cores. In order to support memory types we added a "units" 
parameter into ExecutorResourceRequest.  the other parameters resourceName, 
vendor, discoveryScript, amount all match the accelerator aware scheduling 
parameters.}}

{{}}

ResourceProfile - this class takes in the executor and task requirement and 
holds them to be used by other components.  For instance, we have to pass the 
executor resources into the cluster managers so it can ask for the proper 
containers.  The requests have to also be passed into the executors when 
launched so they use the correct discovery Script.  The task requirements are 
used by the scheduler to assign tasks to proper containers. {{  We also have a 
ResourceProfile object that has an accessor to get the default ResourceProfile. 
This is the profile generated from the configs the user passes in when the 
spark application is submitted.  So it will have --executor-cores, memory, 
overhead memory, pyspark memory, accelerator resources the user all specified 
via --confs or properties file on submit.}}

> Stage Level Sched: Add base ResourceProfile and Request classes
> ---
>
> Key: SPARK-29415
> URL: https://issues.apache.org/jira/browse/SPARK-29415
> Project: Spark
>  Issue Type: Story
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
>
> this is just to add initial ResourceProfile, ExecutorResourceRequest and 
> taskResourceRequest classes that are used by the other parts of the code.
> Initially we will have them private until we have other pieces in place.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-29641) Stage Level Sched: Add python api's and test

2019-10-29 Thread Thomas Graves (Jira)
Thomas Graves created SPARK-29641:
-

 Summary: Stage Level Sched: Add python api's and test
 Key: SPARK-29641
 URL: https://issues.apache.org/jira/browse/SPARK-29641
 Project: Spark
  Issue Type: Story
  Components: PySpark
Affects Versions: 3.0.0
Reporter: Thomas Graves


For the Stage Level scheduling feature, add any missing python api and test it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29415) Stage Level Sched: Add base ResourceProfile and Request classes

2019-10-23 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16957974#comment-16957974
 ] 

Thomas Graves commented on SPARK-29415:
---

>From a high level design point, this is the base classes needed for other 
>jira/components to be implemented. You can see the design doc attached to 
>SPARK-27495 for the entire overview, but for this specifically this is what we 
>are looking to add.  These will start out private until we have other parts 
>implemented and then make public incase this isn't fully implemented for a 
>release.

 

ResourceProfile:

The user will have to build up a _ResourceProfile_ to pass into an RDD 
withResources call. This profile will have a limited set of resources the user 
is allowed to specify. It will allow both task and executor resources. It will 
be a builder type interface where the main function called will be 
_ResourceProfile.require._  Adding the ResourceProfile API class leaves it open 
to do more advanced things in the future. For instance, perhaps you want a 
_ResourceProfile.prefer_ option where it would run on a node with some 
resources if available but then fall back if they aren’t.   The config names 
supported correspond to the regular spark configs with the prefix removed. For 
instance overhead memory in this api is memoryOverhead, which is 
spark.executor.memoryOverhead with the spark.executor removed.  Resources like 
GPUs are resource.gpu (spark configs spark.executor.resource.gpu.*).

| |

*_def_* _require(request: TaskResourceRequest):_ *_this_*_._*_type_*

*_def_* _require(request: ExecutorResourceRequest):_ *_this_*_._*_type_*

It will also have functions to get the resources out for both scala and java.

 

*Resource Requests:*

*_class_* _ExecutorResourceRequest(_

   _val resourceName: String,_

   _val amount: Int, // potentially make this handle fractional resources_

   _val units: String, // to handle memory unit types
_

   _val discoveryScript: Option[__String__] = None,_

   _val vendor: Option[__String__] = None)_

 

*_class_* _TaskResourceRequest(_

   _val resourceName: String,_

   _val amount: Double) // double to handle fractional resources (ie 2 tasks 
using 1 resource )
_

 

This will allow the user to programmatically set the resources vs just using 
the configs like they can in Spark 3.0 now.  The first implementation would 
support cpu, memory (overhead, pyspark, on heap, off heap), and the generic 
resources. 

 __ 

An example of the way this might work is:

 __ 

_val_ *_rp_* _= new ResourceProfile()_

_rp.require(new ExecutorResourceRequest("memory", 2048))_

_rp.require(new ExecutorResourceRequest("cores", 2))_

_rp.require(new ExecutorResourceRequest("gpu", 1, 
Some("/opt/gpuScripts/getGpus")))_

_rp.require(new TaskResourceRequest("gpu", 1))_

 

Internally we will also create a default profile, which will be based on the 
normal spark configs passed in. This default one can be used everywhere where 
user hasn't explicitly set the ResourceProfile

> Stage Level Sched: Add base ResourceProfile and Request classes
> ---
>
> Key: SPARK-29415
> URL: https://issues.apache.org/jira/browse/SPARK-29415
> Project: Spark
>  Issue Type: Story
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
>
> this is just to add initial ResourceProfile, ExecutorResourceRequest and 
> taskResourceRequest classes that are used by the other parts of the code.
> Initially we will have them private until we have other pieces in place.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29329) maven incremental builds not working

2019-10-21 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956362#comment-16956362
 ] 

Thomas Graves commented on SPARK-29329:
---

from the linked issue above it sounds like the zinc compile server mode isn't 
really supported anymore in scala maven plugin 4.x. the zinc version changed to 
1 and the code is completely different and sounds like they only support the 
embedded mode and zinc server option isn't available so if that is the case it 
isn't helping us and it recompiles every time.

There are still a few things unclear to me from the issue above in the wording 
in their response. I haven't had time to look at it more since their response.  
Sounds like someone working on "bloop" compiler server replacement for scala.

> maven incremental builds not working
> 
>
> Key: SPARK-29329
> URL: https://issues.apache.org/jira/browse/SPARK-29329
> Project: Spark
>  Issue Type: Bug
>  Components: Build
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Priority: Major
>
> It looks like since we Upgraded scala-maven-plugin to 4.2.0 
> https://issues.apache.org/jira/browse/SPARK-28759 spark incremental builds 
> stop working.  Everytime you build its building all files, which takes 
> forever.
> It would be nice to fix this.
>  
> To reproduce, just build spark once ( I happened to be using the command 
> below):
> build/mvn -Phadoop-3.2 -Phive-thriftserver -Phive -Pyarn -Pkinesis-asl 
> -Pkubernetes -Pmesos -Phadoop-cloud -Pspark-ganglia-lgpl package -DskipTests
> Then build it again and you will see that it compiles all the files and takes 
> 15-30 minutes. With incremental it skips all unnecessary files and takes 
> closer to 5 minutes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26154) Stream-stream joins - left outer join gives inconsistent output

2019-10-16 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-26154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-26154:
--
Affects Version/s: 3.0.0

> Stream-stream joins - left outer join gives inconsistent output
> ---
>
> Key: SPARK-26154
> URL: https://issues.apache.org/jira/browse/SPARK-26154
> Project: Spark
>  Issue Type: Bug
>  Components: Structured Streaming
>Affects Versions: 2.3.2, 3.0.0
> Environment: Spark version - Spark 2.3.2
> OS- Suse 11
>Reporter: Haripriya
>Priority: Blocker
>  Labels: correctness
>
> Stream-stream joins using left outer join gives inconsistent  output 
> The data processed once, is being processed again and gives null value. In 
> Batch 2, the input data  "3" is processed. But again in batch 6, null value 
> is provided for same data
> Steps
> In spark-shell
> {code:java}
> scala> import org.apache.spark.sql.functions.{col, expr}
> import org.apache.spark.sql.functions.{col, expr}
> scala> import org.apache.spark.sql.streaming.Trigger
> import org.apache.spark.sql.streaming.Trigger
> scala> val lines_stream1 = spark.readStream.
>  |   format("kafka").
>  |   option("kafka.bootstrap.servers", "ip:9092").
>  |   option("subscribe", "topic1").
>  |   option("includeTimestamp", true).
>  |   load().
>  |   selectExpr("CAST (value AS String)","CAST(timestamp AS 
> TIMESTAMP)").as[(String,Timestamp)].
>  |   select(col("value") as("data"),col("timestamp") 
> as("recordTime")).
>  |   select("data","recordTime").
>  |   withWatermark("recordTime", "5 seconds ")
> lines_stream1: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = 
> [data: string, recordTime: timestamp]
> scala> val lines_stream2 = spark.readStream.
>  |   format("kafka").
>  |   option("kafka.bootstrap.servers", "ip:9092").
>  |   option("subscribe", "topic2").
>  |   option("includeTimestamp", value = true).
>  |   load().
>  |   selectExpr("CAST (value AS String)","CAST(timestamp AS 
> TIMESTAMP)").as[(String,Timestamp)].
>  |   select(col("value") as("data1"),col("timestamp") 
> as("recordTime1")).
>  |   select("data1","recordTime1").
>  |   withWatermark("recordTime1", "10 seconds ")
> lines_stream2: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = 
> [data1: string, recordTime1: timestamp]
> scala> val query = lines_stream1.join(lines_stream2, expr (
>  |   """
>  | | data == data1 and
>  | | recordTime1 >= recordTime and
>  | | recordTime1 <= recordTime + interval 5 seconds
>  |   """.stripMargin),"left").
>  |   writeStream.
>  |   option("truncate","false").
>  |   outputMode("append").
>  |   format("console").option("checkpointLocation", 
> "/tmp/leftouter/").
>  |   trigger(Trigger.ProcessingTime ("5 seconds")).
>  |   start()
> query: org.apache.spark.sql.streaming.StreamingQuery = 
> org.apache.spark.sql.execution.streaming.StreamingQueryWrapper@1a48f55b
> {code}
> Step2 : Start producing data
> kafka-console-producer.sh --broker-list ip:9092 --topic topic1
>  >1
>  >2
>  >3
>  >4
>  >5
>  >aa
>  >bb
>  >cc
> kafka-console-producer.sh --broker-list ip:9092 --topic topic2
>  >2
>  >2
>  >3
>  >4
>  >5
>  >aa
>  >cc
>  >ee
>  >ee
>  
> Output obtained:
> {code:java}
> Batch: 0
> ---
> ++--+-+---+
> |data|recordTime|data1|recordTime1|
> ++--+-+---+
> ++--+-+---+
> ---
> Batch: 1
> ---
> ++--+-+---+
> |data|recordTime|data1|recordTime1|
> ++--+-+---+
> ++--+-+---+
> ---
> Batch: 2
> ---
> ++---+-+---+
> |data|recordTime |data1|recordTime1|
> ++---+-+---+
> |3   |2018-11-22 20:09:35.053|3|2018-11-22 20:09:36.506|
> |2   |2018-11-22 20:09:31.613|2|2018-11-22 20:09:33.116|
> ++---+-+---+
> ---
> Batch: 3
> ---
> ++---+-+---+
> |data|recordTime |data1|recordTime1|
> ++---+-+---+
> |4   |2018-11-22 20:09:38.654|4|2018-11-22 20:09:39.818|
> 

[jira] [Commented] (SPARK-29465) Unable to configure SPARK UI (spark.ui.port) in spark yarn cluster mode.

2019-10-16 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16952617#comment-16952617
 ] 

Thomas Graves commented on SPARK-29465:
---

So I'd be fine with a range of ports. I suggest updating the description and 
supporting for all port types. I'm pretty sure this has been brought up before, 
see 
https://issues.apache.org/jira/browse/SPARK-4449?jql=project%20%3D%20SPARK%20AND%20text%20~%20%22port%20range%22
 and possibly others

> Unable to configure SPARK UI (spark.ui.port) in spark yarn cluster mode. 
> -
>
> Key: SPARK-29465
> URL: https://issues.apache.org/jira/browse/SPARK-29465
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Submit, YARN
>Affects Versions: 3.0.0
>Reporter: Vishwas Nalka
>Priority: Major
>
>  I'm trying to restrict the ports used by spark app which is launched in yarn 
> cluster mode. All ports (viz. driver, executor, blockmanager) could be 
> specified using the respective properties except the ui port. The spark app 
> is launched using JAVA code and setting the property spark.ui.port in 
> sparkConf doesn't seem to help. Even setting a JVM option 
> -Dspark.ui.port="some_port" does not spawn the UI is required port. 
> From the logs of the spark app, *_the property spark.ui.port is overridden 
> and the JVM property '-Dspark.ui.port=0' is set_* even though it is never set 
> to 0. 
> _(Run in Spark 1.6.2) From the logs ->_
> _command:LD_LIBRARY_PATH="/usr/hdp/2.6.4.0-91/hadoop/lib/native:$LD_LIBRARY_PATH"
>  {{JAVA_HOME}}/bin/java -server -XX:OnOutOfMemoryError='kill %p' -Xms4096m 
> -Xmx4096m -Djava.io.tmpdir={{PWD}}/tmp '-Dspark.blockManager.port=9900' 
> '-Dspark.driver.port=9902' '-Dspark.fileserver.port=9903' 
> '-Dspark.broadcast.port=9904' '-Dspark.port.maxRetries=20' 
> '-Dspark.ui.port=0' '-Dspark.executor.port=9905'_
> _19/10/14 16:39:59 INFO Utils: Successfully started service 'SparkUI' on port 
> 35167.19/10/14 16:39:59 INFO SparkUI: Started SparkUI at_ 
> [_http://10.65.170.98:35167_|http://10.65.170.98:35167/]
> Even tried using a *spark-submit command with --conf spark.ui.port* does 
> spawn UI in required port
> {color:#172b4d}_(Run in Spark 2.4.4)_{color}
>  {color:#172b4d}_./bin/spark-submit --class org.apache.spark.examples.SparkPi 
> --master yarn --deploy-mode cluster --driver-memory 4g --executor-memory 2g 
> --executor-cores 1 --conf spark.ui.port=12345 --conf spark.driver.port=12340 
> --queue default examples/jars/spark-examples_2.11-2.4.4.jar 10_{color}
> _From the logs::_
>  _19/10/15 00:04:05 INFO ui.SparkUI: Stopped Spark web UI at 
> [http://invrh74ace005.informatica.com:46622|http://invrh74ace005.informatica.com:46622/]_
> _command:{{JAVA_HOME}}/bin/java -server -Xmx2048m 
> -Djava.io.tmpdir={{PWD}}/tmp '-Dspark.ui.port=0'  'Dspark.driver.port=12340' 
> -Dspark.yarn.app.container.log.dir= -XX:OnOutOfMemoryError='kill %p' 
> org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url 
> spark://coarsegrainedschedu...@invrh74ace005.informatica.com:12340 
> --executor-id  --hostname  --cores 1 --app-id 
> application_1570992022035_0089 --user-class-path 
> [file:$PWD/__app__.jar1|file://%24pwd/__app__.jar1]>/stdout2>/stderr_
>  
> Looks like the application master override this and set a JVM property before 
> launch resulting in random UI port even though spark.ui.port is set by the 
> user.
> In these links
>  # 
> [https://github.com/apache/spark/blob/master/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala]
>  (line 214)
>  # 
> [https://github.com/cloudera/spark/blob/master/yarn/alpha/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala]
>  (line 75)
> I can see that the method _*run() in above files sets a system property 
> UI_PORT*_ and _*spark.ui.port respectively.*_



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29465) Unable to configure SPARK UI (spark.ui.port) in spark yarn cluster mode.

2019-10-16 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16952603#comment-16952603
 ] 

Thomas Graves commented on SPARK-29465:
---

Note the problem I see with just using the port if user specified is many users 
dont know what they are doing in different environments and they result in 
random failures that they dont necessarily understand. The default port is 4040 
and not 0. So I'm a bit on the fence if the solution here is purely use port 
when specified a specific port.  If ots a range of ports that makes more sense. 
 So I would like to understand your use case and why you are trying to specify 
specific port.

> Unable to configure SPARK UI (spark.ui.port) in spark yarn cluster mode. 
> -
>
> Key: SPARK-29465
> URL: https://issues.apache.org/jira/browse/SPARK-29465
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Submit, YARN
>Affects Versions: 3.0.0
>Reporter: Vishwas Nalka
>Priority: Major
>
>  I'm trying to restrict the ports used by spark app which is launched in yarn 
> cluster mode. All ports (viz. driver, executor, blockmanager) could be 
> specified using the respective properties except the ui port. The spark app 
> is launched using JAVA code and setting the property spark.ui.port in 
> sparkConf doesn't seem to help. Even setting a JVM option 
> -Dspark.ui.port="some_port" does not spawn the UI is required port. 
> From the logs of the spark app, *_the property spark.ui.port is overridden 
> and the JVM property '-Dspark.ui.port=0' is set_* even though it is never set 
> to 0. 
> _(Run in Spark 1.6.2) From the logs ->_
> _command:LD_LIBRARY_PATH="/usr/hdp/2.6.4.0-91/hadoop/lib/native:$LD_LIBRARY_PATH"
>  {{JAVA_HOME}}/bin/java -server -XX:OnOutOfMemoryError='kill %p' -Xms4096m 
> -Xmx4096m -Djava.io.tmpdir={{PWD}}/tmp '-Dspark.blockManager.port=9900' 
> '-Dspark.driver.port=9902' '-Dspark.fileserver.port=9903' 
> '-Dspark.broadcast.port=9904' '-Dspark.port.maxRetries=20' 
> '-Dspark.ui.port=0' '-Dspark.executor.port=9905'_
> _19/10/14 16:39:59 INFO Utils: Successfully started service 'SparkUI' on port 
> 35167.19/10/14 16:39:59 INFO SparkUI: Started SparkUI at_ 
> [_http://10.65.170.98:35167_|http://10.65.170.98:35167/]
> Even tried using a *spark-submit command with --conf spark.ui.port* does 
> spawn UI in required port
> {color:#172b4d}_(Run in Spark 2.4.4)_{color}
>  {color:#172b4d}_./bin/spark-submit --class org.apache.spark.examples.SparkPi 
> --master yarn --deploy-mode cluster --driver-memory 4g --executor-memory 2g 
> --executor-cores 1 --conf spark.ui.port=12345 --conf spark.driver.port=12340 
> --queue default examples/jars/spark-examples_2.11-2.4.4.jar 10_{color}
> _From the logs::_
>  _19/10/15 00:04:05 INFO ui.SparkUI: Stopped Spark web UI at 
> [http://invrh74ace005.informatica.com:46622|http://invrh74ace005.informatica.com:46622/]_
> _command:{{JAVA_HOME}}/bin/java -server -Xmx2048m 
> -Djava.io.tmpdir={{PWD}}/tmp '-Dspark.ui.port=0'  'Dspark.driver.port=12340' 
> -Dspark.yarn.app.container.log.dir= -XX:OnOutOfMemoryError='kill %p' 
> org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url 
> spark://coarsegrainedschedu...@invrh74ace005.informatica.com:12340 
> --executor-id  --hostname  --cores 1 --app-id 
> application_1570992022035_0089 --user-class-path 
> [file:$PWD/__app__.jar1|file://%24pwd/__app__.jar1]>/stdout2>/stderr_
>  
> Looks like the application master override this and set a JVM property before 
> launch resulting in random UI port even though spark.ui.port is set by the 
> user.
> In these links
>  # 
> [https://github.com/apache/spark/blob/master/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala]
>  (line 214)
>  # 
> [https://github.com/cloudera/spark/blob/master/yarn/alpha/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala]
>  (line 75)
> I can see that the method _*run() in above files sets a system property 
> UI_PORT*_ and _*spark.ui.port respectively.*_



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29465) Unable to configure SPARK UI (spark.ui.port) in spark yarn cluster mode.

2019-10-16 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16952542#comment-16952542
 ] 

Thomas Graves commented on SPARK-29465:
---

Thanks for copying me, yes agree this would be improvement. So my understanding 
is you want.to restrict.to certain port range or just very specific port?  A 
specific port doesn't really make sense on yarn where you have multiple users.

> Unable to configure SPARK UI (spark.ui.port) in spark yarn cluster mode. 
> -
>
> Key: SPARK-29465
> URL: https://issues.apache.org/jira/browse/SPARK-29465
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Submit, YARN
>Affects Versions: 3.0.0
>Reporter: Vishwas Nalka
>Priority: Major
>
>  I'm trying to restrict the ports used by spark app which is launched in yarn 
> cluster mode. All ports (viz. driver, executor, blockmanager) could be 
> specified using the respective properties except the ui port. The spark app 
> is launched using JAVA code and setting the property spark.ui.port in 
> sparkConf doesn't seem to help. Even setting a JVM option 
> -Dspark.ui.port="some_port" does not spawn the UI is required port. 
> From the logs of the spark app, *_the property spark.ui.port is overridden 
> and the JVM property '-Dspark.ui.port=0' is set_* even though it is never set 
> to 0. 
> _(Run in Spark 1.6.2) From the logs ->_
> _command:LD_LIBRARY_PATH="/usr/hdp/2.6.4.0-91/hadoop/lib/native:$LD_LIBRARY_PATH"
>  {{JAVA_HOME}}/bin/java -server -XX:OnOutOfMemoryError='kill %p' -Xms4096m 
> -Xmx4096m -Djava.io.tmpdir={{PWD}}/tmp '-Dspark.blockManager.port=9900' 
> '-Dspark.driver.port=9902' '-Dspark.fileserver.port=9903' 
> '-Dspark.broadcast.port=9904' '-Dspark.port.maxRetries=20' 
> '-Dspark.ui.port=0' '-Dspark.executor.port=9905'_
> _19/10/14 16:39:59 INFO Utils: Successfully started service 'SparkUI' on port 
> 35167.19/10/14 16:39:59 INFO SparkUI: Started SparkUI at_ 
> [_http://10.65.170.98:35167_|http://10.65.170.98:35167/]
> Even tried using a *spark-submit command with --conf spark.ui.port* does 
> spawn UI in required port
> {color:#172b4d}_(Run in Spark 2.4.4)_{color}
>  {color:#172b4d}_./bin/spark-submit --class org.apache.spark.examples.SparkPi 
> --master yarn --deploy-mode cluster --driver-memory 4g --executor-memory 2g 
> --executor-cores 1 --conf spark.ui.port=12345 --conf spark.driver.port=12340 
> --queue default examples/jars/spark-examples_2.11-2.4.4.jar 10_{color}
> _From the logs::_
>  _19/10/15 00:04:05 INFO ui.SparkUI: Stopped Spark web UI at 
> [http://invrh74ace005.informatica.com:46622|http://invrh74ace005.informatica.com:46622/]_
> _command:{{JAVA_HOME}}/bin/java -server -Xmx2048m 
> -Djava.io.tmpdir={{PWD}}/tmp '-Dspark.ui.port=0'  'Dspark.driver.port=12340' 
> -Dspark.yarn.app.container.log.dir= -XX:OnOutOfMemoryError='kill %p' 
> org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url 
> spark://coarsegrainedschedu...@invrh74ace005.informatica.com:12340 
> --executor-id  --hostname  --cores 1 --app-id 
> application_1570992022035_0089 --user-class-path 
> [file:$PWD/__app__.jar1|file://%24pwd/__app__.jar1]>/stdout2>/stderr_
>  
> Looks like the application master override this and set a JVM property before 
> launch resulting in random UI port even though spark.ui.port is set by the 
> user.
> In these links
>  # 
> [https://github.com/apache/spark/blob/master/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala]
>  (line 214)
>  # 
> [https://github.com/cloudera/spark/blob/master/yarn/alpha/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala]
>  (line 75)
> I can see that the method _*run() in above files sets a system property 
> UI_PORT*_ and _*spark.ui.port respectively.*_



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-29148) Modify dynamic allocation manager for stage level scheduling

2019-10-11 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves reassigned SPARK-29148:
-

Assignee: Thomas Graves

> Modify dynamic allocation manager for stage level scheduling
> 
>
> Key: SPARK-29148
> URL: https://issues.apache.org/jira/browse/SPARK-29148
> Project: Spark
>  Issue Type: Story
>  Components: Scheduler
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
>
> To Support Stage Level Scheduling, the dynamic allocation manager has to 
> track the usage and need or executor per ResourceProfile.
> We will have to figure out what to do with the metrics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-29415) Stage Level Sched: Add base ResourceProfile and Request classes

2019-10-11 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves reassigned SPARK-29415:
-

Assignee: Thomas Graves

> Stage Level Sched: Add base ResourceProfile and Request classes
> ---
>
> Key: SPARK-29415
> URL: https://issues.apache.org/jira/browse/SPARK-29415
> Project: Spark
>  Issue Type: Story
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
>
> this is just to add initial ResourceProfile, ExecutorResourceRequest and 
> taskResourceRequest classes that are used by the other parts of the code.
> Initially we will have them private until we have other pieces in place.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-29306) Executors need to track what ResourceProfile they are created with

2019-10-11 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves reassigned SPARK-29306:
-

Assignee: Thomas Graves

> Executors need to track what ResourceProfile they are created with 
> ---
>
> Key: SPARK-29306
> URL: https://issues.apache.org/jira/browse/SPARK-29306
> Project: Spark
>  Issue Type: Story
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
>
> For stage level scheduling, the Executors need to report what ResourceProfile 
> they are created with so that the ExecutorMonitor can track them and the 
> ExecutorAllocationManager can use that information to know how many to 
> request, etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29396) Extend Spark plugin interface to driver

2019-10-11 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949460#comment-16949460
 ] 

Thomas Graves commented on SPARK-29396:
---

 While I do like the idea, do we have some concrete use cases here? 

Its easier to make sure API makes sense when you have use cases. Normally I 
would have thought on the driver side its easy to write whatever code you want 
and spin it up yourself.

> Extend Spark plugin interface to driver
> ---
>
> Key: SPARK-29396
> URL: https://issues.apache.org/jira/browse/SPARK-29396
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Marcelo Masiero Vanzin
>Priority: Major
>
> Spark provides an extension API for people to implement executor plugins, 
> added in SPARK-24918 and later extended in SPARK-28091.
> That API does not offer any functionality for doing similar things on the 
> driver side, though. As a consequence of that, there is not a good way for 
> the executor plugins to get information or communicate in any way with the 
> Spark driver.
> I've been playing with such an improved API for developing some new 
> functionality. I'll file a few child bugs for the work to get the changes in.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-29396) Extend Spark plugin interface to driver

2019-10-11 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves reassigned SPARK-29396:
-

Assignee: (was: Thomas Graves)

> Extend Spark plugin interface to driver
> ---
>
> Key: SPARK-29396
> URL: https://issues.apache.org/jira/browse/SPARK-29396
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Marcelo Masiero Vanzin
>Priority: Major
>
> Spark provides an extension API for people to implement executor plugins, 
> added in SPARK-24918 and later extended in SPARK-28091.
> That API does not offer any functionality for doing similar things on the 
> driver side, though. As a consequence of that, there is not a good way for 
> the executor plugins to get information or communicate in any way with the 
> Spark driver.
> I've been playing with such an improved API for developing some new 
> functionality. I'll file a few child bugs for the work to get the changes in.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-29396) Extend Spark plugin interface to driver

2019-10-11 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves reassigned SPARK-29396:
-

Assignee: Thomas Graves

> Extend Spark plugin interface to driver
> ---
>
> Key: SPARK-29396
> URL: https://issues.apache.org/jira/browse/SPARK-29396
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Marcelo Masiero Vanzin
>Assignee: Thomas Graves
>Priority: Major
>
> Spark provides an extension API for people to implement executor plugins, 
> added in SPARK-24918 and later extended in SPARK-28091.
> That API does not offer any functionality for doing similar things on the 
> driver side, though. As a consequence of that, there is not a good way for 
> the executor plugins to get information or communicate in any way with the 
> Spark driver.
> I've been playing with such an improved API for developing some new 
> functionality. I'll file a few child bugs for the work to get the changes in.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28859) Remove value check of MEMORY_OFFHEAP_SIZE in declaration section

2019-10-10 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948652#comment-16948652
 ] 

Thomas Graves commented on SPARK-28859:
---

I wouldn't expect users to specify the size when enabled its false. If they do 
specify false, I guess its ok for it to be 0, but not sure we really need to 
special case this.

Default of 0 is fine that is why I said if the user specifies a value it should 
be > 0, but I haven't looked to see when the configEntry does the validation on 
this.  If it validates the default value then we can't change it, or validator 
needs change. This is what the Jira is to investigate.  Taking a skim of the 
code it looks like the validator only runs on the non-default value.

> Remove value check of MEMORY_OFFHEAP_SIZE in declaration section
> 
>
> Key: SPARK-28859
> URL: https://issues.apache.org/jira/browse/SPARK-28859
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Yang Jie
>Assignee: yifan
>Priority: Minor
>
> Now MEMORY_OFFHEAP_SIZE has default value 0, but It should be greater than 0 
> when 
> MEMORY_OFFHEAP_ENABLED is true,, should we check this condition in code?
>  
> SPARK-28577 add this check before request memory resource to Yarn 
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-29417) Resource Scheduling - add TaskContext.resource java api

2019-10-09 Thread Thomas Graves (Jira)
Thomas Graves created SPARK-29417:
-

 Summary: Resource Scheduling - add TaskContext.resource java api
 Key: SPARK-29417
 URL: https://issues.apache.org/jira/browse/SPARK-29417
 Project: Spark
  Issue Type: Story
  Components: Spark Core
Affects Versions: 3.0.0
Reporter: Thomas Graves
Assignee: Thomas Graves


I noticed the TaskContext.resource() api we added returns a scala Map. This 
isn't very nice for the java api usage, so we should add an api that returns a 
java Map.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28859) Remove value check of MEMORY_OFFHEAP_SIZE in declaration section

2019-10-09 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947908#comment-16947908
 ] 

Thomas Graves commented on SPARK-28859:
---

The comment came from the pr you linked to, but was referring to the config 
definition and any other usages.

If you look in config/package.scala MEMORY_OFFHEAP_SIZE check if >= 0.  Really 
I think that should be >0 when its specified by the user. Need to see if that > 
0 is ok in checkValue and check other usages of it.

> Remove value check of MEMORY_OFFHEAP_SIZE in declaration section
> 
>
> Key: SPARK-28859
> URL: https://issues.apache.org/jira/browse/SPARK-28859
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Yang Jie
>Priority: Minor
>
> Now MEMORY_OFFHEAP_SIZE has default value 0, but It should be greater than 0 
> when 
> MEMORY_OFFHEAP_ENABLED is true,, should we check this condition in code?
>  
> SPARK-28577 add this check before request memory resource to Yarn 
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-29415) Stage Level Sched: Add base ResourceProfile and Request classes

2019-10-09 Thread Thomas Graves (Jira)
Thomas Graves created SPARK-29415:
-

 Summary: Stage Level Sched: Add base ResourceProfile and Request 
classes
 Key: SPARK-29415
 URL: https://issues.apache.org/jira/browse/SPARK-29415
 Project: Spark
  Issue Type: Story
  Components: Spark Core
Affects Versions: 3.0.0
Reporter: Thomas Graves


this is just to add initial ResourceProfile, ExecutorResourceRequest and 
taskResourceRequest classes that are used by the other parts of the code.

Initially we will have them private until we have other pieces in place.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29329) maven incremental builds not working

2019-10-03 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16943839#comment-16943839
 ] 

Thomas Graves commented on SPARK-29329:
---

filed issue with scala-maven-plugin, we will see what they say:

https://github.com/davidB/scala-maven-plugin/issues/364

> maven incremental builds not working
> 
>
> Key: SPARK-29329
> URL: https://issues.apache.org/jira/browse/SPARK-29329
> Project: Spark
>  Issue Type: Bug
>  Components: Build
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Priority: Major
>
> It looks like since we Upgraded scala-maven-plugin to 4.2.0 
> https://issues.apache.org/jira/browse/SPARK-28759 spark incremental builds 
> stop working.  Everytime you build its building all files, which takes 
> forever.
> It would be nice to fix this.
>  
> To reproduce, just build spark once ( I happened to be using the command 
> below):
> build/mvn -Phadoop-3.2 -Phive-thriftserver -Phive -Pyarn -Pkinesis-asl 
> -Pkubernetes -Pmesos -Phadoop-cloud -Pspark-ganglia-lgpl package -DskipTests
> Then build it again and you will see that it compiles all the files and takes 
> 15-30 minutes. With incremental it skips all unnecessary files and takes 
> closer to 5 minutes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29329) maven incremental builds not working

2019-10-02 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942790#comment-16942790
 ] 

Thomas Graves commented on SPARK-29329:
---

there are few comments on SPARK-28759 in regards to this, see:

 

https://issues.apache.org/jira/browse/SPARK-28759?focusedCommentId=16942407=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16942407

> maven incremental builds not working
> 
>
> Key: SPARK-29329
> URL: https://issues.apache.org/jira/browse/SPARK-29329
> Project: Spark
>  Issue Type: Bug
>  Components: Build
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Priority: Major
>
> It looks like since we Upgraded scala-maven-plugin to 4.2.0 
> https://issues.apache.org/jira/browse/SPARK-28759 spark incremental builds 
> stop working.  Everytime you build its building all files, which takes 
> forever.
> It would be nice to fix this.
>  
> To reproduce, just build spark once ( I happened to be using the command 
> below):
> build/mvn -Phadoop-3.2 -Phive-thriftserver -Phive -Pyarn -Pkinesis-asl 
> -Pkubernetes -Pmesos -Phadoop-cloud -Pspark-ganglia-lgpl package -DskipTests
> Then build it again and you will see that it compiles all the files and takes 
> 15-30 minutes. With incremental it skips all unnecessary files and takes 
> closer to 5 minutes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-29329) maven incremental builds not working

2019-10-02 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-29329:
--
Description: 
It looks like since we Upgraded scala-maven-plugin to 4.2.0 
https://issues.apache.org/jira/browse/SPARK-28759 spark incremental builds stop 
working.  Everytime you build its building all files, which takes forever.

It would be nice to fix this.

 

To reproduce, just build spark once ( I happened to be using the command below):

build/mvn -Phadoop-3.2 -Phive-thriftserver -Phive -Pyarn -Pkinesis-asl 
-Pkubernetes -Pmesos -Phadoop-cloud -Pspark-ganglia-lgpl package -DskipTests

Then build it again and you will see that it compiles all the files and takes 
15-30 minutes. With incremental it skips all unnecessary files and takes closer 
to 5 minutes.

  was:
It looks like since we Upgraded scala-maven-plugin to 4.2.0 
https://issues.apache.org/jira/browse/SPARK-28759 spark incremental builds stop 
working.  Everytime you build its building all files, which takes forever.

It would be nice to fix this.


> maven incremental builds not working
> 
>
> Key: SPARK-29329
> URL: https://issues.apache.org/jira/browse/SPARK-29329
> Project: Spark
>  Issue Type: Bug
>  Components: Build
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Priority: Major
>
> It looks like since we Upgraded scala-maven-plugin to 4.2.0 
> https://issues.apache.org/jira/browse/SPARK-28759 spark incremental builds 
> stop working.  Everytime you build its building all files, which takes 
> forever.
> It would be nice to fix this.
>  
> To reproduce, just build spark once ( I happened to be using the command 
> below):
> build/mvn -Phadoop-3.2 -Phive-thriftserver -Phive -Pyarn -Pkinesis-asl 
> -Pkubernetes -Pmesos -Phadoop-cloud -Pspark-ganglia-lgpl package -DskipTests
> Then build it again and you will see that it compiles all the files and takes 
> 15-30 minutes. With incremental it skips all unnecessary files and takes 
> closer to 5 minutes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28759) Upgrade scala-maven-plugin to 4.2.0

2019-10-02 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942787#comment-16942787
 ] 

Thomas Graves commented on SPARK-28759:
---

I rolled back this commit and the incremental compile now works.  Without 
incremental compiles the build takes forever so I'm against disabling it.  I 
filed https://issues.apache.org/jira/browse/SPARK-29329 for us to look at.

> Upgrade scala-maven-plugin to 4.2.0
> ---
>
> Key: SPARK-28759
> URL: https://issues.apache.org/jira/browse/SPARK-28759
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 3.0.0
>Reporter: Dongjoon Hyun
>Assignee: Hyukjin Kwon
>Priority: Major
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-29329) maven incremental builds not working

2019-10-02 Thread Thomas Graves (Jira)
Thomas Graves created SPARK-29329:
-

 Summary: maven incremental builds not working
 Key: SPARK-29329
 URL: https://issues.apache.org/jira/browse/SPARK-29329
 Project: Spark
  Issue Type: Bug
  Components: Build
Affects Versions: 3.0.0
Reporter: Thomas Graves


It looks like since we Upgraded scala-maven-plugin to 4.2.0 
https://issues.apache.org/jira/browse/SPARK-28759 spark incremental builds stop 
working.  Everytime you build its building all files, which takes forever.

It would be nice to fix this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-28759) Upgrade scala-maven-plugin to 4.2.0

2019-10-01 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942308#comment-16942308
 ] 

Thomas Graves edited comment on SPARK-28759 at 10/1/19 9:05 PM:


[~dongjoon] [~hyukjin.kwon]  Did either of you noticed that is seems 
incremental re-compiles seems to have been broken (rebuilding same thing again 
is taking forever because its recompiling all files again). I was rolling back 
to see what caused it and it seems like its this change.

Wondering if you are seeing the same thing?


was (Author: tgraves):
[~dongjoon] [~hyukjin.kwon]  Did either of you noticed that is seems 
incremental re-compiles seems to have been broken. I was rolling back to see 
what caused it and it seems like its this change.

Wondering if you are seeing the same thing?

> Upgrade scala-maven-plugin to 4.2.0
> ---
>
> Key: SPARK-28759
> URL: https://issues.apache.org/jira/browse/SPARK-28759
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 3.0.0
>Reporter: Dongjoon Hyun
>Assignee: Hyukjin Kwon
>Priority: Major
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28759) Upgrade scala-maven-plugin to 4.2.0

2019-10-01 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942308#comment-16942308
 ] 

Thomas Graves commented on SPARK-28759:
---

[~dongjoon] [~hyukjin.kwon]  Did either of you noticed that is seems 
incremental re-compiles seems to have been broken. I was rolling back to see 
what caused it and it seems like its this change.

Wondering if you are seeing the same thing?

> Upgrade scala-maven-plugin to 4.2.0
> ---
>
> Key: SPARK-28759
> URL: https://issues.apache.org/jira/browse/SPARK-28759
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 3.0.0
>Reporter: Dongjoon Hyun
>Assignee: Hyukjin Kwon
>Priority: Major
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-29306) Executors need to track what ResourceProfile they are created with

2019-09-30 Thread Thomas Graves (Jira)
Thomas Graves created SPARK-29306:
-

 Summary: Executors need to track what ResourceProfile they are 
created with 
 Key: SPARK-29306
 URL: https://issues.apache.org/jira/browse/SPARK-29306
 Project: Spark
  Issue Type: Story
  Components: Spark Core
Affects Versions: 3.0.0
Reporter: Thomas Graves


For stage level scheduling, the Executors need to report what ResourceProfile 
they are created with so that the ExecutorMonitor can track them and the 
ExecutorAllocationManager can use that information to know how many to request, 
etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-27396) SPIP: Public APIs for extended Columnar Processing Support

2019-09-30 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-27396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved SPARK-27396.
---
  Assignee: Robert Joseph Evans
Resolution: Fixed

> SPIP: Public APIs for extended Columnar Processing Support
> --
>
> Key: SPARK-27396
> URL: https://issues.apache.org/jira/browse/SPARK-27396
> Project: Spark
>  Issue Type: Epic
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Robert Joseph Evans
>Assignee: Robert Joseph Evans
>Priority: Major
>
> *SPIP: Columnar Processing Without Arrow Formatting Guarantees.*
>  
> *Q1.* What are you trying to do? Articulate your objectives using absolutely 
> no jargon.
> The Dataset/DataFrame API in Spark currently only exposes to users one row at 
> a time when processing data.  The goals of this are to
>  # Add to the current sql extensions mechanism so advanced users can have 
> access to the physical SparkPlan and manipulate it to provide columnar 
> processing for existing operators, including shuffle.  This will allow them 
> to implement their own cost based optimizers to decide when processing should 
> be columnar and when it should not.
>  # Make any transitions between the columnar memory layout and a row based 
> layout transparent to the users so operations that are not columnar see the 
> data as rows, and operations that are columnar see the data as columns.
>  
> Not Requirements, but things that would be nice to have.
>  # Transition the existing in memory columnar layouts to be compatible with 
> Apache Arrow.  This would make the transformations to Apache Arrow format a 
> no-op. The existing formats are already very close to those layouts in many 
> cases.  This would not be using the Apache Arrow java library, but instead 
> being compatible with the memory 
> [layout|https://arrow.apache.org/docs/format/Layout.html] and possibly only a 
> subset of that layout.
>  
> *Q2.* What problem is this proposal NOT designed to solve? 
> The goal of this is not for ML/AI but to provide APIs for accelerated 
> computing in Spark primarily targeting SQL/ETL like workloads.  ML/AI already 
> have several mechanisms to get data into/out of them. These can be improved 
> but will be covered in a separate SPIP.
> This is not trying to implement any of the processing itself in a columnar 
> way, with the exception of examples for documentation.
> This does not cover exposing the underlying format of the data.  The only way 
> to get at the data in a ColumnVector is through the public APIs.  Exposing 
> the underlying format to improve efficiency will be covered in a separate 
> SPIP.
> This is not trying to implement new ways of transferring data to external 
> ML/AI applications.  That is covered by separate SPIPs already.
> This is not trying to add in generic code generation for columnar processing. 
>  Currently code generation for columnar processing is only supported when 
> translating columns to rows.  We will continue to support this, but will not 
> extend it as a general solution. That will be covered in a separate SPIP if 
> we find it is helpful.  For now columnar processing will be interpreted.
> This is not trying to expose a way to get columnar data into Spark through 
> DataSource V2 or any other similar API.  That would be covered by a separate 
> SPIP if we find it is needed.
>  
> *Q3.* How is it done today, and what are the limits of current practice?
> The current columnar support is limited to 3 areas.
>  # Internal implementations of FileFormats, optionally can return a 
> ColumnarBatch instead of rows.  The code generation phase knows how to take 
> that columnar data and iterate through it as rows for stages that wants rows, 
> which currently is almost everything.  The limitations here are mostly 
> implementation specific. The current standard is to abuse Scala’s type 
> erasure to return ColumnarBatches as the elements of an RDD[InternalRow]. The 
> code generation can handle this because it is generating java code, so it 
> bypasses scala’s type checking and just casts the InternalRow to the desired 
> ColumnarBatch.  This makes it difficult for others to implement the same 
> functionality for different processing because they can only do it through 
> code generation. There really is no clean separate path in the code 
> generation for columnar vs row based. Additionally, because it is only 
> supported through code generation if for any reason code generation would 
> fail there is no backup.  This is typically fine for input formats but can be 
> problematic when we get into more extensive processing.
>  # When caching data it can optionally be cached in a columnar format if the 
> input is also columnar.  This is similar to the first area and has the same 
> 

[jira] [Commented] (SPARK-27396) SPIP: Public APIs for extended Columnar Processing Support

2019-09-30 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-27396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16940974#comment-16940974
 ] 

Thomas Graves commented on SPARK-27396:
---

The main objectives have actually already been implemented, see the linked 
jiras. I will close this.

> SPIP: Public APIs for extended Columnar Processing Support
> --
>
> Key: SPARK-27396
> URL: https://issues.apache.org/jira/browse/SPARK-27396
> Project: Spark
>  Issue Type: Epic
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Robert Joseph Evans
>Priority: Major
>
> *SPIP: Columnar Processing Without Arrow Formatting Guarantees.*
>  
> *Q1.* What are you trying to do? Articulate your objectives using absolutely 
> no jargon.
> The Dataset/DataFrame API in Spark currently only exposes to users one row at 
> a time when processing data.  The goals of this are to
>  # Add to the current sql extensions mechanism so advanced users can have 
> access to the physical SparkPlan and manipulate it to provide columnar 
> processing for existing operators, including shuffle.  This will allow them 
> to implement their own cost based optimizers to decide when processing should 
> be columnar and when it should not.
>  # Make any transitions between the columnar memory layout and a row based 
> layout transparent to the users so operations that are not columnar see the 
> data as rows, and operations that are columnar see the data as columns.
>  
> Not Requirements, but things that would be nice to have.
>  # Transition the existing in memory columnar layouts to be compatible with 
> Apache Arrow.  This would make the transformations to Apache Arrow format a 
> no-op. The existing formats are already very close to those layouts in many 
> cases.  This would not be using the Apache Arrow java library, but instead 
> being compatible with the memory 
> [layout|https://arrow.apache.org/docs/format/Layout.html] and possibly only a 
> subset of that layout.
>  
> *Q2.* What problem is this proposal NOT designed to solve? 
> The goal of this is not for ML/AI but to provide APIs for accelerated 
> computing in Spark primarily targeting SQL/ETL like workloads.  ML/AI already 
> have several mechanisms to get data into/out of them. These can be improved 
> but will be covered in a separate SPIP.
> This is not trying to implement any of the processing itself in a columnar 
> way, with the exception of examples for documentation.
> This does not cover exposing the underlying format of the data.  The only way 
> to get at the data in a ColumnVector is through the public APIs.  Exposing 
> the underlying format to improve efficiency will be covered in a separate 
> SPIP.
> This is not trying to implement new ways of transferring data to external 
> ML/AI applications.  That is covered by separate SPIPs already.
> This is not trying to add in generic code generation for columnar processing. 
>  Currently code generation for columnar processing is only supported when 
> translating columns to rows.  We will continue to support this, but will not 
> extend it as a general solution. That will be covered in a separate SPIP if 
> we find it is helpful.  For now columnar processing will be interpreted.
> This is not trying to expose a way to get columnar data into Spark through 
> DataSource V2 or any other similar API.  That would be covered by a separate 
> SPIP if we find it is needed.
>  
> *Q3.* How is it done today, and what are the limits of current practice?
> The current columnar support is limited to 3 areas.
>  # Internal implementations of FileFormats, optionally can return a 
> ColumnarBatch instead of rows.  The code generation phase knows how to take 
> that columnar data and iterate through it as rows for stages that wants rows, 
> which currently is almost everything.  The limitations here are mostly 
> implementation specific. The current standard is to abuse Scala’s type 
> erasure to return ColumnarBatches as the elements of an RDD[InternalRow]. The 
> code generation can handle this because it is generating java code, so it 
> bypasses scala’s type checking and just casts the InternalRow to the desired 
> ColumnarBatch.  This makes it difficult for others to implement the same 
> functionality for different processing because they can only do it through 
> code generation. There really is no clean separate path in the code 
> generation for columnar vs row based. Additionally, because it is only 
> supported through code generation if for any reason code generation would 
> fail there is no backup.  This is typically fine for input formats but can be 
> problematic when we get into more extensive processing.
>  # When caching data it can optionally be cached in a columnar format if the 
> input is also columnar.  This is similar 

[jira] [Created] (SPARK-29303) UI updates for stage level scheduling

2019-09-30 Thread Thomas Graves (Jira)
Thomas Graves created SPARK-29303:
-

 Summary: UI updates for stage level scheduling
 Key: SPARK-29303
 URL: https://issues.apache.org/jira/browse/SPARK-29303
 Project: Spark
  Issue Type: Story
  Components: Web UI
Affects Versions: 3.0.0
Reporter: Thomas Graves


Update the UI to show information about stage level scheduling.

The stage pages should have what resources were required for instance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29206) Number of shuffle Netty server threads should be a multiple of number of chunk fetch handler threads

2019-09-23 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935968#comment-16935968
 ] 

Thomas Graves commented on SPARK-29206:
---

that is definitely not the behavior I was expecting from Netty. I'll have to 
look at the details further but if you already have a patch that works I would 
say put it up and we can discuss there.

> Number of shuffle Netty server threads should be a multiple of number of 
> chunk fetch handler threads
> 
>
> Key: SPARK-29206
> URL: https://issues.apache.org/jira/browse/SPARK-29206
> Project: Spark
>  Issue Type: Improvement
>  Components: Shuffle
>Affects Versions: 3.0.0
>Reporter: Min Shen
>Priority: Major
>
> In SPARK-24355, we proposed to use a separate chunk fetch handler thread pool 
> to handle the slow-to-process chunk fetch requests in order to improve the 
> responsiveness of shuffle service for RPC requests.
> Initially, we thought by making the number of Netty server threads larger 
> than the number of chunk fetch handler threads, it would reserve some threads 
> for RPC requests thus resolving the various RPC request timeout issues we 
> experienced previously. The solution worked in our cluster initially. 
> However, as the number of Spark applications in our cluster continues to 
> increase, we saw the RPC request (SASL authentication specifically) timeout 
> issue again:
> {noformat}
> java.lang.RuntimeException: java.util.concurrent.TimeoutException: Timeout 
> waiting for task.
>   at 
> org.spark-project.guava.base.Throwables.propagate(Throwables.java:160)
>   at 
> org.apache.spark.network.client.TransportClient.sendRpcSync(TransportClient.java:278)
>   at 
> org.apache.spark.network.sasl.SaslClientBootstrap.doBootstrap(SaslClientBootstrap.java:80)
>   at 
> org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:228)
>   at 
> org.apache.spark.network.client.TransportClientFactory.createUnmanagedClient(TransportClientFactory.java:181)
>   at 
> org.apache.spark.network.shuffle.ExternalShuffleClient.registerWithShuffleServer(ExternalShuffleClient.java:141)
>   at 
> org.apache.spark.storage.BlockManager$$anonfun$registerWithExternalShuffleServer$1.apply$mcVI$sp(BlockManager.scala:218)
>  {noformat}
> After further investigation, we realized that as the number of concurrent 
> clients connecting to a shuffle service increases, it becomes _VERY_ 
> important to configure the number of Netty server threads and number of chunk 
> fetch handler threads correctly. Specifically, the number of Netty server 
> threads needs to be a multiple of the number of chunk fetch handler threads. 
> The reason is explained in details below:
> When a channel is established on the Netty server, it is registered with both 
> the Netty server default EventLoopGroup and the chunk fetch handler 
> EventLoopGroup. Once registered, this channel sticks with a given thread in 
> both EventLoopGroups, i.e. all requests from this channel is going to be 
> handled by the same thread. Right now, Spark shuffle Netty server uses the 
> default Netty strategy to select a thread from a EventLoopGroup to be 
> associated with a new channel, which is simply round-robin (Netty's 
> DefaultEventExecutorChooserFactory).
> In SPARK-24355, with the introduced chunk fetch handler thread pool, all 
> chunk fetch requests from a given channel will be first added to the task 
> queue of the chunk fetch handler thread associated with that channel. When 
> the requests get processed, the chunk fetch request handler thread will 
> submit a task to the task queue of the Netty server thread that's also 
> associated with this channel. If the number of Netty server threads is not a 
> multiple of the number of chunk fetch handler threads, it would become a 
> problem when the server has a large number of concurrent connections.
> Assume we configure the number of Netty server threads as 40 and the 
> percentage of chunk fetch handler threads as 87, which leads to 35 chunk 
> fetch handler threads. Then according to the round-robin policy, channel 0, 
> 40, 80, 120, 160, 200, 240, and 280 will all be associated with the 1st Netty 
> server thread in the default EventLoopGroup. However, since the chunk fetch 
> handler thread pool only has 35 threads, out of these 8 channels, only 
> channel 0 and 280 will be associated with the same chunk fetch handler 
> thread. Thus, channel 0, 40, 80, 120, 160, 200, 240 will all be associated 
> with different chunk fetch handler threads but associated with the same Netty 
> server thread. This means, the 7 different chunk fetch handler threads 
> associated with these channels could potentially submit tasks to the 

[jira] [Updated] (SPARK-29151) Support fraction resources for task resource scheduling

2019-09-20 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-29151:
--
Summary: Support fraction resources for task resource scheduling  (was: 
Support fraction resources for resource scheduling)

> Support fraction resources for task resource scheduling
> ---
>
> Key: SPARK-29151
> URL: https://issues.apache.org/jira/browse/SPARK-29151
> Project: Spark
>  Issue Type: Story
>  Components: Scheduler
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Priority: Major
>
> The current resource scheduling code for GPU/FPGA, etc only supports amounts 
> as integers, so you can only schedule whole resources.  There are cases where 
> you may want to share the resources and schedule multiple tasks to run on the 
> same resources (GPU).  It would be nice to support fractional resources.  
> Somehow say we want a task to have 1/4 of a GPU for instance.  I think we 
> only want to support fractional when the resources amount is < 1.  Otherwise 
> you run into issues where someone asks for 2 1/8 GPU, which doesn't really 
> make sense to me and makes assigning addresses very complicated.
> Need to think about implementation details, for instance using a float can be 
> troublesome here due to floating point math precision issues.
> Another thing to consider, depending on implementation is limiting the 
> precision - go down to tenths, hundreths, thousandths, etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-29151) Support fraction resources for task resource scheduling

2019-09-20 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-29151:
--
Description: 
The current resource scheduling code for GPU/FPGA, etc only supports amounts as 
integers, so you can only schedule whole resources.  There are cases where you 
may want to share the resources and schedule multiple tasks to run on the same 
resources (GPU).  It would be nice to support fractional resources for the task 
level settings.  Somehow say we want a task to have 1/4 of a GPU for instance.  
I think we only want to support fractional when the resources amount is < 1.  
Otherwise you run into issues where someone asks for 2 1/8 GPU, which doesn't 
really make sense to me and makes assigning addresses very complicated.

Need to think about implementation details, for instance using a float can be 
troublesome here due to floating point math precision issues.

Another thing to consider, depending on implementation is limiting the 
precision - go down to tenths, hundreths, thousandths, etc.

  was:
The current resource scheduling code for GPU/FPGA, etc only supports amounts as 
integers, so you can only schedule whole resources.  There are cases where you 
may want to share the resources and schedule multiple tasks to run on the same 
resources (GPU).  It would be nice to support fractional resources.  Somehow 
say we want a task to have 1/4 of a GPU for instance.  I think we only want to 
support fractional when the resources amount is < 1.  Otherwise you run into 
issues where someone asks for 2 1/8 GPU, which doesn't really make sense to me 
and makes assigning addresses very complicated.

Need to think about implementation details, for instance using a float can be 
troublesome here due to floating point math precision issues.

Another thing to consider, depending on implementation is limiting the 
precision - go down to tenths, hundreths, thousandths, etc.


> Support fraction resources for task resource scheduling
> ---
>
> Key: SPARK-29151
> URL: https://issues.apache.org/jira/browse/SPARK-29151
> Project: Spark
>  Issue Type: Story
>  Components: Scheduler
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Priority: Major
>
> The current resource scheduling code for GPU/FPGA, etc only supports amounts 
> as integers, so you can only schedule whole resources.  There are cases where 
> you may want to share the resources and schedule multiple tasks to run on the 
> same resources (GPU).  It would be nice to support fractional resources for 
> the task level settings.  Somehow say we want a task to have 1/4 of a GPU for 
> instance.  I think we only want to support fractional when the resources 
> amount is < 1.  Otherwise you run into issues where someone asks for 2 1/8 
> GPU, which doesn't really make sense to me and makes assigning addresses very 
> complicated.
> Need to think about implementation details, for instance using a float can be 
> troublesome here due to floating point math precision issues.
> Another thing to consider, depending on implementation is limiting the 
> precision - go down to tenths, hundreths, thousandths, etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29151) Support fraction resources for resource scheduling

2019-09-20 Thread Thomas Graves (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16934419#comment-16934419
 ] 

Thomas Graves commented on SPARK-29151:
---

To keep this simple for a design, I think we change the .amount config to be a 
Double and then kind of make it like tasks per GPU.

So we only allow 0-0.5 or whole numbers 1,2,3,4.  We don't allow 1.25 for 
instance because we have no way to tell the user which GPU they get 1/4 of. We 
only do 0-0.5 because anything larger then 0.5 essentially just give you 1 task 
per GPU.

for the math for the scheduler I think we can do floor (1/amount). This should 
give us a nice multiple for tasks per GPU for the scheduler to track.  
floor(1/0.333) = 3 . Basically internally scheduler treat it as an int at that 
point so we don't have issues with weird precision math issues.  

I think this will be ok if we document it clearly and have log messages and 
such to what it is really using.

> Support fraction resources for resource scheduling
> --
>
> Key: SPARK-29151
> URL: https://issues.apache.org/jira/browse/SPARK-29151
> Project: Spark
>  Issue Type: Story
>  Components: Scheduler
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Priority: Major
>
> The current resource scheduling code for GPU/FPGA, etc only supports amounts 
> as integers, so you can only schedule whole resources.  There are cases where 
> you may want to share the resources and schedule multiple tasks to run on the 
> same resources (GPU).  It would be nice to support fractional resources.  
> Somehow say we want a task to have 1/4 of a GPU for instance.  I think we 
> only want to support fractional when the resources amount is < 1.  Otherwise 
> you run into issues where someone asks for 2 1/8 GPU, which doesn't really 
> make sense to me and makes assigning addresses very complicated.
> Need to think about implementation details, for instance using a float can be 
> troublesome here due to floating point math precision issues.
> Another thing to consider, depending on implementation is limiting the 
> precision - go down to tenths, hundreths, thousandths, etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27495) SPIP: Support Stage level resource configuration and scheduling

2019-09-18 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-27495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-27495:
--
Epic Name: Stage Level Scheduling

> SPIP: Support Stage level resource configuration and scheduling
> ---
>
> Key: SPARK-27495
> URL: https://issues.apache.org/jira/browse/SPARK-27495
> Project: Spark
>  Issue Type: Epic
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
>  Labels: SPIP
>
> *Q1.* What are you trying to do? Articulate your objectives using absolutely 
> no jargon.
> Objectives:
>  # Allow users to specify task and executor resource requirements at the 
> stage level. 
>  # Spark will use the stage level requirements to acquire the necessary 
> resources/executors and schedule tasks based on the per stage requirements.
> Many times users have different resource requirements for different stages of 
> their application so they want to be able to configure resources at the stage 
> level. For instance, you have a single job that has 2 stages. The first stage 
> does some  ETL which requires a lot of tasks, each with a small amount of 
> memory and 1 core each. Then you have a second stage where you feed that ETL 
> data into an ML algorithm. The second stage only requires a few executors but 
> each executor needs a lot of memory, GPUs, and many cores.  This feature 
> allows the user to specify the task and executor resource requirements for 
> the ETL Stage and then change them for the ML stage of the job. 
> Resources include cpu, memory (on heap, overhead, pyspark, and off heap), and 
> extra Resources (GPU/FPGA/etc). It has the potential to allow for other 
> things like limiting the number of tasks per stage, specifying other 
> parameters for things like shuffle, etc. Initially I would propose we only 
> support resources as they are now. So Task resources would be cpu and other 
> resources (GPU, FPGA), that way we aren't adding in extra scheduling things 
> at this point.  Executor resources would be cpu, memory, and extra 
> resources(GPU,FPGA, etc). Changing the executor resources will rely on 
> dynamic allocation being enabled.
> Main use cases:
>  # ML use case where user does ETL and feeds it into an ML algorithm where 
> it’s using the RDD API. This should work with barrier scheduling as well once 
> it supports dynamic allocation.
>  # This adds the framework/api for Spark's own internal use.  In the future 
> (not covered by this SPIP), Catalyst could control the stage level resources 
> as it finds the need to change it between stages for different optimizations. 
> For instance, with the new columnar plugin to the query planner we can insert 
> stages into the plan that would change running something on the CPU in row 
> format to running it on the GPU in columnar format. This API would allow the 
> planner to make sure the stages that run on the GPU get the corresponding GPU 
> resources it needs to run. Another possible use case for catalyst is that it 
> would allow catalyst to add in more optimizations to where the user doesn’t 
> need to configure container sizes at all. If the optimizer/planner can handle 
> that for the user, everyone wins.
> This SPIP focuses on the RDD API but we don’t exclude the Dataset API. I 
> think the DataSet API will require more changes because it specifically hides 
> the RDD from the users via the plans and catalyst can optimize the plan and 
> insert things into the plan. The only way I’ve found to make this work with 
> the Dataset API would be modifying all the plans to be able to get the 
> resource requirements down into where it creates the RDDs, which I believe 
> would be a lot of change.  If other people know better options, it would be 
> great to hear them.
> *Q2.* What problem is this proposal NOT designed to solve?
> The initial implementation is not going to add Dataset APIs.
> We are starting with allowing users to specify a specific set of 
> task/executor resources and plan to design it to be extendable, but the first 
> implementation will not support changing generic SparkConf configs and only 
> specific limited resources.
> This initial version will have a programmatic API for specifying the resource 
> requirements per stage, we can add the ability to perhaps have profiles in 
> the configs later if its useful.
> *Q3.* How is it done today, and what are the limits of current practice?
> Currently this is either done by having multiple spark jobs or requesting 
> containers with the max resources needed for any part of the job.  To do this 
> today, you can break it into separate jobs where each job requests the 
> corresponding resources needed, but then you have to write the 

[jira] [Updated] (SPARK-27495) SPIP: Support Stage level resource configuration and scheduling

2019-09-18 Thread Thomas Graves (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-27495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated SPARK-27495:
--
Labels: SPIP  (was: )

> SPIP: Support Stage level resource configuration and scheduling
> ---
>
> Key: SPARK-27495
> URL: https://issues.apache.org/jira/browse/SPARK-27495
> Project: Spark
>  Issue Type: Epic
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Major
>  Labels: SPIP
>
> *Q1.* What are you trying to do? Articulate your objectives using absolutely 
> no jargon.
> Objectives:
>  # Allow users to specify task and executor resource requirements at the 
> stage level. 
>  # Spark will use the stage level requirements to acquire the necessary 
> resources/executors and schedule tasks based on the per stage requirements.
> Many times users have different resource requirements for different stages of 
> their application so they want to be able to configure resources at the stage 
> level. For instance, you have a single job that has 2 stages. The first stage 
> does some  ETL which requires a lot of tasks, each with a small amount of 
> memory and 1 core each. Then you have a second stage where you feed that ETL 
> data into an ML algorithm. The second stage only requires a few executors but 
> each executor needs a lot of memory, GPUs, and many cores.  This feature 
> allows the user to specify the task and executor resource requirements for 
> the ETL Stage and then change them for the ML stage of the job. 
> Resources include cpu, memory (on heap, overhead, pyspark, and off heap), and 
> extra Resources (GPU/FPGA/etc). It has the potential to allow for other 
> things like limiting the number of tasks per stage, specifying other 
> parameters for things like shuffle, etc. Initially I would propose we only 
> support resources as they are now. So Task resources would be cpu and other 
> resources (GPU, FPGA), that way we aren't adding in extra scheduling things 
> at this point.  Executor resources would be cpu, memory, and extra 
> resources(GPU,FPGA, etc). Changing the executor resources will rely on 
> dynamic allocation being enabled.
> Main use cases:
>  # ML use case where user does ETL and feeds it into an ML algorithm where 
> it’s using the RDD API. This should work with barrier scheduling as well once 
> it supports dynamic allocation.
>  # This adds the framework/api for Spark's own internal use.  In the future 
> (not covered by this SPIP), Catalyst could control the stage level resources 
> as it finds the need to change it between stages for different optimizations. 
> For instance, with the new columnar plugin to the query planner we can insert 
> stages into the plan that would change running something on the CPU in row 
> format to running it on the GPU in columnar format. This API would allow the 
> planner to make sure the stages that run on the GPU get the corresponding GPU 
> resources it needs to run. Another possible use case for catalyst is that it 
> would allow catalyst to add in more optimizations to where the user doesn’t 
> need to configure container sizes at all. If the optimizer/planner can handle 
> that for the user, everyone wins.
> This SPIP focuses on the RDD API but we don’t exclude the Dataset API. I 
> think the DataSet API will require more changes because it specifically hides 
> the RDD from the users via the plans and catalyst can optimize the plan and 
> insert things into the plan. The only way I’ve found to make this work with 
> the Dataset API would be modifying all the plans to be able to get the 
> resource requirements down into where it creates the RDDs, which I believe 
> would be a lot of change.  If other people know better options, it would be 
> great to hear them.
> *Q2.* What problem is this proposal NOT designed to solve?
> The initial implementation is not going to add Dataset APIs.
> We are starting with allowing users to specify a specific set of 
> task/executor resources and plan to design it to be extendable, but the first 
> implementation will not support changing generic SparkConf configs and only 
> specific limited resources.
> This initial version will have a programmatic API for specifying the resource 
> requirements per stage, we can add the ability to perhaps have profiles in 
> the configs later if its useful.
> *Q3.* How is it done today, and what are the limits of current practice?
> Currently this is either done by having multiple spark jobs or requesting 
> containers with the max resources needed for any part of the job.  To do this 
> today, you can break it into separate jobs where each job requests the 
> corresponding resources needed, but then you have to write the data out 
> 

[jira] [Created] (SPARK-29154) Update Spark scheduler for stage level scheduling

2019-09-18 Thread Thomas Graves (Jira)
Thomas Graves created SPARK-29154:
-

 Summary: Update Spark scheduler for stage level scheduling
 Key: SPARK-29154
 URL: https://issues.apache.org/jira/browse/SPARK-29154
 Project: Spark
  Issue Type: Story
  Components: Scheduler
Affects Versions: 3.0.0
Reporter: Thomas Graves


Make the changes to DAGscheduler, stage, task set manager, task scheduler to 
support scheduling based on the resource profiles.  Note that the logic to 
merge profiles has a separate jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



<    1   2   3   4   5   6   7   8   9   10   >