date:20170926

[jira] [Assigned] (SPARK-22124) Sample and Limit should also defer input evaluation under codegen

2017-09-26 Thread Wenchen Fan (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-22124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-22124:
---

Assignee: Liang-Chi Hsieh

> Sample and Limit should also defer input evaluation under codegen
> -
>
> Key: SPARK-22124
> URL: https://issues.apache.org/jira/browse/SPARK-22124
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Liang-Chi Hsieh
>Assignee: Liang-Chi Hsieh
>Priority: Minor
> Fix For: 2.3.0
>
>
> We can override {{usedInputs}} to claim that an operator defers input 
> evaluation. {{Sample}} and {{Limit}} are two operators which should claim it 
> but don't. We should do it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-22124) Sample and Limit should also defer input evaluation under codegen

2017-09-26 Thread Wenchen Fan (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-22124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-22124.
-
   Resolution: Fixed
Fix Version/s: 2.3.0

Issue resolved by pull request 19345
[https://github.com/apache/spark/pull/19345]

> Sample and Limit should also defer input evaluation under codegen
> -
>
> Key: SPARK-22124
> URL: https://issues.apache.org/jira/browse/SPARK-22124
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Liang-Chi Hsieh
>Priority: Minor
> Fix For: 2.3.0
>
>
> We can override {{usedInputs}} to claim that an operator defers input 
> evaluation. {{Sample}} and {{Limit}} are two operators which should claim it 
> but don't. We should do it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-15574) Python meta-algorithms in Scala

2017-09-26 Thread Sean Owen (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-15574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen resolved SPARK-15574.
---
Resolution: Won't Fix

> Python meta-algorithms in Scala
> ---
>
> Key: SPARK-15574
> URL: https://issues.apache.org/jira/browse/SPARK-15574
> Project: Spark
>  Issue Type: Improvement
>  Components: ML, PySpark
>Reporter: Joseph K. Bradley
>
> This is an experimental idea for implementing Python ML meta-algorithms 
> (CrossValidator, TrainValidationSplit, Pipeline, OneVsRest, etc.) in Scala.  
> This would require a Scala wrapper for algorithms implemented in Python, 
> somewhat analogous to Python UDFs.
> The benefit of this change would be that we could avoid currently awkward 
> conversions between Scala/Python meta-algorithms required for persistence.  
> It would let us have full support for Python persistence and would generally 
> simplify the implementation within MLlib.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-21235) UTest should clear temp results when run case

2017-09-26 Thread Sean Owen (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-21235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen resolved SPARK-21235.
---
Resolution: Won't Fix

> UTest should clear temp results when run case 
> --
>
> Key: SPARK-21235
> URL: https://issues.apache.org/jira/browse/SPARK-21235
> Project: Spark
>  Issue Type: Test
>  Components: Tests
>Affects Versions: 2.1.1
>Reporter: wangjiaochun
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-21737) Create communication channel between arbitrary clients and the Spark AM in YARN mode

2017-09-26 Thread Sean Owen (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-21737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen resolved SPARK-21737.
---
Resolution: Won't Fix

> Create communication channel between arbitrary clients and the Spark AM in 
> YARN mode
> 
>
> Key: SPARK-21737
> URL: https://issues.apache.org/jira/browse/SPARK-21737
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.1.1
>Reporter: Jong Yoon Lee
>Priority: Minor
>
> In this JIRA, I develop code to create a communication channel between 
> arbitrary clients and a Spark AM on YARN. This code can be utilized to send 
> commands such as getting status command, getting history info from the CLI, 
> killing the application and pushing new tokens.
> Design Doc:
> https://docs.google.com/document/d/1QMbWhg13ocIoADywZQBRRVj-b9Zf8CnBrruP5JhcOOY/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-21655) Kill CLI for Yarn mode

2017-09-26 Thread Sean Owen (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-21655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen resolved SPARK-21655.
---
Resolution: Won't Fix

> Kill CLI for Yarn mode
> --
>
> Key: SPARK-21655
> URL: https://issues.apache.org/jira/browse/SPARK-21655
> Project: Spark
>  Issue Type: Improvement
>  Components: YARN
>Affects Versions: 2.1.1
>Reporter: Jong Yoon Lee
>Priority: Minor
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Similar to how standalone and Mesos have the capability to safely shut down 
> the spark application, there should be a way to safely shut down spark on 
> Yarn mode. This will ensure a clean shutdown and unregistration from yarn.
> This is the design doc:
> https://docs.google.com/document/d/1QG8hITjLNi1D9dVR3b_hZkyrGm5FFm0u9M1KGM4y1Ak/edit?usp=sharing
> and I will upload the patch soon



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-22016) Add HiveDialect for JDBC connection to Hive

2017-09-26 Thread Sean Owen (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-22016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen resolved SPARK-22016.
---
Resolution: Won't Fix

> Add HiveDialect for JDBC connection to Hive
> ---
>
> Key: SPARK-22016
> URL: https://issues.apache.org/jira/browse/SPARK-22016
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 1.6.3, 2.2.1
>Reporter: Daniel Fernandez
>
> I found out there is no Dialect for Hive in spark. So, I would like to add 
> the HiveDialect.scala in the package org.apache.spark.sql.jdbc to support it.
> Only two functions should be overriden:
> * canHandle
> * quoteIdentifier



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-22080) Allow developers to add pre-optimisation rules

2017-09-26 Thread Sean Owen (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-22080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen resolved SPARK-22080.
---
Resolution: Won't Fix

> Allow developers to add pre-optimisation rules
> --
>
> Key: SPARK-22080
> URL: https://issues.apache.org/jira/browse/SPARK-22080
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 2.1.1, 2.2.0
>Reporter: Sathiya Kumar
>Priority: Minor
>
> [SPARK-9843] added support for adding custom rules for optimising 
> LogicalPlan, but the added rules are only applied only after all the spark's 
> native rules are applied. Allowing users to plug in pre-optimisation rules 
> facilitate some advanced optimisation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-22074) Task killed by other attempt task should not be resubmitted

2017-09-26 Thread Saisai Shao (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-22074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16180506#comment-16180506
 ] 

Saisai Shao commented on SPARK-22074:
-

Hi [~XuanYuan], can you please help me to understand your scenario, is it 
happened only when task attempt (66.0) is lost (which will be adding to pending 
list), at this time another attempt (66.1) is finished, it will try to kill 
66.0, but because 66.0 is pending for resubmitting, so it is not truly killed,  
so attempt 66.0 is lingering in the stage 1046.0, which makes 1046 fail to 
finish, do I understand right?

Can you please explain more if my assumption is wrong.

> Task killed by other attempt task should not be resubmitted
> ---
>
> Key: SPARK-22074
> URL: https://issues.apache.org/jira/browse/SPARK-22074
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Li Yuanjian
>
> When a task killed by other task attempt, the task still resubmitted while 
> its executor lost. There is a certain probability caused the stage hanging 
> forever because of the unnecessary resubmit(see the scenario description 
> below). Although the patch https://issues.apache.org/jira/browse/SPARK-13931 
> can resolve the hanging problem(thx [~GavinGavinNo1] :) ), but the 
> unnecessary resubmit should abandon.
> Detail scenario description:
> 1. A ShuffleMapStage has many tasks, some of them finished successfully
> 2. An Executor Lost happened, this will trigger a new TaskSet resubmitted, 
> includes all missing partitions.
> 3. Before the resubmitted TaskSet completed, another executor which only 
> include the task killed by other attempt lost, trigger the Resubmitted Event, 
> current stage's pendingPartitions is not empty.
> 4. Resubmitted TaskSet end, shuffleMapStage.isAvailable == true, but 
> pendingPartitions is not empty, never step into submitWaitingChildStages.
> Leave the key logs of this scenario below:
> {noformat}
> 393332:17/09/11 13:45:24 [dag-scheduler-event-loop] INFO DAGScheduler: 
> Submitting 120 missing tasks from ShuffleMapStage 1046 
> (MapPartitionsRDD[5321] at rdd at AFDEntry.scala:116)
> 39:17/09/11 13:45:24 [dag-scheduler-event-loop] INFO 
> YarnClusterScheduler: Adding task set 1046.0 with 120 tasks
> 408766:17/09/11 13:46:25 [dispatcher-event-loop-5] INFO TaskSetManager: 
> Starting task 66.0 in stage 1046.0 (TID 110761, hidden-baidu-host.baidu.com, 
> executor 15, partition 66, PROCESS_LOCAL, 6237 bytes)
> [1] Executor 15 lost, task 66.0 and 90.0 on it
> 410532:17/09/11 13:46:32 [dispatcher-event-loop-47] INFO 
> YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 15.
> 410900:17/09/11 13:46:33 [dispatcher-event-loop-34] INFO TaskSetManager: 
> Starting task 66.1 in stage 1046.0 (TID 111400, hidden-baidu-host.baidu.com, 
> executor 70, partition 66, PROCESS_LOCAL, 6237 bytes)
> [2] Task 66.0 killed by 66.1
> 411315:17/09/11 13:46:37 [task-result-getter-2] INFO TaskSetManager: Killing 
> attempt 0 for task 66.0 in stage 1046.0 (TID 110761) on 
> hidden-baidu-host.baidu.com as the attempt 1 succeeded on 
> hidden-baidu-host.baidu.com
> 411316:17/09/11 13:46:37 [task-result-getter-2] INFO TaskSetManager: Finished 
> task 66.1 in stage 1046.0 (TID 111400) in 3545 ms on 
> hidden-baidu-host.baidu.com (executor 70) (115/120)
> [3] Executor 7 lost, task 0.0 72.0 7.0 on it
> 411390:17/09/11 13:46:37 [dispatcher-event-loop-24] INFO 
> YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 7.
> 416014:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO DAGScheduler: 
> ShuffleMapStage 1046 (rdd at AFDEntry.scala:116) finished in 94.577 s
> [4] ShuffleMapStage 1046.0 finished, missing partition trigger resubmitted 
> 1046.1
> 416019:17/09/1 13:46:59 [dag-scheduler-event- oop] INFO DAGScheduler: 
> Resubmitting ShuffleMapStage 1046 (rdd at AFDEntry.scala:116) because some of 
> its tasks had failed: 0, 72, 79
> 416020:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO DAGScheduler: 
> Submitting ShuffleMapStage 1046 (MapPartitionsRDD[5321] at rdd at 
> AFDEntry.scala:116), which has no missing parents
> 416030:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO DAGScheduler: 
> Submitting 3 missing tasks from ShuffleMapStage 1046 (MapPartitionsRDD[5321] 
> at rdd at AFDEntry.scala:116)
> 416032:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO 
> YarnClusterScheduler: Adding task set 1046.1 with 3 tasks
> 416034:17/09/11 13:46:59 [dispatcher-event-loop-21] INFO TaskSetManager: 
> Starting task 0.0 in stage 1046.1 (TID 112788, hidden-baidu-host.baidu.com, 
> executor 37, partition 0, PROCESS_LOCAL, 6237 bytes)
> 416037:17/09/11 13:46:59 [dispatcher-event-loop-23] INFO TaskSetManager: 
> Starting task 1.0 in stage 1046.1 (TID 112789, 
> yq01-inf-nmg01-spark03-2016

[jira] [Created] (SPARK-22126) Fix model-specific optimization support for ML tuning

2017-09-26 Thread Weichen Xu (JIRA)

Weichen Xu created SPARK-22126:
--

 Summary: Fix model-specific optimization support for ML tuning
 Key: SPARK-22126
 URL: https://issues.apache.org/jira/browse/SPARK-22126
 Project: Spark
  Issue Type: Improvement
  Components: ML
Affects Versions: 2.3.0
Reporter: Weichen Xu


Fix model-specific optimization support for ML tuning. This is discussed in 
SPARK-19357



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-19606) Support constraints in spark-dispatcher

2017-09-26 Thread Pascal GILLET (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-19606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16076510#comment-16076510
 ] 

Pascal GILLET edited comment on SPARK-19606 at 9/26/17 10:17 AM:
-

+1 but with 'spark.mesos.dispatcher.driverDefault.spark.mesos.constraints'!

I tested the patch and it works well!
As stated originally, the 'spark.mesos.constraints' property is ignored by the 
Spark Dispatcher.
As a consequence, the Mesos slave where the Spark driver is running does not 
comply to the given Mesos constraints, but on the other hand, the Mesos 
constraints are well applied for the Spark executors (without the need to patch 
anything).

*BUT* we do not want necessarily apply the same Mesos constraints for the 
driver and executors.
For instance, we may need to run Spark drivers and executors on 2 exclusive 
types of Mesos slaves: 
- The dispatcher is given Mesos resources only for drivers
- Once a driver is launched, it becomes a Mesos framework itself and is 
responsible for reserving resources for its executors
- If we schedule too many jobs on a Mesos cluster through the dispatcher, the 
whole cluster is allocated for the drivers and there are no more resources 
available for the executors. A driver may be launched but it may be waiting for 
resources for its executors infinitely, which leads to a congestion then to a 
dead-lock situation.
- A solution to work around this problem is to *not* mix the drivers and 
executors on the same machines by passing different Mesos constraints for the 
driver and for executors.

The 'spark.mesos.constraints' property still apply for executors. As for the 
drivers, the 'spark.mesos.dispatcher.driverDefault.[PropertyName]' generic 
property seems ideal. By defintion, it allows to set default properties for 
drivers submitted through the dispatcher.

I propose to revise this patch and to use 
'spark.mesos.dispatcher.driverDefault.spark.mesos.constraints' instead of 
'spark.mesos.constraints'.
What do you guys think?


was (Author: pgillet):
+1
Need to run Spark drivers and executors on 2 exclusive types of Mesos slaves 
through a Mesos constraint.
What about the spark.mesos.dispatcher.driverDefault.spark.mesos.constraints 
property ?

> Support constraints in spark-dispatcher
> ---
>
> Key: SPARK-19606
> URL: https://issues.apache.org/jira/browse/SPARK-19606
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Affects Versions: 2.1.0
>Reporter: Philipp Hoffmann
>
> The `spark.mesos.constraints` configuration is ignored by the 
> spark-dispatcher. The constraints need to be passed in the Framework 
> information when registering with Mesos.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-22126) Fix model-specific optimization support for ML tuning

2017-09-26 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-22126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-22126:


Assignee: Apache Spark

> Fix model-specific optimization support for ML tuning
> -
>
> Key: SPARK-22126
> URL: https://issues.apache.org/jira/browse/SPARK-22126
> Project: Spark
>  Issue Type: Improvement
>  Components: ML
>Affects Versions: 2.3.0
>Reporter: Weichen Xu
>Assignee: Apache Spark
>
> Fix model-specific optimization support for ML tuning. This is discussed in 
> SPARK-19357



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-22126) Fix model-specific optimization support for ML tuning

2017-09-26 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-22126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-22126:


Assignee: (was: Apache Spark)

> Fix model-specific optimization support for ML tuning
> -
>
> Key: SPARK-22126
> URL: https://issues.apache.org/jira/browse/SPARK-22126
> Project: Spark
>  Issue Type: Improvement
>  Components: ML
>Affects Versions: 2.3.0
>Reporter: Weichen Xu
>
> Fix model-specific optimization support for ML tuning. This is discussed in 
> SPARK-19357



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-22126) Fix model-specific optimization support for ML tuning

2017-09-26 Thread Apache Spark (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-22126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16180583#comment-16180583
 ] 

Apache Spark commented on SPARK-22126:
--

User 'WeichenXu123' has created a pull request for this issue:
https://github.com/apache/spark/pull/19350

> Fix model-specific optimization support for ML tuning
> -
>
> Key: SPARK-22126
> URL: https://issues.apache.org/jira/browse/SPARK-22126
> Project: Spark
>  Issue Type: Improvement
>  Components: ML
>Affects Versions: 2.3.0
>Reporter: Weichen Xu
>
> Fix model-specific optimization support for ML tuning. This is discussed in 
> SPARK-19357



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-19606) Support constraints in spark-dispatcher

2017-09-26 Thread Pascal GILLET (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-19606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16076510#comment-16076510
 ] 

Pascal GILLET edited comment on SPARK-19606 at 9/26/17 10:19 AM:
-

+1 but with 'spark.mesos.dispatcher.driverDefault.spark.mesos.constraints'!

I tested the patch and it works well!
As stated originally, the 'spark.mesos.constraints' property is ignored by the 
Spark Dispatcher.
As a consequence, the Mesos slave where the Spark driver is running does not 
comply to the given Mesos constraints, but on the other hand, the Mesos 
constraints are well applied for the Spark executors (without the need to patch 
anything).

*BUT* we do not want necessarily apply the same Mesos constraints for the 
driver and executors.
For instance, we may need to run Spark drivers and executors on 2 exclusive 
types of Mesos slaves: 
- The dispatcher is given Mesos resources only for drivers
- Once a driver is launched, it becomes a Mesos framework itself and is 
responsible for reserving resources for its executors
- If we schedule too many jobs on a Mesos cluster through the dispatcher, the 
whole cluster is allocated for the drivers and there are no more resources 
available for the executors. A driver may be launched but it may be waiting for 
resources for its executors infinitely, which leads to a congestion then to a 
dead-lock situation.
- A solution to work around this problem is to *not* mix the drivers and 
executors on the same machines by passing different Mesos constraints for the 
driver and for executors.

The 'spark.mesos.constraints' property still apply for executors. As for the 
drivers, the 'spark.mesos.dispatcher.driverDefault.[PropertyName]' generic 
property seems ideal. By definition, it allows to set default properties for 
drivers submitted through the dispatcher.

I propose to revise this patch and to use 
'spark.mesos.dispatcher.driverDefault.spark.mesos.constraints' instead of 
'spark.mesos.constraints'.
What do you guys think?


was (Author: pgillet):
+1 but with 'spark.mesos.dispatcher.driverDefault.spark.mesos.constraints'!

I tested the patch and it works well!
As stated originally, the 'spark.mesos.constraints' property is ignored by the 
Spark Dispatcher.
As a consequence, the Mesos slave where the Spark driver is running does not 
comply to the given Mesos constraints, but on the other hand, the Mesos 
constraints are well applied for the Spark executors (without the need to patch 
anything).

*BUT* we do not want necessarily apply the same Mesos constraints for the 
driver and executors.
For instance, we may need to run Spark drivers and executors on 2 exclusive 
types of Mesos slaves: 
- The dispatcher is given Mesos resources only for drivers
- Once a driver is launched, it becomes a Mesos framework itself and is 
responsible for reserving resources for its executors
- If we schedule too many jobs on a Mesos cluster through the dispatcher, the 
whole cluster is allocated for the drivers and there are no more resources 
available for the executors. A driver may be launched but it may be waiting for 
resources for its executors infinitely, which leads to a congestion then to a 
dead-lock situation.
- A solution to work around this problem is to *not* mix the drivers and 
executors on the same machines by passing different Mesos constraints for the 
driver and for executors.

The 'spark.mesos.constraints' property still apply for executors. As for the 
drivers, the 'spark.mesos.dispatcher.driverDefault.[PropertyName]' generic 
property seems ideal. By defintion, it allows to set default properties for 
drivers submitted through the dispatcher.

I propose to revise this patch and to use 
'spark.mesos.dispatcher.driverDefault.spark.mesos.constraints' instead of 
'spark.mesos.constraints'.
What do you guys think?

> Support constraints in spark-dispatcher
> ---
>
> Key: SPARK-19606
> URL: https://issues.apache.org/jira/browse/SPARK-19606
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Affects Versions: 2.1.0
>Reporter: Philipp Hoffmann
>
> The `spark.mesos.constraints` configuration is ignored by the 
> spark-dispatcher. The constraints need to be passed in the Framework 
> information when registering with Mesos.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-19606) Support constraints in spark-dispatcher

2017-09-26 Thread Pascal GILLET (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-19606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16076510#comment-16076510
 ] 

Pascal GILLET edited comment on SPARK-19606 at 9/26/17 10:27 AM:
-

+1 but through 'spark.mesos.dispatcher.driverDefault.spark.mesos.constraints'!

I tested the patch and it works well!
As stated originally, the 'spark.mesos.constraints' property is ignored by the 
Spark Dispatcher.
As a consequence, the Mesos slave where the Spark driver is running does not 
comply to the given Mesos constraints, but on the other hand, the Mesos 
constraints are well applied for the Spark executors (without the need to patch 
anything).

*BUT* we do not want necessarily apply the same Mesos constraints for the 
driver and executors.
For instance, we may need to run Spark drivers and executors on 2 exclusive 
types of Mesos slaves: 
- The dispatcher is given Mesos resources only for drivers
- Once a driver is launched, it becomes a Mesos framework itself and is 
responsible for reserving resources for its executors
- If we schedule too many jobs on a Mesos cluster through the dispatcher, the 
whole cluster is allocated for the drivers and there are no more resources 
available for the executors. A driver may be launched but it may be waiting for 
resources for its executors infinitely, which leads to a congestion then to a 
dead-lock situation.
- A solution to work around this problem is to *not* mix the drivers and 
executors on the same machines by passing different Mesos constraints for the 
driver and for executors.

The 'spark.mesos.constraints' property still apply for executors. As for the 
drivers, the 'spark.mesos.dispatcher.driverDefault.[PropertyName]' generic 
property seems ideal. By definition, it allows to "_set default properties for 
drivers submitted through the dispatcher_".

I propose to revise this patch and to use 
'spark.mesos.dispatcher.driverDefault.spark.mesos.constraints' instead of 
'spark.mesos.constraints'.
What do you guys think?


was (Author: pgillet):
+1 but through 'spark.mesos.dispatcher.driverDefault.spark.mesos.constraints'!

I tested the patch and it works well!
As stated originally, the 'spark.mesos.constraints' property is ignored by the 
Spark Dispatcher.
As a consequence, the Mesos slave where the Spark driver is running does not 
comply to the given Mesos constraints, but on the other hand, the Mesos 
constraints are well applied for the Spark executors (without the need to patch 
anything).

*BUT* we do not want necessarily apply the same Mesos constraints for the 
driver and executors.
For instance, we may need to run Spark drivers and executors on 2 exclusive 
types of Mesos slaves: 
- The dispatcher is given Mesos resources only for drivers
- Once a driver is launched, it becomes a Mesos framework itself and is 
responsible for reserving resources for its executors
- If we schedule too many jobs on a Mesos cluster through the dispatcher, the 
whole cluster is allocated for the drivers and there are no more resources 
available for the executors. A driver may be launched but it may be waiting for 
resources for its executors infinitely, which leads to a congestion then to a 
dead-lock situation.
- A solution to work around this problem is to *not* mix the drivers and 
executors on the same machines by passing different Mesos constraints for the 
driver and for executors.

The 'spark.mesos.constraints' property still apply for executors. As for the 
drivers, the 'spark.mesos.dispatcher.driverDefault.[PropertyName]' generic 
property seems ideal. By definition, it allows to "set default properties for 
drivers submitted through the dispatcher".

I propose to revise this patch and to use 
'spark.mesos.dispatcher.driverDefault.spark.mesos.constraints' instead of 
'spark.mesos.constraints'.
What do you guys think?

> Support constraints in spark-dispatcher
> ---
>
> Key: SPARK-19606
> URL: https://issues.apache.org/jira/browse/SPARK-19606
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Affects Versions: 2.1.0
>Reporter: Philipp Hoffmann
>
> The `spark.mesos.constraints` configuration is ignored by the 
> spark-dispatcher. The constraints need to be passed in the Framework 
> information when registering with Mesos.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-19606) Support constraints in spark-dispatcher

2017-09-26 Thread Pascal GILLET (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-19606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16076510#comment-16076510
 ] 

Pascal GILLET edited comment on SPARK-19606 at 9/26/17 10:27 AM:
-

+1 but through 'spark.mesos.dispatcher.driverDefault.spark.mesos.constraints'!

I tested the patch and it works well!
As stated originally, the 'spark.mesos.constraints' property is ignored by the 
Spark Dispatcher.
As a consequence, the Mesos slave where the Spark driver is running does not 
comply to the given Mesos constraints, but on the other hand, the Mesos 
constraints are well applied for the Spark executors (without the need to patch 
anything).

*BUT* we do not want necessarily apply the same Mesos constraints for the 
driver and executors.
For instance, we may need to run Spark drivers and executors on 2 exclusive 
types of Mesos slaves: 
- The dispatcher is given Mesos resources only for drivers
- Once a driver is launched, it becomes a Mesos framework itself and is 
responsible for reserving resources for its executors
- If we schedule too many jobs on a Mesos cluster through the dispatcher, the 
whole cluster is allocated for the drivers and there are no more resources 
available for the executors. A driver may be launched but it may be waiting for 
resources for its executors infinitely, which leads to a congestion then to a 
dead-lock situation.
- A solution to work around this problem is to *not* mix the drivers and 
executors on the same machines by passing different Mesos constraints for the 
driver and for executors.

The 'spark.mesos.constraints' property still apply for executors. As for the 
drivers, the 'spark.mesos.dispatcher.driverDefault.[PropertyName]' generic 
property seems ideal. By definition, it allows to "set default properties for 
drivers submitted through the dispatcher".

I propose to revise this patch and to use 
'spark.mesos.dispatcher.driverDefault.spark.mesos.constraints' instead of 
'spark.mesos.constraints'.
What do you guys think?


was (Author: pgillet):
+1 but with 'spark.mesos.dispatcher.driverDefault.spark.mesos.constraints'!

I tested the patch and it works well!
As stated originally, the 'spark.mesos.constraints' property is ignored by the 
Spark Dispatcher.
As a consequence, the Mesos slave where the Spark driver is running does not 
comply to the given Mesos constraints, but on the other hand, the Mesos 
constraints are well applied for the Spark executors (without the need to patch 
anything).

*BUT* we do not want necessarily apply the same Mesos constraints for the 
driver and executors.
For instance, we may need to run Spark drivers and executors on 2 exclusive 
types of Mesos slaves: 
- The dispatcher is given Mesos resources only for drivers
- Once a driver is launched, it becomes a Mesos framework itself and is 
responsible for reserving resources for its executors
- If we schedule too many jobs on a Mesos cluster through the dispatcher, the 
whole cluster is allocated for the drivers and there are no more resources 
available for the executors. A driver may be launched but it may be waiting for 
resources for its executors infinitely, which leads to a congestion then to a 
dead-lock situation.
- A solution to work around this problem is to *not* mix the drivers and 
executors on the same machines by passing different Mesos constraints for the 
driver and for executors.

The 'spark.mesos.constraints' property still apply for executors. As for the 
drivers, the 'spark.mesos.dispatcher.driverDefault.[PropertyName]' generic 
property seems ideal. By definition, it allows to set default properties for 
drivers submitted through the dispatcher.

I propose to revise this patch and to use 
'spark.mesos.dispatcher.driverDefault.spark.mesos.constraints' instead of 
'spark.mesos.constraints'.
What do you guys think?

> Support constraints in spark-dispatcher
> ---
>
> Key: SPARK-19606
> URL: https://issues.apache.org/jira/browse/SPARK-19606
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Affects Versions: 2.1.0
>Reporter: Philipp Hoffmann
>
> The `spark.mesos.constraints` configuration is ignored by the 
> spark-dispatcher. The constraints need to be passed in the Framework 
> information when registering with Mesos.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-22127) The Master Register Application Function requires an warn log to increase the waiting status

2017-09-26 Thread guoxiaolongzte (JIRA)

guoxiaolongzte created SPARK-22127:
--

 Summary: The Master Register Application Function requires an warn 
log to increase the waiting status
 Key: SPARK-22127
 URL: https://issues.apache.org/jira/browse/SPARK-22127
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 2.3.0
Reporter: guoxiaolongzte



The Master register application function requires an alarm log to increase the 
waiting status.

When I create a spark application, I apply for the resources to reach the 
ceiling, the current Worker does not have enough resources to allocate 
Executor, this time spark application state is waiting.
But I can not know from the spark master log this situation, which led to my 
positioning problems difficult, I mistakenly thought that the master or worker 
process dead.

All that I added to the warting state of the warn log, which better helps us 
locate this kind of problem.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-19606) Support constraints in spark-dispatcher

2017-09-26 Thread Pascal GILLET (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-19606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16076510#comment-16076510
 ] 

Pascal GILLET edited comment on SPARK-19606 at 9/26/17 11:37 AM:
-

+1 but through 'spark.mesos.dispatcher.driverDefault.spark.mesos.constraints'!

I tested the patch and it works well!
As stated originally, the 'spark.mesos.constraints' property is ignored by the 
Spark Dispatcher.
As a consequence, the Mesos slave where the Spark driver is running does not 
comply to the given Mesos constraints, but on the other hand, the Mesos 
constraints are well applied for the Spark executors (without the need to patch 
anything).

*BUT* we do not want necessarily apply the same Mesos constraints for the 
driver and executors.
For instance, we may need to run Spark drivers and executors on 2 exclusive 
types of Mesos slaves: 
- The dispatcher is given Mesos resources only for drivers
- Once a driver is launched, it becomes a Mesos framework itself and is 
responsible for reserving resources for its executors
- If we schedule too many jobs on a Mesos cluster through the dispatcher, the 
whole cluster is allocated for the drivers and there are no more resources 
available for the executors. A driver may be launched but it may be waiting for 
resources for its executors infinitely, which leads to a congestion then to a 
dead-lock situation.
- A solution to work around this problem is to *not* mix the drivers and 
executors on the same machines by passing different Mesos constraints for the 
driver and for executors.

The 'spark.mesos.constraints' property still apply for executors. As for the 
drivers, the 'spark.mesos.dispatcher.driverDefault.[PropertyName]' generic 
property seems ideal. By definition, it allows to "_set default properties for 
drivers submitted through the dispatcher_".

Thus, I propose to revise this patch and to use 
'spark.mesos.dispatcher.driverDefault.spark.mesos.constraints' instead of 
'spark.mesos.constraints'.
What do you guys think?


was (Author: pgillet):
+1 but through 'spark.mesos.dispatcher.driverDefault.spark.mesos.constraints'!

I tested the patch and it works well!
As stated originally, the 'spark.mesos.constraints' property is ignored by the 
Spark Dispatcher.
As a consequence, the Mesos slave where the Spark driver is running does not 
comply to the given Mesos constraints, but on the other hand, the Mesos 
constraints are well applied for the Spark executors (without the need to patch 
anything).

*BUT* we do not want necessarily apply the same Mesos constraints for the 
driver and executors.
For instance, we may need to run Spark drivers and executors on 2 exclusive 
types of Mesos slaves: 
- The dispatcher is given Mesos resources only for drivers
- Once a driver is launched, it becomes a Mesos framework itself and is 
responsible for reserving resources for its executors
- If we schedule too many jobs on a Mesos cluster through the dispatcher, the 
whole cluster is allocated for the drivers and there are no more resources 
available for the executors. A driver may be launched but it may be waiting for 
resources for its executors infinitely, which leads to a congestion then to a 
dead-lock situation.
- A solution to work around this problem is to *not* mix the drivers and 
executors on the same machines by passing different Mesos constraints for the 
driver and for executors.

The 'spark.mesos.constraints' property still apply for executors. As for the 
drivers, the 'spark.mesos.dispatcher.driverDefault.[PropertyName]' generic 
property seems ideal. By definition, it allows to "_set default properties for 
drivers submitted through the dispatcher_".

I propose to revise this patch and to use 
'spark.mesos.dispatcher.driverDefault.spark.mesos.constraints' instead of 
'spark.mesos.constraints'.
What do you guys think?

> Support constraints in spark-dispatcher
> ---
>
> Key: SPARK-19606
> URL: https://issues.apache.org/jira/browse/SPARK-19606
> Project: Spark
>  Issue Type: New Feature
>  Components: Mesos
>Affects Versions: 2.1.0
>Reporter: Philipp Hoffmann
>
> The `spark.mesos.constraints` configuration is ignored by the 
> spark-dispatcher. The constraints need to be passed in the Framework 
> information when registering with Mesos.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-22127) The Master Register Application Function requires an warn log to increase the waiting status

2017-09-26 Thread Apache Spark (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-22127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16180635#comment-16180635
 ] 

Apache Spark commented on SPARK-22127:
--

User 'guoxiaolongzte' has created a pull request for this issue:
https://github.com/apache/spark/pull/19351

> The Master Register Application Function requires an warn log to increase the 
> waiting status
> 
>
> Key: SPARK-22127
> URL: https://issues.apache.org/jira/browse/SPARK-22127
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.3.0
>Reporter: guoxiaolongzte
>
> The Master register application function requires an alarm log to increase 
> the waiting status.
> When I create a spark application, I apply for the resources to reach the 
> ceiling, the current Worker does not have enough resources to allocate 
> Executor, this time spark application state is waiting.
> But I can not know from the spark master log this situation, which led to my 
> positioning problems difficult, I mistakenly thought that the master or 
> worker process dead.
> All that I added to the warting state of the warn log, which better helps us 
> locate this kind of problem.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-22127) The Master Register Application Function requires an warn log to increase the waiting status

2017-09-26 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-22127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-22127:


Assignee: (was: Apache Spark)

> The Master Register Application Function requires an warn log to increase the 
> waiting status
> 
>
> Key: SPARK-22127
> URL: https://issues.apache.org/jira/browse/SPARK-22127
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.3.0
>Reporter: guoxiaolongzte
>
> The Master register application function requires an alarm log to increase 
> the waiting status.
> When I create a spark application, I apply for the resources to reach the 
> ceiling, the current Worker does not have enough resources to allocate 
> Executor, this time spark application state is waiting.
> But I can not know from the spark master log this situation, which led to my 
> positioning problems difficult, I mistakenly thought that the master or 
> worker process dead.
> All that I added to the warting state of the warn log, which better helps us 
> locate this kind of problem.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-22127) The Master Register Application Function requires an warn log to increase the waiting status

2017-09-26 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-22127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-22127:


Assignee: Apache Spark

> The Master Register Application Function requires an warn log to increase the 
> waiting status
> 
>
> Key: SPARK-22127
> URL: https://issues.apache.org/jira/browse/SPARK-22127
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.3.0
>Reporter: guoxiaolongzte
>Assignee: Apache Spark
>
> The Master register application function requires an alarm log to increase 
> the waiting status.
> When I create a spark application, I apply for the resources to reach the 
> ceiling, the current Worker does not have enough resources to allocate 
> Executor, this time spark application state is waiting.
> But I can not know from the spark master log this situation, which led to my 
> positioning problems difficult, I mistakenly thought that the master or 
> worker process dead.
> All that I added to the warting state of the warn log, which better helps us 
> locate this kind of problem.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-22074) Task killed by other attempt task should not be resubmitted

2017-09-26 Thread Li Yuanjian (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-22074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16180652#comment-16180652
 ] 

Li Yuanjian commented on SPARK-22074:
-

Hi [~jerryshao], thanks for you comment. 
In my scenario, the 66.0 is truly killed by 66.1, the root case cause 1046.0 
fail to finish is that the resubmitted event of task 66.0(killed by 66.1before) 
reached while 1046.1 running.

> Task killed by other attempt task should not be resubmitted
> ---
>
> Key: SPARK-22074
> URL: https://issues.apache.org/jira/browse/SPARK-22074
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Li Yuanjian
>
> When a task killed by other task attempt, the task still resubmitted while 
> its executor lost. There is a certain probability caused the stage hanging 
> forever because of the unnecessary resubmit(see the scenario description 
> below). Although the patch https://issues.apache.org/jira/browse/SPARK-13931 
> can resolve the hanging problem(thx [~GavinGavinNo1] :) ), but the 
> unnecessary resubmit should abandon.
> Detail scenario description:
> 1. A ShuffleMapStage has many tasks, some of them finished successfully
> 2. An Executor Lost happened, this will trigger a new TaskSet resubmitted, 
> includes all missing partitions.
> 3. Before the resubmitted TaskSet completed, another executor which only 
> include the task killed by other attempt lost, trigger the Resubmitted Event, 
> current stage's pendingPartitions is not empty.
> 4. Resubmitted TaskSet end, shuffleMapStage.isAvailable == true, but 
> pendingPartitions is not empty, never step into submitWaitingChildStages.
> Leave the key logs of this scenario below:
> {noformat}
> 393332:17/09/11 13:45:24 [dag-scheduler-event-loop] INFO DAGScheduler: 
> Submitting 120 missing tasks from ShuffleMapStage 1046 
> (MapPartitionsRDD[5321] at rdd at AFDEntry.scala:116)
> 39:17/09/11 13:45:24 [dag-scheduler-event-loop] INFO 
> YarnClusterScheduler: Adding task set 1046.0 with 120 tasks
> 408766:17/09/11 13:46:25 [dispatcher-event-loop-5] INFO TaskSetManager: 
> Starting task 66.0 in stage 1046.0 (TID 110761, hidden-baidu-host.baidu.com, 
> executor 15, partition 66, PROCESS_LOCAL, 6237 bytes)
> [1] Executor 15 lost, task 66.0 and 90.0 on it
> 410532:17/09/11 13:46:32 [dispatcher-event-loop-47] INFO 
> YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 15.
> 410900:17/09/11 13:46:33 [dispatcher-event-loop-34] INFO TaskSetManager: 
> Starting task 66.1 in stage 1046.0 (TID 111400, hidden-baidu-host.baidu.com, 
> executor 70, partition 66, PROCESS_LOCAL, 6237 bytes)
> [2] Task 66.0 killed by 66.1
> 411315:17/09/11 13:46:37 [task-result-getter-2] INFO TaskSetManager: Killing 
> attempt 0 for task 66.0 in stage 1046.0 (TID 110761) on 
> hidden-baidu-host.baidu.com as the attempt 1 succeeded on 
> hidden-baidu-host.baidu.com
> 411316:17/09/11 13:46:37 [task-result-getter-2] INFO TaskSetManager: Finished 
> task 66.1 in stage 1046.0 (TID 111400) in 3545 ms on 
> hidden-baidu-host.baidu.com (executor 70) (115/120)
> [3] Executor 7 lost, task 0.0 72.0 7.0 on it
> 411390:17/09/11 13:46:37 [dispatcher-event-loop-24] INFO 
> YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 7.
> 416014:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO DAGScheduler: 
> ShuffleMapStage 1046 (rdd at AFDEntry.scala:116) finished in 94.577 s
> [4] ShuffleMapStage 1046.0 finished, missing partition trigger resubmitted 
> 1046.1
> 416019:17/09/1 13:46:59 [dag-scheduler-event- oop] INFO DAGScheduler: 
> Resubmitting ShuffleMapStage 1046 (rdd at AFDEntry.scala:116) because some of 
> its tasks had failed: 0, 72, 79
> 416020:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO DAGScheduler: 
> Submitting ShuffleMapStage 1046 (MapPartitionsRDD[5321] at rdd at 
> AFDEntry.scala:116), which has no missing parents
> 416030:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO DAGScheduler: 
> Submitting 3 missing tasks from ShuffleMapStage 1046 (MapPartitionsRDD[5321] 
> at rdd at AFDEntry.scala:116)
> 416032:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO 
> YarnClusterScheduler: Adding task set 1046.1 with 3 tasks
> 416034:17/09/11 13:46:59 [dispatcher-event-loop-21] INFO TaskSetManager: 
> Starting task 0.0 in stage 1046.1 (TID 112788, hidden-baidu-host.baidu.com, 
> executor 37, partition 0, PROCESS_LOCAL, 6237 bytes)
> 416037:17/09/11 13:46:59 [dispatcher-event-loop-23] INFO TaskSetManager: 
> Starting task 1.0 in stage 1046.1 (TID 112789, 
> yq01-inf-nmg01-spark03-20160817113538.yq01.baidu.com, executor 69, partition 
> 72, PROCESS_LOCAL, 6237 bytes)
> 416039:17/09/11 13:46:59 [dispatcher-event-loop-23] INFO TaskSetManager: 
> Starting task 2.0 in stage 1046.1 (TID 112790, hidden-baidu-host.baidu.com, 
> execu

[jira] [Commented] (SPARK-14540) Support Scala 2.12 closures and Java 8 lambdas in ClosureCleaner

2017-09-26 Thread Sean Owen (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-14540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16180658#comment-16180658
 ] 

Sean Owen commented on SPARK-14540:
---

I might have spoken too soon. After solving some other 2.12 issues, I am now 
facing this:

{code}
[ERROR] Tests run: 18, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 2.165 
s <<< FAILURE! - in test.org.apache.spark.Java8RDDAPISuite
[ERROR] foldByKey(test.org.apache.spark.Java8RDDAPISuite)  Time elapsed: 0.084 
s  <<< ERROR!
org.apache.spark.SparkException: 
Job aborted due to stage failure: Task not serializable: 
java.io.NotSerializableException: scala.runtime.LazyRef
Serialization stack:
- object not serializable (class: scala.runtime.LazyRef, value: LazyRef 
thunk)
- element of array (index: 2)
- array (class [Ljava.lang.Object;, size 3)
- field (class: java.lang.invoke.SerializedLambda, name: capturedArgs, 
type: class [Ljava.lang.Object;)
- object (class java.lang.invoke.SerializedLambda, 
SerializedLambda[capturingClass=class org.apache.spark.rdd.PairRDDFunctions, 
functionalInterfaceMethod=scala/Function0.apply:()Ljava/lang/Object;, 
implementation=invokeStatic 
org/apache/spark/rdd/PairRDDFunctions.$anonfun$foldByKey$2:(Lorg/apache/spark/rdd/PairRDDFunctions;[BLscala/runtime/LazyRef;)Ljava/lang/Object;,
 instantiatedMethodType=()Ljava/lang/Object;, numCaptured=3])
- writeReplace data (class: java.lang.invoke.SerializedLambda)
- object (class 
org.apache.spark.rdd.PairRDDFunctions$$Lambda$1249/2053647669, 
org.apache.spark.rdd.PairRDDFunctions$$Lambda$1249/2053647669@2bf19860)
- element of array (index: 0)
- array (class [Ljava.lang.Object;, size 2)
- field (class: java.lang.invoke.SerializedLambda, name: capturedArgs, 
type: class [Ljava.lang.Object;)
- object (class java.lang.invoke.SerializedLambda, 
SerializedLambda[capturingClass=class org.apache.spark.rdd.PairRDDFunctions, 
functionalInterfaceMethod=scala/Function1.apply:(Ljava/lang/Object;)Ljava/lang/Object;,
 implementation=invokeStatic 
org/apache/spark/rdd/PairRDDFunctions.$anonfun$foldByKey$3:(Lscala/Function0;Lscala/Function2;Ljava/lang/Object;)Ljava/lang/Object;,
 instantiatedMethodType=(Ljava/lang/Object;)Ljava/lang/Object;, numCaptured=2])
- writeReplace data (class: java.lang.invoke.SerializedLambda)
- object (class 
org.apache.spark.rdd.PairRDDFunctions$$Lambda$1250/250144767, 
org.apache.spark.rdd.PairRDDFunctions$$Lambda$1250/250144767@36d4186f)
- field (class: org.apache.spark.Aggregator, name: createCombiner, 
type: interface scala.Function1)
- object (class org.apache.spark.Aggregator, 
Aggregator(org.apache.spark.rdd.PairRDDFunctions$$Lambda$1250/250144767@36d4186f,org.apache.spark.api.java.JavaPairRDD$$$Lambda$832/1799521220@551d5576,org.apache.spark.api.java.JavaPairRDD$$$Lambda$832/1799521220@551d5576))
- field (class: scala.Some, name: value, type: class java.lang.Object)
- object (class scala.Some, 
Some(Aggregator(org.apache.spark.rdd.PairRDDFunctions$$Lambda$1250/250144767@36d4186f,org.apache.spark.api.java.JavaPairRDD$$$Lambda$832/1799521220@551d5576,org.apache.spark.api.java.JavaPairRDD$$$Lambda$832/1799521220@551d5576)))
- field (class: org.apache.spark.ShuffleDependency, name: aggregator, 
type: class scala.Option)
- object (class org.apache.spark.ShuffleDependency, 
org.apache.spark.ShuffleDependency@aa278a4)
- field (class: scala.Tuple2, name: _2, type: class java.lang.Object)
- object (class scala.Tuple2, (ParallelCollectionRDD[0] at 
parallelizePairs at 
Java8RDDAPISuite.java:137,org.apache.spark.ShuffleDependency@aa278a4))
at 
test.org.apache.spark.Java8RDDAPISuite.foldByKey(Java8RDDAPISuite.java:139)
{code}

This might be the current manifestation of the same problem, not sure.


> Support Scala 2.12 closures and Java 8 lambdas in ClosureCleaner
> 
>
> Key: SPARK-14540
> URL: https://issues.apache.org/jira/browse/SPARK-14540
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Reporter: Josh Rosen
>
> Using https://github.com/JoshRosen/spark/tree/build-for-2.12, I tried running 
> ClosureCleanerSuite with Scala 2.12 and ran into two bad test failures:
> {code}
> [info] - toplevel return statements in closures are identified at cleaning 
> time *** FAILED *** (32 milliseconds)
> [info]   Expected exception 
> org.apache.spark.util.ReturnStatementInClosureException to be thrown, but no 
> exception was thrown. (ClosureCleanerSuite.scala:57)
> {code}
> and
> {code}
> [info] - user provided closures are actually cleaned *** FAILED *** (56 
> milliseconds)
> [info]   Expected ReturnStatementInClosureException, but got 
> org.apache.spark.SparkE

[jira] [Commented] (SPARK-14540) Support Scala 2.12 closures and Java 8 lambdas in ClosureCleaner

2017-09-26 Thread Lukas Rytz (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-14540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16180689#comment-16180689
 ] 

Lukas Rytz commented on SPARK-14540:


Ah, you just found a new bug in Scala 2.12! I created a ticket with a small 
reproducer: https://github.com/scala/bug/issues/10522. We'll fix this for 
2.12.4, which will be out soon, probably in two weeks.

> Support Scala 2.12 closures and Java 8 lambdas in ClosureCleaner
> 
>
> Key: SPARK-14540
> URL: https://issues.apache.org/jira/browse/SPARK-14540
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Reporter: Josh Rosen
>
> Using https://github.com/JoshRosen/spark/tree/build-for-2.12, I tried running 
> ClosureCleanerSuite with Scala 2.12 and ran into two bad test failures:
> {code}
> [info] - toplevel return statements in closures are identified at cleaning 
> time *** FAILED *** (32 milliseconds)
> [info]   Expected exception 
> org.apache.spark.util.ReturnStatementInClosureException to be thrown, but no 
> exception was thrown. (ClosureCleanerSuite.scala:57)
> {code}
> and
> {code}
> [info] - user provided closures are actually cleaned *** FAILED *** (56 
> milliseconds)
> [info]   Expected ReturnStatementInClosureException, but got 
> org.apache.spark.SparkException: Job aborted due to stage failure: Task not 
> serializable: java.io.NotSerializableException: java.lang.Object
> [info]- element of array (index: 0)
> [info]- array (class "[Ljava.lang.Object;", size: 1)
> [info]- field (class "java.lang.invoke.SerializedLambda", name: 
> "capturedArgs", type: "class [Ljava.lang.Object;")
> [info]- object (class "java.lang.invoke.SerializedLambda", 
> SerializedLambda[capturingClass=class 
> org.apache.spark.util.TestUserClosuresActuallyCleaned$, 
> functionalInterfaceMethod=scala/runtime/java8/JFunction1$mcII$sp.apply$mcII$sp:(I)I,
>  implementation=invokeStatic 
> org/apache/spark/util/TestUserClosuresActuallyCleaned$.org$apache$spark$util$TestUserClosuresActuallyCleaned$$$anonfun$69:(Ljava/lang/Object;I)I,
>  instantiatedMethodType=(I)I, numCaptured=1])
> [info]- element of array (index: 0)
> [info]- array (class "[Ljava.lang.Object;", size: 1)
> [info]- field (class "java.lang.invoke.SerializedLambda", name: 
> "capturedArgs", type: "class [Ljava.lang.Object;")
> [info]- object (class "java.lang.invoke.SerializedLambda", 
> SerializedLambda[capturingClass=class org.apache.spark.rdd.RDD, 
> functionalInterfaceMethod=scala/Function3.apply:(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;,
>  implementation=invokeStatic 
> org/apache/spark/rdd/RDD.org$apache$spark$rdd$RDD$$$anonfun$20$adapted:(Lscala/Function1;Lorg/apache/spark/TaskContext;Ljava/lang/Object;Lscala/collection/Iterator;)Lscala/collection/Iterator;,
>  
> instantiatedMethodType=(Lorg/apache/spark/TaskContext;Ljava/lang/Object;Lscala/collection/Iterator;)Lscala/collection/Iterator;,
>  numCaptured=1])
> [info]- field (class "org.apache.spark.rdd.MapPartitionsRDD", name: 
> "f", type: "interface scala.Function3")
> [info]- object (class "org.apache.spark.rdd.MapPartitionsRDD", 
> MapPartitionsRDD[2] at apply at Transformer.scala:22)
> [info]- field (class "scala.Tuple2", name: "_1", type: "class 
> java.lang.Object")
> [info]- root object (class "scala.Tuple2", (MapPartitionsRDD[2] at 
> apply at 
> Transformer.scala:22,org.apache.spark.SparkContext$$Lambda$957/431842435@6e803685)).
> [info]   This means the closure provided by user is not actually cleaned. 
> (ClosureCleanerSuite.scala:78)
> {code}
> We'll need to figure out a closure cleaning strategy which works for 2.12 
> lambdas.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-22128) Update paranamer to 2.8 to avoid BytecodeReadingParanamer ArrayIndexOutOfBoundsException with Scala 2.12 + Java 8 lambda

2017-09-26 Thread Sean Owen (JIRA)

Sean Owen created SPARK-22128:
-

 Summary: Update paranamer to 2.8 to avoid BytecodeReadingParanamer 
ArrayIndexOutOfBoundsException with Scala 2.12 + Java 8 lambda
 Key: SPARK-22128
 URL: https://issues.apache.org/jira/browse/SPARK-22128
 Project: Spark
  Issue Type: Sub-task
  Components: Spark Core
Affects Versions: 2.3.0
Reporter: Sean Owen
Assignee: Sean Owen
Priority: Minor


Minor, but probably needs a JIRA.

After updating to Scala 2.12 I encountered this issue pretty quickly in tests: 
https://github.com/paul-hammant/paranamer/issues/17

{code}
java.lang.ArrayIndexOutOfBoundsException: 
at 
com.thoughtworks.paranamer.BytecodeReadingParanamer$ClassReader.accept(BytecodeReadingParanamer.java:563)
...
{code}

Spark depends on jackson-module-paranamer 2.6.7 to match the other jackson 
deps, and that depends on paranamer 2.6. The bug above is fixed in 2.8.

However I noticed that, really, Spark uses jackson-module-scala 2.6.7.1, and 
that in turn actually depends on jackson-module-paranamer 2.7.9 (kind of odd), 
which already depends on 2.8.

So it seemed prudent to just not manually manage this dependency down. Then we 
still need to manage paranamer to 2.8 because Avro 1.7 pulls in 2.3.

But it all seems to work in a quick test. And it's necessary to get 2.12 
working.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-22128) Update paranamer to 2.8 to avoid BytecodeReadingParanamer ArrayIndexOutOfBoundsException with Scala 2.12 + Java 8 lambda

2017-09-26 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-22128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-22128:


Assignee: Apache Spark  (was: Sean Owen)

> Update paranamer to 2.8 to avoid BytecodeReadingParanamer 
> ArrayIndexOutOfBoundsException with Scala 2.12 + Java 8 lambda
> 
>
> Key: SPARK-22128
> URL: https://issues.apache.org/jira/browse/SPARK-22128
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 2.3.0
>Reporter: Sean Owen
>Assignee: Apache Spark
>Priority: Minor
>
> Minor, but probably needs a JIRA.
> After updating to Scala 2.12 I encountered this issue pretty quickly in 
> tests: https://github.com/paul-hammant/paranamer/issues/17
> {code}
> java.lang.ArrayIndexOutOfBoundsException: 
> at 
> com.thoughtworks.paranamer.BytecodeReadingParanamer$ClassReader.accept(BytecodeReadingParanamer.java:563)
> ...
> {code}
> Spark depends on jackson-module-paranamer 2.6.7 to match the other jackson 
> deps, and that depends on paranamer 2.6. The bug above is fixed in 2.8.
> However I noticed that, really, Spark uses jackson-module-scala 2.6.7.1, and 
> that in turn actually depends on jackson-module-paranamer 2.7.9 (kind of 
> odd), which already depends on 2.8.
> So it seemed prudent to just not manually manage this dependency down. Then 
> we still need to manage paranamer to 2.8 because Avro 1.7 pulls in 2.3.
> But it all seems to work in a quick test. And it's necessary to get 2.12 
> working.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-22128) Update paranamer to 2.8 to avoid BytecodeReadingParanamer ArrayIndexOutOfBoundsException with Scala 2.12 + Java 8 lambda

2017-09-26 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-22128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-22128:


Assignee: Sean Owen  (was: Apache Spark)

> Update paranamer to 2.8 to avoid BytecodeReadingParanamer 
> ArrayIndexOutOfBoundsException with Scala 2.12 + Java 8 lambda
> 
>
> Key: SPARK-22128
> URL: https://issues.apache.org/jira/browse/SPARK-22128
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 2.3.0
>Reporter: Sean Owen
>Assignee: Sean Owen
>Priority: Minor
>
> Minor, but probably needs a JIRA.
> After updating to Scala 2.12 I encountered this issue pretty quickly in 
> tests: https://github.com/paul-hammant/paranamer/issues/17
> {code}
> java.lang.ArrayIndexOutOfBoundsException: 
> at 
> com.thoughtworks.paranamer.BytecodeReadingParanamer$ClassReader.accept(BytecodeReadingParanamer.java:563)
> ...
> {code}
> Spark depends on jackson-module-paranamer 2.6.7 to match the other jackson 
> deps, and that depends on paranamer 2.6. The bug above is fixed in 2.8.
> However I noticed that, really, Spark uses jackson-module-scala 2.6.7.1, and 
> that in turn actually depends on jackson-module-paranamer 2.7.9 (kind of 
> odd), which already depends on 2.8.
> So it seemed prudent to just not manually manage this dependency down. Then 
> we still need to manage paranamer to 2.8 because Avro 1.7 pulls in 2.3.
> But it all seems to work in a quick test. And it's necessary to get 2.12 
> working.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-22128) Update paranamer to 2.8 to avoid BytecodeReadingParanamer ArrayIndexOutOfBoundsException with Scala 2.12 + Java 8 lambda

2017-09-26 Thread Apache Spark (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-22128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16180745#comment-16180745
 ] 

Apache Spark commented on SPARK-22128:
--

User 'srowen' has created a pull request for this issue:
https://github.com/apache/spark/pull/19352

> Update paranamer to 2.8 to avoid BytecodeReadingParanamer 
> ArrayIndexOutOfBoundsException with Scala 2.12 + Java 8 lambda
> 
>
> Key: SPARK-22128
> URL: https://issues.apache.org/jira/browse/SPARK-22128
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 2.3.0
>Reporter: Sean Owen
>Assignee: Sean Owen
>Priority: Minor
>
> Minor, but probably needs a JIRA.
> After updating to Scala 2.12 I encountered this issue pretty quickly in 
> tests: https://github.com/paul-hammant/paranamer/issues/17
> {code}
> java.lang.ArrayIndexOutOfBoundsException: 
> at 
> com.thoughtworks.paranamer.BytecodeReadingParanamer$ClassReader.accept(BytecodeReadingParanamer.java:563)
> ...
> {code}
> Spark depends on jackson-module-paranamer 2.6.7 to match the other jackson 
> deps, and that depends on paranamer 2.6. The bug above is fixed in 2.8.
> However I noticed that, really, Spark uses jackson-module-scala 2.6.7.1, and 
> that in turn actually depends on jackson-module-paranamer 2.7.9 (kind of 
> odd), which already depends on 2.8.
> So it seemed prudent to just not manually manage this dependency down. Then 
> we still need to manage paranamer to 2.8 because Avro 1.7 pulls in 2.3.
> But it all seems to work in a quick test. And it's necessary to get 2.12 
> working.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-22103) Move HashAggregateExec parent consume to a separate function in codegen

2017-09-26 Thread Apache Spark (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-22103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16180801#comment-16180801
 ] 

Apache Spark commented on SPARK-22103:
--

User 'juliuszsompolski' has created a pull request for this issue:
https://github.com/apache/spark/pull/19353

> Move HashAggregateExec parent consume to a separate function in codegen
> ---
>
> Key: SPARK-22103
> URL: https://issues.apache.org/jira/browse/SPARK-22103
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Juliusz Sompolski
>Assignee: Juliusz Sompolski
> Fix For: 2.3.0
>
>
> HashAggregateExec codegen uses two paths for fast hash table and a generic 
> one.
> It generates code paths for iterating over both, and both code paths generate 
> the consume code of the parent operator, resulting in that code being 
> expanded twice.
> This leads to a long generated function that might be an issue for the 
> compiler (see e.g. SPARK-21603).
> I propose to remove the double expansion by generating the consume code in a 
> helper function that can just be called from both iterating loop.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-22129) Spark release scripts ignore the GPG_KEY and always sign with your default key

2017-09-26 Thread holdenk (JIRA)

holdenk created SPARK-22129:
---

 Summary: Spark release scripts ignore the GPG_KEY and always sign 
with your default key
 Key: SPARK-22129
 URL: https://issues.apache.org/jira/browse/SPARK-22129
 Project: Spark
  Issue Type: Bug
  Components: Build
Affects Versions: 2.2.1, 2.3.0
Reporter: holdenk
Assignee: holdenk
Priority: Blocker


Currently the release scripts require GPG_KEY be specified but the param is 
ignored and instead the default GPG key is used. Change this to sign with the 
specified key.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-22123) Add latest failure reason for task set blacklist

2017-09-26 Thread Imran Rashid (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-22123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Imran Rashid updated SPARK-22123:
-
Summary: Add latest failure reason for task set blacklist  (was: Add latest 
failure reason for task set blacklis)

> Add latest failure reason for task set blacklist
> 
>
> Key: SPARK-22123
> URL: https://issues.apache.org/jira/browse/SPARK-22123
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.1.0, 2.2.0
>Reporter: zhoukang
>
> Till now , every job which aborted by completed blacklist just show log like 
> below which has no more information:
> {code:java}
> Aborting $taskSet because task $indexInTaskSet (partition $partition) cannot 
> run anywhere due to node and executor blacklist. Blacklisting behavior cannot 
> run anywhere due to node and executor blacklist.Blacklisting behavior can be 
> configured via spark.blacklist.*."
> {code}
> We could add most recent failure reason for taskset blacklist which can be 
> showed on spark ui to let user know failure reason directly.
> An example after modifying:
> {code:java}
> User class threw exception: org.apache.spark.SparkException: Job aborted due 
> to stage failure: Aborting TaskSet 0.0 because task 0 (partition 0) cannot 
> run anywhere due to node and executor blacklist. **Latest failure reason is** 
> Some(Lost task 0.1 in stage 0.0 (TID 3,xxx, executor 1): java.lang.Exception: 
> Fake error! at 
> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:73) at 
> org.apache.spark.scheduler.Task.run(Task.scala:99) at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:305) at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  at java.lang.Thread.run(Thread.java:745) ). Blacklisting behavior can be 
> configured via spark.blacklist.*. at 
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1458)
>  at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1446)
>  at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1445)
>  at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>  at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at 
> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1445) 
> at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:808)
>  at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:808)
>  at scala.Option.foreach(Option.scala:257) at 
> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:808)
>  at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1681)
>  at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1636)
>  at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1625)
>  at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) at 
> org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:634) at 
> org.apache.spark.SparkContext.runJob(SparkContext.scala:1922) at 
> org.apache.spark.SparkContext.runJob(SparkContext.scala:1935) at 
> org.apache.spark.SparkContext.runJob(SparkContext.scala:1948) at 
> org.apache.spark.SparkContext.runJob(SparkContext.scala:1962) at 
> org.apache.spark.rdd.RDD.count(RDD.scala:1157) at 
> org.apache.spark.examples.GroupByTest$.main(GroupByTest.scala:50) at 
> org.apache.spark.examples.GroupByTest.main(GroupByTest.scala) at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:653)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-22123) Add latest failure reason for task set blacklist

2017-09-26 Thread Imran Rashid (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-22123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Imran Rashid updated SPARK-22123:
-
Component/s: Scheduler

> Add latest failure reason for task set blacklist
> 
>
> Key: SPARK-22123
> URL: https://issues.apache.org/jira/browse/SPARK-22123
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler, Spark Core
>Affects Versions: 2.1.0, 2.2.0
>Reporter: zhoukang
>
> Till now , every job which aborted by completed blacklist just show log like 
> below which has no more information:
> {code:java}
> Aborting $taskSet because task $indexInTaskSet (partition $partition) cannot 
> run anywhere due to node and executor blacklist. Blacklisting behavior cannot 
> run anywhere due to node and executor blacklist.Blacklisting behavior can be 
> configured via spark.blacklist.*."
> {code}
> We could add most recent failure reason for taskset blacklist which can be 
> showed on spark ui to let user know failure reason directly.
> An example after modifying:
> {code:java}
> User class threw exception: org.apache.spark.SparkException: Job aborted due 
> to stage failure: Aborting TaskSet 0.0 because task 0 (partition 0) cannot 
> run anywhere due to node and executor blacklist. **Latest failure reason is** 
> Some(Lost task 0.1 in stage 0.0 (TID 3,xxx, executor 1): java.lang.Exception: 
> Fake error! at 
> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:73) at 
> org.apache.spark.scheduler.Task.run(Task.scala:99) at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:305) at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  at java.lang.Thread.run(Thread.java:745) ). Blacklisting behavior can be 
> configured via spark.blacklist.*. at 
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1458)
>  at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1446)
>  at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1445)
>  at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>  at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at 
> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1445) 
> at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:808)
>  at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:808)
>  at scala.Option.foreach(Option.scala:257) at 
> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:808)
>  at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1681)
>  at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1636)
>  at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1625)
>  at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) at 
> org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:634) at 
> org.apache.spark.SparkContext.runJob(SparkContext.scala:1922) at 
> org.apache.spark.SparkContext.runJob(SparkContext.scala:1935) at 
> org.apache.spark.SparkContext.runJob(SparkContext.scala:1948) at 
> org.apache.spark.SparkContext.runJob(SparkContext.scala:1962) at 
> org.apache.spark.rdd.RDD.count(RDD.scala:1157) at 
> org.apache.spark.examples.GroupByTest$.main(GroupByTest.scala:50) at 
> org.apache.spark.examples.GroupByTest.main(GroupByTest.scala) at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:653)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-22104) Add new option to dataframe -> parquet ==> custom extension to file name

2017-09-26 Thread Anbu Cheeralan (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-22104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16180988#comment-16180988
 ] 

Anbu Cheeralan commented on SPARK-22104:


[~hyukjin.kwon] Yes. I am writing files using multiple processes in the same 
directory in append mode. I would like to identify the files in case of 
troubleshooting. 
Hive supports this option via "hive.output.file.extension" property.
 

> Add new option to dataframe -> parquet ==> custom extension to file name
> 
>
> Key: SPARK-22104
> URL: https://issues.apache.org/jira/browse/SPARK-22104
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, SQL
>Affects Versions: 2.2.0
>Reporter: Anbu Cheeralan
>Priority: Minor
>
> While writing to dataframe to parquet, I would like to add a new option to 
> denote the custom extension name (important_info.parquet)  instead of a 
> simple .parquet.
> I can create a pull request for this solution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-20992) Support for Nomad as scheduler backend

2017-09-26 Thread Apache Spark (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-20992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16181040#comment-16181040
 ] 

Apache Spark commented on SPARK-20992:
--

User 'barnardb' has created a pull request for this issue:
https://github.com/apache/spark/pull/19354

> Support for Nomad as scheduler backend
> --
>
> Key: SPARK-20992
> URL: https://issues.apache.org/jira/browse/SPARK-20992
> Project: Spark
>  Issue Type: New Feature
>  Components: Scheduler
>Affects Versions: 2.1.1
>Reporter: Ben Barnard
>
> It is convenient to have scheduler backend support for running applications 
> on [Nomad|https://github.com/hashicorp/nomad], as with YARN and Mesos, so 
> that users can run Spark applications on a Nomad cluster without the need to 
> bring up a Spark Standalone cluster in the Nomad cluster.
> Both client and cluster deploy modes should be supported.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-22130) UTF8String.trim() inefficiently scans all white-space string twice.

2017-09-26 Thread Kazuaki Ishizaki (JIRA)

Kazuaki Ishizaki created SPARK-22130:


 Summary: UTF8String.trim() inefficiently scans all white-space 
string twice.
 Key: SPARK-22130
 URL: https://issues.apache.org/jira/browse/SPARK-22130
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 2.2.0
Reporter: Kazuaki Ishizaki
Priority: Minor


{{UTF8String.trim()}} scans a string including only white space (e.g. {{"
"}}) twice inefficiently. 




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-22130) UTF8String.trim() inefficiently scans all white-space string twice.

2017-09-26 Thread Kazuaki Ishizaki (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-22130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16181070#comment-16181070
 ] 

Kazuaki Ishizaki commented on SPARK-22130:
--

I will submit a PR soon.

> UTF8String.trim() inefficiently scans all white-space string twice.
> ---
>
> Key: SPARK-22130
> URL: https://issues.apache.org/jira/browse/SPARK-22130
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.0
>Reporter: Kazuaki Ishizaki
>Priority: Minor
>
> {{UTF8String.trim()}} scans a string including only white space (e.g. {{"
> "}}) twice inefficiently. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-22121) Spark should fix hive table location for hdfs HA

2017-09-26 Thread Imran Rashid (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-22121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Imran Rashid updated SPARK-22121:
-
Description: 
When converting an existing hdfs setup to have multiple namenodes, users 
*should* run the hive metatool to change the location of the metastore, to 
refer to the nameservice, instead of a specific namenode.  (See the 
{{updateLocation}} section of the [metatool 
docs|https://cwiki.apache.org/confluence/display/Hive/Hive+MetaTool].)

However, users tend to forget to do this.  If hdfs HA is turned on after a hive 
database is already created, the db location may still reference just one 
namenode, instead of the nameservice.  To be a little more user friendly, Spark 
should detect the misconfiguration and try to auto-adjust for it.  (This is the 
behavior from hive as well.)

An example exception is given below.  Users should run the hive metatool to 
update the database location if they see this.

{noformat}
Exception in thread "main" org.apache.spark.sql.AnalysisException: 
org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got 
exception: org.apache.hadoop.ipc.RemoteException Operation category READ is not 
supported in state standby. Visit https://s.apache.org/sbnn-error
at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:88)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1946)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1412)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:2986)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1142)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:938)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
);
at 
org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:108)
at 
org.apache.spark.sql.hive.HiveExternalCatalog.doCreateTable(HiveExternalCatalog.scala:217)
at 
org.apache.spark.sql.catalyst.catalog.ExternalCatalog.createTable(ExternalCatalog.scala:110)
at 
org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:316)
at 
org.apache.spark.sql.execution.command.CreateTableCommand.run(tables.scala:127)
at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:67)
at org.apache.spark.sql.Dataset.(Dataset.scala:182)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:67)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:623)
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:691)
at com.cloudera.spark.RunHiveQl$$anonfun$run$1.apply(RunHiveQl.scala:50)
at com.cloudera.spark.RunHiveQl$$anonfun$run$1.apply(RunHiveQl.scala:48)
at 
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at com.cloudera.spark.RunHiveQl.run(RunHiveQl.scala:48)
at com.cloudera.spark.RunHiveQl$.main(RunHiveQl.scala:181)
at com.cloudera.spark.RunHiveQl.main(RunHiveQl.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755)
at 
org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubm

[jira] [Commented] (SPARK-22130) UTF8String.trim() inefficiently scans all white-space string twice.

2017-09-26 Thread Apache Spark (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-22130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16181129#comment-16181129
 ] 

Apache Spark commented on SPARK-22130:
--

User 'kiszk' has created a pull request for this issue:
https://github.com/apache/spark/pull/19355

> UTF8String.trim() inefficiently scans all white-space string twice.
> ---
>
> Key: SPARK-22130
> URL: https://issues.apache.org/jira/browse/SPARK-22130
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.0
>Reporter: Kazuaki Ishizaki
>Priority: Minor
>
> {{UTF8String.trim()}} scans a string including only white space (e.g. {{"
> "}}) twice inefficiently. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-22130) UTF8String.trim() inefficiently scans all white-space string twice.

2017-09-26 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-22130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-22130:


Assignee: (was: Apache Spark)

> UTF8String.trim() inefficiently scans all white-space string twice.
> ---
>
> Key: SPARK-22130
> URL: https://issues.apache.org/jira/browse/SPARK-22130
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.0
>Reporter: Kazuaki Ishizaki
>Priority: Minor
>
> {{UTF8String.trim()}} scans a string including only white space (e.g. {{"
> "}}) twice inefficiently. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-22130) UTF8String.trim() inefficiently scans all white-space string twice.

2017-09-26 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-22130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-22130:


Assignee: Apache Spark

> UTF8String.trim() inefficiently scans all white-space string twice.
> ---
>
> Key: SPARK-22130
> URL: https://issues.apache.org/jira/browse/SPARK-22130
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.0
>Reporter: Kazuaki Ishizaki
>Assignee: Apache Spark
>Priority: Minor
>
> {{UTF8String.trim()}} scans a string including only white space (e.g. {{"
> "}}) twice inefficiently. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-16845) org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" grows beyond 64 KB

2017-09-26 Thread Kazuaki Ishizaki (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-16845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16181190#comment-16181190
 ] 

Kazuaki Ishizaki commented on SPARK-16845:
--

[~mvelusce] Thank you for reporting an issue with repro. I can reproduce this.

If I am correct, Spark 2.2 can fall back into a path disabling code gen by 
[this PR|https://github.com/apache/spark/pull/17087]. Once we tried to backport 
this to Spark 2.1, it was rejected.

> org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" 
> grows beyond 64 KB
> -
>
> Key: SPARK-16845
> URL: https://issues.apache.org/jira/browse/SPARK-16845
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.0
>Reporter: hejie
>Assignee: Liwei Lin
> Fix For: 1.6.4, 2.0.3, 2.1.1, 2.2.0
>
> Attachments: error.txt.zip
>
>
> I have a wide table(400 columns), when I try fitting the traindata on all 
> columns,  the fatal error occurs. 
>   ... 46 more
> Caused by: org.codehaus.janino.JaninoRuntimeException: Code of method 
> "(Lorg/apache/spark/sql/catalyst/InternalRow;Lorg/apache/spark/sql/catalyst/InternalRow;)I"
>  of class 
> "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" 
> grows beyond 64 KB
>   at org.codehaus.janino.CodeContext.makeSpace(CodeContext.java:941)
>   at org.codehaus.janino.CodeContext.write(CodeContext.java:854)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-22130) UTF8String.trim() inefficiently scans all white-space string twice.

2017-09-26 Thread Kazuaki Ishizaki (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-22130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kazuaki Ishizaki updated SPARK-22130:
-
Issue Type: Improvement  (was: Bug)

> UTF8String.trim() inefficiently scans all white-space string twice.
> ---
>
> Key: SPARK-22130
> URL: https://issues.apache.org/jira/browse/SPARK-22130
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.2.0
>Reporter: Kazuaki Ishizaki
>Priority: Minor
>
> {{UTF8String.trim()}} scans a string including only white space (e.g. {{"
> "}}) twice inefficiently. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-22104) Add new option to dataframe -> parquet ==> custom extension to file name

2017-09-26 Thread Anbu Cheeralan (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-22104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16181285#comment-16181285
 ] 

Anbu Cheeralan commented on SPARK-22104:


Here is a similar ticket for csv ==> 
https://issues.apache.org/jira/browse/SPARK-20731

> Add new option to dataframe -> parquet ==> custom extension to file name
> 
>
> Key: SPARK-22104
> URL: https://issues.apache.org/jira/browse/SPARK-22104
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, SQL
>Affects Versions: 2.2.0
>Reporter: Anbu Cheeralan
>Priority: Minor
>
> While writing to dataframe to parquet, I would like to add a new option to 
> denote the custom extension name (important_info.parquet)  instead of a 
> simple .parquet.
> I can create a pull request for this solution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-22131) Add Mesos Secrets Support to the Mesos Driver

2017-09-26 Thread Arthur Rand (JIRA)

Arthur Rand created SPARK-22131:
---

 Summary: Add Mesos Secrets Support to the Mesos Driver
 Key: SPARK-22131
 URL: https://issues.apache.org/jira/browse/SPARK-22131
 Project: Spark
  Issue Type: New Feature
  Components: Mesos
Affects Versions: 2.3.0
Reporter: Arthur Rand


We recently added Secrets support to the Dispatcher (SPARK-20812). In order to 
have Driver-to-Executor TLS we need the same support in the Mesos Driver so a 
secret can be disseminated to the executors. This JIRA is to move the current 
secrets implementation to be used by both frameworks. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-22132) Document the Dispatcher REST API

2017-09-26 Thread Arthur Rand (JIRA)

Arthur Rand created SPARK-22132:
---

 Summary: Document the Dispatcher REST API
 Key: SPARK-22132
 URL: https://issues.apache.org/jira/browse/SPARK-22132
 Project: Spark
  Issue Type: Improvement
  Components: Mesos
Affects Versions: 2.3.0
Reporter: Arthur Rand
Priority: Minor


The Dispatcher has a REST API for managing jobs in a Mesos cluster but it's 
currently undocumented meaning that users have to reference the source code for 
programmatic access. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-22133) Document Mesos reject offer duration configutations

2017-09-26 Thread Arthur Rand (JIRA)

Arthur Rand created SPARK-22133:
---

 Summary: Document Mesos reject offer duration configutations
 Key: SPARK-22133
 URL: https://issues.apache.org/jira/browse/SPARK-22133
 Project: Spark
  Issue Type: Improvement
  Components: Mesos
Affects Versions: 2.3.0
Reporter: Arthur Rand


Mesos has multiple configurable timeouts {{spark.mesos.rejectOfferDuration}}, 
{{spark.mesos.rejectOfferDurationForUnmetConstraints}}, and 
{{spark.mesos.rejectOfferDurationForReachedMaxCores}} that can have a large 
effect on Spark performance when sharing a Mesos cluster with other frameworks 
and users. These configurations aren't documented, add documentation and 
information for non-Mesos experts on how these settings should be used. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-22121) Spark should fix hive table location for hdfs HA

2017-09-26 Thread Imran Rashid (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-22121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Imran Rashid resolved SPARK-22121.
--
Resolution: Won't Fix

Discussed with [~smilegator] on the PR, we've decided for now to mark this as 
won't fix as there isn't a clear home for making the auto-adjustment:

bq. Spark SQL might not be deployed in the HDFS system. Conceptually, this 
HDFS-specific codes should not be part of our HiveExternalCatalog . 
HiveExternalCatalog is just for using Hive metastore. It does not assume we use 
HDFS.

hopefully the jira description is enough for users to search and find the 
workaround.

> Spark should fix hive table location for hdfs HA
> 
>
> Key: SPARK-22121
> URL: https://issues.apache.org/jira/browse/SPARK-22121
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Imran Rashid
>Assignee: Imran Rashid
>Priority: Minor
>
> When converting an existing hdfs setup to have multiple namenodes, users 
> *should* run the hive metatool to change the location of the metastore, to 
> refer to the nameservice, instead of a specific namenode.  (See the 
> {{updateLocation}} section of the [metatool 
> docs|https://cwiki.apache.org/confluence/display/Hive/Hive+MetaTool].)
> However, users tend to forget to do this.  If hdfs HA is turned on after a 
> hive database is already created, the db location may still reference just 
> one namenode, instead of the nameservice.  To be a little more user friendly, 
> Spark should detect the misconfiguration and try to auto-adjust for it.  
> (This is the behavior from hive as well.)
> An example exception is given below.  Users should run the hive metatool to 
> update the database location if they see this.
> {noformat}
> Exception in thread "main" org.apache.spark.sql.AnalysisException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got 
> exception: org.apache.hadoop.ipc.RemoteException Operation category READ is 
> not supported in state standby. Visit https://s.apache.org/sbnn-error
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:88)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1946)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1412)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:2986)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1142)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:938)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
> );
>   at 
> org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:108)
>   at 
> org.apache.spark.sql.hive.HiveExternalCatalog.doCreateTable(HiveExternalCatalog.scala:217)
>   at 
> org.apache.spark.sql.catalyst.catalog.ExternalCatalog.createTable(ExternalCatalog.scala:110)
>   at 
> org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:316)
>   at 
> org.apache.spark.sql.execution.command.CreateTableCommand.run(tables.scala:127)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:67)
>   at org.apache.spark.sql.Dataset.(Dataset.scala:182)
>   at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:67)
>   at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:623)
>   at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:691)
>   at com.cloudera.spark.RunHiveQl$$anonfun$run$1.apply(RunHiveQl.scala:50)
>   at com.cloudera.spark.RunHiveQl$$anonf

[jira] [Commented] (SPARK-9103) Tracking spark's memory usage

2017-09-26 Thread Imran Rashid (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16181543#comment-16181543
 ] 

Imran Rashid commented on SPARK-9103:
-

Hi [~jerryshao], sorry it took me a little while to respond.

I think having the info available in the metric system is great, but I see two 
different types of short-comings:

1) MetricsSystem / graphite etc. are great, but its really hard to correlate 
the timeline view you get with what is actually going on in your job.  Did the 
tasks that took the longest also correlate to the tasks that used the most 
memory?  Were there some phases of your application which had pressure on one 
part of memory (eg., execution memory) and another part which had pressure on 
another part (eg., user memory)?  What was the memory usage for tasks that 
failed?  And for tasks that were slow?

It seems *really* hard to answer those questions via graphite, as you have to 
do some mental joins from task to executor to the graphite view in the right 
time frame, and the rollups aren't exactly right, etc.  (Or maybe there is some 
sophisticated way to do this in graphite that I don't know about?)  It also 
just seems like something that should be baked into the spark UI, and not 
require a 3rd party tool ... perhaps its overkill for the spark UI to handle 
this (too big a load on the driver for large apps?), but would like to see if 
something better is possible.

2) Even within the metric system, I'm not sure we are capturing everything we 
need.  The most obvious thing to me is capturing the total process memory -- 
even off-heap memory which isn't part of the jvm managed memory, whether its 
memory managed by spark, or even a 3rd party lib (eg. parquet).  I have a 
feeling there are more things that would be useful to capture, though I haven't 
done a full audit of what is currently exposed in the metric system to be 
honest.

> Tracking spark's memory usage
> -
>
> Key: SPARK-9103
> URL: https://issues.apache.org/jira/browse/SPARK-9103
> Project: Spark
>  Issue Type: Umbrella
>  Components: Spark Core, Web UI
>Reporter: Zhang, Liye
> Attachments: Tracking Spark Memory Usage - Phase 1.pdf
>
>
> Currently spark only provides little memory usage information (RDD cache on 
> webUI) for the executors. User have no idea on what is the memory consumption 
> when they are running spark applications with a lot of memory used in spark 
> executors. Especially when they encounter the OOM, it’s really hard to know 
> what is the cause of the problem. So it would be helpful to give out the 
> detail memory consumption information for each part of spark, so that user 
> can clearly have a picture of where the memory is exactly used. 
> The memory usage info to expose should include but not limited to shuffle, 
> cache, network, serializer, etc.
> User can optionally choose to open this functionality since this is mainly 
> for debugging and tuning.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-22134) StackOverflowError issue when applying large nested UDF calls

2017-09-26 Thread Andrew Hu Zheng (JIRA)

Andrew Hu Zheng created SPARK-22134:
---

 Summary: StackOverflowError issue when applying large nested UDF 
calls
 Key: SPARK-22134
 URL: https://issues.apache.org/jira/browse/SPARK-22134
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 2.1.0
 Environment: Spark 2.1.0 on Cloudera CDH 5u8
Reporter: Andrew Hu Zheng
Priority: Critical


Spark throws a StackOverflowError whenever there is a large nested call of UDFs.
I have tried increasing the memory, but the same issue still happens.

Sample code of the nested calls : 

{code:java}
val v4 = 
u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat($"C0_0",
 $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), 
$"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), 
$"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), 
$"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), 
$"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), 
$"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), 
$"C0_0"), $"C0_0"), $"C0_0");
{code}


stack trace
{code:java}
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.informatica.compiler.InfaSparkMain$.main(InfaSparkMain.scala:74)
at com.informatica.compiler.InfaSparkMain.main(InfaSparkMain.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:637)
Caused by: java.lang.StackOverflowError
at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:358)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:188)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformChildren(TreeNode.scala:329)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:307)
at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:307)
at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:307)
at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5$$anonfun$apply$11.apply(TreeNode.scala:360)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.AbstractTraversable.map(Traversable.scala:104)
at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:358)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:188)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformChildren(TreeNode.scala:329)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:307)
at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:307)
at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:307)
at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5$$anonfun$apply$11.apply(TreeNode.scala:360)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
{code}





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-

[jira] [Commented] (SPARK-22134) StackOverflowError issue when applying large nested UDF calls

2017-09-26 Thread Sean Owen (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-22134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16181650#comment-16181650
 ] 

Sean Owen commented on SPARK-22134:
---

I mean, if you chain enough method calls, of course this will happen. This 
looks like a very contrived example. I don't think this can be considered a bug.

> StackOverflowError issue when applying large nested UDF calls
> -
>
> Key: SPARK-22134
> URL: https://issues.apache.org/jira/browse/SPARK-22134
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.1.0
> Environment: Spark 2.1.0 on Cloudera CDH 5u8
>Reporter: Andrew Hu Zheng
>Priority: Critical
>
> Spark throws a StackOverflowError whenever there is a large nested call of 
> UDFs.
> I have tried increasing the memory, but the same issue still happens.
> Sample code of the nested calls : 
> {code:java}
> val v4 = 
> u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat($"C0_0",
>  $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), 
> $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), 
> $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), 
> $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), 
> $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), 
> $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), 
> $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), 
> $"C0_0");
> {code}
> stack trace
> {code:java}
> java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at com.informatica.compiler.InfaSparkMain$.main(InfaSparkMain.scala:74)
>   at com.informatica.compiler.InfaSparkMain.main(InfaSparkMain.scala)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:637)
> Caused by: java.lang.StackOverflowError
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:358)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:188)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformChildren(TreeNode.scala:329)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:307)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:307)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:307)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5$$anonfun$apply$11.apply(TreeNode.scala:360)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
>   at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
>   at scala.collection.AbstractTraversable.map(Traversable.scala:104)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:358)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:188)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformChildren(TreeNode.scala:329)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:307)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:307)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:307)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5$$anonfun$apply$11.apply

[jira] [Created] (SPARK-22135) metrics in spark-dispatcher not being registered properly

2017-09-26 Thread paul mackles (JIRA)

paul mackles created SPARK-22135:


 Summary: metrics in spark-dispatcher not being registered properly
 Key: SPARK-22135
 URL: https://issues.apache.org/jira/browse/SPARK-22135
 Project: Spark
  Issue Type: Bug
  Components: Deploy, Mesos
Affects Versions: 2.2.0, 2.1.0
Reporter: paul mackles
Priority: Minor


There is a bug in the way that the metrics in 
org.apache.spark.scheduler.cluster.mesos.MesosClusterSchedulerSource are 
initialized such that they are never registered with the underlying registry. 
Basically, each call to the overridden "metricRegistry" function results in the 
creation of a new registry. Patch is forthcoming.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-22135) metrics in spark-dispatcher not being registered properly

2017-09-26 Thread paul mackles (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-22135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

paul mackles updated SPARK-22135:
-
Description: There is a bug in the way that the metrics in 
org.apache.spark.scheduler.cluster.mesos.MesosClusterSchedulerSource are 
initialized such that they are never registered with the underlying registry. 
Basically, each call to the overridden "metricRegistry" function results in the 
creation of a new registry. PR is forthcoming.  (was: There is a bug in the way 
that the metrics in 
org.apache.spark.scheduler.cluster.mesos.MesosClusterSchedulerSource are 
initialized such that they are never registered with the underlying registry. 
Basically, each call to the overridden "metricRegistry" function results in the 
creation of a new registry. Patch is forthcoming.)

> metrics in spark-dispatcher not being registered properly
> -
>
> Key: SPARK-22135
> URL: https://issues.apache.org/jira/browse/SPARK-22135
> Project: Spark
>  Issue Type: Bug
>  Components: Deploy, Mesos
>Affects Versions: 2.1.0, 2.2.0
>Reporter: paul mackles
>Priority: Minor
>
> There is a bug in the way that the metrics in 
> org.apache.spark.scheduler.cluster.mesos.MesosClusterSchedulerSource are 
> initialized such that they are never registered with the underlying registry. 
> Basically, each call to the overridden "metricRegistry" function results in 
> the creation of a new registry. PR is forthcoming.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-8288) ScalaReflection should also try apply methods defined in companion objects when inferring schema from a Product type

2017-09-26 Thread Jithin Thomas (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-8288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16181679#comment-16181679
 ] 

Jithin Thomas commented on SPARK-8288:
--

Hi, is there a temporary fix for this issue?
I've also been trying to get Spark SQL to infer the schema of my 
Scrooge-generated scala classes.

Thanks,
Jithin

> ScalaReflection should also try apply methods defined in companion objects 
> when inferring schema from a Product type
> 
>
> Key: SPARK-8288
> URL: https://issues.apache.org/jira/browse/SPARK-8288
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 1.4.0
>Reporter: Cheng Lian
>
> This ticket is derived from PARQUET-293 (which actually describes a Spark SQL 
> issue).
> My comment on that issue quoted below:
> {quote}
> ...  The reason of this exception is that, the Scala code Scrooge generates 
> is actually a trait extending {{Product}}:
> {code}
> trait Junk
>   extends ThriftStruct
>   with scala.Product2[Long, String]
>   with java.io.Serializable
> {code}
> while Spark expects a case class, something like:
> {code}
> case class Junk(junkID: Long, junkString: String)
> {code}
> The key difference here is that the latter case class version has a 
> constructor whose arguments can be transformed into fields of the DataFrame 
> schema.  The exception was thrown because Spark can't find such a constructor 
> from trait {{Junk}}.
> {quote}
> We can make {{ScalaReflection}} try {{apply}} methods in companion objects, 
> so that trait types generated by Scrooge can also be used for Spark SQL 
> schema inference.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-8288) ScalaReflection should also try apply methods defined in companion objects when inferring schema from a Product type

2017-09-26 Thread Drew Robb (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-8288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16181716#comment-16181716
 ] 

Drew Robb commented on SPARK-8288:
--

I do not yet have a fully working fix. I think that the best approach might be 
instead to change things on the scrooge end.

> ScalaReflection should also try apply methods defined in companion objects 
> when inferring schema from a Product type
> 
>
> Key: SPARK-8288
> URL: https://issues.apache.org/jira/browse/SPARK-8288
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 1.4.0
>Reporter: Cheng Lian
>
> This ticket is derived from PARQUET-293 (which actually describes a Spark SQL 
> issue).
> My comment on that issue quoted below:
> {quote}
> ...  The reason of this exception is that, the Scala code Scrooge generates 
> is actually a trait extending {{Product}}:
> {code}
> trait Junk
>   extends ThriftStruct
>   with scala.Product2[Long, String]
>   with java.io.Serializable
> {code}
> while Spark expects a case class, something like:
> {code}
> case class Junk(junkID: Long, junkString: String)
> {code}
> The key difference here is that the latter case class version has a 
> constructor whose arguments can be transformed into fields of the DataFrame 
> schema.  The exception was thrown because Spark can't find such a constructor 
> from trait {{Junk}}.
> {quote}
> We can make {{ScalaReflection}} try {{apply}} methods in companion objects, 
> so that trait types generated by Scrooge can also be used for Spark SQL 
> schema inference.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-3181) Add Robust Regression Algorithm with Huber Estimator

2017-09-26 Thread Joseph K. Bradley (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16181724#comment-16181724
 ] 

Joseph K. Bradley commented on SPARK-3181:
--

Re: [~sethah]'s comment about separating Huber Estimator from regular 
LinearRegression in the PR:

This was my initial reaction too, but I can see both sides:
* Technically, robust regression using the Huber loss is Linear Regression.  As 
far as I know (and as far as I can tell from Wikipedia), "Linear Regression" 
refers to the predictive model, not to the loss model.  Using Huber loss 
instead of squared error does not change the predictive model.
* Users should get what they expect when they use an Estimator.  The average 
user would not expect Linear Regression to do fancy Huber loss regression.  
That said, the default behavior *would* be least squares regression, so it's 
not a real problem.

I don't have strong feelings here, but I'm fine with them being combined.  
Thinking about the past, it was definitely overkill to separate 
LinearRegression, RidgeRegression, and Lasso in the old RDD-based API.

> Add Robust Regression Algorithm with Huber Estimator
> 
>
> Key: SPARK-3181
> URL: https://issues.apache.org/jira/browse/SPARK-3181
> Project: Spark
>  Issue Type: New Feature
>  Components: ML
>Affects Versions: 2.2.0
>Reporter: Fan Jiang
>Assignee: Yanbo Liang
>  Labels: features
>   Original Estimate: 0h
>  Remaining Estimate: 0h
>
> Linear least square estimates assume the error has normal distribution and 
> can behave badly when the errors are heavy-tailed. In practical we get 
> various types of data. We need to include Robust Regression  to employ a 
> fitting criterion that is not as vulnerable as least square.
> In 1973, Huber introduced M-estimation for regression which stands for 
> "maximum likelihood type". The method is resistant to outliers in the 
> response variable and has been widely used.
> The new feature for MLlib will contain 3 new files
> /main/scala/org/apache/spark/mllib/regression/RobustRegression.scala
> /test/scala/org/apache/spark/mllib/regression/RobustRegressionSuite.scala
> /main/scala/org/apache/spark/examples/mllib/HuberRobustRegression.scala
> and one new class HuberRobustGradient in 
> /main/scala/org/apache/spark/mllib/optimization/Gradient.scala



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-19141) VectorAssembler metadata causing memory issues

2017-09-26 Thread Weichen Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-19141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16174685#comment-16174685
 ] 

Weichen Xu edited comment on SPARK-19141 at 9/27/17 12:12 AM:
--

Maybe we need design a sparse format of AttributeGroup for vector ML column. We 
don't need create Attribute for each vector dimension. The better way I think 
is only when needed we create it. But `VectorAssembler` create attribute for 
each dimension, in any case. 


was (Author: weichenxu123):
Maybe we need design a sparse format of AttributeGroup for vector ML column. We 
don't need create Attribute for each vector dimension. The better way I think 
is only when needed we create it. But `VectorAssembler` create attribute for 
each dimension, in any case. Current design looks stupid.


> VectorAssembler metadata causing memory issues
> --
>
> Key: SPARK-19141
> URL: https://issues.apache.org/jira/browse/SPARK-19141
> Project: Spark
>  Issue Type: Bug
>  Components: ML, MLlib
>Affects Versions: 1.6.0, 2.0.0, 2.1.0
> Environment: Windows 10, Ubuntu 16.04.1, Scala 2.11.8, Spark 1.6.0, 
> 2.0.0, 2.1.0
>Reporter: Antonia Oprescu
>
> VectorAssembler produces unnecessary metadata that overflows the Java heap in 
> the case of sparse vectors. In the example below, the logical length of the 
> vector is 10^6, but the number of non-zero values is only 2.
> The problem arises when the vector assembler creates metadata (ML attributes) 
> for each of the 10^6 slots, even if this metadata didn't exist upstream (i.e. 
> HashingTF doesn't produce metadata per slot). Here is a chunk of metadata it 
> produces:
> {noformat}
> {"ml_attr":{"attrs":{"numeric":[{"idx":0,"name":"HashedFeat_0"},{"idx":1,"name":"HashedFeat_1"},{"idx":2,"name":"HashedFeat_2"},{"idx":3,"name":"HashedFeat_3"},{"idx":4,"name":"HashedFeat_4"},{"idx":5,"name":"HashedFeat_5"},{"idx":6,"name":"HashedFeat_6"},{"idx":7,"name":"HashedFeat_7"},{"idx":8,"name":"HashedFeat_8"},{"idx":9,"name":"HashedFeat_9"},...,{"idx":100,"name":"Feat01"}]},"num_attrs":101}}
> {noformat}
> In this lightweight example, the feature size limit seems to be 1,000,000 
> when run locally, but this scales poorly with more complicated routines. With 
> a larger dataset and a learner (say LogisticRegression), it maxes out 
> anywhere between 10k and 100k hash size even on a decent sized cluster.
> I did some digging, and it seems that the only metadata necessary for 
> downstream learners is the one indicating categorical columns. Thus, I 
> thought of the following possible solutions:
> 1. Compact representation of ml attributes metadata (but this seems to be a 
> bigger change)
> 2. Removal of non-categorical tags from the metadata created by the 
> VectorAssembler
> 3. An option on the existent VectorAssembler to skip unnecessary ml 
> attributes or create another transformer altogether
> I would happy to take a stab at any of these solutions, but I need some 
> direction from the Spark community.
> {code:title=VABug.scala |borderStyle=solid}
> import org.apache.spark.SparkConf
> import org.apache.spark.ml.feature.{HashingTF, VectorAssembler}
> import org.apache.spark.sql.SparkSession
> object VARepro {
>   case class Record(Label: Double, Feat01: Double, Feat02: Array[String])
>   def main(args: Array[String]) {
> val conf = new SparkConf()
>   .setAppName("Vector assembler bug")
>   .setMaster("local[*]")
> val spark = SparkSession.builder.config(conf).getOrCreate()
> import spark.implicits._
> val df = Seq(Record(1.0, 2.0, Array("4daf")), Record(0.0, 3.0, 
> Array("a9ee"))).toDS()
> val numFeatures = 1000
> val hashingScheme = new 
> HashingTF().setInputCol("Feat02").setOutputCol("HashedFeat").setNumFeatures(numFeatures)
> val hashedData = hashingScheme.transform(df)
> val vectorAssembler = new 
> VectorAssembler().setInputCols(Array("HashedFeat","Feat01")).setOutputCol("Features")
> val processedData = vectorAssembler.transform(hashedData).select("Label", 
> "Features")
> processedData.show()
>   }
> }
> {code}
> *Stacktrace from the example above:*
> Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit 
> exceeded
>   at 
> org.apache.spark.ml.attribute.NumericAttribute.copy(attributes.scala:272)
>   at 
> org.apache.spark.ml.attribute.NumericAttribute.withIndex(attributes.scala:215)
>   at 
> org.apache.spark.ml.attribute.NumericAttribute.withIndex(attributes.scala:195)
>   at 
> org.apache.spark.ml.attribute.AttributeGroup$$anonfun$3$$anonfun$apply$1.apply(AttributeGroup.scala:71)
>   at 
> org.apache.spark.ml.attribute.AttributeGroup$$anonfun$3$$anonfun$apply$1.apply(AttributeGroup.scala:70)
>   at scala.collection.Iterator$$anon$11.next(Iterator.s

[jira] [Commented] (SPARK-22134) StackOverflowError issue when applying large nested UDF calls

2017-09-26 Thread Andrew Hu Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-22134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16181782#comment-16181782
 ] 

Andrew Hu Zheng commented on SPARK-22134:
-

Sean, I understand what you are trying to say. However, we had this similar 
example working fine in Hive. I had an inclination that Spark would be able to 
handle something of this magnitude. 

> StackOverflowError issue when applying large nested UDF calls
> -
>
> Key: SPARK-22134
> URL: https://issues.apache.org/jira/browse/SPARK-22134
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.1.0
> Environment: Spark 2.1.0 on Cloudera CDH 5u8
>Reporter: Andrew Hu Zheng
>Priority: Critical
>
> Spark throws a StackOverflowError whenever there is a large nested call of 
> UDFs.
> I have tried increasing the memory, but the same issue still happens.
> Sample code of the nested calls : 
> {code:java}
> val v4 = 
> u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat($"C0_0",
>  $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), 
> $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), 
> $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), 
> $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), 
> $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), 
> $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), 
> $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), 
> $"C0_0");
> {code}
> stack trace
> {code:java}
> java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at com.informatica.compiler.InfaSparkMain$.main(InfaSparkMain.scala:74)
>   at com.informatica.compiler.InfaSparkMain.main(InfaSparkMain.scala)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:637)
> Caused by: java.lang.StackOverflowError
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:358)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:188)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformChildren(TreeNode.scala:329)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:307)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:307)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:307)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5$$anonfun$apply$11.apply(TreeNode.scala:360)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
>   at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
>   at scala.collection.AbstractTraversable.map(Traversable.scala:104)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:358)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:188)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformChildren(TreeNode.scala:329)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:307)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:307)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:307)
>   at 
> org.apache.spark.sql.catalyst.

[jira] [Commented] (SPARK-17075) Cardinality Estimation of Predicate Expressions

2017-09-26 Thread Apache Spark (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-17075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16181784#comment-16181784
 ] 

Apache Spark commented on SPARK-17075:
--

User 'ron8hu' has created a pull request for this issue:
https://github.com/apache/spark/pull/19357

> Cardinality Estimation of Predicate Expressions
> ---
>
> Key: SPARK-17075
> URL: https://issues.apache.org/jira/browse/SPARK-17075
> Project: Spark
>  Issue Type: Sub-task
>  Components: Optimizer
>Affects Versions: 2.0.0
>Reporter: Ron Hu
>Assignee: Ron Hu
> Fix For: 2.2.0
>
>
> A filter condition is the predicate expression specified in the WHERE clause 
> of a SQL select statement.  A predicate can be a compound logical expression 
> with logical AND, OR, NOT operators combining multiple single conditions.  A 
> single condition usually has comparison operators such as =, <, <=, >, >=, 
> ‘like’, etc.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-21322) support histogram in filter cardinality estimation

2017-09-26 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-21322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-21322:


Assignee: (was: Apache Spark)

> support histogram in filter cardinality estimation
> --
>
> Key: SPARK-21322
> URL: https://issues.apache.org/jira/browse/SPARK-21322
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Ron Hu
>
> Histogram is effective in dealing with skewed distribution.  After we 
> generate histogram information for column statistics, we need to adjust 
> filter estimation based on histogram data structure.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-21322) support histogram in filter cardinality estimation

2017-09-26 Thread Apache Spark (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-21322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16181787#comment-16181787
 ] 

Apache Spark commented on SPARK-21322:
--

User 'ron8hu' has created a pull request for this issue:
https://github.com/apache/spark/pull/19357

> support histogram in filter cardinality estimation
> --
>
> Key: SPARK-21322
> URL: https://issues.apache.org/jira/browse/SPARK-21322
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Ron Hu
>
> Histogram is effective in dealing with skewed distribution.  After we 
> generate histogram information for column statistics, we need to adjust 
> filter estimation based on histogram data structure.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-21322) support histogram in filter cardinality estimation

2017-09-26 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-21322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-21322:


Assignee: Apache Spark

> support histogram in filter cardinality estimation
> --
>
> Key: SPARK-21322
> URL: https://issues.apache.org/jira/browse/SPARK-21322
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Ron Hu
>Assignee: Apache Spark
>
> Histogram is effective in dealing with skewed distribution.  After we 
> generate histogram information for column statistics, we need to adjust 
> filter estimation based on histogram data structure.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-22136) Implement stream-stream outer joins in append mode

2017-09-26 Thread Jose Torres (JIRA)

Jose Torres created SPARK-22136:
---

 Summary: Implement stream-stream outer joins in append mode
 Key: SPARK-22136
 URL: https://issues.apache.org/jira/browse/SPARK-22136
 Project: Spark
  Issue Type: Sub-task
  Components: Structured Streaming
Affects Versions: 2.3.0
Reporter: Jose Torres


Followup to inner join subtask. We can implement outer joins by generating null 
rows when old state gets cleaned up, given that the join has watermarks 
allowing that cleanup to happen.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-22136) Implement stream-stream outer joins in append mode

2017-09-26 Thread Apache Spark (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-22136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16181789#comment-16181789
 ] 

Apache Spark commented on SPARK-22136:
--

User 'joseph-torres' has created a pull request for this issue:
https://github.com/apache/spark/pull/19327

> Implement stream-stream outer joins in append mode
> --
>
> Key: SPARK-22136
> URL: https://issues.apache.org/jira/browse/SPARK-22136
> Project: Spark
>  Issue Type: Sub-task
>  Components: Structured Streaming
>Affects Versions: 2.3.0
>Reporter: Jose Torres
>
> Followup to inner join subtask. We can implement outer joins by generating 
> null rows when old state gets cleaned up, given that the join has watermarks 
> allowing that cleanup to happen.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-22136) Implement stream-stream outer joins in append mode

2017-09-26 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-22136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-22136:


Assignee: Apache Spark

> Implement stream-stream outer joins in append mode
> --
>
> Key: SPARK-22136
> URL: https://issues.apache.org/jira/browse/SPARK-22136
> Project: Spark
>  Issue Type: Sub-task
>  Components: Structured Streaming
>Affects Versions: 2.3.0
>Reporter: Jose Torres
>Assignee: Apache Spark
>
> Followup to inner join subtask. We can implement outer joins by generating 
> null rows when old state gets cleaned up, given that the join has watermarks 
> allowing that cleanup to happen.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-22136) Implement stream-stream outer joins in append mode

2017-09-26 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-22136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-22136:


Assignee: (was: Apache Spark)

> Implement stream-stream outer joins in append mode
> --
>
> Key: SPARK-22136
> URL: https://issues.apache.org/jira/browse/SPARK-22136
> Project: Spark
>  Issue Type: Sub-task
>  Components: Structured Streaming
>Affects Versions: 2.3.0
>Reporter: Jose Torres
>
> Followup to inner join subtask. We can implement outer joins by generating 
> null rows when old state gets cleaned up, given that the join has watermarks 
> allowing that cleanup to happen.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-17075) Cardinality Estimation of Predicate Expressions

2017-09-26 Thread Ron Hu (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-17075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16181793#comment-16181793
 ] 

Ron Hu commented on SPARK-17075:


This jira has been resolved.  Pull request 19357 was added here by accident.  
Pull request 19357 is fore [SPARK-21322].  

> Cardinality Estimation of Predicate Expressions
> ---
>
> Key: SPARK-17075
> URL: https://issues.apache.org/jira/browse/SPARK-17075
> Project: Spark
>  Issue Type: Sub-task
>  Components: Optimizer
>Affects Versions: 2.0.0
>Reporter: Ron Hu
>Assignee: Ron Hu
> Fix For: 2.2.0
>
>
> A filter condition is the predicate expression specified in the WHERE clause 
> of a SQL select statement.  A predicate can be a compound logical expression 
> with logical AND, OR, NOT operators combining multiple single conditions.  A 
> single condition usually has comparison operators such as =, <, <=, >, >=, 
> ‘like’, etc.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-22134) StackOverflowError issue when applying large nested UDF calls

2017-09-26 Thread Hyukjin Kwon (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-22134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-22134:
-
Priority: Major  (was: Critical)

> StackOverflowError issue when applying large nested UDF calls
> -
>
> Key: SPARK-22134
> URL: https://issues.apache.org/jira/browse/SPARK-22134
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.1.0
> Environment: Spark 2.1.0 on Cloudera CDH 5u8
>Reporter: Andrew Hu Zheng
>
> Spark throws a StackOverflowError whenever there is a large nested call of 
> UDFs.
> I have tried increasing the memory, but the same issue still happens.
> Sample code of the nested calls : 
> {code:java}
> val v4 = 
> u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat($"C0_0",
>  $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), 
> $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), 
> $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), 
> $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), 
> $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), 
> $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), 
> $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), 
> $"C0_0");
> {code}
> stack trace
> {code:java}
> java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at com.informatica.compiler.InfaSparkMain$.main(InfaSparkMain.scala:74)
>   at com.informatica.compiler.InfaSparkMain.main(InfaSparkMain.scala)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:637)
> Caused by: java.lang.StackOverflowError
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:358)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:188)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformChildren(TreeNode.scala:329)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:307)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:307)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:307)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5$$anonfun$apply$11.apply(TreeNode.scala:360)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
>   at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
>   at scala.collection.AbstractTraversable.map(Traversable.scala:104)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:358)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:188)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformChildren(TreeNode.scala:329)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:307)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:307)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:307)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5$$anonfun$apply$11.apply(TreeNode.scala:360)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scal

[jira] [Commented] (SPARK-22135) metrics in spark-dispatcher not being registered properly

2017-09-26 Thread paul mackles (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-22135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16181804#comment-16181804
 ] 

paul mackles commented on SPARK-22135:
--

here is the PR: https://github.com/apache/spark/pull/19358

> metrics in spark-dispatcher not being registered properly
> -
>
> Key: SPARK-22135
> URL: https://issues.apache.org/jira/browse/SPARK-22135
> Project: Spark
>  Issue Type: Bug
>  Components: Deploy, Mesos
>Affects Versions: 2.1.0, 2.2.0
>Reporter: paul mackles
>Priority: Minor
>
> There is a bug in the way that the metrics in 
> org.apache.spark.scheduler.cluster.mesos.MesosClusterSchedulerSource are 
> initialized such that they are never registered with the underlying registry. 
> Basically, each call to the overridden "metricRegistry" function results in 
> the creation of a new registry. PR is forthcoming.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-22135) metrics in spark-dispatcher not being registered properly

2017-09-26 Thread Apache Spark (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-22135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16181809#comment-16181809
 ] 

Apache Spark commented on SPARK-22135:
--

User 'pmackles' has created a pull request for this issue:
https://github.com/apache/spark/pull/19358

> metrics in spark-dispatcher not being registered properly
> -
>
> Key: SPARK-22135
> URL: https://issues.apache.org/jira/browse/SPARK-22135
> Project: Spark
>  Issue Type: Bug
>  Components: Deploy, Mesos
>Affects Versions: 2.1.0, 2.2.0
>Reporter: paul mackles
>Priority: Minor
>
> There is a bug in the way that the metrics in 
> org.apache.spark.scheduler.cluster.mesos.MesosClusterSchedulerSource are 
> initialized such that they are never registered with the underlying registry. 
> Basically, each call to the overridden "metricRegistry" function results in 
> the creation of a new registry. PR is forthcoming.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-22135) metrics in spark-dispatcher not being registered properly

2017-09-26 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-22135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-22135:


Assignee: (was: Apache Spark)

> metrics in spark-dispatcher not being registered properly
> -
>
> Key: SPARK-22135
> URL: https://issues.apache.org/jira/browse/SPARK-22135
> Project: Spark
>  Issue Type: Bug
>  Components: Deploy, Mesos
>Affects Versions: 2.1.0, 2.2.0
>Reporter: paul mackles
>Priority: Minor
>
> There is a bug in the way that the metrics in 
> org.apache.spark.scheduler.cluster.mesos.MesosClusterSchedulerSource are 
> initialized such that they are never registered with the underlying registry. 
> Basically, each call to the overridden "metricRegistry" function results in 
> the creation of a new registry. PR is forthcoming.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-22135) metrics in spark-dispatcher not being registered properly

2017-09-26 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-22135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-22135:


Assignee: Apache Spark

> metrics in spark-dispatcher not being registered properly
> -
>
> Key: SPARK-22135
> URL: https://issues.apache.org/jira/browse/SPARK-22135
> Project: Spark
>  Issue Type: Bug
>  Components: Deploy, Mesos
>Affects Versions: 2.1.0, 2.2.0
>Reporter: paul mackles
>Assignee: Apache Spark
>Priority: Minor
>
> There is a bug in the way that the metrics in 
> org.apache.spark.scheduler.cluster.mesos.MesosClusterSchedulerSource are 
> initialized such that they are never registered with the underlying registry. 
> Basically, each call to the overridden "metricRegistry" function results in 
> the creation of a new registry. PR is forthcoming.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-22134) StackOverflowError issue when applying large nested UDF calls

2017-09-26 Thread guoxiaolongzte (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-22134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16181874#comment-16181874
 ] 

guoxiaolongzte commented on SPARK-22134:


Increase the parameters   spark.driver.extraJavaOptions=‘-Xss5M’, Increase  
-Xss value.


> StackOverflowError issue when applying large nested UDF calls
> -
>
> Key: SPARK-22134
> URL: https://issues.apache.org/jira/browse/SPARK-22134
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.1.0
> Environment: Spark 2.1.0 on Cloudera CDH 5u8
>Reporter: Andrew Hu Zheng
>
> Spark throws a StackOverflowError whenever there is a large nested call of 
> UDFs.
> I have tried increasing the memory, but the same issue still happens.
> Sample code of the nested calls : 
> {code:java}
> val v4 = 
> u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat(u_concat($"C0_0",
>  $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), 
> $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), 
> $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), 
> $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), 
> $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), 
> $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), 
> $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), $"C0_0"), 
> $"C0_0");
> {code}
> stack trace
> {code:java}
> java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at com.informatica.compiler.InfaSparkMain$.main(InfaSparkMain.scala:74)
>   at com.informatica.compiler.InfaSparkMain.main(InfaSparkMain.scala)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:637)
> Caused by: java.lang.StackOverflowError
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:358)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:188)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformChildren(TreeNode.scala:329)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:307)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:307)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:307)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5$$anonfun$apply$11.apply(TreeNode.scala:360)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
>   at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
>   at scala.collection.AbstractTraversable.map(Traversable.scala:104)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:358)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:188)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformChildren(TreeNode.scala:329)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:307)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:307)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:307)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5$$anonfun$apply$11.apply(TreeNode.scala:360)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(

[jira] [Created] (SPARK-22137) Failed to insert VectorUDT to hive table with DataFrameWriter.insertInto(tableName: String)

2017-09-26 Thread yzheng616 (JIRA)

yzheng616 created SPARK-22137:
-

 Summary: Failed to insert VectorUDT to hive table with 
DataFrameWriter.insertInto(tableName: String)
 Key: SPARK-22137
 URL: https://issues.apache.org/jira/browse/SPARK-22137
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.1.1
Reporter: yzheng616


Failed to insert VectorUDT to hive table with 
DataFrameWriter.insertInto(tableName: String). The issue seems similar with 
SPARK-17765 which have been resolved in 2.1.0. 

Error message: 
{color:red}Exception in thread "main" org.apache.spark.sql.AnalysisException: 
cannot resolve '`features`' due to data type mismatch: cannot cast 
org.apache.spark.ml.linalg.VectorUDT@3bfc3ba7 to 
StructType(StructField(type,ByteType,true), StructField(size,IntegerType,true), 
StructField(indices,ArrayType(IntegerType,true),true), 
StructField(values,ArrayType(DoubleType,true),true));;
'InsertIntoTable Relation[id#21,features#22] parquet, 
OverwriteOptions(false,Map()), false
+- 'Project [cast(id#13L as int) AS id#27, cast(features#14 as 
struct,values:array>) AS 
features#28]
   +- LogicalRDD [id#13L, features#14]{color}

Reproduce code:

{code:java}
import scala.annotation.varargs
import org.apache.spark.ml.linalg.SQLDataTypes
import org.apache.spark.sql.Row
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.types.LongType
import org.apache.spark.sql.types.StructField
import org.apache.spark.sql.types.StructType


case class UDT(`id`: Long, `features`: org.apache.spark.ml.linalg.Vector)

object UDTTest {

  def main(args: Array[String]): Unit = {
val tb = "table_udt"
val spark = 
SparkSession.builder().master("local[4]").appName("UDTInsertInto").enableHiveSupport().getOrCreate()

spark.sql("drop table if exists " + tb)

/*
 * VectorUDT sql type definition:
 * 
 *   override def sqlType: StructType = {
 *   StructType(Seq(
 *  StructField("type", ByteType, nullable = false),
 *  StructField("size", IntegerType, nullable = true),
 *  StructField("indices", ArrayType(IntegerType, containsNull = 
false), nullable = true),
 *  StructField("values", ArrayType(DoubleType, containsNull = 
false), nullable = true)))
 *   }
*/

//Create Hive table base on VectorUDT sql type
spark.sql("create table if not exists "+tb+"(id int, features 
struct,values:array>)" +
  " row format serde 
'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'"+
  " stored as"+
" inputformat 
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'"+
" outputformat 
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'")

var seq = new scala.collection.mutable.ArrayBuffer[UDT]()
for (x <- 1 to 2) {
  seq += (new UDT(x, org.apache.spark.ml.linalg.Vectors.dense(0.2, 0.21, 
0.44)))
}

val rowRDD = (spark.sparkContext.makeRDD[UDT](seq)).map { x => 
Row.fromSeq(Seq(x.id,x.features)) }
val schema = StructType(Array(StructField("id", 
LongType,false),StructField("features", SQLDataTypes.VectorType,false)))
val df = spark.createDataFrame(rowRDD, schema)
 
//insert into hive table
df.write.insertInto(tb)
  }
}
{code}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-22123) Add latest failure reason for task set blacklist

2017-09-26 Thread zhoukang (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-22123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhoukang updated SPARK-22123:
-
Description: 
Till now , every job which aborted by completed blacklist just show log like 
below which has no more information:

{code:java}
Aborting $taskSet because task $indexInTaskSet (partition $partition) cannot 
run anywhere due to node and executor blacklist. Blacklisting behavior cannot 
run anywhere due to node and executor blacklist.Blacklisting behavior can be 
configured via spark.blacklist.*."
{code}
We could add most recent failure reason for taskset blacklist which can be 
showed on spark ui to let user know failure reason directly.
An example after modifying:

{code:java}
Aborting TaskSet 0.0 because task 0 (partition 0) cannot run anywhere due to 
node and executor blacklist.
 Most recent failure:
 Some(Lost task 0.1 in stage 0.0 (TID 3,xxx, executor 1): java.lang.Exception: 
Fake error!
 at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:73)
 at org.apache.spark.scheduler.Task.run(Task.scala:99)
 at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:305)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 ). 

Blacklisting behavior can be configured via spark.blacklist.*.
{code}



  was:
Till now , every job which aborted by completed blacklist just show log like 
below which has no more information:

{code:java}
Aborting $taskSet because task $indexInTaskSet (partition $partition) cannot 
run anywhere due to node and executor blacklist. Blacklisting behavior cannot 
run anywhere due to node and executor blacklist.Blacklisting behavior can be 
configured via spark.blacklist.*."
{code}
We could add most recent failure reason for taskset blacklist which can be 
showed on spark ui to let user know failure reason directly.
An example after modifying:

{code:java}
User class threw exception: org.apache.spark.SparkException: Job aborted due to 
stage failure: Aborting TaskSet 0.0 because task 0 (partition 0) cannot run 
anywhere due to node and executor blacklist. **Latest failure reason is** 
Some(Lost task 0.1 in stage 0.0 (TID 3,xxx, executor 1): java.lang.Exception: 
Fake error! at 
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:73) at 
org.apache.spark.scheduler.Task.run(Task.scala:99) at 
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:305) at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
at java.lang.Thread.run(Thread.java:745) ). Blacklisting behavior can be 
configured via spark.blacklist.*. at 
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1458)
 at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1446)
 at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1445)
 at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) 
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at 
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1445) at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:808)
 at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:808)
 at scala.Option.foreach(Option.scala:257) at 
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:808)
 at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1681)
 at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1636)
 at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1625)
 at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) at 
org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:634) at 
org.apache.spark.SparkContext.runJob(SparkContext.scala:1922) at 
org.apache.spark.SparkContext.runJob(SparkContext.scala:1935) at 
org.apache.spark.SparkContext.runJob(SparkContext.scala:1948) at 
org.apache.spark.SparkContext.runJob(SparkContext.scala:1962) at 
org.apache.spark.rdd.RDD.count(RDD.scala:1157) at 
org.apache.spark.examples.GroupByTest$.main(GroupByTest.scala:50) at 
org.apache.spark.examples.GroupByTest.main(GroupByTest.scala) at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498) at 
org.apache.spark

[jira] [Resolved] (SPARK-22112) Add missing method to pyspark api: spark.read.csv(Dataset)

2017-09-26 Thread Hyukjin Kwon (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-22112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-22112.
--
   Resolution: Fixed
Fix Version/s: 2.3.0

Issue resolved by pull request 19339
[https://github.com/apache/spark/pull/19339]

> Add missing method to pyspark api: spark.read.csv(Dataset)
> --
>
> Key: SPARK-22112
> URL: https://issues.apache.org/jira/browse/SPARK-22112
> Project: Spark
>  Issue Type: Improvement
>  Components: PySpark
>Affects Versions: 2.2.0
>Reporter: Andrew Ash
> Fix For: 2.3.0
>
>
> https://issues.apache.org/jira/browse/SPARK-15463 added a method to the scala 
> API without adding an equivalent in pyspark: 
> {{spark.read.csv(Dataset)}}
> I was writing some things with pyspark but had to switch it to scala/java to 
> use that method -- since equivalency between python/java/scala is a Spark 
> goal, we should make sure this functionality exists in all the supported 
> languages.
> https://github.com/apache/spark/pull/16854/files#diff-f70bda59304588cc3abfa3a9840653f4R408
> cc [~hyukjin.kwon]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-22112) Add missing method to pyspark api: spark.read.csv(Dataset)

2017-09-26 Thread Hyukjin Kwon (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-22112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-22112:


Assignee: Jia-Xuan Liu

> Add missing method to pyspark api: spark.read.csv(Dataset)
> --
>
> Key: SPARK-22112
> URL: https://issues.apache.org/jira/browse/SPARK-22112
> Project: Spark
>  Issue Type: Improvement
>  Components: PySpark
>Affects Versions: 2.2.0
>Reporter: Andrew Ash
>Assignee: Jia-Xuan Liu
> Fix For: 2.3.0
>
>
> https://issues.apache.org/jira/browse/SPARK-15463 added a method to the scala 
> API without adding an equivalent in pyspark: 
> {{spark.read.csv(Dataset)}}
> I was writing some things with pyspark but had to switch it to scala/java to 
> use that method -- since equivalency between python/java/scala is a Spark 
> goal, we should make sure this functionality exists in all the supported 
> languages.
> https://github.com/apache/spark/pull/16854/files#diff-f70bda59304588cc3abfa3a9840653f4R408
> cc [~hyukjin.kwon]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-22138) Allow retry during release-build

2017-09-26 Thread holdenk (JIRA)

holdenk created SPARK-22138:
---

 Summary: Allow retry during release-build
 Key: SPARK-22138
 URL: https://issues.apache.org/jira/browse/SPARK-22138
 Project: Spark
  Issue Type: Improvement
  Components: Build
Affects Versions: 2.2.1, 2.3.0
 Environment: Right now the build script is configured with no retries, 
but since transient issues exist with networking lets allow a small number of 
retries.
Reporter: holdenk
Assignee: holdenk
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-9103) Tracking spark's memory usage

2017-09-26 Thread Saisai Shao (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16181905#comment-16181905
 ] 

Saisai Shao commented on SPARK-9103:


Hi [~irashid], thanks a lot for your response.

I agree that your concern is very valid, especially on how to correlate the 
whole memory usage to the task execution. But somehow it is hard to do from the 
task level based on current Spark's design, in which some memory usage is 
shared between tasks, like Netty memory, storage and execution memory. Also 
about user memory, I think it is a missing part in the current Spark, but to 
know this part of memory seems quite expensive, since we cannot expect what 
will user do in the task, like memory used by 3rd party lib. 

So let me think a bit on how to further extend this feature (though looks a 
little difficult to do) :).

> Tracking spark's memory usage
> -
>
> Key: SPARK-9103
> URL: https://issues.apache.org/jira/browse/SPARK-9103
> Project: Spark
>  Issue Type: Umbrella
>  Components: Spark Core, Web UI
>Reporter: Zhang, Liye
> Attachments: Tracking Spark Memory Usage - Phase 1.pdf
>
>
> Currently spark only provides little memory usage information (RDD cache on 
> webUI) for the executors. User have no idea on what is the memory consumption 
> when they are running spark applications with a lot of memory used in spark 
> executors. Especially when they encounter the OOM, it’s really hard to know 
> what is the cause of the problem. So it would be helpful to give out the 
> detail memory consumption information for each part of spark, so that user 
> can clearly have a picture of where the memory is exactly used. 
> The memory usage info to expose should include but not limited to shuffle, 
> cache, network, serializer, etc.
> User can optionally choose to open this functionality since this is mainly 
> for debugging and tuning.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-22129) Spark release scripts ignore the GPG_KEY and always sign with your default key

2017-09-26 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-22129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-22129:


Assignee: holdenk  (was: Apache Spark)

> Spark release scripts ignore the GPG_KEY and always sign with your default key
> --
>
> Key: SPARK-22129
> URL: https://issues.apache.org/jira/browse/SPARK-22129
> Project: Spark
>  Issue Type: Bug
>  Components: Build
>Affects Versions: 2.2.1, 2.3.0
>Reporter: holdenk
>Assignee: holdenk
>Priority: Blocker
>
> Currently the release scripts require GPG_KEY be specified but the param is 
> ignored and instead the default GPG key is used. Change this to sign with the 
> specified key.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-22138) Allow retry during release-build

2017-09-26 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-22138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-22138:


Assignee: Apache Spark  (was: holdenk)

> Allow retry during release-build
> 
>
> Key: SPARK-22138
> URL: https://issues.apache.org/jira/browse/SPARK-22138
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 2.2.1, 2.3.0
> Environment: Right now the build script is configured with no 
> retries, but since transient issues exist with networking lets allow a small 
> number of retries.
>Reporter: holdenk
>Assignee: Apache Spark
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-22129) Spark release scripts ignore the GPG_KEY and always sign with your default key

2017-09-26 Thread Apache Spark (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-22129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16181916#comment-16181916
 ] 

Apache Spark commented on SPARK-22129:
--

User 'holdenk' has created a pull request for this issue:
https://github.com/apache/spark/pull/19359

> Spark release scripts ignore the GPG_KEY and always sign with your default key
> --
>
> Key: SPARK-22129
> URL: https://issues.apache.org/jira/browse/SPARK-22129
> Project: Spark
>  Issue Type: Bug
>  Components: Build
>Affects Versions: 2.2.1, 2.3.0
>Reporter: holdenk
>Assignee: holdenk
>Priority: Blocker
>
> Currently the release scripts require GPG_KEY be specified but the param is 
> ignored and instead the default GPG key is used. Change this to sign with the 
> specified key.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-22129) Spark release scripts ignore the GPG_KEY and always sign with your default key

2017-09-26 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-22129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-22129:


Assignee: Apache Spark  (was: holdenk)

> Spark release scripts ignore the GPG_KEY and always sign with your default key
> --
>
> Key: SPARK-22129
> URL: https://issues.apache.org/jira/browse/SPARK-22129
> Project: Spark
>  Issue Type: Bug
>  Components: Build
>Affects Versions: 2.2.1, 2.3.0
>Reporter: holdenk
>Assignee: Apache Spark
>Priority: Blocker
>
> Currently the release scripts require GPG_KEY be specified but the param is 
> ignored and instead the default GPG key is used. Change this to sign with the 
> specified key.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-22138) Allow retry during release-build

2017-09-26 Thread Apache Spark (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-22138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16181917#comment-16181917
 ] 

Apache Spark commented on SPARK-22138:
--

User 'holdenk' has created a pull request for this issue:
https://github.com/apache/spark/pull/19359

> Allow retry during release-build
> 
>
> Key: SPARK-22138
> URL: https://issues.apache.org/jira/browse/SPARK-22138
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 2.2.1, 2.3.0
> Environment: Right now the build script is configured with no 
> retries, but since transient issues exist with networking lets allow a small 
> number of retries.
>Reporter: holdenk
>Assignee: holdenk
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-22138) Allow retry during release-build

2017-09-26 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-22138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-22138:


Assignee: holdenk  (was: Apache Spark)

> Allow retry during release-build
> 
>
> Key: SPARK-22138
> URL: https://issues.apache.org/jira/browse/SPARK-22138
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 2.2.1, 2.3.0
> Environment: Right now the build script is configured with no 
> retries, but since transient issues exist with networking lets allow a small 
> number of retries.
>Reporter: holdenk
>Assignee: holdenk
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-3165) DecisionTree does not use sparsity in data

2017-09-26 Thread 颜发才


[ 
https://issues.apache.org/jira/browse/SPARK-3165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16181925#comment-16181925
 ] 

Yan Facai (颜发才) commented on SPARK-3165:


The PR proposed by me has been closed because another better solution exists. 
So, welcome to take over the JIRA if you are interested in it. Thanks.

> DecisionTree does not use sparsity in data
> --
>
> Key: SPARK-3165
> URL: https://issues.apache.org/jira/browse/SPARK-3165
> Project: Spark
>  Issue Type: Improvement
>  Components: MLlib
>Reporter: Joseph K. Bradley
>Priority: Minor
>
> Improvement: computation
> DecisionTree should take advantage of sparse feature vectors.  Aggregation 
> over training data could handle the empty/zero-valued data elements more 
> efficiently.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-22074) Task killed by other attempt task should not be resubmitted

2017-09-26 Thread Saisai Shao (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-22074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16181924#comment-16181924
 ] 

Saisai Shao commented on SPARK-22074:
-

Hey [~XuanYuan], I'm a little confused why there will be a resubmit event after 
66.0 is killed, since this killing action is expected and Spark should not 
launch another attempt.

> Task killed by other attempt task should not be resubmitted
> ---
>
> Key: SPARK-22074
> URL: https://issues.apache.org/jira/browse/SPARK-22074
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Li Yuanjian
>
> When a task killed by other task attempt, the task still resubmitted while 
> its executor lost. There is a certain probability caused the stage hanging 
> forever because of the unnecessary resubmit(see the scenario description 
> below). Although the patch https://issues.apache.org/jira/browse/SPARK-13931 
> can resolve the hanging problem(thx [~GavinGavinNo1] :) ), but the 
> unnecessary resubmit should abandon.
> Detail scenario description:
> 1. A ShuffleMapStage has many tasks, some of them finished successfully
> 2. An Executor Lost happened, this will trigger a new TaskSet resubmitted, 
> includes all missing partitions.
> 3. Before the resubmitted TaskSet completed, another executor which only 
> include the task killed by other attempt lost, trigger the Resubmitted Event, 
> current stage's pendingPartitions is not empty.
> 4. Resubmitted TaskSet end, shuffleMapStage.isAvailable == true, but 
> pendingPartitions is not empty, never step into submitWaitingChildStages.
> Leave the key logs of this scenario below:
> {noformat}
> 393332:17/09/11 13:45:24 [dag-scheduler-event-loop] INFO DAGScheduler: 
> Submitting 120 missing tasks from ShuffleMapStage 1046 
> (MapPartitionsRDD[5321] at rdd at AFDEntry.scala:116)
> 39:17/09/11 13:45:24 [dag-scheduler-event-loop] INFO 
> YarnClusterScheduler: Adding task set 1046.0 with 120 tasks
> 408766:17/09/11 13:46:25 [dispatcher-event-loop-5] INFO TaskSetManager: 
> Starting task 66.0 in stage 1046.0 (TID 110761, hidden-baidu-host.baidu.com, 
> executor 15, partition 66, PROCESS_LOCAL, 6237 bytes)
> [1] Executor 15 lost, task 66.0 and 90.0 on it
> 410532:17/09/11 13:46:32 [dispatcher-event-loop-47] INFO 
> YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 15.
> 410900:17/09/11 13:46:33 [dispatcher-event-loop-34] INFO TaskSetManager: 
> Starting task 66.1 in stage 1046.0 (TID 111400, hidden-baidu-host.baidu.com, 
> executor 70, partition 66, PROCESS_LOCAL, 6237 bytes)
> [2] Task 66.0 killed by 66.1
> 411315:17/09/11 13:46:37 [task-result-getter-2] INFO TaskSetManager: Killing 
> attempt 0 for task 66.0 in stage 1046.0 (TID 110761) on 
> hidden-baidu-host.baidu.com as the attempt 1 succeeded on 
> hidden-baidu-host.baidu.com
> 411316:17/09/11 13:46:37 [task-result-getter-2] INFO TaskSetManager: Finished 
> task 66.1 in stage 1046.0 (TID 111400) in 3545 ms on 
> hidden-baidu-host.baidu.com (executor 70) (115/120)
> [3] Executor 7 lost, task 0.0 72.0 7.0 on it
> 411390:17/09/11 13:46:37 [dispatcher-event-loop-24] INFO 
> YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 7.
> 416014:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO DAGScheduler: 
> ShuffleMapStage 1046 (rdd at AFDEntry.scala:116) finished in 94.577 s
> [4] ShuffleMapStage 1046.0 finished, missing partition trigger resubmitted 
> 1046.1
> 416019:17/09/1 13:46:59 [dag-scheduler-event- oop] INFO DAGScheduler: 
> Resubmitting ShuffleMapStage 1046 (rdd at AFDEntry.scala:116) because some of 
> its tasks had failed: 0, 72, 79
> 416020:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO DAGScheduler: 
> Submitting ShuffleMapStage 1046 (MapPartitionsRDD[5321] at rdd at 
> AFDEntry.scala:116), which has no missing parents
> 416030:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO DAGScheduler: 
> Submitting 3 missing tasks from ShuffleMapStage 1046 (MapPartitionsRDD[5321] 
> at rdd at AFDEntry.scala:116)
> 416032:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO 
> YarnClusterScheduler: Adding task set 1046.1 with 3 tasks
> 416034:17/09/11 13:46:59 [dispatcher-event-loop-21] INFO TaskSetManager: 
> Starting task 0.0 in stage 1046.1 (TID 112788, hidden-baidu-host.baidu.com, 
> executor 37, partition 0, PROCESS_LOCAL, 6237 bytes)
> 416037:17/09/11 13:46:59 [dispatcher-event-loop-23] INFO TaskSetManager: 
> Starting task 1.0 in stage 1046.1 (TID 112789, 
> yq01-inf-nmg01-spark03-20160817113538.yq01.baidu.com, executor 69, partition 
> 72, PROCESS_LOCAL, 6237 bytes)
> 416039:17/09/11 13:46:59 [dispatcher-event-loop-23] INFO TaskSetManager: 
> Starting task 2.0 in stage 1046.1 (TID 112790, hidden-baidu-host.baidu.com, 
> executor 26, partition 79, PROCESS_LOCAL, 6237 bytes)
>

[jira] [Commented] (SPARK-22074) Task killed by other attempt task should not be resubmitted

2017-09-26 Thread Li Yuanjian (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-22074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16181943#comment-16181943
 ] 

Li Yuanjian commented on SPARK-22074:
-

Hi [~jerryshao] saisai, the 66.0 resubmitted because of its executor lost 
during 1046.1 running. I also reproduce this in the UT added in my patch and 
add detailed scenario description in comment, it will fail without the changes 
in this PR and will pass conversely. Could you help me check the UT recreate 
the scenario right? Thanks a lot. :)

> Task killed by other attempt task should not be resubmitted
> ---
>
> Key: SPARK-22074
> URL: https://issues.apache.org/jira/browse/SPARK-22074
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Li Yuanjian
>
> When a task killed by other task attempt, the task still resubmitted while 
> its executor lost. There is a certain probability caused the stage hanging 
> forever because of the unnecessary resubmit(see the scenario description 
> below). Although the patch https://issues.apache.org/jira/browse/SPARK-13931 
> can resolve the hanging problem(thx [~GavinGavinNo1] :) ), but the 
> unnecessary resubmit should abandon.
> Detail scenario description:
> 1. A ShuffleMapStage has many tasks, some of them finished successfully
> 2. An Executor Lost happened, this will trigger a new TaskSet resubmitted, 
> includes all missing partitions.
> 3. Before the resubmitted TaskSet completed, another executor which only 
> include the task killed by other attempt lost, trigger the Resubmitted Event, 
> current stage's pendingPartitions is not empty.
> 4. Resubmitted TaskSet end, shuffleMapStage.isAvailable == true, but 
> pendingPartitions is not empty, never step into submitWaitingChildStages.
> Leave the key logs of this scenario below:
> {noformat}
> 393332:17/09/11 13:45:24 [dag-scheduler-event-loop] INFO DAGScheduler: 
> Submitting 120 missing tasks from ShuffleMapStage 1046 
> (MapPartitionsRDD[5321] at rdd at AFDEntry.scala:116)
> 39:17/09/11 13:45:24 [dag-scheduler-event-loop] INFO 
> YarnClusterScheduler: Adding task set 1046.0 with 120 tasks
> 408766:17/09/11 13:46:25 [dispatcher-event-loop-5] INFO TaskSetManager: 
> Starting task 66.0 in stage 1046.0 (TID 110761, hidden-baidu-host.baidu.com, 
> executor 15, partition 66, PROCESS_LOCAL, 6237 bytes)
> [1] Executor 15 lost, task 66.0 and 90.0 on it
> 410532:17/09/11 13:46:32 [dispatcher-event-loop-47] INFO 
> YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 15.
> 410900:17/09/11 13:46:33 [dispatcher-event-loop-34] INFO TaskSetManager: 
> Starting task 66.1 in stage 1046.0 (TID 111400, hidden-baidu-host.baidu.com, 
> executor 70, partition 66, PROCESS_LOCAL, 6237 bytes)
> [2] Task 66.0 killed by 66.1
> 411315:17/09/11 13:46:37 [task-result-getter-2] INFO TaskSetManager: Killing 
> attempt 0 for task 66.0 in stage 1046.0 (TID 110761) on 
> hidden-baidu-host.baidu.com as the attempt 1 succeeded on 
> hidden-baidu-host.baidu.com
> 411316:17/09/11 13:46:37 [task-result-getter-2] INFO TaskSetManager: Finished 
> task 66.1 in stage 1046.0 (TID 111400) in 3545 ms on 
> hidden-baidu-host.baidu.com (executor 70) (115/120)
> [3] Executor 7 lost, task 0.0 72.0 7.0 on it
> 411390:17/09/11 13:46:37 [dispatcher-event-loop-24] INFO 
> YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 7.
> 416014:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO DAGScheduler: 
> ShuffleMapStage 1046 (rdd at AFDEntry.scala:116) finished in 94.577 s
> [4] ShuffleMapStage 1046.0 finished, missing partition trigger resubmitted 
> 1046.1
> 416019:17/09/1 13:46:59 [dag-scheduler-event- oop] INFO DAGScheduler: 
> Resubmitting ShuffleMapStage 1046 (rdd at AFDEntry.scala:116) because some of 
> its tasks had failed: 0, 72, 79
> 416020:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO DAGScheduler: 
> Submitting ShuffleMapStage 1046 (MapPartitionsRDD[5321] at rdd at 
> AFDEntry.scala:116), which has no missing parents
> 416030:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO DAGScheduler: 
> Submitting 3 missing tasks from ShuffleMapStage 1046 (MapPartitionsRDD[5321] 
> at rdd at AFDEntry.scala:116)
> 416032:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO 
> YarnClusterScheduler: Adding task set 1046.1 with 3 tasks
> 416034:17/09/11 13:46:59 [dispatcher-event-loop-21] INFO TaskSetManager: 
> Starting task 0.0 in stage 1046.1 (TID 112788, hidden-baidu-host.baidu.com, 
> executor 37, partition 0, PROCESS_LOCAL, 6237 bytes)
> 416037:17/09/11 13:46:59 [dispatcher-event-loop-23] INFO TaskSetManager: 
> Starting task 1.0 in stage 1046.1 (TID 112789, 
> yq01-inf-nmg01-spark03-20160817113538.yq01.baidu.com, executor 69, partition 
> 72, PROCESS_LOCAL, 6237 bytes)
> 416039:17/09/11 13:46:59 [dispatcher-event-

[jira] [Comment Edited] (SPARK-22074) Task killed by other attempt task should not be resubmitted

2017-09-26 Thread Li Yuanjian (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-22074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16181943#comment-16181943
 ] 

Li Yuanjian edited comment on SPARK-22074 at 9/27/17 4:46 AM:
--

Hi [~jerryshao] saisai, the 66.0 resubmitted because of its executor lost 
during 1046.1 running. I also reproduce this in the 
UT(https://github.com/apache/spark/pull/19287/files#diff-8425e96a6c100b5f368b8e520ad80068R748)
 added in my patch and add detailed scenario description in comment, it will 
fail without the changes in this PR and will pass conversely. Could you help me 
check the UT recreate the scenario right? Thanks a lot. :)


was (Author: xuanyuan):
Hi [~jerryshao] saisai, the 66.0 resubmitted because of its executor lost 
during 1046.1 running. I also reproduce this in the UT added in my patch and 
add detailed scenario description in comment, it will fail without the changes 
in this PR and will pass conversely. Could you help me check the UT recreate 
the scenario right? Thanks a lot. :)

> Task killed by other attempt task should not be resubmitted
> ---
>
> Key: SPARK-22074
> URL: https://issues.apache.org/jira/browse/SPARK-22074
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Li Yuanjian
>
> When a task killed by other task attempt, the task still resubmitted while 
> its executor lost. There is a certain probability caused the stage hanging 
> forever because of the unnecessary resubmit(see the scenario description 
> below). Although the patch https://issues.apache.org/jira/browse/SPARK-13931 
> can resolve the hanging problem(thx [~GavinGavinNo1] :) ), but the 
> unnecessary resubmit should abandon.
> Detail scenario description:
> 1. A ShuffleMapStage has many tasks, some of them finished successfully
> 2. An Executor Lost happened, this will trigger a new TaskSet resubmitted, 
> includes all missing partitions.
> 3. Before the resubmitted TaskSet completed, another executor which only 
> include the task killed by other attempt lost, trigger the Resubmitted Event, 
> current stage's pendingPartitions is not empty.
> 4. Resubmitted TaskSet end, shuffleMapStage.isAvailable == true, but 
> pendingPartitions is not empty, never step into submitWaitingChildStages.
> Leave the key logs of this scenario below:
> {noformat}
> 393332:17/09/11 13:45:24 [dag-scheduler-event-loop] INFO DAGScheduler: 
> Submitting 120 missing tasks from ShuffleMapStage 1046 
> (MapPartitionsRDD[5321] at rdd at AFDEntry.scala:116)
> 39:17/09/11 13:45:24 [dag-scheduler-event-loop] INFO 
> YarnClusterScheduler: Adding task set 1046.0 with 120 tasks
> 408766:17/09/11 13:46:25 [dispatcher-event-loop-5] INFO TaskSetManager: 
> Starting task 66.0 in stage 1046.0 (TID 110761, hidden-baidu-host.baidu.com, 
> executor 15, partition 66, PROCESS_LOCAL, 6237 bytes)
> [1] Executor 15 lost, task 66.0 and 90.0 on it
> 410532:17/09/11 13:46:32 [dispatcher-event-loop-47] INFO 
> YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 15.
> 410900:17/09/11 13:46:33 [dispatcher-event-loop-34] INFO TaskSetManager: 
> Starting task 66.1 in stage 1046.0 (TID 111400, hidden-baidu-host.baidu.com, 
> executor 70, partition 66, PROCESS_LOCAL, 6237 bytes)
> [2] Task 66.0 killed by 66.1
> 411315:17/09/11 13:46:37 [task-result-getter-2] INFO TaskSetManager: Killing 
> attempt 0 for task 66.0 in stage 1046.0 (TID 110761) on 
> hidden-baidu-host.baidu.com as the attempt 1 succeeded on 
> hidden-baidu-host.baidu.com
> 411316:17/09/11 13:46:37 [task-result-getter-2] INFO TaskSetManager: Finished 
> task 66.1 in stage 1046.0 (TID 111400) in 3545 ms on 
> hidden-baidu-host.baidu.com (executor 70) (115/120)
> [3] Executor 7 lost, task 0.0 72.0 7.0 on it
> 411390:17/09/11 13:46:37 [dispatcher-event-loop-24] INFO 
> YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 7.
> 416014:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO DAGScheduler: 
> ShuffleMapStage 1046 (rdd at AFDEntry.scala:116) finished in 94.577 s
> [4] ShuffleMapStage 1046.0 finished, missing partition trigger resubmitted 
> 1046.1
> 416019:17/09/1 13:46:59 [dag-scheduler-event- oop] INFO DAGScheduler: 
> Resubmitting ShuffleMapStage 1046 (rdd at AFDEntry.scala:116) because some of 
> its tasks had failed: 0, 72, 79
> 416020:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO DAGScheduler: 
> Submitting ShuffleMapStage 1046 (MapPartitionsRDD[5321] at rdd at 
> AFDEntry.scala:116), which has no missing parents
> 416030:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO DAGScheduler: 
> Submitting 3 missing tasks from ShuffleMapStage 1046 (MapPartitionsRDD[5321] 
> at rdd at AFDEntry.scala:116)
> 416032:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO 
> YarnClusterScheduler: Adding task se

[jira] [Commented] (SPARK-22074) Task killed by other attempt task should not be resubmitted

2017-09-26 Thread Saisai Shao (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-22074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16182062#comment-16182062
 ] 

Saisai Shao commented on SPARK-22074:
-

So if I understand correctly, this happens when speculation is happened, if one 
task attempt is finished (66.1), it will try to kill all other attempts (66.0), 
but before this attempt (66.0) is fully killed, the executor who run this 
attempt is lost, so scheduler will resubmit this attempt because of executor 
lost, and neglect other successful attempt, Am I right?



> Task killed by other attempt task should not be resubmitted
> ---
>
> Key: SPARK-22074
> URL: https://issues.apache.org/jira/browse/SPARK-22074
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Li Yuanjian
>
> When a task killed by other task attempt, the task still resubmitted while 
> its executor lost. There is a certain probability caused the stage hanging 
> forever because of the unnecessary resubmit(see the scenario description 
> below). Although the patch https://issues.apache.org/jira/browse/SPARK-13931 
> can resolve the hanging problem(thx [~GavinGavinNo1] :) ), but the 
> unnecessary resubmit should abandon.
> Detail scenario description:
> 1. A ShuffleMapStage has many tasks, some of them finished successfully
> 2. An Executor Lost happened, this will trigger a new TaskSet resubmitted, 
> includes all missing partitions.
> 3. Before the resubmitted TaskSet completed, another executor which only 
> include the task killed by other attempt lost, trigger the Resubmitted Event, 
> current stage's pendingPartitions is not empty.
> 4. Resubmitted TaskSet end, shuffleMapStage.isAvailable == true, but 
> pendingPartitions is not empty, never step into submitWaitingChildStages.
> Leave the key logs of this scenario below:
> {noformat}
> 393332:17/09/11 13:45:24 [dag-scheduler-event-loop] INFO DAGScheduler: 
> Submitting 120 missing tasks from ShuffleMapStage 1046 
> (MapPartitionsRDD[5321] at rdd at AFDEntry.scala:116)
> 39:17/09/11 13:45:24 [dag-scheduler-event-loop] INFO 
> YarnClusterScheduler: Adding task set 1046.0 with 120 tasks
> 408766:17/09/11 13:46:25 [dispatcher-event-loop-5] INFO TaskSetManager: 
> Starting task 66.0 in stage 1046.0 (TID 110761, hidden-baidu-host.baidu.com, 
> executor 15, partition 66, PROCESS_LOCAL, 6237 bytes)
> [1] Executor 15 lost, task 66.0 and 90.0 on it
> 410532:17/09/11 13:46:32 [dispatcher-event-loop-47] INFO 
> YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 15.
> 410900:17/09/11 13:46:33 [dispatcher-event-loop-34] INFO TaskSetManager: 
> Starting task 66.1 in stage 1046.0 (TID 111400, hidden-baidu-host.baidu.com, 
> executor 70, partition 66, PROCESS_LOCAL, 6237 bytes)
> [2] Task 66.0 killed by 66.1
> 411315:17/09/11 13:46:37 [task-result-getter-2] INFO TaskSetManager: Killing 
> attempt 0 for task 66.0 in stage 1046.0 (TID 110761) on 
> hidden-baidu-host.baidu.com as the attempt 1 succeeded on 
> hidden-baidu-host.baidu.com
> 411316:17/09/11 13:46:37 [task-result-getter-2] INFO TaskSetManager: Finished 
> task 66.1 in stage 1046.0 (TID 111400) in 3545 ms on 
> hidden-baidu-host.baidu.com (executor 70) (115/120)
> [3] Executor 7 lost, task 0.0 72.0 7.0 on it
> 411390:17/09/11 13:46:37 [dispatcher-event-loop-24] INFO 
> YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 7.
> 416014:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO DAGScheduler: 
> ShuffleMapStage 1046 (rdd at AFDEntry.scala:116) finished in 94.577 s
> [4] ShuffleMapStage 1046.0 finished, missing partition trigger resubmitted 
> 1046.1
> 416019:17/09/1 13:46:59 [dag-scheduler-event- oop] INFO DAGScheduler: 
> Resubmitting ShuffleMapStage 1046 (rdd at AFDEntry.scala:116) because some of 
> its tasks had failed: 0, 72, 79
> 416020:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO DAGScheduler: 
> Submitting ShuffleMapStage 1046 (MapPartitionsRDD[5321] at rdd at 
> AFDEntry.scala:116), which has no missing parents
> 416030:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO DAGScheduler: 
> Submitting 3 missing tasks from ShuffleMapStage 1046 (MapPartitionsRDD[5321] 
> at rdd at AFDEntry.scala:116)
> 416032:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO 
> YarnClusterScheduler: Adding task set 1046.1 with 3 tasks
> 416034:17/09/11 13:46:59 [dispatcher-event-loop-21] INFO TaskSetManager: 
> Starting task 0.0 in stage 1046.1 (TID 112788, hidden-baidu-host.baidu.com, 
> executor 37, partition 0, PROCESS_LOCAL, 6237 bytes)
> 416037:17/09/11 13:46:59 [dispatcher-event-loop-23] INFO TaskSetManager: 
> Starting task 1.0 in stage 1046.1 (TID 112789, 
> yq01-inf-nmg01-spark03-20160817113538.yq01.baidu.com, executor 69, partition 
> 72, PROCESS_LOCAL, 6237 bytes)
> 416039:17/09/11 13

[jira] [Commented] (SPARK-22074) Task killed by other attempt task should not be resubmitted

2017-09-26 Thread Li Yuanjian (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-22074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16182095#comment-16182095
 ] 

Li Yuanjian commented on SPARK-22074:
-

Yes, that's right.

> Task killed by other attempt task should not be resubmitted
> ---
>
> Key: SPARK-22074
> URL: https://issues.apache.org/jira/browse/SPARK-22074
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Li Yuanjian
>
> When a task killed by other task attempt, the task still resubmitted while 
> its executor lost. There is a certain probability caused the stage hanging 
> forever because of the unnecessary resubmit(see the scenario description 
> below). Although the patch https://issues.apache.org/jira/browse/SPARK-13931 
> can resolve the hanging problem(thx [~GavinGavinNo1] :) ), but the 
> unnecessary resubmit should abandon.
> Detail scenario description:
> 1. A ShuffleMapStage has many tasks, some of them finished successfully
> 2. An Executor Lost happened, this will trigger a new TaskSet resubmitted, 
> includes all missing partitions.
> 3. Before the resubmitted TaskSet completed, another executor which only 
> include the task killed by other attempt lost, trigger the Resubmitted Event, 
> current stage's pendingPartitions is not empty.
> 4. Resubmitted TaskSet end, shuffleMapStage.isAvailable == true, but 
> pendingPartitions is not empty, never step into submitWaitingChildStages.
> Leave the key logs of this scenario below:
> {noformat}
> 393332:17/09/11 13:45:24 [dag-scheduler-event-loop] INFO DAGScheduler: 
> Submitting 120 missing tasks from ShuffleMapStage 1046 
> (MapPartitionsRDD[5321] at rdd at AFDEntry.scala:116)
> 39:17/09/11 13:45:24 [dag-scheduler-event-loop] INFO 
> YarnClusterScheduler: Adding task set 1046.0 with 120 tasks
> 408766:17/09/11 13:46:25 [dispatcher-event-loop-5] INFO TaskSetManager: 
> Starting task 66.0 in stage 1046.0 (TID 110761, hidden-baidu-host.baidu.com, 
> executor 15, partition 66, PROCESS_LOCAL, 6237 bytes)
> [1] Executor 15 lost, task 66.0 and 90.0 on it
> 410532:17/09/11 13:46:32 [dispatcher-event-loop-47] INFO 
> YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 15.
> 410900:17/09/11 13:46:33 [dispatcher-event-loop-34] INFO TaskSetManager: 
> Starting task 66.1 in stage 1046.0 (TID 111400, hidden-baidu-host.baidu.com, 
> executor 70, partition 66, PROCESS_LOCAL, 6237 bytes)
> [2] Task 66.0 killed by 66.1
> 411315:17/09/11 13:46:37 [task-result-getter-2] INFO TaskSetManager: Killing 
> attempt 0 for task 66.0 in stage 1046.0 (TID 110761) on 
> hidden-baidu-host.baidu.com as the attempt 1 succeeded on 
> hidden-baidu-host.baidu.com
> 411316:17/09/11 13:46:37 [task-result-getter-2] INFO TaskSetManager: Finished 
> task 66.1 in stage 1046.0 (TID 111400) in 3545 ms on 
> hidden-baidu-host.baidu.com (executor 70) (115/120)
> [3] Executor 7 lost, task 0.0 72.0 7.0 on it
> 411390:17/09/11 13:46:37 [dispatcher-event-loop-24] INFO 
> YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 7.
> 416014:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO DAGScheduler: 
> ShuffleMapStage 1046 (rdd at AFDEntry.scala:116) finished in 94.577 s
> [4] ShuffleMapStage 1046.0 finished, missing partition trigger resubmitted 
> 1046.1
> 416019:17/09/1 13:46:59 [dag-scheduler-event- oop] INFO DAGScheduler: 
> Resubmitting ShuffleMapStage 1046 (rdd at AFDEntry.scala:116) because some of 
> its tasks had failed: 0, 72, 79
> 416020:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO DAGScheduler: 
> Submitting ShuffleMapStage 1046 (MapPartitionsRDD[5321] at rdd at 
> AFDEntry.scala:116), which has no missing parents
> 416030:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO DAGScheduler: 
> Submitting 3 missing tasks from ShuffleMapStage 1046 (MapPartitionsRDD[5321] 
> at rdd at AFDEntry.scala:116)
> 416032:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO 
> YarnClusterScheduler: Adding task set 1046.1 with 3 tasks
> 416034:17/09/11 13:46:59 [dispatcher-event-loop-21] INFO TaskSetManager: 
> Starting task 0.0 in stage 1046.1 (TID 112788, hidden-baidu-host.baidu.com, 
> executor 37, partition 0, PROCESS_LOCAL, 6237 bytes)
> 416037:17/09/11 13:46:59 [dispatcher-event-loop-23] INFO TaskSetManager: 
> Starting task 1.0 in stage 1046.1 (TID 112789, 
> yq01-inf-nmg01-spark03-20160817113538.yq01.baidu.com, executor 69, partition 
> 72, PROCESS_LOCAL, 6237 bytes)
> 416039:17/09/11 13:46:59 [dispatcher-event-loop-23] INFO TaskSetManager: 
> Starting task 2.0 in stage 1046.1 (TID 112790, hidden-baidu-host.baidu.com, 
> executor 26, partition 79, PROCESS_LOCAL, 6237 bytes)
> [5] ShuffleMapStage 1046.1 still running, the attempted task killed by other 
> trigger the Resubmitted event
> 416646:17/09/11 13:47:01 [dispatcher-event-loop-2

[jira] [Created] (SPARK-22139) Remove the variable which is never used in SparkConf.scala

2017-09-26 Thread guoxiaolongzte (JIRA)

guoxiaolongzte created SPARK-22139:
--

 Summary: Remove the variable which is never used in SparkConf.scala
 Key: SPARK-22139
 URL: https://issues.apache.org/jira/browse/SPARK-22139
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 2.3.0
Reporter: guoxiaolongzte
Priority: Trivial


Remove the variable which is never used in SparkConf.scala.

val executorClasspathKey = "spark.executor.extraClassPath"
val driverOptsKey = "spark.driver.extraJavaOptions"
val driverClassPathKey = "spark.driver.extraClassPath"
val sparkExecutorInstances = "spark.executor.instances"

They variables are never used. Because the implementation code for the 
validation rule has been removed in SPARK-17979.





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

92 matches

Mail list logo