[jira] [Updated] (SPARK-24519) MapStatus has 2000 hardcoded

2018-06-19 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-24519: -- Description: MapStatus uses hardcoded value of 2000 partitions to determine if it should use

[jira] [Updated] (SPARK-13343) speculative tasks that didn't commit shouldn't be marked as success

2018-07-02 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-13343: -- Description: Currently Speculative tasks that didn't commit can show up as success 

[jira] [Created] (SPARK-24124) Spark history server should create spark.history.store.path and set permissions properly

2018-04-30 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-24124: - Summary: Spark history server should create spark.history.store.path and set permissions properly Key: SPARK-24124 URL: https://issues.apache.org/jira/browse/SPARK-24124

[jira] [Commented] (SPARK-24124) Spark history server should create spark.history.store.path and set permissions properly

2018-04-30 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458835#comment-16458835 ] Thomas Graves commented on SPARK-24124: --- [~vanzin]  any objections to this? > Spark history server

[jira] [Commented] (SPARK-20928) SPIP: Continuous Processing Mode for Structured Streaming

2018-01-18 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16331045#comment-16331045 ] Thomas Graves commented on SPARK-20928: --- what is status of this, it looks like subtasks are

[jira] [Commented] (SPARK-23189) reflect stage level blacklisting on executor tab

2018-01-24 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338005#comment-16338005 ] Thomas Graves commented on SPARK-23189: --- for large jobs the specific stage page is a pain to

[jira] [Commented] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-02 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16350428#comment-16350428 ] Thomas Graves commented on SPARK-23304: --- well I guess that give you end # of partitions and not the

[jira] [Commented] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-02 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16350440#comment-16350440 ] Thomas Graves commented on SPARK-23304: --- ok so I guess by that logic then the coalesce won't every

[jira] [Commented] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-02 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16350423#comment-16350423 ] Thomas Graves commented on SPARK-23304: --- it doesn't look like sql("xyz").rdd.partitions.length

[jira] [Resolved] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-02 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved SPARK-23304. --- Resolution: Invalid > Spark SQL coalesce() against hive not working >

[jira] [Commented] (SPARK-23309) Spark 2.3 cached query performance 20-30% worse then spark 2.2

2018-02-02 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16350533#comment-16350533 ] Thomas Graves commented on SPARK-23309: --- Note the schema of "something" here is a "string". I'll

[jira] [Comment Edited] (SPARK-23309) Spark 2.3 cached query performance 20-30% worse then spark 2.2

2018-02-02 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16350813#comment-16350813 ] Thomas Graves edited comment on SPARK-23309 at 2/2/18 8:15 PM: --- I'm still

[jira] [Comment Edited] (SPARK-23309) Spark 2.3 cached query performance 20-30% worse then spark 2.2

2018-02-02 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16350813#comment-16350813 ] Thomas Graves edited comment on SPARK-23309 at 2/2/18 7:04 PM: --- I'm still

[jira] [Comment Edited] (SPARK-23309) Spark 2.3 cached query performance 20-30% worse then spark 2.2

2018-02-02 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16350901#comment-16350901 ] Thomas Graves edited comment on SPARK-23309 at 2/2/18 8:29 PM: --- I should

[jira] [Commented] (SPARK-23309) Spark 2.3 cached query performance 20-30% worse then spark 2.2

2018-02-02 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16350900#comment-16350900 ] Thomas Graves commented on SPARK-23309: --- So the last test I did was spark 2.3 with the old hive

[jira] [Commented] (SPARK-23309) Spark 2.3 cached query performance 20-30% worse then spark 2.2

2018-02-02 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16350901#comment-16350901 ] Thomas Graves commented on SPARK-23309: --- I should ask is there a log statement or query plan I can

[jira] [Comment Edited] (SPARK-23309) Spark 2.3 cached query performance 20-30% worse then spark 2.2

2018-02-02 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16350901#comment-16350901 ] Thomas Graves edited comment on SPARK-23309 at 2/2/18 8:29 PM: --- I should

[jira] [Commented] (SPARK-23309) Spark 2.3 cached query performance 20-30% worse then spark 2.2

2018-02-02 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16350813#comment-16350813 ] Thomas Graves commented on SPARK-23309: --- I'm still seeing spark 2.3 slower by about 15% for the

[jira] [Created] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-23304: - Summary: Spark SQL coalesce() against hive not working Key: SPARK-23304 URL: https://issues.apache.org/jira/browse/SPARK-23304 Project: Spark Issue Type:

[jira] [Commented] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349367#comment-16349367 ] Thomas Graves commented on SPARK-23304: --- ok I've attached 2 files one with spark 2.3 and one with

[jira] [Commented] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349446#comment-16349446 ] Thomas Graves commented on SPARK-23304: --- It still seems like a bug to me since the coalesce isn't

[jira] [Commented] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349444#comment-16349444 ] Thomas Graves commented on SPARK-23304: --- [~smilegator] just to make sure you saw my comment above,

[jira] [Commented] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349212#comment-16349212 ] Thomas Graves commented on SPARK-23304: --- yes there are difference in the # of partitions between

[jira] [Commented] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349220#comment-16349220 ] Thomas Graves commented on SPARK-23304: --- If it helps , spark 2.3 # partitions is 317531 and spark

[jira] [Commented] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349227#comment-16349227 ] Thomas Graves commented on SPARK-23304: --- Ok, I just realized what you are getting at, I tried on

[jira] [Updated] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-23304: -- Priority: Major (was: Blocker) > Spark SQL coalesce() against hive not working >

[jira] [Updated] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-23304: -- Description: The query below seems to ignore the coalesce. This is running spark 2.2 or spark

[jira] [Updated] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-23304: -- Attachment: spark23_oldorc_explain.txt spark22_oldorc_explain.txt > Spark SQL

[jira] [Commented] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349369#comment-16349369 ] Thomas Graves commented on SPARK-23304: --- Note I've removed some of the columns from the output, if

[jira] [Commented] (SPARK-23309) Spark 2.3 cached query performance 20-30% worse then spark 2.2

2018-02-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349483#comment-16349483 ] Thomas Graves commented on SPARK-23309: --- sure, I can also run with the  --conf

[jira] [Commented] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349270#comment-16349270 ] Thomas Graves commented on SPARK-23304: --- so with the new ORC code is there anyway to control the #

[jira] [Commented] (SPARK-23309) Spark 2.3 cached query performance 20-30% worse then spark 2.2

2018-02-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349431#comment-16349431 ] Thomas Graves commented on SPARK-23309: --- I'm curious if anyone else is seeing the same behavior? 

[jira] [Updated] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-23304: -- Attachment: spark23_oldorc_explain_convermetastoreorcfalse.txt > Spark SQL coalesce() against

[jira] [Created] (SPARK-23309) Spark 2.3 cached query performance 20-30% worse then spark 2.2

2018-02-01 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-23309: - Summary: Spark 2.3 cached query performance 20-30% worse then spark 2.2 Key: SPARK-23309 URL: https://issues.apache.org/jira/browse/SPARK-23309 Project: Spark

[jira] [Commented] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349430#comment-16349430 ] Thomas Graves commented on SPARK-23304: ---   I filed Jira

[jira] [Commented] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349480#comment-16349480 ] Thomas Graves commented on SPARK-23304: --- I just ran the query (show()) and saw the # of partitions. 

[jira] [Commented] (SPARK-23304) Spark SQL coalesce() against hive not working

2018-02-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349555#comment-16349555 ] Thomas Graves commented on SPARK-23304: --- I don't have any hive tables backed by parquet to compare

[jira] [Commented] (SPARK-23309) Spark 2.3 cached query performance 20-30% worse then spark 2.2

2018-02-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349553#comment-16349553 ] Thomas Graves commented on SPARK-23309: --- [~dongjoon] is there any native way with the native hive

[jira] [Commented] (SPARK-23309) Spark 2.3 cached query performance 20-30% worse then spark 2.2

2018-02-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349551#comment-16349551 ] Thomas Graves commented on SPARK-23309: --- seeing the same time difference after adding in the   

[jira] [Commented] (SPARK-22683) DynamicAllocation wastes resources by allocating containers that will barely be used

2018-02-07 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16355920#comment-16355920 ] Thomas Graves commented on SPARK-22683: --- If the config is set to 1 which keeps the current behavior

[jira] [Commented] (SPARK-23309) Spark 2.3 cached query performance 20-30% worse then spark 2.2

2018-02-08 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357346#comment-16357346 ] Thomas Graves commented on SPARK-23309: --- sorry I haven't had time to make a query/dataset to

[jira] [Commented] (SPARK-22683) DynamicAllocation wastes resources by allocating containers that will barely be used

2018-02-07 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356168#comment-16356168 ] Thomas Graves commented on SPARK-22683: --- I agree, I think default behavior stays 1.  I ran a few

[jira] [Comment Edited] (SPARK-22683) DynamicAllocation wastes resources by allocating containers that will barely be used

2018-02-07 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356168#comment-16356168 ] Thomas Graves edited comment on SPARK-22683 at 2/7/18 10:24 PM: I agree,

[jira] [Commented] (SPARK-23309) Spark 2.3 cached query performance 20-30% worse then spark 2.2

2018-02-06 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16354336#comment-16354336 ] Thomas Graves commented on SPARK-23309: --- I pulled in that patch

[jira] [Commented] (SPARK-22683) DynamicAllocation wastes resources by allocating containers that will barely be used

2018-02-07 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1637#comment-1637 ] Thomas Graves commented on SPARK-22683: --- ok thanks,  I would like to try this out myself on a few

[jira] [Commented] (SPARK-22683) DynamicAllocation wastes resources by allocating containers that will barely be used

2018-01-03 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16309910#comment-16309910 ] Thomas Graves commented on SPARK-22683: --- [~jcuquemelle] . just to confirm your applications

[jira] [Commented] (SPARK-24622) Task attempts in other stage attempts not killed when one task attempt succeeds

2018-06-21 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16519416#comment-16519416 ] Thomas Graves commented on SPARK-24622: --- Need to investigate further/test to make sure I am not

[jira] [Created] (SPARK-24622) Task attempts in other stage attempts not killed when one task attempt succeeds

2018-06-21 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-24622: - Summary: Task attempts in other stage attempts not killed when one task attempt succeeds Key: SPARK-24622 URL: https://issues.apache.org/jira/browse/SPARK-24622

[jira] [Comment Edited] (SPARK-24552) Task attempt numbers are reused when stages are retried

2018-06-21 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16519445#comment-16519445 ] Thomas Graves edited comment on SPARK-24552 at 6/21/18 3:02 PM: more

[jira] [Comment Edited] (SPARK-24552) Task attempt numbers are reused when stages are retried

2018-06-21 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16519445#comment-16519445 ] Thomas Graves edited comment on SPARK-24552 at 6/21/18 3:01 PM: more

[jira] [Commented] (SPARK-24552) Task attempt numbers are reused when stages are retried

2018-06-21 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16519445#comment-16519445 ] Thomas Graves commented on SPARK-24552: --- more details on hadoop committer side: So I think the

[jira] [Commented] (SPARK-24552) Task attempt numbers are reused when stages are retried

2018-06-21 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16519430#comment-16519430 ] Thomas Graves commented on SPARK-24552: --- this is actually a problem with hadoop committers, v1 and

[jira] [Commented] (SPARK-24611) Clean up OutputCommitCoordinator

2018-06-21 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16519420#comment-16519420 ] Thomas Graves commented on SPARK-24611: --- [~joshrosen]  just noticed you were the last one to

[jira] [Resolved] (SPARK-24589) OutputCommitCoordinator may allow duplicate commits

2018-06-21 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved SPARK-24589. --- Resolution: Fixed Assignee: Marcelo Vanzin Fix Version/s: 2.4.0

[jira] [Commented] (SPARK-24924) Add mapping for built-in Avro data source

2018-08-02 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16567045#comment-16567045 ] Thomas Graves commented on SPARK-24924: --- why are we doing this? If a user ships the spark-avro

[jira] [Commented] (SPARK-24909) Spark scheduler can hang when fetch failures, executor lost, task running on lost executor, and multiple stage attempts

2018-08-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565437#comment-16565437 ] Thomas Graves commented on SPARK-24909: --- this is unfortunately not a straight forward fix, the

[jira] [Commented] (SPARK-24909) Spark scheduler can hang when fetch failures, executor lost, task running on lost executor, and multiple stage attempts

2018-08-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565513#comment-16565513 ] Thomas Graves commented on SPARK-24909: --- looking more I think the fix may actually just be to

[jira] [Commented] (SPARK-24986) OOM in BufferHolder during writes to a stream

2018-08-01 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565296#comment-16565296 ] Thomas Graves commented on SPARK-24986: --- fyi [~irashid] I know you were looking at memory related

[jira] [Updated] (SPARK-24909) Spark scheduler can hang when fetch failures, executor lost, task running on lost executor, and multiple stage attempts

2018-08-02 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-24909: -- Target Version/s: 2.4.0 > Spark scheduler can hang when fetch failures, executor lost, task

[jira] [Commented] (SPARK-24924) Add mapping for built-in Avro data source

2018-08-03 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16568199#comment-16568199 ] Thomas Graves commented on SPARK-24924: --- Hmm, so we are adding this for ease of upgrading I guess

[jira] [Commented] (SPARK-24924) Add mapping for built-in Avro data source

2018-08-03 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16568204#comment-16568204 ] Thomas Graves commented on SPARK-24924: --- [~felixcheung] did your discussion on the same thing with

[jira] [Commented] (SPARK-24924) Add mapping for built-in Avro data source

2018-08-03 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16568393#comment-16568393 ] Thomas Graves commented on SPARK-24924: --- | It wouldn't be very different for 2.4.0. It could be

[jira] [Created] (SPARK-25016) remove Support for hadoop 2.6

2018-08-03 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-25016: - Summary: remove Support for hadoop 2.6 Key: SPARK-25016 URL: https://issues.apache.org/jira/browse/SPARK-25016 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-25016) remove Support for hadoop 2.6

2018-08-03 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-25016: -- Target Version/s: 3.0.0 > remove Support for hadoop 2.6 > - > >

[jira] [Commented] (SPARK-24934) Complex type and binary type in in-memory partition pruning does not work due to missing upper/lower bounds cases

2018-07-30 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561944#comment-16561944 ] Thomas Graves commented on SPARK-24934: --- what is the real affected versions here?  Since it went

[jira] [Resolved] (SPARK-13343) speculative tasks that didn't commit shouldn't be marked as success

2018-07-27 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved SPARK-13343. --- Resolution: Fixed Assignee: Hieu Tri Huynh Fix Version/s: 2.4.0 >

[jira] [Commented] (SPARK-24579) SPIP: Standardize Optimized Data Exchange between Spark and DL/AI frameworks

2018-07-31 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563689#comment-16563689 ] Thomas Graves commented on SPARK-24579: --- going from Spark feeds data into DL/AI frameworks for

[jira] [Commented] (SPARK-24615) Accelerator-aware task scheduling for Spark

2018-07-31 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563711#comment-16563711 ] Thomas Graves commented on SPARK-24615: --- so I guess my question is this the right approach at all. 

[jira] [Issue Comment Deleted] (SPARK-25024) Update mesos documentation to be clear about security supported

2018-08-03 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-25024: -- Comment: was deleted (was: I'm going to work on this.) > Update mesos documentation to be

[jira] [Commented] (SPARK-25023) Clarify Spark security documentation

2018-08-03 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16568767#comment-16568767 ] Thomas Graves commented on SPARK-25023: --- I'm going to work on this > Clarify Spark security

[jira] [Created] (SPARK-25024) Update mesos documentation to be clear about security supported

2018-08-03 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-25024: - Summary: Update mesos documentation to be clear about security supported Key: SPARK-25024 URL: https://issues.apache.org/jira/browse/SPARK-25024 Project: Spark

[jira] [Commented] (SPARK-25023) Clarify Spark security documentation

2018-08-03 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16568805#comment-16568805 ] Thomas Graves commented on SPARK-25023: --- note some of this was already updated with

[jira] [Commented] (SPARK-24918) Executor Plugin API

2018-07-26 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558331#comment-16558331 ] Thomas Graves commented on SPARK-24918: --- I think this is a good idea. I thought I had seen a Jira

[jira] [Updated] (SPARK-23243) Shuffle+Repartition on an RDD could lead to incorrect answers

2018-07-27 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-23243: -- Priority: Blocker (was: Major) > Shuffle+Repartition on an RDD could lead to incorrect

[jira] [Commented] (SPARK-25081) Nested spill in ShuffleExternalSorter may access a released memory page

2018-08-10 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576700#comment-16576700 ] Thomas Graves commented on SPARK-25081: --- thanks, wanted to clarify since the description only

[jira] [Commented] (SPARK-23298) distinct.count on Dataset/DataFrame yields non-deterministic results

2018-08-10 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576709#comment-16576709 ] Thomas Graves commented on SPARK-23298: --- [~mjukiewicz] have you tried spark with fix for

[jira] [Commented] (SPARK-23207) Shuffle+Repartition on an DataFrame could lead to incorrect answers

2018-08-09 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16574956#comment-16574956 ] Thomas Graves commented on SPARK-23207: --- ok, I guess I disagree with that. Any correctness bug is

[jira] [Commented] (SPARK-23207) Shuffle+Repartition on an DataFrame could lead to incorrect answers

2018-08-09 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16574886#comment-16574886 ] Thomas Graves commented on SPARK-23207: --- [~jiangxb1987] ^ > Shuffle+Repartition on an DataFrame

[jira] [Commented] (SPARK-24924) Add mapping for built-in Avro data source

2018-08-06 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570736#comment-16570736 ] Thomas Graves commented on SPARK-24924: --- so if the user includes the databricks jar and they

[jira] [Commented] (SPARK-24924) Add mapping for built-in Avro data source

2018-08-06 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570840#comment-16570840 ] Thomas Graves commented on SPARK-24924: --- so officially the spark api compatibility is only at the

[jira] [Created] (SPARK-25023) Clarify Spark security documentation

2018-08-03 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-25023: - Summary: Clarify Spark security documentation Key: SPARK-25023 URL: https://issues.apache.org/jira/browse/SPARK-25023 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-25024) Update mesos documentation to be clear about security supported

2018-08-03 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16568766#comment-16568766 ] Thomas Graves commented on SPARK-25024: --- I'm going to work on this. > Update mesos documentation

[jira] [Commented] (SPARK-24598) SPARK SQL:Datatype overflow conditions gives incorrect result

2018-08-07 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16572011#comment-16572011 ] Thomas Graves commented on SPARK-24598: --- In the very least we should file a separate Jira to track

[jira] [Commented] (SPARK-23207) Shuffle+Repartition on an DataFrame could lead to incorrect answers

2018-08-07 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16572232#comment-16572232 ] Thomas Graves commented on SPARK-23207: --- does this affect spark 2.2 and < ? from the description

[jira] [Commented] (SPARK-24924) Add mapping for built-in Avro data source

2018-08-15 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16581117#comment-16581117 ] Thomas Graves commented on SPARK-24924: --- [~cloud_fan] [~hyukjin.kwon] seems no one else has a

[jira] [Commented] (SPARK-24924) Add mapping for built-in Avro data source

2018-08-15 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16581608#comment-16581608 ] Thomas Graves commented on SPARK-24924: --- I'd be ok with that but CSV has been that way already for

[jira] [Resolved] (SPARK-25043) spark-sql should print the appId and master on startup

2018-08-14 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved SPARK-25043. --- Resolution: Fixed Fix Version/s: 2.4.0 > spark-sql should print the appId and master

[jira] [Assigned] (SPARK-25043) spark-sql should print the appId and master on startup

2018-08-14 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves reassigned SPARK-25043: - Assignee: Alessandro Bellina > spark-sql should print the appId and master on startup

[jira] [Updated] (SPARK-24909) Spark scheduler can hang when fetch failures, executor lost, task running on lost executor, and multiple stage attempts

2018-08-16 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-24909: -- Description: The DAGScheduler can hang if the executor was lost (due to fetch failure) and

[jira] [Updated] (SPARK-24909) Spark scheduler can hang when fetch failures, executor lost, task running on lost executor, and multiple stage attempts

2018-08-16 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-24909: -- Description: The DAGScheduler can hang if the executor was lost (due to fetch failure) and

[jira] [Commented] (SPARK-25024) Update mesos documentation to be clear about security supported

2018-08-06 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570194#comment-16570194 ] Thomas Graves commented on SPARK-25024: --- We need to make it clear what mesos supports for security

[jira] [Commented] (SPARK-24924) Add mapping for built-in Avro data source

2018-08-06 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570220#comment-16570220 ] Thomas Graves commented on SPARK-24924: --- {quote}I have followed the changes in Avro and I don't

[jira] [Resolved] (SPARK-24981) ShutdownHook timeout causes job to fail when succeeded when SparkContext stop() not called by user program

2018-08-06 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved SPARK-24981. --- Resolution: Fixed Assignee: Hieu Tri Huynh Fix Version/s: 2.4.0 >

[jira] [Commented] (SPARK-24924) Add mapping for built-in Avro data source

2018-08-07 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571852#comment-16571852 ] Thomas Graves commented on SPARK-24924: --- thanks, I missed it in the output for spark as I was just

[jira] [Commented] (SPARK-24924) Add mapping for built-in Avro data source

2018-08-06 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570586#comment-16570586 ] Thomas Graves commented on SPARK-24924: --- For compatibility we can't remove it unless major

[jira] [Commented] (SPARK-24924) Add mapping for built-in Avro data source

2018-08-07 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571908#comment-16571908 ] Thomas Graves commented on SPARK-24924: --- so originally when I started on this I didn't know about

[jira] [Commented] (SPARK-25024) Update mesos documentation to be clear about security supported

2018-08-09 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16575267#comment-16575267 ] Thomas Graves commented on SPARK-25024: --- ok, I'm not familiar with mesos hardly at all so I

[jira] [Commented] (SPARK-25081) Nested spill in ShuffleExternalSorter may access a released memory page

2018-08-10 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576234#comment-16576234 ] Thomas Graves commented on SPARK-25081: --- Does this ever result in the task reading the wrong data

[jira] [Commented] (SPARK-24918) Executor Plugin API

2018-08-14 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16579851#comment-16579851 ] Thomas Graves commented on SPARK-24918: --- Personally I like the explicit config on better

[jira] [Commented] (SPARK-24787) Events being dropped at an alarming rate due to hsync being slow for eventLogging

2018-08-14 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16579854#comment-16579854 ] Thomas Graves commented on SPARK-24787: --- Yes it was caused by hsync, hsync has to go to the

<    6   7   8   9   10   11   12   13   14   15   >