[jira] [Commented] (SPARK-27663) Task accomplished incompletely but marked as success

2019-05-09 Thread Hyukjin Kwon (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836435#comment-16836435
 ] 

Hyukjin Kwon commented on SPARK-27663:
--

+1 should better be narrowed down and check if the issue exists in higher 
version

> Task accomplished incompletely but marked as success
> 
>
> Key: SPARK-27663
> URL: https://issues.apache.org/jira/browse/SPARK-27663
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, SQL
>Affects Versions: 2.1.0
>Reporter: Yunbo Fan
>Priority: Major
> Attachments: image-2019-05-09-11-10-04-602.png, incomplte-task-0.png, 
> incomplte-task-1.png, incomplte-task-2.png, reran-0.png, reran-1.png
>
>
> It happens when running sql queries using spark sql.
> The task was accomplished incompletely but marked as success since there were 
> not any  exceptions and failed or killed tasks.
> When I checked the query result, it missed about 4000 records.
> The history web ui shows that the task input size is 23.5 MB, but the log in 
> the executor shows the split size is 326992763, about 300 MB.
> And this task was finished in 1 second, but others’ duration was about 15 
> seconds.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27663) Task accomplished incompletely but marked as success

2019-05-09 Thread Sean Owen (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836392#comment-16836392
 ] 

Sean Owen commented on SPARK-27663:
---

Also, you'd need to reproduce on master. 2.1.0 is very old.

> Task accomplished incompletely but marked as success
> 
>
> Key: SPARK-27663
> URL: https://issues.apache.org/jira/browse/SPARK-27663
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, SQL
>Affects Versions: 2.1.0
>Reporter: Yunbo Fan
>Priority: Major
> Attachments: image-2019-05-09-11-10-04-602.png, incomplte-task-0.png, 
> incomplte-task-1.png, incomplte-task-2.png, reran-0.png, reran-1.png
>
>
> It happens when running sql queries using spark sql.
> The task was accomplished incompletely but marked as success since there were 
> not any  exceptions and failed or killed tasks.
> When I checked the query result, it missed about 4000 records.
> The history web ui shows that the task input size is 23.5 MB, but the log in 
> the executor shows the split size is 326992763, about 300 MB.
> And this task was finished in 1 second, but others’ duration was about 15 
> seconds.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27663) Task accomplished incompletely but marked as success

2019-05-09 Thread Sean Owen (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836391#comment-16836391
 ] 

Sean Owen commented on SPARK-27663:
---

No idea, I don't think this narrows it down enough to say. You might search 
JIRA for similar sounding issues first.

> Task accomplished incompletely but marked as success
> 
>
> Key: SPARK-27663
> URL: https://issues.apache.org/jira/browse/SPARK-27663
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, SQL
>Affects Versions: 2.1.0
>Reporter: Yunbo Fan
>Priority: Major
> Attachments: image-2019-05-09-11-10-04-602.png, incomplte-task-0.png, 
> incomplte-task-1.png, incomplte-task-2.png, reran-0.png, reran-1.png
>
>
> It happens when running sql queries using spark sql.
> The task was accomplished incompletely but marked as success since there were 
> not any  exceptions and failed or killed tasks.
> When I checked the query result, it missed about 4000 records.
> The history web ui shows that the task input size is 23.5 MB, but the log in 
> the executor shows the split size is 326992763, about 300 MB.
> And this task was finished in 1 second, but others’ duration was about 15 
> seconds.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27663) Task accomplished incompletely but marked as success

2019-05-08 Thread Fan Yunbo (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836055#comment-16836055
 ] 

Fan Yunbo commented on SPARK-27663:
---

[~jerryshao], [~srowen]

Could you please take a look?

> Task accomplished incompletely but marked as success
> 
>
> Key: SPARK-27663
> URL: https://issues.apache.org/jira/browse/SPARK-27663
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, SQL
>Affects Versions: 2.1.0
>Reporter: Fan Yunbo
>Priority: Major
> Attachments: image-2019-05-09-11-10-04-602.png, incomplte-task-0.png, 
> incomplte-task-1.png, incomplte-task-2.png, reran-0.png, reran-1.png
>
>
> It happens when running sql queries using spark sql.
> The task was accomplished incompletely but marked as success since there were 
> not any  exceptions and failed or killed tasks.
> When I checked the query result, it missed about 4000 records.
> The history web ui shows that the task input size is 23.5 MB, but the log in 
> the executor shows the split size is 326992763, about 300 MB.
> And this task was finished in 1 second, but others’ duration was about 15 
> seconds.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27663) Task accomplished incompletely but marked as success

2019-05-08 Thread Fan Yunbo (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836054#comment-16836054
 ] 

Fan Yunbo commented on SPARK-27663:
---

Two thoughts:
 # The executor shut down hook was triggered when the task was running, it 
ended the task but marked it as success.
 # Some exceptions were happened in the task, but the executor didn't catch it 
or just ignored.

 

> Task accomplished incompletely but marked as success
> 
>
> Key: SPARK-27663
> URL: https://issues.apache.org/jira/browse/SPARK-27663
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, SQL
>Affects Versions: 2.1.0
>Reporter: Fan Yunbo
>Priority: Major
> Attachments: image-2019-05-09-11-10-04-602.png, incomplte-task-0.png, 
> incomplte-task-1.png, incomplte-task-2.png, reran-0.png, reran-1.png
>
>
> It happens when running sql queries using spark sql.
> The task was accomplished incompletely but marked as success since there were 
> not any  exceptions and failed or killed tasks.
> When I checked the query result, it missed about 4000 records.
> The history web ui shows that the task input size is 23.5 MB, but the log in 
> the executor shows the split size is 326992763, about 300 MB.
> And this task was finished in 1 second, but others’ duration was about 15 
> seconds.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27663) Task accomplished incompletely but marked as success

2019-05-08 Thread Fan Yunbo (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836047#comment-16836047
 ] 

Fan Yunbo commented on SPARK-27663:
---

When I reran the query, it was finished as normal.

!reran-0.png!

!reran-1.png!

> Task accomplished incompletely but marked as success
> 
>
> Key: SPARK-27663
> URL: https://issues.apache.org/jira/browse/SPARK-27663
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, SQL
>Affects Versions: 2.1.0
>Reporter: Fan Yunbo
>Priority: Major
> Attachments: image-2019-05-09-11-10-04-602.png, incomplte-task-0.png, 
> incomplte-task-1.png, incomplte-task-2.png, reran-0.png, reran-1.png, 
> reran-1.png
>
>
> It happens when running sql queries using spark sql.
> The task was accomplished incompletely but marked as success since there were 
> not any  exceptions and failed or killed tasks.
> When I checked the query result, it missed about 4000 records.
> The history web ui shows that the task input size is 23.5 MB, but the log in 
> the executor shows the split size is 326992763, about 300 MB.
> And this task was finished in 1 second, but others’ duration was about 15 
> seconds.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27663) Task accomplished incompletely but marked as success

2019-05-08 Thread Fan Yunbo (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836044#comment-16836044
 ] 

Fan Yunbo commented on SPARK-27663:
---

The incomplete task's id is 17.0 in tage 98517.0

!incomplte-task-1.png!

the input size is 23.5 MB, and finished in 1 s !incomplte-task-2.png!

and the log shows the input split size is
{code:java}
Input split: 
hdfs://cqocdc/user/hive/warehouse/dw_user_useage_privilege_dt_mmdd/month_id=201904/day_id=20190422/17_0.snappy:0+326992763{code}
{code:java}
19/04/23 12:09:18 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 
6835988
19/04/23 12:09:18 INFO executor.Executor: Running task 17.0 in stage 98517.0 
(TID 6835988)
19/04/23 12:09:18 INFO broadcast.TorrentBroadcast: Started reading broadcast 
variable 173456
19/04/23 12:09:18 INFO memory.MemoryStore: Block broadcast_173456_piece0 stored 
as bytes in memory (estimated size 13.4 KB, free 15.2 GB)
19/04/23 12:09:18 INFO broadcast.TorrentBroadcast: Reading broadcast variable 
173456 took 4 ms
19/04/23 12:09:18 INFO memory.MemoryStore: Block broadcast_173456 stored as 
values in memory (estimated size 30.3 KB, free 15.2 GB)
19/04/23 12:09:18 INFO rdd.HadoopRDD: Input split: 
hdfs://cqocdc/user/hive/warehouse/dw_user_useage_privilege_dt_mmdd/month_id=201904/day_id=20190422/17_0.snappy:0+326992763
19/04/23 12:09:18 INFO broadcast.TorrentBroadcast: Started reading broadcast 
variable 173452
19/04/23 12:09:18 INFO memory.MemoryStore: Block broadcast_173452_piece0 stored 
as bytes in memory (estimated size 30.8 KB, free 15.2 GB)
19/04/23 12:09:18 INFO broadcast.TorrentBroadcast: Reading broadcast variable 
173452 took 3 ms
19/04/23 12:09:18 INFO memory.MemoryStore: Block broadcast_173452 stored as 
values in memory (estimated size 365.1 KB, free 15.3 GB)
19/04/23 12:09:18 INFO codegen.CodeGenerator: Code generated in 6.949728 ms
19/04/23 12:09:18 INFO codegen.CodeGenerator: Code generated in 20.909883 ms
19/04/23 12:09:18 INFO output.FileOutputCommitter: Saved output of task 
'attempt_20190423120856_98508_m_47_0' to 
hdfs://cqocdc/tmp/.staging/hive_hive_2019-04-23_12-08-56_154_3110404551071203558-1370/-ext-1/_temporary/0/task_20190423120856_98508_m_47
19/04/23 12:09:18 INFO mapred.SparkHadoopMapRedUtil: 
attempt_20190423120856_98508_m_47_0: Committed
19/04/23 12:09:18 INFO executor.Executor: Finished task 47.0 in stage 98508.0 
(TID 6835975). 3217 bytes result sent to driver
19/04/23 12:09:19 ERROR executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL 
TERM
19/04/23 12:09:19 INFO storage.DiskBlockManager: Shutdown hook called
19/04/23 12:09:19 INFO util.ShutdownHookManager: Shutdown hook called
19/04/23 12:09:19 INFO executor.Executor: Finished task 17.0 in stage 98517.0 
(TID 6835988). 3188 bytes result sent to driver
{code}
The file size and last modified time:

!image-2019-05-09-11-10-04-602.png!

The stage of the query total input is 14.9 G:

!incomplte-task-0.png!

 

> Task accomplished incompletely but marked as success
> 
>
> Key: SPARK-27663
> URL: https://issues.apache.org/jira/browse/SPARK-27663
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, SQL
>Affects Versions: 2.1.0
>Reporter: Fan Yunbo
>Priority: Major
> Attachments: image-2019-05-09-11-10-04-602.png, incomplte-task-0.png, 
> incomplte-task-1.png, incomplte-task-2.png
>
>
> It happens when running sql queries using spark sql.
> The task was accomplished incompletely but marked as success since there were 
> not any  exceptions and failed or killed tasks.
> When I checked the query result, it missed about 4000 records.
> The history web ui shows that the task input size is 23.5 MB, but the log in 
> the executor shows the split size is 326992763, about 300 MB.
> And this task was finished in 1 second, but others’ duration was about 15 
> seconds.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org