[ 
https://issues.apache.org/jira/browse/HIVE-19508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493897#comment-16493897
 ] 

Sahil Takiar commented on HIVE-19508:
-------------------------------------

A single Spark stage can be attempted multiple times - e.g. something like 
{{Stage-1_0: ... Stage-1_1 ... Stage-2_0 ...}}. The comparator needs to compare 
based on both stage id and attempt id.

If you up to do a bit of re-factoring, the implementation of {{getStageNum}} 
isn't ideal. We shouldn't rely on string parsing to extract the stage id and 
attempt id. {{SparkJobStatus#getSparkStageProgress}} should return a {{Map}} 
whose key isn't a string, instead it should be a POJO that contains the stage 
id and the attempt id.

Please add a unit test for this.

> SparkJobMonitor getReport doesn't print stage progress in order
> ---------------------------------------------------------------
>
>                 Key: HIVE-19508
>                 URL: https://issues.apache.org/jira/browse/HIVE-19508
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Sahil Takiar
>            Assignee: Bharathkrishna Guruvayoor Murali
>            Priority: Major
>         Attachments: HIVE-19508.1.patch
>
>
> You can end up with a progress output like this:
> {code}
> Stage-10_0: 0/29      Stage-11_0: 0/44        Stage-12_0: 0/11        
> Stage-13_0: 0/1 Stage-8_0: 258(+76)/468 Stage-9_0: 0/165
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to