[jira] [Updated] (TEZ-3769) Unordered: Fix wrong stats being sent out in the last event, when final merge is disabled

2017-06-27 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-3769:
--
Attachment: TEZ-3769.3.patch

Uploading .3 with review comments addressed.  

Agreed that unordered writer needs refactoring to reduce the complexity.

> Unordered: Fix wrong stats being sent out in the last event, when final merge 
> is disabled
> -
>
> Key: TEZ-3769
> URL: https://issues.apache.org/jira/browse/TEZ-3769
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
> Attachments: TEZ-3769.1.patch, TEZ-3769.2.patch, TEZ-3769.3.patch
>
>
> When final merge is disabled (without pipelining), wrong stats was sent out 
> in the last event. 
> It was based on {{numRecordsPerPartition}} which contains the overall 
> partition data. It should be ideally be based on the spill result and its 
> buffers.
> Also, {{finalSpill}} was unncessarily sending events when no data was present 
> (i.e, when currentBuffer didn't have any data).  This can be optimized to 
> reduce the number of events being sent across.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (TEZ-3769) Unordered: Fix wrong stats being sent out in the last event, when final merge is disabled

2017-06-26 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-3769:
--
Attachment: TEZ-3769.2.patch

Rebasing the patch (TEZ-3762 got committed) and added few more tests.

> Unordered: Fix wrong stats being sent out in the last event, when final merge 
> is disabled
> -
>
> Key: TEZ-3769
> URL: https://issues.apache.org/jira/browse/TEZ-3769
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
> Attachments: TEZ-3769.1.patch, TEZ-3769.2.patch
>
>
> When final merge is disabled (without pipelining), wrong stats was sent out 
> in the last event. 
> It was based on {{numRecordsPerPartition}} which contains the overall 
> partition data. It should be ideally be based on the spill result and its 
> buffers.
> Also, {{finalSpill}} was unncessarily sending events when no data was present 
> (i.e, when currentBuffer didn't have any data).  This can be optimized to 
> reduce the number of events being sent across.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (TEZ-3769) Unordered: Fix wrong stats being sent out in the last event, when final merge is disabled

2017-06-22 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-3769:
--
Attachment: TEZ-3769.1.patch

[~sseth], [~aplusplus], [~harishjp], [~jeagles] - Please review when you find 
time. 

Patch contains TEZ-3762 changes as well.

> Unordered: Fix wrong stats being sent out in the last event, when final merge 
> is disabled
> -
>
> Key: TEZ-3769
> URL: https://issues.apache.org/jira/browse/TEZ-3769
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
> Attachments: TEZ-3769.1.patch
>
>
> When final merge is disabled (without pipelining), wrong stats was sent out 
> in the last event. 
> It was based on {{numRecordsPerPartition}} which contains the overall 
> partition data. It should be ideally be based on the spill result and its 
> buffers.
> Also, {{finalSpill}} was unncessarily sending events when no data was present 
> (i.e, when currentBuffer didn't have any data).  This can be optimized to 
> reduce the number of events being sent across.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (TEZ-3769) Unordered: Fix wrong stats being sent out in the last event, when final merge is disabled

2017-06-22 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-3769:
--
Summary: Unordered: Fix wrong stats being sent out in the last event, when 
final merge is disabled  (was: Unordered: Fix wrong stats being sent out in the 
last event when final merge is disabled)

> Unordered: Fix wrong stats being sent out in the last event, when final merge 
> is disabled
> -
>
> Key: TEZ-3769
> URL: https://issues.apache.org/jira/browse/TEZ-3769
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>
> When final merge is disabled (without pipelining), wrong stats was sent out 
> in the last event. 
> It was based on {{numRecordsPerPartition}} which contains the overall 
> partition data. It should be ideally be based on the spill result and its 
> buffers.
> Also, {{finalSpill}} was unncessarily sending events when no data was present 
> (i.e, when currentBuffer didn't have any data).  This can be optimized to 
> reduce the number of events being sent across.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)