[
https://issues.apache.org/jira/browse/PIG-4259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rohini Palaniswamy updated PIG-4259:
------------------------------------
Attachment: PIG-4259-1.patch
Review board link - https://reviews.apache.org/r/27429/
Patch addresses different issues encountered while trying to debug wrong
results for a production script.
Issues addressed:
- Optimized union followed directly by Limit also fixing possibility of
incorrect results when Limit could be totally removed by UnionOptimizer if
parallelism of union was also 1.
- Fixed wrong result in case of group by with secondary key followed by Union
(Union_14)
- Fixed CROSS for Union and multiquery.
- Fixed/Optimized POLimit to not process next input in bag redundantly if
limit is already reached.
- Fixed some issues in auto parallelism and modified overriding parallelism of
intermediate reducers (PIG-4162) only for required cases.
- Adjust the AM size based on total tasks. Pain to keep adjusting memory size
after task runs for a long time and then fails with OOM.
- Fixes NPE in logs while fetching counters when job fails
- Avoid printing counters everytime while printing dagStatus. Only print tasks
and diagnostics.
> Fix few issues with Union and CROSS in Tez
> ------------------------------------------
>
> Key: PIG-4259
> URL: https://issues.apache.org/jira/browse/PIG-4259
> Project: Pig
> Issue Type: Sub-task
> Components: tez
> Reporter: Rohini Palaniswamy
> Assignee: Rohini Palaniswamy
> Fix For: 0.14.0
>
> Attachments: PIG-4259-1.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)