[ 
https://issues.apache.org/jira/browse/PIG-3325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13696209#comment-13696209
 ] 

Dmitriy V. Ryaboy commented on PIG-3325:
----------------------------------------

Ok I started looking at this, will update with a patch shortly. In the meantime 
-- my benchmark shows Mark's patch improves perf on small bags of 20-100 
elements, but causes extremely poor performance for large bags.

I created a benchmark that does 100 rounds of creating a bag of N elements, for 
values of N in [1,20,100,1000]. These sets of 100 rounds are run 15 times each, 
performance of the first 5 is thrown out to account for system warmup / jit 
optimizations.

Results:
||Num Tuples in Bag || Trunk avg || Patch 1 avg ||
| 1 | round: 0.00 | round: 0.00 |
| 20 | round: 0.01 | round: 0.00 |
| 100 | round: 0.13 | round: 0.00 |
| 1000 | round: 0.19 | round: 1.20 |

                
> Adding a tuple to a bag is slow
> -------------------------------
>
>                 Key: PIG-3325
>                 URL: https://issues.apache.org/jira/browse/PIG-3325
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.11, 0.11.1, 0.11.2
>            Reporter: Mark Wagner
>            Assignee: Mark Wagner
>            Priority: Critical
>         Attachments: PIG-3325.demo.patch, PIG-3325.optimize.1.patch
>
>
> The time it takes to add a tuple to a bag has increased significantly, 
> causing some jobs to take about 50x longer compared to 0.10.1. I've tracked 
> this down to PIG-2923, which has made adding a tuple heavier weight (it now 
> includes some memory estimation).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to