orhankislal opened a new pull request #413: Pivot: Fix array_agg + distinct 
scaling issue on gpdb
URL: https://github.com/apache/madlib/pull/413
 
 
   JIRA: MADLIB-1361
   
   With large datasets, pivot fails because of the array_agg(distinct)
   query. This is because array_agg collects the values first and filters
   the distinct values later. This causes the array_agg to go out of
   memory.
   
   This commit fixes the issue by separating distinct from array_agg.
   We use a subquery to get the distinct values. Then we aggregate these
   values using array_agg.
   
   Closes #413
   
   Co-authored-by: Ekta Khanna <[email protected]>

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to