njayaram2 opened a new pull request #422: Correlation: Process deconstruction in chunks for grouping URL: https://github.com/apache/madlib/pull/422 JIRA: MADLIB-1301 While deconstructing the correlation matrix to create the output table, a big UNION ALL query was created, with one sub-query for each distinct grouping value. This was causing the memory, stack, and performance related issues. The fix is to run multiple queries, with each query processing the deconstruction of the correlation matrix for a limited number of groups (we have defaulted the value to 10). This value can be parameterized by the user with a newly introduced optional parameter named `n_groups_per_run` for both correlation and covariance. Co-authored-by: Orhan Kislal <[email protected]>
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
