njayaram2 opened a new pull request #423: Association Rules: Improve performance URL: https://github.com/apache/madlib/pull/423 JIRA: MADLIB-1327 Assocation rules was slow due to a blow up in number of candidate itemsets before checking for their support and graduating them to frequent itemsets. This PR changes ensures the candidate itemsets are ordered, and only a few consecutive itemsets must be considered to merge and create a new candidate itemset for checking support. This also reduces the result of a join query significantly (previously, the result of the join were all potential candidates, which were redundant and which may have resulted in itemsets that were larger than what was considered in the current iteration). This commit also adds relevant tests in dev-check. Co-authored-by: Orhan Kislal <[email protected]>
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
