[ https://issues.apache.org/jira/browse/DERBY-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12724860#action_12724860 ]
Bryan Pendleton commented on DERBY-3002: ---------------------------------------- Indeed, I *am* concerned that there may be some circumstances where the new technique is less efficient. An example might be a case where there is a large amount of data to be processed, but a very small number of groups; in this case, most of the raw data ends up being discarded during the GROUP BY processing, and the sort-observer technique will discard the unneeded data sooner, which means it may have a substantial edge over the new algorithm. However, other circumstances, such as those in which an index exists, and thus the data can be processed in sorted order automatically, and those in which the data is relatively small, and thus is not expensive to sort, and those in which there is a lot of data, but there are also many different values for the grouping column, may not see much impact at all. And, I am just guessing about the performance impacts; I don't know how important this distinction will be, given the multitude of other things that occur during query processing. Earlier in the project, I intended to support multiple algorithms. However, I then realized: - we'd have to have multiple sets of code for the same functionality - we'd have to have some way of determining, at runtime, which implementation to choose. Both problems seemed quite troubling, the second problem seemed very important, because if the selection of the appropriate algorithm is based on information about the size and distribution of the data being processed by the query, then the decision ought to be made by the optimizer, which appeared like it would dramatically increase the complexity of this project. So it would be great if the single implementation was "good enough" for the queries we expect to run. I'll try to put a benchmark together and we can see what the results say, and then we'll have a better idea of how big a problem we have here. > Add support for GROUP BY ROLLUP > ------------------------------- > > Key: DERBY-3002 > URL: https://issues.apache.org/jira/browse/DERBY-3002 > Project: Derby > Issue Type: New Feature > Components: SQL > Affects Versions: 10.4.1.3 > Reporter: Bryan Pendleton > Assignee: Bryan Pendleton > Priority: Minor > Attachments: fixWhiteSpace.diff, IncludesASimpleTest.diff, > passesRegressionTests.diff, prototypeChangeNoTests.diff, > rewriteGroupByRS.diff, rollupNullability.diff, useLookahead.diff > > > Provide an implementation of the ROLLUP form of multi-dimensional grouping > according to the SQL standard. > See http://wiki.apache.org/db-derby/OLAPRollupLists for some more detailed > information about this aspect of the SQL standard. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.