[ 
https://issues.apache.org/jira/browse/OFBIZ-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13540723#comment-13540723
 ] 

Leon edited comment on OFBIZ-5115 at 12/29/12 1:57 AM:
-------------------------------------------------------

Jacques,

We use OFBiz as the back-end framework of our LIMS (Laboratory Information 
Management System). It requires very heavy data statistics. In this case, we 
defined a static view entity to do some basic summary job and use "select" with 
distinct options to get desired result data-set. The data itself is no problem, 
but we found the total count it returned is always a little bigger than 
expected.

After some research, we found the root cause: the SQL with group by implies to 
select all the fields to be grouped by, that's why "count" and /or 
"count(distinct)" does not work as usual. e.g.

SELECT COUNT(*) FROM POSTAL_ADDRESS GROUP BY COUNTRY_GEO_ID -> its count result 
is same as SELECT COUNTRY_GEO_ID, COUNT(*) FROM POSTAL_ADDRESS GROUP BY 
COUNTRY_GEO_ID. and, 

SELECT COUNT(DISTINCT COUNTRY_GEO_ID) FROM POSTAL_ADDRESS GROUP BY 
COUNTRY_GEO_ID, STATE_PROVINCE_GEO_ID -> its data set result size equals that 
of  SELECT COUNTRY_GEO_ID, STATE_PROVINCE_GEO_ID, COUNT(*) FROM POSTAL_ADDRESS 
GROUP BY COUNTRY_GEO_ID, STATE_PROVINCE_GEO_ID.

                
      was (Author: utcb):
    Jacques,

We use OFBiz as the back-end framework of our LIMS (Laboratory Information 
Management System). It requires very heavy data statistics. In this case, we 
defined a static view entity to get some basic summary job and use "select" 
with distinct options to get desired result data-set. The data itself is no 
problem, but we found the total count it returned is always a little bigger 
than expected.

After some research, we found the root cause: the SQL with group by implies to 
select all the fields to be grouped by, that's why "count" and /or 
"count(distinct)" does not work as usual. e.g.

SELECT COUNT(*) FROM POSTAL_ADDRESS GROUP BY COUNTRY_GEO_ID -> its count result 
is same as SELECT COUNTRY_GEO_ID, COUNT(*) FROM POSTAL_ADDRESS GROUP BY 
COUNTRY_GEO_ID. and, 

SELECT COUNT(DISTINCT COUNTRY_GEO_ID) FROM POSTAL_ADDRESS GROUP BY 
COUNTRY_GEO_ID, STATE_PROVINCE_GEO_ID -> its data set result size equals that 
of  SELECT COUNTRY_GEO_ID, STATE_PROVINCE_GEO_ID, COUNT(*) FROM POSTAL_ADDRESS 
GROUP BY COUNTRY_GEO_ID, STATE_PROVINCE_GEO_ID.

                  
> wrong COUNT DISTINCT if there's GROUP BY clause
> -----------------------------------------------
>
>                 Key: OFBIZ-5115
>                 URL: https://issues.apache.org/jira/browse/OFBIZ-5115
>             Project: OFBiz
>          Issue Type: Bug
>          Components: framework
>    Affects Versions: SVN trunk
>            Reporter: Leon
>            Priority: Minor
>             Fix For: SVN trunk
>
>         Attachments: OFBIZ-5115.patch
>
>
> if there's a "group by" and distinct find options set, then the count of 
> select returned is always the total number of non-distinct.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to