jadami10 opened a new issue, #15238:
URL: https://github.com/apache/pinot/issues/15238
This is a simplified version of a user query
```
SELECT
start_month,
end_month,
array_agg(segment_category, 'STRING', true) AS segment_category,
distinctcount(segment_category) AS distinctcount
FROM metrics_table
WHERE subset_name = 'All Data'
AND start_month = '2024-01-01'
AND end_month = '2024-09-01'
GROUP BY 1, 2
UNION ALL
SELECT
'no group by' AS start_month,
'no group by' AS end_month,
array_agg(segment_category, 'STRING', true) AS segment_category,
distinctcount(segment_category) AS distinctcount
FROM metrics_table
WHERE subset_name = 'All Data'
AND start_month = '2024-01-01'
AND end_month = '2024-09-01'
```
What we see here is
| start_month | end_month | segment_category | distinctcount |
|-------------|-----------|-----------------|--------------|
| 2024-01-01 | 2024-09-01 |
segment_a,segment_b,segment_c,segment_d,segment_e,segment_f,segment_g,segment_h
| 8 |
| no group by | no group by |
segment_a,segment_b,segment_c,segment_d,segment_f,segment_g,segment_h | 8 |
without the group by, we still get the right distinct count, but one of the
groups is missing in array_agg. This happens consistently.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]