jamangstangs opened a new issue, #16982:
URL: https://github.com/apache/druid/issues/16982

   ### Environment
   - Apache Druid: 26.0.0
   - Kafka: 2.7.1
   
   ### Description
   Using Kafka ingestion and submitting the ingestion task as follows.
   ```
   ...
       "metricsSpec": [
         {
           "name": "uniq_column1",
           "type": "thetaSketch",
           "fieldName": "uniq_column1",
           "size": 16384
         },
         {
           "name": "uniq_column1",
           "type": "thetaSketch",
           "fieldName": "uniq_column1",
           "size": 16384
         },
       ]
   ...
       "tuningConfig": {
         "type": "kafka",
         "maxRowsPerSegment": 1000000000,
         "maxTotalRows": 1000000000,
         "maxBytesInMemory": -1
       },
   ...
       "granularitySpec": {
         "type": "uniform",
         "segmentGranularity": "HOUR",
         "queryGranularity": "SECOND",
         "rollup": true
       }
   ...
       "taskDuration": "PT1H"
   ```
   
   When use segment metadata query, thetaSketch type column return type and 
typeSignature as STRING type. Not the thetaSketch type.
   ```
   {
         queryType: "segmentMetadata",
         dataSource: "datasource",
         merge: true
   }
   ```
   column | typeSignature | type | errorMessage
   -- | -- | -- | --
   uniq_column1 | STRING | STRING | error:cannot_merge_diff_types: 
[thetaSketch] and [thetaSketchBuild]
   uniq_column2 | STRING | STRING | error:cannot_merge_diff_types: 
[thetaSketch] and [thetaSketchBuild]
   
   But, when I set the range of the segment metadata query to exclude the 
real-time ingestion range, it returns the correct type.
   ```
   {
         queryType: "segmentMetadata",
         dataSource: "datasource",
         merge: true,
         intervals:["2024-08-30T04:00:00.000Z/2024-09-01T23:00:00.000Z"]
   }
   ```
   column | typeSignature | type | errorMessage
   -- | -- | -- | --
   uniq_column1 | COMPLEX\<thetaSketch\>  | thetaSketch | null
   uniq_column2 | COMPLEX\<thetaSketch\>  | thetaSketch | null
   
   
   I'm also using version 0.21.0 of the Druid cluster, and when I test the same 
type of query, it returns the correct type.
   ```
   {
         queryType: "segmentMetadata",
         dataSource: "datasource",
         merge: true
   }
   ```
   column | type | errorMessage
   -- | -- | -- 
   uniq_column1  | thetaSketch | null
   uniq_column2 | thetaSketch | null
   
   
   It seems particularly unable to merge in the real-time ingestion range for 
thetaSketch type.
   This kind of issue already fixed in 
https://github.com/apache/druid/issues/3339, but still affected in version 
26.0.0. 
   
   Is there a solution for this, or has it been fixed in a newer version of the 
Druid cluster?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org

Reply via email to