joshlemer opened a new issue #4029: Postaggregation of `finalizingFieldAccess` 
fields does not include those fields in query aggregations
URL: https://github.com/apache/incubator-superset/issues/4029
 
 
   I am trying to make a query against druid which computes the number of 
records per unique user, like the following (this is the EXPECTED query 
Superset should produce):
   
   ```json
   {
       "aggregations": [
           {
               "type": "longSum", 
               "fieldName": "count", 
               "name": "sum__count"
           }, 
           {
               "type": "cardinality", 
               "name": "count_distinct__dmuid", 
               "fieldNames": [
                   "dmuid"
               ]
           }
       ], 
       "intervals": "2017-11-30T00:00:00+00:00/2017-12-07T13:08:53+00:00", 
       "dataSource": "responses", 
       "granularity": {
           "duration": 600000.0, 
           "timeZone": "UTC", 
           "type": "duration"
       }, 
       "postAggregations": [
           {
               "fields": [
                   {
                       "fieldName": "sum__count", 
                       "type": "fieldAccess", 
                       "name": "sum__count"
                   }, 
                   {
                       "fieldName": "count_distinct__dmuid", 
                       "type": "finalizingFieldAccess", 
                       "name": "count_distinct__dmuid"
                   }
               ], 
               "type": "arithmetic", 
               "name": "Ads per user", 
               "fn": "/"
           }
       ], 
       "queryType": "timeseries"
   }
   ```
   
   So to do this in Superset, I have created a postagg metric for my dataSource 
which has the following JSON:
   
   ```json
   { 
       "type"   : "arithmetic",
        "name"   : "div",
        "fn"     : "/",
        "fields" : [
              { "type" : "fieldAccess", "name" : "sum__count", "fieldName" : 
"sum__count" },
              { "type" : "finalizingFieldAccess", "name" : 
"count_distinct__dmuid", "fieldName" : "count_distinct__dmuid" }
            ]
   }
   ```
   
   But actually, the `finalizingFieldAccess` seems to stop that field from 
being added to the aggregations, and so we end up missing that one. Here is a 
sample error:
   
   ```
    2017-12-07 13:23:50,739:ERROR:root:HTTP Error 500: Internal Server Error 
    Druid Error: Unknown exception 
    Query is: {
       "aggregations": [
           {
               "type": "longSum", 
               "fieldName": "count", 
               "name": "sum__count"
           }
       ], 
       "intervals": "2017-11-30T00:00:00+00:00/2017-12-07T13:23:50+00:00", 
       "dataSource": "responses", 
       "granularity": {
           "duration": 600000.0, 
           "timeZone": "UTC", 
           "type": "duration"
       }, 
       "postAggregations": [
           {
               "fields": [
                   {
                       "fieldName": "sum__count", 
                       "type": "fieldAccess", 
                       "name": "sum__count"
                   }, 
                   {
                       "fieldName": "count_distinct__dmuid", 
                       "type": "finalizingFieldAccess", 
                       "name": "count_distinct__dmuid"
                   }
               ], 
               "type": "arithmetic", 
               "name": "Ads per user", 
               "fn": "/"
           }
       ], 
       "queryType": "timeseries"
   }
   Traceback (most recent call last):
     File "/Users/joshlemer/venv/lib/python2.7/site-packages/superset/viz.py", 
line 275, in get_payload
       df = self.get_df()
     File "/Users/joshlemer/venv/lib/python2.7/site-packages/superset/viz.py", 
line 98, in get_df
       self.results = self.datasource.query(query_obj)
     File 
"/Users/joshlemer/venv/lib/python2.7/site-packages/superset/connectors/druid/models.py",
 line 1069, in query
       client=client, query_obj=query_obj, phase=2)
     File 
"/Users/joshlemer/venv/lib/python2.7/site-packages/superset/connectors/druid/models.py",
 line 865, in get_query_str
       return self.run_query(client=client, phase=phase, **query_obj)
     File 
"/Users/joshlemer/venv/lib/python2.7/site-packages/superset/connectors/druid/models.py",
 line 979, in run_query
       client.timeseries(**qry)
     File 
"/Users/joshlemer/venv/lib/python2.7/site-packages/pydruid/client.py", line 
141, in timeseries
       return self._post(query)
     File 
"/Users/joshlemer/venv/lib/python2.7/site-packages/pydruid/client.py", line 
409, in _post
       e, err, json.dumps(query.query_dict, indent=4)))
   ```
   - [ ] I have checked the superset logs for python stacktraces and included 
it here as text if any
   - [ ] I have reproduced the issue with at least the latest released version 
of superset
   - [ ] I have checked the issue tracker for the same issue and I haven't 
found one similar
   
   
   ### Superset version
   0.20.6
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to