It looks like there is a FLOAT field called MinutesTotal that is only present in some documents. Can you try writing a query that uses explicit column specs like this?

with element as (
select
  _id,
  ElementTypeName,
  PlanId,
  ...
FROM
  `mongo.grounds`.`Elements`
), element_effort as (
select
  _id,
  EffortTypeName
FROM
  `mongo.grounds`.`Elements_Efforts`
)
select
  *
from
  element
join
  element_effort on element._id = element_effort._id;
                
(Needs some fleshing out). You can also experiment with the UNION type for this situation but I understand that one should be cautious about using it in production.

I still cannot say why 1.19 has no problem here but it could perhaps be a batch ordering thing. I think that whether or not the _first_ batch includes a MinutesTotal field can make a difference to subsequent schema handling (would need to confirm this last bit).


On 2022/02/03 15:33, Daniel Clark wrote:
Hi James,

Please see the attached.

On Wed, Feb 2, 2022 at 2:35 AM Daniel Clark <[email protected] <mailto:[email protected]>> wrote:

    Hi James,

    There initially weren’t any differences between the 1.19 environment
    and the 1.20.0-SNAPSHOT environment. The config options that worked
    in the 1.19 environment were carried over when I installed the
    snapshot build.  The recent change made to the snapshot build was
    setting store.mongo.bson.record.reader to true. The original query
    worked in the 1.19 environment, with the parameter set to false.

    Yes, I’m running the exact same query against the exact same data
    sources. I’ll attach a copy of the stack trace and profile, later
    this morning. I’ll also see about reducing the dataset. Thanks for
    following up.

    Sent from my iPhone

     > On Feb 2, 2022, at 2:03 AM, James Turton <[email protected]
    <mailto:[email protected]>> wrote:
     >
     > Okay.  It's always a good idea to attach a stack trace and a
    query profile when you have an error to send in, so maybe you can
    add those?
     >
     > Next, we're left with a reproducibility challenge.  Are there
    other config option differences between your two Drill environments,
    beyond the one we've uncovered?  Are you running exactly the same
query against exactly the same data source in both environments? Can you reduce the collections involved in the query to minimal (and
    obfuscated if need be) datasets that we can use to reproduce the
    problem?
     >
     >> On 2022/02/01 18:15, Daniel Clark wrote:
     >> No, exec.enable_union_type is set tofalse.
     >> On Tue, Feb 1, 2022 at 10:59 AM James Turton <[email protected]
    <mailto:[email protected]> <mailto:[email protected]
    <mailto:[email protected]>>> wrote:
     >>    Do you have exec.enable_union_type = true in your 1.19
    environment?
     >>    On 2022/02/01 17:30, Daniel Clark wrote:
     >>     > Hi James,
     >>     >
     >>     > Yes, the store.mongo.bson.record.reader was set to false.
    I set
     >>    it to true
     >>     > and re-ran the original query. It returned an error:
     >>     > UNSUPPORTED_OPERATION ERROR: Schema changes not supported in
     >>    External Sort.
     >>     > Please enable Union type.
     >>     >
     >>     >
     >>     >
     >>     > On Tue, Feb 1, 2022 at 9:19 AM James Turton
    <[email protected] <mailto:[email protected]>
     >>    <mailto:[email protected] <mailto:[email protected]>>> wrote:
     >>     >
     >>     >> Hi Daniel
     >>     >>
     >>     >> Please let us know if you have set the config option
     >>    store.mongo.bson.record.reader
     >>     >> = false and, if so, please set it to true.
     >>     >>
     >>     >> Thanks
     >>     >> James
     >>     >>
     >>     >> On 2022/01/31 17:45, Daniel Clark wrote:
     >>     >>
     >>     >> Here it is. Please see the attached file.
     >>     >>
     >>     >> On Mon, Jan 31, 2022 at 4:22 AM James Turton
    <[email protected] <mailto:[email protected]>
     >>    <mailto:[email protected] <mailto:[email protected]>>> wrote:
     >>     >>
     >>     >>> Please also attach the query profile if you can.
     >>     >>>
     >>     >>> Thanks
     >>     >>> James
     >>     >>>
     >>     >>> On 2022/01/31 08:09, luoc wrote:
     >>     >>>> Hi Daniel,
     >>     >>>>     What is the data type of the `_id` field? The default
     >>    ObjectId, or
     >>     >>> String or key-value pair (Struct)?
     >>     >>>>
     >>     >>>>> On Jan 31, 2022, at 11:12, Daniel Clark
    <[email protected] <mailto:[email protected]>
     >>    <mailto:[email protected] <mailto:[email protected]>>> wrote:
     >>     >>>>>
     >>     >>>>> 
     >>     >>>>> Hello,
     >>     >>>>>
     >>     >>>>> I'm running this mongo query on the 1.20.0-SNAPSHOT
    build. It
     >>    runs
     >>     >>> without error on the 1.19 release.
     >>     >>>>>
     >>     >>>>> SELECT `Elements_Efforts`.`EffortTypeName` AS
    `EffortTypeName`,
     >>     >>>>>     `Elements`.`ElementSubTypeName` AS
    `ElementSubTypeName`,
     >>     >>>>>     `Elements`.`ElementTypeName` AS `ElementTypeName`,
     >>     >>>>>     `Elements`.`PlanID` AS `PlanID`
     >>     >>>>> FROM `mongo.grounds`.`Elements` `Elements`
     >>     >>>>>     INNER JOIN `mongo.grounds`.`Elements_Efforts`
     >>    `Elements_Efforts` ON
     >>     >>> (`Elements`.`_id` = `Elements_Efforts`.`_id`)
     >>     >>>>> WHERE (`Elements`.`PlanID` = '1623263140')
     >>     >>>>> GROUP BY `Elements_Efforts`.`EffortTypeName`,
     >>     >>>>>     `Elements`.`ElementSubTypeName`,
     >>     >>>>>     `Elements`.`ElementTypeName`,
     >>     >>>>>     `Elements`.`PlanID`
     >>     >>>>>
     >>     >>>>> The error message returned is:
     >>     >>>>>
     >>     >>>>> org.apache.drill.common.exceptions.UserRemoteException:
     >>    SYSTEM ERROR:
     >>     >>> UnsupportedOperationException: Map, Array, Union or repeated
     >>    scalar type
     >>     >>> should not be used in group by, order by or in a comparison
     >>    operator. Drill
     >>     >>> does not support compare between MAP:REQUIRED and
    MAP:REQUIRED.
     >>     >>>>>
     >>     >>>>> Fragment: 0:0
     >>     >>>>>
     >>     >>>>> Please, refer to logs for more information.
     >>     >>>>>
     >>     >>>>> [Error Id: 21b3260d-9ebf-4156-a5fa-4748453b5465 on
     >>    localhost:31010]
     >>     >>>>>
     >>     >>>>> I've tried searching the mailing list archives, as well as
     >>    googling
     >>     >>> the error. The stack trace mentions that memory was
    leaked by
     >>    the query.
     >>     >>> Any ideas? Full stack trace attached.
     >>     >>>>> <stacktrace.txt>
     >>     >>>
     >>     >>>
     >>     >>
     >>     >

Reply via email to