It looks like there is a FLOAT field called MinutesTotal that is only
present in some documents. Can you try writing a query that uses
explicit column specs like this?
with element as (
select
_id,
ElementTypeName,
PlanId,
...
FROM
`mongo.grounds`.`Elements`
), element_effort as (
select
_id,
EffortTypeName
FROM
`mongo.grounds`.`Elements_Efforts`
)
select
*
from
element
join
element_effort on element._id = element_effort._id;
(Needs some fleshing out). You can also experiment with the UNION type
for this situation but I understand that one should be cautious about
using it in production.
I still cannot say why 1.19 has no problem here but it could perhaps be
a batch ordering thing. I think that whether or not the _first_ batch
includes a MinutesTotal field can make a difference to subsequent schema
handling (would need to confirm this last bit).
On 2022/02/03 15:33, Daniel Clark wrote:
Hi James,
Please see the attached.
On Wed, Feb 2, 2022 at 2:35 AM Daniel Clark <[email protected]
<mailto:[email protected]>> wrote:
Hi James,
There initially weren’t any differences between the 1.19 environment
and the 1.20.0-SNAPSHOT environment. The config options that worked
in the 1.19 environment were carried over when I installed the
snapshot build. The recent change made to the snapshot build was
setting store.mongo.bson.record.reader to true. The original query
worked in the 1.19 environment, with the parameter set to false.
Yes, I’m running the exact same query against the exact same data
sources. I’ll attach a copy of the stack trace and profile, later
this morning. I’ll also see about reducing the dataset. Thanks for
following up.
Sent from my iPhone
> On Feb 2, 2022, at 2:03 AM, James Turton <[email protected]
<mailto:[email protected]>> wrote:
>
> Okay. It's always a good idea to attach a stack trace and a
query profile when you have an error to send in, so maybe you can
add those?
>
> Next, we're left with a reproducibility challenge. Are there
other config option differences between your two Drill environments,
beyond the one we've uncovered? Are you running exactly the same
query against exactly the same data source in both environments?
Can you reduce the collections involved in the query to minimal (and
obfuscated if need be) datasets that we can use to reproduce the
problem?
>
>> On 2022/02/01 18:15, Daniel Clark wrote:
>> No, exec.enable_union_type is set tofalse.
>> On Tue, Feb 1, 2022 at 10:59 AM James Turton <[email protected]
<mailto:[email protected]> <mailto:[email protected]
<mailto:[email protected]>>> wrote:
>> Do you have exec.enable_union_type = true in your 1.19
environment?
>> On 2022/02/01 17:30, Daniel Clark wrote:
>> > Hi James,
>> >
>> > Yes, the store.mongo.bson.record.reader was set to false.
I set
>> it to true
>> > and re-ran the original query. It returned an error:
>> > UNSUPPORTED_OPERATION ERROR: Schema changes not supported in
>> External Sort.
>> > Please enable Union type.
>> >
>> >
>> >
>> > On Tue, Feb 1, 2022 at 9:19 AM James Turton
<[email protected] <mailto:[email protected]>
>> <mailto:[email protected] <mailto:[email protected]>>> wrote:
>> >
>> >> Hi Daniel
>> >>
>> >> Please let us know if you have set the config option
>> store.mongo.bson.record.reader
>> >> = false and, if so, please set it to true.
>> >>
>> >> Thanks
>> >> James
>> >>
>> >> On 2022/01/31 17:45, Daniel Clark wrote:
>> >>
>> >> Here it is. Please see the attached file.
>> >>
>> >> On Mon, Jan 31, 2022 at 4:22 AM James Turton
<[email protected] <mailto:[email protected]>
>> <mailto:[email protected] <mailto:[email protected]>>> wrote:
>> >>
>> >>> Please also attach the query profile if you can.
>> >>>
>> >>> Thanks
>> >>> James
>> >>>
>> >>> On 2022/01/31 08:09, luoc wrote:
>> >>>> Hi Daniel,
>> >>>> What is the data type of the `_id` field? The default
>> ObjectId, or
>> >>> String or key-value pair (Struct)?
>> >>>>
>> >>>>> On Jan 31, 2022, at 11:12, Daniel Clark
<[email protected] <mailto:[email protected]>
>> <mailto:[email protected] <mailto:[email protected]>>> wrote:
>> >>>>>
>> >>>>>
>> >>>>> Hello,
>> >>>>>
>> >>>>> I'm running this mongo query on the 1.20.0-SNAPSHOT
build. It
>> runs
>> >>> without error on the 1.19 release.
>> >>>>>
>> >>>>> SELECT `Elements_Efforts`.`EffortTypeName` AS
`EffortTypeName`,
>> >>>>> `Elements`.`ElementSubTypeName` AS
`ElementSubTypeName`,
>> >>>>> `Elements`.`ElementTypeName` AS `ElementTypeName`,
>> >>>>> `Elements`.`PlanID` AS `PlanID`
>> >>>>> FROM `mongo.grounds`.`Elements` `Elements`
>> >>>>> INNER JOIN `mongo.grounds`.`Elements_Efforts`
>> `Elements_Efforts` ON
>> >>> (`Elements`.`_id` = `Elements_Efforts`.`_id`)
>> >>>>> WHERE (`Elements`.`PlanID` = '1623263140')
>> >>>>> GROUP BY `Elements_Efforts`.`EffortTypeName`,
>> >>>>> `Elements`.`ElementSubTypeName`,
>> >>>>> `Elements`.`ElementTypeName`,
>> >>>>> `Elements`.`PlanID`
>> >>>>>
>> >>>>> The error message returned is:
>> >>>>>
>> >>>>> org.apache.drill.common.exceptions.UserRemoteException:
>> SYSTEM ERROR:
>> >>> UnsupportedOperationException: Map, Array, Union or repeated
>> scalar type
>> >>> should not be used in group by, order by or in a comparison
>> operator. Drill
>> >>> does not support compare between MAP:REQUIRED and
MAP:REQUIRED.
>> >>>>>
>> >>>>> Fragment: 0:0
>> >>>>>
>> >>>>> Please, refer to logs for more information.
>> >>>>>
>> >>>>> [Error Id: 21b3260d-9ebf-4156-a5fa-4748453b5465 on
>> localhost:31010]
>> >>>>>
>> >>>>> I've tried searching the mailing list archives, as well as
>> googling
>> >>> the error. The stack trace mentions that memory was
leaked by
>> the query.
>> >>> Any ideas? Full stack trace attached.
>> >>>>> <stacktrace.txt>
>> >>>
>> >>>
>> >>
>> >