Good question. I don't know enough about Mongo config to answer that, but let 
me look into that. 
Best,
-- C

> On Jan 28, 2022, at 10:20 AM, Daniel Clark <[email protected]> wrote:
> 
> Hi Charles,
> 
> I was under the impression that the allowDiskUse parameter is passed by the
> client making the call to the mongodb server. Is it possible to add this
> parameter to the mongo storage plugin, similar to how you added the
> "batchSize" parameter for the 1.20 release?
> 
> On Fri, Jan 28, 2022 at 9:54 AM Charles Givre <[email protected]> wrote:
> 
>> Daniel,
>> Thanks for flagging this.  One thing I noticed in your logs is this:
>> 
>> Sort exceeded memory limit of 104857600 bytes, but did not opt in to
>> external sorting. Aborting operation. Pass allowDiskUse:true to opt in.
>> 
>> What's happening here is that in the newer version of Drill, Drill is
>> sending the sort operation to Mongo which (in theory) should be faster.  In
>> contrast, Drill 1.19 would receive the unsorted data from Mongo then sort
>> it.  I wonder if setting your mongo up so that the `allowDiskUse` parameter
>> is true, you might get better results if Mongo sorts the data.
>> 
>> -- C
>> 
>> 
>> 
>>> On Jan 28, 2022, at 9:43 AM, Daniel Clark <[email protected]> wrote:
>>> 
>>> Hi Charles,
>>> 
>>> Yes "supportsSortPushdown" is set to true. I left it at the default. I'll
>>> try setting it to false, and try again. Thanks for the feedback.
>>> 
>>> On Fri, Jan 28, 2022 at 9:38 AM Charles Givre <[email protected]> wrote:
>>> 
>>>> Hey Daniel,
>>>> Did you have the sort pushdown enabled?  This is one change that we
>> added
>>>> to the mongo pushdown since 1.19 and might be affecting your query.
>>>> Best,
>>>> -- C
>>>> 
>>>> 
>>>>> On Jan 28, 2022, at 9:32 AM, Daniel Clark <[email protected]> wrote:
>>>>> 
>>>>> Hello,
>>>>> 
>>>>> While evaluating 1.20.0-SNAPSHOT release performance, I ran a mongo
>>>> query that runs in 15 minutes in the 1.19 release (below).
>>>>> 
>>>>> SELECT `Elements_Efforts`.`EffortTypeName` AS `EffortTypeName`,
>>>>> `Elements`.`ElementSubTypeName` AS `ElementSubTypeName`,
>>>>> `Elements`.`ElementTypeName` AS `ElementTypeName`,
>>>>> `Elements`.`PlanID` AS `PlanID`
>>>>> FROM `mongo.grounds`.`Elements` `Elements`
>>>>> INNER JOIN `mongo.grounds`.`Elements_Efforts` `Elements_Efforts` ON
>>>> (`Elements`.`_id` = `Elements_Efforts`.`_id`)
>>>>> WHERE (`Elements`.`PlanID` = '1623263140')
>>>>> GROUP BY `Elements_Efforts`.`EffortTypeName`,
>>>>> `Elements`.`ElementSubTypeName`,
>>>>> `Elements`.`ElementTypeName`,
>>>>> `Elements`.`PlanID`
>>>>> 
>>>>> The query runs for 34 minutes before returning this error; "Sort
>>>> exceeded memory limit of 104857600 bytes, but did not opt in to external
>>>> sorting. Aborting operation. Pass allowDiskUse:true to opt in.' on
>> server
>>>> localhost:27017." Any ideas? I realize that it's a mongodb error, but
>> the
>>>> mongo database doesn't raise this error with the 1.19 release. I was
>>>> expecting improved performance with the mongo storage plugin in the
>>>> upcoming 1.20 release. Nothing in my environment has changed. I've
>> attached
>>>> the full stacktrace.
>>>>> 
>>>>> <stacktrace.txt>
>>>> 
>>>> 
>> 
>> 

Reply via email to