Hi, 
Using db._explain() can clearly identify the source of the additional 
resource usage: 

  6   CalculationNode              0       - LET #11 = 
(IS_NULL(l.`Acquisition`.`BORROWER_CREDIT_SCORE`) ? 0 : 
l.`Acquisition`.`ORIGINAL_UPB`)*   /* v8 expression */   */* collections 
used: l : loans2 */

shelling out to V8 and evaluating parts of the query in V8 is not as 
effective as native execution.
There is a simple workaround to avoid the teneray expression which always 
brings a V8 expression into the game: 

  MAX([l.Acquisition.BORROWER_CREDIT_SCORE, 0])

if the first expression is undefined, that will be lower in value than 0, 
thus 0 is used here.

This should in term be a lot faster.

Cheers, 
Willi


On Saturday, August 6, 2016 at 11:07:52 PM UTC+2, BrandonJS wrote:
>
> Hi,
>
> I'm trying to compute a non-null weighted average in an AQL query.  In the 
> same query I would like to also retrieve some other aggregates such as 
> regular averages and sums.  The collection I'm using has ~300k records that 
> are fairly small.
>
> When I use the ternary operator to compute my weight total for all 
> non-null entries it significantly degrades performance.  A 1 second query 
> now takes 8 seconds.  What is the best way to optimize the following query 
> while still being able to compute a non-null weighted average?  I'm trying 
> not to query multiple times to generate what will be one result set.  As a 
> separate query when using a filter to remove all null credit scores it 
> takes about ~2 seconds.
>
> Here is my query with the ternary operator.  This query is taking ~8 
> seconds to complete.
>
>  FOR l in loans2
>     COLLECT origDate = l.Acquisition.ORIGINATION_DATE
>     AGGREGATE 
>         sumUPB = SUM(l.Acquisition.ORIGINAL_UPB),
>         weightedFico = SUM(l.Acquisition.BORROWER_CREDIT_SCORE * 
> l.Acquisition.ORIGINAL_UPB),
>         ficoUPB = SUM( IS_NULL(l.Acquisition.BORROWER_CREDIT_SCORE) ? 0 : 
> l.Acquisition.ORIGINAL_UPB )
>
>     SORT origDate
>
>   RETURN {
>     OrigDate: DATE_FORMAT(origDate, "%m-%yyyy"), 
>     TotalUPB: sumUPB,
>     wavgFICO: weightedFico / ficoUPB}
>
>
> Now that same query without the ternary operator and using the regular sum 
> of UPB (unpaid principal balance)  takes only ~1 second to complete.
>
>  FOR l in loans2
>     COLLECT origDate = l.Acquisition.ORIGINATION_DATE
>     AGGREGATE 
>         sumUPB = SUM(l.Acquisition.ORIGINAL_UPB),
>         weightedFico = SUM(l.Acquisition.BORROWER_CREDIT_SCORE * 
> l.Acquisition.ORIGINAL_UPB),
>
>     SORT origDate
>   RETURN {
>     OrigDate: DATE_FORMAT(origDate, "%m-%yyyy"), 
>     TotalUPB: sumUPB,
>     wavgFICO: weightedFico / sumUPB}
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"ArangoDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to