Hi,
the skiplist index is not used in the original query because the index is
declared sparse.
Sparse indexes exclude documents for which the indexed attribute (i.e.
"created") is either null or not present.
Because some documents may be missing from the index, it will not be used
for sorting when there is no further condition.
Consider the following documents in the collection:
{ "_key" : "a", "created" : null }
{ "_key" : "b", "created" : null }
{ "_key" : "c", "created" : 1 }
{ "_key" : "d", "created" : 2 }
{ "_key" : "e", "created" : 3 }
If you would sort descending by "created" and only return the top 4
documents, then a sparse index could only return 3 documents. A non-sparse
index however would return 4 documents.
Because the results would differ when using the index, the query optimizer
will ignore it.
The second query has a condition "created > 1000000", so the query
optimizer can figure out that "created" cannot be null. So it can use the
skiplist index and the query result will still be the same.
So the solution in your case will be to declare the index as non-sparse,
and it should be used for sorting even without further conditions.
Best regards
Jan
Am Mittwoch, 7. Dezember 2016 15:06:05 UTC+1 schrieb Ilkka Huotari:
>
> Hi,
>
> I have a query that's not performing well, even though I have created a
> skiplist index. It's a simple query:
>
> 127.0.0.1:8529@some4> db._explain("FOR c IN items sort c.created desc
> limit 10 RETURN c");
> Query string:
> FOR c IN items sort c.created desc limit 10 RETURN c
>
> Execution plan:
> Id NodeType Est. Comment
> 1 SingletonNode 1 * ROOT
> 2 EnumerateCollectionNode 1040 - FOR c IN items /* full
> collection scan */
> 3 CalculationNode 1040 - LET #1 = c.`created` /*
> attribute expression */ /* collections used: c : items */
> 4 SortNode 1040 - SORT #1 DESC
> 5 LimitNode 10 - LIMIT 0, 10
> 6 ReturnNode 10 - RETURN c
>
> Indexes used:
> none
>
> Optimization rules applied:
> none
>
> There is no skiplist index used. If I change it like this, thn the index
> is used:
>
> 127.0.0.1:8529@some4> db._explain("FOR c IN items filter c.created >
> 1000000 sort c.created desc limit 10 RETURN c");
> Query string:
> FOR c IN items filter c.created > 1000000 sort c.created desc limit 10
> RETURN c
>
> Execution plan:
> Id NodeType Est. Comment
> 1 SingletonNode 1 * ROOT
> 9 IndexNode 520 - FOR c IN items /* skiplist index scan
> */
> 5 CalculationNode 520 - LET #3 = c.`created` /* attribute
> expression */ /* collections used: c : items */
> 6 SortNode 520 - SORT #3 DESC
> 7 LimitNode 10 - LIMIT 0, 10
> 8 ReturnNode 10 - RETURN c
>
> Indexes used:
> By Type Collection Unique Sparse Selectivity
> Fields Ranges
> 9 skiplist items false true n/a [ `created`
> ] (c.`created` > 1000000)
>
> Optimization rules applied:
> Id RuleName
> 1 move-calculations-up
> 2 move-filters-up
> 3 move-calculations-up-2
> 4 move-filters-up-2
> 5 use-indexes
> 6 remove-filter-covered-by-index
> 7 remove-unnecessary-calculations-2
>
> Now, this is a workaround I could do, but it's not easy to determine a
> good value for items.created - a proper value could be a second, a hour or
> a day. It's not easy to determine without trying different values.
> "items.created" is a numerical timestamp.
>
> Without this workaround the query is very slow.
>
> I don't think it would be difficult to optimize this type of query in
> Arangodb? I'm using Arangodb 3.1.3.
>
> Thanks.
>
>
>
--
You received this message because you are subscribed to the Google Groups
"ArangoDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.