I was trying to say that in the current version it will always do a full scan to find the distinct values, as the indexes currently do not participate in any operations such as DISTINCT / COLLECT. What I was also trying to say is that the best remedy for this would be to implement the DISTINCT operation directly in the index scan, but that this is a feature that is not there yet.
Best regards Jan Am Montag, 20. Mai 2019 09:58:12 UTC+2 schrieb filtered: > > The answer does not convince me. > > Are you saying that a full table scan is always required e.g. when you > want to determine > the distinct values of a particular field and count the related documents > with this value independent of > the existence of an index or not? > > Am Mo., 20. Mai 2019 um 09:10 Uhr schrieb Jan <[email protected] > <javascript:>>: > >> Hi, >> >> I can't tell if views will handle this query very differently, but I >> doubt that when using a view the extra `COLLECT` step will be removed. >> That's not to say that there will be no performance difference when using >> views (I simply don't know). But when it comes to the `COLLECT`, I guess it >> will still stay around in the query even if a view is used. >> >> I guess the query could be most improved by implementing the distinct >> operation directly on the index, which would remove the extra `COLLECT` >> step entirely. >> But that functionality is still a future to-do and not available yet. >> >> Best regards >> Jan >> >> Am Sonntag, 19. Mai 2019 15:29:45 UTC+2 schrieb Andreas Jung: >>> >>> Running ArangoDB 3.4.5 on a collection with 100.000 items. >>> Each item has singe-valued field _type and there are about 20 different >>> values for this field. >>> The field is indexed by a hash index: https://nimb.ws/yh4rRb >>> >>> This query performs a full table scan instead of using the index. >>> Am I missing something here? >>> >>> For doc in import >>> return distinct doc._type >>> >>> Query String: >>> For doc in import >>> return distinct doc._type >>> >>> Execution plan: >>> Id NodeType Est. Comment >>> 1 SingletonNode 1 * ROOT >>> 2 EnumerateCollectionNode 95238 - FOR doc IN import /* full >>> collection scan, projections: `_type` */ >>> 3 CalculationNode 95238 - LET #1 = doc.`_type` /* >>> attribute expression */ /* collections used: doc : import */ >>> 4 CollectNode 76190 - COLLECT #3 = #1 /* >>> distinct */ >>> 5 ReturnNode 76190 - RETURN #3 >>> >>> Indexes used: >>> none >>> >>> Optimization rules applied: >>> Id RuleName >>> 1 reduce-extraction-to-projection >>> >>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "ArangoDB" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/arangodb/0c325e78-45fd-4224-8de3-38dc49ecec4f%40googlegroups.com >> >> <https://groups.google.com/d/msgid/arangodb/0c325e78-45fd-4224-8de3-38dc49ecec4f%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > -- You received this message because you are subscribed to the Google Groups "ArangoDB" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/arangodb/dccb7377-9e20-4623-9fcc-3f40fce4e4af%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
