Hi Nigel,
did you define any indexes? For movies.csv, it would probably make sense to
use this as _key. The primary index is over this attribute, thus sparing
you an additional index.
Another way would be to assume that tags are written in the same case or
update them to be all lowercase for instance, then get rid of the function
calls to LOWER() and create an index on movieId,tag in collection TagLinks.
The index can then be utilized for the following filter condition:
filter m.movieId == tl.movieId And t.tag == tl.tag
Be sure to have a look at the execution plan to see what will be done:
Execution plan:
Id NodeType Est. Comment
1 SingletonNode 1 * ROOT
3 EnumerateCollectionNode 10000 - FOR m IN Movies /* full
collection scan */
4 EnumerateCollectionNode 100000000 - FOR t IN Tags /* full
collection scan */
7 CalculationNode 100000000 - LET #6 = { "_from" :
m.`_id`, "_to" : t.`_id` } /* simple expression */ /* collections used:
m : Movies, t : Tags */
9 IndexNode 100000000 - FOR tl IN TagLinks /*
hash index scan, scan only */
8 InsertNode 0 - INSERT #6 IN MyTags
Indexes used:
By Type Collection Unique Sparse Selectivity Fields
Ranges
9 hash TagLinks false false 91.46 % [ `movieId`, `tag`
] ((m.`movieId` == tl.`movieId`) && (t.`tag` == tl.`tag`))
--
You received this message because you are subscribed to the Google Groups
"ArangoDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.