I've been working on a config management system using arangodb which
collect config data for some common software and stream to a program which
will generate the relationship among those softwares based on some
pre-defined rules and then save the relations into arangodb. After the
relations established, I provides APIs to query the data. One important
query is to generate the topology of these softwares. I use graph traversal
to generate the topology with following AQL:
for n in nginx
for v,e,p in 0..4 outbound n forward, dispatch, route, INBOUND deployto,
referto,monitoron
filter @domain in p.edges[0].server_name
return {id: v._id, type: v.ci_type}
which can generate the following topology:
<https://lh3.googleusercontent.com/-Qb39JcyO_HM/WH7hwJeTRUI/AAAAAAAAAG8/v7pDHlhmPGwATo5HIoXyB56ri1S4Y9daQCLcB/s1600/generated-topology.png>
Which looks fine. However, It takes around 10 seconds to finish the query
which is not acceptable because the volume is not very large. I checked all
the collections and the largest collection, the "forward" edge collection
only has around 28000 documents. So I did some tests:
- I changed depth from 0..4 to 0..2 and it only takes 0.3 second to
finish the query
- I changed depth from 0..4 to 0..3, it takes around 3 seconds
- for 0..4, it takes around 10 seconds
- Since there is a server_name property on the "forward" edge, so I add
a hash index(server_name[*]) but it seems arangodb doesn't use the index
from the explain execute plan
Any tips I can optimize the query? and why the index can't be used in this
case?
Hope someone can help me out with this. Thanks in advance,
--
You received this message because you are subscribed to the Google Groups
"ArangoDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.