Working on a social network project, I've recently realized that the way I 
was fetching users' feed was not scalable:
for f in 1 outbound 'users/<user-id>' follows
for c, p in 1 outbound f hasPublished
sort p.publishedAt
return c
As the number of users and published content grows, this query would always 
consume more CPU and memory! 

So in order to optimize that, I've started to denormalize my data by using 
a 'isSharedWith' edge collection that links users to published content and 
has a skiplist index on a field named 'lastPublishedAt'.
So now my query looks like:
for c, s in 1 inbound 'users/<user-id>' isSharedWith
sort s.lastPublishedAt desc
return c

The "explanation" of this query is:
Execution plan:
 Id   NodeType          Est.   Comment
  1   SingletonNode        1   * ROOT
  2   TraversalNode       84     - FOR c  /* vertex */, s  /* edge */ IN 1..
1  /* min..maxPathDepth */ INBOUND 'users/283139442928' /* startnode */ 
 isSharedWith
  3   CalculationNode     84     - LET #4 = s.`lastPublishedAt`   /* 
attribute expression */
  4   SortNode            84     - SORT #4 DESC
  5   ReturnNode          84     - RETURN c


Indexes used:
 By   Type   Collection     Unique   Sparse   Selectivity   Fields         
      Ranges
  2   edge   isSharedWith   false    false         1.18 %   [ `_from`, `_to` 
]   base INBOUND


Traversals on graphs:
 Id   Depth   Vertex collections   Edge collections   Options               
                    Filter conditions
  2   1..1                         isSharedWith       uniqueVertices: none, 
uniqueEdges: path   


Optimization rules applied:
 none

But this still doesn't look good to me; it seems that a full traversal is 
first performed in order to retrieve lastPublishedAt, and then sort on that 
field.

So my question is, would there be a way to denormalize + query that kind of 
data in order to make sure that the complexity of the query (getting the x 
most recent elements) doesn't grow with the amount of data?

Thanks in advance,
Thomas

-- 
You received this message because you are subscribed to the Google Groups 
"ArangoDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to