Hi,
one possible solution would be to add more precision to the timing events,
e.g. by adding the milliseonds for each timestamp. This would make it more
unlikely that two events will have the exact same timestamp though it could
still happen in rare cases.
It sounds like you've already considered this option and don't want to go
that way.
Another option is to use the `_key` attribute of each document and compare
them. By default, the `_key` values generated by ArangoDB are increasing
numbers packaged into strings. If there is only one process that inserts
documents sequentially into the collection, then documents inserted later
will get "higher" values in `_key` (technically they are not higher because
`_key` is a string, but the "higher" assumption holds when `_key` is
converted into a number).
For example,
FOR doc IN collection
FILTER doc.time == '10:33:45'
SORT TO_NUMBER(doc._key) ASC
RETURN doc
would give you the documents for time `10:33:45` in insertion order.
A third alternative is have the insertion process generate an increasing
value per document it inserts. This will work fine if only one process
inserts documents into the collection. Then the process could simple
compare the data by time, and use an increasing number for each value with
the same time value, e.g.
{"id":"917", "date":"2016-08-01", "time":"10:33:37",
"location":"home","seq":0},
{"id":"917", "date":"2016-08-01", "time":"10:33:39",
"location":"category/1","seq":0},
{"id":"917", "date":"2016-08-01", "time":"10:33:45",
"location":"category/4","seq":0},
{"id":"917", "date":"2016-08-01", "time":"10:33:45",
"location":"item/6","seq":1},
{"id":"917", "date":"2016-08-01", "time":"10:33:50",
"location":"home","seq":0}
You could then query events by sorting by time first and then seq.
I hope this helps.
Best regards
Jan
Am Freitag, 19. August 2016 00:30:37 UTC+2 schrieb Daniel:
>
> How can I get the behaviour flow of user using collection of logs?
>
>
> Background:
>
> I'm using ArangoDB to hold an app's log data that is similar to web
> traffic (I hope that this is a good use case)
>
> I've got two things I'm doing from the nodejs app that processes it
> 1) insert parsed data
> 2) pre-aggregate data to get event counts quickly
>
> In some cases I may need to get more advanced information such as the most
> common behaviour flow.
>
> assuming I've got data such as this:
> [
> {"id":"917", "date":"2016-08-01", "time":"10:33:37", "location":"home"},
> {"id":"917", "date":"2016-08-01", "time":"10:33:39",
> "location":"category/1"},
> {"id":"917", "date":"2016-08-01", "time":"10:33:45",
> "location":"category/4"},
> {"id":"917", "date":"2016-08-01", "time":"10:33:45", "location":"item/6"},
> {"id":"917", "date":"2016-08-01", "time":"10:33:50", "location":"home"},
> etc...
> ]
>
> the problem I've found already is that even though I'm inserting them
> sequentially, once I add two lines with the same timestamp (no millisecond
> info) I can't tell which one came first.
>
> Is this something I could use the graph component for?
>
--
You received this message because you are subscribed to the Google Groups
"ArangoDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.