Hey Simran, In our case we precisely want to scan the whole content of the collection but we want to make sure we don't have to spend 90% of our processing time on the "SKIP" step of the query. (I'm sorry, I should have said SKIP in the previous post)
We have several use cases, I can't tell you much (it's my company project) but you could guess that if we want to dump the whole collection, it would be legitimate to read all of its document. There is no relevant documents because all of them are ! I just read the issue you opened on GitHub, i guess, by the time its resolved, it could do the trick. Don't you have anything such as physical pointers ? To be honest, we are currently using OrientDB and they have a very cool feature being iterating over the RID (Resilient Id Storage). The rids are physical pointers which supports comparison operator (<, <=, =, =>, >) and you perform request as such : "select * from my_collection where @rid > #10:100". In the case "my_collection" starts with the rid #10:100, you basically did "select * from my_collection SKIP 100". The main benefits of this feature being that it runs extremely fast because the time spent to "skip" the records is almost non existent. I'm no OrientDB evangelist, but for the sake of clarity you can go there : https://orientdb.com/docs/3.0.x/sql/Pagination.html#use-the-rid-limit if you are curious. Thank you for your answer, Cyprien Le lundi 10 décembre 2018 16:54:11 UTC+1, Simran Brucherseifer a écrit : > > Hey Cyprien, > > it's going well, thanks! > > In general, yes, using LIMIT with a high offset can take more time than to > return the first few documents if there's a huge dataset to process to > answer the query. But it depends on the exact query. Reading all content of > the documents can be avoided in several cases, but it may still be > necessary to walk through an index data structure up to the documents that > need to be processed and returned. > > Can you post the exact query (or queries) you need so that we can better > understand the goal and check what the options are to optimize it? > Maybe a secondary index can be utilized to select the relevant documents? > Or maybe you know the document keys and can do point lookups for them? > > Regarding range FILTERs: It is currently not possible to do that using an > index on the _key attribute, but this may change: > https://github.com/arangodb/arangodb/issues/7720 > > Best, > Simran > -- You received this message because you are subscribed to the Google Groups "ArangoDB" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
