Hi, Vicențiu! On Dec 11, Vicențiu Ciorbaru wrote: > On Tue, 11 Dec 2018 at 14:33 Sergei Golubchik <s...@mariadb.org> wrote: > > > > But then I was thinking, why do you need to specify an index at all? > > Shouldn't it be just "get me a random row"? Index or whatever - > > that's engine implementation detail. For example, MyISAM with a > > fixed-size rows can just read from > > lseek(floor((file_size/row_size)*rand())*row_size). > > I agree that the need for an index seems a bit much. My reasoning was > that I wanted to allow random sampling on a particular range. This > could help for example when one wants to collect histograms for a > multi-distribution dataset, to get individual distributions (if the > indexed column is able to separate them). > > A more generic idea would be if one could pass some conditions for > random row retrieval to the storage engine, but it feels like this > would complicate storage engine implementation by quite a bit. > > For the first iteration, after considering your input, I'd go with > "init function", "get random row", "end function", without imposing an > index, but somehow passing a (COND or similar) arg to the init > function.
For the first iteration I'd go without a condition. You, probably shouldn't add an API that you won't use, and in the first iteration you won't use it, right? It can be added later when needed. Regards, Sergei _______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp