Re: pgbench-ycsb

Fabien COELHO Sun, 22 Jul 2018 13:44:09 -0700

Just to clarify - if I understand Anthony correctly, this proposal isnot about implementing exactly YCSB as it is, but more about usingzipfian distribution for an id in the regular pgbench table structurein conjunction with read/write balance to simulate something similarto it.
Ok, I misunderstood. My 0.02€: If it does not implement YCSB, and the
point is not to implement YCSB, then do not call it YCSB:-)

Maybe there could be other simpler builtins to use non uniform
distributions: {zipf,exp,...}-{simple,select} and default values
(exp_param, zipf_param?) for the random distribution parameters.

  \set id random_zipfian(1, 100000*:scale, :zipf_param)
  \set val random(-5000, 5000)
  UPDATE pgbench_whatever ...;

Then

  pgbench -b zipf-se@1 -b zipf-si@1 [ -D zipf_param=1.1 ... ] -T 10000 ...
And probably instead of implementing the exact YCSB workload insidepgbench, it makes more sense to add PostgreSQL Jsonb as one of theoptions into the framework itself (I was in the middle of it few yearsago, but then was distracted by some interesting benchmarkingresults).
Sure.
Hello,
thank you for your interest. I'm still improving this idea, the patch
and I'm very happy about the discussion we have. It really helps.

The idea was to implement the workloads as close to YCSB as possible
using pgbench.


Basically I'm against having something called YCSB if it is not YCSB;-)

So, the schema it should be applied to - is default schema generated by
pgbnench -i (pgbench_accounts).

This is a contradiction, because pgbench_accounts table is in no way, evenremotely, conformant to the YCSB benchmark test table.


So for me there are three possibilities:

(1) do nothing, always an option as committers may be against extendingpgbench in this direction anyway. Personally I'm fine with having it.

(2) implement YCSB cleanly, i.e. both initialization and operations, atleast if this is "reasonable" (i.e. it does not result in 2000 lines ofnew code). ISTM that it can be done, given that the YCSB schema is verysimple, hence I suggested "pgbench -i --schema yscb" to trigger a nondefault initialization.

(3) if you are interested in demonstrating non uniform distribution onpgbench_accounts, I'm also fine with it, just do so, but do *NOT* call itYCSB.

Also it seems that the YCSB bench uses some hashing to mix keys and avoidhaving 1 as the most frequent, 2 as the second, and so on. There is a hashfunction in pgbench which can be used (although the solution is notperfect, some values cannot be reached), but it is used by YCSB. OtherwiseI'm planning to submit a pseudo-random permutation function to ease thissome day, provided that the size of the table stays constant.


--
Fabien.

Re: pgbench-ycsb

Reply via email to