On Tue, Sep 2, 2014 at 9:11 AM, Rahila Syed <rahilasye...@gmail.com> wrote:

> Hello,
>
> >It'd be interesting to check avg cpu usage as well
>
> I have collected average CPU utilization numbers by collecting sar output
> at interval of 10 seconds  for following benchmark:
>
> Server specifications:
> Processors:Intel® Xeon ® Processor E5-2650 (2 GHz, 8C/16T, 20 MB) * 2 nos
> RAM: 32GB
> Disk : HDD      450GB 10K Hot Plug 2.5-inch SAS HDD * 8 nos
> 1 x 450 GB SAS HDD, 2.5-inch, 6Gb/s, 10,000 rpm
>
> Benchmark:
>
> Scale : 16
> Command  :java JR  /home/postgres/jdbcrunner-1.2/scripts/tpcc.js
>  -sleepTime 550,250,250,200,200
>
> Warmup time          : 1 sec
> Measurement time     : 900 sec
> Number of tx types   : 5
> Number of agents     : 16
> Connection pool size : 16
> Statement cache size : 40
> Auto commit          : false
>
>
> Checkpoint segments:1024
> Checkpoint timeout:5 mins
>
>
> Average % of CPU utilization at user level for multiple blocks compression:
>
> Compression Off  =  3.34133
>
>  Snappy = 3.41044
>
> LZ4  = 3.59556
>
>  Pglz = 3.66422
>
>
> The numbers show the average CPU utilization is in the following order
> pglz > LZ4 > Snappy > No compression
> Attached is the graph which gives plot of % CPU utilization versus time
> elapsed for each of the compression algorithms.
> Also, the overall CPU utilization during tests is very low i.e below 10% .
> CPU remained idle for large(~90) percentage of time. I will repeat the
> above tests with high load on CPU and using the benchmark given by
> Fujii-san and post the results.
>
>
> Thank you,
>
>
>
> On Wed, Aug 27, 2014 at 9:16 PM, Arthur Silva <arthur...@gmail.com> wrote:
>
>>
>> Em 26/08/2014 09:16, "Fujii Masao" <masao.fu...@gmail.com> escreveu:
>>
>> >
>> > On Tue, Aug 19, 2014 at 6:37 PM, Rahila Syed <rahilasye...@gmail.com>
>> wrote:
>> > > Hello,
>> > > Thank you for comments.
>> > >
>> > >>Could you tell me where the patch for "single block in one run" is?
>> > > Please find attached patch for single block compression in one run.
>> >
>> > Thanks! I ran the benchmark using pgbench and compared the results.
>> > I'd like to share the results.
>> >
>> > [RESULT]
>> > Amount of WAL generated during the benchmark. Unit is MB.
>> >
>> >                 Multiple                Single
>> >     off            202.0                201.5
>> >     on            6051.0                6053.0
>> >     pglz            3543.0                3567.0
>> >     lz4            3344.0                3485.0
>> >     snappy            3354.0                3449.5
>> >
>> > Latency average during the benchmark. Unit is ms.
>> >
>> >                 Multiple                Single
>> >     off            19.1                19.0
>> >     on            55.3                57.3
>> >     pglz            45.0                45.9
>> >     lz4            44.2                44.7
>> >     snappy            43.4                43.3
>> >
>> > These results show that FPW compression is really helpful for decreasing
>> > the WAL volume and improving the performance.
>> >
>> > The compression ratio by lz4 or snappy is better than that by pglz. But
>> > it's difficult to conclude which lz4 or snappy is best, according to
>> these
>> > results.
>> >
>> > ISTM that compression-of-multiple-pages-at-a-time approach can compress
>> > WAL more than compression-of-single-... does.
>> >
>> > [HOW TO BENCHMARK]
>> > Create pgbench database with scall factor 1000.
>> >
>> > Change the data type of the column "filler" on each pgbench table
>> > from CHAR(n) to TEXT, and fill the data with the result of pgcrypto's
>> > gen_random_uuid() in order to avoid empty column, e.g.,
>> >
>> >  alter table pgbench_accounts alter column filler type text using
>> > gen_random_uuid()::text
>> >
>> > After creating the test database, run the pgbench as follows. The
>> > number of transactions executed during benchmark is almost same
>> > between each benchmark because -R option is used.
>> >
>> >   pgbench -c 64 -j 64 -r -R 400 -T 900 -M prepared
>> >
>> > checkpoint_timeout is 5min, so it's expected that checkpoint was
>> > executed at least two times during the benchmark.
>> >
>> > Regards,
>> >
>> > --
>> > Fujii Masao
>> >
>> >
>> > --
>> > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>> > To make changes to your subscription:
>> > http://www.postgresql.org/mailpref/pgsql-hackers
>>
>> It'd be interesting to check avg cpu usage as well.
>>
>
>
Is there any reason to default to LZ4-HC? Shouldn't we try the default as
well? LZ4-default is known for its near realtime speeds in exchange for a
few % of compression, which sounds optimal for this use case.

Also, we might want to compile these libraries with -O3 instead of the
default -O2. They're finely tuned to work with all possible compiler
optimizations w/ hints and other tricks, this is specially true for LZ4,
not sure for snappy.

In my virtual machine LZ4 w/ -O3 compression runs at twice the speed
(950MB/s) of -O2 (450MB/s) @ (61.79%), LZ4-HC seems unaffected though
(58MB/s) @ (60.27%).

Yes, that's right, almost 1GB/s! And the compression ratio is only 1,5%
short compared to LZ4-HC.

Reply via email to