On Sat, Sep 5, 2015 at 12:26 PM, Fabien COELHO <coe...@cri.ensmp.fr> wrote: > > I would be curious whether flushing helps, though. >>>> >>> >>> Yes, me too. I think we should try to reach on consensus for exact >>> scenarios and configuration where this patch('es) can give benefit or we >>> want to verify if there is any regression as I have access to this m/c for >>> a very-very limited time. This m/c might get formatted soon for some other >>> purpose. >>> >> >> Yep, it would be great if you have time for a flush test before it >> disappears... I think it is advisable to disable the write cache as it may >> also hide the impact of flushing. >> > > Still thinking... Depending on the results, it might be interesting to > have these tests run with the write cache enabled as well, to check how > much it interferes positively with performance. > > I have done some tests with both the patches(sort+flush) and below are results:
M/c details -------------------- IBM POWER-8 24 cores, 192 hardware threads RAM = 492GB Test - 1 (Data Fits in shared_buffers) -------------------------------------------------------- non-default settings used in script provided by Fabien upthread used below options for pgbench and the same is used for rest of tests as well. fw) ## full speed parallel write pgbench run="FW" opts="-M prepared -P 1 -T $time $para" ;; warmup=1000 scale=300 max_connections=300 shared_buffers=32GB checkpoint_timeout=10min time=7200 synchronous_commit=on max_wal_size=15GB para="-j 64 -c 128" checkpoint_completion_target=0.8 checkpoint_flush_to_disk="on off" checkpoint_sort="on off" Flush - off and Sort - off avg over 7203: 27480.350104 ± 12791.098857 [0.000000, 16009.400000, 32109.200000, 37629.000000, 51671.400000] percent of values below 10.0: 2.8% Flush - off and Sort - on avg over 7200: 27482.501264 ± 12552.036065 [0.000000, 16587.250000, 31225.950000, 37516.450000, 51296.900000] percent of values below 10.0: 2.8% Flush - on and Sort - off avg over 7200: 25214.757292 ± 11059.709509 [5268.000000, 14188.400000, 26472.450000, 35626.100000, 51479.000000] percent of values below 10.0: 0.0% Flush - on and Sort - on avg over 7200: 26819.631722 ± 10589.745016 [5191.700000, 16825.450000, 29429.750000, 35707.950000, 51475.100000] percent of values below 10.0: 0.0% For this test run, the best results are when both the sort and flush options are enabled, the value of lowest TPS is increased substantially without sacrificing much on average or median TPS values (though there is ~9% dip in median TPS value). When only sorting is enabled, there is neither significant gain nor any loss. When only flush is enabled, there is significant degradation in both average and median value of TPS ~8% and ~21% respectively. Test - 2 (Data doesn't fit in shared_buffers, but fits in RAM) ---------------------------------------------------------------------------------------- warmup=1000 scale=3000 max_connections=300 shared_buffers=32GB checkpoint_timeout=10min time=7200 synchronous_commit=on max_wal_size=25GB para="-j 64 -c 128" checkpoint_completion_target=0.8 checkpoint_flush_to_disk="on off" checkpoint_sort="on off" Flush - off and Sort - off avg over 7200: 5050.059444 ± 4884.528702 [0.000000, 98.100000, 4699.100000, 10125.950000, 13631.000000] percent of values below 10.0: 7.7% Flush - off and Sort - on avg over 7200: 6194.150264 ± 4913.525651 [0.000000, 98.100000, 8982.000000, 10558.000000, 14035.200000] percent of values below 10.0: 11.0% Flush - on and Sort - off avg over 7200: 2771.327472 ± 1860.963043 [287.900000, 2038.850000, 2375.500000, 2679.000000, 12862.000000] percent of values below 10.0: 0.0% Flush - on and Sort - on avg over 7200: 6110.617722 ± 1939.381029 [1652.200000, 5215.100000, 5724.000000, 6196.550000, 13828.000000] percent of values below 10.0: 0.0% For this test run, again the best results are when both the sort and flush options are enabled, the value of lowest TPS is increased substantially and the average and median value of TPS has also increased to ~21% and ~22% respectively. When only sorting is enabled, there is a significant gain in average and median TPS values, but then there is also an increase in number of times when TPS is below 10 which is bad. When only flush is enabled, there is significant degradation in both average and median value of TPS to ~82% and ~97% respectively, now I am not sure if such a big degradation could be expected for this case or it's just a problem in this run, I have not repeated this test. Test - 3 (Data doesn't fit in shared_buffers, but fits in RAM) ---------------------------------------------------------------------------------------- Same configuration and settings as above, but this time, I have enforced Flush to use posix_fadvise() rather than sync_file_range() (basically changed code to comment out sync_file_range() and enable posix_fadvise()). Flush - on and Sort - on avg over 7200: 3400.915069 ± 739.626478 [1642.100000, 2965.550000, 3271.900000, 3558.800000, 6763.000000] percent of values below 10.0: 0.0% On using posix_fadvise(), the results for best case (both flush and sort as on) shows significant degradation in average and median TPS values by ~48% and ~43% which indicates that probably using posix_fadvise() with the current options might not be the best way to achieve Flush. Overall, I think this patch (sort+flush) brings a lot of value on table in terms of stablizing the TPS during checkpoint, however some of the cases like use of posix_fadvise() and the case (all data fits in shared_buffers) where the value of median TPS is regressed could be investigated to see what can be done to improve them. I think more tests can be done to ensure the benefit or regression of this patch, but for now this is what best I can do. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com