> On 06 Sep 2017, at 08:42, Fabien COELHO wrote:
>
> Hello Alik,
>
> Applies, compiles, works for me.
>
> Some minor comments and suggestions.
>
> Two typos:
> - "usinng" -> "using"
> - "a rejection method used" -> "a rejection method is used"
>
> I'm not sure of "least_recently_used_i", this
Hello Alik,
Applies, compiles, works for me.
Some minor comments and suggestions.
Two typos:
- "usinng" -> "using"
- "a rejection method used" -> "a rejection method is used"
I'm not sure of "least_recently_used_i", this naming style is not used in
pgbench. "least_recently_used" would be o
Hello Fabien,
Thank you for detailed review. I hope I have fixed all the issues you mentioned
in your letter.
pgbench-zipf-08v.patch
Description: Binary data
—
Thanks and Regards,
Alik Khilazhev
Postgres Professional:
http://www.postgrespro.com
The Russian Postgres Company
--
Sent via pgsq
Hello Alik,
I am attaching patch v7.
Patch generates multiple warnings with "git apply", apparently because of
end-of-line spaces, and fails:
pgbench-zipf-07v.patch:52: trailing whitespace.
{
pgbench-zipf-07v.patch:53: trailing whitespace.
"random_zipfian", 3, P
Hello, Fabien
I am attaching patch v7.
> Yes, I agree. a >= 1 does not make much sense... If you want uniform you
> should use random(), not call random_zipfian with a = 1. Basically it
> suggests that too large values of "a" should be rejected. Not sure where to
> put the limit, thoug
Hello Alik,
Now “a” does not have upper bound, that’s why on using iterative algorithm with a
>= 1 program will stuck on infinite loop because of following line of code:
double b = pow(2.0, s - 1.0);
Because after overflow “b” becomes “+Inf”.
Yep, overflow can happen.
So should upper b
Hello Fabien,
>
> I think that this method should be used for a>1, and the other very rough one
> can be kept for parameter a in [0, 1), a case which does not make much sense
> to a mathematician as it diverges if unbounded.
Now “a” does not have upper bound, that’s why on using iterative algo
Hello Fabien,
> On 5 Aug 2017, at 12:15, Fabien COELHO wrote:
>
>
> Hello Alik,
>
> I've done some math investigations, which consisted in spending one hour with
> Christian, a statistician colleague of mine. He took an old book out of a
> shelf, opened it to page 550 (roughly in the middle
On Fri, Aug 4, 2017 at 3:30 PM, Peter Geoghegan wrote:
> Yura Sokolov of Postgres Pro performed this benchmark at my request.
> He took the 9.5 commit immediately proceeding 2ed5b87f9 as a baseline.
I attach a simple patch that comments out the release of the buffer
pin for logged tables where an
Hello Alik,
So I would be in favor of expanding the documentation but not
restricting the parameter beyond avoiding value 1.0.
I have removed restriction and expanded documentation in attaching patch v5.
I've done some math investigations, which consisted in spending one hour
with Christia
Hello Peter,
I think that it would also be nice if there was an option to make
functions like random_zipfian() actually return a value that has
undergone perfect hashing. When this option is used, any given value
that the function returns would actually be taken from a random mapping
to some
On Fri, Jul 21, 2017 at 4:51 AM, Alik Khilazhev
wrote:
> (Latest version of pgbench Zipfian patch)
While I'm +1 on this idea, I think that it would also be nice if there
was an option to make functions like random_zipfian() actually return
a value that has undergone perfect hashing. When this opt
On Mon, Jul 31, 2017 at 10:54 AM, Peter Geoghegan wrote:
> Let's wait to see what difference it makes if Alik's zipfian
> distribution pgbench test case uses unlogged tables. That may gives us a
> good sense of the problem for cases with contention/concurrency.
Yura Sokolov of Postgres Pro perfor
Robert Haas wrote:
On Mon, Jul 31, 2017 at 1:54 PM, Peter Geoghegan wrote:
That is hard to justify. I don't think that failing to set LP_DEAD hints
is the cost that must be paid to realize a benefit elsewhere, though. I
don't see much problem with having both benefits consistently. It's
actual
On Mon, Jul 31, 2017 at 1:54 PM, Peter Geoghegan wrote:
> That is hard to justify. I don't think that failing to set LP_DEAD hints
> is the cost that must be paid to realize a benefit elsewhere, though. I
> don't see much problem with having both benefits consistently. It's
> actually very unlikel
Robert Haas wrote:
On Thu, Jul 27, 2017 at 10:05 PM, Peter Geoghegan wrote:
I really don't know if that would be worthwhile. It would certainly fix
the regression shown in my test case, but that might not go far enough.
I strongly suspect that there are more complicated workloads where
LP_DEAD
On Thu, Jul 27, 2017 at 10:05 PM, Peter Geoghegan wrote:
> I really don't know if that would be worthwhile. It would certainly fix
> the regression shown in my test case, but that might not go far enough.
> I strongly suspect that there are more complicated workloads where
> LP_DEAD cleanup from S
Amit Kapila wrote:
Isn't it possible to confirm if the problem is due to commit
2ed5b87f9? Basically, if we have unlogged tables, then it won't
release the pin. So if the commit in question is the culprit, then
the same workload should not lead to bloat.
That's a great idea. Alik?
--
Peter
On Wed, Jul 26, 2017 at 3:32 AM, Peter Geoghegan wrote:
> On Fri, Jul 14, 2017 at 5:06 PM, Peter Geoghegan wrote:
>> I think that what this probably comes down to, more than anything
>> else, is that you have leftmost hot/bloated leaf pages like this:
>>
>>
>> idx | level | l_i
Peter Geoghegan wrote:
In Alik's workload, there are two queries: One UPDATE, one SELECT. Even
though the bloated index was a unique index, and so still gets
_bt_check_unique() item killing, the regression is still going to block
LP_DEAD cleanup by the SELECTs, which seems like it might be quit
Robert Haas wrote:
We now see that no update ever kills items within _bt_killitems(),
because our own update to the index leaf page itself nullifies our
ability to kill anything, by changing the page LSN from the one
stashed in the index scan state variable. Fortunately, we are not
really "self-
On Tue, Jul 25, 2017 at 11:02 PM, Peter Geoghegan wrote:
> While the benchmark Alik came up with is non-trivial to reproduce, I
> can show a consistent regression for a simple case with only one
> active backend.
That's not good.
> We now see that no update ever kills items within _bt_killitems(
On Tue, Jul 25, 2017 at 8:02 PM, Peter Geoghegan wrote:
> Setup:
>
> Initialize pgbench (any scale factor).
> create index on pgbench_accounts (aid);
That "create index" was meant to be on "abalance", to make the UPDATE
queries always HOT-unsafe. (You'll want to *also* create this index
once the
On Tue, Jul 25, 2017 at 3:02 PM, Peter Geoghegan wrote:
> I've been thinking about this a lot, because this really does look
> like a pathological case to me. I think that this workload is very
> sensitive to how effective kill_prior_tuples/LP_DEAD hinting is. Or at
> least, I can imagine that mec
On Fri, Jul 14, 2017 at 5:06 PM, Peter Geoghegan wrote:
> I think that what this probably comes down to, more than anything
> else, is that you have leftmost hot/bloated leaf pages like this:
>
>
> idx | level | l_item | blkno | btpo_prev |
> btpo_next | btpo_flags | type | live
Hello!
I realized that I was sending emails as HTML and latest patch is not visible in
the archive now.
That’s why I am attaching it again.
I am sorry for that.
pgbench-zipf-05v.patch
Description: Binary data
—
Thanks and Regards,
Alik Khilazhev
Postgres Professional:
http://www.postgrespro.
Hmmm. On second thought, maybe one or the other is enough, either restrict the parameter to values where the approximation is good, or put out a clear documentation about when the approximation is not very good, but it may be still useful even if not precise.So I would be in favor of expanding the
> I think that developping a test would be much simpler with the improved tap
> test infrastructure, so I would suggest to wait to know the result of the
> corresponding patch.
Ok, I will wait then.
> Also, could you recod the patch to CF 2017-09?
> https://commitfest.postgresql.org/14/
Hello Alik,
About the maths: As already said, I'm not at ease with a random_zipfian
function which does not display a (good) zipfian distribution. At the
minimum the documentation should be clear about the approximations
implied depending on the parameter value.
I add one more sentence to d
Hello Fabien,I am attaching patch v4. On 19 Jul 2017, at 17:21, Fabien COELHO wrote:About the maths: As already said, I'm not at ease with a random_zipfian function which does not display a (good) zipfian distribution. At the minimum the documentation should be clear about the
Hello Alik,
I am attaching patch v3.
Among other things I fixed small typo in description of
random_exponential function in pgbench.sgml file.
Ok. Probably this typo should be committed separatly and independently.
A few comments about v3:
Patch applies cleanly, compiles, works.
About t
Hello,
Is this bias expected from the drawing method, say because it is
approximated and the approximation is weak at some points, or is there
an issue with its implementation, says some shift which gets smoothed
down for higher indexes?
I have checked paper where such implementation was pr
> On 17 Jul 2017, at 13:51, Fabien COELHO wrote:
>
>
> Is this bias expected from the drawing method, say because it is approximated
> and the approximation is weak at some points, or is there an issue with its
> implementation, says some shift which gets smoothed down for higher indexes?
>
Ok, so you did not get the large bias for i=3. Strange.
I got large bias for i=3 and theta > 1 even with a million outcomes,
Ok. So this is similar to what I got.
Is this bias expected from the drawing method, say because it is
approximated and the approximation is weak at some points, or
Hello Fabien,On 14 Jul 2017, at 17:51, Fabien COELHO wrote:Ok, so you did not get the large bias for i=3. Strange.I got large bias for i=3 and theta > 1 even with a million outcomes, but for theta < 1 (I have tested on theta = 0.1 and 0.3) it showed quite good results.I am att
On Fri, Jul 14, 2017 at 6:37 AM, Alik Khilazhev
wrote:
> I am attaching results of tests for 32 and 128 clients that were running for
> 10 minutes, and TPS remains 305 and 115 ktps respectively.
>
> Tests was executed with configuration set for YCSB. And there is very
> aggressively autovacuum,
Algorithm works with theta less than 1. The only problem here is that
theta can not be 1, because of next line of code
cell->alpha = 1. / (1 - theta);
That’s why I put such restriction. Now I see 2 possible solutions for that:
1) Exclude 1, and allow everything in range (0;+∞).
Yep.
2) Or
> On 13 Jul 2017, at 23:13, Peter Geoghegan wrote:
>
> I just figured out that "result.txt" is only a 60 second pgbench run.
> Is the same true for other tests?
Yes, other tests ran 60 seconds too.
>
> It would be much more interesting to see runs that lasted 10 minutes
> or more.
I am atta
> On 13 Jul 2017, at 19:14, Fabien COELHO wrote:
>
> Documentation says that the closer theta is from 0 the flatter the
> distribution
> but the implementation requires at least 1, including strange error messages:
>
> zipfian parameter must be greater than 1.00 (not 1.00)
>
> Could
On Thu, Jul 13, 2017 at 12:49 PM, Peter Geoghegan wrote:
> To reiterate what I say above:
>
> The number of leaf pages with dead items is 20 with this most recent
> run (128 clients, patched + unpatched). The leftmost internal page one
> level up from the leaf level contains 289 items. Whereas las
On Thu, Jul 13, 2017 at 10:02 AM, Peter Geoghegan wrote:
> The number of leaf pages at the left hand side of the leaf level seems
> to be ~50 less than the unpatched 128 client case was the first time
> around, which seems like a significant difference. I wonder why. Maybe
> autovacuum ran at the
On Thu, Jul 13, 2017 at 4:38 AM, Alik Khilazhev
wrote:
> I am attaching results of test for 32 and 128 clients for original and
> patched(_bt_doinsert) variants.
Thanks.
The number of leaf pages at the left hand side of the leaf level seems
to be ~50 less than the unpatched 128 client case was t
Hello Alik,
A few comments about the patch v2.
Patch applies and compiles.
Documentation says that the closer theta is from 0 the flatter the distribution
but the implementation requires at least 1, including strange error messages:
zipfian parameter must be greater than 1.00 (not 1.000
On 13 Jul 2017, at 00:20, Peter Geoghegan wrote:Actually, I mean that I wonder how much of a difference it would makeif this entire block was commented out within _bt_doinsert():if (checkUnique != UNIQUE_CHECK_NO){ …}I am attaching results of test for 32 and 128 clients for origin
On Wed, Jul 12, 2017 at 2:17 PM, Peter Geoghegan wrote:
> I'd be interested in seeing the difference it makes if Postgres is
> built with the call to _bt_check_unique() commented out within
> nbtinsert.c.
Actually, I mean that I wonder how much of a difference it would make
if this entire block w
On Wed, Jul 12, 2017 at 1:55 PM, Alvaro Herrera
wrote:
> Not to mention work done with a "buffer cleanup lock" held -- which is
> compounded by the fact that acquiring such a lock is prone to starvation
> if there are many scanners of that index. I've seen a case where a very
> hot table is scann
Peter Geoghegan wrote:
> Now, that might not seem like that much of a difference, but if you
> consider how duplicates are handled in the B-Tree code, and how unique
> index enforcement works, I think it could be. It could lead to heavy
> buffer lock contention, because we sometimes do a lot of wo
On Wed, Jul 12, 2017 at 12:30 PM, Peter Geoghegan wrote:
> On Wed, Jul 12, 2017 at 4:28 AM, Alik Khilazhev
> wrote:
>> I am attaching results of query that you sent. It shows that there is
>> nothing have changed after executing tests.
>
> But something did change! In the case where performance w
On Wed, Jul 12, 2017 at 4:28 AM, Alik Khilazhev
wrote:
> I am attaching results of query that you sent. It shows that there is
> nothing have changed after executing tests.
But something did change! In the case where performance was good, all
internal pages on the level above the leaf level have
Hello!
I want to say that our company is already engaged in the search for the causes
of the problem and their solution. And also we have few experimental patches
that increases performance for 1000 clients by several times.
In addition, I have fixed threadsafety issues and implemented per-thr
On 7 Jul 2017, at 21:53, Peter Geoghegan wrote:Is it possible for you to instrument the number of B-Tree pageaccesses using custom instrumentation for pgbench_accounts_pkey?If that seems like too much work, then it would still be interestingto see what the B-Tree keyspace looks like
Hello Alik,
Your description is not very precise. What version of Postgres is used?
If there is a decline, compared to which version? Is there a link to
these results?
Benchmark have been done in master v10. I am attaching image with results:
.
Ok, thanks.
More precision would be helpful,
On Mon, Jul 10, 2017 at 12:19 PM, Alik Khilazhev wrote:
> Hello, Fabien!
>
> Your description is not very precise. What version of Postgres is used? If
> there is a decline, compared to which version? Is there a link to these
> results?
>
>
> Benchmark have been done in master v10. I am attaching
Hello, Fabien!
> Your description is not very precise. What version of Postgres is used? If
> there is a decline, compared to which version? Is there a link to these
> results?
Benchmark have been done in master v10. I am attaching image with results:
.
> Indeed, the function computation is ov
On Fri, Jul 7, 2017 at 12:59 PM, Alvaro Herrera
wrote:
> Hmm, this seems potentially very useful. Care to upload it to
> https://wiki.postgresql.org/wiki/Category:Snippets ?
Sure. I've added it here, under "index maintenance":
https://wiki.postgresql.org/wiki/Index_Maintenance#Summarize_keyspac
Peter Geoghegan wrote:
> Here is the query:
>
> with recursive index_details as (
> select
> 'pgbench_accounts_pkey'::text idx
> ), [...]
Hmm, this seems potentially very useful. Care to upload it to
https://wiki.postgresql.org/wiki/Category:Snippets ?
--
Álvaro Herrerah
On Fri, Jul 7, 2017 at 12:45 AM, Alik Khilazhev
wrote:
> On scale = 10(1 million rows) it gives following results on machine with 144
> cores(with synchronous_commit=off):
> nclientstps
> 1 8842.401870
> 2 18358.140869
> 4
On Fri, Jul 7, 2017 at 5:17 AM, Robert Haas wrote:
> How is that possible? In a Zipfian distribution, no matter how big
> the table is, almost all of the updates will be concentrated on a
> handful of rows - and updates to any given row are necessarily
> serialized, or so I would think. Maybe Mo
On Fri, Jul 7, 2017 at 3:45 AM, Alik Khilazhev
wrote:
> PostgreSQL shows very bad results in YCSB Workload A (50% SELECT and 50%
> UPDATE of random row by PK) on benchmarking with big number of clients using
> Zipfian distribution. MySQL also has decline but it is not significant as it
> is in
Hello Alik,
PostgreSQL shows very bad results in YCSB Workload A (50% SELECT and 50%
UPDATE of random row by PK) on benchmarking with big number of clients
using Zipfian distribution. MySQL also has decline but it is not
significant as it is in PostgreSQL. MongoDB does not have decline at
al
Hello!
PostgreSQL shows very bad results in YCSB Workload A (50% SELECT and 50% UPDATE
of random row by PK) on benchmarking with big number of clients using Zipfian
distribution. MySQL also has decline but it is not significant as it is in
PostgreSQL. MongoDB does not have decline at all. And i
61 matches
Mail list logo