Re: [GENERAL] Sequential vs. random values - number of pages in B-tree
Hi Rob: On Tue, Aug 23, 2016 at 4:52 PM, Rob Sargent wrote: > By 'this' I was referring to the optimizations mentioned, and am wondering > if this holds true under user load. For that you'll have to refer to the source, or ask someone more versed in pg source arcanes. > Much magic can happen in a custom data > load, but do these optimization apply to an application loading single (or > perhaps several) records per transaction. Does one, in that scenario, not > suffer any consequence for continuously loading one side of the tree (the > rightmost node?). Not that much magic is neccesary. The time I did it I just needed to detect on every insertion whether I was at the rightmost position ( made easier because I had minimum/maximum keys cached in the tree object header ), and have a special routine for inserting a new last node ( put in last page, whose pointer I had, grabbing a new one of needed, whose pointer will be appended at the tail of the parent, etc.., it was just a pruned down version of the general insert routine, but made insertions run easily 20 times faster by avoiding nearly every check knowing I was on the right edge ). I do not know if pg inserts several items at a time in bulk loading, but I doubt it. Normally every btree indexing library has some optimization for this cases, as they are common, just like every real sort routine has some optimization for presorted input. Francisco Olarte. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Sequential vs. random values - number of pages in B-tree
On 08/23/2016 08:34 AM, Francisco Olarte wrote: On Tue, Aug 23, 2016 at 4:28 PM, Rob Sargent wrote: On 08/23/2016 07:44 AM, Francisco Olarte wrote: On Tue, Aug 23, 2016 at 2:26 PM, pinker wrote: I am just surprised by the order of magnitude in the difference though. 2 and 27 minutes that's the huge difference... I did another, simplified test, to make sure there is no duplicates and the only difference between both sets is the order: ... INSERT INTO t_sequential SELECT * FROM source_sequential; 102258,949 ms INSERT INTO t_random SELECT * FROM source_random; 1657575,699 ms If I read correctly, you are getting 100s/10Mkeys=10us/key in sequential, and 165 in random. I'm not surprissed at all. I've got greater differences on a memory tree, sorted insertion can be easily optimized to be very fast. AS an example, sequential insertion can easily avoid moving data while filling the pages and, with a little care, it can also avoid some of them when splitting. I'm not current with the current postgres details, but it does not surprise me they have big optimizations for this, especially when index ordered insertion is quite common in things like bulk loads or timestamped log lines. And if each insert is in a separate transaction, does this still hold true? What are you referring to by 'this'? ( BTW, bear in mind one transaction needs at least a disk flush, and, if done via network, at least one RTT, so I doubt you can achieve 10us/transaction unless you have very special conditions ). Francisco Olarte. By 'this' I was referring to the optimizations mentioned, and am wondering if this holds true under user load. Much magic can happen in a custom data load, but do these optimization apply to an application loading single (or perhaps several) records per transaction. Does one, in that scenario, not suffer any consequence for continuously loading one side of the tree (the rightmost node?). -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Sequential vs. random values - number of pages in B-tree
On Tue, Aug 23, 2016 at 4:28 PM, Rob Sargent wrote: > On 08/23/2016 07:44 AM, Francisco Olarte wrote: >> On Tue, Aug 23, 2016 at 2:26 PM, pinker wrote: >>> I am just surprised by the order of magnitude in the difference though. 2 >>> and 27 minutes that's the huge difference... I did another, simplified >>> test, >>> to make sure there is no duplicates and the only difference between both >>> sets is the order: >> >> ... >>> >>> INSERT INTO t_sequential SELECT * FROM source_sequential; >>> 102258,949 ms >>> INSERT INTO t_random SELECT * FROM source_random; >>> 1657575,699 ms >> >> If I read correctly, you are getting 100s/10Mkeys=10us/key in >> sequential, and 165 in random. >> >> I'm not surprissed at all. I've got greater differences on a memory >> tree, sorted insertion can be easily optimized to be very fast. AS an >> example, sequential insertion can easily avoid moving data while >> filling the pages and, with a little care, it can also avoid some of >> them when splitting. I'm not current with the current postgres >> details, but it does not surprise me they have big optimizations for >> this, especially when index ordered insertion is quite common in >> things like bulk loads or timestamped log lines. > And if each insert is in a separate transaction, does this still hold true? What are you referring to by 'this'? ( BTW, bear in mind one transaction needs at least a disk flush, and, if done via network, at least one RTT, so I doubt you can achieve 10us/transaction unless you have very special conditions ). Francisco Olarte. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Sequential vs. random values - number of pages in B-tree
On 08/23/2016 07:44 AM, Francisco Olarte wrote: Hi pinker: On Tue, Aug 23, 2016 at 2:26 PM, pinker wrote: I am just surprised by the order of magnitude in the difference though. 2 and 27 minutes that's the huge difference... I did another, simplified test, to make sure there is no duplicates and the only difference between both sets is the order: ... INSERT INTO t_sequential SELECT * FROM source_sequential; 102258,949 ms INSERT INTO t_random SELECT * FROM source_random; 1657575,699 ms If I read correctly, you are getting 100s/10Mkeys=10us/key in sequential, and 165 in random. I'm not surprissed at all. I've got greater differences on a memory tree, sorted insertion can be easily optimized to be very fast. AS an example, sequential insertion can easily avoid moving data while filling the pages and, with a little care, it can also avoid some of them when splitting. I'm not current with the current postgres details, but it does not surprise me they have big optimizations for this, especially when index ordered insertion is quite common in things like bulk loads or timestamped log lines. Francisco Olarte. And if each insert is in a separate transaction, does this still hold true? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Sequential vs. random values - number of pages in B-tree
Hi pinker: On Tue, Aug 23, 2016 at 2:26 PM, pinker wrote: > I am just surprised by the order of magnitude in the difference though. 2 > and 27 minutes that's the huge difference... I did another, simplified test, > to make sure there is no duplicates and the only difference between both > sets is the order: ... > INSERT INTO t_sequential SELECT * FROM source_sequential; > 102258,949 ms > INSERT INTO t_random SELECT * FROM source_random; > 1657575,699 ms If I read correctly, you are getting 100s/10Mkeys=10us/key in sequential, and 165 in random. I'm not surprissed at all. I've got greater differences on a memory tree, sorted insertion can be easily optimized to be very fast. AS an example, sequential insertion can easily avoid moving data while filling the pages and, with a little care, it can also avoid some of them when splitting. I'm not current with the current postgres details, but it does not surprise me they have big optimizations for this, especially when index ordered insertion is quite common in things like bulk loads or timestamped log lines. Francisco Olarte. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Sequential vs. random values - number of pages in B-tree
Francisco Olarte wrote > It's already been told that btrees work that way, if you find itstrange > read a bit about them, this is completely normal, but ... I am just surprised by the order of magnitude in the difference though. 2 and 27 minutes that's the huge difference...I did another, simplified test, to make sure there is no duplicates and the only difference between both sets is the order: CREATE TABLE source_sequential AS SELECT s from generate_series(1,1000) as s; CREATE TABLE source_randomAS SELECT * from source_sequential ORDER BY random();CREATE TABLE t_sequential (id bigint);CREATE INDEX i_sequential ON t_sequential (id);CREATE TABLE t_random (id bigint);CREATE INDEX i_random ON t_random (id);INSERT INTO t_sequential SELECT * FROM source_sequential;*102258,949 ms*INSERT INTO t_random SELECT * FROM source_random;*1657575,699 ms* -- View this message in context: http://postgresql.nabble.com/Sequential-vs-random-values-number-of-pages-in-B-tree-tp5916956p5917292.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
Re: [GENERAL] Sequential vs. random values - number of pages in B-tree
On Fri, Aug 19, 2016 at 3:20 PM, Daniel Verite wrote: > There's a simple technique that works on top of a Feistel network, > called the cycle-walking cipher. Described for instance at: > http://web.cs.ucdavis.edu/~rogaway/papers/subset.pdf > I'm using the opportunity to add a wiki page: > https://wiki.postgresql.org/wiki/Pseudo_encrypt_constrained_to_an_arbitrary_range > with sample plgsql code for the [0..10,000,000] range that might be useful > for other cases. Nice reference, nice WikiPage, nice job. Bookmarking every thing. > But for the btree fragmentation and final size issue, TBH I don't expect > that constraining the values within that smaller range will make any > difference > in the tests, because it's the dispersion that matters, not the values > themselves. Neither do I, that is why I stated probabley good enough for tests, but > I mean that, whether the values are well dispersed in the [0..1e7] range or > equally well dispersed in the [0..2**32] range, the probability of a newly > inserted value to compare greater or lower to any previous values of the list > should be the same, so shouldn't the page splits be the same, statistically > speaking? I know some btrees do prefix-coding of the keys, and some do it even with long integers, and some others do delta-coding for integer keys. I seem to recall postgres does not, that's whay I do not think it will make a difference. But anyway, to compare two things like that, as the original poster was doing, I normally prefer to test just one thing at a time, that's why I would normally try to do it by writing a sorted file, shuffling it with sort -R, and copying it, server side if posible, to eliminate so both Francisco Olarte. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Sequential vs. random values - number of pages in B-tree
Francisco Olarte wrote: > I think there are some pseudo-random number generators which > can be made to work with any range, but do not recall which ones right > now. There's a simple technique that works on top of a Feistel network, called the cycle-walking cipher. Described for instance at: http://web.cs.ucdavis.edu/~rogaway/papers/subset.pdf I'm using the opportunity to add a wiki page: https://wiki.postgresql.org/wiki/Pseudo_encrypt_constrained_to_an_arbitrary_range with sample plgsql code for the [0..10,000,000] range that might be useful for other cases. But for the btree fragmentation and final size issue, TBH I don't expect that constraining the values within that smaller range will make any difference in the tests, because it's the dispersion that matters, not the values themselves. I mean that, whether the values are well dispersed in the [0..1e7] range or equally well dispersed in the [0..2**32] range, the probability of a newly inserted value to compare greater or lower to any previous values of the list should be the same, so shouldn't the page splits be the same, statistically speaking? Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Sequential vs. random values - number of pages in B-tree
Daniel: On Thu, Aug 18, 2016 at 5:24 PM, Daniel Verite wrote: >> unless you know of an easy way to generate a random permutation on the >> fly without using a lot of memory, I do not. > It could be done by encrypting the stream. > For 32 bits integers: > https://wiki.postgresql.org/wiki/Skip32 > For 64 bits integers: > https://wiki.postgresql.org/wiki/XTEA Nearly, probably good enough for tests, but only generates a pseudorandom permutation if you encrypt 2**32/64 values, not with the 1..1E7 range, it will map them into 1E7 different numbers in the range 2**32/64. I think there are some pseudo-random number generators which can be made to work with any range, but do not recall which ones right now. Francisco Olarte. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Sequential vs. random values - number of pages in B-tree
Francisco Olarte wrote: > unless you know of an easy way to generate a random permutation on the > fly without using a lot of memory, I do not. It could be done by encrypting the stream. For 32 bits integers: https://wiki.postgresql.org/wiki/Skip32 For 64 bits integers: https://wiki.postgresql.org/wiki/XTEA Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Sequential vs. random values - number of pages in B-tree
Hi: On Thu, Aug 18, 2016 at 1:32 PM, pinker wrote: ... > create table t01 (id bigint); > create index i01 on t01(id); > insert into t01 SELECT s from generate_series(1,1000) as s; > > and random values: > create table t02 (id bigint); > create index i02 on t02(id); > insert into t02 SELECT random()*100 from generate_series(1,1000) as s; It's already been told that btrees work that way, if you find it strange read a bit about them, this is completely normal, but ... ... what I come to point is your test is severely flawed. It probably does not matter in this case, but you are inserting 10M DIFFERENT VALUES in the first case and only 100 in the second one, which an average of 100K DUPLICATES of each. This affects btrees too. You could try using random*1G, or at least 100M, for a better test ( which may have even worse behaviour, ideally I would just write 10M integers to a disk file, then shuffle it and compare COPY FROM times from both ) ( unless you know of an easy way to generate a random permutation on the fly without using a lot of memory, I do not ). Francisco Olarte. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Sequential vs. random values - number of pages in B-tree
W dniu 2016-08-18 14:19:25 użytkownik Ilya Kazakevich napisał: > >Thank you. So if that is the reason changing the fillfactor parameter should > >help? > > Fillfactor is not about rebalancing, but about page split. If you have many > insertions you may decrease fillfactor to minimize page splits, but I am not > sure it will help in your case. But you should try) > Better approach is to create index _after_ insertion, but it is not always > possible. > > > Ilya Kazakevich > > JetBrains > http://www.jetbrains.com > The Drive to Develop > > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > >From link you have pasted: "Both insertions and deletions are fast as long as space is available on a block. If an insertion won't fit on the block, then some free space on some nearby block must be found and the auxiliary indices adjusted. The hope is that enough space is nearby such that a lot of blocks do not need to be reorganized." and from postgres documentation: fillfactor The fillfactor for an index is a percentage that determines how full the index method will try to pack index pages. For B-trees, leaf pages are filled to this percentage during initial index build, and also when extending the index at the right (adding new largest key values) So spliting happens when no room left on the page. But before that room can be used for further insertions... -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Sequential vs. random values - number of pages in B-tree
>Thank you. So if that is the reason changing the fillfactor parameter should >help? Fillfactor is not about rebalancing, but about page split. If you have many insertions you may decrease fillfactor to minimize page splits, but I am not sure it will help in your case. But you should try) Better approach is to create index _after_ insertion, but it is not always possible. Ilya Kazakevich JetBrains http://www.jetbrains.com The Drive to Develop -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Sequential vs. random values - number of pages in B-tree
W dniu 2016-08-18 14:00:31 użytkownik Ilya Kazakevich napisał: > Hi, > > >What's the reason that postgres needs more index pages to store random > >data > >than sequential ones? > > I assume that is because B-Tree is self-balanced tree, so it needs to be > rebalanced after each insertion. > Random insertions may go to the head of index where no space left leading to > huge data moving. > https://en.wikipedia.org/wiki/B-tree#Insertions_and_deletions > > > > Ilya Kazakevich > > JetBrains > http://www.jetbrains.com > The Drive to Develop > > Thank you. So if that is the reason changing the fillfactor parameter should help? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Sequential vs. random values - number of pages in B-tree
Hi, >What's the reason that postgres needs more index pages to store random >data >than sequential ones? I assume that is because B-Tree is self-balanced tree, so it needs to be rebalanced after each insertion. Random insertions may go to the head of index where no space left leading to huge data moving. https://en.wikipedia.org/wiki/B-tree#Insertions_and_deletions Ilya Kazakevich JetBrains http://www.jetbrains.com The Drive to Develop -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general