Re: [HACKERS] Btree Index on PostgreSQL and Wiredtiger (MongoDB3.2)

2016-08-15 Thread Andrew Borodin
>So on average in a large randomly filled index, pages spend more time nearer 50% full than 100% full. I think we can make this number more...controllable. Before split we can check whether left and right pages both are in shared buffer and if they are seriously under certain fillfactor, say under

Re: [HACKERS] Btree Index on PostgreSQL and Wiredtiger (MongoDB3.2)

2016-08-15 Thread Mark Kirkwood
On 13/08/16 05:44, Jeff Janes wrote: On Fri, Aug 12, 2016 at 1:40 AM, Mark Kirkwood However your index rebuild gets you from 5 to 3 GB - does that really help performance significantly? It can make a big difference, depending on how much RAM you have. Yeah - I suspect this is the issue - loa

Re: [HACKERS] Btree Index on PostgreSQL and Wiredtiger (MongoDB3.2)

2016-08-12 Thread Greg Stark
On Sat, Aug 13, 2016 at 1:18 AM, Andrew Gierth wrote: > > Hmm? The code in _bt_findsplitloc and _bt_checksplitloc doesn't seem to > agree with this. > > (Inserting on the high leaf page is a special case, which is where the > fillfactor logic kicks in; that's why sequentially filled indexes are >

Re: [HACKERS] Btree Index on PostgreSQL and Wiredtiger (MongoDB3.2)

2016-08-12 Thread Andrew Gierth
> "Greg" == Greg Stark writes: >> No, because as the pages split, they fill more slowly (because there >> are now more pages). So on average in a large randomly filled index, >> pages spend more time nearer 50% full than 100% full. This is easy >> to demonstrate by creating a table with a

Re: [HACKERS] Btree Index on PostgreSQL and Wiredtiger (MongoDB3.2)

2016-08-12 Thread Greg Stark
On Fri, Aug 12, 2016 at 8:13 PM, Andrew Gierth wrote: > No, because as the pages split, they fill more slowly (because there are > now more pages). So on average in a large randomly filled index, pages > spend more time nearer 50% full than 100% full. This is easy to > demonstrate by creating a ta

Re: [HACKERS] Btree Index on PostgreSQL and Wiredtiger (MongoDB3.2)

2016-08-12 Thread Andrew Gierth
> "Jeff" == Jeff Janes writes: Jeff> But shouldn't that still leave us with a 75% full index, rather Jeff> than slightly over 50% full? Average is usually about 67%-70%. (For capacity estimation I always assume 66% for a non-sequentially-filled btree.) Jeff> The leaf pages start at 50%,

Re: [HACKERS] Btree Index on PostgreSQL and Wiredtiger (MongoDB3.2)

2016-08-12 Thread Jeff Janes
On Fri, Aug 12, 2016 at 1:40 AM, Mark Kirkwood wrote: > After examining the benchmark design - I see we are probably not being > helped by the repeated insertion of keys all of form 'userxxx' leading > to some page splitting. But shouldn't that still leave us with a 75% full index, rather tha

Re: [HACKERS] Btree Index on PostgreSQL and Wiredtiger (MongoDB3.2)

2016-08-12 Thread Kisung Kim
You're right. Reindex improves the performance of the benchmark workloads dramatically. I'm gathering results and will announce them. But I think we should notice that the results before Reindexing is poorer than MongoDB. It seems that this is because of Btree bloating (not exact expression). The

Re: [HACKERS] Btree Index on PostgreSQL and Wiredtiger (MongoDB3.2)

2016-08-12 Thread Mark Kirkwood
After examining the benchmark design - I see we are probably not being helped by the repeated insertion of keys all of form 'userxxx' leading to some page splitting. However your index rebuild gets you from 5 to 3 GB - does that really help performance significantly? regards Mark On 11

Re: [HACKERS] Btree Index on PostgreSQL and Wiredtiger (MongoDB3.2)

2016-08-10 Thread Kisung Kim
Thank you for your information. Here is the result: After insertions: ycsb=# select * from pgstatindex('usertable_pkey'); version | tree_level | index_size | root_block_no | internal_pages | leaf_pages | empty_pages | deleted_pages | avg_leaf_density | leaf_fragmentation -++-

Re: [HACKERS] Btree Index on PostgreSQL and Wiredtiger (MongoDB3.2)

2016-08-10 Thread Lukas Fittl
On Wed, Aug 10, 2016 at 4:24 PM, Kisung Kim wrote: > > When I used the index bloating estimation script in > https://github.com/ioguix/pgsql-bloat-estimation, > the result is as follows: > Regardless of the issue at hand, it might make sense to verify these statistics using pgstattuple - those bl

[HACKERS] Btree Index on PostgreSQL and Wiredtiger (MongoDB3.2)

2016-08-10 Thread Kisung Kim
Hi, I've run YCSB(Yahoo! Cloud Service Benchmark) on PostgreSQL and MongoDB with WiredTiger. And I found some interesting results and some issues(maybe) on Btree index of PostgreSQL. Here is my experiments and results. YCSB is for document store benchmark and I build following schema in PG. CREA