Re: [PERFORM] PostgreSQL disk fragmentation causes performance problems on Windows

2015-05-21 Thread Josh Berkus
On 04/23/2015 12:47 PM, Jan Gunnar Dyrset wrote:
 I think that preallocating lumps of a given, configurable size, say 4
 MB, for the tables would remove this problem. The max number of
 fragments on a 1 GB file would then be  250, which is no problem. Is
 this possible to configure in PostgreSQL? If not, how difficult is it to
 implement in the database?

It is not currently possible to configure.

This has been talked about as a feature, but would require major work on
PostgreSQL to make it possible.  You'd be looking at several months of
effort by a really good hacker, and then a whole bunch of performance
testing.  If you have the budget for this, then please let's talk about
it because right now nobody is working on it.

Note that this could be a dead end; it's possible that preallocating
large extents could cause worse problems than the current fragmentation
issues.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL disk fragmentation causes performance problems on Windows

2015-05-21 Thread Andres Freund
On 2015-05-21 11:54:40 -0700, Josh Berkus wrote:
 This has been talked about as a feature, but would require major work on
 PostgreSQL to make it possible.  You'd be looking at several months of
 effort by a really good hacker, and then a whole bunch of performance
 testing.  If you have the budget for this, then please let's talk about
 it because right now nobody is working on it.

I think this is overestimating the required effort quite a bit. While
not trivial, it's also not that complex to make this work.

Greetings,

Andres Freund


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] union all and filter / index scan - seq scan

2015-05-21 Thread Florian Lohoff

Hi,
i stumbled over something i cant seem to find a workaround. I create a view like

create view v_test as
select  a,b
frombig_table
union all
select  a,b
fromsmall_table;

When i now use the view like

select * from v_test where a = 42;

I can see an index scan happening on big_table. When i issue
something like 

select * from v_test where a in ( select 42 );

or joining to another table i see that there will be seq scan on big
table. First the union will be executed and later the filter e.g. a in (
select 42 ) will be done on the huge result. My use case is that
big_table is 70mio entries growing fast and small_table is like 4
entries, growing little.  The filter e.g. a in ( select 42 ) will
typically select 50-1000 entries of the 70mio. So i now create a union
with 70mio + 4 entries to then filter all with a = 42.

It seems the planner is not able to rewrite a union all e.g. the above
statement could be rewritten from:

select  *
from(
select  a,b
frombig_table
union all
select  a,b
fromsmall_table;
) foo
where   a in ( select 42 );

to 

select  *
from(
select  a,b
frombig_table
where a in ( select 42 )
union all
select  a,b
fromsmall_table
where a in ( select 42 )
) foo

which would then use an index scan not a seq scan and execution times
would be acceptable. 

I have now tried to wrap my head around the problem for 2 days and i am 
unable to find a workaround to using a union but the filter optimisation
is impossible with a view construct.

Flo
PS: Postgres 9.1 - I tried 9.4 on Debian/jessie with IIRC same results.
-- 
Florian Lohoff f...@zz.de
 We need to self-defense - GnuPG/PGP enable your email today!


signature.asc
Description: Digital signature


Re: [PERFORM] union all and filter / index scan - seq scan

2015-05-21 Thread Vitalii Tymchyshyn
It looks pretty much like partitioning. You should check partitioning
recipes.

Чт, 21 трав. 2015 06:41 Florian Lohoff f...@zz.de пише:


 Hi,
 i stumbled over something i cant seem to find a workaround. I create a
 view like

 create view v_test as
 select  a,b
 frombig_table
 union all
 select  a,b
 fromsmall_table;

 When i now use the view like

 select * from v_test where a = 42;

 I can see an index scan happening on big_table. When i issue
 something like

 select * from v_test where a in ( select 42 );

 or joining to another table i see that there will be seq scan on big
 table. First the union will be executed and later the filter e.g. a in (
 select 42 ) will be done on the huge result. My use case is that
 big_table is 70mio entries growing fast and small_table is like 4
 entries, growing little.  The filter e.g. a in ( select 42 ) will
 typically select 50-1000 entries of the 70mio. So i now create a union
 with 70mio + 4 entries to then filter all with a = 42.

 It seems the planner is not able to rewrite a union all e.g. the above
 statement could be rewritten from:

 select  *
 from(
 select  a,b
 frombig_table
 union all
 select  a,b
 fromsmall_table;
 ) foo
 where   a in ( select 42 );

 to

 select  *
 from(
 select  a,b
 frombig_table
 where a in ( select 42 )
 union all
 select  a,b
 fromsmall_table
 where a in ( select 42 )
 ) foo

 which would then use an index scan not a seq scan and execution times
 would be acceptable.

 I have now tried to wrap my head around the problem for 2 days and i am
 unable to find a workaround to using a union but the filter optimisation
 is impossible with a view construct.

 Flo
 PS: Postgres 9.1 - I tried 9.4 on Debian/jessie with IIRC same results.
 --
 Florian Lohoff f...@zz.de
  We need to self-defense - GnuPG/PGP enable your email today!



Re: [PERFORM] union all and filter / index scan - seq scan

2015-05-21 Thread Tom Lane
Florian Lohoff f...@zz.de writes:
 It seems the planner is not able to rewrite a union all

I do not see any problems with pushing indexable conditions down through a
UNION ALL when I try it.  I speculate that either you are using a very old
9.1.x minor release, or the actual view is more complex than you've let on.

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL disk fragmentation causes performance problems on Windows

2015-05-21 Thread Vitalii Tymchyshyn
It may be even easier. AFAIR, it's possible just to tell OS expected
allocation without doing it. This way nothing changes for general code,
it's only needed to specify expected file size on creation.

Please see FILE_ALLOCATION_INFO:
https://msdn.microsoft.com/en-us/library/windows/desktop/aa364214(v=vs.85).aspx

Чт, 21 трав. 2015 16:39 Andres Freund and...@anarazel.de пише:

 On 2015-05-21 11:54:40 -0700, Josh Berkus wrote:
  This has been talked about as a feature, but would require major work on
  PostgreSQL to make it possible.  You'd be looking at several months of
  effort by a really good hacker, and then a whole bunch of performance
  testing.  If you have the budget for this, then please let's talk about
  it because right now nobody is working on it.

 I think this is overestimating the required effort quite a bit. While
 not trivial, it's also not that complex to make this work.

 Greetings,

 Andres Freund


 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance



Re: [PERFORM] Fastest way / best practice to calculate next birthdays

2015-05-21 Thread David G. Johnston
On Thursday, May 21, 2015, Bosco Rama postg...@boscorama.com wrote:

 On 05/20/15 20:22, David G. Johnston wrote:
  On Monday, May 18, 2015, er.tejaspate...@gmail.com javascript:; 
  er.tejaspate...@gmail.com javascript:; wrote:
 
  If I have to find upcoming birthdays in current week and the current
 week
  fall into different months - how would you handle that?
 
 
  Extract(week from timestamptz_column)
 
  ISO weeks are not affected by month boundaries but do start on Monday.

 There is the year start/end boundary conditions to worry about there.

 If the current week covers Dec28-Jan02 then week of year won't help for
 a birthday on Jan01 or Jan02 if 'today' is in the Dec portion.  Ditto
 for birthday in Dec portion when 'today' is in the Jan portion.


You need to read the documentation regarding ISO year and ISO week more
carefully.  There is no issue with years only ensuring that your definition
of week starts with Monday and contains 7 days.  The ISO year for January
1st can be different than the Gregorian year for the same.

David J.


Re: [PERFORM] Fastest way / best practice to calculate next birthdays

2015-05-21 Thread Bosco Rama
On 05/20/15 20:22, David G. Johnston wrote:
 On Monday, May 18, 2015, er.tejaspate...@gmail.com 
 er.tejaspate...@gmail.com wrote:
 
 If I have to find upcoming birthdays in current week and the current week
 fall into different months - how would you handle that?

 
 Extract(week from timestamptz_column)
 
 ISO weeks are not affected by month boundaries but do start on Monday.

There is the year start/end boundary conditions to worry about there.

If the current week covers Dec28-Jan02 then week of year won't help for
a birthday on Jan01 or Jan02 if 'today' is in the Dec portion.  Ditto
for birthday in Dec portion when 'today' is in the Jan portion.

There is probably a better way to do it than what I'm showing here, but
here's an example:

with x as (
   select now() - (extract(dow from now()) || ' days')::interval as
weekstart
)
select to_char(x.weekstart, '-MM-DD') as first_day,
to_char(x.weekstart + '6 days', '-MM-DD') as last_day
from x;

You could probably make some of that into a function that accepts a
timestamptz and generates the two days.  Or even does the compare too.

HTH.

Bosco.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance