Re: [GENERAL] Rounding Double Precision or Numeric

2017-06-01 Thread Torsten Förtsch
This is documented in section 8.1.2 in the manual. (
https://www.postgresql.org/docs/9.6/static/datatype-numeric.html)

NUMERIC rounds away from zero.

IEEE 754 based data types (FLOAT, DOUBLE PRECISION) round to the closest
even number.

On Thu, Jun 1, 2017 at 6:26 PM, Louis Battuello  wrote:

> Is the round() function implemented differently for double precision than
> for numeric? Forgive me if this exists somewhere in the documentation, but
> I can't seem to find it.
>
> I've noticed with 9.6 on OSX, the .5 rounding is handled differently
> between the types. (I haven't tested other versions, yet.) For double
> precision values, even whole numbers are rounded down, yet for odds they
> are rounded up. For numeric values, all .5 numbers are rounded up.
>
> psql (9.6.3)
> Type "help" for help.
>
> postgres=# \x
> Expanded display is on.
> postgres=# select round(cast(1230.5 as double precision)) as
> round_double_even_0
> postgres-#   ,round(cast(1231.5 as double precision)) as
> round_double_odd_1
> postgres-#   ,round(cast(1232.5 as double precision)) as
> round_double_even_2
> postgres-#   ,round(cast(1233.5 as double precision)) as
> round_double_odd_3
> postgres-#   ,round(cast(1234.5 as double precision)) as
> round_double_even_4
> postgres-#   ,round(cast(1235.5 as double precision)) as
> round_double_odd_5
> postgres-# ;
>
> -[ RECORD 1 ]---+-
> round_double_even_0 | 1230
> round_double_odd_1  | 1232
> round_double_even_2 | 1232
> round_double_odd_3  | 1234
> round_double_even_4 | 1234
> round_double_odd_5  | 1236
>
> postgres=# select round(cast(1230.5 as numeric)) as round_numeric_even_0
> postgres-#   ,round(cast(1231.5 as numeric)) as round_numeric_odd_1
> postgres-#   ,round(cast(1232.5 as numeric)) as round_numeric_even_2
> postgres-#   ,round(cast(1233.5 as numeric)) as round_numeric_odd_3
> postgres-#   ,round(cast(1234.5 as numeric)) as round_numeric_even_4
> postgres-#   ,round(cast(1235.5 as numeric)) as round_numeric_odd_5
> postgres-# ;
>
> -[ RECORD 1 ]+-
> round_numeric_even_0 | 1231
> round_numeric_odd_1  | 1232
> round_numeric_even_2 | 1233
> round_numeric_odd_3  | 1234
> round_numeric_even_4 | 1235
> round_numeric_odd_5  | 1236
>
> postgres=# select round(1230.5) as round_even_0
>  ,round(1231.5) as round_odd_1
>  ,round(1232.5) as round_even_2
>  ,round(1233.5) as round_odd_3
>  ,round(1234.5) as round_even_4
>  ,round(1235.5) as round_odd_5
> ;
>
> -[ RECORD 1 ]+-
> round_even_0 | 1231
> round_odd_1  | 1232
> round_even_2 | 1233
> round_odd_3  | 1234
> round_even_4 | 1235
> round_odd_5  | 1236
>
> postgres=# \q
>
> Why does the algorithm vary by data type?
>
> Or is something entirely different happening?
>
>


[GENERAL] How to assign default values to psql variables?

2017-02-21 Thread Torsten Förtsch
Hi,

I have a psql script relying on variables passed in on the command line
with -v.

Is there any way to assign a default value in case the -v option is
forgotten?

Here is an example with pgtap:

select diag('should be printed only in verbose mode') where :testverbose;

This can be called with -v testverbose=true or -v testverbose=false.
However, if the assignment is forgotten, it breaks. I'd like to behave it
as if testverbose=false was passed.

Thanks.


Re: [GENERAL] change type from NUMERIC(14,4) to NUMERIC(24,12)

2017-01-24 Thread Torsten Förtsch
I found that myself. But ...

postgres=# create table x(n14_4 NUMERIC(14,4), n24_12 NUMERIC(24,12), n
NUMERIC);
CREATE TABLE
postgres=# insert into x select i+.4, i+.12, i+.5234543 from
generate_series(1,100) i;
INSERT 0 100
postgres=# select * from x order by n limit 5;
n14_4  | n24_12 | n
++---
1.4000 | 1.1200 | 1.5234543
2.4000 | 2.1200 | 2.5234543
3.4000 | 3.1200 | 3.5234543
4.4000 | 4.1200 | 4.5234543
5.4000 | 5.1200 | 5.5234543
(5 rows)

postgres=# \d x
   Table "tf.x"
Column |  Type  | Modifiers
++---
n14_4  | numeric(14,4)  |
n24_12 | numeric(24,12) |
n  | numeric|

postgres=# begin; alter table x alter column n14_4 type NUMERIC(24,12);
select * from x order by n limit 5; abort;
BEGIN
ALTER TABLE
n14_4  | n24_12 | n
++---
1.4000 | 1.1200 | 1.5234543
2.4000 | 2.1200 | 2.5234543
3.4000 | 3.1200 | 3.5234543
4.4000 | 4.1200 | 4.5234543
5.4000 | 5.1200 | 5.5234543
(5 rows)

ROLLBACK
postgres=# \d x
   Table "tf.x"
Column |  Type  | Modifiers
++---
n14_4  | numeric(14,4)  |
n24_12 | numeric(24,12) |
n  | numeric|

postgres=# select * from x order by n limit 5;
n14_4  | n24_12 | n
++---
1.4000 | 1.1200 | 1.5234543
2.4000 | 2.1200 | 2.5234543
3.4000 | 3.1200 | 3.5234543
4.4000 | 4.1200 | 4.5234543
5.4000 | 5.1200 | 5.5234543
(5 rows)

postgres=# begin; update pg_attribute set atttypmod=1572880 where
attrelid='x'::regclass::oid and attname='n14_4'; select * from x order by n
limit 5;
BEGIN
UPDATE 1
n14_4  | n24_12 | n
++---
1.4000 | 1.1200 | 1.5234543
2.4000 | 2.1200 | 2.5234543
3.4000 | 3.1200 | 3.5234543
4.4000 | 4.1200 | 4.5234543
5.4000 | 5.1200 | 5.5234543
(5 rows)

postgres=# \d x
   Table "tf.x"
Column |  Type  | Modifiers
++---
n14_4  | numeric(24,12) |
n24_12 | numeric(24,12) |
n  | numeric|

postgres=# abort;
ROLLBACK

As you can see, after the ALTER TABLE command the n14_4 column is shown
with 12 places after the dot. If I just update atttypmod, it's still only 4
places.

Why is that so? I checked ctid. The ALTER TABLE version does not actually
update the tuple.


On Tue, Jan 24, 2017 at 11:48 AM, Albe Laurenz <laurenz.a...@wien.gv.at>
wrote:

> Torsten Förtsch wrote:
> > we have a large table and want to change the type of one column from
> NUMERIC(14,4) to NUMERIC(24,12).
> > If the new type is just NUMERIC without any boundaries, the operation is
> fast. If (24,12) is
> > specified, it takes ages.
> >
> > I think it takes so long because the database wants to check that all
> data in the table is compatible
> > with the new type. But the old type has stricter boundaries both before
> and after the dot. So, it is
> > compatible. It has to be.
> >
> > Is there a way to change the type skipping the additional check?
> >
> > This is 9.6.
>
> If you don't mind doing something unsupported, you could just modify
> the attribute metadata in the catalog:
>
> test=# CREATE TABLE a(x numeric(14,4));
> CREATE TABLE
> test=# INSERT INTO a VALUES (1234567890.1234);
> INSERT 0 1
> test=# UPDATE pg_attribute
>SET atttypmod = atttypmod + (24 - 14) * 65536 + (12 - 4)
>WHERE attrelid = 'a'::regclass AND attname = 'x';
> UPDATE 1
> test=# \d a
>   Table "public.a"
>  Column |  Type  | Modifiers
> ++---
>  x  | numeric(24,12) |
>
> test=# SELECT * FROM a;
> x
> -
>  1234567890.1234
> (1 row)
>
> Yours,
> Laurenz Albe
>


[GENERAL] change type from NUMERIC(14,4) to NUMERIC(24,12)

2017-01-24 Thread Torsten Förtsch
Hi,

we have a large table and want to change the type of one column
from NUMERIC(14,4) to NUMERIC(24,12). If the new type is just NUMERIC
without any boundaries, the operation is fast. If (24,12) is specified, it
takes ages.

I think it takes so long because the database wants to check that all data
in the table is compatible with the new type. But the old type has stricter
boundaries both before and after the dot. So, it is compatible. It has to
be.

Is there a way to change the type skipping the additional check?

This is 9.6.

Thanks,
Torsten


Re: [GENERAL] psql error (encoding related?)

2017-01-05 Thread Torsten Förtsch
This hex string decodes to something sensible:

$ perl -le 'print pack "H*", shift'
246c69626469722f757466385f616e645f69736f383835395f31
$libdir/utf8_and_iso8859_1

Maybe it rings a bell.


On Thu, Jan 5, 2017 at 7:57 PM, BRUSSER Michael 
wrote:

> I see this with PostgreSQL 9.4.7 and some 8.x versions running on Linux
> Red Hat.
> Older versions "supposedly" do not exhibit this behavior, but I didn't
> check.
>
> $ psql
> Password:
> psql: FATAL:  could not access file 
> "\x246c69626469722f757466385f616e645f69736f383835395f31":
> No such file or directory
>
> The same error is written to the database log:
> FATAL:  could not access file 
> "\x246c69626469722f757466385f616e645f69736f383835395f31":
> No such file or directory
>
> This is the environment where this problem was reported
> $ locale
> LANG=en_US
> LC_CTYPE="en_US"
> LC_NUMERIC="en_US"
> LC_COLLATE="en_US"
> . . .   . . .. . .
>
> This is the "normal" environment where I don't see any problems launching
> psql
> $  locale
> LANG=en_US.UTF-8
> LC_CTYPE="en_US.UTF-8"
> LC_NUMERIC="en_US.UTF-8"
> LC_COLLATE="en_US.UTF-8"
> . . .   . . .. . .
>
> The database was created using these parameters "initdb  -E UTF8
>  --locale=C   -D  ... "
>
> When I display the list of databases with \l I see that all of them have
> these parameters:
> EncodingUTF8
> CollateC
> Ctype   C
>
> I can work around this error by doing this "setenv PGCLIENTENCODING UTF-8"
>
> I would like to understand what happens... is this some kind of a bug or
> just an inherent incompatibility between the client (psql) and the database
> settings?
> Is there a different and maybe a better remedy than using PGCLIENTENCODING?
>
> And if anyone from the Postgres team listening... in the old tradition of
> whining I would add that the error message referring to a long hex string
> is not helpful!
> Thanks in advance.
> Michael.
>
> This email and any attachments are intended solely for the use of the
> individual or entity to whom it is addressed and may be confidential and/or
> privileged.
>
> If you are not one of the named recipients or have received this email in
> error,
>
> (i) you should not read, disclose, or copy it,
>
> (ii) please notify sender of your receipt by reply email and delete this
> email and all attachments,
>
> (iii) Dassault Systemes does not accept or assume any liability or
> responsibility for any use of or reliance on this email.
>
> For other languages, go to http://www.3ds.com/terms/email-disclaimer
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


[GENERAL] streaming replication protocol in Perl?

2016-12-18 Thread Torsten Förtsch
Hi,

is there a perl module that allows to speak the streaming replication
protocol? Can DBD::Pg do that anyhow?

I think I could just pipe from pg_recvlogical. But would be cool to have it
directly in Perl.

Thanks,
Torsten


Re: [GENERAL] Checking data checksums...

2016-12-17 Thread Torsten Förtsch
I use this:

create extension pageinspect;

SELECT count(*) AS pages_read
  FROM (
SELECT c.oid::regclass::text AS rel,
   f.fork,
   ser.i AS blocknr,
   page_header(get_raw_page(c.oid::regclass::text,
f.fork,
ser.i))
  FROM pg_class c
 CROSS JOIN (values ('main'::text),
('fsm'::text),
('vm'::text)) f(fork)
 CROSS JOIN pg_relation_size(c.oid::regclass, f.fork) sz(sz)
 CROSS JOIN generate_series(0,(sz.sz/8192)::int-1) ser(i)
 WHERE sz.sz>0
   ) t1;

The idea is to read just everything. Since a select works only inside one
database, this works only for that database. If you have multiple databases
in a cluster, you need to run it in every one of them.

Note this only works if your page size is the usual 8k. If you have
compiled your postgres otherwise then change 8192 to whatever it is.

Also, PG verifies the checksum when it reads a page from storage. So, this
will miss pages that are present in shared_buffers. But assuming that they
came there from storage in the first place, that should be good enough.

Alternatives are something like pg_dumpall >/dev/null. This reads all data
files but won't probably detect problems in indexes. Still it's a good idea
to do once in a while to check toasted data for instance.


On Fri, Dec 16, 2016 at 11:07 AM,  wrote:

> Hi,
>
> I enabled data checksums (initdb --data-checksums) on a new instance and
> was wandering is there a command in the psql console, or from the linux
> console, to force a checksum check on the entire cluster and get error
> reports if it finds some corrupted pages.
>
> Regards,
> Mladen Marinović
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


Re: [GENERAL] SQL query problem of a Quiz program

2016-12-17 Thread Torsten Förtsch
Did you try DISTINCT ON?

postgres=# table x;
id | qid |  uid
+-+
 1 |  25 |  1
 2 |  25 |  1
 3 |  25 |  1
 4 |  26 |  1
 5 |  26 |  1
 6 |  27 |  1
 7 |  27 |  1
 8 |  25 |  2
 9 |  25 |  2
10 |  25 |  2
11 |  26 |  2
12 |  26 |  2
13 |  27 |  2
14 |  27 |  2
15 |  25 | 
16 |  25 | 
17 |  25 | 
18 |  26 | 
19 |  26 | 
20 |  27 | 
21 |  27 | 
(21 rows)

postgres=# select distinct on (qid, uid) * from x order by uid, qid, id
desc;
id | qid |  uid
+-+
 3 |  25 |  1
 5 |  26 |  1
 7 |  27 |  1
10 |  25 |  2
12 |  26 |  2
14 |  27 |  2
17 |  25 | 
19 |  26 | 
21 |  27 | 
(9 rows)


Is that what you need?


On Sat, Dec 17, 2016 at 4:25 PM, Arup Rakshit 
wrote:

> Hi,
>
> Here is a sample data from table "quiz_results":
>
> id | question_id |  user_id
> +-+
>   2 |  25 | 5142670086
>   3 |  26 |
>   4 |  26 |
>   5 |  27 |
>   6 |  25 | 5142670086
>   7 |  25 | 5142670086
>   8 |  25 | 5142670086
>   9 |  26 |
>  10 | 40 | 5142670086
>  11 |  29 | 5142670086
>
>
> As you see above question id 25 appeared more than once. This is basically
> a quiz result table where for users as they answered. question_id 25 always
> the first questions. Any user can go though the quiz N number of time. So,
> I want to find the last occurrence of the question_id 25 for any specific
> user in the table, and select that and all answers the users gave after
> this till the end of the quiz. Any idea how to solve it in a single
> efficient query. My all try didn't work out.
>
>
>
> 
> Regards,
> Arup Rakshit
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


Re: [GENERAL] vacuum freeze in 96

2016-12-14 Thread Torsten Förtsch
On Wed, Dec 14, 2016 at 5:59 AM, Michael Paquier <michael.paqu...@gmail.com>
wrote:

> On Wed, Dec 14, 2016 at 5:00 AM, Torsten Förtsch <tfoertsch...@gmail.com>
> wrote:
> > one of the major enhancements in 96 is skipping completely frozen pages
> in
> > vacuum freeze. I assume that requires a special bit on the page.
>
> The freeze map uses an additional bit in the vm, and pg_upgrade would
> take care of the conversion.
>

Thanks. That makes sense.


[GENERAL] vacuum freeze in 96

2016-12-13 Thread Torsten Förtsch
Hi,

one of the major enhancements in 96 is skipping completely frozen pages in
vacuum freeze. I assume that requires a special bit on the page.

If I upgrade from 93 using pg_upgrade, that is re-using the data files, can
it still do that? Or do I have to recreate the table?

Thanks,
Torsten


Re: [GENERAL] is this a known bug in 9.6?

2016-12-13 Thread Torsten Förtsch
Thanks Tom

On Tue, Dec 13, 2016 at 7:22 PM, Tom Lane  wrote:

> I wrote:
> > Yup, sure looks like a bug to me, especially since it seems to work as
> > expected before 9.5.  No idea offhand what broke it.
>
> The answer is, I broke it, through some ill-advised neatnik-ism :-(,
> ie clearing a field I thought would be unused but it wasn't.
>
> Fix pushed.  Thanks for the report!
>
> regards, tom lane
>


Re: [GENERAL] is this a known bug in 9.6?

2016-12-13 Thread Torsten Förtsch
Thanks for confirming.

Here are a few more examples that also work:

with i(x) as (values (1::int)) select x from (select x from i union all
select 3::int) b order by x desc limit 1;

with i(x) as (values (1::int)) select max(x) from (select x from i union
select 3::int) b;

It also works with EXCEPT or INTERSECT, both with or without ALL.

The UNION ALL version fails with MIN and MAX but it works with all other
aggregates that I have tested.


On Tue, Dec 13, 2016 at 4:31 PM, Tom Lane  wrote:

> =?UTF-8?Q?Torsten_F=C3=B6rtsch?=  writes:
> > postgres=# with i(x) as (values (1::int)), j(y) as (values (2::int))
> select
> > x from (select x from i union all select y from j) b;
> > x
> > ---
> > 1
> > 2
> > (2 rows)
>
> > postgres=# with i(x) as (values (1::int)), j(y) as (values (2::int))
> select
> > max(x) from (select x from i union all select y from j) b;
> > ERROR:  could not find plan for CTE "i"
>
> Yup, sure looks like a bug to me, especially since it seems to work as
> expected before 9.5.  No idea offhand what broke it.
>
> regards, tom lane
>


[GENERAL] is this a known bug in 9.6?

2016-12-13 Thread Torsten Förtsch
Hi,

this is a stock PGDG 9.6:

postgres=# with i(x) as (values (1::int)), j(y) as (values (2::int)) select
x from (select x from i union all select y from j) b;
x
---
1
2
(2 rows)

postgres=# with i(x) as (values (1::int)), j(y) as (values (2::int)) select
max(x) from (select x from i union all select y from j) b;
ERROR:  could not find plan for CTE "i"

The same on 9.5:

postgres=# with i(x) as (values (1::int)), j(y) as (values (2::int)) select
x from (select x from i union all select y from j) b;
x
---
1
2
(2 rows)

postgres=# with i(x) as (values (1::int)), j(y) as (values (2::int)) select
max(x) from (select x from i union all select y from j) b;
max
-
  2
(1 row)

Is this a bug or is my assumption that this should work wrong?

Both the aggregate and the UNION are required to trigger the bug:

postgres=# with i(x) as (values (1::int)) select x from (select x from i
union all select 3::int) b;
x
---
1
3
(2 rows)

postgres=# with i(x) as (values (1::int)) select max(x) from (select x from
i) b;
max
-
  1
(1 row)

postgres=# with i(x) as (values (1::int)) select max(x) from (select x from
i union all select 3::int) b;
ERROR:  could not find plan for CTE "i"


Thanks,
Torsten


Re: [GENERAL] WAL

2016-12-12 Thread Torsten Förtsch
On Mon, Dec 12, 2016 at 12:37 PM, Albe Laurenz <laurenz.a...@wien.gv.at>
wrote:

> Torsten Förtsch wrote:
> > if I do something like this:
> >
> > BEGIN;
> > UPDATE tbl SET data='something' WHERE pkey='selector';
> > UPDATE tbl SET data=NULL WHERE pkey='selector';
> > COMMIT;
> >
> > Given 'selector' actually exists, I get a separate WAL entry for each of
> the updates. My question is,
> > does the first update actually hit the data file?
>
> It should, yes.
>
> > If I am only interested in the first update hitting the WAL, does it
> make sense to do something like
> > the above in a transaction? Would that help to keep the table small in a
> high concurrency situation?
> > The table itself has a small fillfactor. So, in most cases there should
> be enough space to do a HOT
> > update. For that HOT update, is that second update setting data to NULL
> beneficial or rather adverse?
>
> How could the second update *not* be WAL logged?
>
> Maybe you could explain what you are trying to achieve.
>

I am working on a logical decoder plugin, something similar to wal2json. I
don't need 'something' to be permanently in the database. I just need it in
the WAL stream. So, I was thinking that maybe keeping data=NULL permanently
in the table might help to keep the table small. This table has only about
500 rows in my current case. But it is updated very often by different
processes.

By now I have tried it out for short strings of 'something'. The size of
tbl remains fixed and the same with or without the data=NULL update. Also,
I ran both version each 3 times for 200 transactions. The run time
difference is marginal. In the final version 'something' will be a string
up to 1kb, I think.

Torsten


[GENERAL] WAL

2016-12-12 Thread Torsten Förtsch
Hi,

if I do something like this:

BEGIN;
UPDATE tbl SET data='something' WHERE pkey='selector';
UPDATE tbl SET data=NULL WHERE pkey='selector';
COMMIT;

Given 'selector' actually exists, I get a separate WAL entry for each of
the updates. My question is, does the first update actually hit the data
file?

If I am only interested in the first update hitting the WAL, does it make
sense to do something like the above in a transaction? Would that help to
keep the table small in a high concurrency situation? The table itself has
a small fillfactor. So, in most cases there should be enough space to do a
HOT update. For that HOT update, is that second update setting data to NULL
beneficial or rather adverse?

Thanks,
Torsten


Re: [GENERAL] logical decoding output plugin

2016-12-11 Thread Torsten Förtsch
On Sun, Dec 11, 2016 at 12:10 AM, Andres Freund <and...@anarazel.de> wrote:

> On 2016-12-10 22:34:02 +0100, Torsten Förtsch wrote:
> > I am working on a logical decoding output plugin. I am only interested in
> > the changes in one particular table.
> >
> > One way to check for the table would be to compare class_form->relname
> > and get_namespace_name(class_form->relnamespace). But I'd much prefer to
> > just compare the OID of the table.
> >
> > Is there a cheap way to do something like "SELECT
> > 'schema.table'::regclass::oid" that can be called in the startup
> callback?
>
> You shouldn't do it in the startup callback (no catalog access there),
> but you can do it the first time through the change callback.
>
> For lookups the most complete lookup is to use RangeVarGetRelid() to get
> the oid. There's other variants, but that's the easiest approach.
>
> Do you have to care about the table being renamed?
>

Thanks, Andres. That would be nice to have but not really necessary, at
least not at the beginning. Is it possible to catch CREATE TABLE and ALTER
TABLE RENAME in 9.5?

Thanks,
Torsten


[GENERAL] logical decoding output plugin

2016-12-10 Thread Torsten Förtsch
Hi,

I am working on a logical decoding output plugin. I am only interested in
the changes in one particular table.

One way to check for the table would be to compare class_form->relname
and get_namespace_name(class_form->relnamespace). But I'd much prefer to
just compare the OID of the table.

Is there a cheap way to do something like "SELECT
'schema.table'::regclass::oid" that can be called in the startup callback?

Thanks,
Torsten


Re: [GENERAL] high transaction rate

2016-12-07 Thread Torsten Förtsch
Think I found it. classid 1262 is pg_database and I seem to remember that
NOTIFY takes that lock. I dropped pg_notify from my function and got
immediately >3500 tx/sec.

On Wed, Dec 7, 2016 at 11:31 AM, Torsten Förtsch <tfoertsch...@gmail.com>
wrote:

> On Wed, Dec 7, 2016 at 11:21 AM, Torsten Förtsch <tfoertsch...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I need to tune my database for a high update rate of a single small
>> table. A little simplified it looks like this:
>>
>> CREATE TABLE temp_agg(
>>   topic TEXT PRIMARY KEY,
>>   tstmp TIMESTAMP,
>>   cnt BIGINT,
>>   sum NUMERIC
>> )
>>
>> The table has 500 rows.
>>
>> A transaction looks simplified like this:
>>
>> 1) select * from temp_agg where topic=$1 for update
>>
>> 2) if date_trunc('second', tstmp)=date_trunc('second', $3) then:
>> 2a) update temp_agg set cnt=cnt+1, sum=sum+$2 where topic=$1
>> 2b) set local synchronous_commit to off
>>
>> 3) if a new second has started:
>> 3a) insert into other_table select * from temp_agg where topic=$1
>> 3b) update temp_agg set tstmp=date_trunc('second', $3) cnt=1, sum=$2
>> where topic=$1
>> 3c) emit a notification (pg_notify(...)) with the new data
>>
>> sorry, hit the wrong button.
>
> As a first test my program starts 10 writers each of which serves 50
> topics. The timestamps are generated in a way that on average 10 timestamps
> per second per topic are produced. That means on average the 2) branch is
> hit 10 times more often.
>
> These 10 writers then flood the database. At first I see a really good
> transaction rate of more than 6500 tx/sec. But after a while it suddenly
> drops to less than 1/10 of that. I also monitored the number of processes
> waiting for locks. As can be seen in this picture, this is a locking
> problem.
>
>
> Here is a snapshot of not granted locks:
> ​
>  locktype | database | relation |  page  | tuple  | virtualxid |
> transactionid | classid | objid | objsubid | virtualtransaction |  pid  |
>mode | granted | fastpath
> --+--+--+++-
> ---+---+-+---+--+---
> -+---+-+-+--
> object   |0 ||  |  |  |
> |1262 | 0 |0 | 15/6381185 | 11468 |
> AccessExclusiveLock | f   | f
> object   |0 ||  |  |  |
> |1262 | 0 |0 | 13/6375058 | 11465 |
> AccessExclusiveLock | f   | f
> object   |0 ||  |  |  |
> |1262 | 0 |0 | 9/6373397  | 11463 |
> AccessExclusiveLock | f   | f
> object   |0 ||  |  |  |
> |1262 | 0 |0 | 11/6380027 | 11464 |
> AccessExclusiveLock | f   | f
> object   |0 ||  |  |  |
> |1262 | 0 |0 | 3/447139   | 11133 |
> AccessExclusiveLock | f   | f
> object   |0 ||  |  |  |
> |1262 | 0 |0 | 7/6375244  | 11461 |
> AccessExclusiveLock | f   | f
> object   |0 ||  |  |  |
> |1262 | 0 |0 | 16/6381714 | 11467 |
> AccessExclusiveLock | f   | f
> object   |0 ||  |  |  |
> |1262 | 0 |0 | 10/6380578 | 11460 |
> AccessExclusiveLock | f   | f
> object   |0 ||  |  |  |
> |1262 | 0 |0 | 8/6374490  | 11459 |
> AccessExclusiveLock | f   | f
> object   |0 ||  |  |  |
> |1262 | 0 |0 | 12/6377255 | 11462 |
> AccessExclusiveLock | f   | f
>
> What are these object locks here? How can I prevent contention?
>
> This is pgdg postgres 9.5
>
> Thanks,
> Torsten
>


Re: [GENERAL] high transaction rate

2016-12-07 Thread Torsten Förtsch
On Wed, Dec 7, 2016 at 11:21 AM, Torsten Förtsch <tfoertsch...@gmail.com>
wrote:

> Hi,
>
> I need to tune my database for a high update rate of a single small table.
> A little simplified it looks like this:
>
> CREATE TABLE temp_agg(
>   topic TEXT PRIMARY KEY,
>   tstmp TIMESTAMP,
>   cnt BIGINT,
>   sum NUMERIC
> )
>
> The table has 500 rows.
>
> A transaction looks simplified like this:
>
> 1) select * from temp_agg where topic=$1 for update
>
> 2) if date_trunc('second', tstmp)=date_trunc('second', $3) then:
> 2a) update temp_agg set cnt=cnt+1, sum=sum+$2 where topic=$1
> 2b) set local synchronous_commit to off
>
> 3) if a new second has started:
> 3a) insert into other_table select * from temp_agg where topic=$1
> 3b) update temp_agg set tstmp=date_trunc('second', $3) cnt=1, sum=$2 where
> topic=$1
> 3c) emit a notification (pg_notify(...)) with the new data
>
> sorry, hit the wrong button.

As a first test my program starts 10 writers each of which serves 50
topics. The timestamps are generated in a way that on average 10 timestamps
per second per topic are produced. That means on average the 2) branch is
hit 10 times more often.

These 10 writers then flood the database. At first I see a really good
transaction rate of more than 6500 tx/sec. But after a while it suddenly
drops to less than 1/10 of that. I also monitored the number of processes
waiting for locks. As can be seen in this picture, this is a locking
problem.


Here is a snapshot of not granted locks:
​
 locktype | database | relation |  page  | tuple  | virtualxid |
transactionid | classid | objid | objsubid | virtualtransaction |  pid  |
   mode | granted | fastpath
--+--+--++++---+-+---+--++---+-+-+--

object   |0 ||  |  |  |
|1262 | 0 |0 | 15/6381185 | 11468 |
AccessExclusiveLock | f   | f
object   |0 ||  |  |  |
|1262 | 0 |0 | 13/6375058 | 11465 |
AccessExclusiveLock | f   | f
object   |0 ||  |  |  |
|1262 | 0 |0 | 9/6373397  | 11463 |
AccessExclusiveLock | f   | f
object   |0 ||  |  |  |
|1262 | 0 |0 | 11/6380027 | 11464 |
AccessExclusiveLock | f   | f
object   |0 ||  |  |  |
|1262 | 0 |0 | 3/447139   | 11133 |
AccessExclusiveLock | f   | f
object   |0 ||  |  |  |
|1262 | 0 |0 | 7/6375244  | 11461 |
AccessExclusiveLock | f   | f
object   |0 ||  |  |  |
|1262 | 0 |0 | 16/6381714 | 11467 |
AccessExclusiveLock | f   | f
object   |0 ||  |  |  |
|1262 | 0 |0 | 10/6380578 | 11460 |
AccessExclusiveLock | f   | f
object   |0 ||  |  |  |
|1262 | 0 |0 | 8/6374490  | 11459 |
AccessExclusiveLock | f   | f
object   |0 ||  |  |  |
|1262 | 0 |0 | 12/6377255 | 11462 |
AccessExclusiveLock | f   | f

What are these object locks here? How can I prevent contention?

This is pgdg postgres 9.5

Thanks,
Torsten


[GENERAL] high transaction rate

2016-12-07 Thread Torsten Förtsch
Hi,

I need to tune my database for a high update rate of a single small table.
A little simplified it looks like this:

CREATE TABLE temp_agg(
  topic TEXT PRIMARY KEY,
  tstmp TIMESTAMP,
  cnt BIGINT,
  sum NUMERIC
)

The table has 500 rows.

A transaction looks simplified like this:

1) select * from temp_agg where topic=$1 for update

2) if date_trunc('second', tstmp)=date_trunc('second', $3) then:
2a) update temp_agg set cnt=cnt+1, sum=sum+$2 where topic=$1
2b) set local synchronous_commit to off

3) if a new second has started:
3a) insert into other_table select * from temp_agg where topic=$1
3b) update temp_agg set tstmp=date_trunc('second', $3) cnt=1, sum=$2 where
topic=$1
3c) emit a notification (pg_notify(...)) with the new data

As a first test my program starts 10 writers each of which serves 50
topics. The timestamps are generated in a way that on average 10 timestamps
per second per topic a


[GENERAL] hot_standby_feedback

2016-11-28 Thread Torsten Förtsch
Hi,

I am in the process of reviewing our configs for a number of 9.3 databases
and found a replica with hot_standby_feedback=on. I remember when we set it
long ago we were fighting cancelled queries. I also remember that it never
really worked for us. In the end we set up 2 replicas, one suitable for
short queries where we prefer low replication lag, and another one where we
allow for long running queries but sacrifice timeliness
(max_standby_*_delay=-1).

I have a hunch why hot_standby_feedback=on didn't work. But I never
verified it. So, here it is. The key is this sentence:

"Feedback messages will not be sent more frequently than once per
wal_receiver_status_interval."

That interval is 10 sec. So, assuming a transaction on the replica uses a
row right after the message has been sent. Then there is a 10 sec window in
which the master cannot know that the row is needed on the replica and can
vacuum it. If then the transaction on the replica takes longer than
max_standby_*_delay, the only option is to cancel it.

Is that explanation correct?

What is the correct way to use hot_standby_feedback to prevent
cancellations reliably? (and accepting the bloat)

Thanks,
Torsten


Re: [GENERAL] min/max_wal_size

2016-11-23 Thread Torsten Förtsch
On Tue, Nov 22, 2016 at 10:34 PM, Adrian Klaver <adrian.kla...@aklaver.com>
wrote:

> On 11/22/2016 12:51 PM, Torsten Förtsch wrote:
>
>>
>> Now, I have a database with very low write activity. Archive_command is
>> called about once per hour to archive one segment. When the database was
>> moved to PG 9.5, it was initially configured with insanely high settings
>> for max_wal_size, min_wal_size and wal_keep_segments. I reset
>> min/max_wal_size to the default settings of 80MB and 1GB and reduced
>> wal_keep_segments to 150.
>>
>> I am seeing in pg_xlog the WAL segments from
>>
>> -rw--- 1 postgres postgres 16777216 Nov 17 04:01
>> pg_xlog/0001000400F9
>> ...
>> -rw--- 1 postgres postgres 16777216 Nov 22 20:00
>> pg_xlog/00010005008E
>> -rw--- 1 postgres postgres 16777216 Nov 22 20:19
>> pg_xlog/00010005008F
>> -rw--- 1 postgres postgres 16777216 Nov 15 07:50
>> pg_xlog/000100050090
>> ...
>> -rw--- 1 postgres postgres 16777216 Nov 15 07:52
>> pg_xlog/000100060017
>>
>> As you can see, the files from 1/4/F9 to 1/5/8E are old. That are 150
>> files which matches exactly wal_keep_segments. If I understand
>> correctly, the file 1/5/8F is currently written. Further, the files from
>> 1/5/90 to 1/6/17 seem to be old WAL files that have been renamed to be
>> reused in the future. Their count is 136.
>>
>> Why does a database that generates a little more than 1 WAL file per
>> hour and has a checkpoint_timeout of 30 minutes with a
>> completion_target=0.7 need so many of them? The default value for
>> min_wal_size is 80MB which amounts to 5 segments. That should be totally
>> enough for this database.
>>
>> Is this because of the previously insanely high setting (min=1GB,
>> max=9GB)? Should I expect this value to drop in a week's time? Or is
>> there anything that I am not aware of?
>>
>
> Are you talking about the recycled files?


Yes, I was talking about recycled files and I think the suspicion that the
number of recycled files will drop over time to min_wal_size is correct.
Over night the number of recycled files dropped to 123 and according to the
log, many checkpoints have removed a file and none was added.


[GENERAL] min/max_wal_size

2016-11-22 Thread Torsten Förtsch
Hi,

I am a bit confused about min_wal_size and max_wal_size. Previously, there
was this formula to estimate the max number of WAL segment files in
pg_xlog/ (https://www.postgresql.org/docs/9.4/static/wal-configuration.html
):

  (2 + checkpoint_completion_target) * checkpoint_segments + 1 or
checkpoint_segments + wal_keep_segments + 1

I don't exactly know what the operation "or" means. Before writing this
email I always thought of wal_keep_segments as a parameter that configures
how many segments to keep that would otherwise be deleted and
checkpoint_segments as the number of WAL files the database is allowed to
work with within a checkpoint_timeout interval.

The formula above makes more or less sense. The database is allowed to
write one set of WAL files during the checkpoint interval. While performing
the checkpoint it needs the previous set of WAL files. I don't know where
that checkpoint_completion_target comes in. But I trust the wisdom of the
author of the documentation.

Now, I have a database with very low write activity. Archive_command is
called about once per hour to archive one segment. When the database was
moved to PG 9.5, it was initially configured with insanely high settings
for max_wal_size, min_wal_size and wal_keep_segments. I reset
min/max_wal_size to the default settings of 80MB and 1GB and reduced
wal_keep_segments to 150.

I am seeing in pg_xlog the WAL segments from

-rw--- 1 postgres postgres 16777216 Nov 17 04:01
pg_xlog/0001000400F9
...
-rw--- 1 postgres postgres 16777216 Nov 22 20:00
pg_xlog/00010005008E
-rw--- 1 postgres postgres 16777216 Nov 22 20:19
pg_xlog/00010005008F
-rw--- 1 postgres postgres 16777216 Nov 15 07:50
pg_xlog/000100050090
...
-rw--- 1 postgres postgres 16777216 Nov 15 07:52
pg_xlog/000100060017

As you can see, the files from 1/4/F9 to 1/5/8E are old. That are 150 files
which matches exactly wal_keep_segments. If I understand correctly, the
file 1/5/8F is currently written. Further, the files from 1/5/90 to 1/6/17
seem to be old WAL files that have been renamed to be reused in the future.
Their count is 136.

Why does a database that generates a little more than 1 WAL file per hour
and has a checkpoint_timeout of 30 minutes with a completion_target=0.7
need so many of them? The default value for min_wal_size is 80MB which
amounts to 5 segments. That should be totally enough for this database.

Is this because of the previously insanely high setting (min=1GB, max=9GB)?
Should I expect this value to drop in a week's time? Or is there anything
that I am not aware of?

Thanks,
Torsten


Re: [GENERAL] Pgbouncer

2015-11-30 Thread Torsten Förtsch
On 30/11/15 22:50, Steve Crawford wrote:
> Do you have any clients connected that are idle in transaction?

No, the backend database is a streaming replica. Also, as I said, the
backend process is in "idle" state not "idle in transaction".

backend_start| 2015-11-30 18:58:30.639962+00
xact_start   | 
query_start  | 2015-11-30 20:56:36.479003+00
state_change | 2015-11-30 20:56:36.480199+00
waiting  | f
state| idle



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Pgbouncer

2015-11-30 Thread Torsten Förtsch
Hi,

I am not sure if this is the right place to ask this question. If not,
please point me to it.

I am trying out the new pgbouncer (latest git). "SHOW SERVERS" is
telling me 2 connections in "active" state. Both show
connect_time=2015-11-30 18:58:30. Request_time is 19:01:35 and 20:56:36.
Both use pool_mode=transaction.

Server_lifetime=600. And now is 21:38:55.

Shouldn't those connections have been closed long ago?

I also checked on the backend. The backend processes are there, are idle
and query_start and state_change times match the request_times above.

Also, since pool_mode=transaction, why are these connections "active"?

Thanks,
Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] postgres function

2015-10-15 Thread Torsten Förtsch
On 15/10/15 14:32, Ramesh T wrote:
>  select position('-' in '123-987-123')
> position
> ---
> 4
> But I want second occurrence,
> position
> -
> 8
> 
> plz any help..?


For instance:

# select char_length(substring('123-987-123' from '^[^-]*-[^-]*-'));
char_length
-
8

Best,
Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] xmin horizon?

2015-07-29 Thread Torsten Förtsch
On 29/07/15 21:13, CS DBA wrote:
 The documentation for pg_stat_activity lists this column:
 
 backend_xmin xid The current backend's xmin horizon.
 
 Can someone point me to a better understanding on xmin horizon?

https://momjian.us/main/writings/pgsql/mvcc.pdf

you can find this talk also on youtube. It's worth watching.

Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Question about timelines

2015-07-28 Thread Torsten Förtsch
Hi,

we have a complex structure of streaming replication (PG 9.3) like:

master -- replica1
   |
   +- replica2 -- replica21
   |
   +-- replica22 -- replica221

Now I want to retire master and make replica2 the new master:

   +-- replica1
   |
   replica2 -- replica21
   |
   +-- replica22 -- replica221

replica2 is currently a synchronous replica.

If I promote replica2 a new timeline is created. Hence, I have to
instruct all other replicas to follow that new timeline
(recovery_target_timeline = 'latest' in recovery.conf).

On the other hand, since replica2 is synchronous it should have all
transactions when master is shut down. So, I am thinking do I really
need a new timeline? Can't I just remove recovery.conf on replica2 and
restart it as a master. The only thing I then have to do is to point
replica1 to it.

Is that a way to go? If promoting is better than why?

Thanks,
Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_xlog Concern

2015-05-18 Thread Torsten Förtsch
On 18/05/15 13:44, Sachin Srivastava wrote:
 But currently my pg_xlog size is 60 GB and there are 3740 WAL file in
 this folder and in Last week this was 400 GB(pg_xlog folder) and WAL
 file were approx. 3. Due to archiving pg_xlog folder size is
 decreasing now but it’s taking one week to come in normal size.

Any chance you have unfinished transactions running for a week?

pg_stat_activity should be able to tell you that.

Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Locking question

2015-02-26 Thread Torsten Förtsch
Hi,

given a query like this:

select *
  from account a
 cross join lateral (
   select rate
 from exchange
where target='USD'
  and source=a.currency
order by date desc
limit 1) e
 where a.id=19
   for update;

If I understand the documentation correctly, both rows, the one from
exchange and the one from account are locked, right?

In fact, I have tried it. This query blocks (currency is 'AUD' for
account #19):

select *
  from exchange
 where target='USD'
   and source='AUD'
 order by date desc
 limit 1
   for update;

However, if I create a SQL function like this:

CREATE OR REPLACE FUNCTION
exchangetousd_rate(
cur CHAR(3),
tm TIMESTAMP DEFAULT now()
) RETURNS TABLE(rate NUMERIC)
AS $def$

SELECT rate
  FROM exchange
 WHERE source = $1
   AND target = 'USD'
   AND date = $2::TIMESTAMP
  ORDER BY date DESC
 LIMIT 1

$def$ LANGUAGE sql STABLE;

and use it here:

select *
  from account a
 cross join exchangeToUSD_rate(a.currency) e
 where a.id=19
   for update;

Then the 2nd query above does not block. So, the row from the exchange
table is not locked.

Is that documented somewhere? Can I rely on it?

The plan for the last query tells me the function call is inlined. So,
in principle it's not different from the first one.

Thanks,
Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] 9.3: bug related to json

2015-02-25 Thread Torsten Förtsch
On 25/02/15 07:22, David G Johnston wrote:
 I'm doubting you intended to join a bunch of commas using the field value as
 the delimiter...methinks your got the argument order reversed for
 string_agg.

OMG, I am so stupid. Thanks.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] 9.3: bug related to json

2015-02-25 Thread Torsten Förtsch
On 25/02/15 07:34, David G Johnston wrote:
 Torsten Förtsch wrote
  Is there anything similar for JSON scalars?
 IDK, but have you tried ::text?

yes. Here is the difference

select * from (values (('{a:b}'::json - 'a')::text),
   ('{a:b}'::json - 'a')) t;
 column1
-
 b
 b

Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] 9.3: bug related to json

2015-02-24 Thread Torsten Förtsch
Hi,

I think I found a json related bug in 9.3.

Given this query:

select *
  from json_array_elements('[{s:[{x:1},{x:2},{x:5}]},
 {s:[{x:3},{x:4},{x:6}]}]')
   t(el)
 cross join lateral (
 select syms.sym - 'x' as x
   from json_array_elements(t.el - 's')
syms(sym)
 ) s;

It gives me this table:

  el   | x
---+---
 {s:[{x:1},{x:2},{x:5}]} | 1
 {s:[{x:1},{x:2},{x:5}]} | 2
 {s:[{x:1},{x:2},{x:5}]} | 5
 {s:[{x:3},{x:4},{x:6}]} | 3
 {s:[{x:3},{x:4},{x:6}]} | 4
 {s:[{x:3},{x:4},{x:6}]} | 6
(6 rows)

So far so good. Now I want to aggregate all the x's:

select *
  from json_array_elements('[{s:[{x:1},{x:2},{x:5}]},
 {s:[{x:3},{x:4},{x:6}]}]')
   t(el)
 cross join lateral (
 select array_agg(syms.sym - 'x') as xx
   from json_array_elements(t.el - 's')
syms(sym)
 ) s;
  el   |   xx
---+-
 {s:[{x:1},{x:2},{x:5}]} | {1,2,5}
 {s:[{x:3},{x:4},{x:6}]} | {3,4,6}
(2 rows)

Still works.

But if I want to string_agg them, I get this:

select *
  from json_array_elements('[{s:[{x:1},{x:2},{x:5}]},
 {s:[{x:3},{x:4},{x:6}]}]')
   t(el)
 cross join lateral (
 select string_agg(', ', syms.sym - 'x') as xx
   from json_array_elements(t.el - 's')
syms(sym)
 ) s;
  el   |xx
---+--
 {s:[{x:1},{x:2},{x:5}]} | , 2, 5,
 {s:[{x:3},{x:4},{x:6}]} | , 4, 6,
(2 rows)

Note, the first element of the resulting string is always missing.

If the xx is first aggregated as array and then converted to a string,
it works as expected:

select *
  from json_array_elements('[{s:[{x:1},{x:2},{x:5}]},
 {s:[{x:3},{x:4},{x:6}]}]')
   t(el)
 cross join lateral (
 select array_to_string(array_agg(syms.sym - 'x'), ', ') as xx
   from json_array_elements(t.el - 's')
syms(sym)
 ) s;
  el   |   xx
---+-
 {s:[{x:1},{x:2},{x:5}]} | 1, 2, 5
 {s:[{x:3},{x:4},{x:6}]} | 3, 4, 6
(2 rows)

One more question. Originally, my JSON data looked like this:

select *
  from json_array_elements('[{s:[1,2,5]},
 {s:[3,4,6]}]')
   t(el)
 cross join lateral (
 select syms.sym as x   -- problem
   from json_array_elements(t.el - 's')
syms(sym)
 ) s;
 el  |  x
-+-
 {s:[1,2,5]} | 1
 {s:[1,2,5]} | 2
 {s:[1,2,5]} | 5
 {s:[3,4,6]} | 3
 {s:[3,4,6]} | 4
 {s:[3,4,6]} | 6
(6 rows)

The syms.sym field in the x column is a JSON scalar. How do I convert
that to simple TEXT? For JSON objects there is the - operator. Is
there anything similar for JSON scalars?


Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] PG user group in the Kuala Lumpur area?

2015-01-18 Thread Torsten Förtsch
Hi,

I was looking for a PG user group around KL. I know there is one in
Singapore. As it happens, Chris Travers, the PG contact for Malaysia is
a friend of mine. So, I asked him. He wasn't aware of one either.
However, he very much appreciated the idea of founding one. I know there
are lots of PG users in the area.

But is there enough demand for a user group? If you are interested,
please contact me.

My idea behind this whole thing is to eventually have a regular PG
conference South East Asia. I have been to PGconf.eu several times and I
know from experience that it is a great opportunity to learn new stuff,
meet people and also have much fun. I think esp. Malaysia is a good
place for such an event. There are many people out there that could
never come to PGconf.eu or similar in the US and in many other places
because of their passport. Getting a visa to Malaysia is possible for
almost everyone. I don't know about North Korea, but there are many
Iranians around here.

About myself, I am German, currently traveling back and forth between
Germany and Malaysia.

Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Planet Postgres

2014-10-31 Thread Torsten Förtsch
On 31/10/14 09:38, Magnus Hagander wrote:
 I think in this particular case, much of the blame can be placed on
 everybody being at pgconf.eu last week, and struggling under
 backlogs. But as Stephen says, the backlog will eventually get worked
 through, and the moderation happen.

I see. No problem, I'll just wait. BTW, the reason or rather the last
trigger to start the blog was pgconf.eu.

Thanks for organizing it.

Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Planet Postgres

2014-10-30 Thread Torsten Förtsch
Hi,

I don't know if this is the right mailing list for my question but I
have already emailed pla...@postgresql.org without any response.

What is the correct way to get a blog listed on planet postgres?

The documented way does not work. I registered my blog on Monday. It
still is in Not approved yet state. Yesterday I asked planet@... no
response.

So, what did I wrong?

Thanks,
Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] some queries on standby preventing replication updates

2014-10-28 Thread Torsten Förtsch
On 28/10/14 19:37, Joe Van Dyk wrote:
 On Mon, Oct 27, 2014 at 6:22 AM, Emanuel Calvo
 emanuel.ca...@2ndquadrant.com mailto:emanuel.ca...@2ndquadrant.com
 wrote:
 
 
 El 23/10/14 a las 17:40, Joe Van Dyk escibió:
  Hi,
 
  I have a master and a slave database.
 
  I've got hot_standby_feedback turned on,
  max_standby_streaming_delay=-1. I've configured the master and slave
  to keep a few days of WALs around.
 
  I've noticed that when some large queries are run on the standby
  machine (ones that take more than a minute or so), replication updates
  are paused. Is there a way to fix this?
 
 You may need to set a value on max_standby_streaming_delay, which
 controls the time
 before cancelling the standby queries when a conflict occurs on a
 wal-records-about-to-be-applied.
 
 Source:
 http://www.postgresql.org/docs/9.3/static/runtime-config-replication.html
 
 
 I'm using -1 for that option, would using something different be better? 

It depends on what you want to achieve. If you want to sacrifice your
long-running query to keep replication going, set the value to 0. If
you (like me) are using the slave to run analytical queries that can
take many hours or even days, I'd rather live with the current
behaviour. When the long-running query is over the wal receiver
automatically reconnects to the master. The only thing you should make
sure is to keep enough wal segments. With 9.4 even that gets easier.
There you can assign a replication slot to the replica and the master
then knows which segments are still needed when the slave reconnects.

Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] pg_stat_replication in 9.3

2014-09-14 Thread Torsten Förtsch
Hi,

I noticed a strange behaviour regarding pg_stat_replication in 9.3. If
called from psql using the \watch command, I see all my replicas. From
time to time one of them drops out and reconnects in a short period of
time, typically ~30 sec.

If I use the same select in plpgsql like:

  FOR r in SELECT application_name,
  client_addr,
  flush_location, clock_timestamp() AS lmd
 FROM pg_stat_replication
ORDER BY application_name, client_addr
  LOOP
RAISE NOTICE 'aname=%, ca=%, lmd=%, loc=%, cur=%, lag=%',
 r.application_name, r.client_addr, r.lmd,
 r.flush_location,
 pg_current_xlog_location(),
 pg_size_pretty(
   pg_xlog_location_diff(
 pg_current_xlog_location(),
 r.flush_location
   )
 );
  END LOOP;

I see one of the replicas dropping out but never coming back again while
in a parallel session using psql and \watch it indeed does come back.

Is that intended?

Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_stat_replication in 9.3

2014-09-14 Thread Torsten Förtsch
On 14/09/14 16:24, Andy Colson wrote:
 I wonder if its a transaction thing?  Maybe \watch is using a
 transaction for each (or isn't using transactions at all), whereas the
 plpgsql is one long transaction?
 
 Also if one of your replicas is far away, it doesn't really surprise me
 that it might loose connection every once and a while.  On the other
 hand, if the box is on the same subnet, right next to the master, and it
 was loosing connection, that would be a bad thing.
 
 So, how far away is the replica?  And does 'ps ax|grep postgr' show
 'idle' or 'idle in transaction' on the \watch and the plpgsql?

The replicas are far away, intercontinental far. I am not complaining
that the replica looses the connection. What makes me wonder is that
within a transaction, pg_stat_replication can forget rows but cannot
acquire new ones. I'd think it should be either report the state at the
beginning of the transaction like now() or the current state like
clock_timestamp(). But currently it's reporting half the current state.

Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_stat_replication in 9.3

2014-09-14 Thread Torsten Förtsch
On 14/09/14 18:55, Tom Lane wrote:
 Are you watching the state in a loop inside a single plpgsql function?
 If so, I wonder whether the problem is that the plpgsql function's
 snapshot isn't changing.  From memory, marking the function VOLATILE
 would help if that's the issue.

The function is VOLATILE. I attached 2 versions of it. fn-old.sql does
not work because once a slave has disconnected it drops out and does not
come back. fn.sql uses dblink to work around the problem. But it
consumes 2 db connections.

The intent of the function is to be called between operations that may
cause slaves to lag behind. If the lag is below a certain limit, it
simply returns. Otherwise, it waits until the lag drops below a second
limit.

If it were a VOLATILE problem, the functions would not be able to see
when a slave drops out nor changes in the data. But it does see these
changes. Only when a slave comes back online, it is not seen in the
current transaction.

Torsten
CREATE OR REPLACE FUNCTION wait_for_streaming_lag(low_water_mark BIGINT DEFAULT 100, high_water_mark BIGINT DEFAULT 2000, tmout INTERVAL DEFAULT '4h')
RETURNS BIGINT
AS $def$
DECLARE r  RECORD;
water_mark BIGINT;
BEGIN
SET LOCAL client_min_messages TO ERROR;
CREATE TEMP TABLE IF NOT EXISTS lag (
gen  INT,
application_name TEXT,
client_addr  INET,
flush_location   TEXT,
lmd  TIMESTAMP
);
SET LOCAL client_min_messages TO NOTICE;

water_mark := $2;   -- use high_water_mark for the first loop

LOOP
WITH g AS (SELECT max(gen) AS gen FROM lag),
 r AS (SELECT 1 AS ord, application_name, client_addr, flush_location, clock_timestamp() AS lmd
 FROM pg_stat_replication
UNION ALL
   SELECT 2 AS ord, application_name, client_addr, flush_location, lmd
 FROM lag)
INSERT INTO lag
SELECT coalesce(g.gen+1, 1), rx.*
  FROM (SELECT DISTINCT ON (application_name, client_addr)
   application_name, client_addr, flush_location, lmd
  FROM r
 ORDER BY application_name,
  client_addr,
  ord ASC,
  pg_xlog_location_diff(flush_location, '0/0') ASC) rx CROSS JOIN g;
DELETE FROM lag WHERE gen(SELECT max(gen) FROM lag);
DELETE FROM lag WHERE lmdclock_timestamp() - '5min'::INTERVAL;

SELECT INTO r coalesce(max(pg_xlog_location_diff(pg_current_xlog_location(), flush_location)), 0) AS lag,
  clock_timestamp()-now() AS tm FROM lag;

EXIT WHEN r.lag = water_mark;

IF r.tm$3 THEN
RAISE EXCEPTION USING
MESSAGE='Timeout while waiting for streaming lag to drop below ' || $1,
ERRCODE='TF001';
END IF;

water_mark := $1;
PERFORM pg_sleep(1);
END LOOP;

RETURN r.lag;
END;
$def$ LANGUAGE plpgsql VOLATILE SECURITY invoker;
BEGIN;

CREATE OR REPLACE FUNCTION wait_for_streaming_lag(low_water_mark BIGINT DEFAULT 100, high_water_mark BIGINT DEFAULT 2000, tmout INTERVAL DEFAULT '4h')
RETURNS BIGINT
AS $def$
DECLARE r  RECORD;
water_mark BIGINT;
BEGIN
-- we need dblink here because pg_stat_replication at least in 9.3,
-- although it does report replicas dropping out, it does not report
-- replicas reconnecting if called in a transaction.
PERFORM dblink_connect('wait_for_streaming_lag', 'dbname=' || current_database() || ' application_name=wait_for_streaming_lag')
  WHERE NOT EXISTS (SELECT 1 FROM unnest(dblink_get_connections()) c(c)
 WHERE c='wait_for_streaming_lag');

SET LOCAL client_min_messages TO ERROR;
CREATE TEMP TABLE IF NOT EXISTS lag (
gen  INT,
application_name TEXT,
client_addr  INET,
flush_location   TEXT,
lmd  TIMESTAMP
);
SET LOCAL client_min_messages TO NOTICE;

water_mark := $2;   -- use high_water_mark for the first loop

LOOP
WITH g AS (SELECT max(gen) AS gen FROM lag),
 r AS (SELECT 1 AS ord, application_name, client_addr, flush_location, clock_timestamp() AS lmd
 FROM dblink('wait_for_streaming_lag', $$
  SELECT application_name, client_addr, flush_location
FROM pg_stat_replication
  $$) repl(application_name TEXT, client_addr INET, flush_location TEXT)
UNION ALL
   SELECT 2 AS ord, application_name, client_addr, flush_location, lmd
 FROM lag)
INSERT INTO lag
SELECT coalesce(g.gen+1, 1), rx.*
  FROM (SELECT DISTINCT ON (application_name, client_addr)
   application_name, client_addr, flush_location, lmd

Re: [GENERAL] check database integrity

2014-07-22 Thread Torsten Förtsch
On 21/07/14 16:17, Tom Lane wrote:
  db=# select page_header(get_raw_page(2836::oid::regclass::text, 'fsm',
  1));
  ERROR:  block number 1 is out of range for relation pg_toast_1255
  db=# select pg_relation_size(2836::oid::regclass, 'fsm');
   pg_relation_size
  --
  24576
 That's bizarre.  AFAICS, pg_relation_size() reduces to a stat() call,
 while the other error looks like it's coming from rawpage.c's check on
 RelationGetNumberOfBlocks() which depends on mdnblocks() which prefers
 to look at the result of lseek(SEEK_END).  But both of those should
 surely get the same answer, if the file's not changing.
 
 Could you trace through it and see where the results diverge?  Also,
 what's the actual size of the file on disk?

# select pg_relation_filepath(2836::oid::regclass);
pg_relation_filepath
--
 base/25317/11790


# ls -l data/base/25317/11790*
-rw--- 1 postgres postgres  8192 Jul 21 07:31 data/base/25317/11790
-rw--- 1 postgres postgres 24576 Jul 21 07:33 data/base/25317/11790_fsm
-rw--- 1 postgres postgres  8192 Jul 21 07:33 data/base/25317/11790_vm

You see, main and vm forks of the relation are one page. Only fsm is 3
pages.

After a fresh restart of the database I attached strace to the backend.
There are only 2 lines in the output that mention that relation:

open(base/25317/11790, O_RDWR)= 35
lseek(35, 0, SEEK_END)  = 8192

This happened during this query:

select get_raw_page(2836::oid::regclass::text, 'fsm', 1);

Shouldn't it rather open 11790_fsm?

Or is there something that first checks the main fork to see if the fsm
page makes sense?

It seems so because here is the same query for a relation where it works:

open(base/25317/60966, O_RDWR)= 39
lseek(39, 0, SEEK_END)  = 1490944
open(base/25317/60966_fsm, O_RDWR)= 40
lseek(40, 8192, SEEK_SET)   = 8192
read(40, \37\1\0\0\360\371\275\212\305\35\0\0\30\0\0 \0 \4
\0\0\0\0\0\0\0\0\372\372\0\372..., 8192) = 8192

First it opens the main fork, then *_fsm where it reads a page at offset
8192.

Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] check database integrity

2014-07-22 Thread Torsten Förtsch
On 22/07/14 16:58, Tom Lane wrote:
 Doh.  I looked right at this code in get_raw_page yesterday:
 
 if (blkno = RelationGetNumberOfBlocks(rel))
 elog(ERROR, block number %u is out of range for relation \%s\,
  blkno, RelationGetRelationName(rel));
 
 RelationGetNumberOfBlocks reports the length of the main fork ... but
 this check is applied regardless of which fork we're reading.  Should
 be using RelationGetNumberOfBlocksInFork, of course.

Thanks for fixing it. I saw your GIT commit.

Do you know how long it takes for the change to make it into a new pgdg
debian package?

Thanks,
Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] check database integrity

2014-07-20 Thread Torsten Förtsch
Hi,

we are using 9.3 with data checksums enabled. Now I am looking for a way
to check if all database blocks are still intact. First I tried
pg_filedump. In many cases it simply ignored tampered data blocks. It is
probably not made for this task.

Then I remembered about the pageinspect extension. The following select
is a bit too verbose but it seems to do the job for everything except
fsm files.

SELECT c.oid::regclass::text as rel,
   f.fork,
   ser.i as blocknr,
   pg.*
  FROM pg_class c
 CROSS JOIN (values ('main'::text), ('vm'::text)) f(fork)
 CROSS JOIN pg_relation_size(c.oid::regclass, f.fork) sz(sz)
 CROSS JOIN generate_series(0,(sz.sz/8192)::int-1) ser(i)
 CROSS JOIN page_header(get_raw_page(c.oid::regclass::text,
 f.fork,
 ser.i)) pg
 WHERE sz.sz0

Am I right?

The problem with the select above is that either page_header() or
get_raw_page() seems to allocate the memory for the page without freeing
it again. The process size grew to ~12.5 GB and the query returned
~1,500,000 rows. And 1.5E6 * 8 kB gives roughly 12 GB. Shared buffers is
~120 MB for this database.

I ran this query in a separate transaction. The memory was freed only
when the backend process exited.

Is there a way to work around this memory leak?

Is there a better way to do what I want? I also thought about pg_dump.
But that does not read indexes, AFAIK. Best if the solution would avoid
expensive locks. Then I could also use it in production. But currently I
need it only to verify a backup.

Thanks,
Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] check database integrity

2014-07-20 Thread Torsten Förtsch
On 20/07/14 17:35, Tom Lane wrote:
 =?ISO-8859-1?Q?Torsten_F=F6rtsch?= torsten.foert...@gmx.net writes:
 Then I remembered about the pageinspect extension. The following select
 is a bit too verbose but it seems to do the job for everything except
 fsm files.
 
 SELECT c.oid::regclass::text as rel,
f.fork,
ser.i as blocknr,
pg.*
   FROM pg_class c
  CROSS JOIN (values ('main'::text), ('vm'::text)) f(fork)
  CROSS JOIN pg_relation_size(c.oid::regclass, f.fork) sz(sz)
  CROSS JOIN generate_series(0,(sz.sz/8192)::int-1) ser(i)
  CROSS JOIN page_header(get_raw_page(c.oid::regclass::text,
  f.fork,
  ser.i)) pg
  WHERE sz.sz0
 
 The problem with the select above is that either page_header() or
 get_raw_page() seems to allocate the memory for the page without freeing
 it again.
 
 Probably commit 45b0f3572 will help you with that.

Thanks, Tom. At least the patch description helped. I moved the
page_header() call to output column list and now it works perfectly.
I'll try the patch next weekend.

 I ran this query in a separate transaction. The memory was freed only
 when the backend process exited.
 
 AFAIK such memory is released at end of query, even without the patch.
 Are you sure you aren't looking at shared-buffer usage?  Or maybe you're
 on a platform where libc doesn't release freed memory back to the OS.

You are right here. When I wrote the email I restored the behaviour from
my faulty memory. Today I tried it again and the memory is indeed freed
at the end of the query.

Another question, just out of curiosity, for vm and main forks I use
pg_relation_size to figure out the highest page number. That does not
work for fsm. I have at least one fsm file that it 24 kb. Fetching page
0 works, page 1 and above gives an error:

db=# select page_header(get_raw_page(2836::oid::regclass::text, 'fsm', 0));
 page_header
--
 (114/23485F78,19084,0,24,8192,8192,8192,4,0)
(1 row)

db=# select page_header(get_raw_page(2836::oid::regclass::text, 'fsm',
1));
ERROR:  block number 1 is out of range for relation pg_toast_1255
db=# select pg_relation_size(2836::oid::regclass, 'fsm');

 pg_relation_size
--
24576


For other relations it works:

db=# select page_header(get_raw_page(60966::oid::regclass::text, 'fsm',
i)) from generate_series(0,2) i;
  page_header
---
 (11F/76884610,-4342,0,24,8192,8192,8192,4,0)
 (11F/768825C0,22465,0,24,8192,8192,8192,4,0)
 (11F/83E9EC38,-29015,0,24,8192,8192,8192,4,0)
(3 rows)

db=# select pg_relation_size(60966::oid::regclass, 'fsm');
 pg_relation_size
--
24576


Is there a way to figure out the highest page number for fsm forks? Is
there perhaps a common way that works for all forks?


Thanks,
Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] check database integrity

2014-07-20 Thread Torsten Förtsch
On 20/07/14 16:02, Andrew Sullivan wrote:
 Then I could also use it in production. But currently I
  need it only to verify a backup.

 If you need to verify a backup, why isn't pg_dump acceptable?  Or is
 it that you are somehow trying to prove that what you have on the
 target (backup) machine is in fact production-ready?  I guess I don't
 really understand what you are trying to do.

Sorry, for kind-of misusing the word backup. What I am doing is this. I
took a base backup and replayed a few xlogs. This is what I meant with
backup.

What I want to verify is whether all pages in all files match their
checksums. So, I have to make postgres read all pages at least once.
Pg_dump does this for normal tables and toast. But it does not read
index relations as far as I know. A

  select count(*)

from all tables would also do the job, again without indexes.

The sentence about the backup was only to point out that I don't really
care if the query locks the database for concurrent transactions. But
better if it would not acquire an exclusive lock on all tables.

Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] updates not causing changes

2014-06-12 Thread Torsten Förtsch
Hi,

our developers use a ORM tool which generates updates that write all
columns of a table. In most cases, however, very few columns actually
change. So, those updates mostly write the same value that already is
there in the column.

Now, if there is an index on such columns, does Postgres recognize the
situation and skip the index update? Similarly, does it skip the FK or
UNIQUE check?

Thanks,
Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] locking order

2014-06-12 Thread Torsten Förtsch
Hi,

when a row is updated a RowExclusiveLock is obtained on the updated row
and on every related index.

Is the order in which these locks are acquired defined in some way?

Thanks,
Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] WAL bandwidth

2014-05-22 Thread Torsten Förtsch
Hi,

time and again I need to build indexes. If they are big, that generates
a lot of WAL data that needs to be replicated to streaming replication
slaves. Usually these slaves don't lag behind noticeably. So, the
application often reads from them. Well, unless I build indexes and,
thus, create a huge amount of WAL in a short period of time.

What I'd like to have is something where I can set the max. bandwidth
with which the index generating backend may generate WAL data. I seem to
remember to have seen a discussion about something similar but can't
recall where.

Is there anything I can do about that problem in 9.3 or 9.4?

I already have a function that waits for the streaming slaves to catch
up. But that mitigates the problem only at a very crude level. I'd like
to be able to set that bandwidth to, say, 10mbit/sec. Then I can be sure
that all my replicas are fine. How long the index creation takes, does
not matter.

Thanks,
Torsten



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] WAL bandwidth

2014-05-22 Thread Torsten Förtsch
On 22/05/14 21:05, Jeff Janes wrote:
 time and again I need to build indexes. If they are big, that
 generates
 a lot of WAL data that needs to be replicated to streaming
 replication
 slaves. Usually these slaves don't lag behind noticeably. So, the
 application often reads from them. Well, unless I build indexes and,
 thus, create a huge amount of WAL in a short period of time.
 
 
 Are these built CONCURRENTLY?

yes

 What I'd like to have is something where I can set the max.
 bandwidth
 with which the index generating backend may generate WAL data. I
 seem to
 remember to have seen a discussion about something similar but can't
 recall where.
 
 Is there anything I can do about that problem in 9.3 or 9.4?
 
 I already have a function that waits for the streaming slaves to
 catch
 up. But that mitigates the problem only at a very crude level.
 I'd like
 to be able to set that bandwidth to, say, 10mbit/sec. Then I can
 be sure
 that all my replicas are fine. How long the index creation
 takes, does
 not matter.
 
 
 This does not appear the domain of PostgreSQL as much as the domain
 of your OS and network layer.
 
 
 The OS and network have little choice but to process the WAL in the
 order it is generated.   If you want to throttle the generation of WAL
 by background maintenance operations so they don't interfere with the
 processing of WAL generated by bread-and-butter transaction processing,
 that is something that only PostgreSQL can do.

That's what I want, to throttle the rate at which WAL is generated by
maintenance operations.

I take it, there is no such thing by now. Would it be a useful addition?

I am not sure if I have the time to implement it. I have had a cursory
look at the code before, just to find out how things work, but never
changed something. What do you think, is it complicated to implement?

Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] break table into portions for writing to separate files

2014-05-01 Thread Torsten Förtsch
On 01/05/14 19:50, Seb wrote:
 Hello,
 
 I've been looking for a way to write a table into multiple files, and am
 wondering if there are some clever suggestions.  Say we have a table
 that is too large (several Gb) to write to a file that can be used for
 further analyses in other languages.  The table consists of a timestamp
 field and several numeric fields, with records every 10th of a second.
 It could be meaningfully broken down into subsets of say 20 minutes
 worth of records.  One option is to write a shell script that loops
 through the timestamp, selects the corresponding subset of the table,
 and writes it as a unique file.  However, this would be extremely slow
 because each select takes several hours, and there can be hundreds of
 subsets.  Is there a better way?

# copy (select * from generate_series(1,1000)) to program 'split -l 100
- /tmp/xxx';
COPY 1000
# \q

$ ls -l /tmp/xxxa*
-rw--- 1 postgres postgres 292 May  1 19:08 /tmp/xxxaa
-rw--- 1 postgres postgres 400 May  1 19:08 /tmp/xxxab
-rw--- 1 postgres postgres 400 May  1 19:08 /tmp/xxxac
-rw--- 1 postgres postgres 400 May  1 19:08 /tmp/xxxad
-rw--- 1 postgres postgres 400 May  1 19:08 /tmp/xxxae
-rw--- 1 postgres postgres 400 May  1 19:08 /tmp/xxxaf
-rw--- 1 postgres postgres 400 May  1 19:08 /tmp/xxxag
-rw--- 1 postgres postgres 400 May  1 19:08 /tmp/xxxah
-rw--- 1 postgres postgres 400 May  1 19:08 /tmp/xxxai
-rw--- 1 postgres postgres 401 May  1 19:08 /tmp/xxxaj

Each of those contains 100 lines.

Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] arrays of rows and dblink

2014-04-30 Thread Torsten Förtsch
Hi,

we have the ROW type and we have arrays. We also can create arrays of
rows like:

select array_agg(r)
  from (values (1::int, 'today'::timestamp, 'a'::text),
   (2, 'yesterday', 'b')) r(a,b,c);
 array_agg
---
 {(1,\2014-04-30 00:00:00\,a),(2,\2014-04-29 00:00:00\,b)}


Now I want to execute that query via dblink on a remote server. How do I
specify the result type?

select tb.* from dblink( 'dbname=postgres', $$
select array_agg(r)
  from (values (1::int, 'today'::timestamp, 'a'::text),
$$) tb( WHAT DO I PUT HERE? )

Thanks,
Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] arrays of rows and dblink

2014-04-30 Thread Torsten Förtsch
On 30/04/14 20:19, David G Johnston wrote:
 ISTM that you have to CREATE TYPE ... as appropriate then 
 
 ... tb ( col_alias type_created_above[] )
 
 There is only so much you can do with anonymous types (which is what the ROW
 construct creates; ROW is not a type but an expression anchor - like
 ARRAY[...]) that tells the parser how to interpret what follows.

I thought so. Do I have to create the type in both databases or only on
the receiving site?

Thanks,
Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] How to find out PIDs of transactions older than the current?

2014-04-25 Thread Torsten Förtsch
Hi,

I think I can find out the transaction ids of concurrent transactions
older than the current one by:

  select * from txid_snapshot_xip(txid_current_snapshot())
  union
  select * from txid_snapshot_xmax(txid_current_snapshot());

Now, I want to map these transaction ids to backend process ids.
pg_stat_activity does not provide the transaction id. So, I turned to
pg_locks.

select l.pid
  from (
  select * from txid_snapshot_xip(txid_current_snapshot())
  union
  select * from txid_snapshot_xmax(txid_current_snapshot())) tx(id)
  join pg_locks l
on (l.locktype='transactionid'
and l.transactionid::TEXT::BIGINT=tx.id);

This works. But my transaction ids are still far less than 2^32.

Will it also work after the wraparound? I am worried because there is no
default cast from XID to INT or BIGINT.

Is there a better way?

Thanks,
Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to find out PIDs of transactions older than the current?

2014-04-25 Thread Torsten Förtsch
On 25/04/14 13:26, Torsten Förtsch wrote:
 I think I can find out the transaction ids of concurrent transactions
 older than the current one by:
 
   select * from txid_snapshot_xip(txid_current_snapshot())
   union
   select * from txid_snapshot_xmax(txid_current_snapshot());
 
 Now, I want to map these transaction ids to backend process ids.
 pg_stat_activity does not provide the transaction id. So, I turned to
 pg_locks.
 
 select l.pid
   from (
   select * from txid_snapshot_xip(txid_current_snapshot())
   union
   select * from txid_snapshot_xmax(txid_current_snapshot())) tx(id)
   join pg_locks l
 on (l.locktype='transactionid'
 and l.transactionid::TEXT::BIGINT=tx.id);
 
 This works. But my transaction ids are still far less than 2^32.

I think I got it. pg_locks.transactionid is a 4-byte quantity. But
I can safely ignore the upper half of the BIGINT that comes out of
txid_snapshot_xip(). So, the query becomes:

select l.pid
  from (
  select * from txid_snapshot_xip(txid_current_snapshot())
  union
  select * from txid_snapshot_xmax(txid_current_snapshot())) tx(id)
  join pg_locks l
on (l.locktype='transactionid'
and l.transactionid::TEXT::BIGINT=tx.id  (1::BIGINT32)-1)

Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Is this a planner bug?

2014-04-22 Thread Torsten Förtsch
Hi,

I got this plan:

Limit  (cost=0.00..1.12 rows=1 width=0)
   -  Seq Scan on fmb  (cost=0.00..6964734.35 rows=6237993 width=0)
 Filter: ...

The table has ~80,000,000 rows. So, the filter, according to the plan,
filters out 90% of the rows. Although the cost for the first row to
come out of the seqscan might be 0, the cost for the first row to pass
the filter and, hence, to hit the limit node is probably higher.

Thanks,
Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Is this a planner bug?

2014-04-22 Thread Torsten Förtsch
On 22/04/14 14:24, Pavel Stehule wrote:
 what is your effective_cache_size in postgresql.conf?
 
 What is random_page_cost and seq_page_cost?
 

8GB, 4, 1

But I am not asking about how to get a different plan or how to optimize
the query. I know that.

What I'm asking is the following. Assuming node without any filter has a
startup cost C1, a total cost of C2 and produces N rows. Now, a filter
is applied which passes through M rows. Then the startup cost for the
node *with* the filter applied should be different from C1 because a
certain amount of rows from the beginning is filtered out, right?

I think the startup cost should be something like

  C1 + (C2+N*F-C1)*M/N   or   C1 + 0.5*(C2+N*F-C1)*M/N

where F is the cost to apply the filter to one row.

On average only one out of N/M rows matches the filter. So we need to
fetch N/M rows to produce the first row out of the filter. Now, you can
argue that we don't know where in that set the first matching row is. On
average it would probably in the middle. That's where the 0.5 comes from.

I certainly got it wrong somewhere. But I think you got the idea.

If not the seqscan node, but the limit node should have a startup cost
0 (depending where the filter is taken into account). In my case the
startup cost for the limit node should be somewhere between 25 and
30.

Torsten

 2014-04-22 14:10 GMT+02:00 Torsten Förtsch torsten.foert...@gmx.net
 mailto:torsten.foert...@gmx.net:
 
 Hi,
 
 I got this plan:
 
 Limit  (cost=0.00..1.12 rows=1 width=0)
-  Seq Scan on fmb  (cost=0.00..6964734.35 rows=6237993 width=0)
  Filter: ...
 
 The table has ~80,000,000 rows. So, the filter, according to the plan,
 filters out 90% of the rows. Although the cost for the first row to
 come out of the seqscan might be 0, the cost for the first row to pass
 the filter and, hence, to hit the limit node is probably higher.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Is this a planner bug?

2014-04-22 Thread Torsten Förtsch
On 22/04/14 16:39, Albe Laurenz wrote:
 Could you run EXPLAIN ANALYZE for the query with enable_seqscan
 on and off?  I'd be curious
 a) if the index can be used
 b) if it can be used, if that is actually cheaper
 c) how the planner estimates compare with reality.
 

Using the index:

Limit  (cost=0.57..2.95 rows=1 width=0)
   (actual time=0.095..0.095 rows=1 loops=1)
   -  Index Scan ... (cost=0.57..14857285.83 rows=6240539 width=0)
  (actual time=0.095..0.095 rows=1 loops=1)
 Index Cond:...
 Filter: ...
 Rows Removed by Filter: 4
 Total runtime: 0.147 ms


seq scan:

Limit  (cost=0.00..1.12 rows=1 width=0)
   (actual time=0.943..0.944 rows=1 loops=1)
   -  Seq Scan ...  (cost=0.00..6967622.77 rows=6240580 width=0)
 (actual time=0.940..0.940 rows=1 loops=1)
 Filter: ...
 Rows Removed by Filter: 215
 Total runtime: 0.997 ms

In these cases all the stuff comes from cache hits. When I first tried
the query it used a seq scan and it took several seconds. In this case
only setting random_page_cost less than seq_page_cost would make the
planner use the index.


I think if we had separate filter nodes, just like SORT nodes, then it
would be clearer that the setup cost of the seq scan with filter cannot
be 0.

Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Is this a planner bug?

2014-04-22 Thread Torsten Förtsch
On 22/04/14 16:45, Tom Lane wrote:
 No.  The model is that startup cost is what's expended before the scan can
 start, and then the run cost (total_cost - startup_cost) is expended while
 scanning.  Applying a filter increases the run cost and also reduces the
 number of rows returned, but that's got nothing to do with startup cost.
 
 As a comparison point, imagine an index scan that has a filter condition
 in addition to the indexable condition (which let's assume selects
 multiple rows).  The startup cost for such a plan corresponds to the index
 descent costs.  The run cost corresponds to scanning the index entries
 matching the indexable condition, fetching the heap rows, and applying the
 filter condition.
 
 Or in other words, time to get the first result row is not just startup
 cost; it's startup cost plus run_cost/N, if the plan is estimated to
 return N rows altogether.

Ok, I understand that's the way the model is.

The point is that especially in presence of a LIMIT 1 there is a
difference between a seq scan that has to fetch a few 10MB to find the
first and only row and an index scan that has to process perhaps  a few
kb. And in this case even setting random_page_cost=seq_page_cost didn't
help.

If that query were part of a larger one, I wouldn't want to fiddle with
the cost parameters to get one part of the query fast only to sacrifice
performance in another part.

Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] pg_stat_replication.state: streaming/catchup

2014-04-21 Thread Torsten Förtsch
Hi,

just out of curiosity, what's the difference between streaming and
catchup state in pg_stat_replication.

According to the documentation this field is Current WAL sender state.
But that does not tell me anything.

Thanks,
Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_stat_replication.state: streaming/catchup

2014-04-21 Thread Torsten Förtsch
On 21/04/14 13:18, Michael Paquier wrote:
 When a standby connects for the first time to a primary, it is not yet
 synchronized, this is the catchup phase. Once the lag between the
 standby and the master is reduced to zero for the first time,
 replication state changes to streaming.

Thanks.

I am seeing several standbys changing from streaming to catchup and
back. Sometimes they also get completely lost. This happens when the lag
becomes high, hundreds of MB or even GB. The standby servers are
relatively far away on the internet. And the operation to generate this
kind of lag is index creation on large tables.

Is there a difference in the protocol used in those phases? Maybe the
catchup phase is optimized for bulk throughput?

Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Disable an index temporarily

2014-04-20 Thread Torsten Förtsch
On 20/04/14 03:02, Sergey Konoplev wrote:

Thanks for you reply.

 an index can be INVALID (pg_index.indisvalid=false).

 I want to temporarily disable an index so that it won't be used to
 access data but will still be updated.

 Can I simply set pg_index.indisvalid=false and later turn it true again?
 
 It works on a quick test, but I'm not sure how safe it is.
 
 If you need to test a query without the index use a transaction:
 
 Begin;
 Drop index ...;
 Explain ... select ...;
 Rollback;

I know about that.

The problem is I have a number of indexes in a large system that are
very similar. And I suspect some of them are superfluous.

Example:

btree (fmb_id, action_type)
btree (fmb_id)

Action_type in this case is one out of a handful of values (should
perhaps be an ENUM but is TEXT) and for most of the table the
combination of (fmb_id, action_type) is unique. The table itself has
~2E8 rows. So it takes a while to build these indexes from scratch.

Now, we have several performance counters in place. I want to disable
these indexes one by one and see what happens. I am probably not able to
find all of the queries that use them. But I believe that nothing much
happens if I drop one of them (preferably the former?).

Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Disable an index temporarily

2014-04-20 Thread Torsten Förtsch
On 20/04/14 12:08, Thomas Kellerer wrote:
 Example:

  btree (fmb_id, action_type)
  btree (fmb_id)

[...]
 
 Btw: in the above example the second one is definitely not needed.
 Any query that uses the second one (single column) can also use the
 first one.

I know.

But the single column index is 3534 MB the 2-column one 4963 MB. The
number of rows per distinct fmb_id is very small, usually 1 or 2. So, if
a query looks for fmb_id=A and action_type=B, it has to filter out
only one row if the first index is not available. Hence, I thought maybe
the lower size of the index and the higher compactness per page could
outweigh the more direct access provided by the 2-column index.

I am quite sure there is no query that qualifies for an index-only scan
on the 2-column index.

Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Disable an index temporarily

2014-04-19 Thread Torsten Förtsch
Hi,

an index can be INVALID (pg_index.indisvalid=false).

I want to temporarily disable an index so that it won't be used to
access data but will still be updated.

Can I simply set pg_index.indisvalid=false and later turn it true again?

Thanks,
Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Why do we need an AccessExclusiveLock to validate a FK constraint marked as NOT VALID?

2014-04-13 Thread Torsten Förtsch
Hi,

currently, ALTER TABLE VALIDATE CONSTRAINT for foreign key constraints
acquires an AccessExclusiveLock on the referencing table.

Why?

If the constraint is in place but not validated (ADD CONSTRAINT ... NOT
VALID) it already prevents new modifications from violating the constraint.

The code that is called to validate the constraint, RI_Initial_Check,
contains this comment:

 * We expect that the caller has made provision to prevent any problems
 * caused by concurrent actions. This could be either by locking rel and
 * pkrel at ShareRowExclusiveLock or higher, or by otherwise ensuring
 * that triggers implementing the checks are already active.
 * Hence, we do not need to lock individual rows for the check.

Doesn't the presence of the NOT VALID constraint qualify as otherwise
ensuring that triggers implementing the checks are already active?

Is there any deeper reason? Or is it simply not implemented yet?

Thanks,
Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why do we need an AccessExclusiveLock to validate a FK constraint marked as NOT VALID?

2014-04-13 Thread Torsten Förtsch
On 13/04/14 13:34, Vik Fearing wrote:
 Actually, it is implemented yet.
 
 http://www.postgresql.org/message-id/e1wwovd-0004ts...@gemulon.postgresql.org
 
 It'll be in 9.4.

That's good news.

So, I could validate a FK constraint this way:

UPDATE pg_constraint
   SET convalidated = NOT EXISTS(
SELECT 1
  FROM ONLY fkrel a
  LEFT JOIN ONLY pkrel b
ON (a.fkcol1=b.pkcol1 AND ...)   -- all fk columns
 WHERE b.pkcol1 IS NULL  -- inner join failed
   AND (a.fkcol1 IS NOT NULL
OR/AND   -- MATCH SIMPLE: AND; FULL: OR
a.fkcol2 IS NOT NUL
...)
   )
 WHERE contype='f'
   AND ...

fkrel is confrelid::regclass and pkrel conrelid::regclass.

That's essentially what AT VALIDATE CONSTRAINT does.

Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Indices and Foreign Tables

2014-04-04 Thread Torsten Förtsch
On 04/04/14 17:14, Kohler Manuel (ID SIS) wrote:
 Here is the same query with fdw:
 
 db=# EXPLAIN VERBOSE select ds.code, count(*), sum(dsf.size_in_bytes) as 
 raw_size,pg_size_pretty(sum(dsf.size_in_bytes)) as size from 
 data_set_files_fdw dsf, data_sets_fdw ds where dsf.parent_id is null and 
 dsf.dase_id=ds.id group by ds.code order by raw_size desc;
   QUERY PLAN
 ---
  Sort  (cost=327.81..327.97 rows=64 width=40)
Output: ds.code, (count(*)), (sum(dsf.size_in_bytes)), 
 (pg_size_pretty(sum(dsf.size_in_bytes)))
Sort Key: (sum(dsf.size_in_bytes))
-  HashAggregate  (cost=325.09..325.89 rows=64 width=40)
  Output: ds.code, count(*), sum(dsf.size_in_bytes), 
 pg_size_pretty(sum(dsf.size_in_bytes))
  -  Hash Join  (cost=270.61..324.45 rows=64 width=40)
Output: dsf.size_in_bytes, ds.code
Hash Cond: (ds.id = (dsf.dase_id)::bigint)
-  Foreign Scan on public.data_sets_fdw ds  
 (cost=100.00..148.40 rows=1280 width=40)
  Output: ds.id, ds.code, ds.location
  Remote SQL: SELECT id, code FROM public.data_sets
-  Hash  (cost=170.48..170.48 rows=10 width=16)
  Output: dsf.size_in_bytes, dsf.dase_id
  -  Foreign Scan on public.data_set_files_fdw dsf  
 (cost=100.00..170.48 rows=10 width=16)
Output: dsf.size_in_bytes, dsf.dase_id
Filter: (dsf.parent_id IS NULL)
Remote SQL: SELECT dase_id, parent_id, 
 size_in_bytes FROM public.data_set_files
 (17 rows)
 
 And also with ANALYZE:
 
 db=# EXPLAIN ANALYZE select ds.code, count(*), sum(dsf.size_in_bytes) as 
 raw_size,pg_size_pretty(sum(dsf.size_in_bytes)) as size from 
 data_set_files_fdw dsf, data_sets_fdw ds where dsf.parent_id is null and 
 dsf.dase_id=ds.id group by ds.code order by raw_size desc;
   
 QUERY PLAN
 --
  Sort  (cost=327.81..327.97 rows=64 width=40) (actual 
 time=219401.864..219414.641 rows=13839 loops=1)
Sort Key: (sum(dsf.size_in_bytes))
Sort Method: quicksort  Memory: 2283kB
-  HashAggregate  (cost=325.09..325.89 rows=64 width=40) (actual 
 time=219327.664..219363.709 rows=13839 loops=1)
  -  Hash Join  (cost=270.61..324.45 rows=64 width=40) (actual 
 time=219127.848..219277.308 rows=13839 loops=1)
Hash Cond: (ds.id = (dsf.dase_id)::bigint)
-  Foreign Scan on data_sets_fdw ds  (cost=100.00..148.40 
 rows=1280 width=40) (actual time=1.057..77.415 rows=13839 loops=1)
-  Hash  (cost=170.48..170.48 rows=10 width=16) (actual 
 time=219126.713..219126.713 rows=13839 loops=1)
  Buckets: 1024  Batches: 1  Memory Usage: 649kB
  -  Foreign Scan on data_set_files_fdw dsf  
 (cost=100.00..170.48 rows=10 width=16) (actual time=1082.614..219083.326 
 rows=13839 loops=1)
Filter: (parent_id IS NULL)
Rows Removed by Filter: 35726596
  Total runtime: 219438.925 ms
 (1

Do you know that you can combine VERBOSE and ANALYZE in one EXPLAIN?

  EXPLAIN (ANALYZE, VERBOSE) SELECT ...

The problem is that your qualifier is not pushed down to the backend
database. So, you pull (35726596+13839) rows (which is the complete
table) from data_set_files in the remote database, filter them locally
and throw 35726596 of them away. Then you copy the remote data_sets
table (13839 rows). And then you join them.

I don't think use_remote_estimate can help a lot in this situation.

We tried to use postgres_fdw in a similar approach last year. Then
switched to dblink because, although it is possible for FDW to push
qualifiers to the backend, it too often does not do so. Then it copies
large tables over the network and scans them sequentially on the local
machine. Also, LIMIT for instance is never pushed to the backend.

Now, we create functions like this:

CREATE OR REPLACE FUNCTION __get_client_metrics_on(srv TEXT)
RETURNS TABLE(loginid TEXT,
  ...)
AS $def$

SELECT tb.*
  FROM dblink($1, $$

-- remote query starts here

SELECT c.loginid, ...
  FROM clients c
  JOIN ...

-- remote query ends here

  $$) AS tb(loginid TEXT,
...)

$def$ LANGUAGE sql VOLATILE SECURITY definer ROWS ... COST ...;

CREATE OR REPLACE FUNCTION get_client_metrics()
RETURNS TABLE(srv TEXT,
  loginid TEXT,
  ...)
AS $def$

SELECT s.srvname, rem.*
  FROM production_servers() s
 CROSS JOIN 

Re: [GENERAL] Pagination count strategies

2014-04-03 Thread Torsten Förtsch
On 03/04/14 15:34, Leonardo M. Ramé wrote:
 Hi, in one of our systems, we added a kind of pagination feature, that
 shows N records of Total records.
 
 To do this, we added a count(*) over() as Total field in our queries
 in replacement of doing two queries, one for fetching the records, and
 other for getting the count. This improved the performance, but we are't
 happy with the results yet, by removing the count, the query takes
 200ms vs 2000ms with it.
 
 We are thinking of removing the count, but if we do that, the system
 will lack an interesting feature.
 
 What strategy for showing the total number of records returned do you
 recommend?.

If you need only an estimated number and if your planner statistics are
up to date, you can use the planner.

Here is my implementation of the explain function. The COMMENT below
shows how to use it:

CREATE OR REPLACE FUNCTION explain(VARIADIC TEXT[])
RETURNS JSON AS $$
DECLARE
tmp TEXT;
BEGIN
EXECUTE 'EXPLAIN ('
 || array_to_string(array_append($1[2:array_upper($1, 1)],
'FORMAT JSON'), ', ')
 || ') '
 || $1[1] INTO tmp;
RETURN tmp::JSON;
END;
$$ LANGUAGE plpgsql VOLATILE;

COMMENT ON FUNCTION explain(VARIADIC TEXT[])
IS $def$
This function is a SQL interface to the planner. It returns the plan
(result of EXPLAIN) of the query passed as TEXT string as the first
parameter as JSON object.

The remaining parameters are EXPLAIN-modifiers, like ANALYZE or
BUFFERS.

The function can be used to store plans in the database.

Another interesting usage is when you need only an estimated row
count for a query. You can use

  SELECT count(*) ...

This gives you an exact number but is usually slow. If your planner
statistics are up to date and the query is not too complicated, the
planner usually gives a good estimate and is much faster.

  SELECT explain('SELECT 1 FROM tb WHERE id8000')
   -0-'Plan'-'Plan Rows';
$def$;


Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Pagination count strategies

2014-04-03 Thread Torsten Förtsch
On 03/04/14 17:44, Leonardo M. Ramé wrote:
 Nice!, do you know if this will work on 8.4?.
 
no way


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] SQL advice needed

2014-03-17 Thread Torsten Förtsch
Hi,

I have a volatile function that returns multiple rows. It may also
return nothing. Now, I want to write an SQL statement that calls this
function until it returns an empty result set and returns all the rows.

So, in principle I want to:

WITH RECURSIVE
t AS (
  SELECT * FROM xx()
  UNION ALL
  SELECT * FROM xx()
)
SELECT * FROM t;

But that's not recursive because the union all part lacks a reference to t.

Next I tried this:

WITH RECURSIVE
t AS (
  SELECT * FROM xx()
  UNION ALL
  SELECT * FROM xx() WHERE EXISTS (SELECT 1 FROM t)
)
SELECT * FROM t;

But the reference to t is not allowed in a subquery.

What's the best (or at least a working) way to achieve what I want?

I can do it in plpgsql. But that would mean to accumulate the complete
result in memory first, right? I need to avoid that.

Thanks,
Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] SQL advice needed

2014-03-17 Thread Torsten Förtsch
On 17/03/14 21:42, Merlin Moncure wrote:
 I can do it in plpgsql. But that would mean to accumulate the complete
  result in memory first, right? I need to avoid that.
 I would test that assumption.   This is better handled in loop IMO.
 
 LOOP
   RETURN QUERY SELECT * FROM xx();
   IF NOT found
   THEN
 RETURN;
   END IF;
 END LOOP;

At least according to the manual it is stored in memory:

cite
Note: The current implementation of RETURN NEXT and RETURN QUERY stores
the entire result set before returning from the function, as discussed
above. That means that if a PL/pgSQL function produces a very large
result set, performance might be poor: data will be written to disk to
avoid memory exhaustion, but the function itself will not return until
the entire result set has been generated. A future version of PL/pgSQL
might allow users to define set-returning functions that do not have
this limitation. Currently, the point at which data begins being written
to disk is controlled by the work_mem configuration variable.
Administrators who have sufficient memory to store larger result sets in
memory should consider increasing this parameter.
/cite

I didn't test that, though.

Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] How to make PG use work_mem?

2014-03-11 Thread Torsten Förtsch
Hi,

I have a query that involves an external sort:

   -  Sort  (cost=13662680.01..13850498.48 rows=75127389 width=16)
 (actual time=980098.397..1021411.862 rows=74181544 loops=1)
 Sort Key: (ROW(account_id, (purchase_time)::date))
 Sort Method: external merge  Disk: 3118088kB
 Buffers: shared hit=1568637 read=1327223,
  temp read=389763 written=389763

What puzzles me is that this happens even when I set work_mem to 50GB in
the session.

Why does it still use the external merge?

The query runs on a streaming replica if that matters.

Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to make PG use work_mem?

2014-03-11 Thread Torsten Förtsch
On 11/03/14 14:36, Tom Lane wrote:
 Perhaps you fat-fingered the SET somehow?

I just repeated it:

# select * from pg_settings where name='work_mem';
-[ RECORD 1 ]
name   | work_mem
setting| 52428800
unit   | kB
...

# explain (analyze,buffers)
  select distinct(account_id, purchase_time::date) from fmb;

   QUERY PLAN
-
 Unique  (cost=13666972.01..14042722.46 rows=12894641 width=16)
 (actual time=1000989.364..1058273.210 rows=2200442 loops=1)
   Buffers: shared hit=1570088 read=1326647,
temp read=389842 written=389842
   -  Sort (cost=13666972.01..13854847.24 rows=75150089 width=16)
(actual time=1000989.362..1035694.670 rows=74196802 loops=1)
 Sort Key: (ROW(account_id, (purchase_time)::date))
 Sort Method: external merge  Disk: 3118720kB
 Buffers: shared hit=1570088 read=1326647,
  temp read=389842 written=389842
 -  Seq Scan on fmb
 (cost=0.00..3836111.11 rows=75150089 width=16)
 (actual time=0.021..35520.901 rows=74196802 loops=1)
   Buffers: shared hit=1570088 read=1326647
 Total runtime: 1059324.646 ms

# show work_mem;
 work_mem
--
 50GB

This is 9.3.3 from the pgdg debian repository.

Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to make PG use work_mem?

2014-03-11 Thread Torsten Förtsch
On 11/03/14 16:03, Tom Lane wrote:
 [ thinks for awhile... ]  Oh, I know what's happening: your sort is so
 large that it's being constrained by the MaxAllocSize limit on the tuple
 pointer array.  This has been fixed in HEAD, but it's not yet in any
 shipping release.  According to the log entry for commit
 263865a48973767ce8ed7b7788059a38a24a9f37, the previous limit on the number
 of tuples that could be sorted in memory was INT_MAX/48 or about 44
 million; I've not done the arithmetic to check that, but it seems about
 right seeing that you're having trouble with 75 million.

Thanks, that makes sense. BTW, I solved my problem w/o that sort. I was
just curious what happened here.

Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to continue streaming replication after this error?

2014-02-23 Thread Torsten Förtsch
On 22/02/14 03:21, Torsten Förtsch wrote:
 2014-02-21 05:17:10 UTC PANIC:  heap2_redo: unknown op code 32
  2014-02-21 05:17:10 UTC CONTEXT:  xlog redo UNKNOWN
  2014-02-21 05:17:11 UTC LOG:  startup process (PID 1060) was terminated
  by signal 6: Aborted
  2014-02-21 05:17:11 UTC LOG:  terminating any other active server processes
  2014-02-21 05:17:11 UTC WARNING:  terminating connection because of
  crash of another server process
  2014-02-21 05:17:11 UTC DETAIL:  The postmaster has commanded this
  server process to roll back the current transaction and exit, because
  another server process exited abnormally and possibly corrupted shared
  memory.
  2014-02-21 05:17:11 UTC HINT:  In a moment you should be able to
  reconnect to the database and repeat your command.

 Any idea what that means?

Updating the replica to 9.3.3 cured it. The master was already on 9.3.3.

Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] How to continue streaming replication after this error?

2014-02-21 Thread Torsten Förtsch
Hi,

one of our streaming replicas died with

2014-02-21 05:17:10 UTC PANIC:  heap2_redo: unknown op code 32
2014-02-21 05:17:10 UTC CONTEXT:  xlog redo UNKNOWN
2014-02-21 05:17:11 UTC LOG:  startup process (PID 1060) was terminated
by signal 6: Aborted
2014-02-21 05:17:11 UTC LOG:  terminating any other active server processes
2014-02-21 05:17:11 UTC WARNING:  terminating connection because of
crash of another server process
2014-02-21 05:17:11 UTC DETAIL:  The postmaster has commanded this
server process to roll back the current transaction and exit, because
another server process exited abnormally and possibly corrupted shared
memory.
2014-02-21 05:17:11 UTC HINT:  In a moment you should be able to
reconnect to the database and repeat your command.


Now, if I try to restart it, I get this:


The PostgreSQL server failed to start. Please check the log output:
2014-02-21 07:42:53 UTC LOG:  database system was interrupted while in
recovery at log time 2014-02-21 05:02:45 UTC
2014-02-21 07:42:53 UTC HINT:  If this has occurred more than once some
data might be corrupted and you might need to choose an earlier recovery
target.
2014-02-21 07:42:53 UTC LOG:  incomplete startup packet
2014-02-21 07:42:53 UTC LOG:  entering standby mode
2014-02-21 07:42:53 UTC LOG:  redo starts at 11C/B2211778
2014-02-21 07:42:53 UTC FATAL:  the database system is starting up
2014-02-21 07:42:54 UTC LOG:  consistent recovery state reached at
11C/B4234108
2014-02-21 07:42:54 UTC LOG:  database system is ready to accept read
only connections
2014-02-21 07:42:54 UTC PANIC:  heap2_redo: unknown op code 32
2014-02-21 07:42:54 UTC CONTEXT:  xlog redo UNKNOWN
2014-02-21 07:42:54 UTC LOG:  startup process (PID 38187) was terminated
by signal 6: Aborted
2014-02-21 07:42:54 UTC LOG:  terminating any other active server processes


This is 9.3.2. What is the supposed way to continue replication? Or do I
need to start from a fresh base backup?

Thanks,
Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to continue streaming replication after this error?

2014-02-21 Thread Torsten Förtsch
On 21/02/14 09:17, Torsten Förtsch wrote:
 one of our streaming replicas died with
 
 2014-02-21 05:17:10 UTC PANIC:  heap2_redo: unknown op code 32
 2014-02-21 05:17:10 UTC CONTEXT:  xlog redo UNKNOWN
 2014-02-21 05:17:11 UTC LOG:  startup process (PID 1060) was terminated
 by signal 6: Aborted
 2014-02-21 05:17:11 UTC LOG:  terminating any other active server processes
 2014-02-21 05:17:11 UTC WARNING:  terminating connection because of
 crash of another server process
 2014-02-21 05:17:11 UTC DETAIL:  The postmaster has commanded this
 server process to roll back the current transaction and exit, because
 another server process exited abnormally and possibly corrupted shared
 memory.
 2014-02-21 05:17:11 UTC HINT:  In a moment you should be able to
 reconnect to the database and repeat your command.

Any idea what that means?

I have got a second replica dying with the same symptoms.

Thanks,
Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] How to get rid of superfluous WAL segments?

2014-02-05 Thread Torsten Förtsch
Hi,

we decreased wal_keep_segments quite a lot. What is the supposed way to
get rid of the now superfluous files in pg_xlog?

Thanks,
Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to get rid of superfluous WAL segments?

2014-02-05 Thread Torsten Förtsch
On 06/02/14 06:46, Torsten Förtsch wrote:
 we decreased wal_keep_segments quite a lot. What is the supposed way to
 get rid of the now superfluous files in pg_xlog?

Nothing special. The database did it for me.

Sorry for the noise.

Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Help with details of what happens when I create a constraint NOT VALID

2014-01-23 Thread Torsten Förtsch
On 23/01/14 14:46, Bill Moran wrote:
 
 Some quickie background: I'm on a project to migrate a fairly large
 database from MySQL to PostgreSQL (~2T).  As a result of a number of
 factors, I have to do it in one shot and I have a limited time window
 in which things can be down while I switch it over.
 
 As one of many, many things I'm considering to make this work, I'm
 looking at adding constraints after the data move using NOT VALID to
 allow them to be applied quickly.  This seems pretty straight forward,
 but I'm trying to understand if there are any troublesome side-effects
 to leaving the constraints unvalidated.
 
 Because of the uptime requirements, there are some very large tables
 with may foreign keys that I will never be allowed to take a lock on
 long enough to validate all the constraints.  It was suggested that
 leaving the constraints as NOT VALID might affect the planner, causing
 it to use less optimal plans because it doesn't think it can trust
 the constraint.  Is this true?
 
 It has also been suggested that manually changing the status to valid
 in the catalog without going through the validation process could cause
 problems, although I haven't found an explanation of what those
 problems might be.
 
 I understand that the best way is to go through and do all the steps,
 but that may simply be impossible for me because of the lock it
 requires and the time involved.  Is there any negative effect to
 leaving the constraint unvalidated?  Is there any actual danger in
 manually flipping the value in the catalog (The constraint can be
 consider safe because it was previously enforced on the source
 database system)


I had a similar problem some time ago. The way I solved it is as
follows. First, add the constraint as NOT VALID. That prevents further
changes to violate it. Then make sure the constraint is met. Then update
pg_constraint.

UPDATE pg_constraint
   SET convalidated = true
 WHERE conrelid='schema.table'::regclass::oid
   AND conname='constraintname'

Not sure if that way can be recommended but it worked for me. In my case
it was a check constraint ensuring an interdependence between the
columns in a row.

Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Is http://www.postgresql.org/docs/9.3/static/row-estimation-examples.html up to date?

2014-01-15 Thread Torsten Förtsch
Hi,

I am asking because ...

I have a table with

relpages  | 19164
reltuples | 194775

pg_relation_size / 8192 yields the same number as relpages. So, there is
no need to scale reltuples. Relcardinality is therefore 194775.

Statistics target is the default, 100. So, I assume each of the 100
buckets contains 1947.75 tuples.

Now, I have a timestamp column and a query for col'2013-01-01'. There
are 27 buckets out of the 100 where col is '2013-01-01'. The bucket
boundaries where 2013-01-01 falls into are

hist | 2013-01-08 20:48:52
hist | 2012-12-13 12:36:30

There is no / operation for INTERVAL types. So, I calculate in seconds:

select (27
+ extract('epoch' from
  '2013-01-08 20:48:52'::timestamp
   - '2013-01-01'::timestamp)
  / extract('epoch' from
'2013-01-08 20:48:52'::timestamp
- '2012-12-13 12:36:30'::timestamp))
   * 1947.75;

That results in 53170.9642980797 and would be rounded to 53171.

However, EXPLAIN shows:

  -  Seq Scan on client  (cost=0.00..21731.03 rows=52939 width=29)
Filter: (date_joined  '2013-01-01 00:00:00'::timestamp
without time zone)

The numbers are of the same number of magnitude, but they are too
different to be rounding errors.

So, what did I wrong?

Thanks,
Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] question about checksum in 9.3

2014-01-14 Thread Torsten Förtsch
On 13/01/14 17:50, Mike Broers wrote:
 Is there a built-in method of scanning the server to check for
 corruption or will I have to wait for a corrupt object to be accessed to
 see the log entry?

This presentation:

  http://www.youtube.com/watch?v=TovZ0lb16-Q

suggests pg_filedump.

Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] window function and order by

2013-12-20 Thread Torsten Förtsch
Hi,

assuming I have a query that computes a running sum like this:

  select id, buy_price, sum(buy_price) over (order by id) sum
from fmb
   where 202300=id and id=202400
   order by id;

Do I need the ORDER BY clause at the end? Or does the ORDER BY in the
window function already define the order of the result set?

Thanks,
Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Is it possible to avoid the VACUUM FREEZE when upgrading to 9.3.2?

2013-12-07 Thread Torsten Förtsch
Hi,

can the VACUUM FREEZE be skipped if

a) txid_current  vacuum_freeze_min_age or if

b) txid_current  vacuum_freeze_table_age or if

c) txid_current  autovacuum_freeze_max_age and no manual VACUUM has
been done?

Or is the operation perhaps only necessary for tables matching

 select n.nspname, c.relname, c.relfrozenxid
   from pg_class c join pg_namespace n on c.relnamespace=n.oid
  where c.relfrozenxid::text::int=(select setting::int
  from pg_settings
 where name='vacuum_freeze_min_age')

or similar?

Thanks,
Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PG replication across DataCenters

2013-11-22 Thread Torsten Förtsch
On 22/11/13 11:57, Albe Laurenz wrote:
 Don't use synchronous replication if you have a high transaction
 rate and a noticable network latency between the sites.
 
 Wait for the next bugfix release, since a nasty bug has just
 been discovered.

Can you please explain or provide a pointer for more information?

We have recently started to use sync replication over a line with 80ms
latency. It works for small transactions with a relatively low
transaction rate.

Avoid transactions using NOTIFY. Those acquire an exclusive lock during
commit that is released only when the remote host has also done its
commit. So, only one such transaction can be committing at time.

Async replication works just fine.

Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Partitioned table question

2013-11-13 Thread Torsten Förtsch
Hi,

we have a table partitioned by time. Each month goes into a separate
child table. Primary key in each table is (underlying, ts). The
resulting index is perfect for ordering like in the query below. Each
child table has a constraint like:

  CHECK(ts= '2011-1-1' and ts'2011-1-1'::DATE + interval '1 month')

Now, we have queries of this type:

SELECT * FROM tick
 WHERE underlying = 'R_50' AND ts = '2013-05-02'
 ORDER BY ts DESC LIMIT 100

The query plan for this is at http://explain.depesz.com/s/fB6

According to this plan it fetches all the result tuples from tick_2013_4
which is fine because tick_2013_5 obviously does not contain matches.

My question is, why does it then try to fetch one row from every other
index? Can that be avoided without a lower bound on ts?

Thanks,
Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioned table question

2013-11-13 Thread Torsten Förtsch
On 13/11/13 13:49, Gabriel Sánchez Martínez wrote:
 My question is, why does it then try to fetch one row from every other
 index? Can that be avoided without a lower bound on ts?

 If you don't set a lower bound, since every other table has dates below
 2013-05-01, they have to be scanned too.  I'm not sure what happens on
 actual execution if it searches in '2013_4' first and finds 100 or more
 rows.  I don't know if the query planner uses constraint exclusion rules
 to figure out the order in which tables will be scanned.

It probably does. According to the analyze part of the query plan it
does not find any match in 2013_5. But from 2013_4 it fetches 100 rows.

-  Index Scan Backward using tick_2013_4_pkey on tick_2013_4 tick
  (cost=0.00..5025184.53 rows=1336481 width=40)
  (actual time=0.047..0.124 rows=100 loops=1)   == rows=100

Of course, it's a good idea to add a lower bound to the query.

I also know that the planner does not know how many rows it can fetch
from each table (it can have a quite accurate guess though). So, the
plan must include all tables before and including 2013_5.

The question, however, was why does the executor try to fetch rows from
the other tables at all.

Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioned table question

2013-11-13 Thread Torsten Förtsch
On 13/11/13 20:21, Jeff Janes wrote:
 The planner uses the check constraints to reason about the relation
 between each partition separately and the query, not between the
 different partitions.  So while it may be possible to know that all rows
 in 2013_4 must be greater than all in 2013_3, it doesn't make use of
 that, instead taking the greatest value from each partition and putting
 it in a priority queue. So the one row from each table acts as a
 sentinel to prove that more rows from that table are not needed.

That makes perfect sense. Thank you very much.

Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] locks held during commit with synchronous replication

2013-10-22 Thread Torsten Förtsch
On 21/10/13 20:46, Tom Lane wrote:
 =?ISO-8859-1?Q?Torsten_F=F6rtsch?= torsten.foert...@gmx.net writes:
  I noticed that with synchronous replication I often see locks like this:
  [ AccessExclusiveLock on database 0 ]
 You did not say what PG version you're talking about, but if it's recent

It's 9.3.1

 then this must be coming from PreCommit_Notify, which takes such a lock
 while pushing entries into the shared notification-event queue:
 
 /*
  * Serialize writers by acquiring a special lock that we hold till
  * after commit.  This ensures that queue entries appear in commit
  * order, and in particular that there are never uncommitted queue
  * entries ahead of committed ones, so an uncommitted transaction
  * can't block delivery of deliverable notifications.
  *
  * We use a heavyweight lock so that it'll automatically be released
  * after either commit or abort.  This also allows deadlocks to be
  * detected, though really a deadlock shouldn't be possible here.
  *
  * The lock is on database 0, which is pretty ugly but it doesn't
  * seem worth inventing a special locktag category just for this.
  * (Historical note: before PG 9.0, a similar lock on database 0 was
  * used by the flatfiles mechanism.)
  */
 LockSharedObject(DatabaseRelationId, InvalidOid, 0,
  AccessExclusiveLock);
 
 This has nothing to do with synchronous replication, only with use of
 LISTEN/NOTIFY.
 
  Does that mean that only one transaction can be committed at a time?
 If they're sending notifies, yes.

Thanks, Tom! I think that explains it. We are also using bucardo. So
when a transaction commits the bucardo notification acquires the lock.
But now with synchronous replication and synchronous_commit=on commit
waits for the remote host to complete. Thus, the lock is held longer
than before and I see it.

The problem is we use synchronous replication over long distances on the
internet to bring the data closer to the customer. Turns out, this
combined with bucardo creates quite a bottleneck.

Best,
Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] locks held during commit with synchronous replication

2013-10-21 Thread Torsten Förtsch
Hi,

I noticed that with synchronous replication I often see locks like this:

-[ RECORD 10 ]-+
locktype   | object
database   | 0
relation   |
page   |
tuple  |
virtualxid |
transactionid  |
classid| 1262
objid  | 0
objsubid   | 0
virtualtransaction | 13/42806
pid| 9794
mode   | AccessExclusiveLock
granted| t
fastpath   | f

According to the manual classid is an oid in pg_class. If so, the 1262
resolves to pg_database. What objid=0 means, I have no idea.

For how long is this lock held?

Am I right in assuming that it is held from the start of the commit
until the transaction is streamed and written to disk by the synchronous
replica?

Does that mean that only one transaction can be committed at a time? I
have to admit I haven't thought much about that before. Lock contention
has never been a problem here.

Thanks,
Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] like optimization

2013-10-12 Thread Torsten Förtsch
On 12/10/13 20:08, Scott Ribe wrote:
 select * from test where tz = start and tz  end and colb like '%foobar%'

I think you can use an index only for wildcard expressions that are
anchored at the beginning. So,

  select * from test where tz = start and tz  end
 and colb like 'foobar%'

can use an index on colb.

You could perhaps

  select * from test where tz = start and tz  end
 and colb like 'foobar%'
  union all
  select * from test where tz = start and tz  end
 and reverse(colb) like 'raboof%'

Then you need 2 indexes, one on colb the other on reverse(colb).

You can have duplicates in the result set if the table contains rows
where colb='foobar'. If that's a problem, use union distinct.

Alternatively, if foobar is kind of a word (with boundaries), you could
consider full-text search.

Just my 2¢,
Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] declare constraint as valid

2013-10-09 Thread Torsten Förtsch
Hi,

assuming a constraint is added to a table as NOT VALID. Now I know it IS
valid. Can I simply declare it as valid by

update pg_constraint
   set convalidated='t'
 where conrelid=(select c.oid
   from pg_class c
   join pg_namespace n on (n.oid=c.relnamespace)
  where c.relname='tablename'
and n.nspname='schemaname')
   and conname='constraintname';

instead of

alter table tablename validate constraint ...

Or does the latter have other side effects?

I am asking because I want to avoid the ACCESS EXCLUSIVE lock required
by the ALTER TABLE. I am sure there are no rows violating the constraint.

Thanks,
Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] ALTER TABLE VALIDATE CONSTRAINT w/o lock

2013-10-08 Thread Torsten Förtsch
Hi,

I want to add a new column named sell_time to a table which already has
a boolean field named is_sold. Then I want to create a new check constraint:

ALTER TABLE ...
CHECK(is_sold AND sell_time IS NOT NULL OR
  NOT is_sold AND sell_time IS NULL)
NOT VALID;

The constraint is added as NOT VALID for 2 reasons. 1) it is not valid
at that point and 2) I cannot afford the lock required to validate the it.

According to the docs subsequent operations on the table must pass the
constraint.

Now I can slowly populate the new column. After a while I'll reach a
point where the constraint is valid for all rows.

How can I tell that the database without the re-validation?

Alternatively, is there a way to make the constraint validation really
fast (a few seconds at most for 50.000.000 rows)? (this is 9.3)

Thanks,
Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] the new checksum feature

2013-09-19 Thread Torsten Förtsch
Hi,

is it possible to turn on checksums in an existing database? Or do I
have to dump/initdb/restore?

Thanks,
Torsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general