from:"Andy Colson"

Re: [GENERAL] PG 10 and perl

2017-09-29 Thread Andy Colson


On 09/29/2017 09:32 AM, Tom Lane wrote:

Andy Colson  writes:

I started playing with 10, and cannot get it to use the right version of perl.  
I'll only use the system version:
root@firefly:/usr/local/pg10/lib/postgresql# ldd plperl.so
libperl.so => /usr/lib64/perl5/CORE/libperl.so (0x7fc9c67d4000)


Hm, I don't think we've changed anything about that in v10.


But, here's the thing.  The config.log reports 5.26 version:


That's good, but what's probably missing is an "rpath" specification when
linking plperl.so.  Did you build with --disable-rpath, perhaps?  That's
the usual recommendation when building for Linux, but if you want to use
any non-default libraries, you can't do it.

Go into src/pl/plperl, remove and remake plperl.so, and see whether the
link command includes anything like

-Wl,-rpath,'/usr/lib64/perl5/CORE',--enable-new-dtags

(That's what I see when building with a stock Linux Perl configuration and
rpath enabled.)  If there's no such switch, or if it doesn't point to
where the libperl.so that you want to use is, then there's your problem.

regards, tom lane



tldr: PG 10 compiles fine and works with perlbrew fine, I confused install 
paths.

I removed the plperl.so, and typed make:

gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement 
-Wendif-labels -Wmissing-format-attribute -Wformat-security 
-fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC 
-DLINUX_OOM_SCORE_ADJ=0 -fPIC -shared -o plperl.so plperl.o SPI.o Util.o  
-L../../../src/port -L../../../src/common -Wl,--as-needed 
-Wl,-rpath,'/opt/perl5/perls/perl-5.26/lib/5.26.1/x86_64-linux-thread-multi/CORE',--enable-new-dtags
  -fstack-protector-strong -L/usr/local/lib  
-L/opt/perl5/perls/perl-5.26/lib/5.26.1/x86_64-linux-thread-multi/CORE -lperl 
-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc


And in the src dir:
root@firefly:/tmp/SBo/postgresql-10rc1# ldd ./src/pl/plperl/plperl.so
libperl.so => 
/opt/perl5/perls/perl-5.26/lib/5.26.1/x86_64-linux-thread-multi/CORE/libperl.so 
(0x7f33b9f87000)


I had 10beta1 installed to /usr/local/pg10.
The SlackBuild script I'm using installs to /usr/local/pg95 still.  So I was 
compiling and installing 10rc1 into pg95.

Sorry for the noise (and thanks Tom).

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] PG 10 and perl

2017-09-29 Thread Andy Colson


On 9/29/2017 9:28 AM, kbran...@pwhome.com wrote:

Andy Colson wrote:
In PG 9.5 this worked fine, and compiling and running PG used the 5.26 version 
of perl:
root@firefly:/usr/local/pg95/lib/postgresql# ldd plperl.so
libperl.so => 
/opt/perl5/perls/perl-5.26/lib/5.26.1/x86_64-linux-thread-multi/CORE/libperl.so 
(0x7f81e8bde000)

I started playing with 10, and cannot get it to use the right version of perl.  
I'll only use the system version:
root@firefly:/usr/local/pg10/lib/postgresql# ldd plperl.so
libperl.so => /usr/lib64/perl5/CORE/libperl.so (0x7fc9c67d4000)

But, here's the thing.  The config.log reports 5.26 version:

configure:7752: checking for perl
configure:7770: found /opt/perl5/perls/perl-5.26/bin/perl
configure:7782: result: /opt/perl5/perls/perl-5.26/bin/perl
configure:7804: using perl 5.26.1
...



I'm about to go down this path to prepare for 10.0, but I haven't had time yet.

You don't show your configure command, but did you set PERL there as well as 
have it in your PATH? Here's what I've done in the past:

 ./configure --prefix=/opt/pg-9.5 --with-system-tzdata --with-perl 
PERL=/opt/perl/bin/perl

and it used our custom Perl just fine. If you did that, I'm not sure what else 
to say at the moment. I hope to get around to checking out RC1 early next week 
though and I'll be using 5.26 as well.

Hmm, given your report, I'll try to move it up my ToDo list to give myself more 
time. I'll report back here if I get it to work and haven't seen a reply by you 
about you getting it to work also.

Kevin




I only use --with-perl, that's all I've ever used.  I didnt know you 
could pass PERL=/opt/perl/bin/perl.


-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] PG 10 and perl

2017-09-28 Thread Andy Colson


I'm not subscribed to the -devel list, so hopefully I can post this here.

I'm running slackware 14.2

I have the system perl (5.24) installed, and also perlbrew with perl 5.26 
installed.

In PG 9.5 this worked fine, and compiling and running PG used the 5.26 version 
of perl:
root@firefly:/usr/local/pg95/lib/postgresql# ldd plperl.so
libperl.so => 
/opt/perl5/perls/perl-5.26/lib/5.26.1/x86_64-linux-thread-multi/CORE/libperl.so 
(0x7f81e8bde000)

I started playing with 10, and cannot get it to use the right version of perl.  
I'll only use the system version:
root@firefly:/usr/local/pg10/lib/postgresql# ldd plperl.so
libperl.so => /usr/lib64/perl5/CORE/libperl.so (0x7fc9c67d4000)

But, here's the thing.  The config.log reports 5.26 version:

configure:7752: checking for perl
configure:7770: found /opt/perl5/perls/perl-5.26/bin/perl
configure:7782: result: /opt/perl5/perls/perl-5.26/bin/perl
configure:7804: using perl 5.26.1
configure:7837: checking for Perl archlibexp
configure:7841: result: 
/opt/perl5/perls/perl-5.26/lib/5.26.1/x86_64-linux-thread-multi
configure:7843: checking for Perl privlibexp
configure:7847: result: /opt/perl5/perls/perl-5.26/lib/5.26.1
configure:7849: checking for Perl useshrplib
configure:7853: result: true
configure:7862: checking for CFLAGS recommended by Perl
configure:7865: result: -D_REENTRANT -D_GNU_SOURCE -fwrapv -fno-strict-aliasing 
-pipe -fstack-protector
-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 
-D_FORTIFY_SOURCE=2
configure:7867: checking for CFLAGS to compile embedded Perl
configure:7870: result:
configure:7874: checking for flags to link embedded Perl
configure:7891: result:   -fstack-protector-strong -L/usr/local/lib  
-L/opt/perl5/perls/perl-5.26/lib/5
.26.1/x86_64-linux-thread-multi/CORE -lperl -lpthread -lnsl -ldl -lm -lcrypt 
-lutil -lc



configure:15976: checking for perl.h
configure:15976: gcc -c -Wall -Wmissing-prototypes -Wpointer-arith 
-Wdeclaration-after-statement -Wendi
f-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing 
-fwrapv -fexcess-precision=s
tandard -O2 -fPIC -DLINUX_OOM_SCORE_ADJ=0  -D_GNU_SOURCE  
-I/opt/perl5/perls/perl-5.26/lib/5.26.1/x86_6
4-linux-thread-multi/CORE conftest.c >&5
configure:15976: $? = 0
configure:15976: result: yes
configure:15991: checking for libperl
configure:16007: gcc -o conftest -Wall -Wmissing-prototypes -Wpointer-arith 
-Wdeclaration-after-stateme
nt -Wendif-labels -Wmissing-format-attribute -Wformat-security 
-fno-strict-aliasing -fwrapv -fexcess-pr
ecision=standard -O2 -fPIC -DLINUX_OOM_SCORE_ADJ=0  -D_GNU_SOURCE  
-I/opt/perl5/perls/perl-5.26/lib/5.2
6.1/x86_64-linux-thread-multi/CORE   conftest.c   -fstack-protector-strong 
-L/usr/local/lib  -L/opt/per
l5/perls/perl-5.26/lib/5.26.1/x86_64-linux-thread-multi/CORE -lperl -lpthread 
-lnsl -ldl -lm -lcrypt -l
util -lc >&5
configure:16007: $? = 0
configure:16008: result: yes




ac_cv_path_PERL=/opt/perl5/perls/perl-5.26/bin/perl


perl_archlibexp='/opt/perl5/perls/perl-5.26/lib/5.26.1/x86_64-linux-thread-multi'
perl_embed_ccflags=''
perl_embed_ldflags='  -fstack-protector-strong -L/usr/local/lib  
-L/opt/perl5/perls/perl-5.26/lib/5.26.
1/x86_64-linux-thread-multi/CORE -lperl -lpthread -lnsl -ldl -lm -lcrypt -lutil 
-lc'
perl_privlibexp='/opt/perl5/perls/perl-5.26/lib/5.26.1'
perl_useshrplib='true'


Thanks for your time,

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] pg_test_fsync performance

2017-04-25 Thread Andy Colson


On 04/25/2017 05:40 AM, Karri Niemelä wrote:

Hi. Wondering what sort of results are people seeing on modern servers when 
running pg_test_fsync?

Thanks,
Karri



18 sata laptop hd's (some 5400, some 7200), hardware raid 10
12 core Xeon E5645

5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.

Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync is Linux's default)
open_datasync 24903.540 ops/sec  40 usecs/op
fdatasync 21800.982 ops/sec  46 usecs/op
fsync 20762.443 ops/sec  48 usecs/op
fsync_writethroughn/a
open_sync 23394.041 ops/sec  43 usecs/op

Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync is Linux's default)
open_datasync 12301.561 ops/sec  81 usecs/op
fdatasync 17146.136 ops/sec  58 usecs/op
fsync 15853.102 ops/sec  63 usecs/op
fsync_writethroughn/a
open_sync 11182.711 ops/sec  89 usecs/op

Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB in different write
open_sync sizes.)
 1 * 16kB open_sync write 18419.537 ops/sec  54 usecs/op
 2 *  8kB open_sync writes11215.533 ops/sec  89 usecs/op
 4 *  4kB open_sync writes 6250.670 ops/sec 160 usecs/op
 8 *  2kB open_sync writes 3245.170 ops/sec 308 usecs/op
16 *  1kB open_sync writes 1648.117 ops/sec 607 usecs/op

Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written on a different
descriptor.)
write, fsync, close   17929.396 ops/sec  56 usecs/op
write, close, fsync   17782.500 ops/sec  56 usecs/op

Non-sync'ed 8kB writes:
write421898.978 ops/sec   2 usecs/op



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] pg_test_fsync performance

2017-04-25 Thread Andy Colson


On 04/25/2017 05:40 AM, Karri Niemelä wrote:

Hi. Wondering what sort of results are people seeing on modern servers when 
running pg_test_fsync?

Thanks,
Karri


4 sata HD, hardware raid 10
4 core Xeon E5504
Box is Busy


5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.

Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync is Linux's default)
open_datasync 15984.370 ops/sec  63 usecs/op
fdatasync 16100.336 ops/sec  62 usecs/op
fsync 15109.988 ops/sec  66 usecs/op
fsync_writethroughn/a
open_sync 16849.201 ops/sec  59 usecs/op

Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync is Linux's default)
open_datasync  8638.443 ops/sec 116 usecs/op
fdatasync 12036.359 ops/sec  83 usecs/op
fsync 11807.447 ops/sec  85 usecs/op
fsync_writethroughn/a
open_sync  8532.569 ops/sec 117 usecs/op

Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB in different write
open_sync sizes.)
 1 * 16kB open_sync write 14205.165 ops/sec  70 usecs/op
 2 *  8kB open_sync writes 8506.969 ops/sec 118 usecs/op
 4 *  4kB open_sync writes 4979.092 ops/sec 201 usecs/op
 8 *  2kB open_sync writes 2658.138 ops/sec 376 usecs/op
16 *  1kB open_sync writes 1343.359 ops/sec 744 usecs/op

Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written on a different
descriptor.)
write, fsync, close   13331.347 ops/sec  75 usecs/op
write, close, fsync   13435.198 ops/sec  74 usecs/op

Non-sync'ed 8kB writes:
write297771.626 ops/sec   3 usecs/op



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Merging records in a table with 2-columns primary key

2017-04-02 Thread Andy Colson


On 04/02/2017 09:26 AM, Alexander Farber wrote:

Good afternoon,

I have prepared a simple test case for my question -

CREATE TABLE users (
uid SERIAL PRIMARY KEY,
name varchar(255) NOT NULL
);

CREATE TABLE reviews (
uid integer NOT NULL CHECK (uid <> author) REFERENCES users ON DELETE 
CASCADE,
author integer NOT NULL REFERENCES users(uid) ON DELETE CASCADE,
review varchar(255),
PRIMARY KEY(uid, author)
);

Here I fill the above tables with sample data -

INSERT INTO users (uid, name) VALUES (1, 'User 1');
INSERT INTO users (uid, name) VALUES (2, 'User 2');
INSERT INTO users (uid, name) VALUES (3, 'User 3');
INSERT INTO users (uid, name) VALUES (4, 'User 4');

INSERT INTO reviews (uid, author, review) VALUES (1, 2, 'User 2 says: 1 is 
nice');
INSERT INTO reviews (uid, author, review) VALUES (1, 3, 'User 3 says: 1 is 
nice');
INSERT INTO reviews (uid, author, review) VALUES (1, 4, 'User 4 says: 1 is 
nice');

INSERT INTO reviews (uid, author, review) VALUES (2, 1, 'User 1 says: 2 is 
nice');
INSERT INTO reviews (uid, author, review) VALUES (2, 3, 'User 3 says: 2 is 
nice');
INSERT INTO reviews (uid, author, review) VALUES (2, 4, 'User 4 says: 2 is 
ugly');

INSERT INTO reviews (uid, author, review) VALUES (3, 1, 'User 1 says: 3 is 
nice');
INSERT INTO reviews (uid, author, review) VALUES (3, 2, 'User 2 says: 3 is 
ugly');
INSERT INTO reviews (uid, author, review) VALUES (3, 4, 'User 4 says: 3 is 
ugly');

INSERT INTO reviews (uid, author, review) VALUES (4, 1, 'User 1 says: 4 is 
ugly');
INSERT INTO reviews (uid, author, review) VALUES (4, 2, 'User 2 says: 4 is 
ugly');
INSERT INTO reviews (uid, author, review) VALUES (4, 3, 'User 3 says: 4 is 
ugly');

And finally here is my problematic custom stored function:

CREATE OR REPLACE FUNCTION merge_users(
in_uids integer[],
OUT out_uid integer
) RETURNS integer AS
$func$
BEGIN
SELECT
MIN(uid)
INTO STRICT
out_uid
FROM users
WHERE uid = ANY(in_uids);

-- delete self-reviews
DELETE FROM reviews
WHERE uid = out_uid
AND author = ANY(in_uids);

DELETE FROM reviews
WHERE author = out_uid
AND uid = ANY(in_uids);

-- try to copy as many reviews OF this user as possible
INSERT INTO reviews (
uid,
author,
review
) SELECT
out_uid,-- change to out_uid
author,
review
FROM reviews
WHERE uid <> out_uid
AND uid = ANY(in_uids)
ON CONFLICT DO NOTHING;

DELETE FROM reviews
WHERE uid <> out_uid
AND uid = ANY(in_uids);

-- try to copy as many reviews BY this user as possible
INSERT INTO reviews (
uid,
author,
review
) SELECT
uid,
out_uid,-- change to out_uid
review
FROM reviews
WHERE author <> out_uid
AND author = ANY(in_uids)
ON CONFLICT DO NOTHING;

DELETE FROM reviews
WHERE author <> out_uid
AND author = ANY(in_uids);

DELETE FROM users
WHERE uid <> out_uid
AND uid = ANY(in_uids);
END
$func$ LANGUAGE plpgsql;

The purpose of the function is to merge several user records to one (with the 
lowest uid).

While merging the reviews records I delete all self-reviews and try to copy 
over as many remaining reviews as possible.

However with PostgreSQL 9.5 the following 2 calls fail:

test=> SELECT out_uid FROM merge_users(ARRAY[1,2]);
 out_uid
-
   1
(1 row)

test=> SELECT out_uid FROM merge_users(ARRAY[1,2,3,4]);
ERROR:  new row for relation "reviews" violates check constraint "reviews_check"
DETAIL:  Failing row contains (1, 1, User 4 says: 3 is ugly).
CONTEXT:  SQL statement "INSERT INTO reviews (
uid,
author,
review
) SELECT
uid,
out_uid,-- change to out_uid
review
FROM reviews
WHERE author <> out_uid
AND author = ANY(in_uids)
ON CONFLICT DO NOTHING"
PL/pgSQL function merge_users(integer[]) line 38 at SQL statement

I have provided more context at
http://stackoverflow.com/questions/43168406/merging-records-in-a-table-with-2-columns-primary-key

Also I have tried to create an SQL Fiddle at
http://sqlfiddle.com/#!15/5f37e/2
for your convenience

Regards
Alex


I'm not sure what you are trying to do.  You posted a sample starting point, 
which is great.  Perhaps you could post how you want the tables would look in 
the end?

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] My humble tribute to psql -- usql v0.5.0

2017-04-02 Thread Andy Colson


On 04/02/2017 07:55 AM, Kenneth Shaw wrote:

Hi All,


usql is built in Go, and as of today supports all the major databases
(PostgreSQL, MySQL, SQLite3, Microsoft SQL Server, Oracle Database)
and more! Additionally, with v0.5.0, released today, usql now has
implemented most of the basic, and much of the advanced features of
psql:


-Ken


I have to admit, I have typed \d at the sqlite3 prompt before.  And I am being 
forced to support a mysql database against my will, so I'll be trying this out. 
 Thanks!

-Andy



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] What is the purpose of PostGIS on PostgreSQL?

2017-01-24 Thread Andy Colson


On 01/23/2017 11:08 AM, Kased, Razy (Chemistry and Geosciences) wrote:

I recently came across this question: "What is the purpose of PostGIS on PostgreSQL?" 
 and wanted to know what this mailing list 
had to respond with.


Thanks,




PostGIS is to shapefile
as PostgreSQL is to csv

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Queries on very big table

2017-01-02 Thread Andy Colson


On 01/02/2017 05:23 AM, Job wrote:

Hello guys and very good new year to everybody!

We are now approaching some queries and statistics on very big table (about 180 
millions of record).
The table is partitioned by day (about ~3 Gb of data for every partition/day).
We use Postgresql 9.6.1

I am experiencing quite important slowdown on queries.
I manually made a "vacuum full" and a "reindex" on every partition in order to 
clean free space and reorder records.

I have a BRIN index on timestamp and index on other field (btree)

Starting by a simple query: explain analyze select count(domain) from 
webtraffic_archive:


Other more complex queries are slower.

How can i improve it?
Records number can raise up until 1.000 millions.
Do i need a third-part tool for big data?

THANK YOU!
/F



I do very similar thing, log all my webstats to PG, but querying millions of 
rows is always going to be slow.  I use a summary table.  Actually, several.
My detail table is like yours, but every 5 minutes I query out the last hour 
and summarize into a by_hour table.  Every night I query out the last 24 hours 
and summarize into a by_day table.  The detail table and by_hour table never 
have more than 24 hours worth of data, by_day goes back many years.

My stats pages all query the by_hour and by_day tables, and its very fast.

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Abscence of synonym

2016-11-07 Thread Andy Colson


On 11/7/2016 1:39 PM, Rakesh Kumar wrote:


I need some ideas here.

Let us say we have a bunch of tables, called a,b,c,d. We will get a batch of 
data in files which need to be processed. At the end of processing, the tables 
will get a fresh set of data.  The problem we would like to solve is to allow 
access to the tables while they are getting processed by the new batch. 
Obviously, during the processing the data will be in an inconsistent state, 
which the app should not be exposed to. At the end of processing, the app can 
see the new data.

In RDBMS where synonyms are allowed we would do this:

keep two set of tables a_1 and a_2
A synonym a will either point to a_1 or a_2.
applications will refer synonym a only, never a_1 or a_2.
While synonym is referring to a_1, we can process a_2.
At the end of processing change the synonym to point to a_2.
next time, repeat it, just flip the tables from a_2 to a_1.

How this can be done in PG 9.5.3?

Thanks.



I do it with schemas.  A live and a work schema.  When batch processing 
is finished live tables get removed and work gets renamed, in a single 
transaction:


begin trans;

drop table live.junk1;
alter table work.junk1 set schema live;

drop table live.junk2;
alter table work.junk2 set schema live;

.. about 200 more tables ...

commit;


works like a champ.

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] tablesample performance

2016-10-18 Thread Andy Colson

On 10/18/2016 11:44 AM, Francisco Olarte wrote:

On Tue, Oct 18, 2016 at 5:06 PM, Andy Colson  wrote:

I wanted to report an awesome performance boost using tablesample.
In my stored function I was getting a random row using:
select one into x from ones order by random() limit 1;
When the table was smaller it worked fine, but the performance has slowly
gotten worse.  This morning I was getting around 8 transactions a second.

Which is not a surprise, as it has to at least read all the rows and
generate a random() for each one and keep track of the minimum.

I just replaced it with:
select one into x from ones tablesample bernoulli(1) limit 1;

This should be faster, but to me it seems it does a different thing.
This seems to select each row of the table with probability 1% and
return the first selected, i.e., something similar to

select one into x from ones where random()>0.01 limit 1.

Which has the ( diminishing with table size ) risk of selecting zero
rows and is going to select one of the first 100 or so rows with high
probability, unless I'm missing something.

I say this because docs state ir returns a 'randomly chosen', sample,
not a 'randomly ORDERED' one, and the straightforward implementation
of sampling returns rows in the primitive scan order. I supose it
could be easily tested by selecting bernouilli(100), but have not
server access now to verify it.

With a big table it seems:

select one into x from ones where random()>0.01 order by random() limit 1
or
select one into x from ones tablesample bernoulli(1) order by random() limit 1;

Is more similar to what you originally did ( and the run time should
possibly be something in between ).

I would recomend you to execute the function and verify it does what
you want ( as you say it's fast, I would try selecting a several
thousands and eyeballing the result, if it does what I fear the
grouping should be obvious ).

Maybe you do not mind it, in which case it's ok, but a one minute run
should let you know wahat you are exactly doing.

Francisco Olarte.

Ah, yes, you're right, there is a bit of a difference there.

Speed wise:
1) select one from ones order by random() limit 1;
> about 360ms
2) select one from ones tablesample bernoulli(1) limit 1 ;
> about 4ms
3) select one from ones tablesample bernoulli(1) order by random() limit 1;
> about 80ms

Using the third option in batch, I'm getting about 15 transactions a second.

Oddly:
select one from ones tablesample bernoulli(0.25) order by random()

takes almost 80ms also.

bernoulli(0.25) returns 3k rows
bernoulli(1) returns 14k rows

Thanks,

-Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] tablesample performance

2016-10-18 Thread Andy Colson


I wanted to report an awesome performance boost using tablesample.

In my stored function I was getting a random row using:
select one into x from ones order by random() limit 1;

When the table was smaller it worked fine, but the performance has 
slowly gotten worse.  This morning I was getting around 8 transactions a 
second.


I just replaced it with:
select one into x from ones tablesample bernoulli(1) limit 1;

And now I'm getting 376 transactions a second!

Thank you dev's!  Thank you PG 9.5!

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] PG vs ElasticSearch for Logs

2016-08-22 Thread Andy Colson


On 8/22/2016 2:39 AM, Thomas Güttler wrote:



Am 19.08.2016 um 19:59 schrieb Andy Colson:

On 8/19/2016 2:32 AM, Thomas Güttler wrote:

I want to store logs in a simple table.

Here my columns:

  Primary-key (auto generated)
  timestamp
  host
  service-on-host
  loglevel
  msg
  json (optional)

I am unsure which DB to choose: Postgres, ElasticSearch or ...?

We don't have high traffic. About 200k rows per day.

My heart beats for postgres. We use it since several years.

On the other hand, the sentence "Don't store logs in a DB" is
somewhere in my head.

What do you think?





I played with ElasticSearch a little, mostly because I wanted to use
Kibana which looks really pretty.  I dumped a ton
of logs into it, and made a pretty dashboard ... but in the end it
didn't really help me, and wasn't that useful.  My
problem is, I don't want to have to go look at it.  If something goes
bad, then I want an email alert, at which point
I'm going to go run top, and tail the logs.

Another problem I had with kibana/ES is the syntax to search stuff is
different than I'm used to.  It made it hard to
find stuff in kibana.

Right now, I have a perl script that reads apache logs and fires off
updates into PG to keep stats.  But its an hourly
summary, which the website turns around and queries the stats to show
pretty usage graphs.


You use Perl to read apache logs. Does this work?

Forwarding logs reliably is not easy. Logs are streams, files in unix
are not streams. Sooner or later
the files get rotated. RELP exists, but AFAIK it's usage is not wide
spread:

  https://en.wikipedia.org/wiki/Reliable_Event_Logging_Protocol

Let's see how to get the logs into postgres 


In the end, PG or ES, all depends on what you want.


Most of my logs start from a http request. I want a unique id per request
in every log line which gets created. This way I can trace the request,
even if its impact spans to several hosts and systems which do not
receive http requests.

Regards,
  Thomas Güttler




I don't read the file.  In apache.conf:

# v, countyia, ip, sess, ts, url, query, status
LogFormat 
"3,%{countyName}e,%a,%{VCSID}C,%{%Y-%m-%dT%H:%M:%S%z}t,\"%U\",\"%q\",%>s" 
csv3


CustomLog "|/usr/local/bin/statSender.pl -r 127.0.0.1" csv3

I think I read somewhere that if you pipe to a script (like above) and 
you dont read fast enough, it could slow apache down.  That's why the 
script above dumps do redis first.  That way I can move processes 
around, restart the database, etc, etc, and not break apache in any way.


The important part of the script:

while (my $x = <>)
{
chomp($x);
next unless ($x);
try_again:
if ($redis)
{
eval {
$redis->lpush($qname, $x);
};
if ($@)
{
$redis = redis_connect();
goto try_again;
}
# just silence this one
eval {
$redis->ltrim($qname, 0, 1000);
};
}
}

Any other machine, or even multiple, then reads from redis and inserts 
into PG.


You can see, in my script, I trim the queue to 1000 items, but that's 
because I'm not as worried about loosing results.  Your setup would 
probably be different.  I also setup redis to not save anything to disk, 
again, because I don't mind if I loose a few hits here or there.  But 
you get the idea.


-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] PG vs ElasticSearch for Logs

2016-08-19 Thread Andy Colson


On 8/19/2016 2:32 AM, Thomas Güttler wrote:

I want to store logs in a simple table.

Here my columns:

  Primary-key (auto generated)
  timestamp
  host
  service-on-host
  loglevel
  msg
  json (optional)

I am unsure which DB to choose: Postgres, ElasticSearch or ...?

We don't have high traffic. About 200k rows per day.

My heart beats for postgres. We use it since several years.

On the other hand, the sentence "Don't store logs in a DB" is
somewhere in my head.

What do you think?





I played with ElasticSearch a little, mostly because I wanted to use 
Kibana which looks really pretty.  I dumped a ton of logs into it, and 
made a pretty dashboard ... but in the end it didn't really help me, and 
wasn't that useful.  My problem is, I don't want to have to go look at 
it.  If something goes bad, then I want an email alert, at which point 
I'm going to go run top, and tail the logs.


Another problem I had with kibana/ES is the syntax to search stuff is 
different than I'm used to.  It made it hard to find stuff in kibana.


Right now, I have a perl script that reads apache logs and fires off 
updates into PG to keep stats.  But its an hourly summary, which the 
website turns around and queries the stats to show pretty usage graphs.


In the end, PG or ES, all depends on what you want.

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] pg_logical_slot_get_changes

2016-08-11 Thread Andy Colson

After testing this more, maybe it does work ok just calling 
pg_logical_slot_get_changes().


I'm making the assumption that you'd like 
pg_replication_slots.restart_lsn to be close to 
pg_current_xlog_location(), correct?


The further apart they are, the more pg_xlog you have to store, yes?

When I had a gig of space in pg_xlog, I know I called 
pg_logical_slot_get_changes() but it did not make restart_lsn progress. 
Could it have gotten stuck some how?  As soon as I dropped that slot 
pg_xlog dropped to 80 meg, so I'm pretty sure that was the hang-up.


After dropping the logical slot and making a new one, and testing it 
out, restart_lsn is progressing just fine.


Here is a bit of terminal history that shows the problem.  Sorry for the 
formatting  (here is a pretty version http://pastebin.com/YgYuiR3U):


andy@katniss:~$ psql vcstimes
psql (9.5.3)
Type "help" for help.

vcstimes=# select * from  pg_replication_slots;
  slot_name  |   plugin| slot_type |  datoid  | database | active | 
active_pid | xmin | catalog_xmin | restart_lsn

-+-+---+--+--+++--+--+--
 counts_slot | decoder_raw | logical   | 23704565 | vcstimes | f  | 
   |  | 30293428 | 111/A235FEA0

(1 row)



root@katniss:/db/pg95# du1
1009M   ./pg_xlog
528K./global
0   ./pg_commit_ts
0   ./pg_dynshmem
8.0K./pg_notify
0   ./pg_serial
0   ./pg_snapshots
208K./pg_subtrans
0   ./pg_twophase
16K ./pg_multixact
47G ./base
4.0K./pg_replslot
0   ./pg_tblspc
0   ./pg_stat
152K./pg_stat_tmp
22M ./pg_logical
7.3M./pg_clog
49G .

andy@katniss:~$ psql vcstimes
psql (9.5.3)
Type "help" for help.

vcstimes=# select * from pg_stat_replication;
 pid | usesysid | usename | application_name | client_addr | 
client_hostname | client_port | backend_start | backend_xmi

-+--+-+--+-+-+-+---+
(0 rows)

vcstimes=# select * from  pg_replication_slots ;
slot_name  |   plugin| slot_type |  datoid  | database | active | 
active_pid | xmin | catalog_xmin | restart_lsn

-+-+---+--+--+++--+--+--
 counts_slot | decoder_raw | logical   | 23704565 | vcstimes | f  | 
   |  | 30293428 | 111/A235FEA0

(1 row)

vcstimes=# select pg_current_xlog_location();
 pg_current_xlog_location
--
 111/E0D25628
(1 row)

vcstimes=# SELECT * FROM pg_logical_slot_peek_changes('counts_slot', 
null, null);
   location   |   xid| 
  data

--+--+--
 111/E0360170 | 30330537 | INSERT INTO public.webcount (custid, type, 
numfull, reccount, tblver) VALUES (2, 'W', 1, 501,

(1 row)

vcstimes=# SELECT * FROM pg_logical_slot_get_changes('counts_slot', 
null, null);
   location   |   xid| 
  data

--+--+--
 111/E0360170 | 30330537 | INSERT INTO public.webcount (custid, type, 
numfull, reccount, tblver) VALUES (2, 'W', 1, 501,

(1 row)

vcstimes=# select * from  pg_replication_slots ;
  slot_name  |   plugin| slot_type |  datoid  | database | active | 
active_pid | xmin | catalog_xmin | restart_lsn

-+-+---+--+--+++--+--+--
 counts_slot | decoder_raw | logical   | 23704565 | vcstimes | f  | 
   |  | 30293435 | 111/A235FEA0

(1 row)


vcstimes=#  select pg_drop_replication_slot('counts_slot');
 pg_drop_replication_slot
--

(1 row)

vcstimes=# select * from  pg_replication_slots ;
 slot_name | plugin | slot_type | datoid | database | active | 
active_pid | xmin | catalog_xmin | restart_lsn

---++---++--+++--+--+-
(0 rows)


root@katniss:/db/pg95# du1
81M ./pg_xlog
528K./global
0   ./pg_commit_ts
0   ./pg_dynshmem
8.0K./pg_notify
0   ./pg_serial
0   ./pg_snapshots
216K./pg_subtrans
0   ./pg_twophase
16K ./pg_multixact
47G ./base
0   ./pg_replslot
0   ./pg_tblspc
0   ./pg_stat
152K./pg_stat_tmp
216K./pg_logical
7.3M./pg_clog
48G .




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] pglogical cross subscribe

2016-08-11 Thread Andy Colson


On 8/3/2016 8:27 PM, Andy Colson wrote:

Hi all.

I have a localdb and a remotedb.

A) I'd like to send table1 and table2 from localdb to remotedb.
B) I'd like to send table3 from remotedb to localdb.

I have PG9.5 and pglogical setup, and (A) works fine.

Well, worked.  I then tried to set up (B) which seems to work, but now
(A) is broken.



Since pglogical wont work, I dropped part (B) and went back to just (A).

To implement part (B) I figured I could schedule a perl script to pull 
changes.



On remote db I:
select pg_create_logical_replication_slot('custom_slot','decoder_raw');

And wrote myself a perl script to run in a cron job every hour that would:
SELECT * FROM pg_logical_slot_get_changes('custom_slot', null, null);

and send it to localdb.

Super.  Except pg_xlog grows and grows.  pg_replication_slots shows it 
as active=false, and restart_lsn never changes.  Even after calling 
pg_logical_slot_get_changes many times.


If I: select pg_drop_replication_slot('custom_slot');

pg_xlog shrinks right away.

Does that mean I cannot poll pg_logical_slot_get_changes()?  Do I have 
to use something like pg_recvlogical or pglogical?



Thanks for your time,

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] pglogical cross subscribe

2016-08-03 Thread Andy Colson


Hi all.

I have a localdb and a remotedb.

A) I'd like to send table1 and table2 from localdb to remotedb.
B) I'd like to send table3 from remotedb to localdb.

I have PG9.5 and pglogical setup, and (A) works fine.

Well, worked.  I then tried to set up (B) which seems to work, but now (A) is 
broken.

remotedb is logging things like:
2016-08-03 20:20:17.086 CDT,: ERROR:  cache lookup failed for replication 
origin 'pgl_vcstimes_katniss_subscrib73496c2'
2016-08-03 20:20:17.087 CDT,: LOG:  worker process: pglogical apply 
23704565:3237263450 (PID 21981) exited with exit code 1

I did find:
https://www.postgresql.org/message-id/1691408856.31522.1468333259045.javamail.zim...@cobra.ru
which leads to this:
https://github.com/2ndQuadrant/pglogical/issues/23

But I'm not sure if that helps me, or if it does I don't understand what I'd 
need to do.

On remotedb, I see:
vcstimes=# select * from pglogical.show_subscription_status();
-[ RECORD 1 ]-+---
subscription_name | subscribe_katniss
status| down
provider_node | payroll_provider
provider_dsn  | host=localhost port=9090 dbname=vcstimes user=postgres
slot_name | pgl_vcstimes_payroll_provider_subscribc0f4b85
replication_sets  | {default,default_insert_only,ddl_sql}
forward_origins   | {all}

I cannot find any description of status = down, and I don't know how to bring 
it back up.  Any help would be appreciated.

Thanks,

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] 9.3 to 9.5 upgrade problems

2016-07-03 Thread Andy Colson


On 07/03/2016 11:42 AM, Andy Colson wrote:


Now that I think about this more, I think you're on to something.
I'm trying to get an _exact_ copy of the master db onto the slave.
Checking rsync man, it matches only on size and modified time, and I
didn't include deletes.

I'm going to re-try with this:

select pg_start_backup('clone',true); rsync -av --delete --checksum
--exclude pg_xlog --exclude postgresql.conf /pub/pg95/*
web2:/pub/pg95/ select pg_stop_backup(); rsync -av --delete
--checksum /pub/pg95/pg_xlog web2:/pub/pg95/

That should make sure the copies are exact.


-Andy





That seemed to have worked.  Took forever, but once I started the slave on web2 
I was able to login and run queries without any errors or problems.

Thanks Adrian,

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] 9.3 to 9.5 upgrade problems

2016-07-03 Thread Andy Colson

On 07/03/2016 11:04 AM, Adrian Klaver wrote:

On 07/03/2016 08:49 AM, Andy Colson wrote:

On 07/03/2016 10:35 AM, Adrian Klaver wrote:

On 07/03/2016 08:06 AM, Andy Colson wrote:

Hi all,

I have a master (web1) and two slaves (web2, webserv), one slave is
quite far from the master, the db is 112 Gig, so pg_basebackup is my
last resort.

I followed the page here:
https://www.postgresql.org/docs/9.5/static/pgupgrade.html

including the rsync stuff. I practiced it _twice_, once in PG 9.5 beta,
and again a week ago, on two VM's I created locally. Both practice
sessions worked perfect.

I just ran it on the live databases. The master seems ok, its running
PG 9.5 now, I can login to it, and no errors in the log.

Neither slave works. After I'd gotten done with the pgupgrade steps,
both slaves gave me this error:

FATAL: database system identifier differs between the primary and
standby

Sure enough pg_controldata show'd their database system id different
(all three web1, web2, webserv were different. no matches at all), so
I'm assuming the rsync didnt rsync right, or I missed a step and ran it
to early, or something ... I'm not quite sure.

I needed to get the live website back up and running again, so I let the
master go, ran analyze, and when it was finished, used the steps here to
try and resync:

https://wiki.postgresql.org/wiki/Binary_Replication_Tutorial

on Master:
select pg_start_backup('clone',true);
rsync -av --exclude pg_xlog --exclude postgresql.conf /pub/pg95/*
web2:/pub/pg95/
select pg_stop_backup();
rsync -av /pub/pg95/pg_xlog web2:/pub/pg95/

Not sure about above rsync, that seems to undo what you did previously.

Also was the remote directory empty when you did this?

Not sure what you mean by undo. pgupgrade.html page, step 10, has you
rsync the master to the slave, so the pg95 directory is hard linked to
the pg93, which save's a ton to bandwidth when your servers are cross
county.

I understand I am just trying to figure out what mixing methods (pg-upgrade,
pg_start_backup) is doing?

In particular the section on pg_start_backup:

https://www.postgresql.org/docs/9.5/static/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP

starts with:

"Ensure that WAL archiving is enabled and working."

and from I gather that is not the case.

The slave log says it reached a consistent state, and is accepting connections,
which tells me it should run ok, maybe not with the newest data, but at least
not get:

FATAL: cache lookup failed for database 16401

Also was the remote directory empty when you did this?

Now that I think about this more, I think you're on to something. I'm trying
to get an _exact_ copy of the master db onto the slave. Checking rsync man, it
matches only on size and modified time, and I didn't include deletes.

I'm going to re-try with this:

select pg_start_backup('clone',true);
rsync -av --delete --checksum --exclude pg_xlog --exclude postgresql.conf
/pub/pg95/* web2:/pub/pg95/
select pg_stop_backup();
rsync -av --delete --checksum /pub/pg95/pg_xlog web2:/pub/pg95/

That should make sure the copies are exact.

-Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] 9.3 to 9.5 upgrade problems

2016-07-03 Thread Andy Colson


On 07/03/2016 10:35 AM, Adrian Klaver wrote:

On 07/03/2016 08:06 AM, Andy Colson wrote:

Hi all,

I have a master (web1) and two slaves (web2, webserv), one slave is
quite far from the master, the db is 112 Gig, so pg_basebackup is my
last resort.

I followed the page here:
https://www.postgresql.org/docs/9.5/static/pgupgrade.html

including the rsync stuff.  I practiced it _twice_, once in PG 9.5 beta,
and again a week ago, on two VM's I created locally.  Both practice
sessions worked perfect.

I just ran it on the live databases.  The master seems ok, its running
PG 9.5 now, I can login to it, and no errors in the log.

Neither slave works.  After I'd gotten done with the pgupgrade steps,
both slaves gave me this error:

FATAL:  database system identifier differs between the primary and standby

Sure enough pg_controldata show'd their database system id different
(all three web1, web2, webserv were different.  no matches at all), so
I'm assuming the rsync didnt rsync right, or I missed a step and ran it
to early, or something ... I'm not quite sure.

I needed to get the live website back up and running again, so I let the
master go, ran analyze, and when it was finished, used the steps here to
try and resync:

https://wiki.postgresql.org/wiki/Binary_Replication_Tutorial

on Master:
select pg_start_backup('clone',true);
rsync -av --exclude pg_xlog --exclude postgresql.conf /pub/pg95/*
web2:/pub/pg95/
select pg_stop_backup();
rsync -av /pub/pg95/pg_xlog web2:/pub/pg95/


Not sure about above rsync, that seems to undo what you did previously.

Also was the remote directory empty when you did this?



Not sure what you mean by undo.  pgupgrade.html page, step 10, has you rsync 
the master to the slave, so the pg95 directory is hard linked to the pg93, 
which save's a ton to bandwidth when your servers are cross county.

My second rsync did the same thing, but only on the pg95 directory (my db lives 
in /pub/pg95).

No, the directory was not empty, and I'm really trying to avoid a fresh copy of 
112 Gig.

-Andy




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] 9.3 to 9.5 upgrade problems

2016-07-03 Thread Andy Colson


On Sun, Jul 3, 2016 at 11:06 AM, Andy Colson mailto:a...@squeakycode.net>> wrote:

Hi all,

I have a master (web1) and two slaves (web2, webserv), one slave is quite 
far from the master, the db is 112 Gig, so pg_basebackup is my last resort.

I followed the page here:
https://www.postgresql.org/docs/9.5/static/pgupgrade.html

including the rsync stuff.  I practiced it _twice_, once in PG 9.5 beta, 
and again a week ago, on two VM's I created locally.  Both practice sessions 
worked perfect.



On 07/03/2016 10:11 AM, Vick Khera wrote:

binary replication requires the versions be identical. Also, once you ran 
pg_upgrade you altered one of the copies so binary replication can no longer 
work on that either.




Yes, all three boxes are running Pg 9.5.

I've uninstalled the PG 9.3 package, and delete the /pub/pg93 database 
directory.

My rsync copied the pg95 folder from web1 to web2.

-Andy



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] 9.3 to 9.5 upgrade problems

2016-07-03 Thread Andy Colson


Hi all,

I have a master (web1) and two slaves (web2, webserv), one slave is quite far 
from the master, the db is 112 Gig, so pg_basebackup is my last resort.

I followed the page here:
https://www.postgresql.org/docs/9.5/static/pgupgrade.html

including the rsync stuff.  I practiced it _twice_, once in PG 9.5 beta, and 
again a week ago, on two VM's I created locally.  Both practice sessions worked 
perfect.

I just ran it on the live databases.  The master seems ok, its running PG 9.5 
now, I can login to it, and no errors in the log.

Neither slave works.  After I'd gotten done with the pgupgrade steps, both 
slaves gave me this error:

FATAL:  database system identifier differs between the primary and standby

Sure enough pg_controldata show'd their database system id different (all three 
web1, web2, webserv were different.  no matches at all), so I'm assuming the 
rsync didnt rsync right, or I missed a step and ran it to early, or something 
... I'm not quite sure.

I needed to get the live website back up and running again, so I let the master 
go, ran analyze, and when it was finished, used the steps here to try and 
resync:

https://wiki.postgresql.org/wiki/Binary_Replication_Tutorial

on Master:
select pg_start_backup('clone',true);
rsync -av --exclude pg_xlog --exclude postgresql.conf /pub/pg95/* 
web2:/pub/pg95/
select pg_stop_backup();
rsync -av /pub/pg95/pg_xlog web2:/pub/pg95/


That ran pretty quick, and pg_controldata shows matching numbers, but when I 
start the slave I get:

,,2016-07-03 06:06:57.173 CDT,: LOG:  entering standby mode
,,2016-07-03 06:06:57.205 CDT,: LOG:  redo starts at 369/D6002228
,,2016-07-03 06:06:57.984 CDT,: LOG:  consistent recovery state reached at 
369/DCC5DB90
,,2016-07-03 06:06:57.984 CDT,: LOG:  database system is ready to accept read 
only connections
,,2016-07-03 06:06:57.984 CDT,: LOG:  invalid record length at 369/DD038ED0
,,2016-07-03 06:06:58.344 CDT,: LOG:  started streaming WAL from primary at 
369/DD00 on timeline 1
web,[unknown],2016-07-03 06:07:11.176 CDT,[local]: FATAL:  role "andy" does not 
exist

I can login as myself on the master, but not on the slave.  when I "psql -U 
postgres" on the slave I get:

psql: FATAL:  cache lookup failed for database 16401

This is only on web2, its close to web1, so I'm hoping I can get it fixed and 
then rsync it quickly to the far away slave.

I'm at a loss here, any hints or suggestions would be appreciated.

Thanks,

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] OT hardware recommend

2016-06-18 Thread Andy Colson


On 06/17/2016 04:39 PM, John R Pierce wrote:

On 6/17/2016 2:33 PM, John W Higgins wrote:

http://www.ebay.com/itm/2U-24-bay-2-5-Supermicro-Server-X8DTH-iF-2x-Xeon-Quad-Core-32GB-RAM-SAS2-216EL1-/222132081393?hash=item33b81a92f1:g:UzYAAOSwR5dXSQVw

With it being 2U you can then pop out the motherboard and go with anything more 
modern you wanted in terms of the motherboard/cpu/ram.


I would, however, also buy a LSI/Avago SAS 9207-8i card and remove that 3ware 
9750 'hardware' raid controller, which you can probably get $200 for on ebay.



Thanks John and John.  (Thank the John's!)

Seems Like I could also get an HP D2700 off ebay pretty cheap, but I'd probably 
also have to buy a raid card that supports mini-sas, and I'd have to find a 
computer to attach it to.  Anyone with experience with those (or similar)?

Any idea's if 15 5400 laptop drives in raid 10 attached to a D2700 would be 
performant at all?

There's a pretty good chance I'll get a few more.  I might get 5 more, would 
that make a difference?

Would the Supermicro and an HP D2700 use about the same amount of power?

Thanks all.

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] OT hardware recommend

2016-06-18 Thread Andy Colson


On 06/17/2016 09:23 PM, Scott Marlowe wrote:

On Fri, Jun 17, 2016 at 2:36 PM, Andy Colson mailto:a...@squeakycode.net>> wrote:

Hi all.

I have access to quite a few laptop HD's (10 to 15 of them at least), and 
thought that might make a neat test box that might have some good IO speed.

Needs to be cheap though, so linux with software raid, rack mount preferred 
but not required.

Anyone have any experience with anything like that?  $2K might be possible, 
painful, but possible.

Suggestions?


Sell them all and buy a couple of 800G SSDs? :)

Gaaa.. math!

Yep, had not thought about that.  But do I want speed or space?  How much could we sell 
them for?  So many questions I had not thought about.  I'd just thought, "huh, a 
pile of laptop drives, I'd better raid them", and not much else.

We have good production boxes, so this would only be for test/play.  I don't 
even really have a purpose, other than maybe learning something new.

Thanks all for the suggestions.

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] OT hardware recommend

2016-06-18 Thread Andy Colson


On 06/17/2016 04:33 PM, Gavin Flower wrote:

On 18/06/16 08:36, Andy Colson wrote:

Hi all.

I have access to quite a few laptop HD's (10 to 15 of them at least), and 
thought that might make a neat test box that might have some good IO speed.

Needs to be cheap though, so linux with software raid, rack mount preferred but 
not required.

Anyone have any experience with anything like that?  $2K might be possible, 
painful, but possible.

Suggestions?

Thanks for your time,

-Andy



It would be a good idea to say what country you are in, and what city (or 
locality).

What I know about Auckland in New Zealand may not be relevant to you...  :-)


Cheers,
Gavin





Iowa, USA.



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] OT hardware recommend

2016-06-17 Thread Andy Colson


Hi all.

I have access to quite a few laptop HD's (10 to 15 of them at least), 
and thought that might make a neat test box that might have some good IO 
speed.


Needs to be cheap though, so linux with software raid, rack mount 
preferred but not required.


Anyone have any experience with anything like that?  $2K might be 
possible, painful, but possible.


Suggestions?

Thanks for your time,

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Using a VIEW as a temporary mechanism for renaming a table

2016-06-08 Thread Andy Colson


On 6/8/2016 12:57 PM, Ben Buckman wrote:

Hello,
I would like to rename a table with ~35k rows (on pgsql 9.4), let's say
from `oldthings` to `newthings`.
Our application is actively reading from and writing to this table, and
the code will break if the table name suddenly changes at runtime. So I
can't simply run an `ALTER TABLE oldthings RENAME TO newthings`, unless
we take downtime, which we'd prefer not to do. (I'd also prefer to avoid
a data migration from one table to another, which would require
dual-writes or some other way to handle data written during the transition.)

It seems that a reasonable approach to do this without downtime, would
be to use a temporary VIEW. We can `CREATE VIEW newthings AS SELECT *
FROM oldthings;`. Views in pg9.4 that are backed by a single table
support writes. So my plan is like this:

1. Create the view, essentially as an alias to the table.
2. In the code, change all references from the old name to the new name.
The code would "think" it's using a renamed table, but would really be
using a view.
  (At this point, I expect that all basic CRUD operations on the view
should behave as if they were on the table, and that the added
performance impact would be negligible.)
3. In a transaction, drop the view and rename the table, so `newthings`
is now the original table and `oldthings` no longer exists. (In my
testing, this operation took <10ms.)
  (When this is done, the view will have only existed and been used by
the application for a few minutes.)

What are people's thoughts on this approach? Is there a flaw or
potential danger that I should be aware of? Is there a simpler approach
I should consider instead?

Thank you


--



Oh, one other minor comment.  I usually have a temp schema staging area 
with exact table structures but new data, and when everything is ready I 
run:


start trans;

drop table public.tableA;
alter table tmp.tableA new schema public;

... same for 100 more tables ...
commit;

99% of the time it works great, but every once and a while I get a 
deadlock error.  I just re-run it real quick and it works fine.


when you do your drop view, rename table, if you happen to get a 
deadlock, I wouldnt worry too much.  Just re-run it.  Also, I'm still on 
9.3 so maybe its not as much of a problem anymore.


-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Using a VIEW as a temporary mechanism for renaming a table

2016-06-08 Thread Andy Colson


On 6/8/2016 2:57 PM, Ben Buckman wrote:

Thanks Andy.

My understanding, and please correct me if I'm wrong, is that the view
will effectively inherit the table's constraints, because writes to the
view that can't be written to the table will fail on the table. Re:
"will the data be good data," what risks should I be considering?

In terms of rollout, we would 1) create the view, 2) deploy code that
uses the new [view] name, 3) drop the view and rename the table.
Deployments are "rolling" so there would be no downtime. The app and
users shouldn't notice/care if they're hitting the table or the view.

Thank you


I'd assumed new version of app would have new columns in the table. 
That's what I meant by good data.  New columns would not get populated 
by the old app.


But if the table structure isn't changing, then I'd say your plan sounds 
like it should work.  I've never tried it, personally, but I would if I 
were in the same boat.


-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Using a VIEW as a temporary mechanism for renaming a table

2016-06-08 Thread Andy Colson


On 6/8/2016 12:57 PM, Ben Buckman wrote:

Hello,
I would like to rename a table with ~35k rows (on pgsql 9.4), let's say
from `oldthings` to `newthings`.
Our application is actively reading from and writing to this table, and
the code will break if the table name suddenly changes at runtime. So I
can't simply run an `ALTER TABLE oldthings RENAME TO newthings`, unless
we take downtime, which we'd prefer not to do. (I'd also prefer to avoid
a data migration from one table to another, which would require
dual-writes or some other way to handle data written during the transition.)

It seems that a reasonable approach to do this without downtime, would
be to use a temporary VIEW. We can `CREATE VIEW newthings AS SELECT *
FROM oldthings;`. Views in pg9.4 that are backed by a single table
support writes. So my plan is like this:

1. Create the view, essentially as an alias to the table.
2. In the code, change all references from the old name to the new name.
The code would "think" it's using a renamed table, but would really be
using a view.
  (At this point, I expect that all basic CRUD operations on the view
should behave as if they were on the table, and that the added
performance impact would be negligible.)
3. In a transaction, drop the view and rename the table, so `newthings`
is now the original table and `oldthings` no longer exists. (In my
testing, this operation took <10ms.)
  (When this is done, the view will have only existed and been used by
the application for a few minutes.)

What are people's thoughts on this approach? Is there a flaw or
potential danger that I should be aware of? Is there a simpler approach
I should consider instead?

Thank you


--


Shyp
*Ben Buckman / Platform Engineering*
www.shyp.com
Shipping made easy 





I think it kinda depends on how you roll out an application.  Will you 
kick everyone out, updated it and let them back in?  Sounds like you 
want to avoid that to avoid downtime.


But, if your old app is hitting a view, and inserts data via the view 
into the new table, will it be good data?  Will the new app be ok with it?


-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Best approach for multi-database system

2016-03-10 Thread Andy Colson

please don't top post.
> Em 10/03/2016 16:56, Andy Colson escreveu:

On 3/10/2016 1:41 PM, Edson Richter wrote:

Hi!

Using async replication between geographically database servers. I've
some 9.3 and some 9.4 servers, so doesn't matter the version (the only
reason I've not migrated all to 9.4 yet is because I'm short on time).

I've experienced some synchronization failures between Master-Slave
servers (for example, if connection is lost for 4 to 6 hours, the number
of wall keep segments may not suffice).
I've created some bash scripts that will reacreate the master-slave
using rsync, which seems to be farily easy - but really slow.
Depending on server size (130Gb for example) take really long on a 5Mbps
link - in some cases, all night long.

This is the first approach: all databases in one server, and if I loose
the sync between master and slave, all data need to be transfered
between server and slave and is really slow.

The other approach I've tested is to split each database in one cluster.
By using the second approach, I can resynch each server separeted (for
example 15Gb instead 130Gb).
This approach have the problema that it is needed to configure and
maintain another cluster for each database.

Is there a best practice for multi-database systems with async
replication to avoid such "resync" problems?

Thanks for your orientation,

I wrote about my setup here:

http://www.postgresql.org/message-id/548f3954.8040...@squeakycode.net

Would be happy to answer questions.

-Andy

On 3/10/2016 2:14 PM, Edson Richter wrote:
> Interesting. This logicel (only remove WAL files after been applied)
> should be available in PostgreSQL by default.
> I mean, to have an option wheter every 15 minutes a master server query
> slave for completeness, or slave notify master last successfully
> imported WAL.
>
>
> Atenciosamente,
>
> Edson Carlos Ericksson Richter
>

It is.  Starting in 9.4 you can use replication slots.

-Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Best approach for multi-database system

2016-03-10 Thread Andy Colson


On 3/10/2016 1:41 PM, Edson Richter wrote:

Hi!

Using async replication between geographically database servers. I've
some 9.3 and some 9.4 servers, so doesn't matter the version (the only
reason I've not migrated all to 9.4 yet is because I'm short on time).

I've experienced some synchronization failures between Master-Slave
servers (for example, if connection is lost for 4 to 6 hours, the number
of wall keep segments may not suffice).
I've created some bash scripts that will reacreate the master-slave
using rsync, which seems to be farily easy - but really slow.
Depending on server size (130Gb for example) take really long on a 5Mbps
link - in some cases, all night long.

This is the first approach: all databases in one server, and if I loose
the sync between master and slave, all data need to be transfered
between server and slave and is really slow.

The other approach I've tested is to split each database in one cluster.
By using the second approach, I can resynch each server separeted (for
example 15Gb instead 130Gb).
This approach have the problema that it is needed to configure and
maintain another cluster for each database.

Is there a best practice for multi-database systems with async
replication to avoid such "resync" problems?


Thanks for your orientation,



I wrote about my setup here:

http://www.postgresql.org/message-id/548f3954.8040...@squeakycode.net

Would be happy to answer questions.

-Andy



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Test CMake build

2016-02-13 Thread Andy Colson


On 02/13/2016 04:33 AM, Yury Zhuravlev wrote:

Andy Colson wrote:

cmake and make -j2 fine, but then

You can try again I removed some features from CMake 3.x .

Realy big thanks for testing!

My pleasure.

Still didn't work.  'make check' seems to make it through make install first, 
but then same sort of error:


andy@mapper:~/projects/postgres_cmake/build$ make check
[  1%] Built target gen_errorcodes
[  3%] Built target port

-- Installing: 
/home/andy/projects/postgres_cmake/build/src/test/regress/tmp_install/tmp/pg99/share/postgresql/extension/timetravel--1.0.sql
-- Installing: 
/home/andy/projects/postgres_cmake/build/src/test/regress/tmp_install/tmp/pg99/share/postgresql/extension/timetravel--unpackaged--1.0.sql
Built target tablespace-setup
CMake Error: cmake version 2.8.12
Usage: /usr/bin/cmake -E [command] [arguments ...]
Available commands:
  chdir dir cmd [args]...   - run command in a given directory
  compare_files file1 file2 - check if file1 is same as file2
  copy file destination - copy file to destination (either file or 
directory)
  copy_directory source destination   - copy directory 'source' content to 
directory 'destination'
  copy_if_different in-file out-file  - copy file if input has changed
  echo [string]...  - displays arguments as text
  echo_append [string]...   - displays arguments as text but no new line
  environment   - display the current environment
  make_directory dir- create a directory
  md5sum file1 [...]- compute md5sum of files
  remove [-f] file1 file2 ... - remove the file(s), use -f to force it
  remove_directory dir  - remove a directory and its contents
  rename oldname newname- rename a file or directory (on one volume)
  tar [cxt][vfz][cvfj] file.tar [file/dir1 file/dir2 ...]
- create or extract a tar or zip archive
  time command [args] ...   - run command and return elapsed time
  touch file- touch a file.
  touch_nocreate file   - touch a file but do not create it.
Available on UNIX only:
  create_symlink old new- create a symbolic link new -> old

make[7]: *** [src/test/regress/CMakeFiles/installcheck_tmp] Error 1
make[6]: *** [src/test/regress/CMakeFiles/installcheck_tmp.dir/all] Error 2
make[5]: *** [src/test/regress/CMakeFiles/installcheck_tmp.dir/rule] Error 2
make[4]: *** [installcheck_tmp] Error 2
make[3]: *** [src/test/regress/CMakeFiles/check] Error 2
make[2]: *** [src/test/regress/CMakeFiles/check.dir/all] Error 2
make[1]: *** [src/test/regress/CMakeFiles/check.dir/rule] Error 2
make: *** [check] Error 2



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Test CMake build

2016-02-12 Thread Andy Colson


On 2/12/2016 9:47 AM, Yury Zhuravlev wrote:

Andy Colson wrote:

Is the installcheck important to you?


Hello!
You can try new make check. Also "make install" started support DESTDIR.



cmake and make -j2 fine, but then

andy@mapper:~/projects/postgres_cmake/build$ make check
Scanning dependencies of target check
CMake Error: cmake version 2.8.12
Usage: /usr/bin/cmake -E [command] [arguments ...]
Available commands:
  chdir dir cmd [args]...   - run command in a given directory
  compare_files file1 file2 - check if file1 is same as file2
  copy file destination - copy file to destination (either file or 
directory)
  copy_directory source destination   - copy directory 'source' content 
to directory 'destination'

  copy_if_different in-file out-file  - copy file if input has changed
  echo [string]...  - displays arguments as text
  echo_append [string]...   - displays arguments as text but no new line
  environment   - display the current environment
  make_directory dir- create a directory
  md5sum file1 [...]- compute md5sum of files
  remove [-f] file1 file2 ... - remove the file(s), use -f to force it
  remove_directory dir  - remove a directory and its contents
  rename oldname newname- rename a file or directory (on one volume)
  tar [cxt][vfz][cvfj] file.tar [file/dir1 file/dir2 ...]
- create or extract a tar or zip archive
  time command [args] ...   - run command and return elapsed time
  touch file- touch a file.
  touch_nocreate file   - touch a file but do not create it.
Available on UNIX only:
  create_symlink old new- create a symbolic link new -> old

make[3]: *** [src/test/regress/CMakeFiles/check] Error 1
make[2]: *** [src/test/regress/CMakeFiles/check.dir/all] Error 2
make[1]: *** [src/test/regress/CMakeFiles/check.dir/rule] Error 2
make: *** [check] Error 2





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Test CMake build

2016-02-11 Thread Andy Colson


On 2/10/2016 12:09 PM, Yury Zhuravlev wrote:

Hello all.
Please test build Postgres using cmake. If you are of course interested.
Still not everything is ready but most of the work. Assembly
instructions as does the repository is on github:
https://github.com/stalkerg/postgres_cmake

The compilation will be enough (tests even better). I need feedbacks so
that create issues on github.
Very interesting NetBSD, OpenBSD, Solaris.
Thanks!


On a side note, how hard would it be to print a small summary of options 
after the cmake step?


Its not important, but is pretty nice.  mapserver, for example, shows this:


-- * Summary of configured options for this build
--  * Mandatory components
--   * png: /usr/lib64/libpng.so
--   * jpeg: /usr/lib64/libjpeg.so
--   * freetype: /usr/lib64/libfreetype.so
--  * Optional components
--   * GDAL: /usr/local/lib/libgdal.so
--   * OGR: /usr/local/lib/libgdal.so
--   * GD: disabled
--   * GIF: /usr/lib64/libgif.so
--   * MYSQL: disabled
--   * FRIBIDI: /usr/lib64/libfribidi.so
--   * GIF: /usr/lib64/libgif.so
--   * CAIRO: /usr/lib64/libcairo.so
--   * SVGCAIRO: disabled
--   * RSVG: disabled
--   * CURL: disabled
--   * PROJ: /usr/lib64/libproj.so
--   * LIBXML2: /usr/lib64/libxml2.so
--   * POSTGIS: /usr/local/pg93/lib/libpq.so
--   * GEOS: /usr/lib64/libgeos_c.so
--   * FastCGI: /usr/lib64/libfcgi.so
--   * Oracle Spatial: disabled
--   * SDE: disabled
--   * Exempi XMP: disabled
--  * Optional features
--   * WMS SERVER: ENABLED
--   * WFS SERVER: ENABLED
--   * WCS SERVER: ENABLED
--   * SOS SERVER: disabled
--   * WMS CLIENT: disabled
--   * WFS CLIENT: disabled
--   * ICONV: ENABLED
--   * Thread-safety support: disabled
--   * KML output: disabled
--   * Z+M point coordinate support: disabled
--   * XML Mapfile support: disabled
--  * Mapscripts
--   * Python: disabled
--   * PHP: disabled
--   * PERL: ENABLED
--   * RUBY: disabled
--   * JAVA: disabled
--   * C#: disabled
--   * Apache Module (Experimental): disabled
--
-- Will install files to /usr/local
-- Will install libraries to /usr/local/lib64


We would not care about the libs, but things like integer dates, and 
perl, python, etc, ssl version.  Looks like you calc TABLE_BLOCKSIZE and 
WAL_BLOCKSIZE, those might be nice to see.



Anyway, thanks for all your work on this.  Looking good.

-Andy



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Test CMake build

2016-02-11 Thread Andy Colson


On 2/11/2016 10:44 AM, Andy Colson wrote:

On 2/11/2016 9:49 AM, Yury Zhuravlev wrote:

Yury Zhuravlev wrote:

I will try to fix soon.
I write as a corrected.


You can try again. Thanks!



That seems better:

-- Found Readline: /usr/include
-- Found Curses: /usr/lib64/libcurses.so

Is this bad?
-- Check size of int64
-- Check size of int64 - failed
-- Check size of uint64
-- Check size of uint64 - failed
-- Check size of int8
-- Check size of int8 - failed

make -j2
running now...

-Andy





make finished ok.

I build slackware packages, so it needs to install into a temp spot, 
which kinda works:


andy@mapper:~/projects/postgres_cmake/build$ mkdir /tmp/pg999

I used to use "make install-strip", is that not a thing anymore?

andy@mapper:~/projects/postgres_cmake/build$ make install-strip 
DESTDIR=/tmp/pg999

make: *** No rule to make target `install-strip'.  Stop.

Ok, we'll install, but into a tmp folder:

andy@mapper:~/projects/postgres_cmake/build$ make install DESTDIR=/tmp/pg999
[  0%] Built target gen_errorcodes
[  3%] Built target port
[  5%] Built target port_srv


[100%] Built target refint
[100%] Built target timetravel
Install the project...
-- Install configuration: ""
-- Installing: /tmp/pg999/usr/local/pg99/lib/libpq.so
-- Installing: /tmp/pg999/usr/local/pg99/bin/initdb

-- Removed runtime path from "/tmp/pg999/usr/local/pg99/bin/initdb"



-- Installing: /tmp/pg999/usr/local/pg99/share/timezonesets/Default
-- Installing: /tmp/pg999/usr/local/pg99/share/timezonesets/Australia
-- Installing: /tmp/pg999/usr/local/pg99/share/timezonesets/India
/home/andy/projects/postgres_cmake/build/src/timezone//zic: Cannot 
create directory /usr/local/pg99: Permission denied

-- Installing: /tmp/pg999/usr/local/pg99/lib/plpgsql.so
-- Installing: /tmp/pg999/usr/local/pg99/share/extension/plpgsql.control
-- Installing: /tmp/pg999/usr/local/pg99/share/extension/plpgsql--1.0.sql
-- Installing: 
/tmp/pg999/usr/local/pg99/share/extension/plpgsql--unpackaged--1.0.sql





I start with:
cmake .. -DCMAKE_INSTALL_PREFIX="/usr/local/pg99"

I'm not building/installing as root, so cannot actually write to 
/usr/local/pg99


I dunno what version we are building, and dont want to replace anything 
currently installed, so I went for pg99.




andy@mapper:~/projects/postgres_cmake/build$ make installcheck
Scanning dependencies of target tablespace-setup
[  0%] Built target tablespace-setup
[  0%] Built target gen_errorcodes
[ 42%] Built target port
[ 85%] Built target pq
[100%] Built target pgcommon
[100%] Built target pg_regress
Scanning dependencies of target installcheck
== creating temporary instance==
== initializing database system   ==

pg_regress: initdb failed
Examine 
/home/andy/projects/postgres_cmake/src/test/regress/log/initdb.log for 
the reason.
Command was: "/usr/local/pg99/bin/initdb" -D 
"/home/andy/projects/postgres_cmake/src/test/regress/tmp_check/data" 
--noclean --nosync --no-locale > 
"/home/andy/projects/postgres_cmake/src/test/regress/log/initdb.log" 2>&1

make[3]: *** [src/test/regress/CMakeFiles/installcheck] Error 2
make[2]: *** [src/test/regress/CMakeFiles/installcheck.dir/all] Error 2
make[1]: *** [src/test/regress/CMakeFiles/installcheck.dir/rule] Error 2
make: *** [installcheck] Error 2


andy@mapper:~/projects/postgres_cmake/build$ cat 
/home/andy/projects/postgres_cmake/src/test/regress/log/initdb.log

sh: /usr/local/pg99/bin/initdb: No such file or directory


Yeah, that makes sense, its not actually there yet.  Is the installcheck 
important to you?  I can do the install if you like.


-Andy





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Test CMake build

2016-02-11 Thread Andy Colson


On 2/11/2016 9:49 AM, Yury Zhuravlev wrote:

Yury Zhuravlev wrote:

I will try to fix soon.
I write as a corrected.


You can try again. Thanks!



That seems better:

-- Found Readline: /usr/include
-- Found Curses: /usr/lib64/libcurses.so

Is this bad?
-- Check size of int64
-- Check size of int64 - failed
-- Check size of uint64
-- Check size of uint64 - failed
-- Check size of int8
-- Check size of int8 - failed

make -j2
running now...

-Andy



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Test CMake build

2016-02-10 Thread Andy Colson


On 2/10/2016 2:50 PM, Andy Colson wrote:

On 2/10/2016 2:45 PM, Andy Colson wrote:

On 2/10/2016 12:09 PM, Yury Zhuravlev wrote:

Hello all.
Please test build Postgres using cmake. If you are of course interested.
Still not everything is ready but most of the work. Assembly
instructions as does the repository is on github:
https://github.com/stalkerg/postgres_cmake

The compilation will be enough (tests even better). I need feedbacks so
that create issues on github.
Very interesting NetBSD, OpenBSD, Solaris.
Thanks!



Slackware64, 14.1





/usr/lib64/gcc/x86_64-slackware-linux/4.9.2/../../../../lib64/libreadline.so:
undefined reference to `tputs'



tputs is in ncurses?

I did not see a:

-- Looking for curses

And it didnt try to link with it:

/usr/bin/cc 
CMakeFiles/cmTryCompileExec301475258.dir/HAVE_LONG_LONG_INT_64.c.o  -o 
cmTryCompileExec301475258 -rdynamic -lreadline


-Andy



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Test CMake build

2016-02-10 Thread Andy Colson


On 2/10/2016 2:45 PM, Andy Colson wrote:

On 2/10/2016 12:09 PM, Yury Zhuravlev wrote:

Hello all.
Please test build Postgres using cmake. If you are of course interested.
Still not everything is ready but most of the work. Assembly
instructions as does the repository is on github:
https://github.com/stalkerg/postgres_cmake

The compilation will be enough (tests even better). I need feedbacks so
that create issues on github.
Very interesting NetBSD, OpenBSD, Solaris.
Thanks!

Slackware64, 14.1


-- The C compiler identification is GNU 4.9.2
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works


This might be important:

-- Looking for include file pwd.h
-- Looking for include file pwd.h - found
-- Found Readline: /usr/include
-- Looking for include files stdio.h, readline.h
-- Looking for include files stdio.h, readline.h - not found
-- Looking for include files stdio.h, history.h
-- Looking for include files stdio.h, history.h - not found
-- Looking for include files stdio.h, readline/history.h
-- Looking for include files stdio.h, readline/history.h - found
-- Looking for include files stdio.h, readline/readline.h
-- Looking for include files stdio.h, readline/readline.h - found
-- Looking for include files stdio.h, editline/history.h
-- Looking for include files stdio.h, editline/history.h - not found
-- Looking for include files stdio.h, editline/readline.h
-- Looking for include files stdio.h, editline/readline.h - not found

> -- Check size of long long int - failed
> CMake Error at CMakeLists.txt:262 (message):
>Cannot find a working 64-bit integer type.
>



the end of CMakeFiles/CMakeError.log shows:

Determining size of long long int failed with the following output:
Change Dir: /home/andy/projects/postgres_cmake/build/CMakeFiles/CMakeTmp

Run Build Command:/usr/bin/gmake "cmTryCompileExec301475258/fast"
/usr/bin/gmake -f CMakeFiles/cmTryCompileExec301475258.dir/build.make 
CMakeFiles/cmTryCompileExec301475258.dir/build
gmake[1]: Entering directory 
`/home/andy/projects/postgres_cmake/build/CMakeFiles/CMakeTmp'
/usr/bin/cmake -E cmake_progress_report 
/home/andy/projects/postgres_cmake/build/CMakeFiles/CMakeTmp/CMakeFiles 1
Building C object 
CMakeFiles/cmTryCompileExec301475258.dir/HAVE_LONG_LONG_INT_64.c.o
/usr/bin/cc-o 
CMakeFiles/cmTryCompileExec301475258.dir/HAVE_LONG_LONG_INT_64.c.o   -c 
/home/andy/projects/postgres_cmake/build/CMakeFiles/CheckTypeSize/HAVE_LONG_LONG_INT_64.c

Linking C executable cmTryCompileExec301475258
/usr/bin/cmake -E cmake_link_script 
CMakeFiles/cmTryCompileExec301475258.dir/link.txt --verbose=1
/usr/bin/cc 
CMakeFiles/cmTryCompileExec301475258.dir/HAVE_LONG_LONG_INT_64.c.o  -o 
cmTryCompileExec301475258 -rdynamic -lreadline
/usr/lib64/gcc/x86_64-slackware-linux/4.9.2/../../../../lib64/libreadline.so: 
undefined reference to `tputs'
/usr/lib64/gcc/x86_64-slackware-linux/4.9.2/../../../../lib64/libreadline.so: 
undefined reference to `tgoto'
/usr/lib64/gcc/x86_64-slackware-linux/4.9.2/../../../../lib64/libreadline.so: 
undefined reference to `tgetflag'
/usr/lib64/gcc/x86_64-slackware-linux/4.9.2/../../../../lib64/libreadline.so: 
undefined reference to `UP'
/usr/lib64/gcc/x86_64-slackware-linux/4.9.2/../../../../lib64/libreadline.so: 
undefined reference to `tgetent'
/usr/lib64/gcc/x86_64-slackware-linux/4.9.2/../../../../lib64/libreadline.so: 
undefined reference to `tgetnum'
/usr/lib64/gcc/x86_64-slackware-linux/4.9.2/../../../../lib64/libreadline.so: 
undefined reference to `PC'
/usr/lib64/gcc/x86_64-slackware-linux/4.9.2/../../../../lib64/libreadline.so: 
undefined reference to `tgetstr'
/usr/lib64/gcc/x86_64-slackware-linux/4.9.2/../../../../lib64/libreadline.so: 
undefined reference to `BC'

collect2: error: ld returned 1 exit status
gmake[1]: *** [cmTryCompileExec301475258] Error 1
gmake[1]: Leaving directory 
`/home/andy/projects/postgres_cmake/build/CMakeFiles/CMakeTmp'

gmake: *** [cmTryCompileExec301475258/fast] Error 2






--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Test CMake build

2016-02-10 Thread Andy Colson


On 2/10/2016 12:09 PM, Yury Zhuravlev wrote:

Hello all.
Please test build Postgres using cmake. If you are of course interested.
Still not everything is ready but most of the work. Assembly
instructions as does the repository is on github:
https://github.com/stalkerg/postgres_cmake

The compilation will be enough (tests even better). I need feedbacks so
that create issues on github.
Very interesting NetBSD, OpenBSD, Solaris.
Thanks!

Slackware64, 14.1


-- The C compiler identification is GNU 4.9.2
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check if the system is big endian
-- Searching 16 bit integer
-- Looking for sys/types.h
-- Looking for sys/types.h - found
etc, etc,
-- Performing Test AC_FUNC_ACCEPT
-- Performing Test AC_FUNC_ACCEPT - Failed
-- Performing Test AC_FUNC_ACCEPT
-- Performing Test AC_FUNC_ACCEPT - Failed
...A TON of the above...
-- Performing Test AC_FUNC_ACCEPT
-- Performing Test AC_FUNC_ACCEPT - Failed
ERRORcould not determine argument types
-- Looking for accept function args - found , , ,  *
-- Check alignment of double
-- Check alignment of double - 8
-- Check alignment of int
-- Check alignment of int - 4
-- Check alignment of long
-- Check alignment of long - 8
-- Check alignment of long long int
-- Check alignment of long long int - 8
-- Check alignment of short
-- Check alignment of short - 2
-- Check size of int64
-- Check size of int64 - failed
-- Check size of uint64
-- Check size of uint64 - failed
-- Check size of int8
-- Check size of int8 - failed
-- Check size of void *
-- Check size of void * - failed
-- Check size of long int
-- Check size of long int - failed
-- Check size of long
-- Check size of long - failed
-- Check size of size_t
-- Check size of size_t - failed
-- Check size of locale_t
-- Check size of locale_t - failed
-- Check size of long long int
-- Check size of long long int - failed
CMake Error at CMakeLists.txt:262 (message):
  Cannot find a working 64-bit integer type.





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] PostgreSQL vs Firebird SQL

2016-02-10 Thread Andy Colson


On 2/9/2016 10:10 PM, ioan ghip wrote:

I have a Firebird SQL database running on one of my servers which has
about 50k inserts, about 100k updates and about 30k deletes every day.
There are about 4 million records in 24 tables. I have a bunch of stored
procedures, triggers, events and views that I'm using.
Firebird works fairly well, but from time to time the database gets
corrupted and I couldn't figure out yet (after many years of running)
what's the reason. When this happens I run "gfix -mend -full -ignore",
backup and restore the db and everything is fine until next problem in a
week, or a month.

I never used PostgreSQL. Yesterday I installed it on my development
machine and after few tests I saw that it's fairly easy to use.

Does anyone have experience with both, Firebird and PostgreSQL? Is
PostgreSQL way better performing than Firebird? Is it worth the effort
moving away from Firebird? Would I gain stability and increased performance?

Thanks.




One of our windows apps runs on a client/server setup in the office, and 
then on laptop for remote use.  We use Firebird (FB) for both.  Its a 
quick simple install, runs in 8 meg of ram, has zero maintenance.


The only time I've seen corruptions is anti-virus scanning the db, and 
HD/raid problems.


FB is a nice little db.  That said, I can wholeheartedly recommend PG! 
It could still run on a laptop, might require a bit more maintenance, 
but on a dedicated server, it would be able to grow and use all the 
resources available.


If you have HD/raid problems, then you wont gain stability.  Upgrading 
between major versions is also more difficult.


That said, yes, you'd gain stability and performance, and not only that, 
a huge amount of functionality.  A Huge Amount!


FB has, replace() for string ops, oh and substring().  Baa.  That's 
nothing compared to PG's.  Its like that Aladdin song 'a whole new world'!


You know, in FB, when one person does a large delete on a table?  The 
next person that runs a select will perform the vacuum on it.  Its the 
person running the select that pays the time for a huge delete.  In PG, 
there is a background vacuum task, so users don't pay the price.


Respect for FB, but my heart belongs to PG.

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] long transfer time for binary data

2016-01-21 Thread Andy Colson


Am 21.01.2016 um 03:33 schrieb Andy Colson:

On 01/20/2016 03:29 PM, Johannes wrote:

I noticed transferring a large object or bytea data between client and
server takes a long time.
For example: An image with a real size of 11 MB could be read on server
side (explain analyze) in 81ms. Fine.

But on client side the result was completed after 6.7 seconds without
ssl compression and 4.5 seconds with ssl compression (both via 100MBit
ethernet).

SSL compression seems to be not a good idea anymore, since this had
become a security risk. Its still possible with pgadmin, but afaik not
with java/jdbc .

Are there any other solutions available to display my images in my
client application more quickly? Or are there planned improvements to
postgresql (transferring the real binary data)?

Best regards
Johannes



Yep, that's slow.  The ssl compression is very odd if the image is
jpeg'ish and already compressed.  If its a bitmap or uncompressed tif
then its not so surprising.

A few tests you could try:

1) copy the same 11 meg file from server to client via regular file copy
and time it.  At 100 Mbit/s it should take about a second.  If it takes
6 you have network problems, not PG problems.

2) try it via psql command line (or at least something other than java),
to see if its java thats the problem.

3) watch wireshark/tcpdump, maybe you'll see something glaring that'll
point you in the right direction.

-Andy

PS: I've never heard that ssl compression was a security risk, got
links/proof?






On 01/21/2016 03:59 PM, Johannes wrote:

Here are some transferring measurements (from server to client) with the
same file.

scp
+ssl -compression 1.3 sec
+ssl +compression 4.6 sec

pgadmin
select lo_get(12345);
-ssl  3.4 sec
+ssl +compression 5.5 sec
+ssl -compression 4.5 sec

psql
select lo_get(12345);
+ssl -compression 6.0 sec
-ssl  4.4 sec

java/jdbc
only while(in.read(buf,0,len))
+ssl -compression 6.0 sec
-ssl  3.0 sec (+ 1.8 sec for new Image())

Here is a link for insecure ssl compression:
https://en.wikipedia.org/wiki/Comparison_of_TLS_implementations#Compression

Best Regargs
Johannes



Please don't top post.

Thanks for the link on ssl compression, I'd not seen that before.  I'm going to 
have to read up.

Your numbers ... look ...  odd.  scp +compression is slower?  pgadmin -ssl and 
psql -ssl and java -ssl are all different speeds?  ssl always adds extra time?  
Maybe a high latency thing?  If you ping the other box what sort of time's do 
you get?  Maybe the extra ssl handshakes up front + high latency is causing it. 
 You could try a shared/cached ssh connection to avoid the overhead.

Best case though, your file copy was 1.3 seconds and with PG it was 3 seconds.  
Even getting ssl fixed, you probably wont get faster than 3 seconds.  Is that 
enough?

-Andy




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] long transfer time for binary data

2016-01-20 Thread Andy Colson


On 01/20/2016 03:29 PM, Johannes wrote:

I noticed transferring a large object or bytea data between client and
server takes a long time.
For example: An image with a real size of 11 MB could be read on server
side (explain analyze) in 81ms. Fine.

But on client side the result was completed after 6.7 seconds without
ssl compression and 4.5 seconds with ssl compression (both via 100MBit
ethernet).

SSL compression seems to be not a good idea anymore, since this had
become a security risk. Its still possible with pgadmin, but afaik not
with java/jdbc .

Are there any other solutions available to display my images in my
client application more quickly? Or are there planned improvements to
postgresql (transferring the real binary data)?

Best regards
Johannes



Yep, that's slow.  The ssl compression is very odd if the image is jpeg'ish and 
already compressed.  If its a bitmap or uncompressed tif then its not so 
surprising.

A few tests you could try:

1) copy the same 11 meg file from server to client via regular file copy and 
time it.  At 100 Mbit/s it should take about a second.  If it takes 6 you have 
network problems, not PG problems.

2) try it via psql command line (or at least something other than java), to see 
if its java thats the problem.

3) watch wireshark/tcpdump, maybe you'll see something glaring that'll point 
you in the right direction.

-Andy

PS: I've never heard that ssl compression was a security risk, got links/proof?


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] regexp_replace

2016-01-14 Thread Andy Colson


On 1/14/2016 2:06 PM, David G. Johnston wrote:

select regexp_replace('71.09.6.01.3', '(\d)[.-](?=\d)', '\1\2', 'g');


Thanks David!

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] regexp_replace

2016-01-14 Thread Andy Colson


On 1/14/2016 2:02 PM, John McKown wrote:

How about:

select regexp_replace('71.09.6.01.3', '(\d)[.-]', '\1', 'g');

match is 1.3 and result is 13 ( 71096.013). If you don't want to
eliminate the period or dash unless it is _between_ two digits, try:

select regexp_replace('71.09.6.01.3', '(\d)[.-](?=\d)', '\1', 'g');

(?=\d) is a "look ahead") match which says that the period or dash must
be followed by a digit, but the expression _does not_ "consume" the
digit matched.



Maranatha! <><
John McKown


Yes, excellent, both seem to work.  I'll run a bunch of data through 
them both and see what happens.


Thanks much for the help!

-Andy



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] regexp_replace

2016-01-14 Thread Andy Colson


On 1/14/2016 1:59 PM, Tom Lane wrote:

Andy Colson  writes:

This is not doing as I'd expected:



select regexp_replace('71.09.6.01.3', '(\d)[.-](\d)', '\1\2', 'g');



   regexp_replace

   71096.013
(1 row)


I think regexp_replace considers only non-overlapping substrings,
eg, once it's replaced 1.0 with 10, it then picks up searching at
the 9 rather than starting over.  The dot after 6 doesn't get
removed because the 6 can't belong to two replaceable substrings, and
it already got consumed in the process of removing the dot before 6.

I might be wrong, but I think two passes of regexp_replace would
do what you want in this example.

regards, tom lane



Ah, that would make sense, and seems to explain:

select regexp_replace('7-9-6-1-3', '(\d)[.-](\d)', '\1\2', 'g');

 regexp_replace

 79-61-3
(1 row)


select regexp_replace('71-09-56-01-53', '(\d)[.-](\d)', '\1\2', 'g');

 regexp_replace

 7109560153
(1 row)


I can work two passes in.  Thanks Tom!

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] regexp_replace

2016-01-14 Thread Andy Colson


Hi all.

This is not doing as I'd expected:

select regexp_replace('71.09.6.01.3', '(\d)[.-](\d)', '\1\2', 'g');

 regexp_replace

 71096.013
(1 row)

It acts the same with dashes:
select regexp_replace('71-09-6-01-3', '(\d)[.-](\d)', '\1\2', 'g');

 regexp_replace

 71096-013
(1 row)

I cannot use translate because there is other text in the field.  I'm 
trying to strip masking characters from a parcel number in a larger text 
field (for example:  "the parcel 12-34-56 has caught on fire")


I seem to be missing something, any hints?

I'm on PG 9.3.9 on Slackware64.

Thanks for your time,

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Question -- Session Operations - Feasibility Of Proposed Synchronization Method?

2016-01-07 Thread Andy Colson


On 01/07/2016 06:30 PM, Steve Petrie, P.Eng. wrote:

Thanks to forum members for the four helpful replies, to my earlier message 
that initiated this thread.

The replies expressed concerns, with the feasibility of my proposal to use 
postgres tables to store short-lived context data, for dialog continuity during 
website app transient sessions, with visitor browsers over modeless HTTP 
connections.

Hope the four emails I sent in response (5 January 2016), went some way to 
satisfying the concerns expressed.

Here is a list of the issues discussed, in the dialog mentioned above:

1. "Session" defined;
2. Avoid row DELETEs;
3. Periodically TRUNCATE each table in a pool of session context tables;
4. Embed a session ID key parameter in an HTML "hidden" field (optional);
5. Use sequence generators as rapid global iterators controlling access to 
session context tables;





Thanks to forum members for taking the time to read my email.



This feels hugely overcomplicated.  I also didn't read most of the last thread, 
so forgive me if you've answered this already:  How many website requests a 
second (that actually need to touch session data) are you expecting?  How much 
space is the session data going to take?  (like, 5 Gig a day?)

If its a huge number, you should put effort into growing horizontally, not all 
of this stuff.
If its a small number, you'll spend more time fixing all the broken things than 
its worth.
Have you benchmarked this?  In my mind, complicated == slow.

Sorry if I'm raining on your parade, it looks like you have really put a lot of 
work into this.

Have you considered saving session data to disk is faster than saving to db?  A 
good reverse web proxy can stick a session to the same backend.  1 web proxy up 
front, 5 web servers behind it.  I'd bet its way faster.

-Andy



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] A unique pairs version of UNNEST() ?

2016-01-04 Thread Andy Colson


On 1/4/2016 2:08 PM, Wells Oliver wrote:

Hey all, happy new year.

I am trying to get unique pairs from an array of N numbered items,
usually 5, but possibly 4 or 6.

If I just wanted unique values, I could do SELECT UNNEST(list_of_ids) AS
id, COUNT(*) FROM table GROUP BY id but in this situation I want all
unique pairs and a COUNT.

For those familiar with python, this is the functionality found in
itertools.combinations. I'm leaning towards just doing this in python,
but I really like keeping as much in SQL as possible.

So in an example where list_of_ids is {1,2,3,4,5} I would essentially get:

{1, 2}
{1, 3}
{1, 4}
{1, 5}
{2, 3}
{2, 4}
{2, 5}
{3, 4}
{3, 5}
{4, 5}


Any tips? Thanks!

--
Wells Oliver
wells.oli...@gmail.com 


if you could convert the array to a table then cross join it.  Something 
like:


select a.*, b.*
from unnest( {1,2,3,4,5} ) a
cross join unnest( {1,2,3,4,5} ) b

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] cannot get stable function to use index

2015-12-30 Thread Andy Colson


On 12/30/2015 2:39 PM, Andy Colson wrote:

On 12/30/2015 2:33 PM, Tom Lane wrote:

Andy Colson  writes:

On 12/30/2015 2:18 PM, Tom Lane wrote:

Maybe something weird about the build you're using?  What does
pg_config print?



[ output ]


No smoking gun there either.

It might be worthwhile to update to 9.3.10, just in case there is
something wonky about this particular build you've got.  But I'm
starting to get the feeling that you may not get an answer short
of tracing through gincostestimate to see where it's going nuts.

regards, tom lane



The entire database is 78Gig, would you expect a "vacuum analyze" to fix
it?  I never run it.



Ok, I can reproduce this now.  The full vacuum analyze isn't needed.

If I drop and recreate the table it goes back to preferring table scan. 
 I can "analyze search" and it still table scans.


But once I "vacuum analyze search", then it starts index scanning.


-Andy




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] cannot get stable function to use index

2015-12-30 Thread Andy Colson


On 12/30/2015 2:33 PM, Tom Lane wrote:

Andy Colson  writes:

On 12/30/2015 2:18 PM, Tom Lane wrote:

Maybe something weird about the build you're using?  What does
pg_config print?



[ output ]


No smoking gun there either.

It might be worthwhile to update to 9.3.10, just in case there is
something wonky about this particular build you've got.  But I'm
starting to get the feeling that you may not get an answer short
of tracing through gincostestimate to see where it's going nuts.

regards, tom lane



The entire database is 78Gig, would you expect a "vacuum analyze" to fix 
it?  I never run it.


Cuz I started one, and its still going, but at this point right now it's 
preferring indexed scans. So it seems fixed.


I'd ran: analyze jasperia.search

many times, before and after I'd emailed the list. I've rebuilt the 
search table several times over, but never vacuumed it.



explain analyze
select *
from jasperia.search
where search_vec @@ to_tsquery_partial('213 E 13 ST N')

   QUERY PLAN 


-
 Bitmap Heap Scan on search  (cost=76.01..80.03 rows=1 width=73) 
(actual time=62.803..62.804 rows=1 loops=1)
   Recheck Cond: (search_vec @@ 
to_tsquery((array_to_string('{213,E,13,ST,N}'::text[], ' & '::text) || 
':*'::text)))
   ->  Bitmap Index Scan on search_key  (cost=0.00..76.01 rows=1 
width=0) (actual time=62.797..62.797 rows=1 loops=1)
 Index Cond: (search_vec @@ 
to_tsquery((array_to_string('{213,E,13,ST,N}'::text[], ' & '::text) || 
':*'::text)))

 Total runtime: 62.869 ms



(* The vacuum analyze is still running *)





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] cannot get stable function to use index

2015-12-30 Thread Andy Colson


On 12/30/2015 2:18 PM, Tom Lane wrote:

Andy Colson  writes:

On 12/30/2015 1:55 PM, Tom Lane wrote:

Are you using any nondefault planner settings?  Anything else
unusual about your installation?



There are others, but I'll bet its:
random_page_cost = 1


Nope...

Maybe something weird about the build you're using?  What does
pg_config print?

regards, tom lane



BINDIR = /usr/local/pg93/bin
DOCDIR = /usr/local/pg93/share/doc/postgresql
HTMLDIR = /usr/local/pg93/share/doc/postgresql
INCLUDEDIR = /usr/local/pg93/include
PKGINCLUDEDIR = /usr/local/pg93/include/postgresql
INCLUDEDIR-SERVER = /usr/local/pg93/include/postgresql/server
LIBDIR = /usr/local/pg93/lib
PKGLIBDIR = /usr/local/pg93/lib/postgresql
LOCALEDIR = /usr/local/pg93/share/locale
MANDIR = /usr/local/pg93/share/man
SHAREDIR = /usr/local/pg93/share/postgresql
SYSCONFDIR = /usr/local/pg93/etc/postgresql
PGXS = /usr/local/pg93/lib/postgresql/pgxs/src/makefiles/pgxs.mk
CONFIGURE = '--prefix=/usr/local/pg93' '--with-perl' 
'--enable-thread-safety' '--build=x86_64-slackware-linux' 
'build_alias=x86_64-slackware-linux' 'CFLAGS=-O2 -fPIC 
-DLINUX_OOM_SCORE_ADJ=0'

CC = gcc
CPPFLAGS = -D_GNU_SOURCE
CFLAGS = -O2 -fPIC -DLINUX_OOM_SCORE_ADJ=0 -Wall -Wmissing-prototypes 
-Wpointer-arith -Wdeclaration-after-statement -Wendif-labels 
-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing 
-fwrapv -fexcess-precision=standard

CFLAGS_SL = -fpic
LDFLAGS = -L../../../src/common -Wl,--as-needed 
-Wl,-rpath,'/usr/local/pg93/lib',--enable-new-dtags

LDFLAGS_EX =
LDFLAGS_SL =
LIBS = -lpgport -lpgcommon -lz -lreadline -ltermcap -lcrypt -ldl -lm
VERSION = PostgreSQL 9.3.9




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] cannot get stable function to use index

2015-12-30 Thread Andy Colson


On 12/30/2015 2:03 PM, Andy Colson wrote:

On 12/30/2015 1:55 PM, Tom Lane wrote:

Andy Colson  writes:

On 12/30/2015 1:07 PM, Tom Lane wrote:

it seems like you've got some weird data statistics that are causing a
misestimate.  Could we see the pg_stats row for that tsvector column?
Or maybe even the actual data?



The table exists in a schema named jasperia, I've been removing the name
for simplicity.  The dump of the table is here:


Thanks very much for sharing the data.  But now I am well and truly
confused, because I still can't reproduce your results.  I get

regression=# explain analyze
select *
from search
where search_vec @@ to_tsquery('213 & E & 13 & ST & N:*');
  QUERY PLAN


  Bitmap Heap Scan on search  (cost=76.00..80.02 rows=1 width=72)
(actual time=8.119..8.119 rows=1 loops=1)
Recheck Cond: (search_vec @@ to_tsquery('213 & E & 13 & ST &
N:*'::text))
->  Bitmap Index Scan on search_key  (cost=0.00..76.00 rows=1
width=0) (actual time=8.113..8.113 rows=1 loops=1)
  Index Cond: (search_vec @@ to_tsquery('213 & E & 13 & ST &
N:*'::text))
  Total runtime: 8.210 ms

This is on 9.3 branch tip, not 9.3.9 which I don't have installed;
but I see no bug fixes related to GIN estimation in the commit logs
since 9.3.9.

Are you using any nondefault planner settings?  Anything else
unusual about your installation?

regards, tom lane



There are others, but I'll bet its:

random_page_cost = 1



Humm, nope.  I removed the config option, restart PG, then analyzed the 
search table:


# show random_page_cost ;
 random_page_cost
--
 4

# analyze search;

And it still wont use the index.  I'll tool around a little more and see 
if I can find something.


Thanks much for all your help on this.

-Andy





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] cannot get stable function to use index

2015-12-30 Thread Andy Colson


On 12/30/2015 1:55 PM, Tom Lane wrote:

Andy Colson  writes:

On 12/30/2015 1:07 PM, Tom Lane wrote:

it seems like you've got some weird data statistics that are causing a
misestimate.  Could we see the pg_stats row for that tsvector column?
Or maybe even the actual data?



The table exists in a schema named jasperia, I've been removing the name
for simplicity.  The dump of the table is here:


Thanks very much for sharing the data.  But now I am well and truly
confused, because I still can't reproduce your results.  I get

regression=# explain analyze
select *
from search
where search_vec @@ to_tsquery('213 & E & 13 & ST & N:*');
  QUERY PLAN

  Bitmap Heap Scan on search  (cost=76.00..80.02 rows=1 width=72) (actual 
time=8.119..8.119 rows=1 loops=1)
Recheck Cond: (search_vec @@ to_tsquery('213 & E & 13 & ST & N:*'::text))
->  Bitmap Index Scan on search_key  (cost=0.00..76.00 rows=1 width=0) 
(actual time=8.113..8.113 rows=1 loops=1)
  Index Cond: (search_vec @@ to_tsquery('213 & E & 13 & ST & 
N:*'::text))
  Total runtime: 8.210 ms

This is on 9.3 branch tip, not 9.3.9 which I don't have installed;
but I see no bug fixes related to GIN estimation in the commit logs
since 9.3.9.

Are you using any nondefault planner settings?  Anything else
unusual about your installation?

regards, tom lane



There are others, but I'll bet its:

random_page_cost = 1


The Others:

max_connections = 20
shared_buffers = 400MB
work_mem = 5MB
maintenance_work_mem = 64MB
effective_cache_size = 1700MB
synchronous_commit = off
effective_io_concurrency = 3
track_io_timing = on
max_locks_per_transaction = 2300


I can't honestly say why I've set random_page_cost.  Its been way too 
long for me to remember.  The box is running a 4 drive sata software 
raid 10, on Slackware64.


-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] cannot get stable function to use index

2015-12-30 Thread Andy Colson


Wow thats bad.

Here's another link:

http://camavision.com/dn/stats.txt

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] cannot get stable function to use index

2015-12-30 Thread Andy Colson


On 12/30/2015 1:07 PM, Tom Lane wrote:

I wrote:

This says there's only about a 25% runtime penalty for the partial match,
at least on your example, compared to the planner's estimate of 2700x
penalty :-(.  Definitely need to fix that.


I tried to reproduce this behavior with simple generated data, and could
not: the estimates seem to track the actual cost reasonably well.  So
it seems like you've got some weird data statistics that are causing a
misestimate.  Could we see the pg_stats row for that tsvector column?
Or maybe even the actual data?

regards, tom lane



The table exists in a schema named jasperia, I've been removing the name 
for simplicity.  The dump of the table is here:


http://camavision.com/dn/search.sql.bz2

Here is the pg_stats, I hope word wrap doesn't mess this up too bad:

 schemaname | tablename |  attname   | inherited | null_frac | 
avg_width | n_distinct | 






















most_common_vals 





















| 








   most_common_freqs 







  | 















   histogram_bounds 














   | correlation | 











































most_common_elems 










































  | 






































































  most_common_elem_freqs 







































































 | elem_count_histogram
+---++---+---+---++--
--
--
-+---

Re: [GENERAL] cannot get stable function to use index

2015-12-30 Thread Andy Colson


On 12/30/2015 10:09 AM, Tom Lane wrote:

Andy Colson  writes:

 ->  Bitmap Index Scan on search_key  (cost=0.00..6.00 rows=1 width=0) 
(actual time=0.025..0.025 rows=0 loops=1)
   Index Cond: (search_vec @@ to_tsquery('213 & E & 13 & ST & N'::text))


Hmm ... so the partial case actually is significantly more expensive than
the non-partial case: 4 msec vs .025 msec.  Still, that's about a 200x
penalty, not the 1x penalty the planner is ascribing to it.

Thanks for the data!  I'll go moan about this on -hackers.

regards, tom lane



No, that's not right, the table was empty.  I rebuilt the table as it 
was before, here are all three queries again:


explain analyze
select *
from search
where search_vec @@ to_tsquery('213 & E & 13 & ST & N:*');

  QUERY PLAN 


---
 Seq Scan on search  (cost=0.00..2211.08 rows=1 width=73) (actual 
time=31.904..380.568 rows=1 loops=1)

   Filter: (search_vec @@ to_tsquery('213 & E & 13 & ST & N:*'::text))
   Rows Removed by Filter: 79071
 Total runtime: 380.609 ms



set enable_seqscan TO 0;
explain analyze
select *
from search
where search_vec @@ to_tsquery('213 & E & 13 & ST & N:*');

  QUERY PLAN 


---
 Bitmap Heap Scan on search  (cost=63623.00..63624.02 rows=1 width=73) 
(actual time=5.004..5.004 rows=1 loops=1)
   Recheck Cond: (search_vec @@ to_tsquery('213 & E & 13 & ST & 
N:*'::text))
   ->  Bitmap Index Scan on search_key  (cost=0.00..63623.00 rows=1 
width=0) (actual time=4.996..4.996 rows=1 loops=1)
 Index Cond: (search_vec @@ to_tsquery('213 & E & 13 & ST & 
N:*'::text))

 Total runtime: 5.045 ms


explain analyze
select *
from search
where search_vec @@ to_tsquery('213 & E & 13 & ST & N');

 QUERY PLAN 



 Bitmap Heap Scan on search  (cost=23.00..24.02 rows=1 width=73) 
(actual time=4.067..4.067 rows=1 loops=1)

   Recheck Cond: (search_vec @@ to_tsquery('213 & E & 13 & ST & N'::text))
   ->  Bitmap Index Scan on search_key  (cost=0.00..23.00 rows=1 
width=0) (actual time=4.057..4.057 rows=1 loops=1)
 Index Cond: (search_vec @@ to_tsquery('213 & E & 13 & ST & 
N'::text))

 Total runtime: 4.109 ms



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] cannot get stable function to use index

2015-12-30 Thread Andy Colson


On 12/30/2015 9:55 AM, Andy Colson wrote:

On 12/30/2015 9:53 AM, Tom Lane wrote:

Andy Colson  writes:

Here are my results, if there are any others you'd like to see please
let me know.  Thanks Tom.


For comparison, could we see the results for the non-partial case, ie

explain analyze
select *
from search
where search_vec @@ to_tsquery('213 & E & 13 & ST & N');

regards, tom lane



 QUERY PLAN
---

  Bitmap Heap Scan on search  (cost=6.00..7.02 rows=1 width=100) (actual
time=0.029..0.029 rows=0 loops=1)
Recheck Cond: (search_vec @@ to_tsquery('213 & E & 13 & ST & N'::text))
->  Bitmap Index Scan on search_key  (cost=0.00..6.00 rows=1
width=0) (actual time=0.025..0.025 rows=0 loops=1)
  Index Cond: (search_vec @@ to_tsquery('213 & E & 13 & ST &
N'::text))
  Total runtime: 0.060 ms
(5 rows)






Oh!  I just realized... this is on my test box and I just blew away that 
table to rebuild it.  And I have to run off to a meeting, so if this 
analyze makes no sense, that's why.


I'll re-re-do it within the hour.  Sorry about that.

-Andy



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] cannot get stable function to use index

2015-12-30 Thread Andy Colson


On 12/30/2015 9:53 AM, Tom Lane wrote:

Andy Colson  writes:

Here are my results, if there are any others you'd like to see please
let me know.  Thanks Tom.


For comparison, could we see the results for the non-partial case, ie

explain analyze
select *
from search
where search_vec @@ to_tsquery('213 & E & 13 & ST & N');

regards, tom lane



QUERY PLAN 


---
 Bitmap Heap Scan on search  (cost=6.00..7.02 rows=1 width=100) (actual 
time=0.029..0.029 rows=0 loops=1)

   Recheck Cond: (search_vec @@ to_tsquery('213 & E & 13 & ST & N'::text))
   ->  Bitmap Index Scan on search_key  (cost=0.00..6.00 rows=1 
width=0) (actual time=0.025..0.025 rows=0 loops=1)
 Index Cond: (search_vec @@ to_tsquery('213 & E & 13 & ST & 
N'::text))

 Total runtime: 0.060 ms
(5 rows)




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] cannot get stable function to use index

2015-12-30 Thread Andy Colson


On 12/29/2015 6:35 PM, Tom Lane wrote:

Andy Colson  writes:

I cannot get this sql to use the index:



explain analyze
select *
from search
where search_vec @@ to_tsquery_partial('213 E 13 ST N')



--
Seq Scan on search  (cost=0.00..2526.56 rows=1 width=69) (actual
time=68.033..677.490 rows=1 loops=1)
 Filter: (search_vec @@
to_tsquery((array_to_string('{213,E,13,ST,N}'::text[], ' & '::text) ||
':*'::text)))
 Rows Removed by Filter: 76427
   Total runtime: 677.548 ms
(4 rows)


If you force it with enable_seqscan = off, you'll soon see that it's
capable of picking the indexscan plan, but it doesn't want to because it
estimates that the cost will be much higher, which seems to be a
consequence of the ":*" in the query.  (Even though the functions involved
are only stable, the planner is capable of seeing through them to look at
the pattern that will be fed to the GIN index search.)  You get the same
results if you use the resulting tsquery without any function at all.
For example (with dummy data), I get

regression=# explain select * from search
where search_vec @@ '213 & e & 13 & st & n:*'::tsquery;
 QUERY PLAN
--
  Seq Scan on search  (cost=0.00..3774.01 rows=1 width=21)
Filter: (search_vec @@ '''213'' & ''e'' & ''13'' & ''st'' & 
''n'':*'::tsquery)
(2 rows)

regression=# set enable_seqscan TO 0;
SET
regression=# explain select * from search
where search_vec @@ '213 & e & 13 & st & n:*'::tsquery;
  QUERY PLAN

  Bitmap Heap Scan on search  (cost=10.00..104448.01 rows=1 width=21)
Recheck Cond: (search_vec @@ '''213'' & ''e'' & ''13'' & ''st'' & 
''n'':*'::tsquery)
->  Bitmap Index Scan on search_key  (cost=0.00..10.00 rows=1 width=0)
  Index Cond: (search_vec @@ '''213'' & ''e'' & ''13'' & ''st'' & 
''n'':*'::tsquery)
(4 rows)

but for comparison, with a pattern without ':*', I get

regression=# explain select * from search
where search_vec @@ '213 & e & 13 & st & n'::tsquery;
 QUERY PLAN
--
  Bitmap Heap Scan on search  (cost=44.00..48.01 rows=1 width=21)
Recheck Cond: (search_vec @@ '''213'' & ''e'' & ''13'' & ''st'' & 
''n'''::tsquery)
->  Bitmap Index Scan on search_key  (cost=0.00..44.00 rows=1 width=0)
  Index Cond: (search_vec @@ '''213'' & ''e'' & ''13'' & ''st'' & 
''n'''::tsquery)
(4 rows)

I'm inclined to think this is a bug in the estimator; it seems to be
charging for many more "entry page" fetches than there are pages in
the index.  But maybe it's right and there will be lots of repeated
work involved.  It would be interesting to see EXPLAIN ANALYZE results
from your data for these examples.

regards, tom lane




Here are my results, if there are any others you'd like to see please 
let me know.  Thanks Tom.



# explain analyze
select *
from search
where search_vec @@ to_tsquery('213 & E & 13 & ST & N:*');

  QUERY PLAN
---
 Seq Scan on search  (cost=0.00..2144.42 rows=1 width=69) (actual 
time=30.584..361.147 rows=1 loops=1)

   Filter: (search_vec @@ to_tsquery('213 & E & 13 & ST & N:*'::text))
   Rows Removed by Filter: 76427
 Total runtime: 361.181 ms
(4 rows)

Time: 363.012 ms



# set enable_seqscan TO 0;
SET
Time: 0.185 ms

# explain analyze
select *
from search
where search_vec @@ to_tsquery('213 & E & 13 & ST & N:*');
  QUERY PLAN
---
 Bitmap Heap Scan on search  (cost=63716.00..63717.02 rows=1 width=69) 
 (actual time=4.354..4.355 rows=1 loops=1)
   Recheck Cond: (search_vec @@ to_tsquery('213 & E & 13 & ST & 
N:*'::text))
   ->  Bitmap Index Scan on search_key  (cost=0.00..63716.00 rows=1 
width=0) (actual time=4.351..4.351 rows=1 loops=1)
 Index Cond: (search_vec @@ to_tsquery('213 & E & 13 & ST & 
N:*'::text))

 Total runtime: 4.370 ms
(5 rows)

Time: 4.794 ms





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] cannot get stable function to use index

2015-12-30 Thread Andy Colson


On 12/29/2015 6:03 PM, Jim Nasby wrote:


If I'm reading EXPLAIN ANALYZE correctly, to_tsquery_partial is being
simplified out of the query entirely:

Filter: (search_vec @@
to_tsquery((array_to_string('{213,E,13,ST,N}'::text[], ' & '::text) ||
':*'::text)))

Part of this could well be that you're not feeding the same data to
to_tsquery. Your hard-coded example is

where search_vec @@ to_tsquery('213 & E & 13 & ST & N');

but your second query becomes '213 & E & 13 & ST & N:*'. Have you tried
that as a hard-coded value?


Ahh!  Yep, that was the missing link.  Jeez, I can't believe I couldn't 
find it.  The :* is for matching partials, its even IN the name 
to_tsquery_partial.


Indeed, this does not use an index:

explain analyze
select *
from search
where search_vec @@ to_tsquery('213 & E & 13 & ST & N:*')

Thank you!

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] cannot get stable function to use index

2015-12-29 Thread Andy Colson


Hi all,

I seem to be missing something.

I'm using PG 9.3.9 on Slackware64.

My table:


create table search (
gid integer,
descr text,
search_vec tsvector
);
create index search_key on search using gin(search_vec);

I've put a bunch of data in it, and using to_tsquery uses the index fine:

explain analyze
select *
from search
where search_vec @@ to_tsquery('213 & E & 13 & ST & N');

QUERY PLAN
---
 Bitmap Heap Scan on search  (cost=16.00..17.02 rows=1 width=69) 
(actual time=87.493..87.494 rows=1 loops=1)

   Recheck Cond: (search_vec @@ to_tsquery('213 & E & 13 & ST & N'::text))
   ->  Bitmap Index Scan on search_key  (cost=0.00..16.00 rows=1 
width=0) (actual time=87.478..87.478 rows=1 loops=1)
 Index Cond: (search_vec @@ to_tsquery('213 & E & 13 & ST & 
N'::text))

 Total runtime: 87.554 ms
(5 rows)


Here is the problem, I'd like to use this function from
http://workshops.boundlessgeo.com/tutorial-autocomplete/


CREATE OR REPLACE FUNCTION public.to_tsquery_partial(text)
  RETURNS tsquery AS $$
SELECT to_tsquery(
   array_to_string(
   regexp_split_to_array(
   trim($1),E'\\s+'),' & ') ||
   CASE WHEN $1 ~ ' $' THEN '' ELSE ':*' END)
  $$ LANGUAGE 'sql' STABLE STRICT;



Originally it didn't have "STABLE STRICT", but I added it.  Doesn't seem 
to matter though.  I cannot get this sql to use the index:


explain analyze
select *
from search
where search_vec @@ to_tsquery_partial('213 E 13 ST N')

--
Seq Scan on search  (cost=0.00..2526.56 rows=1 width=69) (actual 
time=68.033..677.490 rows=1 loops=1)
   Filter: (search_vec @@ 
to_tsquery((array_to_string('{213,E,13,ST,N}'::text[], ' & '::text) || 
':*'::text)))

   Rows Removed by Filter: 76427
 Total runtime: 677.548 ms
(4 rows)


to_tsquery_partial() calls to_tsquery() and array_to_string(), both of 
which I checked, and all of them are marked as stable.


Any hints why this is happening?

Thanks,

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Session Identifiers

2015-12-20 Thread Andy Colson


On 12/20/2015 09:16 AM, oleg yusim wrote:

Greetings!

I'm new to PostgreSQL, working on it from the point of view of Cyber Security 
assessment. In regards to the here is my questions:

 From the security standpoint we have to assure that database invalidates 
session identifiers upon user logout or other session termination (timeout 
counts too).

Does PostgreSQL perform this type of actions? If so, where are those Session 
IDs are stored, so I can verify it?

Thanks,

Oleg


Are you talking about a website session?  Does this website session happen to 
be stored in PG?

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] using a postgres table as a multi-writer multi-updater queue

2015-11-23 Thread Andy Colson


On 11/23/2015 4:41 AM, Chris Withers wrote:

Hi All,

I wondered if any of you could recommend best practices for using a
postgres table as a queue. Roughly speaking, 100-200 workers will vomit
rows and rates of a few hundres per second into the table leaving the
status as new and then as many workers as needed to keep up with the
load will plough through the queue changing the status to something
other than new.

My naive implementation would be something along the lines of:

CREATE TABLE event (
 tstimestamp,
 event char(40),
 statuschar(10),
 CONSTRAINT pkey PRIMARY KEY(ts, event)
);


...with writers doing INSERT or COPY to get data into the table and
readers doing something like:

SELECT FOR UPDATE * FROM event WHERE status='new' LIMIT 1000;

...so, grabbing batches of 1,000, working on them and then setting their
status.

But, am I correct in thinking that SELECT FOR UPDATE will not prevent
multiple workers selecting the same rows?

Anyway, is this approach reasonable? If so, what tweaks/optimisations
should I be looking to make?

If it's totally wrong, how should I be looking to approach the problem?

cheers,

Chris


Have you tried Redis?  Its really good at that sort of thing.

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Hierarchical Query Question (PHP)

2015-10-31 Thread Andy Colson


On 10/30/2015 05:10 PM, David Blomstrom wrote:

Just so I understand what's going on, I can create a lookup table by pasting 
this code...



I don't know anything about biology so this data might be laughable, but its 
based on your original question:

http://stackoverflow.com/questions/33402831/count-descendants-in-hierarchical-query


It seemed like the parent_id column was really the rank, and I couldn't tell if 
it should be taxon_rank or parent_rank.  Thinking more about the problem, I 
normalized a little, renamed some, and came up with this script.  It lacks 
indexes and proper names, etc, etc, because I wanted to tackle one problem at a 
time.  Hopefully its a good example.

create table rank (
id integer,
descr text
);

insert into rank values(1, 'Classes');
insert into rank values(2, 'Orders');
insert into rank values(3, 'Families');
insert into rank values(4, 'Genera');
insert into rank values(5, 'Species');

create table mammals (
id integer,
taxon text,
rank integer,
parentid integer
);

insert into mammals values (0, 'Chordata',  1, null);
insert into mammals values (1, 'Mammalia',  1, 0);
insert into mammals values (2, 'Carnivora', 2, 1);
insert into mammals values (3, 'Canidae'  , 3, 2);
insert into mammals values (4, 'Canis', 4, 3);
insert into mammals values (5, 'Canis-lupus',   5, 4);
insert into mammals values (6, 'Canis-latrans', 5, 4);


-- This query shows you the basic results.  It only
-- returns the id columns.  further queries build on this base one.
--  (you could this of this query as Order Chordata :-) )
with recursive heir(id, rank, parentid) as (
select m.id, 0, m.parentid
from mammals m
where taxon = 'Canis'
union all
select m.id, m.rank, m.parentid
from heir
inner join mammals m on m.parentid = heir.id
)
select * from heir;

Results:
 id | rank | parentid
+--+--
  4 |0 |3
  5 |5 |4
  6 |5 |4
(3 rows)



-- This looks up the columns for a more meaningful result:
with recursive heir(id, rank, parentid) as (
select m.id, 0, m.parentid
from mammals m
where taxon = 'Canidae'
union all
select m.id, m.rank, m.parentid
from heir
inner join mammals m on m.parentid = heir.id
)
select m.taxon, r.descr
from heir
inner join mammals m on m.id = heir.id
inner join rank r on heir.rank=r.id

Results:
 taxon |  descr
---+-
 Canis | Genera
 Canis-lupus   | Species
 Canis-latrans | Species


-
-- This, finally, groups and counts, like your original question
with recursive heir(id, rank, parentid) as (
select m.id, 0, m.parentid
from mammals m
where taxon = 'Canidae'
union all
select m.id, m.rank, m.parentid
from heir
inner join mammals m on m.parentid = heir.id
)
select r.id, r.descr, count(*)
from heir
inner join mammals m on m.id = heir.id
inner join rank r on heir.rank=r.id
group by r.id, r.descr
order by r.id


Results:
 id |  descr  | count
+-+---
  4 | Genera  | 1
  5 | Species | 2




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Selectively Importing Data

2015-10-30 Thread Andy Colson


On 10/30/2015 07:24 PM, David Blomstrom wrote:

First consider the following table:

create table taxon (
   taxonid serial,
   descr text
);

As I understand it, "serial" means that column will automatically populate with 
a numerical key.

If I want to fill the field 'descr' with a list of scientific names stored in a 
spreadsheet, then how would I proceed?

I have a spreadsheet that has about a dozen columns. I copied the field with 
scientific names into a new spreadsheet. Then I created an empty field on the 
left. So I have a spreadsheet with two columns. I saved it as a CSV file.

When I try to import it with pgAdmin III, I get the error message

WARNING: null value in column "taxonid" violates non-null constraint

How can I import that single column into the second column in this table? 
Thanks.


Try using just one column?

Or, create taxonid as integer, then fill the spreadsheet column with int's, and 
import both columns.

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Hierarchical Query Question (PHP)

2015-10-30 Thread Andy Colson


On 10/30/2015 4:36 PM, Andy Colson wrote:

On 10/30/2015 3:47 PM, David Blomstrom wrote:

No, I get the same T_FUNCTION error.

Someone commented that the function...

   create function tax_rank(id integer) returns text as $$
 select case id
  when 1 then 'Classes'
  when 2 then 'Orders'
  when 3 then 'Families'
  when 4 then 'Genera'
  when 5 then 'Species'
end;
   $$ language sql;

...should ideally be part of the table schema. Does that mean I need to
go into pgAdmin, open up my table and paste this in somehow?

I wonder if the function is even necessary. My goal is to create a
hierarchical query that displays the number of children, grandchildren,
etc. And, depending on the taxonomic level, it might display the result
as "20 families, 74 genera and 413 species." With MySQL I could probably
turn that into a series of echo values, which I could then display like
this:

echo ''.$NumberChildren.' families
'.$NumberGrandchildren.' genera
'.$NumberGreatgrandchildren.' species';

I'm wondering if I should figure out this query or go back to square one
with a simpler query. I've never seen a query with a function before. ;)

Thanks for the tips.


Shoot, I should have read this before responding to the first one.

Yes, create function tax_rank, should be done in pgAdmin.

 > I wonder if the function is even necessary.

Correct, its not.  It should probably be a lookup table:

create table taxon (
   taxonid serial,
   descr text
);
create table gz_life_mammals (
   id serial,
   taxonid integer, -- use the lookup table
   parentid integer -- use the lookup table
);


-Andy






Humm, after looking at this further, my answer isn't right.  I did not 
notice rank (classes, orders, families...) is different than taxon 
(mammilia, carnivora, ...)


But still, lookup table is better than function.

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Hierarchical Query Question (PHP)

2015-10-30 Thread Andy Colson


On 10/30/2015 3:47 PM, David Blomstrom wrote:

No, I get the same T_FUNCTION error.

Someone commented that the function...

   create function tax_rank(id integer) returns text as $$
 select case id
  when 1 then 'Classes'
  when 2 then 'Orders'
  when 3 then 'Families'
  when 4 then 'Genera'
  when 5 then 'Species'
end;
   $$ language sql;

...should ideally be part of the table schema. Does that mean I need to
go into pgAdmin, open up my table and paste this in somehow?

I wonder if the function is even necessary. My goal is to create a
hierarchical query that displays the number of children, grandchildren,
etc. And, depending on the taxonomic level, it might display the result
as "20 families, 74 genera and 413 species." With MySQL I could probably
turn that into a series of echo values, which I could then display like
this:

echo ''.$NumberChildren.' families
'.$NumberGrandchildren.' genera
'.$NumberGreatgrandchildren.' species';

I'm wondering if I should figure out this query or go back to square one
with a simpler query. I've never seen a query with a function before. ;)

Thanks for the tips.


Shoot, I should have read this before responding to the first one.

Yes, create function tax_rank, should be done in pgAdmin.

> I wonder if the function is even necessary.

Correct, its not.  It should probably be a lookup table:

create table taxon (
  taxonid serial,
  descr text
);
create table gz_life_mammals (
  id serial,
  taxonid integer, -- use the lookup table
  parentid integer -- use the lookup table
);


-Andy




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Hierarchical Query Question (PHP)

2015-10-30 Thread Andy Colson


On 10/29/2015 7:18 PM, David Blomstrom wrote:

Can anyone tell me how to write the query described @
http://stackoverflow.com/questions/33402831/count-descendants-in-hierarchical-query
?

The answer's very thorough, but I don't know how to string two queries
and a function together like that. This doesn't work:

$sql = "select * from gz_life_mammals;";

create function tax_rank(id integer) returns text as $$
 select case id
  when 1 then 'Classes'
  when 2 then 'Orders'
  when 3 then 'Families'
  when 4 then 'Genera'
  when 5 then 'Species'
end;
$$ language sql;

$sql = "with recursive hier(taxon,parent_id) as (
   select m.taxon, null::integer
   from gz_life_mammals m
   where taxon='Mammalia' --<< substitute me
   union all
   select m.taxon, m.parent_id
   from hier, gz_life_mammals m
   where m.parent=hier.taxon
)
select tax_rank(parent_id),
count(*) num_of_desc
from   hier
where  parent_id is not null
group by parent_id
order by parent_id;";

Thanks.



The function is created once (like with your create tables).  Don't use 
it in PHP.


Your PHP should only be like:

> $sql = "with recursive hier(taxon,parent_id) as (
>select m.taxon, null::integer
>from gz_life_mammals m
>where taxon='Mammalia' --<< substitute me
>union all
>select m.taxon, m.parent_id
>from hier, gz_life_mammals m
>where m.parent=hier.taxon
> )
> select tax_rank(parent_id),
> count(*) num_of_desc
> from   hier
> where  parent_id is not null
> group by parent_id
> order by parent_id;";

$result = pg_query($dbh, $sql);

while ($row = pg_fetch_array($result)) {
   etc
   etc


-Andy



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] 9.5 beta pg_upgrade documentation

2015-09-01 Thread Andy Colson


On 09/01/2015 08:46 PM, Andy Colson wrote:

On 09/01/2015 07:00 PM, Bruce Momjian wrote:

On Tue, Sep  1, 2015 at 06:56:11PM -0500, Andy Colson wrote:

I also added a mention that rsync, not pg_upgrade, will be run on the
standbys.  You can see all the results of the patch here:

http://momjian.us/pgsql_docs/pgupgrade.html

Thanks.



Sweet, I'm glad I stopped where I did.  I think I'm safe to pick up at step f.  
Which seemed to work ok.  But now we get to step g (run rsync).  I checked the 
rsync manual and don't see anything like this three directory argument thing 
you are trying to run.  Unless you want to use --link-dest.  In which case I 
think the cmd would be like:

rsync --archive --delete --hard-links --size-only --link-dest=old_pgdata 
new_pgdata remote_dir

I'm gonna try this now, will report back.


No, you are copying "old_pgdata and new_pgdata" to remote_dir.



Ohhh... I'll try again.  (The method above (run from the standby) doesn't work 
well at all).

-Andy




Yeah, much better:

postgres@test1:/pub$ rsync --archive --stats --delete --hard-links --size-only 
pg93 pg95 test2:/pub/

Number of files: 373,393 (reg: 373,340, dir: 53)
Number of created files: 186,779 (reg: 186,751, dir: 28)
Number of deleted files: 1 (reg: 1)
Number of regular files transferred: 1,480
Total file size: 201,329,799,433 bytes
Total transferred file size: 1,445,463,198 bytes
Literal data: 1,445,463,198 bytes
Matched data: 0 bytes
File list size: 7,208,811
File list generation time: 0.001 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 1,458,717,642
Total bytes received: 5,629,388

sent 1,458,717,642 bytes received 5,629,388 bytes 8,956,250.95 bytes/sec
total size is 201,329,799,433 speedup is 137.49


test1 is the primary, and test2 the standby.  Both have /pub/pg93 and /pub/pg95
I copied over a recovery.conf and the standby started up.
I started pg95 on the primary and am running analyze now.  I'll copy over wal 
to the standby and continue testing.

Looks good so far!

Thanks,

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] 9.5 beta pg_upgrade documentation

2015-09-01 Thread Andy Colson


On 09/01/2015 07:00 PM, Bruce Momjian wrote:

On Tue, Sep  1, 2015 at 06:56:11PM -0500, Andy Colson wrote:

I also added a mention that rsync, not pg_upgrade, will be run on the
standbys.  You can see all the results of the patch here:

http://momjian.us/pgsql_docs/pgupgrade.html

Thanks.



Sweet, I'm glad I stopped where I did.  I think I'm safe to pick up at step f.  
Which seemed to work ok.  But now we get to step g (run rsync).  I checked the 
rsync manual and don't see anything like this three directory argument thing 
you are trying to run.  Unless you want to use --link-dest.  In which case I 
think the cmd would be like:

rsync --archive --delete --hard-links --size-only --link-dest=old_pgdata 
new_pgdata remote_dir

I'm gonna try this now, will report back.


No, you are copying "old_pgdata and new_pgdata" to remote_dir.



Ohhh... I'll try again.  (The method above (run from the standby) doesn't work 
well at all).

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] 9.5 beta pg_upgrade documentation

2015-09-01 Thread Andy Colson


On 09/01/2015 04:09 PM, Bruce Momjian wrote:

On Sun, Aug 23, 2015 at 09:45:50AM -0500, Andy Colson wrote:

I think we should add a step 6.5 (before step 7 Stop both servers) with 
something like:

If you are upgrading both a primary and standby, then we need to make sure the 
standby is caught up.
If you are wal shipping then on primary run: select pg_switch_xlog();
shut down primary
before you shut down the standby make sure it gets caught up to the primary.

I don't think its 100% required for them to be exact, is it?  If they are a 
little different then rsync has more data to xfer from primary to standby ... 
but it would still work.  Right?


You are one of the first to use this new ability so it is good to get
your feedback.  I have developed the attached applied patch to address
the problems you saw.

First, the verification has to happen earlier, before pg_upgrade is run.
I think what is happening is that a checkpoint on server shutdown is
changing the value while pg_upgrade is running, and the rename of the
controldata file is another issue, so doing it right after the primary
is shut down is the right place.

I also added a mention that rsync, not pg_upgrade, will be run on the
standbys.  You can see all the results of the patch here:

http://momjian.us/pgsql_docs/pgupgrade.html

Thanks.



Sweet, I'm glad I stopped where I did.  I think I'm safe to pick up at step f.  
Which seemed to work ok.  But now we get to step g (run rsync).  I checked the 
rsync manual and don't see anything like this three directory argument thing 
you are trying to run.  Unless you want to use --link-dest.  In which case I 
think the cmd would be like:

rsync --archive --delete --hard-links --size-only --link-dest=old_pgdata 
new_pgdata remote_dir

I'm gonna try this now, will report back.

-Andy



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] 9.5 beta pg_upgrade documentation

2015-08-23 Thread Andy Colson


I think we should add a step 6.5 (before step 7 Stop both servers) with 
something like:

If you are upgrading both a primary and standby, then we need to make sure the 
standby is caught up.
If you are wal shipping then on primary run: select pg_switch_xlog();
shut down primary
before you shut down the standby make sure it gets caught up to the primary.

I don't think its 100% required for them to be exact, is it?  If they are a 
little different then rsync has more data to xfer from primary to standby ... 
but it would still work.  Right?

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-22 Thread Andy Colson

On Sat, Aug 22, 2015 at 1:16 PM, Tom Lane mailto:t...@sss.pgh.pa.us>> wrote:

Melvin Davidson mailto:melvin6...@gmail.com>> writes:
>> Best Practice would rather be something along the lines:
 Avoid coding in a way that triggers "WARNING:
 nonstandard use of escape in a string literal". If you
 cannot comply with this rule document your reasons.

> Thanks for the suggestion. For the past few months I've been dealing with
> an error log that is filled with these warnings simply because
> the developers do not comprehend how to use ( or the requirement to use)
> an escape clause.

IMO best practice in this area is "run with standard_conforming_strings = 
ON".
If you're seeing this warning at all, it's because you aren't doing that,
which means your code is unnecessarily unportable to other DBMSes.
Adopting a coding policy of always using E'' would make that worse.

 regards, tom lane

On 08/22/2015 02:40 PM, Melvin Davidson wrote:

Tom,

Thank you for pointing out "run with standard_conforming_strings = ON"..
However, that is NOT the problem.
What is occurring is that the developers are sending strings like 'Mr. M\'vey',
which, if we set standard_conforming_strings = ON, would, and does, result in 
errors and the statement failing,
which is a lot less desirable that a simple warning.

Therefore, I am trying to educate the developers in the proper method of 
escaping strings,
instead of loading up the error log with annoying warnings.

Please dont top post.

But you are not educating them correctly.  Using E'' isnt right.  The correct 
way to escape a quote is to double quote it:  'Mr. M''vey'

-Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] 9.5 beta pg_upgrade documentation

2015-08-22 Thread Andy Colson


I finished running pg_upgrade on the primary, so far so good, and now I'm at 
step 9.5 (Verify).

On the primary I see:
root@test1:/pub/pg95# /usr/local/pg95/bin/pg_controldata -D /pub/pg95|grep "Latest 
check"
Latest checkpoint location:   1D2/3628

I cannot run pg93 pg_controldata because pg_control was renamed with a .old, 
but I'm not sure if I even need to.
pg_controldata: could not open file "/pub/pg93/global/pg_control" for reading: 
No such file or directory



On the standby I see:
postgres@test2:~$ /usr/local/pg93/bin/pg_controldata /pub/pg93|grep "Latest 
check"
Latest checkpoint location:   1D1/AF60


So these numbers dont match, so that that mean:
1) I compared the wrong ones
2) Its broke, dont go any further
3) Its fine, rsync away

Thanks for your time,

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] 9.5 beta pg_upgrade documentation

2015-08-21 Thread Andy Colson


Hi All.

I setup two test VM's with my PG93 database to test upgrading to PG95. 
I have a primary and standby using wal shipping.
The database is about 150Gig, and the two servers (the real servers) are 
far apart.  The vm's are both running on my desktop.


I would love to use pg_upgrade on both primary and standby to save from 
copying 150Gig a very long distance.


I'm reading:
http://www.postgresql.org/docs/9.5/static/pgupgrade.html

At step 9, the first sentence says "If you have ... Log-Shipping ... 
follow these steps .. before starting any servers".


At first I thought a step was missing because it never says to run 
pg_upgrade on the standby.  Then I realized you only run pg_upgrade on 
the primary then rsync the standby.  Would a quick description of the 
process be helpful?  Something like:


"If you have Streaming Replication (Section 25.2.5) or Log-Shipping 
(Section 25.2) standby servers, follow these steps to upgrade them. 
After you have performed pg_upgrade on the primary don't start it up yet 
because you can rsync it to the standby for an efficient standby upgrade."




I'm confused by step 5 (verify).  There are 4 PG instances we're talking 
about (primary new/old and standby new/old)  Which two do I run 
pg_controldata on?  And how does running it "prevent old standby servers 
from being modified"?  And if step 5 requires standby shutdown after the 
primary, isn't that an important thing to say near the top?  Maybe by 
step 7: Stop both servers.



On a side note, I'm confusing myself by the step numbers.  There's two 
step 7's.  Can we renumber the step 9 sub steps to be 9.1, 9.2, etc?



I think I understand step 9.7, of the four PG instances, "the new 
master" tells me which to start and stop.  Although I'm not sure how 
long I need to keep it up.  Is as fast as I can type enough time? 
Should I count a few potato's between stop and start?


Step 9.8 sounds scary.  Can we specifically say that old_pgdata refers 
to the new pg on the primary and new_pgdata refers to the new pg on the 
standby?  Is that even right?  Any chance we could get examples of 
setting old/new_pgdata?


Thanks all on another great release!

-Andy



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Extension to rewrite queries before execution

2015-08-14 Thread Andy Colson


On 8/13/2015 2:49 PM, Jeff Janes wrote:

I am looking for an extension or a technique that will allow me to
intercept a query by the exact query text, and replace that query with a
different one.

The context is running a third-party app which issues queries I have no
control over.  I'd like to intercept a specific query (which has no bind
parameters) and either replace the query text with a different text
which, for example, swaps out an "in list" clause to instead be an
"exists (subquery)".

Or just wrap the query in a "set enable_* =off;" or "set work_mem=*;"
before and a reset of it after.

Is there anything out there like this?  This would be for 9.4.

I'm willing to put the query text, and its replacement, directly into
the extension source code and compile it, but of course something more
flexible would be ideal.

Thanks,

Jeff


Have you looked at pg_bouncer?  You'd have to make changes to the 
source, but it's got a bunch of what you'd need.  It sits in the middle 
and could swap out text pretty easy.  I'd guess.


-Andy




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] could not create shared memory segment: Invalid argument

2015-07-16 Thread Andy Colson

On Tue, Jul 14, 2015 at 8:59 AM, Andy Colson mailto:a...@squeakycode.net>> wrote:

On 7/13/2015 7:08 PM, Ryan King - NOAA Affiliate wrote:

Apologies ahead of time for not knowing which group to send to,
but I
wanted to see if anyone has encountered and resolved this type
of error.
I'm setting up postgresql 9.2 streaming replication on RH and after
copying the master data directory over to the slave, the psql
service
refuses start and gives the following errors.

 2015-07-13 23:55:41.224 UTC FATAL:  could not create shared
memory
segment: Invalid argument
 2015-07-13 23:55:41.224 UTC DETAIL:  Failed system call was
shmget(key=5432001, size=1146945536, 03600).
 2015-07-13 23:55:41.224 UTC HINT:  This error usually means
that
PostgreSQL's request for a shared memory segment exceeded your
kernel's
SHMMAX parameter.  You can either reduce the request size or
reconfigure
the kernel with larger SHMMAX.  To reduce the request size
(currently
1146945536 bytes), reduce PostgreSQL's shared memory usage,
perhaps by
reducing shared_buffers or max_connections.
  If the request size is already small, it's possible
that it is
less than your kernel's SHMMIN parameter, in which case raising the
request size or reconfiguring SHMMIN is called for.
  The PostgreSQL documentation contains more information
about
shared memory configuration.
 2015-07-13 23:56:21.344 UTC FATAL:  could not create shared
memory
segment: Invalid argument
 2015-07-13 23:56:21.344 UTC DETAIL:  Failed system call was
shmget(key=5432001, size=58302464, 03600).
 2015-07-13 23:56:21.344 UTC HINT:  This error usually means
that
PostgreSQL's request for a shared memory segment exceeded your
kernel's
SHMMAX parameter.  You can either reduce the request size or
reconfigure
the kernel with larger SHMMAX.  To reduce the request size
(currently
58302464 bytes), reduce PostgreSQL's shared memory usage, perhaps by
reducing shared_buffers or max_connections.
  If the request size is already small, it's possible
that it is
less than your kernel's SHMMIN parameter, in which case raising the
request size or reconfiguring SHMMIN is called for.
  The PostgreSQL documentation contains more information
about
shared memory configuration.

I've set shared_buffer way down to next to nothing along with
kernel.shmmax and kernel.shmall per some blogs. However, the
same error
persists, and I'm getting no where. I think ultimately the
solution is
to upgrade, but the devs may not be ready for an upgrade at this
point.
Any help would be greatly appreciated. Thanks!

You don't want to decrease kernel.shmmax you want to set it to the
request size:

sysctl -w kernel.shmmax=1146945536

shmmax is the only thing you really need to play with.

-Andy

On 7/15/2015 9:13 AM, Ryan King - NOAA Affiliate wrote:
> I tried that too - same result. I updated another box w/ the same issue
> to 9.4.4, and all is well there. Thanks for your reply.
>

Ah, I assume then that something else is already using some shared memory.

PG needs:

 To reduce the request size (currently 58302464 bytes),

That much shared memory *free*.  You can check current usage with:  ipcs -m

Add what PG needs to what you are already using, and you should be good 
to go.

-Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Index Only Scan vs Cache

2015-07-14 Thread Andy Colson


On 7/14/2015 1:19 PM, Marc Mamin wrote:



On 7/9/2015 12:41 PM, Tom Lane wrote:

Andy Colson  writes:

My question is:  Will PG cache only the index (assuming it can
always do an Index Only Scan), or will it cache the table as
well?


I'm not sure that indexes on tiny tables are useful. They raise the
options to consider by the query planner, which has its small cost
too. I'd be interested on other opinions on this. Any rule of the
thumb with which number of pages per relation it is worth to start
indexing ?

And still another question: I've have tiny static tables too, that
never got analyzed. Can this fool the query planner in a negative way
?

regards,

Marc Mamin



They can be.  A unique constraint to ensue correctness for example.  In
my case some of my "tiny" tables can be different sizes from 100 to 500
rows.  I want indexes on those in case they get bigger and start to get
slow.  PG can figure out when to use and not to use the index.  I'd 
rather have the safety net.




And still another question: I've have tiny static tables too, that never got 
analyzed.
Can this fool the query planner in a negative way ?


I would say yes.  A tiny table is quickest when it is table scanned, but 
its only going to be a few milliseconds more if it uses the index (also 
depending on how much the table and index are cached).  For a small 
table I can't imagine the speed difference would even be noticeable.


In my testing, with tables of 100 rows the speed was almost the same 
with an index, a covering index, and no index.


-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] could not create shared memory segment: Invalid argument

2015-07-14 Thread Andy Colson


On 7/13/2015 7:08 PM, Ryan King - NOAA Affiliate wrote:

Apologies ahead of time for not knowing which group to send to, but I
wanted to see if anyone has encountered and resolved this type of error.
I'm setting up postgresql 9.2 streaming replication on RH and after
copying the master data directory over to the slave, the psql service
refuses start and gives the following errors.



2015-07-13 23:55:41.224 UTC FATAL:  could not create shared memory
segment: Invalid argument
2015-07-13 23:55:41.224 UTC DETAIL:  Failed system call was
shmget(key=5432001, size=1146945536, 03600).
2015-07-13 23:55:41.224 UTC HINT:  This error usually means that
PostgreSQL's request for a shared memory segment exceeded your kernel's
SHMMAX parameter.  You can either reduce the request size or reconfigure
the kernel with larger SHMMAX.  To reduce the request size (currently
1146945536 bytes), reduce PostgreSQL's shared memory usage, perhaps by
reducing shared_buffers or max_connections.
 If the request size is already small, it's possible that it is
less than your kernel's SHMMIN parameter, in which case raising the
request size or reconfiguring SHMMIN is called for.
 The PostgreSQL documentation contains more information about
shared memory configuration.
2015-07-13 23:56:21.344 UTC FATAL:  could not create shared memory
segment: Invalid argument
2015-07-13 23:56:21.344 UTC DETAIL:  Failed system call was
shmget(key=5432001, size=58302464, 03600).
2015-07-13 23:56:21.344 UTC HINT:  This error usually means that
PostgreSQL's request for a shared memory segment exceeded your kernel's
SHMMAX parameter.  You can either reduce the request size or reconfigure
the kernel with larger SHMMAX.  To reduce the request size (currently
58302464 bytes), reduce PostgreSQL's shared memory usage, perhaps by
reducing shared_buffers or max_connections.
 If the request size is already small, it's possible that it is
less than your kernel's SHMMIN parameter, in which case raising the
request size or reconfiguring SHMMIN is called for.
 The PostgreSQL documentation contains more information about
shared memory configuration.



I've set shared_buffer way down to next to nothing along with
kernel.shmmax and kernel.shmall per some blogs. However, the same error
persists, and I'm getting no where. I think ultimately the solution is
to upgrade, but the devs may not be ready for an upgrade at this point.
Any help would be greatly appreciated. Thanks!


You don't want to decrease kernel.shmmax you want to set it to the 
request size:


sysctl -w kernel.shmmax=1146945536

shmmax is the only thing you really need to play with.

-Andy



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] How to get total count of queries hitting DB per day or per hour?

2015-07-09 Thread Andy Colson


On 7/9/2015 1:04 PM, Sheena, Prabhjot wrote:

Guys

   Is there a way to get total count of queries hitting
Postgresql DB(verison 9.3) per day or per hour ? I cannot turn on
log_statement=all coz that’s too much logging for our log file to
handle. Is there another way to do that.

Thanks

Prabhjot



My collectd config:



Statement "SELECT sum(numbackends) AS count, sum(xact_commit) 
as ttlcommit FROM pg_stat_database;"


Type "pg_numbackends"
ValuesFrom "count"


Type "pg_xact"
ValuesFrom "ttlcommit"




#Interval 60
Host "/tmp"
Port "5432"
SSLMode "disable"
Query ttlbackends





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Index Only Scan vs Cache

2015-07-09 Thread Andy Colson


On 7/9/2015 12:41 PM, Tom Lane wrote:

Andy Colson  writes:

My question is:  Will PG cache only the index (assuming it can always do
an Index Only Scan), or will it cache the table as well?


The table blocks would fall out of cache if they're never touched.

regards, tom lane




Sweet!  Thanks Tom.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] regexp_matches for digit

2015-07-09 Thread Andy Colson


On 7/9/2015 11:24 AM, Ramesh T wrote:

Hi,
   in oracle regexp_like(entered
date,'[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}','i')

for postgres i have regexp_matches ,But i need how to match [:digit:] in
postgres when we pass date..?
any help


\d

per:
http://www.postgresql.org/docs/devel/static/functions-matching.html

# select 1 where '1234-56-78' ~ '\d{4}-\d{2}-\d{2}';
 ?column?
--
1
(1 row)



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Index Only Scan vs Cache

2015-07-09 Thread Andy Colson


Hi All.

I have a website db that is 90% read-only.  I have 50 (or so) tiny 
lookup tables, something like:


\d m_zone
 Column  |  Type   | Modifiers
-+-+---
 code| integer | not null
 zone_id | text|
 descr   | text|

This one has less than 10 rows, others might hit 100, I doubt any would 
make it over 500 rows.


All of them have an index on code.  I'm thinking of dropping it and 
creating the index on (code, descr) so that I'd get Index Only Scans.


I host 100 counties (One database, 100 schemas) each will have these 50 
tables, so 5,000 small lookup tables.


My question is:  Will PG cache only the index (assuming it can always do 
an Index Only Scan), or will it cache the table as well?


There is very little difference speed wise, my purpose is to reduce 
cache usage.  Right now I assume I have 5000 tables + 5000 indexes in cache.


I'd like to try to cut that down to either 5000 tables, or 5000 indexes 
(not both).


Thanks for your time,

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] DB access speeds, App(linux)<->PG(linux) vs App(linux) <->MSSql(Windows)

2015-06-24 Thread Andy Colson

-Original Message-
From: Andy Colson [mailto:a...@squeakycode.net]
Sent: Wednesday, June 24, 2015 10:10 AM
To: Gauthier, Dave; Postgres General
Subject: Re: [GENERAL] DB access speeds, App(linux)<->PG(linux) vs App(linux) 
<->MSSql(Windows)

On 6/24/2015 8:44 AM, Gauthier, Dave wrote:

Hi:  I'm trying to get a 10,000 ft understanding of the difference in
DB access speeds for two different scenarios...

Scenario 1: Apps are on linux.  PG DB is on linux (different server
than apps)

Scenario 2: Apps are on linux.  MSSql DB is on Windows (obviously a
different server)

The apps are typically perl scripts using Perl DBI.

I'm thinking that since perl has a DBD driver for PG, that would be
faster than going through ODBC to get to the MsSQL DB, but I'm not
sure about that.

Any insights/guesses ?

Thanks in Advance.

I think it'll depend on your usage.

If you fire off one really big analytic sql that chews on a huge table and does 
stats and counts, then odbc vs native will make no difference.

If you fire off lots and lots of small querys that hit an index and return very 
fast, then native is the way to go.

-Andy

On 6/24/2015 10:13 AM, Gauthier, Dave wrote:>
>>> 1) The same schema, data on both DB servers?
> Yes, identical
>>> 2) Are they the same apps?
> Yes, the same.  The only diff is the db connect.
>>> 3) What are the apps doing?
> The vast, vast majority are small queries, often run inside large 
programming loops.

>
>>> If you fire off lots and lots of small querys that hit an index and 
return very fast, then native is the way to go.
> By this are you implying that there is more overhead using ODBC vs 
Perl DBI-DBD ?   If so, any guess, percentage wise, of how much slower 
it would be? 50% slower?  2x?  Other?

>
> Some (who really don't understand DB mechanics, or even know what 
odbc is) are telling me that there is no difference.  I say there is, 
but this is an intuitive response.  I'm trying to tap more experienced 
people for something more solid.

>
>
>
>

I am totally guessing!  I have never benchmarked odbc vs native.

But you have bigger problems.  You have win vs linux, pg vs mssql, odbc 
vs native to contend with.

Benchmarking a bunch of queries to one vs the other wont tell you which 
(os, db, method) is the slow part.

To only measure odbc vs native you'd need everything else to be the same.

I'd recommend benchmarking you app to each db and just see which is 
faster.  You wont know why its faster but do you really care?

-Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] DB access speeds, App(linux)<->PG(linux) vs App(linux) <->MSSql(Windows)

2015-06-24 Thread Andy Colson


On 6/24/2015 8:44 AM, Gauthier, Dave wrote:

Hi:  I'm trying to get a 10,000 ft understanding of the difference in DB
access speeds for two different scenarios...

Scenario 1: Apps are on linux.  PG DB is on linux (different server
than apps)

Scenario 2: Apps are on linux.  MSSql DB is on Windows (obviously a
different server)

The apps are typically perl scripts using Perl DBI.

I'm thinking that since perl has a DBD driver for PG, that would be
faster than going through ODBC to get to the MsSQL DB, but I'm not sure
about that.

Any insights/guesses ?

Thanks in Advance.



I think it'll depend on your usage.

If you fire off one really big analytic sql that chews on a huge table 
and does stats and counts, then odbc vs native will make no difference.


If you fire off lots and lots of small querys that hit an index and 
return very fast, then native is the way to go.


-Andy



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Counting the occurences of a substring within a very large text

2015-06-24 Thread Andy Colson


On 6/24/2015 5:55 AM, Marc Mamin wrote:

Hello,

I'd like to count the number  linebreaks within a string,
but I get a memory allocation error when using regexp_matches or 
regexp_split_to_table.

Any idea for an alternative to this problem  ?

select count(*)-1 from
(  select regexp_split_to_table(full_message,'(\n)', 'g')
from mytable
where id =-2146999703
)foo;

ERROR:  invalid memory alloc request size 1447215584

regards,

Marc Mamin




If its a large enough string, switching to plperl/plpython might give 
you a pleasant speedup.  There is a small overhead getting the string 
to/from pg, but the string ops will be much faster.


-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Reg: BULK COLLECT

2015-05-27 Thread Andy Colson






On 05/25/2015 07:24 AM, Medhavi Mahansaria wrote:
 > Hello,
 >
 > I am porting my application from Oracle to PostgreSQL.
 >
 > We are using BULK COLLECT functionality of oracle.
 > How can i change the 'BULK COLLECT' fetch of the data from the cursor
to make if compatible for pg/plsql?
 >
 > A small example is as below (This is just an example and the query is
much more complex which returns huge amount of data)
 >
 >
 > */CREATE OR REPLACE FUNCTION abc() RETURNS VOID AS $body$/*
 >
 > */DECLARE/*
 > */l_data b%ROWTYPE;/*
 >
 > */POPULATE_STATS CURSOR IS/*
 > */(/*
 > */SELECT * from a/*
 > */)/*
 > */;// query returning a huge amount of data/*
 >
 > */BEGIN/*
 > */  OPEN POPULATE_STATS;/*
 > */  LOOP/*
 > */  FETCH POPULATE_STATS BULK COLLECT INTO l_data LIMIT 1000;/*
 > */IF POPULATE_STATS%ROWCOUNT > 0/*
 > */THEN/*
 > */FORALL i IN 1..l_data.COUNT/*
 > */  INSERT INTO b VALUES l_data(i);/*
 > */END IF;/*
 > */  IF NOT FOUND THEN EXIT; END IF; /*
 > */  END LOOP;/*
 > */  CLOSE POPULATE_STATS;/*
 > */EXCEPTION/*
 > */  WHEN OTHERS THEN/*
 > */  CODE := SQLSTATE;/*
 > */  MSG := SQLERRM;/*
 > */ INSERT INTO tracker VALUES (CODE,MSG,LOCALTIMESTAMP);/*
 > */  RAISE NOTICE 'SQLERRM';/*
 > */  RAISE NOTICE '%', SQLSTATE;/*
 > */  RAISE NOTICE '%', MSG;/*
 > */END;
 > /*
 > */$body$/*
 > */LANGUAGE PLPGSQL;/*
 >
 > How can i change the 'BULK COLLECT' fetch of the data from the cursor
to make if compatible for pg/plsql?
 >
 >
 > Thanks & Regards
 > Medhavi Mahansaria
 > Mailto: medhavi.mahansa...@tcs.com
 >

That seems pretty over complicated version of:

insert into b select * from a;

Which is all you'll need in PG.  It it does something else, then I
failed to understand the stored proc.

-Andy




On 5/27/2015 12:52 AM, Medhavi Mahansaria wrote:> Dear Andy,
>
> We are using bulk collect to enhance the performance as the data is huge.
>
> But as you said it is ideally insert into b select * from a;
>
> So now I am using the looping through query result option as Adrian
> suggested.
>
> 
http://www.postgresql.org/docs/9.4/interactive/plpgsql-control-structures.html#PLPGSQL-RECORDS-ITERATING 


>
>
> Thank You Adrian.
>
>
> Thanks & Regards
> Medhavi Mahansaria




Did you time it?  I'll bet "insert into b select * from a" is the 
fastest method.


-Andy



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Reg: BULK COLLECT

2015-05-25 Thread Andy Colson


On 05/25/2015 07:24 AM, Medhavi Mahansaria wrote:

Hello,

I am porting my application from Oracle to PostgreSQL.

We are using BULK COLLECT functionality of oracle.
How can i change the 'BULK COLLECT' fetch of the data from the cursor to make 
if compatible for pg/plsql?

A small example is as below (This is just an example and the query is much more 
complex which returns huge amount of data)


*/CREATE OR REPLACE FUNCTION abc() RETURNS VOID AS $body$/*

*/DECLARE/*
*/l_data b%ROWTYPE;/*

*/POPULATE_STATS CURSOR IS/*
*/(/*
*/SELECT * from a/*
*/)/*
*/;// query returning a huge amount of data/*

*/BEGIN/*
*/  OPEN POPULATE_STATS;/*
*/  LOOP/*
*/  FETCH POPULATE_STATS BULK COLLECT INTO l_data LIMIT 1000;/*
*/IF POPULATE_STATS%ROWCOUNT > 0/*
*/THEN/*
*/FORALL i IN 1..l_data.COUNT/*
*/  INSERT INTO b VALUES l_data(i);/*
*/END IF;/*
*/  IF NOT FOUND THEN EXIT; END IF; /*
*/  END LOOP;/*
*/  CLOSE POPULATE_STATS;/*
*/EXCEPTION/*
*/  WHEN OTHERS THEN/*
*/  CODE := SQLSTATE;/*
*/  MSG := SQLERRM;/*
*/ INSERT INTO tracker VALUES (CODE,MSG,LOCALTIMESTAMP);/*
*/  RAISE NOTICE 'SQLERRM';/*
*/  RAISE NOTICE '%', SQLSTATE;/*
*/  RAISE NOTICE '%', MSG;/*
*/END;
/*
*/$body$/*
*/LANGUAGE PLPGSQL;/*

How can i change the 'BULK COLLECT' fetch of the data from the cursor to make 
if compatible for pg/plsql?


Thanks & Regards
Medhavi Mahansaria
Mailto: medhavi.mahansa...@tcs.com



That seems pretty over complicated version of:

insert into b select * from a;

Which is all you'll need in PG.  It it does something else, then I failed to 
understand the stored proc.

-Andy



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Setting up a database for 10000 concurrent users

2015-05-04 Thread Andy Colson



On 05/04/2015 02:02 AM, sanjeetkamble wrote:

Hello,

Please let me know how The database server is started with 
max_connections =
1 ???

I have same issue, but i have a SAN storage where Postgresql  is 
installed.


Sanjeet


On Mon, May 4, 2015 at 10:08 AM, Andy Colson mailto:a...@squeakycode.net>> wrote:

No doubt that would be a problem.  Its bad idea.  set max_connections to 
core count * 2, then put pg_pool in front, and set pg_pools max count to 1.

-Andy




On 05/04/2015 09:22 AM, Melvin Davidson wrote:


I suggest pg_bouncer as opposed to pg_pool. My testing showed it handled 
connections better. Ultimately the choice is yours, but with 1 connections, 
you absolutely need a connection manger.





Oops.  I meant pg_bouncer too.  (I haven't had caffeine yet).

-Andy



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Setting up a database for 10000 concurrent users

2015-05-04 Thread Andy Colson


On 05/04/2015 02:02 AM, sanjeetkamble wrote:

Hello,

Please let me know how The database server is started with max_connections =
1 ???

I have same issue, but i have a SAN storage where Postgresql  is installed.


Sanjeet




No doubt that would be a problem.  Its bad idea.  set max_connections to core 
count * 2, then put pg_pool in front, and set pg_pools max count to 1.

-Andy



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] COALESCE woes

2015-04-24 Thread Andy Colson


On 04/24/2015 08:06 AM, Greg Spiegelberg wrote:

Hi PG List,

I'm missing something or haven't had enough coffee yet.  What gives with the 
COALESCE in the view below?

mxl_sqr=# \d users
  Table "public.users"
  Column  |  Type   | Modifiers
-+-+---
  user_id | integer | not null
Indexes:
 "users_pkey" PRIMARY KEY, btree (user_id)

CREATE TABLE ts1 (
  user_id int references users(user_id),
  ts  timestamptz default now()
);

CREATE TABLE ts2 (
  user_id int references users(user_id),
  ts  timestamptz default now()
);

CREATE TABLE ts3 (
  user_id int references users(user_id),
  ts  timestamptz default now()
);

CREATE OR REPLACE VIEW user_timestamps
AS
SELECT COALESCE(ts1.user_id, ts2,user_id, ts3.user_id) AS user_id,
max(ts1.ts) AS ts_x,
max(ts2.ts) AS ts_y,
max(ts3.ts) AS ts_z
   FROM   ts1
LEFT JOIN ts2 USING (user_id)
LEFT JOIN ts3 USING (user_id)
  GROUP BY 1;
ERROR:  COALESCE types integer and ts2 cannot be matched
*LINE 3: SELECT COALESCE(ts1.user_id, ts2,user_id, ts3.user_id) AS us...*
* ^
*


All types match from start to finish.

Thanks,
-Greg


Maybe dot instead of comma?  (ts2.user_id instead of ts2,user_id)

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] What constitutes "reproducible" numbers from pgbench?

2015-04-23 Thread Andy Colson


On 4/23/2015 4:07 AM, holger.friedrich-fa-triva...@it.nrw.de wrote:

On Tuesday, April 21, 2015 7:43 PM, Andy Colson wrote:

On 4/21/2015 9:21 AM, holger.friedrich-fa-triva...@it.nrw.de wrote:

Exactly what constitutes "reproducible" values from pgbench?  I keep
getting a range between 340 tps and 440 tps or something like that

I think its common to get different timings.  I think its ok because things are 
changing (files, caches, indexes, etc).


Qingqing Zhou wrote that the range between 340 tps and 440 tps I keep getting 
is not ok and numbers should be the same within several per cent.  Of course, 
if other things are going on on the physical server, I can't always expect a 
close match.



I disagree.  Having a reproducible test withing a few percent is a great 
result.  But any result is informative.  You're tests tell you an upper 
and lower bound on performance.  It tells you to expect a little 
variance in your work load.  It probably tells you a little about how 
your vm host is caching writes to disk.  You are feeling the pulse of 
your hardware.  Each hardware setup has its own pulse, and understanding 
it will help you understand how it'll handle a load.


-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] What constitutes "reproducible" numbers from pgbench?

2015-04-21 Thread Andy Colson


On 4/21/2015 9:21 AM, holger.friedrich-fa-triva...@it.nrw.de wrote:

Hello list,
Exactly what constitutes „reproducible“ values from pgbench?  I keep
getting a range between 340 tps and 440 tps or something like that using
the same command line on the same machine.  Is that reproducible enough?
The docs state that one should verify that the numbers are reproducible,
so I repeat any test run ten times before believing the results.  I’ve
tried increasing the test duration (-T) from one minute to five minutes,
then turning off autovacuum (in postgresql.conf) as recommended by the
docs, but the range of results is not getting any narrower.  So what
does “reproducible” mean as applied to pgbench?
Obviously I could be doing something wrong, such as missing some vital
configuration option…
Thanks in advance for any insights.
Cheers,
Holger Friedrich


I think its common to get different timings.  I think its ok because 
things are changing (files, caches, indexes, etc).


If you run three to five short runs, they should all be withing the same 
range (say 340 to 440).  If you are planning hardware, you might take 
the worst case and purchase based on that.  If you are planning 
schedules you might use the average case.  If you are bragging on the 
newsgroups use the best case :-).


If you want more realistic then keep vacuum enabled and run for 24 
hours.  In the real world, you are going to vacuum, so benchmark it too.


If you are playing with postgres.conf settings, then three runs of a few 
minutes each will give you an average, and you can compare different 
settings based on that.


As Qingqing said, a read-only test should be more stable, because you 
are comparing apples to apples.  A read-write test is changing under the 
hood so expect some differences.


Also, if your test data is small, or large, you are benchmarking 
different things. (lock speed, cpu speed, disk io, etc)


pgbench is good for a first test, but its going to act different than 
your real world work load.


-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Best way to migrate a 200 GB database from PG 2.7 to 3.6

2015-04-15 Thread Andy Colson


On 4/15/2015 2:03 PM, Filip Lyncker wrote:

Dear List,

I need to migrate my database from a 2.x to 3.x. Usually Im using
pg_basebackup , but this is not possible with different versions.
Pg_dump seems to be an option but it is slow like hell and I dont want
to stay offline all the time.
Is there another possibility to migrate  a database with 200 GB ?

Thanks a lot,

cheers

Filip




Postgresql is on version 9.  What do you mean version 2 or 3?

Maybe use replication?

http://www.slony.info/documentation/1.2/versionupgrade.html

Or something similar?

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Partial index-based load balancing

2015-03-31 Thread Andy Colson


On 3/31/2015 3:28 PM, Fabio Ugo Venchiarutti wrote:



On 01/04/15 06:12, Andy Colson wrote:

On 3/31/2015 1:58 AM, Fabio Ugo Venchiarutti wrote:

Greetings


I'm working for a startup and our core DB is growing rather fast.

Our target scale is large enough that we expect some of our core tables'
indexes to grow bigger than the memory on any single node over the next
couple years (our current intended design involves conventional
stream-replication-based, write-on-one-read-from-many load balancing).

We don't fancy the idea of using inheritance through partitioning due to
the maintenance overhead and our reliance on validation
constraints/triggers.

My proposal will be to instead create a number of partial indexes
covering predefined ranges of client IDs, then use a connection-level
routing mechanism that relies on what range the relevant client's data
belongs to in order to address the right node and match the right
partial index.

The idea is to have any given read-only node hold just one of the
partial indexes in its cache and never fetch index pages off its
secondary storage.
Scaling would just be a matter of increasing the partitioning density.


I'm going to assume that I'm not the first one to come up with this
strategy (and that there already is a name for it. If so, what is it?).


Is it a valid setup or am I missing some key aspect of how index
partitioning is meant to work?


TIA


Best regards


Fabio Ugo Venchiarutti




Have you timed it?  It sounds like it should work ok, but just to play
devils advocate it sounds a little over complex.

Heres my thinking:  Have one index, have lots of read-only nodes, do the
connection-level routing.  I think the PG caching and OS caching will be
smart enough to cache the parts of the index that's hot per node.  You
are also less likely to confuse the planner.

For example, I have one db, with web hits logging, with a timestamp
index.  Its upward of 25 gig, but I only every query the last 24 hours
to get ip/session/hit counts.  That part of the index/db is cached very
well and responds very quick.  If I ever hit data that's many days old
it slow's down hitting the HD.

I don't know for sure which method would be faster though.  The only
real difference between the two methods is the number of indexes.  I'd
bet there is no speed difference between them.  PG will figure out what
data is hot and cache it.

-Andy





I always assumed that an index, or at least the root level of a BTREE
one, had to be completely cached in order to guarantee instantaneous
lookups, but your thinking is seems sound as after the first level
resolution only certain branches will be walked on any given slave.

Will try both approaches and update you with about my findings if you're
interested (won't happen before a few weeks at least due to management
bottlenecks).

Many thanks


F





Yes, interested.  Everyone loves a good benchmark.  And if your solution 
(with multiple partial index) (which I've never seen discussed on this 
list before) is faster, then I propose we name it after you.  :-)


-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] inputs into query planner costing

2015-03-31 Thread Andy Colson


On 3/31/2015 10:31 AM, Mike Roest wrote:

Hi there,
I'm having an issue with query performance between 2 different pgsql
environments.

Ther first is our current production postgres server with is running
9.3.5 on Centos 5 x64.  The second system is Amazon's RDS postgres as a
service.  On our local DB server we have a query that executes in a
reasonable amount of time (600 msec).  On RDS the query will run for
more then 10 minutes on a similarly CPU specced systems.  I've been
working through with Amazon support and I'm looking for more suggestions
on where to look (both for me and to direct Amazon).  The RDS system
does use a network filesystem while our production server is a local
RAID10 array, I can see that effecting the actual performance of the
query but not the query planner costing (unless there's an input to
query planner costing that I can't find)

The Query plan costs generated by the 2 systems are vastly different,
while the plans themselves are basically identical other then the
materialization that RDS is doing (if I disable the materialization then
they are almost the same other then a seq scan/heap scan on one small
<2000 row table).  All the tables in the query have been analyzed on
each server without any impact

Current Production
Explain:
http://explain.depesz.com/s/Tkyc
Explain Analyze
http://explain.depesz.com/s/UnQt

RDS: (with enable_material=off)
http://explain.depesz.com/s/vDiV

(with enable_material=on)
http://explain.depesz.com/s/HUjx

I have validated that all the query planning configuration variables on
this page
http://www.postgresql.org/docs/9.3/static/runtime-config-query.html are
the same between the 2 environments.  If I modify the local production
system values for things like random_page_cost and seq_page_cost to
absurd values like 6 I can get it to generate a similar planner
cost.  Similarly if I lower the RDS values to absurdly low values like
.0001 I can get it to generate a similarly costed plan (while still
performing horridly).

I've reached the end of things I can think about (I'm also working on
rewriting the query but it's a generated query out of a infrastructure
component so it's not a simple change).

Just looking for any ideas on additional things to look into.

The query is available here:
https://www.dropbox.com/s/m31ct6k0mod0576/simplifiedquery.sql?dl=0

--
Data's inconvienient when people have opinions.


Seems like there is no useable index on table timesheet.  It always 
seems to table scan all 99K rows, several times.


I'll bet on RDS that table scan is super slow.

The RDS plans seem to be just explain?  I assume its too slow to run an 
explain analyze on?  Would be neat to see explain analyze from RDS. 
(any way to add a little extra where magic to cut the rows down to a 
useable, but still slow, sample?)



On this one:
http://explain.depesz.com/s/UnQt

Line 11 table scans 99K rows, then all those rows are carried up the 
chain (lines 10, 9, 8, 7 and 6).  Any way to reduce the row count 
earlier?   Line 5 finally seems to filter out 94K rows.


Would be neat to see if these are buffered reads or are actually hitting 
disk too. (something like  EXPLAIN (ANALYZE, BUFFERS) select...)


-Andy




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Partial index-based load balancing

2015-03-31 Thread Andy Colson


On 3/31/2015 1:58 AM, Fabio Ugo Venchiarutti wrote:

Greetings


I'm working for a startup and our core DB is growing rather fast.

Our target scale is large enough that we expect some of our core tables'
indexes to grow bigger than the memory on any single node over the next
couple years (our current intended design involves conventional
stream-replication-based, write-on-one-read-from-many load balancing).

We don't fancy the idea of using inheritance through partitioning due to
the maintenance overhead and our reliance on validation
constraints/triggers.

My proposal will be to instead create a number of partial indexes
covering predefined ranges of client IDs, then use a connection-level
routing mechanism that relies on what range the relevant client's data
belongs to in order to address the right node and match the right
partial index.

The idea is to have any given read-only node hold just one of the
partial indexes in its cache and never fetch index pages off its
secondary storage.
Scaling would just be a matter of increasing the partitioning density.


I'm going to assume that I'm not the first one to come up with this
strategy (and that there already is a name for it. If so, what is it?).


Is it a valid setup or am I missing some key aspect of how index
partitioning is meant to work?


TIA


Best regards


Fabio Ugo Venchiarutti




Have you timed it?  It sounds like it should work ok, but just to play 
devils advocate it sounds a little over complex.


Heres my thinking:  Have one index, have lots of read-only nodes, do the 
connection-level routing.  I think the PG caching and OS caching will be 
smart enough to cache the parts of the index that's hot per node.  You 
are also less likely to confuse the planner.


For example, I have one db, with web hits logging, with a timestamp 
index.  Its upward of 25 gig, but I only every query the last 24 hours 
to get ip/session/hit counts.  That part of the index/db is cached very 
well and responds very quick.  If I ever hit data that's many days old 
it slow's down hitting the HD.


I don't know for sure which method would be faster though.  The only 
real difference between the two methods is the number of indexes.  I'd 
bet there is no speed difference between them.  PG will figure out what 
data is hot and cache it.


-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Link Office Word form document with data from PostgreSQL

2015-03-30 Thread Andy Colson


On 3/30/2015 10:18 AM, John R Pierce wrote:

On 3/30/2015 1:20 AM, avpro avpro wrote:


Have you ever experienced how to populate fields from a MS Office Word
document with a PostgreSQL data available either in a table or view?

I haven’t seen anything on the web; only possible with MS products, VS
or Access.

Thank you for your input



Microsoft Office tools like Word should be able to directly access
Postgres databases via ODBC or OLEDB with the suitable Postgres driver
(odbc or oledb) installed on the Windows system, and configured to
access the Postgres database.   I've done this in the past with Excel,
no reason it wouldn't work in Word.





Agreed.  I too have use PG odbc to link with excel.  If it works in 
excel it should work in word.


-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] How does one make the following psql statement sql-injection resilient?

2015-03-16 Thread Andy Colson


On 3/16/2015 4:45 PM, Andy Colson wrote:

On 3/16/2015 4:30 PM, David G. Johnston wrote:

psql "$SERVICE" \
  --echo-queries \
  --set=string_input="${1:-ok_to_return}" \
  --set=start="${2:-5}" \
  --set=end="${3:-10}" \
<<'SQL'
 SELECT idx
 FROM generate_series(1, 20) gs (idx)
 WHERE 'short-circuit' != :'string_input'
 AND idx BETWEEN :start AND :end;
SQL

# (6 rows)

--set=end="${3:-10 AND false}"

# (0 rows)

Am I forced to represent the input as text (using :'end') and then
perform a conversion to integer?

Thanks!

David J.




The --set's make it a little complicated.  How about:

string_input="${1:-ok_to_return}"
start="${2:-5}"
end="${3:-10}"

psql "$SERVICE" --echo-queries <<'SQL'
  prepare tmp as SELECT idx
  FROM generate_series(1, 20) gs (idx)
  WHERE 'short-circuit' != $1
  AND idx BETWEEN $2 AND :$3;

   execute tmp($string_input, $start, $end);
   deallocate tmp;
SQL

That's untested, and probably wont work.  The "execute tmp($1, $2, $3)"
need to be passed to psql as-is, but $string_input, $start and $end need
to be replaced in bash before its sent to psql.  Maybe use \$1?

Docs here:

http://www.postgresql.org/docs/9.4/static/sql-prepare.html


-Andy




Wow.  Sorry.  what a mess.

>   AND idx BETWEEN $2 AND :$3;
should be:
   AND idx BETWEEN $2 AND $3;


> That's untested, and probably wont work.  The "execute tmp($1, $2, $3)"
should be: execute tmp($string_input, $start, $end);

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] How does one make the following psql statement sql-injection resilient?

2015-03-16 Thread Andy Colson


On 3/16/2015 4:30 PM, David G. Johnston wrote:

psql "$SERVICE" \
  --echo-queries \
  --set=string_input="${1:-ok_to_return}" \
  --set=start="${2:-5}" \
  --set=end="${3:-10}" \
<<'SQL'
 SELECT idx
 FROM generate_series(1, 20) gs (idx)
 WHERE 'short-circuit' != :'string_input'
 AND idx BETWEEN :start AND :end;
SQL

# (6 rows)

--set=end="${3:-10 AND false}"

# (0 rows)

Am I forced to represent the input as text (using :'end') and then
perform a conversion to integer?

Thanks!

David J.




The --set's make it a little complicated.  How about:

string_input="${1:-ok_to_return}"
start="${2:-5}"
end="${3:-10}"

psql "$SERVICE" --echo-queries <<'SQL'
 prepare tmp as SELECT idx
 FROM generate_series(1, 20) gs (idx)
 WHERE 'short-circuit' != $1
 AND idx BETWEEN $2 AND :$3;

  execute tmp($string_input, $start, $end);
  deallocate tmp;
SQL

That's untested, and probably wont work.  The "execute tmp($1, $2, $3)" 
need to be passed to psql as-is, but $string_input, $start and $end need 
to be replaced in bash before its sent to psql.  Maybe use \$1?


Docs here:

http://www.postgresql.org/docs/9.4/static/sql-prepare.html


-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

1 2 3 4 5 >

1 - 100 of 484 matches

Mail list logo