Re: [GENERAL] Performance degradation when using auto_explain

2017-01-04 Thread Andreas Kretschmer
Kisung Kim  wrote:

> And finally I found that auto_explain is the cause of the problem.

real hardware or virtual hardware? On virtual there are sometimes
problems with exact timings, please read:

https://www.postgresql.org/docs/current/static/pgtesttiming.html


Regards, Andreas Kretschmer
-- 
Andreas Kretschmer
http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY: row is too big

2017-01-04 Thread vod vos
Now I am confused about I can create 1100 columns in a table in postgresql, but 
I can't copy 1100 values into the table. And I really dont want to split the 
csv file to pieces to avoid mistakes after this action.



I create a table with 1100 columns with data type of varchar, and hope the COPY 
command will auto transfer the csv data that contains some character and date, 
most of which are numeric.

 

I use the command: COPY rius FROM "/var/www/test/test.csv" WITH DELIMITER ';' ;



Then it shows: 



ERROR:  row is too big: size 11808, maximum size 8160















 On 星期二, 03 一月 2017 05:24:18 -0800 John McKown 
 wrote 




On Mon, Jan 2, 2017 at 2:57 PM, Rob Sargent  wrote:

Perhaps this is your opportunity to correct someone else's mistake. You need to 
show the table definition to convince us that it cannot be improved. That it 
may be hard work really doesn't mean it's not the right path.




​This may not be possible. The data might be coming in from an external source. 
I imagine you've run into the old "well, _we_ don't have any problems, so it 
must be on your end!" scenario. 



Example: we receive CSV files from an external source. These files are 
_supposed_ to be validated. But we have often received files where NOT NULL 
fields have "nothing" in them them. E.g. a customer bill which has _everything_ 
in it _except_ the customer number (or an invalid one such as "123{"); or 
missing some other vital piece of information.



In this particular case, the OP might want to do what we did in a similar case. 
We had way too many columns in a table. The performance was horrible. We did an 
analysis and, as usual, the majority of the selects were for a subset of the 
columns, about 15% of the total. We "split" the table into the "high use" 
columns table & the "low use" columns table. We then used triggers to make 
sure that if we added a new / deleted an old row from one table, the 
corresponding row in the other was created / deleted.




 






--

 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)

 To make changes to your subscription:

 http://www.postgresql.org/mailpref/pgsql-general










-- 

There’s no obfuscated Perl contest because it’s pointless.



—Jeff Polk




Maranatha! <><

John McKown





































Re: [GENERAL] Row value expression much faster than equivalent OR clauses

2017-01-04 Thread Kevin Grittner
On Wed, Dec 28, 2016 at 10:58 AM, Steven Grimm  wrote:

> WHERE ((e.timeStamp > '2016-12-19T20:34:22.315Z')
> OR (e.timeStamp = '2016-12-19T20:34:22.315Z'
> AND e.sequenceNumber > 0)
> OR (e.timeStamp = '2016-12-19T20:34:22.315Z'
> AND e.sequenceNumber = 0
> AND e.aggregateIdentifier >
> 'dev:642e1953-2562-4768-80d9-0c3af9b0ff84'))

> This uses the index on the three columns it's using for ordering of events,
> but (if I'm reading the explain output correctly) does a full scan of the
> index.

> I played around with it a little and one thing I tried was to restructure
> the WHERE clause using a row value expression that's semantically equivalent
> to the original.

> WHERE (e.timeStamp, e.sequenceNumber, e.aggregateIdentifier) >
> ('2016-11-19T20:34:22.315Z', 0, 
> 'dev:642e1953-2562-4768-80d9-0c3af9b0ff84')

> This ends up being a LOT faster:

Yup.

> I wonder if the query planner could recognize that the two queries are
> equivalent and choose the second plan for the OR-clause version, or at least
> use the index more efficiently.

Theoretically it could, but that would add significant time to
planning for a large number of queries, with no benefit to those
who explicitly write the query in the faster (and more concise!)
fashion.

You could come a lot closer to the performance of the row value
expression technique by using the logical equivalent of your
original query that puts AND at the higher level and OR at the
lower level.  (Having OR at the top is generally inefficient.)

WHERE (e.timeStamp >= '2016-12-19T20:34:22.315Z'
  AND (e.timeStamp >  '2016-12-19T20:34:22.315Z'
 OR (e.sequenceNumber >= 0
AND (e.sequenceNumber >  0
   OR (e.aggregateIdentifier >
'dev:642e1953-2562-4768-80d9-0c3af9b0ff84')

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY: row is too big

2017-01-04 Thread Adrian Klaver

On 01/04/2017 05:00 AM, vod vos wrote:

Now I am confused about I can create 1100 columns in a table in
postgresql, but I can't copy 1100 values into the table. And I really


As pointed out previously:

https://www.postgresql.org/about/
Maximum Columns per Table   250 - 1600 depending on column types

That being dependent on both the number of columns and the actual data 
in the columns. Empty columns are not the problem, it is when you start 
filling them that you get the error.



dont want to split the csv file to pieces to avoid mistakes after this
action.

I create a table with 1100 columns with data type of varchar, and hope
the COPY command will auto transfer the csv data that contains some


I am afraid the solution is going to require more then hope. You are 
going to need to break the data up. I suspect that just splitting it 
into half would do the trick. So:


Table 1
column 1 for a primary key(assuming first column of your present data)
columns 2-550

Table 2
column 1 for a primary key(assuming first column of your present data)
columns 551-1100

Using the program I mentioned previously:

http://csvkit.readthedocs.io/en/1.0.1/scripts/csvcut.html

That translates into:

csvcut -c 1,2-550 your_big.csv > table_1.csv

csvcut -c 1,551-1100 your_big.csv > table_2.csv



character and date, most of which are numeric.


Is this a different data set?
Previously you said:
"The most of the data type are text or varhcar, ..."


I use the command: COPY rius FROM "/var/www/test/test.csv" WITH
DELIMITER ';' ;

Then it shows:

ERROR:  row is too big: size 11808, maximum size 8160







 On 星期二, 03 一月 2017 05:24:18 -0800 *John McKown
* wrote 

On Mon, Jan 2, 2017 at 2:57 PM, Rob Sargent mailto:robjsarg...@gmail.com>>wrote:

Perhaps this is your opportunity to correct someone else's
mistake. You need to show the table definition to convince us
that it cannot be improved. That it may be hard work really
doesn't mean it's not the right path.


​This may not be possible. The data might be coming in from an
external source. I imagine you've run into the old "well, _we_ don't
have any problems, so it must be on your end!" scenario.

Example: we receive CSV files from an external source. These files
are _supposed_ to be validated. But we have often received files
where NOT NULL fields have "nothing" in them them. E.g. a customer
bill which has _everything_ in it _except_ the customer number (or
an invalid one such as "123{"); or missing some other vital piece of
information.

In this particular case, the OP might want to do what we did in a
similar case. We had way too many columns in a table. The
performance was horrible. We did an analysis and, as usual, the
majority of the selects were for a subset of the columns, about 15%
of the total. We "split" the table into the "high use" columns table
& the "low use" columns table. We then used triggers to make sure
that if we added a new / deleted an old row from one table, the
corresponding row in the other was created / deleted.





--
Sent via pgsql-general mailing list
(pgsql-general@postgresql.org )
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general




--
There’s no obfuscated Perl contest because it’s pointless.

—Jeff Polk

Maranatha! <><
John McKown





--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY: row is too big

2017-01-04 Thread Pavel Stehule
Hi

2017-01-04 14:00 GMT+01:00 vod vos :

> Now I am confused about I can create 1100 columns in a table in
> postgresql, but I can't copy 1100 values into the table. And I really dont
> want to split the csv file to pieces to avoid mistakes after this action.
>

The PostgreSQL limit is "Maximum Columns per Table250 - 1600 depending on
column types" - this limit is related to placing values or pointers to
values to one page (8KB).

You can hit this limit not in CREATE TABLE time, but in INSERT time.


>
> I create a table with 1100 columns with data type of varchar, and hope the
> COPY command will auto transfer the csv data that contains some character
> and date, most of which are numeric.
>

Numeric is expensive type - try to use float instead, maybe double.

Regards

Pavel


> I use the command: COPY rius FROM "/var/www/test/test.csv" WITH DELIMITER
> ';' ;
>
> Then it shows:
>
> ERROR:  row is too big: size 11808, maximum size 8160
>
>
>
>
>
>
>
>  On 星期二, 03 一月 2017 05:24:18 -0800 *John McKown
> >* wrote 
>
> On Mon, Jan 2, 2017 at 2:57 PM, Rob Sargent  wrote:
>
> Perhaps this is your opportunity to correct someone else's mistake. You
> need to show the table definition to convince us that it cannot be
> improved. That it may be hard work really doesn't mean it's not the right
> path.
>
>
> ​This may not be possible. The data might be coming in from an external
> source. I imagine you've run into the old "well, _we_ don't have any
> problems, so it must be on your end!" scenario.
>
> Example: we receive CSV files from an external source. These files are
> _supposed_ to be validated. But we have often received files where NOT NULL
> fields have "nothing" in them them. E.g. a customer bill which has
> _everything_ in it _except_ the customer number (or an invalid one such as
> "123{"); or missing some other vital piece of information.
>
> In this particular case, the OP might want to do what we did in a similar
> case. We had way too many columns in a table. The performance was horrible.
> We did an analysis and, as usual, the majority of the selects were for a
> subset of the columns, about 15% of the total. We "split" the table into
> the "high use" columns table & the "low use" columns table. We then used
> triggers to make sure that if we added a new / deleted an old row from one
> table, the corresponding row in the other was created / deleted.
>
>
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>
>
>
>
> --
> There’s no obfuscated Perl contest because it’s pointless.
>
> —Jeff Polk
>
> Maranatha! <><
> John McKown
>
>
>


[GENERAL] vacuum of empty table slows down as database table count grows

2017-01-04 Thread marcin kowalski
I am experiencing an odd issue, i've noticed it on 9.3 , but i can
reproduce it on 9.6.

Basically, i have a database with a lot of schemas, but not that much data.
Each schema is maybe 2-4 GB in size, and often much less than that.

The database has ~300-500 schemas, each with ~100-300 tables. Generally a
few hundred thousand tables total. Entire cluster has 2 or 3 such databases.

As the amount of tables grows, the time it takes to vacuum an _empty_ table
grows as well. The table is in public schema, and it is the only table
there.

I made a simple testing script to make sure that these things are related.
I set up a blank database, create a table with one column in public and
restore one schema. Then i vacuum that table three times, measure the
execution times and repeat the process, adding another schema to db.

At ~200 tables it takes ~100ms for psql to issue a vacuum verbose and exit.
At 83K tables the time is already at ~1.5second. The progress appars to be
directly proportional to table amount, and grows linearly, eventually
crossing past 3seconds - for blank table with no data.

I think this may severely impact the entire vacuumdb run, but i have not
verified that yet.

This is irrelevant of amount of data restored, i am seeing the same
behavior with just schema restore, as well as with schema+data restores.

If anyone is interested i may upload the schema data + my benchmarking
script with collected whisper data from my test run (i've been plotting it
in grafana via carbon)

Is this a known issue? Can i do anything to improve performance here?


Re: [GENERAL] COPY: row is too big

2017-01-04 Thread Adrian Klaver

On 01/04/2017 06:54 AM, Pavel Stehule wrote:

Hi

2017-01-04 14:00 GMT+01:00 vod vos mailto:vod...@zoho.com>>:

__
Now I am confused about I can create 1100 columns in a table in
postgresql, but I can't copy 1100 values into the table. And I
really dont want to split the csv file to pieces to avoid mistakes
after this action.


The PostgreSQL limit is "Maximum Columns per Table250 - 1600 depending
on column types" - this limit is related to placing values or pointers
to values to one page (8KB).

You can hit this limit not in CREATE TABLE time, but in INSERT time.



I create a table with 1100 columns with data type of varchar, and
hope the COPY command will auto transfer the csv data that contains
some character and date, most of which are numeric.


Numeric is expensive type - try to use float instead, maybe double.


If I am following the OP correctly the table itself has all the columns 
declared as varchar. The data in the CSV file is a mix of text, date and 
numeric, presumably cast to text on entry into the table.




Regards

Pavel


I use the command: COPY rius FROM "/var/www/test/test.csv" WITH
DELIMITER ';' ;

Then it shows:

ERROR:  row is too big: size 11808, maximum size 8160







 On 星期二, 03 一月 2017 05:24:18 -0800 *John McKown
mailto:john.archie.mck...@gmail.com>>* wrote 

On Mon, Jan 2, 2017 at 2:57 PM, Rob Sargent
mailto:robjsarg...@gmail.com>>wrote:

Perhaps this is your opportunity to correct someone else's
mistake. You need to show the table definition to convince
us that it cannot be improved. That it may be hard work
really doesn't mean it's not the right path.


​This may not be possible. The data might be coming in from an
external source. I imagine you've run into the old "well, _we_
don't have any problems, so it must be on your end!" scenario.

Example: we receive CSV files from an external source. These
files are _supposed_ to be validated. But we have often received
files where NOT NULL fields have "nothing" in them them. E.g. a
customer bill which has _everything_ in it _except_ the customer
number (or an invalid one such as "123{"); or missing some other
vital piece of information.

In this particular case, the OP might want to do what we did in
a similar case. We had way too many columns in a table. The
performance was horrible. We did an analysis and, as usual, the
majority of the selects were for a subset of the columns, about
15% of the total. We "split" the table into the "high use"
columns table & the "low use" columns table. We then used
triggers to make sure that if we added a new / deleted an old
row from one table, the corresponding row in the other was
created / deleted.





--
Sent via pgsql-general mailing list
(pgsql-general@postgresql.org
)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general





--
There’s no obfuscated Perl contest because it’s pointless.

—Jeff Polk

Maranatha! <><
John McKown






--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Re: could not load library "$libdir/sslutils": in pg_upgrade process

2017-01-04 Thread Adrian Klaver

On 01/03/2017 09:01 PM, DrakoRod wrote:

Yes I installed Postgres Enterprise Manager Agent time ago in this server to
test agent, but now I don't use it.

Amm if you refer the EDB install with binaries PostgreSQL one-click yes, but
is not a EDB Advanced Server , is a normal Cluster installed by EDB
binaries.


Best guess is you are dealing with an apples and oranges situation, 
where you changed the environment and the extension is looking for 
something that is not there. See Tom's answer. Just out of curiosity, 
where do you get the extension from?






-
Dame un poco de fe, eso me bastará.
Rozvo Ware Solutions
--
View this message in context: 
http://postgresql.nabble.com/could-not-load-library-libdir-sslutils-in-pg-upgrade-process-tp5937304p5937324.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.





--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY: row is too big

2017-01-04 Thread vod vos
OK, maybe the final solution is to split it into half.


 On 星期三, 04 一月 2017 06:53:31 -0800 Adrian Klaver 
 wrote 




On 01/04/2017 05:00 AM, vod vos wrote: 

> Now I am confused about I can create 1100 columns in a table in 

> postgresql, but I can't copy 1100 values into the table. And I really 

 

As pointed out previously: 

 

https://www.postgresql.org/about/ 

Maximum Columns per Table250 - 1600 depending on column types 

 

That being dependent on both the number of columns and the actual data 

in the columns. Empty columns are not the problem, it is when you start 

filling them that you get the error. 

 

> dont want to split the csv file to pieces to avoid mistakes after this 

> action. 

> 

> I create a table with 1100 columns with data type of varchar, and hope 

> the COPY command will auto transfer the csv data that contains some 

 

I am afraid the solution is going to require more then hope. You are 

going to need to break the data up. I suspect that just splitting it 

into half would do the trick. So: 

 

Table 1 

column 1 for a primary key(assuming first column of your present data) 

columns 2-550 

 

Table 2 

column 1 for a primary key(assuming first column of your present data) 

columns 551-1100 

 

Using the program I mentioned previously: 

 

http://csvkit.readthedocs.io/en/1.0.1/scripts/csvcut.html 

 

That translates into: 

 

csvcut -c 1,2-550 your_big.csv > table_1.csv 

 

csvcut -c 1,551-1100 your_big.csv > table_2.csv 

 

 

> character and date, most of which are numeric. 

 

Is this a different data set? 

Previously you said: 

"The most of the data type are text or varhcar, ..." 

> 

> I use the command: COPY rius FROM "/var/www/test/test.csv" WITH 

> DELIMITER ';' ; 

> 

> Then it shows: 

> 

> ERROR: row is too big: size 11808, maximum size 8160 

> 

> 

> 

> 

> 

> 

> 

>  On 星期二, 03 一月 2017 05:24:18 -0800 *John McKown 

> * wrote  

> 

> On Mon, Jan 2, 2017 at 2:57 PM, Rob Sargent  >wrote: 

> 

> Perhaps this is your opportunity to correct someone else's 

> mistake. You need to show the table definition to convince us 

> that it cannot be improved. That it may be hard work really 

> doesn't mean it's not the right path. 

> 

> 

> ​This may not be possible. The data might be coming in from an 

> external source. I imagine you've run into the old "well, _we_ don't 

> have any problems, so it must be on your end!" scenario. 

> 

> Example: we receive CSV files from an external source. These files 

> are _supposed_ to be validated. But we have often received files 

> where NOT NULL fields have "nothing" in them them. E.g. a customer 

> bill which has _everything_ in it _except_ the customer number (or 

> an invalid one such as "123{"); or missing some other vital piece of 

> information. 

> 

> In this particular case, the OP might want to do what we did in a 

> similar case. We had way too many columns in a table. The 

> performance was horrible. We did an analysis and, as usual, the 

> majority of the selects were for a subset of the columns, about 15% 

> of the total. We "split" the table into the "high use" columns table 

> & the "low use" columns table. We then used triggers to make sure 

> that if we added a new / deleted an old row from one table, the 

> corresponding row in the other was created / deleted. 

> 

> 

> 

> 

> 

> -- 

> Sent via pgsql-general mailing list 

> (pgsql-general@postgresql.org ;) 

> To make changes to your subscription: 

> http://www.postgresql.org/mailpref/pgsql-general 

> 

> 

> 

> 

> -- 

> There’s no obfuscated Perl contest because it’s pointless. 

> 

> —Jeff Polk 

> 

> Maranatha! <>< 

> John McKown 

> 

> 

 

 

-- 

Adrian Klaver 

adrian.kla...@aklaver.com 








Re: [GENERAL] COPY: row is too big

2017-01-04 Thread Pavel Stehule
2017-01-04 16:11 GMT+01:00 Adrian Klaver :

> On 01/04/2017 06:54 AM, Pavel Stehule wrote:
>
>> Hi
>>
>> 2017-01-04 14:00 GMT+01:00 vod vos > >:
>>
>> __
>> Now I am confused about I can create 1100 columns in a table in
>> postgresql, but I can't copy 1100 values into the table. And I
>> really dont want to split the csv file to pieces to avoid mistakes
>> after this action.
>>
>>
>> The PostgreSQL limit is "Maximum Columns per Table250 - 1600 depending
>> on column types" - this limit is related to placing values or pointers
>> to values to one page (8KB).
>>
>> You can hit this limit not in CREATE TABLE time, but in INSERT time.
>>
>>
>>
>> I create a table with 1100 columns with data type of varchar, and
>> hope the COPY command will auto transfer the csv data that contains
>> some character and date, most of which are numeric.
>>
>>
>> Numeric is expensive type - try to use float instead, maybe double.
>>
>
> If I am following the OP correctly the table itself has all the columns
> declared as varchar. The data in the CSV file is a mix of text, date and
> numeric, presumably cast to text on entry into the table.
>

Table column type are important - Postgres enforces necessary
transformations.

Regards

Pavel


>
>
>> Regards
>>
>> Pavel
>>
>>
>> I use the command: COPY rius FROM "/var/www/test/test.csv" WITH
>> DELIMITER ';' ;
>>
>> Then it shows:
>>
>> ERROR:  row is too big: size 11808, maximum size 8160
>>
>>
>>
>>
>>
>>
>>
>>  On 星期二, 03 一月 2017 05:24:18 -0800 *John McKown
>> > >* wrote 
>>
>> On Mon, Jan 2, 2017 at 2:57 PM, Rob Sargent
>> mailto:robjsarg...@gmail.com>>wrote:
>>
>>
>> Perhaps this is your opportunity to correct someone else's
>> mistake. You need to show the table definition to convince
>> us that it cannot be improved. That it may be hard work
>> really doesn't mean it's not the right path.
>>
>>
>> ​This may not be possible. The data might be coming in from an
>> external source. I imagine you've run into the old "well, _we_
>> don't have any problems, so it must be on your end!" scenario.
>>
>> Example: we receive CSV files from an external source. These
>> files are _supposed_ to be validated. But we have often received
>> files where NOT NULL fields have "nothing" in them them. E.g. a
>> customer bill which has _everything_ in it _except_ the customer
>> number (or an invalid one such as "123{"); or missing some other
>> vital piece of information.
>>
>> In this particular case, the OP might want to do what we did in
>> a similar case. We had way too many columns in a table. The
>> performance was horrible. We did an analysis and, as usual, the
>> majority of the selects were for a subset of the columns, about
>> 15% of the total. We "split" the table into the "high use"
>> columns table & the "low use" columns table. We then used
>> triggers to make sure that if we added a new / deleted an old
>> row from one table, the corresponding row in the other was
>> created / deleted.
>>
>>
>>
>>
>>
>> --
>> Sent via pgsql-general mailing list
>> (pgsql-general@postgresql.org
>> )
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>> 
>>
>>
>>
>>
>> --
>> There’s no obfuscated Perl contest because it’s pointless.
>>
>> —Jeff Polk
>>
>> Maranatha! <><
>> John McKown
>>
>>
>>
>>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Fwd: [GENERAL] Cannot recover from backup on barman

2017-01-04 Thread Alfredo Palhares
Sorry this message got out of the list.
-- Forwarded message --
From: Alfredo Palhares 
Date: Wed, Jan 4, 2017 at 11:59 AM
Subject: Re: [GENERAL] Cannot recover from backup on barman
To: Michael Paquier 


Hello Michael,


I don't understand why Postgres-XC is part of this discussion as
> pg_resetxlog is part of PostgreSQL.


Well I did an apt-file search pg_resetxlog and this is what I found:
manpages-zh: /usr/share/man/zh_CN/man1/pg_resetxlog.1.gz
manpages-zh: /usr/share/man/zh_TW/man1/pg_resetxlog.1.gz
postgres-xc: /usr/bin/pg_resetxlog
postgres-xc: /usr/share/locale/cs/LC_MESSAGES/pg_resetxlog-9.2.mo
postgres-xc: /usr/share/locale/de/LC_MESSAGES/pg_resetxlog-9.2.mo
postgres-xc: /usr/share/locale/es/LC_MESSAGES/pg_resetxlog-9.2.mo
postgres-xc: /usr/share/locale/fr/LC_MESSAGES/pg_resetxlog-9.2.mo
postgres-xc: /usr/share/locale/it/LC_MESSAGES/pg_resetxlog-9.2.mo
postgres-xc: /usr/share/locale/ja/LC_MESSAGES/pg_resetxlog-9.2.mo
postgres-xc: /usr/share/locale/ko/LC_MESSAGES/pg_resetxlog-9.2.mo
postgres-xc: /usr/share/locale/pl/LC_MESSAGES/pg_resetxlog-9.2.mo
postgres-xc: /usr/share/locale/pt_BR/LC_MESSAGES/pg_resetxlog-9.2.mo
postgres-xc: /usr/share/locale/ro/LC_MESSAGES/pg_resetxlog-9.2.mo
postgres-xc: /usr/share/locale/ru/LC_MESSAGES/pg_resetxlog-9.2.mo
postgres-xc: /usr/share/locale/tr/LC_MESSAGES/pg_resetxlog-9.2.mo
postgres-xc: /usr/share/locale/zh_CN/LC_MESSAGES/pg_resetxlog-9.2.mo
postgres-xc: /usr/share/locale/zh_TW/LC_MESSAGES/pg_resetxlog-9.2.mo
postgres-xc: /usr/share/man/man1/pg_resetxlog.1.gz
postgres-xc-dbg: /usr/lib/debug/usr/bin/pg_resetxlog
postgresql-9.3: /usr/lib/postgresql/9.3/bin/pg_resetxlog
postgresql-9.3: /usr/share/postgresql/9.3/man/man1/pg_resetxlog.1.gz
postgresql-9.3-dbg: /usr/lib/debug/usr/lib/postgresql/9.3/bin/pg_resetxlog

I did use barman recover to recover the directory.

The backup_label file is there, and these are the contents:

START WAL LOCATION: 1BE/7BA0 (file 000101BE007B)
CHECKPOINT LOCATION: 1BE/7B6CB130
BACKUP METHOD: pg_start_backup
BACKUP FROM: master
START TIME: 2016-10-26 08:01:46 UTC
LABEL: Barman backup thingie 20161026T080002

Where should I look for said file to check if its there,

On Tue, Jan 3, 2017 at 12:52 PM, Michael Paquier 
wrote:

> On Tue, Jan 3, 2017 at 12:20 AM, Alfredo Palhares 
> wrote:
> > Well after some research, I was suggested to use the pg_resetxlog tool
> from
> > postgres-xc, but people also advise against it.
>
>
> > Can anyone show me how to properly identify the problem with my backup?
> I am
> > also worried that my backups are not being properly made.
>
> You may want to see that with the maintainers of barman, they are
> likely looking at this list. But here is a guess: the backup_label
> file has been removed.
> --
> Michael
>


Re: [GENERAL] COPY: row is too big

2017-01-04 Thread rob stone
Hello,
On Wed, 2017-01-04 at 07:11 -0800, Adrian Klaver wrote:
> On 01/04/2017 06:54 AM, Pavel Stehule wrote:
> > Hi
> > 
> > 2017-01-04 14:00 GMT+01:00 vod vos  > >:
> > 
> > __
> > Now I am confused about I can create 1100 columns in a table in
> > postgresql, but I can't copy 1100 values into the table. And I
> > really dont want to split the csv file to pieces to avoid
> > mistakes
> > after this action.
> > 
> > 
> > The PostgreSQL limit is "Maximum Columns per Table250 - 1600
> > depending
> > on column types" - this limit is related to placing values or
> > pointers
> > to values to one page (8KB).
> > 
> > You can hit this limit not in CREATE TABLE time, but in INSERT
> > time.
> > 
> > 
> > 
> > I create a table with 1100 columns with data type of varchar,
> > and
> > hope the COPY command will auto transfer the csv data that
> > contains
> > some character and date, most of which are numeric.
> > 
> > 
> > Numeric is expensive type - try to use float instead, maybe double.
> 
> If I am following the OP correctly the table itself has all the
> columns 
> declared as varchar. The data in the CSV file is a mix of text, date
> and 
> numeric, presumably cast to text on entry into the table.
> 
> > 
> > Regards
> > 
> > Pavel
> > 
> > 
> > I use the command: COPY rius FROM "/var/www/test/test.csv" WITH
> > DELIMITER ';' ;
> > 
> > Then it shows:
> > 
> > ERROR:  row is too big: size 11808, maximum size 8160
> > 
> > 


Assuming this is a brand new database instance and not an existing
application, could the OP not compile from source and specify the
 --with-blocksize=16384 so as to overcome the 8k default page size
limit?

My 2 cents.
Rob


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY: row is too big

2017-01-04 Thread Peter J. Holzer
On 2017-01-04 06:53:31 -0800, Adrian Klaver wrote:
> On 01/04/2017 05:00 AM, vod vos wrote:
> >Now I am confused about I can create 1100 columns in a table in
> >postgresql, but I can't copy 1100 values into the table. And I really
> 
> As pointed out previously:
> 
> https://www.postgresql.org/about/
> Maximum Columns per Table 250 - 1600 depending on column types
> 
> That being dependent on both the number of columns and the actual data in
> the columns.

I think this is confusingly phrased. In my mind "column type" is static
- the type is the same, independent of the values which are stored. So
  "250 - 1600 depending on column types" implies to me that there is
some type A of which I can have only 250 columns and another type B of
which I can have 1600 columns. But it doesn't imply to me that the
number of columns depends on the values which ar put into those columns.

May I suggest the these improvements?

In https://www.postgresql.org/about/:
Instead of
| 250 - 1600 depending on column types
write
| 250 - 1600 depending on column types and data

In https://www.postgresql.org/docs/9.6/static/ddl-basics.html:
Replace the sentence:
| Depending on the column types, it is between 250 and 1600.
with:
| For all columns in a row, some information (either the data itself or
| a pointer to the data) must be stored in a single block (8 kB).
| Because for some types this data is itself of variable length, the
| maximum number of columns depends not only on the types of the columns
| but also on the data (e.g., a NULL uses less space than a non-NULL
| value). Therefore there is no simple way to compute the maximum number
| of columns, and it is possible to declare a table with more columns
| than can be filled. Keeping all this in mind, the limit is between 250
| and 1600.

hp

-- 
   _  | Peter J. Holzer| A coding theorist is someone who doesn't
|_|_) || think Alice is crazy.
| |   | h...@hjp.at | -- John Gordon
__/   | http://www.hjp.at/ |http://downlode.org/Etext/alicebob.html


signature.asc
Description: Digital signature


Re: [GENERAL] COPY: row is too big

2017-01-04 Thread Adrian Klaver

On 01/04/2017 08:00 AM, rob stone wrote:

Hello,
On Wed, 2017-01-04 at 07:11 -0800, Adrian Klaver wrote:

On 01/04/2017 06:54 AM, Pavel Stehule wrote:

Hi

2017-01-04 14:00 GMT+01:00 vod vos mailto:vod...@zoho.com>>:

__
Now I am confused about I can create 1100 columns in a table in
postgresql, but I can't copy 1100 values into the table. And I
really dont want to split the csv file to pieces to avoid
mistakes
after this action.


The PostgreSQL limit is "Maximum Columns per Table250 - 1600
depending
on column types" - this limit is related to placing values or
pointers
to values to one page (8KB).

You can hit this limit not in CREATE TABLE time, but in INSERT
time.



I create a table with 1100 columns with data type of varchar,
and
hope the COPY command will auto transfer the csv data that
contains
some character and date, most of which are numeric.


Numeric is expensive type - try to use float instead, maybe double.


If I am following the OP correctly the table itself has all the
columns
declared as varchar. The data in the CSV file is a mix of text, date
and
numeric, presumably cast to text on entry into the table.



Regards

Pavel


I use the command: COPY rius FROM "/var/www/test/test.csv" WITH
DELIMITER ';' ;

Then it shows:

ERROR:  row is too big: size 11808, maximum size 8160





Assuming this is a brand new database instance and not an existing
application, could the OP not compile from source and specify the
 --with-blocksize=16384 so as to overcome the 8k default page size
limit?


Well I was thinking along those lines also, then I did a search on 
BLCKSZ in the docs and saw all the configuration parameters that are 
keyed off it. I know I would have to do a lot more homework to 
understand the implications to the database instance as a whole and 
whether it was worth it to accommodate a single table.





My 2 cents.
Rob




--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY: row is too big

2017-01-04 Thread Steve Crawford
...

> Numeric is expensive type - try to use float instead, maybe double.
>>
>
> If I am following the OP correctly the table itself has all the columns
> declared as varchar. The data in the CSV file is a mix of text, date and
> numeric, presumably cast to text on entry into the table.
>

But a CSV *is* purely text - no casting to text is needed. Conversion is
only needed when the strings in the CSV are text representations of
*non*-text data.

I'm guessing that the OP is using all text fields to deal with possibly
flawed input data and then validating and migrating the data in subsequent
steps. In that case, an ETL solution may be a better approach. Many
options, both open- closed- and hybrid-source exist.

Cheers,
Steve


Re: [GENERAL] COPY: row is too big

2017-01-04 Thread Adrian Klaver

On 01/04/2017 08:32 AM, Steve Crawford wrote:

...

Numeric is expensive type - try to use float instead, maybe double.


If I am following the OP correctly the table itself has all the
columns declared as varchar. The data in the CSV file is a mix of
text, date and numeric, presumably cast to text on entry into the table.


But a CSV *is* purely text - no casting to text is needed. Conversion is
only needed when the strings in the CSV are text representations of
*non*-text data.


Yeah, muddled thinking.



I'm guessing that the OP is using all text fields to deal with possibly
flawed input data and then validating and migrating the data in
subsequent steps. In that case, an ETL solution may be a better
approach. Many options, both open- closed- and hybrid-source exist.

Cheers,
Steve



--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] replication slot to be used in the future

2017-01-04 Thread Tom DalPozzo
Hi,
I've got my primary and I make a pg_basebackup -x in order to create a
standby.
I can connect my standby only later, in some hours, so I'd like the master
to keep new WALs but I don't like to use archiving nor keep-segments
option. I thought to do it through a physical replication slot (my standby
will have its replication slot name).
So I create a physical replication slot but I see that the master, which
has never seen my standby connected to him, doesn't keep WALs.

Any idea?

Regards
Pupillo


[GENERAL] Index impact on update?

2017-01-04 Thread Israel Brewster
Short version:Do indexes impact the speed of an UPDATE, even when the indexed columns aren't changing?Details:I have a table containing geographical data (Latitude, longitude, and elevation) with 406,833,705 records. The Latitude and Longitude columns are indexed. In order to better utilize the data, I've been looking into PostGIS, and decided I wanted to add a "Location" column with PostGIS type "GEOGRAPHY(point)". I then tried to populate it from the existing latitude/longitude data using the following query:UPDATE data SET location=ST_GeographyFromText('SRID=4326;POINT('||lng::text||' '||lat::text||')');I expected this update to take quite a while, since it has 406 million rows to update, but at this point it's been over 19 hours since I started the query, and it still hasn't completed. I'm wondering if the presence of the indexes could be slowing things down even though the indexed columns aren't being updated? Would I be better off canceling the update query, dropping the indexes, and trying again? Or is more likely that the update query is "almost" done, and it would be better to just let it run it's course? Or is there an even better option, such as perhaps exporting the data, adding the additional column in a text editor, and re-importing the data with a COPY command?Thanks for any feedback/advice you can offer!
---Israel BrewsterSystems Analyst IIRavn Alaska5245 Airport Industrial RdFairbanks, AK 99709(907) 450-7293---BEGIN:VCARD
VERSION:3.0
N:Brewster;Israel;;;
FN:Israel Brewster
ORG:Frontier Flying Service;MIS
TITLE:PC Support Tech II
EMAIL;type=INTERNET;type=WORK;type=pref:isr...@frontierflying.com
TEL;type=WORK;type=pref:907-450-7293
item1.ADR;type=WORK;type=pref:;;5245 Airport Industrial Wy;Fairbanks;AK;99701;
item1.X-ABADR:us
CATEGORIES:General
X-ABUID:36305438-95EA-4410-91AB-45D16CABCDDC\:ABPerson
END:VCARD




Re: [GENERAL] Index impact on update?

2017-01-04 Thread Paul Ramsey
You'd be better off forcing the table to write in bulk with something like

CREATE TABLE mynewtable AS
SELECT *, geography(ST_SetSRID(ST_MakePoint(lng, lat), 4326)) AS geog
FROM myoldtable;

Then index the new table, rename, etc. Bulk update will, in addition to
being slow, use 2x the amount of space on disk, as all the old tuples are
left behind from the update until you cluster or vacuum full the table.

P



On Wed, Jan 4, 2017 at 8:59 AM, Israel Brewster 
wrote:

> Short version:
> Do indexes impact the speed of an UPDATE, even when the indexed columns
> aren't changing?
>
> Details:
> I have a table containing geographical data (Latitude, longitude, and
> elevation) with 406,833,705 records. The Latitude and Longitude columns are
> indexed. In order to better utilize the data, I've been looking into
> PostGIS, and decided I wanted to add a "Location" column with PostGIS type
> "GEOGRAPHY(point)". I then tried to populate it from the existing
> latitude/longitude data using the following query:
>
> UPDATE data SET location=ST_GeographyFromText(
> 'SRID=4326;POINT('||lng::text||' '||lat::text||')');
>
> I expected this update to take quite a while, since it has 406 million
> rows to update, but at this point it's been over 19 hours since I started
> the query, and it still hasn't completed.
>
> I'm wondering if the presence of the indexes could be slowing things down
> even though the indexed columns aren't being updated? Would I be better off
> canceling the update query, dropping the indexes, and trying again? Or is
> more likely that the update query is "almost" done, and it would be better
> to just let it run it's course? Or is there an even better option, such as
> perhaps exporting the data, adding the additional column in a text editor,
> and re-importing the data with a COPY command?
>
> Thanks for any feedback/advice you can offer!
> ---
> Israel Brewster
> Systems Analyst II
> Ravn Alaska
> 5245 Airport Industrial Rd
> Fairbanks, AK 99709
> (907) 450-7293
> ---
>
>
>
>
>
>


Re: [GENERAL] Index impact on update?

2017-01-04 Thread Rob Sargent



On 01/04/2017 09:59 AM, Israel Brewster wrote:

Short version:
Do indexes impact the speed of an UPDATE, even when the indexed 
columns aren't changing?


Details:
I have a table containing geographical data (Latitude, longitude, and 
elevation) with 406,833,705 records. The Latitude and Longitude 
columns are indexed. In order to better utilize the data, I've been 
looking into PostGIS, and decided I wanted to add a "Location" column 
with PostGIS type "GEOGRAPHY(point)". I then tried to populate it from 
the existing latitude/longitude data using the following query:


UPDATE data SET 
location=ST_GeographyFromText('SRID=4326;POINT('||lng::text||' 
'||lat::text||')');


I expected this update to take quite a while, since it has 406 million 
rows to update, but at this point it's been over 19 hours since I 
started the query, and it still hasn't completed.


I'm wondering if the presence of the indexes could be slowing things 
down even though the indexed columns aren't being updated? Would I be 
better off canceling the update query, dropping the indexes, and 
trying again? Or is more likely that the update query is "almost" 
done, and it would be better to just let it run it's course? Or is 
there an even better option, such as perhaps exporting the data, 
adding the additional column in a text editor, and re-importing the 
data with a COPY command?


Thanks for any feedback/advice you can offer!
I would use a "create table redo as select *, 
location=ST_GeographyFromText('SRID=4326;POINT('||lng::text||' 
'||lat::text||')') from original;" then index that and drop original.  
Or just "create table location as select 
location=ST_GeographyFromText('SRID=4326;POINT('||lng::text||' 
'||lat::text||')');" along with what ever id you have for the original 
tuple (if it's not just lat+lon) and join or view as necessary after 
indexing.



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Index impact on update?

2017-01-04 Thread Rob Sargent



On 01/04/2017 09:59 AM, Israel Brewster wrote:

Short version:
Do indexes impact the speed of an UPDATE, even when the indexed 
columns aren't changing?


Details:
I have a table containing geographical data (Latitude, longitude, and 
elevation) with 406,833,705 records. The Latitude and Longitude 
columns are indexed. In order to better utilize the data, I've been 
looking into PostGIS, and decided I wanted to add a "Location" column 
with PostGIS type "GEOGRAPHY(point)". I then tried to populate it from 
the existing latitude/longitude data using the following query:


UPDATE data SET 
location=ST_GeographyFromText('SRID=4326;POINT('||lng::text||' 
'||lat::text||')');


I expected this update to take quite a while, since it has 406 million 
rows to update, but at this point it's been over 19 hours since I 
started the query, and it still hasn't completed.


I'm wondering if the presence of the indexes could be slowing things 
down even though the indexed columns aren't being updated? Would I be 
better off canceling the update query, dropping the indexes, and 
trying again? Or is more likely that the update query is "almost" 
done, and it would be better to just let it run it's course? Or is 
there an even better option, such as perhaps exporting the data, 
adding the additional column in a text editor, and re-importing the 
data with a COPY command?


Thanks for any feedback/advice you can offer!
I would use a "create table redo as select *, 
location=ST_GeographyFromText('SRID=4326;POINT('||lng::text||' 
'||lat::text||')') from original;" then index that and drop original.  
Or just "create table location as select 
location=ST_GeographyFromText('SRID=4326;POINT('||lng::text||' 
'||lat::text||')');" along with what ever id you have for the original 
tuple (if it's not just lat+lon) and join or view as necessary after 
indexing.



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Index impact on update?

2017-01-04 Thread Israel Brewster
On Jan 4, 2017, at 8:08 AM, Paul Ramsey  wrote:
> 
> You'd be better off forcing the table to write in bulk with something like
> 
> CREATE TABLE mynewtable AS
> SELECT *, geography(ST_SetSRID(ST_MakePoint(lng, lat), 4326)) AS geog
> FROM myoldtable;
> 
> Then index the new table, rename, etc. Bulk update will, in addition to being 
> slow, use 2x the amount of space on disk, as all the old tuples are left 
> behind from the update until you cluster or vacuum full the table.
> 
> P

Thanks for the suggestion, info, and MUCH nicer looking syntax (perhaps more 
efficient as well?) for populating the column. I'll give it a shot, and see how 
it goes!

---
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
---

> 
> 
> 
> On Wed, Jan 4, 2017 at 8:59 AM, Israel Brewster  > wrote:
> Short version:
> Do indexes impact the speed of an UPDATE, even when the indexed columns 
> aren't changing?
> 
> Details:
> I have a table containing geographical data (Latitude, longitude, and 
> elevation) with 406,833,705 records. The Latitude and Longitude columns are 
> indexed. In order to better utilize the data, I've been looking into PostGIS, 
> and decided I wanted to add a "Location" column with PostGIS type 
> "GEOGRAPHY(point)". I then tried to populate it from the existing 
> latitude/longitude data using the following query:
> 
> UPDATE data SET 
> location=ST_GeographyFromText('SRID=4326;POINT('||lng::text||' 
> '||lat::text||')');
> 
> I expected this update to take quite a while, since it has 406 million rows 
> to update, but at this point it's been over 19 hours since I started the 
> query, and it still hasn't completed. 
> 
> I'm wondering if the presence of the indexes could be slowing things down 
> even though the indexed columns aren't being updated? Would I be better off 
> canceling the update query, dropping the indexes, and trying again? Or is 
> more likely that the update query is "almost" done, and it would be better to 
> just let it run it's course? Or is there an even better option, such as 
> perhaps exporting the data, adding the additional column in a text editor, 
> and re-importing the data with a COPY command?
> 
> Thanks for any feedback/advice you can offer!
> ---
> Israel Brewster
> Systems Analyst II
> Ravn Alaska
> 5245 Airport Industrial Rd
> Fairbanks, AK 99709
> (907) 450-7293 
> ---
> 
> 
> 
> 
> 
> 



Re: [GENERAL] replication slot to be used in the future

2017-01-04 Thread Adrian Klaver

On 01/04/2017 08:44 AM, Tom DalPozzo wrote:

Hi,


Postgres version?

Because in 9.6:

https://www.postgresql.org/docs/9.6/static/functions-admin.html#FUNCTIONS-REPLICATION

Table 9-82. Replication SQL Functions

pg_create_physical_replication_slot(slot_name name [, 
immediately_reserve boolean ])


Creates a new physical replication slot named slot_name. The optional 
second parameter, when true, specifies that the LSN for this replication 
slot be reserved immediately; otherwise the LSN is reserved on first 
connection from a streaming replication client. Streaming changes from a 
physical slot is only possible with the streaming-replication protocol — 
see Section 51.3. This function corresponds to the replication protocol 
command CREATE_REPLICATION_SLOT ... PHYSICAL.



I've got my primary and I make a pg_basebackup -x in order to create a
standby.
I can connect my standby only later, in some hours, so I'd like the
master to keep new WALs but I don't like to use archiving nor
keep-segments option. I thought to do it through a physical replication
slot (my standby will have its replication slot name).
So I create a physical replication slot but I see that the master, which
has never seen my standby connected to him, doesn't keep WALs.

Any idea?

Regards
Pupillo







--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Re: could not load library "$libdir/sslutils": in pg_upgrade process

2017-01-04 Thread DrakoRod
Teorycally, I removed the sslutils from old cluster when review the $libdir
appear this: 

/[postgres@server ~]$ /opt/PostgreSQL/9.3/bin/pg_config --pkglibdir
/opt/PostgreSQL/9.3/lib/postgresql

[postgres@server ~]$ /opt/PostgreSQL/9.3/bin/pg_config --libs
-lpgport -lpgcommon -lxslt -lxml2 -lpam -lssl -lcrypto -lgssapi_krb5 -lz
-ledit -lcrypt -ldl -lm
/

With ls command in /opt/PostgreSQL/9.3/lib/ and
/opt/PostgreSQL/9.3/lib/postgresql not appear sslutils library. In fact,
after uninstall ssutils I moved the contrib  directory that contain sslutils
directory (with make and uninstall scripts) to test whether that was, but
the same error (tested with and without contrib directory). 

Thanks for your help!



-
Dame un poco de fe, eso me bastará.
Rozvo Ware Solutions 
--
View this message in context: 
http://postgresql.nabble.com/could-not-load-library-libdir-sslutils-in-pg-upgrade-process-tp5937304p5937468.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] replication slot to be used in the future

2017-01-04 Thread Tom DalPozzo
2017-01-04 18:24 GMT+01:00 Adrian Klaver :

> On 01/04/2017 08:44 AM, Tom DalPozzo wrote:
>
>> Hi,
>>
>
> Postgres version?
>
> Because in 9.6:
>
> https://www.postgresql.org/docs/9.6/static/functions-admin.h
> tml#FUNCTIONS-REPLICATION
>
> Table 9-82. Replication SQL Functions
>
> pg_create_physical_replication_slot(slot_name name [, immediately_reserve
> boolean ])
>
> Creates a new physical replication slot named slot_name. The optional
> second parameter, when true, specifies that the LSN for this replication
> slot be reserved immediately; otherwise the LSN is reserved on first
> connection from a streaming replication client. Streaming changes from a
> physical slot is only possible with the streaming-replication protocol —
> see Section 51.3. This function corresponds to the replication protocol
> command CREATE_REPLICATION_SLOT ... PHYSICAL.
>
>
>>
Hi, it is 9.5.5!
Thank you very much!

Pupillo


Re: [GENERAL] Index impact on update?

2017-01-04 Thread Israel Brewster


> On Jan 4, 2017, at 8:08 AM, Paul Ramsey  wrote:
> 
> You'd be better off forcing the table to write in bulk with something like
> 
> CREATE TABLE mynewtable AS
> SELECT *, geography(ST_SetSRID(ST_MakePoint(lng, lat), 4326)) AS geog
> FROM myoldtable;
> 
> Then index the new table, rename, etc. Bulk update will, in addition to being 
> slow, use 2x the amount of space on disk, as all the old tuples are left 
> behind from the update until you cluster or vacuum full the table.
> 
> P
> 

Well, that was definitely way faster. Created the table with the additional 
column in about 10 minutes rather than 19+ hours. Now to see how long it takes 
to generate the indexes :-)

Thanks again!
---
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
---

> 
> 
> On Wed, Jan 4, 2017 at 8:59 AM, Israel Brewster  > wrote:
> Short version:
> Do indexes impact the speed of an UPDATE, even when the indexed columns 
> aren't changing?
> 
> Details:
> I have a table containing geographical data (Latitude, longitude, and 
> elevation) with 406,833,705 records. The Latitude and Longitude columns are 
> indexed. In order to better utilize the data, I've been looking into PostGIS, 
> and decided I wanted to add a "Location" column with PostGIS type 
> "GEOGRAPHY(point)". I then tried to populate it from the existing 
> latitude/longitude data using the following query:
> 
> UPDATE data SET 
> location=ST_GeographyFromText('SRID=4326;POINT('||lng::text||' 
> '||lat::text||')');
> 
> I expected this update to take quite a while, since it has 406 million rows 
> to update, but at this point it's been over 19 hours since I started the 
> query, and it still hasn't completed. 
> 
> I'm wondering if the presence of the indexes could be slowing things down 
> even though the indexed columns aren't being updated? Would I be better off 
> canceling the update query, dropping the indexes, and trying again? Or is 
> more likely that the update query is "almost" done, and it would be better to 
> just let it run it's course? Or is there an even better option, such as 
> perhaps exporting the data, adding the additional column in a text editor, 
> and re-importing the data with a COPY command?
> 
> Thanks for any feedback/advice you can offer!
> ---
> Israel Brewster
> Systems Analyst II
> Ravn Alaska
> 5245 Airport Industrial Rd
> Fairbanks, AK 99709
> (907) 450-7293 
> ---
> 
> 
> 
> 
> 
> 



Re: [GENERAL] Re: could not load library "$libdir/sslutils": in pg_upgrade process

2017-01-04 Thread Adrian Klaver

On 01/04/2017 09:27 AM, DrakoRod wrote:

Teorycally, I removed the sslutils from old cluster when review the $libdir
appear this:


What where the exact steps you took to remove sslutils?



/[postgres@server ~]$ /opt/PostgreSQL/9.3/bin/pg_config --pkglibdir
/opt/PostgreSQL/9.3/lib/postgresql

[postgres@server ~]$ /opt/PostgreSQL/9.3/bin/pg_config --libs
-lpgport -lpgcommon -lxslt -lxml2 -lpam -lssl -lcrypto -lgssapi_krb5 -lz
-ledit -lcrypt -ldl -lm
/

With ls command in /opt/PostgreSQL/9.3/lib/ and
/opt/PostgreSQL/9.3/lib/postgresql not appear sslutils library. In fact,
after uninstall ssutils I moved the contrib  directory that contain sslutils
directory (with make and uninstall scripts) to test whether that was, but
the same error (tested with and without contrib directory).


You showed a couple of different errors previously, so what was the 
exact error message and what where you doing?




Thanks for your help!



-
Dame un poco de fe, eso me bastará.
Rozvo Ware Solutions
--
View this message in context: 
http://postgresql.nabble.com/could-not-load-library-libdir-sslutils-in-pg-upgrade-process-tp5937304p5937468.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.





--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] vacuum of empty table slows down as database table count grows

2017-01-04 Thread Jerry Sievers
marcin kowalski  writes:

> I am experiencing an odd issue, i've noticed it on 9.3 , but i can reproduce 
> it on 9.6.
>
> Basically, i have a database with a lot of schemas, but not that much data. 
> Each schema is maybe 2-4 GB in size, and often much less than that.
>
> The database has ~300-500 schemas, each with ~100-300 tables. Generally a few 
> hundred thousand tables total. Entire cluster has 2 or 3 such databases.
>
> As the amount of tables grows, the time it takes to vacuum an _empty_ table 
> grows as well. The table is in public schema, and it is the only table there.

I presume since vacuum then has much larger catalogs to query as if to
find indexes and related toast tables to process along with your table
of interest.

> I made a simple testing script to make sure that these things are related. I 
> set up a blank database, create a table with one column in public and restore 
> one schema.
> Then i vacuum that table three times, measure the execution times and repeat 
> the process, adding another schema to db.
>
> At ~200 tables it takes ~100ms for psql to issue a vacuum verbose and exit. 
> At 83K tables the time is already at ~1.5second. The progress appars to be 
> directly
> proportional to table amount, and grows linearly, eventually crossing past 
> 3seconds - for blank table with no data.
>
> I think this may severely impact the entire vacuumdb run, but i have not 
> verified that yet.
>
> This is irrelevant of amount of data restored, i am seeing the same behavior 
> with just schema restore, as well as with schema+data restores.
>
> If anyone is interested i may upload the schema data + my benchmarking script 
> with collected whisper data from my test run (i've been plotting it in 
> grafana via carbon)
>
> Is this a known issue? Can i do anything to improve performance here?
>

-- 
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres.consult...@comcast.net
p: 312.241.7800


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] vacuum of empty table slows down as database table count grows

2017-01-04 Thread Pavel Stehule
2017-01-04 20:22 GMT+01:00 Jerry Sievers :

> marcin kowalski  writes:
>
> > I am experiencing an odd issue, i've noticed it on 9.3 , but i can
> reproduce it on 9.6.
> >
> > Basically, i have a database with a lot of schemas, but not that much
> data. Each schema is maybe 2-4 GB in size, and often much less than that.
> >
> > The database has ~300-500 schemas, each with ~100-300 tables. Generally
> a few hundred thousand tables total. Entire cluster has 2 or 3 such
> databases.
> >
> > As the amount of tables grows, the time it takes to vacuum an _empty_
> table grows as well. The table is in public schema, and it is the only
> table there.
>
> I presume since vacuum then has much larger catalogs to query as if to
> find indexes and related toast tables to process along with your table
> of interest.
>
> > I made a simple testing script to make sure that these things are
> related. I set up a blank database, create a table with one column in
> public and restore one schema.
> > Then i vacuum that table three times, measure the execution times and
> repeat the process, adding another schema to db.
> >
> > At ~200 tables it takes ~100ms for psql to issue a vacuum verbose and
> exit. At 83K tables the time is already at ~1.5second. The progress appars
> to be directly
> > proportional to table amount, and grows linearly, eventually crossing
> past 3seconds - for blank table with no data.
> >
> > I think this may severely impact the entire vacuumdb run, but i have not
> verified that yet.
> >
> > This is irrelevant of amount of data restored, i am seeing the same
> behavior with just schema restore, as well as with schema+data restores.
> >
> > If anyone is interested i may upload the schema data + my benchmarking
> script with collected whisper data from my test run (i've been plotting it
> in grafana via carbon)
> >
> > Is this a known issue? Can i do anything to improve performance here?
>

we had 10K and more tables in one database - and we had lot of issues.

I know so Tomas fixed some issues, but we need the stat files in tmpfs

please, read this article
https://blog.pgaddict.com/posts/the-two-kinds-of-stats-in-postgresql

Regards

Pavel

>
>
> --
> Jerry Sievers
> Postgres DBA/Development Consulting
> e: postgres.consult...@comcast.net
> p: 312.241.7800
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


Re: [GENERAL] vacuum of empty table slows down as database table count grows

2017-01-04 Thread Pavel Stehule
>
>
>> >
>> > This is irrelevant of amount of data restored, i am seeing the same
>> behavior with just schema restore, as well as with schema+data restores.
>> >
>> > If anyone is interested i may upload the schema data + my benchmarking
>> script with collected whisper data from my test run (i've been plotting it
>> in grafana via carbon)
>> >
>> > Is this a known issue? Can i do anything to improve performance here?
>>
>
> we had 10K and more tables in one database - and we had lot of issues.
>
> I know so Tomas fixed some issues, but we need the stat files in tmpfs
>
> please, read this article  https://blog.pgaddict.com/
> posts/the-two-kinds-of-stats-in-postgresql
>

http://hacksoclock.blogspot.cz/2014/04/putting-statstempdirectory-on-ramdisk.html


>
> Regards
>
> Pavel
>
> >
>>
>> --
>> Jerry Sievers
>> Postgres DBA/Development Consulting
>> e: postgres.consult...@comcast.net
>> p: 312.241.7800
>>
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>>
>
>


Re: [GENERAL] Performance degradation when using auto_explain

2017-01-04 Thread Kisung Kim
On Wed, Jan 4, 2017 at 1:21 AM, Andreas Kretschmer <
akretsch...@spamfence.net> wrote:

> Kisung Kim  wrote:
>
> > And finally I found that auto_explain is the cause of the problem.
>
> real hardware or virtual hardware? On virtual there are sometimes
> problems with exact timings, please read:
>
> https://www.postgresql.org/docs/current/static/pgtesttiming.html
>
>
Thank you for your reply.
I use real hardware.
I am curious timing functions have contentions when multi-threads call them.




>
> Regards, Andreas Kretschmer
> --
> Andreas Kretschmer
> http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>



-- 




Bitnine Global Inc., Kisung Kim, Ph.D
https://sites.google.com/site/kisungresearch/
E-mail : ks...@bitnine.net
Office phone : 070-4800-5890, 408-606-8602
US Mobile phone : 408-805-2192


Re: [GENERAL] Performance PLV8 vs PLPGSQL

2017-01-04 Thread Merlin Moncure
On Wed, Dec 28, 2016 at 3:15 AM, Tim Uckun  wrote:
> I have seen various links on the internet which indicate that PLV8 is
> significantly faster than PL-PGSQL sometimes an order of magnitude faster.
>
> Is this uniformly true or is it just in certain circumstances?
>
> Is there any benefit to choosing PL-PGSQL?
>
> Is there work going on to make PL-PGSQL more performant or has it gotten
> significantly faster in the last two years or so (some of the links are a
> from a while ago).

Pavel covered it pretty well but I'll chime in also.  In typical usage
of pl/pgsql the overhead of the language itself is minimal.  Most of
the heavy lifting is done by the SQL engine. pl/pgsql is glue that
sticks SQL statements together basically.  Heavy computation in
pl/pgsql can be problematic but is rarely necessary.  pl/v8 is
definitively a better choice for such cases but not being in core can
be an issue for some people.

The advantage of pl/pgsql is that SQL statements, types and errors are
native to the language itself.  This makes many things much easier to
do...pl/pgsql is maybe the most efficient data processing language (in
terms of efficiency of coding) in existence.   That may smack of
overstatement but my entire career is built around being good at it so
I'm inclined to effusiveness :-D.

merlin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Re: could not load library "$libdir/sslutils": in pg_upgrade process

2017-01-04 Thread DrakoRod
Adrian, Tom

Finally I did upgrade version but I've removed database pem (Postgres
Enterprise Manager) I guess that this database has some link in some
function to sslutils, because pg_upgrade showed the above errors while
upgraded this database. 

/pg_restore: creating FUNCTION "public.sslutils_version()"
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 608; 1255 71731 FUNCTION
sslutils_version() *pem*
pg_restore: [archiver (db)] could not execute query: ERROR:  could not find
function "sslutils_version" in file
"/opt/PostgreSQL/9.6/lib/postgresql/sslutils.so"
Command was: CREATE FUNCTION "sslutils_version"() RETURNS "text"
LANGUAGE "c" IMMUTABLE
AS '$libdir/sslutils', 'sslutils_version'...
/

But when I removed pem database the process pg_upgrade finished correctly.

Thanks for your help!!



-
Dame un poco de fe, eso me bastará.
Rozvo Ware Solutions 
--
View this message in context: 
http://postgresql.nabble.com/could-not-load-library-libdir-sslutils-in-pg-upgrade-process-tp5937304p5937545.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Index impact on update?

2017-01-04 Thread Tomas Vondra

On 01/04/2017 05:59 PM, Israel Brewster wrote:

Short version:
Do indexes impact the speed of an UPDATE, even when the indexed columns
aren't changing?



They shouldn't, as long as the updated tuple can be updated on the same 
page (8kB chunk of data). In that case we can do a HOT update for the 
row, without updating the index(es).


But as you're updating the whole table, that would require about 50% of 
all pages to be free, which is unlikely to be true. So perhaps some 
updates can proceed without touching indexes, but most can't.



Details:
I have a table containing geographical data (Latitude, longitude, and
elevation) with 406,833,705 records. The Latitude and Longitude columns
are indexed. In order to better utilize the data, I've been looking into
PostGIS, and decided I wanted to add a "Location" column with PostGIS
type "GEOGRAPHY(point)". I then tried to populate it from the existing
latitude/longitude data using the following query:

UPDATE data SET
location=ST_GeographyFromText('SRID=4326;POINT('||lng::text||'
'||lat::text||')');

I expected this update to take quite a while, since it has 406 million
rows to update, but at this point it's been over 19 hours since I
started the query, and it still hasn't completed.

I'm wondering if the presence of the indexes could be slowing things
down even though the indexed columns aren't being updated? Would I be
better off canceling the update query, dropping the indexes, and trying
again? Or is more likely that the update query is "almost" done, and it
would be better to just let it run it's course? Or is there an even
better option, such as perhaps exporting the data, adding the additional
column in a text editor, and re-importing the data with a COPY command?



As explained above, it's likely that such full-table update has to 
modify the indexes anyway, making it much more expensive. Without 
additional information it's however impossible to confirm that's what's 
causing the long update in this case - there may be other bits slowing 
it down - e.g. foreign keys checks, triggers.


CREATE TABLE AS SELECT would not pay any of those costs, of course. 
Also, if you're running with wal_level=minimal, it would not have to 
write the changes into WAL, while the regular UPDATE has to do that.



regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Performance degradation when using auto_explain

2017-01-04 Thread Tomas Vondra

On 01/04/2017 08:54 PM, Kisung Kim wrote:



On Wed, Jan 4, 2017 at 1:21 AM, Andreas Kretschmer
mailto:akretsch...@spamfence.net>> wrote:

Kisung Kim mailto:ks...@bitnine.net>> wrote:

> And finally I found that auto_explain is the cause of the problem.

real hardware or virtual hardware? On virtual there are sometimes
problems with exact timings, please read:

https://www.postgresql.org/docs/current/static/pgtesttiming.html



Thank you for your reply.
I use real hardware.
I am curious timing functions have contentions when multi-threads call them.



It's not so much about contention between threads/processes, but mostly 
about the cost of actually reading data from the clock source. So even 
if you run on physical hardware, the clock source may be slow. Actually, 
there may be multiple clock sources available, differing in precision 
and overhead.


See:

/sys/devices/system/clocksource/*/available_clocksource

On my machine I see 3 different sources "tsc hpet acpi_pm" and after 
running the pg_test_timing tool, linked by Andreas, and I get this for 
'tsc' clock source


Testing timing overhead for 3 seconds.
Per loop time including overhead: 29.87 nsec
Histogram of timing durations:
< usec   % of total  count
 1 97.09159   97499400
 2  2.900852913031
 4  0.00193   1936
 8  0.00089891
16  0.00140   1405
32  0.00335   3366
64  0.0  2

suggesting that 97% of calls took less than 1 usec, which is quite good. 
For comparison, using 'hpet' gives me this:


Testing timing overhead for 3 seconds.
Per loop time including overhead: 766.92 ns
Histogram of timing durations:
  < us   % of total  count
 1 27.695581083389
 2 71.284372788485
 4  0.75748  29631
 8  0.02886   1129
16  0.06578   2573
32  0.16755   6554
64  0.00033 13
   128  0.3  1
   256  0.3  1

Which is clearly much worse (it increased the per-loop cost from 30ns to 
767ns, which is ~25x more).


So check which clock source you have selected, and test how expensive 
that is. But even with a fast clock source, the additional timing 
overhead may make EXPLAIN ANALYZE considerably slower. There's not much 
we can do about it :-(


And to make matters worse, it may affect different plans differently 
(e.g. nested loop joins do many more gettimeofday calls than other join 
types, amplifying the timing costs).


But the question is whether you actually need the timing - the total 
duration + row counts are far more important in my experience, so by setting


auto_explain.log_analyze = on
auto_explain.log_timing = off

you may significantly reduce the impact on query performance, while 
retaining the most valuable information.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [BUGS] Postgresql query HAVING do not work

2017-01-04 Thread Vitaly Burovoy
On 1/4/17, Gwork  wrote:
> On 1/5/17 2:22 AM, Vitaly Burovoy wrote:
>> On 1/4/17, Vitaly Burovoy  wrote:
>>> On 1/4/17, Gwork  wrote:
 Version: Postgresql 9.5
 OS: Debian 8 jessie run on docker

 Following this tutorial The Nested Set Model on
 http://mikehillyer.com/articles/managing-hierarchical-data-in-mysql/


 Section: Depth of a Sub-Tree.
 SELECT node.name, (COUNT(parent.name) - (sub_tree.depth + 1)) AS depth
 FROM nested_category AS node,
 nested_category AS parent,
 nested_category AS sub_parent,
 (
 SELECT node.name, (COUNT(parent.name) - 1) AS depth
 FROM nested_category AS node,
 nested_category AS parent
 WHERE node.lft BETWEEN parent.lft AND parent.rgt
 AND node.name = 'PORTABLE ELECTRONICS'
 GROUP BY node.name, node.lft
 ORDER BY node.lft
 )AS sub_tree
 WHERE node.lft BETWEEN parent.lft AND parent.rgt
 AND node.lft BETWEEN sub_parent.lft AND sub_parent.rgt
 AND sub_parent.name = sub_tree.name
 GROUP BY node.name, node.lft, sub_tree.depth
 ORDER BY node.lft;
 +--+-+
 | name |   depth |
 |--+-|
 | PORTABLE ELECTRONICS |   0 |
 | MP3 PLAYERS  |   1 |
 | FLASH|   2 |
 | CD PLAYERS   |   1 |
 | 2 WAY RADIOS |   1 |
 +--+-+


 Section: Find the Immediate Subordinates of a Node.
 SELECT node.name, (COUNT(parent.name) - (sub_tree.depth + 1)) AS depth
 FROM nested_category AS node,
 nested_category AS parent,
 nested_category AS sub_parent,
 (
 SELECT node.name, (COUNT(parent.name) - 1) AS depth
 FROM nested_category AS node,
 nested_category AS parent
 WHERE node.lft BETWEEN parent.lft AND parent.rgt
 AND node.name = 'PORTABLE ELECTRONICS'
 GROUP BY node.name, node.lft
 ORDER BY node.lft
 )AS sub_tree
 WHERE node.lft BETWEEN parent.lft AND parent.rgt
 AND node.lft BETWEEN sub_parent.lft AND sub_parent.rgt
 AND sub_parent.name = sub_tree.name
 GROUP BY node.name, node.lft, sub_tree.depth
 HAVING depth <= 1
 ORDER BY node.lft;
 Adding 'HAVING depth <= 1' to the query still return the same results
 as
 above instead of this:
 +--+-+
 | name |   depth |
 |--+-|
 | PORTABLE ELECTRONICS |   0 |
 | MP3 PLAYERS  |   1 |
 | FLASH|   1 |
 | CD PLAYERS   |   1 |
 | 2 WAY RADIOS |   1 |
 +--+-+

 I don't know if I'm doing anything wrong?

 Note: Edit the post query by adding node.lft, sub_tree.depth to the
 GROUP BY.
>>> Hello, Gwork,
>>>
>>> HAVING works fine, it is just confusing because of naming. HAVING
>>> works with column names from sources (which is "sub_tree.depth" in
>>> your example), not with names of final columns (because they get
>>> aliases later).
>>>
>>> You can check it adding depth to your SELECT part:
>>> SELECT node.name, (COUNT(parent.name) - (sub_tree.depth + 1)) AS depth
>>> ,array_agg(depth)
>>> FROM nested_category AS node,
>>> ...
>>>
>>> and you can see that values there are not bigger than 1.
>>>
>>> You must use the same expression in HAVING clause as in SELECT one to
>>> get what you want:
>>> HAVING (COUNT(parent.name) - (sub_tree.depth + 1)) <= 1
>>>
>>> but the result will not have "FLASH" because it has "2" even in your
>>> example.
>>> +--+---+
>>> | name | depth |
>>> +--+---+
>>> | PORTABLE ELECTRONICS | 0 |
>>> | MP3 PLAYERS  | 1 |
>>> | CD PLAYERS   | 1 |
>>> | 2 WAY RADIOS | 1 |
>>> +--+---+
>>> (4 rows)
>> I'm sorry, forgot to mention: If you want to deal with hierarchical
>> data, Postgres has better solution - recursive query[1]. When you
>> understand principles, it will be much easier for you to write queries
>> instead of mentioned in the article.
>>
>> For example, "Retrieving a Single Path" from "Adjacency model" can be
>> written as:
>> WITH RECURSIVE
>> sel(name, parent, depth) AS (
>>  SELECT name, parent, 0 FROM category WHERE name='FLASH'
>>  UNION ALL
>>  SELECT c.name, c.parent, depth + 1 FROM category c, sel WHERE
>> c.category_id=sel.parent
>> )
>> SELECT name FROM sel
>> ORDER BY depth DESC;
>>
>> which gives the same result and not depends on "parent.lft" which
>> don't have to increase.
>>
>>

Re: [GENERAL] The best way to deal with hierarchical data (was: Postgresql query HAVING do not work)

2017-01-04 Thread Vitaly Burovoy
On 1/4/17, Gwork  wrote:
> On 1/5/17 2:51 AM, Vitaly Burovoy wrote:
>> On 1/4/17, Gwork  wrote:
>>> On 1/5/17 2:22 AM, Vitaly Burovoy wrote:
 On 1/4/17, Vitaly Burovoy  wrote:
> On 1/4/17, Gwork  wrote:
>> Version: Postgresql 9.5
>> OS: Debian 8 jessie run on docker
>>
>> Following this tutorial The Nested Set Model on
>> http://mikehillyer.com/articles/managing-hierarchical-data-in-mysql/
>>
>>
>> Section: Depth of a Sub-Tree.
>> SELECT node.name, (COUNT(parent.name) - (sub_tree.depth + 1)) AS
>> depth
>> FROM nested_category AS node,
>> nested_category AS parent,
>> nested_category AS sub_parent,
>> (
>> SELECT node.name, (COUNT(parent.name) - 1) AS depth
>> FROM nested_category AS node,
>> nested_category AS parent
>> WHERE node.lft BETWEEN parent.lft AND parent.rgt
>> AND node.name = 'PORTABLE ELECTRONICS'
>> GROUP BY node.name, node.lft
>> ORDER BY node.lft
>> )AS sub_tree
>> WHERE node.lft BETWEEN parent.lft AND parent.rgt
>> AND node.lft BETWEEN sub_parent.lft AND sub_parent.rgt
>> AND sub_parent.name = sub_tree.name
>> GROUP BY node.name, node.lft, sub_tree.depth
>> ORDER BY node.lft;
>> +--+-+
>> | name |   depth |
>> |--+-|
>> | PORTABLE ELECTRONICS |   0 |
>> | MP3 PLAYERS  |   1 |
>> | FLASH|   2 |
>> | CD PLAYERS   |   1 |
>> | 2 WAY RADIOS |   1 |
>> +--+-+
>>
>>
>> Section: Find the Immediate Subordinates of a Node.
>> SELECT node.name, (COUNT(parent.name) - (sub_tree.depth + 1)) AS
>> depth
>> FROM nested_category AS node,
>> nested_category AS parent,
>> nested_category AS sub_parent,
>> (
>> SELECT node.name, (COUNT(parent.name) - 1) AS depth
>> FROM nested_category AS node,
>> nested_category AS parent
>> WHERE node.lft BETWEEN parent.lft AND parent.rgt
>> AND node.name = 'PORTABLE ELECTRONICS'
>> GROUP BY node.name, node.lft
>> ORDER BY node.lft
>> )AS sub_tree
>> WHERE node.lft BETWEEN parent.lft AND parent.rgt
>> AND node.lft BETWEEN sub_parent.lft AND sub_parent.rgt
>> AND sub_parent.name = sub_tree.name
>> GROUP BY node.name, node.lft, sub_tree.depth
>> HAVING depth <= 1
>> ORDER BY node.lft;
>> Adding 'HAVING depth <= 1' to the query still return the same results
>> as
>> above instead of this:
>> +--+-+
>> | name |   depth |
>> |--+-|
>> | PORTABLE ELECTRONICS |   0 |
>> | MP3 PLAYERS  |   1 |
>> | FLASH|   1 |
>> | CD PLAYERS   |   1 |
>> | 2 WAY RADIOS |   1 |
>> +--+-+
>>
>> I don't know if I'm doing anything wrong?
>>
>> Note: Edit the post query by adding node.lft, sub_tree.depth to the
>> GROUP BY.
> Hello, Gwork,
>
> HAVING works fine, it is just confusing because of naming. HAVING
> works with column names from sources (which is "sub_tree.depth" in
> your example), not with names of final columns (because they get
> aliases later).
>
> You can check it adding depth to your SELECT part:
> SELECT node.name, (COUNT(parent.name) - (sub_tree.depth + 1)) AS depth
> ,array_agg(depth)
> FROM nested_category AS node,
> ...
>
> and you can see that values there are not bigger than 1.
>
> You must use the same expression in HAVING clause as in SELECT one to
> get what you want:
> HAVING (COUNT(parent.name) - (sub_tree.depth + 1)) <= 1
>
> but the result will not have "FLASH" because it has "2" even in your
> example.
> +--+---+
> | name | depth |
> +--+---+
> | PORTABLE ELECTRONICS | 0 |
> | MP3 PLAYERS  | 1 |
> | CD PLAYERS   | 1 |
> | 2 WAY RADIOS | 1 |
> +--+---+
> (4 rows)
 I'm sorry, forgot to mention: If you want to deal with hierarchical
 data, Postgres has better solution - recursive query[1]. When you
 understand principles, it will be much easier for you to write queries
 instead of mentioned in the article.

 For example, "Retrieving a Single Path" from "Adjacency model" can be
 written as:
 WITH RECURSIVE
 sel(name, parent, depth) AS (
SELECT name,

Re: [GENERAL] vacuum of empty table slows down as database table count grows

2017-01-04 Thread marcin kowalski
Thanks, i'll redo the benchmarks and report back how things look now.

2017-01-04 20:33 GMT+01:00 Pavel Stehule :

>
>>> >
>>> > This is irrelevant of amount of data restored, i am seeing the same
>>> behavior with just schema restore, as well as with schema+data restores.
>>> >
>>> > If anyone is interested i may upload the schema data + my benchmarking
>>> script with collected whisper data from my test run (i've been plotting it
>>> in grafana via carbon)
>>> >
>>> > Is this a known issue? Can i do anything to improve performance here?
>>>
>>
>> we had 10K and more tables in one database - and we had lot of issues.
>>
>> I know so Tomas fixed some issues, but we need the stat files in tmpfs
>>
>> please, read this article  https://blog.pgaddict.com/pos
>> ts/the-two-kinds-of-stats-in-postgresql
>>
>
> http://hacksoclock.blogspot.cz/2014/04/putting-
> statstempdirectory-on-ramdisk.html
>
>
>>
>> Regards
>>
>> Pavel
>>
>> >
>>>
>>> --
>>> Jerry Sievers
>>> Postgres DBA/Development Consulting
>>> e: postgres.consult...@comcast.net
>>> p: 312.241.7800
>>>
>>>
>>> --
>>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>>> To make changes to your subscription:
>>> http://www.postgresql.org/mailpref/pgsql-general
>>>
>>
>>
>