Re: [GENERAL] Waiting on ExclusiveLock on extension

2015-04-17 Thread Andomar

Are you able to take some 'perf top' during high CPU spike and see
what's burning CPU there? Though the issue is related to blocking, but
high CPU spikes may hint some spinning to acquire behavior.


Will do, although hopefully the spikes were only growing pains after the 
upgrade.



If your previous relation size is smaller than after upgrade, that's a
signal that you do have holes in relation, thus extension can be
avoided sometimes for new tuples.


The relation between high CPU and page splits is not immediately obvious 
to me.


We run with synchronous_commit off, but there does seem to be a peak in 
I/O requests around the CPU spikes.


Is a page split by nature a synchronous I/O activity? And do the other 
connections wait in some kind of CPU intensive form (like a spinlock?)


Kind regards,
Andomar





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Waiting on ExclusiveLock on extension

2015-04-17 Thread Andres Freund
On 2015-04-16 14:23:25 -0700, Qingqing Zhou wrote:
> On Thu, Apr 16, 2015 at 1:24 PM, Andomar  wrote:
> > After upgrading our database from 9.3.5 to 9.4.1 last night, the server
> > suffers from high CPU spikes. During these spikes, there are a lot of these
> > messages in the logs:
> >
> > process X still waiting for ExclusiveLock on extension of relation Y of
> > database Z after 1036.234 ms
> >
> This issue has been complained several times, and here is the most recent one:
> http://www.postgresql.org/message-id/0ddfb621-7282-4a2b-8879-a47f7cecb...@simply.name
> 
> PG 9.4.1 shall have much alleviated it by relaxing buffer pool related
> locks.

Hm. I'm not aware of related changes in 9.4? 9.5 should be a bit better,
but I don't think 9.4 will make much of a difference.

> > b) How can you find the name of the relation being extended? based on the
> > relation number.
> select ::regclass;

That's not correct. The relfilenode is only equivalent to the relation's
oid when you initially create it. But once rewritten it'll change.

Rather use

SELECT oid::regclass FROM pg_class WHERE pg_relation_filenode(oid) = 2937136;


> > Any suggestions on how to approach this issue are welcome.
> >
> There are some diagnosis in above link, see if it rings any bell. From
> PG kernel side, I think fundamentally we may want to extend many pages
> each time instead of one.

I don't really agree that that's the most important bit. See
http://archives.postgresql.org/message-id/20150329185619.GA29062%40alap3.anarazel.de

Greetings,

Andres Freund


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] fillfactor and cluster table vs ZFS copy-on-write

2015-04-17 Thread Albe Laurenz
Geoff Speicher wrote:
> On Thu, Apr 16, 2015 at 4:56 PM, Qingqing Zhou  
> wrote:
>> On Thu, Apr 16, 2015 at 5:09 AM, Geoff Speicher  
>> wrote:
>>> ZFS implements copy-on-write, so when PostgreSQL modifies a block on disk,
>>> the filesystem writes a new block rather than updating the existing block.

>>> Therefore one might posit that PostgreSQL should be configured to use 100%
>>> fillfactor and avoid clustering on ZFS. Can anyone comment on this?

>> Even with COW, I can see fillfactor < 100% still have its virtues. For
>> example, HOT update can avoid adding an extra index item on the index
>> page if it finds the new item can be inserted in the same heap page.

> That's true, the new physical location on disk is transparent to the DBMS so 
> it has no more or less
> housekeeping with or without COW, but the housekeeping still has to be done 
> somewhere, so it helps to
> understand which is more efficient. I'll see if I can produce some empirical 
> data unless anyone thinks
> it's a waste of time.

I am quite certain that fillfactor < 100% will be a win even then (for the 
right load).
Upating one (heap) block should always be cheaper than updating one heap block
as well as (at least) one index block per index involved.

I doubt that you would be able to measure any performance improvement -
after all, you have to update at least one heap block, no matter what the
fillfactor setting is.

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] fillfactor and cluster table vs ZFS copy-on-write

2015-04-17 Thread Geoff Speicher
On Fri, Apr 17, 2015 at 5:24 AM, Albe Laurenz 
wrote:

> >> Even with COW, I can see fillfactor < 100% still have its virtues. For
> >> example, HOT update can avoid adding an extra index item on the index
> >> page if it finds the new item can be inserted in the same heap page.
>
> > That's true, the new physical location on disk is transparent to the
> DBMS so it has no more or less
> > housekeeping with or without COW, but the housekeeping still has to be
> done somewhere, so it helps to
> > understand which is more efficient. I'll see if I can produce some
> empirical data unless anyone thinks
> > it's a waste of time.
>
> I am quite certain that fillfactor < 100% will be a win even then (for the
> right load).
> Upating one (heap) block should always be cheaper than updating one heap
> block
> as well as (at least) one index block per index involved.


Your last three words. I was ignoring the obvious (and likely) scenario of
when more than one index needs to be updated.

fillfactor<100% with COW still gets the win.

Thanks!

Geoff


Re: [GENERAL] database migration question between different ubuntus and different postgresql server versions

2015-04-17 Thread Adrian Klaver

On 04/16/2015 05:52 PM, Octavi Fors wrote:

Hi Adrian,

I didn't received any answer from Andrews.

Yes, sorry I didn't describe completely my migration plan.
Right now the database 'db' is in NAS1 mounted via nfs with computer 1
(running ubuntu 12.04 postgresql 9.2).
I want to migrate 'db' to a faster NAS2 mounted via nfs with computer 2
(running ubuntu 14.04 postgresql 9.3).

Any commands/help to migrate 'db' from comp1+NAS1 to comp2+NAS2 would be
most than welcome.


So to be clear you only want to move the one database 'db', not the 
whole Postgres database cluster?


Is so and it where me I would, with proper user, host, etc parameters:

 This is an outline only, consider it untested 

1) pg_dumpall --globals-only -f pg_globals.sql
This dumps the cluster roles and tablespaces
Open file and modify tablespace location.

Load the globals into new cluster
psql -d postgres -f pg_globals.sql

2) pg_dump -Fc db -f db.out
This gets you a compressed form of the dump.

3)
If you want to test first without the data do:

pg_restore -d postgres -C -s  db.out

This connects to the system database postgres and the CREATES the 'db' 
database with only the schema(tables,functions,etc) defintions. You can 
verify that the schema layout is how you want it. You can add --clean to 
the above to DROP objects(including the database) each time you run the 
command.


If it is then you could do:

pg_restore -d db -a  db.out

This dumps the data only.

FYI, you can also add the -s and -a switches to the pg_dump command to 
make separate schema and data files from the start, if you want.





Cheers,

Octavi.




--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] On using doubles as primary keys

2015-04-17 Thread Kynn Jones
I have some data in the form of a matrix of doubles (~2 million
rows, ~400 columns) that I would like to store in a Pg table,
along with the associated table of metadata (same number of rows,
~30 columns, almost all text).  This is large enough to make
working with it from flat files unwieldy.

(The only reason for wanting to transfer this data to a Pg table
is the hope that it will be easier to work with it by using SQL
queries to extract subsets of it, than by, e.g., writing programs
that must scan the entire matrix every time they're run.)

My plan is to add one extra column to the Pg tables to serve as
the primary key joining the data and the metadata tables
together.

One consideration that is complication the choice of primary key
is wanting to have the ability to store chunks of the data
table (not the metadata table), including the PK column, as
matrices of doubles.  In its simplest form, this would mean using
doubles as primary keys, but this seems to me a bit weird.

I'm willing to go ahead with this, but I wanted to ask for your
feedback on the whole thing.  In particular I'd like to know if
there are there standard ways for using doubles as primary keys,
and, conversely, if there are known pitfalls I should be looking
out for, but I welcome any other words of wisdom you may have on
this topic.

Thanks in advance!

kj


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] On using doubles as primary keys

2015-04-17 Thread Melvin Davidson
First, please ALWAYS include the version and O/S, even with basic questions.

I'm not sure what you mean by doubles. Do you mean bigint data type, or do
you mean use two columns for a primary key? Either way it's pretty simple.
If you mean a bigint, then probably best to use serial data type, which
will default to the next value.
If you mean use two columns for a Primary Key, the you just specify the
columns.
EG: CONSTRAINT PRIMARY KEY table_name_pk PRIMARY KEY (col1, col2)


On Fri, Apr 17, 2015 at 11:34 AM, Kynn Jones  wrote:

> I have some data in the form of a matrix of doubles (~2 million
> rows, ~400 columns) that I would like to store in a Pg table,
> along with the associated table of metadata (same number of rows,
> ~30 columns, almost all text).  This is large enough to make
> working with it from flat files unwieldy.
>
> (The only reason for wanting to transfer this data to a Pg table
> is the hope that it will be easier to work with it by using SQL
> queries to extract subsets of it, than by, e.g., writing programs
> that must scan the entire matrix every time they're run.)
>
> My plan is to add one extra column to the Pg tables to serve as
> the primary key joining the data and the metadata tables
> together.
>
> One consideration that is complication the choice of primary key
> is wanting to have the ability to store chunks of the data
> table (not the metadata table), including the PK column, as
> matrices of doubles.  In its simplest form, this would mean using
> doubles as primary keys, but this seems to me a bit weird.
>
> I'm willing to go ahead with this, but I wanted to ask for your
> feedback on the whole thing.  In particular I'd like to know if
> there are there standard ways for using doubles as primary keys,
> and, conversely, if there are known pitfalls I should be looking
> out for, but I welcome any other words of wisdom you may have on
> this topic.
>
> Thanks in advance!
>
> kj
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>



-- 
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.


Re: [GENERAL] Waiting on ExclusiveLock on extension

2015-04-17 Thread Jeff Janes
On Fri, Apr 17, 2015 at 1:14 AM, Andres Freund  wrote:

> On 2015-04-16 14:23:25 -0700, Qingqing Zhou wrote:
> > On Thu, Apr 16, 2015 at 1:24 PM, Andomar  wrote:
>
> > > b) How can you find the name of the relation being extended? based on
> the
> > > relation number.
> > select ::regclass;
>
> That's not correct. The relfilenode is only equivalent to the relation's
> oid when you initially create it. But once rewritten it'll change.
>
> Rather use
>
> SELECT oid::regclass FROM pg_class WHERE pg_relation_filenode(oid) =
> 2937136;
>


But the thing being reported in the lock wait log message is the relation
OID itself, not the relfilenode.

You would use your query if you get the number by watching the file names
in the file system as they grow,
but not if you get it from the log message.

Cheers,

Jeff

On Fri, Apr 17, 2015 at 1:14 AM, Andres Freund  wrote:

> On 2015-04-16 14:23:25 -0700, Qingqing Zhou wrote:
> > On Thu, Apr 16, 2015 at 1:24 PM, Andomar  wrote:
> > > After upgrading our database from 9.3.5 to 9.4.1 last night, the server
> > > suffers from high CPU spikes. During these spikes, there are a lot of
> these
> > > messages in the logs:
> > >
> > > process X still waiting for ExclusiveLock on extension of relation
> Y of
> > > database Z after 1036.234 ms
> > >
> > This issue has been complained several times, and here is the most
> recent one:
> >
> http://www.postgresql.org/message-id/0ddfb621-7282-4a2b-8879-a47f7cecb...@simply.name
> >
> > PG 9.4.1 shall have much alleviated it by relaxing buffer pool related
> > locks.
>
> Hm. I'm not aware of related changes in 9.4? 9.5 should be a bit better,
> but I don't think 9.4 will make much of a difference.
>
> > > b) How can you find the name of the relation being extended? based on
> the
> > > relation number.
> > select ::regclass;
>
> That's not correct. The relfilenode is only equivalent to the relation's
> oid when you initially create it. But once rewritten it'll change.
>
> Rather use
>
> SELECT oid::regclass FROM pg_class WHERE pg_relation_filenode(oid) =
> 2937136;
>
>
> > > Any suggestions on how to approach this issue are welcome.
> > >
> > There are some diagnosis in above link, see if it rings any bell. From
> > PG kernel side, I think fundamentally we may want to extend many pages
> > each time instead of one.
>
> I don't really agree that that's the most important bit. See
>
> http://archives.postgresql.org/message-id/20150329185619.GA29062%40alap3.anarazel.de
>
> Greetings,
>
> Andres Freund
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


Re: [GENERAL] On using doubles as primary keys

2015-04-17 Thread David G. Johnston
On Fri, Apr 17, 2015 at 8:45 AM, Melvin Davidson 
wrote:

> ​
> On Fri, Apr 17, 2015 at 11:34 AM, Kynn Jones  wrote:
>
>>
>> One consideration that is complication the choice of primary key
>> is wanting to have the ability to store chunks of the data
>> table (not the metadata table), including the PK column, as
>> matrices of doubles.  In its simplest form, this would mean using
>> doubles as primary keys, but this seems to me a bit weird.
>>
>> I'm willing to go ahead with this, but I wanted to ask for your
>> feedback on the whole thing.  In particular I'd like to know if
>> there are there standard ways for using doubles as primary keys,
>> and, conversely, if there are known pitfalls I should be looking
>> out for, but I welcome any other words of wisdom you may have on
>> this topic.
>>
>
MD>> ​
​
First, please ALWAYS include the version and O/S, even with basic questions.

​
MD>>
​
I'm not sure what you mean by doubles. Do you mean bigint data type, or do
you mean use two columns for a primary key? Either way it's pretty simple.
​
MD>>
​
If you mean a bigint, then probably best to use serial data type, which
will default to the next value.
​
MD>>
​
If you mean use two columns for a Primary Key, the you just specify the
columns.
​
MD>>
​
EG: CONSTRAINT PRIMARY KEY table_name_pk PRIMARY KEY (col1, col2)

​I take it the OP means "double precision" ​

http://www.postgresql.org/docs/9.0/static/datatype-numeric.html

I'd be suspect of choosing a PK whose type definition includes the word
"inexact".

You also say you want to store the data as a double precision but you never
actually explain what the natural key of the data is.

If you are strictly storing serial (big) integers but doing so within a
double precision typed column you may be OK - but I would advise taking
input from someone with more expertise on the properties of IEEE floating
point numbers.

David J.


Re: [GENERAL] Waiting on ExclusiveLock on extension

2015-04-17 Thread Qingqing Zhou
On Fri, Apr 17, 2015 at 1:14 AM, Andres Freund  wrote:
> Hm. I'm not aware of related changes in 9.4? 9.5 should be a bit better,
> but I don't think 9.4 will make much of a difference.
>
You are right. I mis-read the check-in log.

> I don't really agree that that's the most important bit. See
> http://archives.postgresql.org/message-id/20150329185619.GA29062%40alap3.anarazel.de
>
Thanks for the link. I actually think changes in a more fundamental
way - will follow on that thread.

Regards,
Qingqing


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] On using doubles as primary keys

2015-04-17 Thread Ray Cote
On Fri, Apr 17, 2015 at 11:56 AM, David G. Johnston <
david.g.johns...@gmail.com> wrote:

> ​
> MD>>
> ​
> I'm not sure what you mean by doubles. Do you mean bigint data type, or do
> you mean use two columns for a primary key? Either way it's pretty simple.
> ​
> MD>>
> ​
> If you mean a bigint, then probably best to use serial data type, which
> will default to the next value.
> ​
> MD>>
> ​
> If you mean use two columns for a Primary Key, the you just specify the
> columns.
> ​
> MD>>
> ​
> EG: CONSTRAINT PRIMARY KEY table_name_pk PRIMARY KEY (col1, col2)
>
> ​I take it the OP means "double precision" ​
>
> http://www.postgresql.org/docs/9.0/static/datatype-numeric.html
>
> I'd be suspect of choosing a PK whose type definition includes the word
> "inexact".
>
> You also say you want to store the data as a double precision but you
> never actually explain what the natural key of the data is.
>
> If you are strictly storing serial (big) integers but doing so within a
> double precision typed column you may be OK - but I would advise taking
> input from someone with more expertise on the properties of IEEE floating
> point numbers.
>
> David J.
>


(Not an IEEE floating point expert, but...) I've learned the hard way to
never rely on comparing two floating point numbers for equality -- and
that's what you are doing if you join on them as primary keys. If you must
use the underlying numeric data for joining, I'd recommend you do something
like:
  * output the value to a string (rounded to a specific number of digits)
  * use the textual representation of the number as your primary key.

Alternatively, if the numbers are very large or very small, you may want to
try converting the binary data into a BINHEX value and use that textual
representation as the primary key.

Either way, I think you'd be saving yourself a lot of headaches using a
representation that is not 'inexact.'
--Ray

-- 
Raymond Cote, President
voice: +1.603.924.6079 email: rgac...@appropriatesolutions.com skype:
ray.cote


Re: [GENERAL] On using doubles as primary keys

2015-04-17 Thread Jim Nasby

On 4/17/15 1:10 PM, Ray Cote wrote:



(Not an IEEE floating point expert, but...) I've learned the hard way to
never rely on comparing two floating point numbers for equality -- and
that's what you are doing if you join on them as primary keys. If you
must use the underlying numeric data for joining, I'd recommend you do
something like:
   * output the value to a string (rounded to a specific number of digits)
   * use the textual representation of the number as your primary key.


numeric would probably be more efficient, if you can use that instead. 
It does suffer from some of the same issues as floating point (there's 
been recent discussion of that on -hackers), but I believe it'd be 
better than double.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Waiting on ExclusiveLock on extension

2015-04-17 Thread Jim Nasby

On 4/16/15 4:39 PM, Andomar wrote:

Thanks for your reply.

This issue has been complained several times, and here is the most
recent one:
http://www.postgresql.org/message-id/0ddfb621-7282-4a2b-8879-a47f7cecb...@simply.name


That post is about a server with huge shared_buffers, but ours is just
8GB. Total memory 48GB memory on a dedicated server. Checkpoints write
around 2% of the buffers.


Yeah, I suspect the OP in that thread was seeing something different 
than you are, but that's just a guess.



PG 9.4.1 shall have much alleviated it by relaxing buffer pool related
locks. PG 9.4.1 shall be actually better in relation extension
handling - a possible explanation is that your 9.3.5 database has been
used for a while thus there are holes in pages, so not many extensions
are required.

The 9.3.5 version went live as an upgrade from 9.1.x in the same way. So
it started from an SQL dump. The load has not changed much since the
9.3.5 upgrade.


Yes, but did you have the same workload when you upgraded to 9.3 as you 
do today?



With holes in pages, I suppose you mean the fill factor?


The fill factor plays a role in whether a page has free space, yes, 
*especially* during an initial load (like restoring from pg_dump). Keep 
in mind that as the system runs you're going to be creating free space 
in pages as data is updated or deleted. So the amount of free space per 
page on the 9.3 database you moved from wouldn't be anything like what 
it was when you moved to 9.3.



Is there a way
to see the current fillfactor of a table and its indices?


Possibly the bloat query 
(https://wiki.postgresql.org/wiki/Show_database_bloat) would. The page 
inspect utility certainly would tell you. Possibly more useful would be 
to see what the pg_freespacemap contrib module shows in the free space 
map, since that's what the system will actually use to see where it can 
find a page to insert data on.


BTW, something else to be aware of: because you essentially re-loaded 
all your data in a single transaction, that means that a: you'll be 
doing a lot of hint bit updates until all data has been read a second 
time, and b: a lot of this data will come up for freezing at around the 
same time, creating a big chunk of work for autovacuum. That's caused 
problems for me in the past, though that was on a database that had a 
pretty high workload.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to keep pg_largeobject from growing endlessly

2015-04-17 Thread Jim Nasby

On 4/15/15 9:22 AM, Andreas Joseph Krogh wrote:

På onsdag 15. april 2015 kl. 16:05:22, skrev Adam Hooper
mailto:a...@adamhooper.com>>:

On Wed, Apr 15, 2015 at 9:57 AM, Andreas Joseph Krogh
 wrote:
 >
 > På onsdag 15. april 2015 kl. 15:50:36, skrev Adam Hooper
:
 >
 > On Wed, Apr 15, 2015 at 4:49 AM, Andreas Joseph Krogh
 >  wrote:
 > >
 > > In other words: Does vacuumlo cause diskspace used by
pg_largeobject to be freed to the OS (after eventually vacuumed by
autovacuum)?
 >
 > No.
 >
 > Ok. Out of curiousity; When does it get freed, when VACUUM FULL'ed?

Yes. VACUUM FULL or CLUSTER will free the space. (Of course, you need
a lot of free disk space to perform those operations.)

I'm sure there's a good reason for why VACUUM FULL needs to rewreite the
whole table and cannot "just free the unused space to the OS".


I think mostly because no one's written something to incrementally 
delete the old data as it's moved. That would be a non-trivial amount of 
work though, because none of the internal APIs are really setup the way 
you'd need them to be to allow for this.


Also, I think there's some mis-information about vacuum returning space 
to the filesystem. It definitely WILL return space to the filesystem, 
but only under a very strict set of conditions:


- There must be a sufficient amount of free space *at the end of the 
relation*

- It must be able to quickly acquire the correct lock
- It will start truncating pages off the relation until it detects 
someone else is blocked on the lock it's holding. At that point it stops 
what it's doing


So when the right set of circumstances occur, a plain vacuum will return 
free space; but on a heavily hit table it's very hard for that to happen 
in practice.


What you might want to do here is essentially re-create the large object 
interface but allow it to hit any table instead of being force to use 
the system one. That would open up the possibility of using tools like 
pg_repack and table partitioning. You could do this in pure SQL, but the 
community might welcome a patch that adds the ability to use different 
tables to the existing large object API.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] On using doubles as primary keys

2015-04-17 Thread John McKown
On Fri, Apr 17, 2015 at 10:34 AM, Kynn Jones  wrote:

> I have some data in the form of a matrix of doubles (~2 million
> rows, ~400 columns) that I would like to store in a Pg table,
> along with the associated table of metadata (same number of rows,
> ~30 columns, almost all text).  This is large enough to make
> working with it from flat files unwieldy.
>
> (The only reason for wanting to transfer this data to a Pg table
> is the hope that it will be easier to work with it by using SQL
> queries to extract subsets of it, than by, e.g., writing programs
> that must scan the entire matrix every time they're run.)
>
> My plan is to add one extra column to the Pg tables to serve as
> the primary key joining the data and the metadata tables
> together.
>
>
​I could be wrong, but it seems to me that the records in the two files
have a one to one correspondence with each other. That is, record #1 in
file #1 has data related to record #1 in file #2. And that there should
therefore always be the same number of records in both files. So my first
thought is why not just have one table in PostgreSQL which contains the
data merged from the corresponding records in the two files? IMO, this is a
cleaner data design than trying to ensure that two tables are kept "in
sync" as records are added and deleted. Of course, this could possible hurt
performance, due to reading a lot of data, if you consistently only need
data from one file and only occasionally read the other.​

If you insist on two for some reason, then I would have the load program
read a record from each file, incrementing a counter, and INSERT the data
from the files into the tables using the counter as the primary key. To
ease things a bit, assuming the 1:1, I would have the PK on the more active
file be a foreign key on the other file, with an ON CASCADE DELETE so that
when you DELETE records from the more active, the corresponding record in
the secondary table will automatically be deleted. You might even want some
sort of CONTRAINT trigger to do a DELETE on the primary table if a record
is DELETEd from the secondary.

Also, if you really want to separate the data into more than one table,
then why only two? In this case, which I admit that I personally _dislike_,
you might want to do an analysis of what columns you access together
frequently and put those columns into separate tables. Again, the main plus
of this would be memory usage and I/O on the data base server side. May Joe
Celko forgive me for introducing physical DB configuration into a data
structure discussion.


​​


>
> Thanks in advance!
>
> kj
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>



-- 
If you sent twitter messages while exploring, are you on a textpedition?

He's about as useful as a wax frying pan.

10 to the 12th power microphones = 1 Megaphone

Maranatha! <><
John McKown


Re: [GENERAL] On using doubles as primary keys

2015-04-17 Thread Paul A Jungwirth
On Apr 17, 2015 8:35 AM, "Kynn Jones"  wrote:
> (The only reason for wanting to transfer this data to a Pg table
> is the hope that it will be easier to work with it by using SQL

800 million 8-byte numbers doesn't seem totally unreasonable for
python/R/Matlab, if you have a lot of memory. Are you sure you want it in
Postgres? Load the file once then filter it as you like. If you don't have
the memory I can see how using Postgres to get fewer rows at a time might
help. Fewer columns at a time would help even more if that's possible.

> In its simplest form, this would mean using
> doubles as primary keys, but this seems to me a bit weird.

I'd avoid that and just include an integer PK with your data. Datagrams in
the languages above support that, or just slice off the PK column before
doing your matrix math.

Also instead of 401 columns per row maybe store all 400 doubles in an array
column? Not sure if that's useful for you but maybe it's worth considering.

Also if you put the metadata in the same table as the doubles, can you
leave off the PKs altogether? Why join if you don't have to? It sounds like
the tables are 1-to-1? Even if some metadata is not, maybe you can finesse
it with hstore/arrays.

Good luck!

Paul


Re: [GENERAL] PL\pgSQL 'ERROR: invalid input syntax for type oid:' [PostgreSQL 9.3.6 and 9.4]

2015-04-17 Thread William Dunn
Thanks Adrian!  Changing the declaration row_data to be of type RECORD
(rather than pg_catalog.pg_class%ROWTYPE) resolved the error :)

- Will

*Will J Dunn*
*willjdunn.com *

On Thu, Apr 16, 2015 at 4:36 PM, Adrian Klaver 
wrote:

> On 04/16/2015 07:52 AM, William Dunn wrote:
>
>> Hello list,
>>
>> I am creating a plpgsql procedure in Postgres 9.4 (also testing in
>> 9.3.6) to move all of the tables that are not in a default tablespace
>> (pg_default, pg_global, or 0) into the tablespace pg_default. However
>> when it executes I get an error 'ERROR:  invalid input syntax for type
>> oid:' which I do not know how to resolve..
>>
>> The procedure executes the following select query, which returns the
>> /relname
>> 
>> /(tablename, type /name/) and /nspname
>> 
>> /(schema name, type /name/) of each table that are not in the default
>> tablespaces, into a variable called /row_data/ (of type
>>
>> pg_catalog.pg_class%ROWTYPE):
>>
>> SELECT pg_class.relname, pg_namespace.nspname
>> FROM pg_class INNER JOIN pg_namespace ON
>> pg_class.relnamespace=pg_namespace.oid
>> WHERE pg_class.reltablespace<>(SELECT oid FROM pg_tablespace WHERE
>> spcname='pg_default')
>> AND pg_class.reltablespace<>(SELECT oid FROM pg_tablespace WHERE
>> spcname='pg_global')
>> AND pg_class.reltablespace<>0
>> AND pg_class.relkind='r'
>> ORDER BY pg_class.relname;
>>
>> Using the example database EDBSTORE (example database provided by
>> Enterprise DB) the query returned the table 'inventory' which was in
>> schema 'edbstore' (which I had stored on tablespace 'edbstore', not
>> pg_default):
>>relname  | nspname
>> ---+--
>>   inventory | edbstore
>> (1 row)
>>
>>
>> The procedure loops through each returned row and executes an ALTER
>> TABLE command to move them to the tablespace pg_default:
>> EXECUTE 'ALTER TABLE '|| row_data.nspname || '.' || row_data.relname ||'
>> SET TABLESPACE pg_default';
>>
>> (so in the above edbstore example it should execute "ALTER TABLE
>> edbstore.inventory SET TABLESPACE pg_default;")
>>
>> However, when I run the procedure it is returning the following error:
>> ERROR:  invalid input syntax for type oid: "edbstore"
>> CONTEXT:  PL/pgSQL function move_table_tablespaces_to_pg_default()
>> line 18 at FOR over SELECT rows
>>
>> Does anyone understand this error?
>>
>
> pg_class has a hidden field oid:
>
> http://www.postgresql.org/docs/9.3/interactive/catalog-pg-class.html
>
> When you are doing:
>
> row_data pg_catalog.pg_class%ROWTYPE;
>
> that is saying you want the whole row type for pg_class:
>
>
> http://www.postgresql.org/docs/9.3/interactive/plpgsql-declarations.html#PLPGSQL-DECLARATION-ROWTYPES
>
> You are not supplying the oid or the columns other then relname and
> nspname so the error is expected.
>
> If it where me I would use a RECORD type:
>
>
> http://www.postgresql.org/docs/9.3/interactive/plpgsql-declarations.html#PLPGSQL-DECLARATION-RECORDS
>
> It will adapt to the columns actually returned.
>
>
>> The full plpgsql function is as follows:
>> ~~~
>> CREATE FUNCTION move_table_tablespaces_to_pg_default () RETURNS INTEGER
>> AS $$
>> -- Loops through the tables not in the tablespace pg_default, pg_global,
>> or the default tablespace and moves them to the pg_default tablespace
>> -- Returns the number of tables that were moved
>>
>> DECLARE
>>
>>-- Declare a variable to hold the counter of tables moved
>>objects_affected INTEGER = 0;
>>
>>-- Declare a variable to hold rows from the pg_class table
>>row_data pg_catalog.pg_class%ROWTYPE;
>>
>> BEGIN
>>
>>-- Iterate through the results of a query which lists all of the
>> tables not in the tablespace pg_default, pg_global, or the default
>> tablespace
>>FOR row_data IN (SELECT pg_class.relname, pg_namespace.nspname
>> FROM pg_class INNER JOIN pg_namespace ON
>> pg_class.relnamespace=pg_namespace.oid
>>
>> WHERE pg_class.reltablespace<>(SELECT oid FROM pg_tablespace WHERE
>> spcname='pg_default') AND pg_class.reltablespace<>(SELECT oid FROM
>> pg_tablespace WHERE
>>
>> spcname='pg_global') AND pg_class.reltablespace<>0 AND
>> pg_class.relkind='r' ORDER BY pg_class.relname)  LOOP
>>
>>   -- execute ALTER TABLE statement on that table to move it to
>> tablespace pg_default
>>   EXECUTE 'ALTER TABLE '|| row_data.nspname || '.' ||
>> row_data.relname ||' SET TABLESPACE pg_default';
>>
>>   -- increment count of tables moved
>>   objects_affected := objects_affected + 1;
>>END LOOP;
>>
>>-- Return count of tables moved
>>-- RETURN objects_affected;
>> END;
>> $$ LANGUAGE 'plpgsql';
>>
>> Thanks!!
>> Will
>>
>> *Will J Dunn*
>> *willjdunn.com *
>

Re: [GENERAL] Waiting on ExclusiveLock on extension

2015-04-17 Thread Andomar


Yes, but did you have the same workload when you upgraded to 9.3 as 
you do today?


The workload is very similar. We upgraded from 9.1 to 9.3 only two 
months ago, and our usage statistics have not changed much. There were 
no "remaining connection slots are reserved for non-replication 
superuser connections" messages in the weeks after the 9.3 upgrade.


BTW, something else to be aware of: because you essentially re-loaded 
all your data in a single transaction, that means that a: you'll be 
doing a lot of hint bit updates until all data has been read a second 
time, and b: a lot of this data will come up for freezing at around 
the same time, creating a big chunk of work for autovacuum. That's 
caused problems for me in the past, though that was on a database that 
had a pretty high workload.


We run "VACUUM ALL" every night and although that gave a bit more CPU 
and I/O but nothing major.


When the problem occurs, the symptoms are:
- spike to a very high load average (= CPU usage)
- messages about long waits on ExclusiveLock on extension appear
- remaining connection slots are reserved (with corresponding end user 
impact)


An example of a message with 1 second wait duration and a lot of waiting 
queries:


process 667 still waiting for ExclusiveLock on extension of relation 
1249 of database 16406 after 1005.226 ms","Process holding the lock: 
36279. Wait queue: 36725, 36405, 36511, 36721, 36280, 36048, 36566, 
36636, 36466, 36734, 36723, 36621, 36423, 36931, 36720, 36429, 35500, 
36735, 37015, 36717, 36938, 36870, 36732, 36587, 36869, 36285, 36573, 
37101, 36937, 36414, 36834, 37105, 36867, 36724, 36991, 37102, 36882, 
36802, 37163, 39964, 39723, 40044, 39821, 40150, 40218, 40203, 40054, 
40060, 40173, 40091, 40174, 40058, 40658, 40370, 40177, 40920, 41085, 
41103, 41117, 41154, 41161, 41066, 41053, 41380, 40661, 40632, 40698, 
41242, 40681, 41174, 41328, 41075, 41245, 41326, 41523, 41153, 41170, 
40543, 41314, 41526, 41490, 41157, 41353, 41472, 41730, 41546, 45087, 
41535, 41474, 41362, 41450, 41948, 41929, 41459, 41508, 42117, 42127, 
41950, 41922, 42414, 41939, 42565, 42643, 42242, 41796, 42324, 42358, 
42411, 42487, 41758, 42120, 42570, 41820, 41925, 43356, 43381, 43360, 
43351, 43364, 42336, 42871, 43007, 42455, 43363, 42287, 43336, 42652, 
42803, 43567, 43706, 43795, 43630, 43716, 43810, 43596, 43061, 43954, 
44014, 43377, 43366, 43825, 43454, 43840, 43582, 43839, 44009, 43842, 
43693, 44320, 43824, 43456, 43852, 43863, 44708, 43848, 44255, 44587, 
44936, 44915, 44759, 44700, 44948, 45051, 44808, 45189, 45137, 45037, 
45303, 45294, 45710, 45711, 45755, 45660, 45120, 45576, 46221, 46125, 
46703, 46512, 46399, 46684, 46762, 46373, 46929, 46443, 46817, 46858, 
47017, 46886, 46805, 46890, 47593, 47548, 47272, 47454, 47906, 47462, 
47801, 47939, 47994, 48098, 48363, 47815, 48393, 48250, 48213, 48470, 
48408, 48509, 48499, 48115, 48448, 48877, 461, 560, 637, 48902, 641, 
49019, 667, 877, 1306, 1070, 1265.""SQL statement ""CREATE TEMPORARY 
TABLE


An example of a message with 140 second wait duration:

process 27173 acquired ExclusiveLock on extension of relation 16787 of 
database 16406 after 138522.699 ms","SQL statement ""INSERT INTO


There are about 50 queries a second, so a 10 second lock will exhaust 
the number of available connections.


We keep sar logs, and they show an unusual amount of pgscand/s and 
pgsteal/s around the time of the problem:


07:25:01 PM  pgpgin/s pgpgout/s   fault/s  majflt/s  pgfree/s pgscank/s 
pgscand/s pgsteal/s%vmeff
08:25:01 PM 29.93   1069.13 393727.33  0.00 170655.03 0.00 
90.20 90.20100.00
08:26:01 PM 20.14   1325.55 388751.51  0.00 157188.40 0.00
315.81315.81100.00
08:27:02 PM 59.33   1215.97 612453.98  0.00 271483.71 0.00
180.40180.40100.00
08:28:01 PM 43.28   1881.71 473067.50  0.00 236778.10 0.00
114.87114.87100.00
08:29:01 PM 23.49665.73 406865.61  0.00 213178.17 0.00
293.11293.11100.00
08:30:01 PM 31.39   1317.51 448565.03  0.02 193320.76 0.00
225.88225.83 99.98

-- Problem starts here
08:31:02 PM 21.16   1499.32 468467.44  0.10 211767.21 0.00   
4311.87   4311.84100.00
08:32:01 PM 10.19648.87 261067.16  0.05 167231.84 0.00   
1071.01   1070.98100.00
08:33:01 PM 63.59950.94 422101.19  0.22 242284.12 0.00   
1243.37   1243.34100.00

-- Problem ends here
08:34:01 PM 24.97   1321.61 412294.87  0.00 273734.03 0.00  
0.00  0.00  0.00
08:35:01 PM 11.60   1094.22 353741.41  0.00 238541.73 0.00  
0.00  0.00  0.00
08:36:01 PM 39.22976.60 368450.80  0.10 240632.57 0.00  
0.00  0.00  0.00
08:37:01 PM 19.83967.31 320415.39  0.00 217557.42 0.00  
0.00  0.00  0.00
08:38:01 PM 15.68   1884.09 301785.58  0.00 200274.51 0.00  
0.00  0.00  0.00
08:39:01 PM 62.61858.31 487099.

Re: [GENERAL] How to keep pg_largeobject from growing endlessly

2015-04-17 Thread Andreas Joseph Krogh
På fredag 17. april 2015 kl. 21:11:05, skrev Jim Nasby mailto:jim.na...@bluetreble.com>>: On 4/15/15 9:22 AM, Andreas Joseph Krogh 
wrote:
 > På onsdag 15. april 2015 kl. 16:05:22, skrev Adam Hooper
 > mailto:a...@adamhooper.com>>:
 >
 >     On Wed, Apr 15, 2015 at 9:57 AM, Andreas Joseph Krogh
 >      wrote:
 >      >
 >      > På onsdag 15. april 2015 kl. 15:50:36, skrev Adam Hooper
 >     :
 >      >
 >      > On Wed, Apr 15, 2015 at 4:49 AM, Andreas Joseph Krogh
 >      >  wrote:
 >      > >
 >      > > In other words: Does vacuumlo cause diskspace used by
 >     pg_largeobject to be freed to the OS (after eventually vacuumed by
 >     autovacuum)?
 >      >
 >      > No.
 >      >
 >      > Ok. Out of curiousity; When does it get freed, when VACUUM FULL'ed?
 >
 >     Yes. VACUUM FULL or CLUSTER will free the space. (Of course, you need
 >     a lot of free disk space to perform those operations.)
 >
 > I'm sure there's a good reason for why VACUUM FULL needs to rewreite the
 > whole table and cannot "just free the unused space to the OS".

 I think mostly because no one's written something to incrementally
 delete the old data as it's moved. That would be a non-trivial amount of
 work though, because none of the internal APIs are really setup the way
 you'd need them to be to allow for this.

 Also, I think there's some mis-information about vacuum returning space
 to the filesystem. It definitely WILL return space to the filesystem,
 but only under a very strict set of conditions:

 - There must be a sufficient amount of free space *at the end of the
 relation*
 - It must be able to quickly acquire the correct lock
 - It will start truncating pages off the relation until it detects
 someone else is blocked on the lock it's holding. At that point it stops
 what it's doing

 So when the right set of circumstances occur, a plain vacuum will return
 free space; but on a heavily hit table it's very hard for that to happen
 in practice.

 What you might want to do here is essentially re-create the large object
 interface but allow it to hit any table instead of being force to use
 the system one. That would open up the possibility of using tools like
 pg_repack and table partitioning. You could do this in pure SQL, but the
 community might welcome a patch that adds the ability to use different
 tables to the existing large object API.
 --
 Jim Nasby, Data Architect, Blue Treble Consulting
 Data in Trouble? Get it in Treble! http://BlueTreble.com   Thanks for the 
info.   There seems to be not much happening with the large-object API (and 
pg_largeobject's restriction being a system-catalog). Are there any plans to 
improve it. I see 2 (for me) obvious enhancements; 1. Being able to move the 
LO-table (for now pg_largeobject) to another tablespace without restarting the 
cluster in single-user mode, and 2, improvements to free space to the OS.   
Would crowd-funding help here?   Thanks.   -- Andreas Joseph Krogh CTO / Partner
 - Visena AS Mobile: +47 909 56 963 andr...@visena.com 
 www.visena.com   
  

[GENERAL] Running pg_upgrade under Debian

2015-04-17 Thread rob stone
Hello,

I'm trying to upgrade from 9.3 to 9.4 on my laptop and encountering this
error:-


postgres@roblaptop:/usr/lib/postgresql/9.4/bin$ ./pg_upgrade
-b /usr/lib/postgresql/9.3/bin -B /usr/lib/postgresql/9.4/bin
-d /home/postgres/data93/userqueries
-D /home/postgres/data94/userqueries -U pguserqueries

check for "/home/postgres/data93/userqueries/base" failed: No such file
or directory

Failure, exiting
postgres@roblaptop:/usr/lib/postgresql/9.4/bin$

postgres@roblaptop:/usr/lib/postgresql/9.4/bin$ ./pg_upgrade --version
pg_upgrade (PostgreSQL) 9.4.1


I have two tablespaces defined for this data -- one to hold the tables
and another for the indices.

There is no "base" file or directory.

Do I just create a dummy directory named "base"?

If there is supposed to be a directory or file named "base" why wasn't
it created by initdb?


Regards,
Robert



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Help with slow table update

2015-04-17 Thread Pawel Veselov
>
> [skipped]
>
>
>>  But remember that if you update or delete a row, removing it from an
>>> index, the data will stay in that index until vacuum comes along.
>>>
>>> Also, there's no point in doing a REINDEX after a VACUUM FULL;
>>> vacuum full rebuilds all the indexes for you.
>>>
>>>
>>> I was being desperate :)
>>>
>>> I still think there is something very wrong with this particular table.
>>> First, I have production systems that employ this function on way larger
>>> data set, and there is no problem (so far, but still). This machine is
>>> part of a test deployment, there is no constant load, the only data that
>>> is being written now is when I do these tests. Vacuuming should prune
>>> all that dead stuff, and if it's absent, it's unclear where is the time
>>> spent navigating/updating the table with 24 rows :)
>>>
>>
>> I think you definitely have a problem with dead rows, as evidenced by the
>> huge improvement VACUUM FULL made.
>>
>
> But it's not clear why (and not reasonable, IMHO, that) it wouldn't
> improve past current point.
>

What I should've done is 'VACUUM FULL VERBOSE'. Once I did, it told me
there were 800k dead rows that can't be removed. After digging around I
found some dangling prepared transactions, going back months. Once I threw
those away, and re-vacuumed, things got back to normal.

Thanks for all your help and advice.


Re: [GENERAL] Running pg_upgrade under Debian

2015-04-17 Thread Karsten Hilbert
On Sat, Apr 18, 2015 at 08:09:43AM +1000, rob stone wrote:

> I'm trying to upgrade from 9.3 to 9.4 on my laptop and encountering this
> error:-
> 
> 
> postgres@roblaptop:/usr/lib/postgresql/9.4/bin$ ./pg_upgrade
> -b /usr/lib/postgresql/9.3/bin -B /usr/lib/postgresql/9.4/bin
> -d /home/postgres/data93/userqueries
> -D /home/postgres/data94/userqueries -U pguserqueries

For what it's worth: Debian provides a

pg_upgradecluster

tailored to its specific setup of PostgreSQL clusters. That
has worked well for me across several major version bumps.

Karsten

> check for "/home/postgres/data93/userqueries/base" failed: No such file
> or directory
> 
> Failure, exiting
> postgres@roblaptop:/usr/lib/postgresql/9.4/bin$
> 
> postgres@roblaptop:/usr/lib/postgresql/9.4/bin$ ./pg_upgrade --version
> pg_upgrade (PostgreSQL) 9.4.1
> 
> 
> I have two tablespaces defined for this data -- one to hold the tables
> and another for the indices.
> 
> There is no "base" file or directory.
> 
> Do I just create a dummy directory named "base"?
> 
> If there is supposed to be a directory or file named "base" why wasn't
> it created by initdb?
> 
> 
> Regards,
> Robert
> 
> 
> 
> -- 
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general

-- 
GPG key ID E4071346 @ eu.pool.sks-keyservers.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Running pg_upgrade under Debian

2015-04-17 Thread rob stone



On Sat, 2015-04-18 at 00:25 +0200, Karsten Hilbert wrote:
> On Sat, Apr 18, 2015 at 08:09:43AM +1000, rob stone wrote:
> 
> > I'm trying to upgrade from 9.3 to 9.4 on my laptop and encountering this
> > error:-
> > 
> > 
> > postgres@roblaptop:/usr/lib/postgresql/9.4/bin$ ./pg_upgrade
> > -b /usr/lib/postgresql/9.3/bin -B /usr/lib/postgresql/9.4/bin
> > -d /home/postgres/data93/userqueries
> > -D /home/postgres/data94/userqueries -U pguserqueries
> 
> For what it's worth: Debian provides a
> 
>   pg_upgradecluster
> 
> tailored to its specific setup of PostgreSQL clusters. That
> has worked well for me across several major version bumps.
> 
> Karsten
> 

Indeed I have that program installed in /usr/lib but the man pages state
that it cannot handle tablespaces.

So, I guess I have to move to Plan B, use pg_dump from the old version
and then import into 9.4.1.

Thanks for your help.

Robert


> > check for "/home/postgres/data93/userqueries/base" failed: No such file
> > or directory
> > 
> > Failure, exiting
> > postgres@roblaptop:/usr/lib/postgresql/9.4/bin$
> > 
> > postgres@roblaptop:/usr/lib/postgresql/9.4/bin$ ./pg_upgrade --version
> > pg_upgrade (PostgreSQL) 9.4.1
> > 
> > 
> > I have two tablespaces defined for this data -- one to hold the tables
> > and another for the indices.
> > 
> > There is no "base" file or directory.
> > 
> > Do I just create a dummy directory named "base"?
> > 
> > If there is supposed to be a directory or file named "base" why wasn't
> > it created by initdb?
> > 
> > 
> > Regards,
> > Robert
> > 
> > 
> > 
> > -- 
> > Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> > To make changes to your subscription:
> > http://www.postgresql.org/mailpref/pgsql-general
> 
> -- 
> GPG key ID E4071346 @ eu.pool.sks-keyservers.net
> E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346
> 
> 



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Running pg_upgrade under Debian

2015-04-17 Thread Adrian Klaver

On 04/17/2015 03:09 PM, rob stone wrote:

Hello,

I'm trying to upgrade from 9.3 to 9.4 on my laptop and encountering this
error:-


postgres@roblaptop:/usr/lib/postgresql/9.4/bin$ ./pg_upgrade
-b /usr/lib/postgresql/9.3/bin -B /usr/lib/postgresql/9.4/bin
-d /home/postgres/data93/userqueries
-D /home/postgres/data94/userqueries -U pguserqueries

check for "/home/postgres/data93/userqueries/base" failed: No such file
or directory

Failure, exiting
postgres@roblaptop:/usr/lib/postgresql/9.4/bin$

postgres@roblaptop:/usr/lib/postgresql/9.4/bin$ ./pg_upgrade --version
pg_upgrade (PostgreSQL) 9.4.1


I have two tablespaces defined for this data -- one to hold the tables
and another for the indices.

There is no "base" file or directory.


Guessing /home/postgres/data*/userqueries/ is the location of your 
tablespace, not the cluster directory.


The -d and -D need to point at the cluster directory, which will be 
something like :


/var/lib/postgresql/9.3/main



Do I just create a dummy directory named "base"?


No, you need to point to the correct directory.

See here for more information:

http://www.postgresql.org/docs/9.4/interactive/pgupgrade.html

Or use pg_upgradecluster:

http://manpages.ubuntu.com/manpages/trusty/man8/pg_upgradecluster.8.html


If there is supposed to be a directory or file named "base" why wasn't
it created by initdb?





Regards,
Robert






--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] "Cast" SRF returning record to a table type?

2015-04-17 Thread Jim Nasby
I'm working on a function that will return a set of test data, for unit 
testing database stuff. It does a few things, but ultimately returns 
SETOF record that's essentially:


RETURN QUERY EXECUTE 'SELECT * FROM ' || table_name;

Because it's always going to return a real relation, I'd like to be able 
to the equivalent of:


SELECT ... FROM my_function( 'some_table' )::some_table;

Is there any trick that would allow that to work? I know that instead of 
'SELECT * ...' I can do 'SELECT row(t.*) FROM ' || table_name || ' AS t' 
and then do


SELECT ... FROM my_function( 'some_table' ) AS data( d some_table )

but I'm hoping to avoid the extra level of indirection.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] "Cast" SRF returning record to a table type?

2015-04-17 Thread David G. Johnston
On Friday, April 17, 2015, Jim Nasby  wrote:

> I'm working on a function that will return a set of test data, for unit
> testing database stuff. It does a few things, but ultimately returns SETOF
> record that's essentially:
>
> RETURN QUERY EXECUTE 'SELECT * FROM ' || table_name;
>
> Because it's always going to return a real relation, I'd like to be able
> to the equivalent of:
>
> SELECT ... FROM my_function( 'some_table' )::some_table;


Unfortunately this means "cast the existing type to some_table" and
"record" is not a valid type in this context.


>
> Is there any trick that would allow that to work? I know that instead of
> 'SELECT * ...' I can do 'SELECT row(t.*) FROM ' || table_name || ' AS t'
> and then do
>
> SELECT ... FROM my_function( 'some_table' ) AS data( d some_table )
>
> but I'm hoping to avoid the extra level of indirection.
>
>
Haven't explored this specific code in depth...but which part - the
function alias or the select row(t.*)?  They seem to be independent
concerns.

David J.


Re: [GENERAL] Waiting on ExclusiveLock on extension

2015-04-17 Thread Jim Nasby

On 4/17/15 4:22 PM, Andomar wrote:



Yes, but did you have the same workload when you upgraded to 9.3 as
you do today?


The workload is very similar. We upgraded from 9.1 to 9.3 only two
months ago, and our usage statistics have not changed much. There were
no "remaining connection slots are reserved for non-replication
superuser connections" messages in the weeks after the 9.3 upgrade.


BTW, something else to be aware of: because you essentially re-loaded
all your data in a single transaction, that means that a: you'll be
doing a lot of hint bit updates until all data has been read a second
time, and b: a lot of this data will come up for freezing at around
the same time, creating a big chunk of work for autovacuum. That's
caused problems for me in the past, though that was on a database that
had a pretty high workload.


We run "VACUUM ALL" every night and although that gave a bit more CPU
and I/O but nothing major.

When the problem occurs, the symptoms are:
- spike to a very high load average (= CPU usage)
- messages about long waits on ExclusiveLock on extension appear
- remaining connection slots are reserved (with corresponding end user
impact)

An example of a message with 1 second wait duration and a lot of waiting
queries:

process 667 still waiting for ExclusiveLock on extension of relation
1249 of database 16406 after 1005.226 ms","Process holding the lock:
36279. Wait queue: 36725, 36405, 36511, 36721, 36280, 36048, 36566,

...

49019, 667, 877, 1306, 1070, 1265.""SQL statement ""CREATE TEMPORARY
TABLE


Ok, that's a MAJOR hint, because relation 1249 is a system catalog; 
namely pg_attribute. So I think what's happening here is that your 
catalog has become horrifically bloated. I'm 99% certain that VACUUM ALL 
will not vacuum the catalog tables.


Do you by chance have autovacuum turned off?

A manual VACUUM VERBOSE pg_attribute might provide some immediate relief.


An example of a message with 140 second wait duration:

process 27173 acquired ExclusiveLock on extension of relation 16787 of
database 16406 after 138522.699 ms","SQL statement ""INSERT INTO


This is a different case though, because 16787 is not a catalog table. 
(SELECT 16787::regclass; will tell you what table that is).



There are about 50 queries a second, so a 10 second lock will exhaust
the number of available connections.


Are you using a connection pool? Establishing 50 new database 
connections per second won't do anything to help performance...


Running out of connections in this scenario isn't surprising.


We keep sar logs, and they show an unusual amount of pgscand/s and
pgsteal/s around the time of the problem:

07:25:01 PM  pgpgin/s pgpgout/s   fault/s  majflt/s  pgfree/s pgscank/s
pgscand/s pgsteal/s%vmeff
08:25:01 PM 29.93   1069.13 393727.33  0.00 170655.03 0.00
90.20 90.20100.00
08:26:01 PM 20.14   1325.55 388751.51  0.00 157188.40 0.00
315.81315.81100.00


...


pgscand/s = Number of pages scanned directly per second.
pgsteal/s = Number of pages the system has reclaimed from cache
(pagecache and swapcache) per second to satisfy its memory demands.

Could the pgscand and pgsteal numbers provide a hint?  They're sometimes
zero for more than half an hour, so they don't seem related to checkpoints.


I think what that means is that there was suddenly a big spike in memory 
demand at the OS level, so now the OS is frantically dumping cached 
pages. That in itself won't explain this, but it may be a clue.


In order to extend a relation we need to ask the filesystem to actually 
extend the file (which presumably means at least writing some metadata 
to disk), and then I think we create a WAL record. Creating the WAL 
record won't by itself write to disk... *unless* the wal_buffers are all 
already full. So if you also see an I/O spike when this happens you 
could well just be starved from the I/O system (though obviously it'd be 
better if we handled that situation more elegantly than this).


Another possibility given the big uptick in page scanning in the OS is 
that you're saturating the memory bus... but you'd need WAY more than 5k 
pages/s to do that.


I do suspect your pgfree/s is very high though; putting 200k pages/s on 
the free list seems like something's broken.


BTW, 
http://www.postgresql.org/message-id/466d72d2-68ec-4ff4-93b8-43b687e7b...@simply.name 
contains a list of URLs about relation extension problems.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] ORDER BY for jsonb

2015-04-17 Thread Pai-Hung Chen
Hi,

I am new to PostgreSQL and have a question about the new jsonb type in 9.4.
Suppose I have a table called "user" that has two columns: (1) "user_id" of
type text, also the primary key, (2) "setting" of type jsonb. With the
following query pattern:

SELECT *
FROM user
WHERE user_id IN [...]
ORDER BY setting->>'foo',
 setting->>'bar',
 ...

where ORDER BY clause can contain an arbitrary list of root-level fields in
"setting". In this case, how should I create indexes for "user" to get good
query performance?

Thanks,
Pai-Hung


Re: [GENERAL] ORDER BY for jsonb

2015-04-17 Thread Jim Nasby

On 4/17/15 9:53 PM, Pai-Hung Chen wrote:

Hi,

I am new to PostgreSQL and have a question about the new jsonb type in
9.4. Suppose I have a table called "user" that has two columns: (1)
"user_id" of type text, also the primary key, (2) "setting" of type
jsonb. With the following query pattern:

SELECT *
FROM user
WHERE user_id IN [...]
ORDER BY setting->>'foo',
setting->>'bar',
  ...

where ORDER BY clause can contain an arbitrary list of root-level fields
in "setting". In this case, how should I create indexes for "user" to
get good query performance?


The performance for that query is going to come from quickly identifying 
records from the WHERE clause, which is going to use the primary key. 
For the query you're showing, indexes on the setting field aren't going 
to help.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] "Cast" SRF returning record to a table type?

2015-04-17 Thread Jim Nasby

On 4/17/15 7:39 PM, David G. Johnston wrote:

On Friday, April 17, 2015, Jim Nasby mailto:jim.na...@bluetreble.com>> wrote:

I'm working on a function that will return a set of test data, for
unit testing database stuff. It does a few things, but ultimately
returns SETOF record that's essentially:

RETURN QUERY EXECUTE 'SELECT * FROM ' || table_name;

Because it's always going to return a real relation, I'd like to be
able to the equivalent of:

SELECT ... FROM my_function( 'some_table' )::some_table;


Unfortunately this means "cast the existing type to some_table" and
"record" is not a valid type in this context.


Is there any trick that would allow that to work? I know that
instead of 'SELECT * ...' I can do 'SELECT row(t.*) FROM ' ||
table_name || ' AS t' and then do

SELECT ... FROM my_function( 'some_table' ) AS data( d some_table )

but I'm hoping to avoid the extra level of indirection.

Haven't explored this specific code in depth...but which part - the
function alias or the select row(t.*)?  They seem to be independent
concerns.


I'm saying that I know I can use the row construct as a poor 
work-around. What I actually want though is a way to tell this query:


SELECT ... FROM my_function( 'some_table' )

that my_function is returning a record that exactly matches "my_table". 
I suspect there's not actually any way to do that :(

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] function to send email with query results

2015-04-17 Thread Suresh Raja
Hi all:

I'm looking to write a function to send email with result of a query.Is
it possible to send email with in a function.  Any help is appreciated.

Thanks,
-Suresh Raja


Re: [GENERAL] "Cast" SRF returning record to a table type?

2015-04-17 Thread David G. Johnston
On Friday, April 17, 2015, Jim Nasby  wrote:

> On 4/17/15 7:39 PM, David G. Johnston wrote:
>
>> On Friday, April 17, 2015, Jim Nasby > > wrote:
>>
>> I'm working on a function that will return a set of test data, for
>> unit testing database stuff. It does a few things, but ultimately
>> returns SETOF record that's essentially:
>>
>> RETURN QUERY EXECUTE 'SELECT * FROM ' || table_name;
>>
>> Because it's always going to return a real relation, I'd like to be
>> able to the equivalent of:
>>
>> SELECT ... FROM my_function( 'some_table' )::some_table;
>>
>>
>> Unfortunately this means "cast the existing type to some_table" and
>> "record" is not a valid type in this context.
>>
>>
>> Is there any trick that would allow that to work? I know that
>> instead of 'SELECT * ...' I can do 'SELECT row(t.*) FROM ' ||
>> table_name || ' AS t' and then do
>>
>> SELECT ... FROM my_function( 'some_table' ) AS data( d some_table )
>>
>> but I'm hoping to avoid the extra level of indirection.
>>
>> Haven't explored this specific code in depth...but which part - the
>> function alias or the select row(t.*)?  They seem to be independent
>> concerns.
>>
>
> I'm saying that I know I can use the row construct as a poor work-around.
> What I actually want though is a way to tell this query:
>
> SELECT ... FROM my_function( 'some_table' )
>
> that my_function is returning a record that exactly matches "my_table". I
> suspect there's not actually any way to do that :(
>
>
No matter what you do inside the function you have to write that last query
as "from my_function('some_table') AS (rel some_table)" otherwise the
planer is clueless.  You cannot defer the type until runtime.  Your cast
form is slightly more succinct but I cannot see making it work when the
current method is serviceable.

Inside the function I would have thought that select * shoud work - no need
to use the row(t.*) construct - but the later seems reasonably direct...

If you could find a way to pass a value of type some_table into the
function - instead of the name/text 'some_table‘ - you could  possibly use
polymorphic pseudotypes...just imagining here...

Select ... From my_func(null::some_table)
Create function my_func(tbl any) returns setof any 
Use typeof to get a text string of the tbl arg's type.

You could maybe also return a refcursor...

David J.


Re: [GENERAL] [SQL] function to send email with query results

2015-04-17 Thread David G. Johnston
On Friday, April 17, 2015, Suresh Raja  wrote:

> Hi all:
>
> I'm looking to write a function to send email with result of a query.
>  Is it possible to send email with in a function.  Any help is appreciated.
>
>
Yes...though neither the neither the sql nor the plpgsql languages have the
necessary language features to do so - you will need to use something like
plperlu.

An (not mutually exclusive) alternative is to create an email queue and
write an email sending client application to process that queue.  The main
advantages being loose coupling and the ability to send the emails
asynchronously.

David J.


Re: [GENERAL] ORDER BY for jsonb

2015-04-17 Thread Pai-Hung Chen
Thanks for the help. So in this case, the performance of ORDER BY will not be 
affected at all by whether an index is created on the jsonb "setting" field?

Pai-Hung

-Original Message-
From: "Jim Nasby" 
Sent: ‎4/‎17/‎2015 8:59 PM
To: "Pai-Hung Chen" ; "pgsql-general@postgresql.org" 

Subject: Re: [GENERAL] ORDER BY for jsonb

On 4/17/15 9:53 PM, Pai-Hung Chen wrote:
> Hi,
>
> I am new to PostgreSQL and have a question about the new jsonb type in
> 9.4. Suppose I have a table called "user" that has two columns: (1)
> "user_id" of type text, also the primary key, (2) "setting" of type
> jsonb. With the following query pattern:
>
> SELECT *
> FROM user
> WHERE user_id IN [...]
> ORDER BY setting->>'foo',
> setting->>'bar',
>   ...
>
> where ORDER BY clause can contain an arbitrary list of root-level fields
> in "setting". In this case, how should I create indexes for "user" to
> get good query performance?

The performance for that query is going to come from quickly identifying 
records from the WHERE clause, which is going to use the primary key. 
For the query you're showing, indexes on the setting field aren't going 
to help.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


Re: [GENERAL] function to send email with query results

2015-04-17 Thread John R Pierce

On 4/17/2015 10:30 PM, Suresh Raja wrote:


I'm looking to write a function to send email with result of a query. 
   Is it possible to send email with in a function. Any help is 
appreciated.


I would do that in an application, not a pl sql function.   make a 
query, fetch the results, format them as you see fit for the email and 
toss it at your language-of-choice's email facility.




--
john r pierce, recycling bits in santa cruz



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general