[ADMIN] 8.3 to 8.4 - Can't load dynamic shared library

2010-07-15 Thread Craig James

I'm migrating from Postgres 8.3.10 to 8.4.4, and also from Fedora 9 to Ubuntu 
10.04.

On 8.3.10, I have a C extension that worked on 8.3, but now refuses to load on 
8.4:

CREATE OR REPLACE FUNCTION chmoogle_session_id() RETURNS integer
AS '/usr/local/pgsql/lib/libchmoogle.so', 'chmoogle_session_id'
LANGUAGE 'C' VOLATILE;
ERROR:  could not load library "/usr/local/pgsql/lib/libchmoogle.so": 
/usr/local/pgsql/lib/libchmoogle.so: undefined symbol: _ZN9OpenBabel4etabE

So it finds the first shared library, libchmoogle.so.  But the symbol 
_ZN9OpenBabel4etabE is from a second library, the OpenBabel (chemistry) shared 
library, which is also located in

  /usr/local/pgsql/lib/libopenbabel.so

On the Postgres 8.3 system, I simply put libopenbabel.so into the 
/usr/local/pgsql/lib directory, and everything worked well.  On 8.4, I can't 
seem to get it to load libopenbabel.so.  Other programs that use this library 
have no trouble finding it.

What (if anything) has changed from 8.3 to 8.4 regarding how it finds and links 
dynamic libraries?  What else should I look for?  I'm stumped on this one.

Thanks,
Craig

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] 8.3 to 8.4 - Can't load dynamic shared library

2010-07-15 Thread Craig James

On 7/15/10 12:52 PM, Tom Lane wrote:

Craig James  writes:

On the Postgres 8.3 system, I simply put libopenbabel.so into the 
/usr/local/pgsql/lib directory, and everything worked well.  On 8.4, I can't 
seem to get it to load libopenbabel.so.  Other programs that use this library 
have no trouble finding it.



What (if anything) has changed from 8.3 to 8.4 regarding how it finds

and links dynamic libraries?

Nothing.  I think this more likely is a consequence of your platform
switch from Fedora to Ubuntu.  I'm not sure how the dynamic linker's
search path is defined on Ubuntu, but it might be different.  Or
possibly it's the same but you forgot about making an /etc/ld.so.conf.d
entry on the new machine, or didn't run ldconfig afterwards.


We don't use /etc/ld.so.conf.d, and never have.

On 8.3.10 and earlier, Postgres would ONLY load this library if we put it in 
/usr/local/pgsql/lib, so I figured it was a security restriction, that Postgres 
altered the LD_LIBRARY_PATH to ensure that only legitimate, admin-approved 
libraries could be loaded.  The Postgres documentation sort of hints at this 
but doesn't give details.

Is this true, that Postgres restricts the dynamic linker in some way?  And if 
so, is there documentation?

Craig

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] 8.3 to 8.4 - Can't load dynamic shared library

2010-07-15 Thread Craig James

On 7/15/10 2:14 PM, Tom Lane wrote:

Craig James  writes:

On 8.3.10 and earlier, Postgres would ONLY load this library if we put it in 
/usr/local/pgsql/lib, so I figured it was a security restriction, that Postgres 
altered the LD_LIBRARY_PATH to ensure that only legitimate, admin-approved 
libraries could be loaded.  The Postgres documentation sort of hints at this 
but doesn't give details.



Is this true, that Postgres restricts the dynamic linker in some way?


No, it is not.  Library searching behavior is whatever the platform's
dynamic linker defines it to be.  On some platforms we try to build the
executables with an "rpath" pointing at the PG lib directory, so that
you don't need to explicitly configure anything to load our libraries
--- but I don't believe that that causes the linker to *only* search
that directory on any of those platforms.  (If it did, you'd have big
problems with libc.so for instance.)

As for LD_LIBRARY_PATH, the only part of our code that touches that is
the regression test wrapper pg_regress.  You might or might not find
this relevant:
http://archives.postgresql.org/pgsql-hackers/2010-05/msg00248.php

My guess is that the reason you had to put libopenbabel.so into
/usr/local/pgsql/lib before is precisely that you hadn't modified
/etc/ld.so.conf, and so only /usr/local/pgsql/lib would be searched
(as a result of the postgres executable's rpath property) plus the
standard system libraries (but unless you ran ldconfig, the linker
wouldn't know about any additions there).  The Ubuntu situation may
or may not be similar --- in particular, I don't know whether their
packages even set the rpath property at all.  I think some distros
deprecate using rpath and instead expect the PG lib directory to
be part of the ldconfig configuration, in which case *no* nonstandard
library is gonna get found unless you ask ldconfig to pretty please
find it.  Or maybe Ubuntu has yet another way.  You really ought to
ask for help about this in an Ubuntu-specific forum.


Thanks, you answer helped but in an unexpected way.  Knowing it had nothing to 
do with dynamic loading made me look elsewhere, so I dug around with nm(1) and 
realized that the missing symbol really was missing.

The reason is because of changes in pgxs.mk, the makefile that's supposed to 
help build extensions.  In 8.3, I could override the link step and use g++ 
rather than gcc, but in 8.4 the pgxs.mk ignored my link step, and was using gcc 
for the linking.  Since the OpenBabel chemistry  library is written in C++, you 
have to use g++ to do the link step for the Postgres extension.

I ended up hand crafting a Makefile and not using pgxs.mk at all, and now 
everything works.

Craig

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


[ADMIN] secret key for encryption

2010-07-16 Thread Craig James

This isn't exactly a Postgres question, but I hope someone in the community has 
solved it.

I want to encrypt some data in Postgres that arrives from Apache.  How do you 
store an encryption key in such a way that Apache CGIs can get it, but a hacker 
or rogue employee who manages to access the machine can't find out the 
encryption key?

Thanks,
Craig

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] secret key for encryption

2010-07-16 Thread Craig James

Kris,

[Replying to list, too.]

On 7/16/10 10:14 AM, Kris Deugau wrote:

Craig James wrote:

This isn't exactly a Postgres question, but I hope someone in the
community has solved it.

I want to encrypt some data in Postgres that arrives from Apache. How
do you store an encryption key in such a way that Apache CGIs can get
it, but a hacker or rogue employee who manages to access the machine
can't find out the encryption key?


Short answer: You don't.

Longer answer: You can tie things up with public-key encryption so that
a different system can retrieve the data, but the system that put it in
can't because it only has the public (encryption) key, not the private
(decryption) key.

Even that isn't safe from a rogue employee - what if that rogue is your
seniour sysadmin with full root access on all your systems?


If we assume no escalation of priviliges, that is, Apache stays apache and 
users can't escalate to root, what then?

This must be a solved problem.  Credit-card numbers are required to be 
encrypted by law.  It wouldn't make sense for them to be encrypted but then 
find that the password is sitting around where anyone can find it.  There must 
be any number of Postgres users who store encrypted credit card numbers and 
other personal data.  How do they solve this problem?

Craig

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] Stumped by a version conflict.

2010-07-16 Thread Craig James

On 7/16/10 3:31 PM, Patric Michael wrote:

Hi all...

I joined this list in an effort to solve a puzzle I fail to understand.
Thank you in advance for taking the time to read.

I've been managing a domain for the last eight years as an erstwhile
administrator.  Erstwhile meaning it is largely a matter of setting up
and automating the routine functions of the machine. I am not by any
stretch of the imagination a guru. And please keep in mind that aside
from minor tweaks and upgrades, I pretty much leave the machine alone.

In that time, I've upgraded postgres 7.1 to 7.4 and now to 8.1 and have
rarely needed to reboot the machine. (Yay CentOS!)  Manual installs from
source.  Not RPM's

A few days ago however, the co-lo appears to have taken the system down
for a short time and brought it back up, essentially rebooting the
system.  No big deal, except that until then, it had been running for
almost two years.

Everything came back up as expected, with the exception of postgres.
When I went to start it manually, I got an error warning of an
unrecognized tcpip_socket parameter in postgresql.conf.  Searching the
8.1 manual I discovered it had been deprecated in favor of
listen_addresses.  Odd, but okay.

I commented out the parameter (irrelevant since psql was configured with
defaults anyway) and tried starting it again.  The next error I got was
that the database was initialized with version 7.4 and this version was
8.1.  What?

I checked the time stamps of the database I knew to be active, and yes,
there are several timestamps from two days ago so I kow that's the most
recent.  PG_VERSION does indeed show 7.4 as the current version, and
yet, the binaries return 8.1.11 as the current version.


Is it possible that you built the Postgres 8.1, then accidentally started the 
old 7.4 version of Postgres, and then deleted the 7.4 tree?  It may have been 
still running for the last two years on a Postgres 7.4 server.

Craig




It should be noted at this point that medical problems have affected my
memory, so I may well be forgetting something obvious to the rest of
you.   :)

So my questions are these:  Since 8.1 binaries are the only ones that
exist on the machine, (in /usr/bin) how is it possible that they could
have utilized a database initialized with 7.4 in the first place?  I
thought perhaps I started the server back then with an old binary,
rather than what's in the path, but I find no other instances of psql,
pg_ctl, etc. except whats in /bin/.

Without an answer to that, and since I can't start the server as it is
(apparently) currently compiled, I'd have to recompile 7.4, open the
database, dumpall with those binaries, initialize a new database with
the 8.1 binaries, and restore the backed up db into the 8.1 structure.
But if the old 7.4 (current) db was running under 8.1 (and I still cant
see how that was possible) will it damage the db structure when it is
restored?

Maybe the better question, though probably harder to answer is, "What am
I forgetting?"

Any thoughts or suggestions on how to restart postgres with the database
I know to be current will be especially helpful.

Thank you.





--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] postgresql basic info

2010-07-27 Thread Craig James

On 7/27/10 3:37 AM, Lazarin Toni wrote:

I am Oracle DBA and i will be maybe work on a project where postgresql
database will be used.

So i would be greatful if you could please advise me on the folowing :

· Estimated time for postgresql latest RDBMS software instalation and
configuration (does it takes longer on windows or on linux)


Most of your time will be spent wondering what you've overlooked, thinking it 
couldn't possibly be that simple.

Seriously, if you know Linux or BSD admin, you can download it from source, 
compile and have a working database in about 30 minutes.  If your distro has 
Postgres, then your installation time is zero.

You'll spend some time tuning your configuration file for your particular 
needs.  The postgres-performance mailing is a good place to ask this question.

The most important considerations are common to both Postgres and Oracle.  That 
is, what kind of hardware do you have, and is it sufficient to your workload?  
CPUs, disks, memory, RAID config, and so on.  That's also a question for the 
postgres-performance mailing list.


· Are there any prefered OS, like windows or linux etc ... ?


BSD or Linux.  Windows is considered a second-tier platform for Postgres.


· Estimated time for import of 50gb of data from older postgres to the
newly instaled postgres?


Depends a lot on your hardware, and also on the nature of the data (lots of 
indexes will add time).  It also depends how you do it.  pg_dump/pg_restore are 
pretty fast.  This would probably take a few hours on a reasonable mid-sized 
server, for example, with an 8-disk RAID 10 array with a battery-backed RAID 
controller.

Craig

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


[ADMIN] Connection pooling for a mixture of lightweight and heavyweight jobs?

2010-07-28 Thread Craig James

I have a question that may be related to connection pooling.

We create a bunch of high-performance lightweight Postgres clients that serve 
up images (via mod_perl and Apache::DBI).  We have roughly ten web sites, with 
ten mod_perl instances each, so we always have around 100 Postgres backends 
sitting around all the time waiting.  When a lightweight request comes in, it's 
a single query on an primary key with no joins, so it's very fast.

We also have a very heavyweight process (our primary search technology) that 
can take many seconds, even minutes, to do a search and generate a web page.

The lightweight backends are mostly idle, but when a heavyweight search 
finishes, it causes a burst on the lightweight backends, which must be very 
fast. (They provide all of the images in the results page.)

This mixture seems to make it hard to configure Postgres with the right amount 
of memory and such.  The primary query needs some elbow room to do its work, 
but the lightweight queries all get the same resources.

I figured that having these lightweight Postgres backends sitting around was 
harmless -- they allocate shared memory and other resources, but they never use 
them, so what's the harm?  But recent discussions about connection pooling seem 
to suggest otherwise, that merely having 100 backends sitting around might be a 
problem.

Craig

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] Connection pooling for a mixture of lightweight and heavyweight jobs?

2010-07-30 Thread Craig James

On 7/30/10 8:57 AM, Kevin Grittner wrote:

Craig James  wrote:


We create a bunch of high-performance lightweight Postgres clients
that serve up images (via mod_perl and Apache::DBI).  We have
roughly ten web sites, with ten mod_perl instances each, so we
always have around 100 Postgres backends sitting around all the
time waiting.  When a lightweight request comes in, it's a single
query on an primary key with no joins, so it's very fast.

We also have a very heavyweight process (our primary search
technology) that can take many seconds, even minutes, to do a
search and generate a web page.

The lightweight backends are mostly idle, but when a heavyweight
search finishes, it causes a burst on the lightweight backends,
which must be very fast. (They provide all of the images in the
results page.)

This mixture seems to make it hard to configure Postgres with the
right amount of memory and such.  The primary query needs some
elbow room to do its work, but the lightweight queries all get the
same resources.

I figured that having these lightweight Postgres backends sitting
around was harmless -- they allocate shared memory and other
resources, but they never use them, so what's the harm?  But
recent discussions about connection pooling seem to suggest
otherwise, that merely having 100 backends sitting around might be
a problem.


Well, the "if it ain't broke, don't fix it" rule might come into
play here.


I should have given one more detail here: We've been the victim of persistent "CPU 
spikes" that were discussed extensively in postgres-performance.  Tom suggested 
upgrading to 8.4.4, but that can't happen for a couple more months (we're working on it).

  http://archives.postgresql.org/pgsql-performance/2010-04/msg00071.php

Craig


The current configuration might leave you vulnerable to
occasional less-than-optimal performance, if two or more heavyweight
processes finish at the same time, and cause a "thundering herd" of
lightweight processes.  Having the lightweight requests go through a
connection pool could mitigate that problem, but they introduce
their own overhead on every request.  So, I would advise keeping an
eye out for problems which might match the above, but not to take
hasty action in the absence of evidence.  You might buy back 400MB
of RAM for caching (which you may or may not need) at the cost of
extra latency and CPU per request.

-Kevin




--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] Connection pooling for a mixture of lightweight and heavyweight jobs?

2010-07-30 Thread Craig James

On 7/30/10 10:37 AM, Kevin Grittner wrote:

Craig James  wrote:


Well, the "if it ain't broke, don't fix it" rule might come into
play here.


I should have given one more detail here: We've been the victim
of persistent "CPU spikes" that were discussed extensively in
postgres-performance.  Tom suggested upgrading to 8.4.4, but that
can't happen for a couple more months (we're working on it).



http://archives.postgresql.org/pgsql-performance/2010-04/msg00071.php

Ah, I hadn't connected that thread with this.  After rereading that
thread with the information from this thread in mind, I think what
you describe on the other thread could well be the "thundering herd"
problem.  Some form of connection pooling could well help.

BTW, I hope you've updated to the latest 8.3.x by now.  If not, you
should expedite that.


Yes, I updated to 8.3.10, partly to see if it would solve this problem.

I'm not clear on how connection pooling would help this problem.  I would have 
100 lightweight backends, whether they were pooled or not, always sitting 
around.  Or are you suggesting that I not use Apache::DBI to maintain 
persistent connections, and instead rely on the connection pooler to provide 
fast connect/disconnect from Postgres?

Thanks,
Craig


--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] password administration

2010-08-05 Thread Craig James

On 8/5/10 12:58 PM, Mark Steben wrote:

I would like to set up a facility that enforces password changes for roles
After a predefined period (30 days for instance) when logging into psql
Or, at the very least, send an email out to notify that your current
Password period is about to expire.  Preferably, I'd like to use
The 'rolvaliduntil' column in pg_roles.

I'm wondering if there is an app inside or outside of postgres
that I can use or do I have to design from scratch.


This is an off-topic response, but security experts have said that this is a 
REALLY bad idea.  It forces people to choose a new password, which means they 
can't remember it.  So what do they do?  They write it down.  Anyone snooping 
around their office can find it.

Besides, when a password is stolen, it's usually used within minutes.  Making 
everyone change every month does no good at all.

A better solution is to implement a password-strength algorithm and require 
people to select decent passwords to begin with.

Craig

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] How to move a database from HP server to Linux Server that had already one database.

2010-08-10 Thread Craig James

On 8/9/10 5:14 PM, Bruce Momjian wrote:

ENGEMANN, DAYSE wrote:

Hi Kevin,
Sorry to disturb you.. But I am really new in it...
Let me see if I understood...

  pg_dump -h sourcemachine -U sourceuser source_dbname | psql target_dbname


Has anyone done any measurement of whether it is faster to do the dump
on the local machine with psql remote or from a remote machine (where
psql would be local)?


It probably depends more on your network speed.  If you have a slow network, 
then run pg_dump on the machine where the database lives and use compression 
(--format=c) and restore using pg_restore.  It will cut WAY down on the amount 
of data that has to move across the net.  If you have a fast network, then it 
doesn't matter very much -- you'll be limited by disk speed.

Craig

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] Tuple changes from relfilenodes

2010-08-27 Thread Craig James

On 8/27/10 9:48 AM, £ukasz Brodziak wrote:

I'm looking for actual versions of row data. What I want to achieve as a final 
result is a kind of data change history.


What about a before-update trigger and an history table?

Craig


 > Date: Fri, 27 Aug 2010 08:25:26 -0500
 > From: kevin.gritt...@wicourts.gov
 > To: lukasz.brodz...@hotmail.com; pgsql-admin@postgresql.org
 > Subject: Re: [ADMIN] Tuple changes from relfilenodes
 > Are you looking for statistics or the actual prior versions of rows?
 >
 > -Kevin
 >
 > --
 > Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
 > To make changes to your subscription:
 > http://www.postgresql.org/mailpref/pgsql-admin



--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] Tuple changes from relfilenodes

2010-08-27 Thread Craig James

On 8/27/10 10:09 AM, Lukasz Brodziak wrote:

What do You mean by history table? Creating such table is out of the question.


You asked for "actual versions of row data" and a "data change history."  How 
can you expect to get a history without the history being stored somewhere?  The previous answers 
to your question already said that the old versions of data can be erased almost immediately, so if 
you really need the history, you have to do it yourself.

Why is that out of the question?  Perhaps a more detailed explanation of what 
you are trying to achieve would help.

Craig



 > Date: Fri, 27 Aug 2010 10:04:39 -0700
 > From: craig_ja...@emolecules.com
 > To:
 > Subject: Re: [ADMIN] Tuple changes from relfilenodes
 >
 > On 8/27/10 9:48 AM, ?ukasz Brodziak wrote:
 > > I'm looking for actual versions of row data. What I want to achieve as a 
final result is a kind of data change history.
 >
 > What about a before-update trigger and an history table?
 >
 > Craig



--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] [NOVICE] - SAN/NAS/DAS - Need advises

2010-09-07 Thread Craig James

On 9/7/10 12:06 PM, Jesper Krogh wrote:

On 2010-09-07 20:42, Scott Marlowe wrote:

With the right supplier, you can plug in literally 100 hard drives to
a regular server with DAS and for a fraction of the cost of a SAN.

Ok, recently I have compared prices a NexSan SASBeast with 42 15K SAS drives
with a HP MDS600 with 15K SAS drives.

The first is 8gbit Fibre Channel, the last is 3Gbit DAS SAS. The
fibre channel version is about 20% more expensive pr TB.

So of course it is a "fraction of the cost of a SAN", but it is a
fairly small one.


Are you really comparing equal systems?  "8gbit Fibre Channel" means a single Fibre 
Channel shared by 42 disks, whereas "3GBit DAS SAS" means 42 3gbit channels running in 
parallel.  It seems like you'd really need some realistic benchmarks that emulate your actual 
server load before you'd know how these two systems compare.

Craig


--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] incrementing updates and locks

2010-09-16 Thread Craig James

On 9/16/10 3:54 PM, Aras Angelo wrote:

Hello All

I have a column in my table which is incrementally updated.


Try to give us more details...

Does the column need have contiguous values or are "gaps" ok?  That is, does it 
have to be 1,2,3,4,...,N-1,N or is it ok to have something like 1,3,4,7,...,M (where M>N) 
for N rows?

Is the value updated every time the row is changed, or is it set once only?

If gaps are OK, then a sequence is a simple answer.

Craig

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] incrementing updates and locks

2010-09-16 Thread Craig James

On 9/16/10 4:17 PM, Aras Angelo wrote:

Daniel, Craig

The gaps are not really expected. It is set once only.
Its about printing packing slips for ecommerce orders. We have the ORDER ID 
sequence, but so many different stations are accessing these orders, if my 
station print the next 100 orders from the que, id like to give them values 
starting from  MAX(print_number_sequence so far) AND  +1, +2, +3,  +100.

I hope this clears it better. I think a sequence can work. My concern was 
performance, as in the actual programming LOOP, querying the max field, 
assigning the row number, reissuing the max field. A sequence i guess, would 
perform better than a regular table index?


Sequences are very fast. Just do something like this:

  update mytable set order_id = nextval('order_id_seq') where ...

Using this technique, you will only get gaps in the sequence if something goes 
wrong and the transaction is rolled back.  But given what you have told us 
about your application, it is probably not very important if a few ORDER ID 
numbers are missing.

Craig


On Thu, Sep 16, 2010 at 4:06 PM, Craig James mailto:craig_ja...@emolecules.com>> wrote:

On 9/16/10 3:54 PM, Aras Angelo wrote:

Hello All

I have a column in my table which is incrementally updated.


Try to give us more details...

Does the column need have contiguous values or are "gaps" ok?  That is, does 
it have to be 1,2,3,4,...,N-1,N or is it ok to have something like 1,3,4,7,...,M (where 
M>N) for N rows?

Is the value updated every time the row is changed, or is it set once only?

If gaps are OK, then a sequence is a simple answer.

Craig


--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org 
<mailto:pgsql-admin@postgresql.org>)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin





--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] Default session timeout in PG?

2010-10-04 Thread Craig James

On 10/4/10 10:36 AM, Lou Picciano wrote:

(Think I've seen the answer to this already, but:) is there any 'innate', or
 'default' session timeout function built in to PG?

We have certain clients who seem to timeout pretty frequently, while
 others seem to never time out at all; before we start diagnosing networks, I 
want to be sure there's nothing we can do server-side to ameliorate/control the 
problem.


We discovered the other day that a whole class of Netgear routers (designed for 
small businesses) contain a bug: they cut off all idle TCP/IP connections after 
5 minutes, and there is no way to change this behavior.  We threw them away.

Craig

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] restore

2010-10-05 Thread Craig James

On 10/5/10 11:08 AM, Dinesh Bhandary wrote:

Hi All,

Is there an easy way to restore to a new table where the column name have been 
changed but data remains the same?
For example I am trying to restore from existing system, table1(col1) to 
table1(col2) and it is erroring out on the new column name even though it is a 
data only restore. I was just wondering if there is a quick way to bypass this. 
Let me know.


If your data file is plain text with a COPY or INSERT commands, do this using 
psql:

begin;
alter table1 rename col2 to col1;
\i datafile.sql
alter table1 rename col1 to col2;
commit;

Craig

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] Connect to a server with SSL encrypted connection?

2010-10-14 Thread Craig James

On 10/14/10 3:59 AM, Sebastien wrote:

Hello!

I'm writting to get some clues about PostgreSQL administration, and more 
precisely SSL connection from one server to another with certificate 
identification. I must underline than I'm new to postgreSQL and server 
administration.

Here is the story:

My main server has an IP XXX.XXX.XXX.XXX. I installed on it PostgreSQL 9.0.1 and OpenSSL. 
I did put a line in pg_hba.conf to allow connection from users on the local network 
(connection to the local server "MyMainServer").

I have a secondary server located elsewhere ("MySecondServer"), on which I want to 
maintain a copy of a database of my main server "MyDatabase" (through dumps), and to have 
access from this place (with the IP YYY.YYY.YYY.YYY) to my PostgreSQL on MyMainServer. I installed 
PostgreSQL 9.0.1 and OpenSSL on this server as well. Both servers run on windows.

I need the connection to be as secured as possible, with only allowed connection from 
"MySecondServer" to "MyMainServer"

 From what I found in the documentation, it seems that a SSL encrypted 
connection with identification of client and server with certificates would fit 
my needs.

Now, beginning of the questions... Please tell me if anything is wrong or could 
be improved...


1) The line to add to my pg_hba.conf on "MyMainServer" :

hostssl MyDatabase MyUserName YYY.YYY.YYY.YYY/32 cert

This is to only allow SSL connections from YYY.YYY.YYY.YYY for user MyUserName 
with certificate.

Question n° 1 : on page http://developer.postgresql.org/pgdocs/postgres/ssl-tcp.html they state "The clientcert 
option in pg_hba.conf is available for all authentication methods (...)", and also "place certificates of the 
certificate authorities (CAs) you trust in the file root.crt in the data directory, and set the clientcert parameter to 
1 on the appropriate hostssl line(s) in pg_hba.conf". Is it something different from the "cert" option 
that I added to the end of the line? I've seen no option called "clientcert" for pg_hba.conf elsewhere.

Now comes the big trouble... Generating the certificates. How should I sign 
them in this case?


2) Creation of my server certificate:

 > openssl req -new -text -out server.req

I enter for "Common Name": "MyMainServer", right? Then,

 > openssl rsa -in privkey.pem -out server.key

"A self-signed certificate can be used for testing (...) If all the clients are 
local to the organization, using a local CA is recommended.".

Okay, from here it becomes more difficult. Does it mean that I should set up a local CA on 
"MySecondServer" to sign the certificate of "MyMainServer"?

So when I connect to "MyMainServer", (from YYY.YYY.YYY.YYY), it sends me its certificate, 
and this one is checked from my local CA on "MyMainServer"?

If it's true, could you tell me how to do that? What are the steps to create my 
local CA, then sign my server certificate with it?


3) Creation of my client certificate... I guess it's the same thing,

 > openssl req -new -text -out client.req

I enter for "Common Name": "MyUserName", right? "The cn (Common Name) attribute of 
the certificate will be compared to the requested database user name, and if they match the login will be 
allowed. User name mapping can be used to allow cn to be different from the database user name.".

Then, should I also remove the passphrase with "openssl rsa -in privkey.pem -out 
client.key". I guess not, so that this passphrase will protect the client key, but 
will be asked everytime I connect, is that true?

So, now can I also sign the client certificate with the same local CA that you 
helped me setup just before? How, please?


4) If I understand well, I know have one certificate for the server, and one 
for the client, both signed with my local CA.

Then, I should add the certificate of my local CA in "root.crt" of "MyMainServer", true? 
What does it mean, copying the content of the CA certificate file in "root.crt"

"server.crt" and "server.key" previously created on step 2 must also be put on the "data" 
folder on "MyMainServer", true?

Ok, I guess that at this point "MyMainServer" is well configured, true?


5) Now, for the client... How does it work on this part? From what I understand:

On "MySecondServer", I must put my local CA certificate in "root.crt", what 
does it mean, copying the content in this file?

Then, in "postgresql.crt", put my client certificate created before ("client.crt"). I guess that's 
all I have to do, true? "MySecondServer" is ready to connect to "MyMainServer"?

Now, how can I set sslmode to "verify-full" as indicated on 
http://developer.postgresql.org/pgdocs/postgres/libpq-ssl.html ? I couldn't find this 
option anywhere...


6) Okay, almost done...

Now I want to make a dump of "MyDatabase" on "MyMainServer" on "MySecondServer". How can I do this 
connection? What is the host that I should write to connect to "MyMainServer", IP XXX.XXX.XXX.XXX with user 
"MyUserName"?

 > pg_dump -c -h  -p 5432 -U "MyUserNam

Re: [ADMIN] Pre-Allocate tablespace on disk

2010-11-15 Thread Craig James

On 11/15/10 11:42 AM, Chris Ruprecht wrote:

I was wondering if there is a way to pre-allocate tablespace on disk before
 adding data and indexes.
My understanding is:
PG writes data into files sequentially. If more space is needed, disk space
is requested from the OS and if there is space, the OS will give PG a file
 system block and PG will write data/indexes until that block is full and so on.
Other database that I have worked with before and that I'm still working with,
allow you to pre-allocate disk space so you get huge chunks of contiguous
space at one, which has major impacts on database performance.


This shouldn't really be a performance issue on modern Unix-type systems.  Most 
file systems have fairly smart block-allocation algorithms that tend to 
allocate large blocks of contiguous space to files even when lots of processes 
are asking for little chunks in interleaved requests.  It doesn't just allocate 
a block to the next guy who happens to ask.  It's a lot smarter than that.  In 
most cases, you'll barely be able to tell the difference between a file that 
was allocated all at once and one that was allocated on an as-needed basis.  
The only time you get bad fragmentation is when your disk gets nearly full, 
which takes away the file system's ability to be clever.

If you really want to preallocate a file, build a separate partition on your 
disk and only put one database (or even just one table) on that file system.

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] binary logs: a location other than pg_xlog??

2010-11-22 Thread Craig James

On 11/22/10 10:42 AM, James Cloos wrote:

"TL" == Tom Lane  writes:


TL>  It *is* configurable: make pg_xlog a symlink.

I notice that everyone suggests using a symlink, but I never see anyone
suggest just mounting a filesystem there.

Is there a (technical) reason for that?  Or is a symlink just though to
be easier to explain?


I think it's a case of, "If you know how do do this, you don't have to ask."

Craig

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] Best Linux filesystem for Postgres data store ?

2010-12-01 Thread Craig James

On 12/1/10 7:54 AM, Kevin Grittner wrote:

"Lello, Nick"  wrote:


What is considered the best filesystem to use for postgres data
stores ?


I held off for a bit to see if someone else would jump in with a
comparison of filesystems, but so far nobody has taken the bait.
The reason I held off is that the only reasonable answer I can think
of is, "It depends."

Personally, I would stay away from any filesystem without journaling
for allocations; which, for example, leaves out ext2.  For various
reasons I would also avoid ReiserFS.  We've had good luck with xfs
in our shop.


We used xfs with good results for a couple years.  We recently reinstalled the 
latest Linux and Postgres on the same hardware, and our tests showed ext4 gave 
slightly better performance than xfs so we switched to ext4.  This is on a BBU 
8-disk RAID10 array with 10K SATA disks.  Just one data point...

Craig

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] FOR SHARE permissions

2010-12-13 Thread Craig James

On 12/11/10 7:47 PM, David Underhill wrote:

I have two tables.  One has a foreign key referencing a serial
 field in the other table.  I've given INSERT privilege to a role
 other than the owner, but I still can't insert into the table
 containing the foreign key unless I grant the /owner/ of the table
 UPDATE privilege on the table containing the referenced field.

I don't quite understand why the /owner/ needs to have UPDATE
 permission in order for another distinct role (with INSERT
 privilege) to be able to insert a row in this case.


I don't know about the specifics of the Postgres implementation, but this makes 
sense from a security point of view.

When you insert into second table, you're effectively "locking" the referenced 
row in the referenced (first) table, making it so that the owner of that table can no 
long delete that row.  You ARE updating that table.  You're not inserting or deleting 
data from it, but you are changing what the owner can do to it.  In other words, you're 
updating the owner's ability to delete from and update the referenced table.

Craig

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] Securing a remotely accessible PostgreSQL server

2010-12-22 Thread Craig James

Josh  wrote:

I am looking for suggestions on how best to secure a server that
is accessible via the internet. Even account creation for the
database is open to the world. Does anybody have any extra changes
they would make to postgresql.conf or OS changes they would
suggest? Perhaps some default permissions that would be best
revoked?

The system setup is currently a Linux box running PostgreSQL 8.4
My pg_hba.conf already limits remote connections to one database
and one particular role.


You don't give any details about your users or how/why they need this access so 
it's hard to give good advice.  But one possibility is to use SSH tunneling, so 
that your users have to log in to your server first using a protocol that's 
pretty secure.

   ssh -L5432:localhost:5432 u...@host.com

Then the user connects locally instead of directly.  On the user's computer:

   psql -h localhost dbname

We've used this technique when a developer had to work from a remote location.  
There is no direct access to Postgres at all, yet you can work remotely and 
securely.

Craig

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] Postgres Backup Utility

2011-01-20 Thread Craig James

On Thu, Jan 20, 2011 at 10:42 AM, Bradley Holbrook
  wrote:

Thanks Scott... a couple comments.

Our developers never decide what goes to where... they just happily
plumb away on the development db until we're ready to take

our product

to testing (at regular intervals), once QA is passed, we

wish to apply these to live.

We have several diff tools and sync tools, but they take forever
(especially the ones that only go one schema at a time).

The DDL Logging sounds like a sufficient solution, can it be
configured to only record create and alter commands (or create or
replace commands on functions or updates on sequences, etc)? I'd
likely write a script to have this emailed to me at the end

of every

day. I'm going to google DDL logging (never heard of it),

but any good resources off the top of your head?

It's basically logging anything that changes the structure of
the database.  It would be easy enough to grep out what you
do and don't want later.


Martin French is right though, ask your developers to write

down all

their SQL struct changes and they look at you funny... and being a
developer myself I'd look at me funny. If you forget just

once you're

screwed into a day sifting through tables and code.


I've worked in three different shops now as a dev-dba and
sysadmin, and in all three, all DDL changes had to be
committed and / or handed over to the DBAs.  period.  Look
funny all they want, they either give up the DDL or their
code doesn't get pushed off dev servers onto anything else.
At the very least they should be able to tell you which
tables changed to go with which code changes, or you're not
sure what code you can and can't push.  I get both of your
point on this, but it's a discipline issue that needs sorting
out with the developers if you want to have reproduceable ddl
changes in all your systems that match the code changes.



Completely agree with Scott.


Me too. The idea that a developer can change the schema without telling anyone 
is laughable. The idea that someone else has to reverse engineer schema changes 
is ludicrous.

We have a strict procedure that every schema change has to be accompanied by a 
script that applies the changes. It has to be checked into Subversion along 
with all the other code.

Make a new rule: Developers have to document schema changes. Institute a 
zero-tolerance policy for omissions.  If a developer can create the DDL to make 
the change, surely it's not too much trouble to spend another minute adding it 
to a file and checking that file in.

Craig

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] mls selinux and postgres

2011-04-09 Thread Craig James

On 4/9/11 9:24 AM, H S wrote:

Dear Sirs,

I would like to include SELINUX MLS to my postgres DBMS,
I have just install mls package on my fedora 14 and mcstrans
But after relabelling my whole file system, I do not have any graphical user 
interface ,  and must login to my system in text mode (runlevel 3).
Is there anyone help me, that whether It is possible to login to system after 
relabelling the file system with MLS policy, with previous rulnleve for example 
runlevel 5?

I connected my postgres to selinux by targeted instead of mls.
But now I would like how it is possible in mls mode for selinux.

I would like to hear any from you.

SARA


If you can't even log in to your computer, it's not a question for a Postgres 
discussion group.  You need to find a Linux help forum and ask your question 
there.

Craig


[ADMIN] select for update

2011-04-22 Thread Craig James

I thought I understood "select ... for update," but maybe not.

We have a number of separate databases and a unique integer identifier that's supposed to 
be global across all databases.  A single "archive" database is used to issue 
the next available ID when a process wants to create a new object.  The sequence of 
operations goes like this (pseudo-code):

  /* start a transaction *
  begin;
 
  /* see if an objectid has been returned for re-use */

  select objectid from archive where db_id is null limit 1 for update

  /* no ID available?  issue a new one */
  if (objectid is null)
new_id = select nextval('object_id_sequence')
insert into archive(objectid, db_id) values('new_id', 'new_id')

  /* ID available? claim it */
  else
update archive set db_id = this_db_id where objectid

  commit

The problem is that very occasionally the same ID will be issued twice.  I don't see how 
this can be.  Doesn't the "for update" guarantee that no other process can 
claim that same row?

Thanks,
Craig

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] select for update

2011-04-22 Thread Craig James

On 4/22/11 1:58 PM, Tom Lane wrote:

Craig James  writes:

select objectid from archive where db_id is null limit 1 for update

The interaction between LIMIT and FOR UPDATE changed in 9.0 ... what
PG version are you using?

8.4.4

thanks,
Craig

regards, tom lane




--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] select for update

2011-04-23 Thread Craig James

On 4/22/11 8:17 PM, Tom Lane wrote:

Craig James  writes:

On 4/22/11 1:58 PM, Tom Lane wrote:

Craig James   writes:

select objectid from archive where db_id is null limit 1 for update

The interaction between LIMIT and FOR UPDATE changed in 9.0 ... what
PG version are you using?

8.4.4

Well, note what it says in the 8.4 SELECT reference page:

Caution

It is possible for a SELECT command using both LIMIT and FOR
UPDATE/SHARE clauses to return fewer rows than specified by
LIMIT. This is because LIMIT is applied first. The command
selects the specified number of rows, but might then block
trying to obtain a lock on one or more of them. Once the SELECT
unblocks, the row might have been deleted or updated so that it
does not meet the query WHERE condition anymore, in which case
it will not be returned.

I think what's probably happening to you is you're getting a NULL not
because there isn't a matching row, but because somebody is updating the
first matching row underneath you and then the LIMIT prevents finding
any other matches.  However, that pseudo-code is too pseudo to tell
whether this theory is correct.

Thanks, it sounds like this is exactly what's happening.  It happens very 
rarely (a few times per month), so this makes sense.

I think I just need a two-step approach:

  $object_id = $dbh->selectrow_array("select min(objectid) from archive where db_id 
is null");
  if ($object_id) {
$db_id = $dbh->selectrow_array("select db_id from archive where objectid = 
$object_id for update");
... double check that db_id is still NULL, repeat if someone else grabbed 
it...
  }

(9.0 handles these situations in a less unintuitive fashion, btw.)

We'll be migrating soon, thanks.

Craig

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


[ADMIN] Who is causing all this i/o?

2011-05-20 Thread Craig James

Our development server (PG 8.4.4 on Ubuntu server) is constantly doing 
something, and I can't figure out what.  The two production servers, which are 
essentially identical, don't show these symptoms.  In a nutshell, it's showing 
10K blocks per second of data going out, all the time, and essentially zero 
blocks per second of input.

To start with, if I turn off Postgres, everything is quiet:

$ /etc/init.d/postgresql stop
Stopping PostgreSQL: server stopped
ok
[radon:root] /emi/logs/londiste # vmstat 2
procs ---memory-- ---swap-- -io -system-- cpu
 r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy id wa
 0  0 126140 1115944 112692 1046645200   132   82913  2  1 97  1
 0  0 126140 1116068 112692 1046645200 0 0  150  175  0  0 100  0
 0  0 126140 1116184 112708 1046645600 094  123  153  0  0 100  0
 0  0 126140 1116432 112708 1046645600 0 0   96  126  0  0 100  0
 0  0 126140 1116928 112708 1046645600 0 0   94  111  0  0 100  0
 0  0 126140 1116688 112708 1046645600 0 0  114  144  0  0 100  0
 0  0 126140 1116364 112708 10466456001874  229  266  1  0 99  0
 0  0 126140 1116488 112708 1046649200 0 0  111  138  0  0 100  0
 1  0 126140 1116868 112716 1046649200 020   96  121  0  0 100  0

But when I restart Postgres, it looks like this, and runs this way 24/7.  Notice the 
"bo" column:

$ /etc/init.d/postgresql start
Starting PostgreSQL: ok
$ vmstat 2
procs ---memory-- ---swap-- -io -system-- cpu
 r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy id wa
 0  0 126140 1078020 112744 1048065200   132   82913  2  1 97  1
 0  1 126140 1054320 112744 104877920052 10714  900  976  2  1 97  0
 0  1 126140 1015988 112760 1051106000  1828  9620 1491 3527  5  2 87  6
 0  1 126140 1008316 112768 1051793600   672 10994  982 1209  1  2 86 11
 0  0 126140 1017780 112776 1052882400  1466 10746 1003 1428  2  1 89  8
 0  0 126140 1013324 112784 1053368000 0  9718  766  858  1  0 99  0
 1  0 126140 991600 112784 1053922000 0 10714  962 1127  2  1 97  0
 0  0 126140 24 112788 1054438000 6 10726  918 1227  2  1 97  0
 0  0 126140 995700 112800 1054881600 0  9744  823  889  1  1 99  0
 2  0 126140 964560 112804 1055404800 2 10718  903 1196  3  1 96  0

That seems to be a lot of output data going somewhere, but there doesn't seem 
to be anything going on that should cause it:

postgres=# select datname, procpid, usename, substr(current_query,0,25) 
current_query, waiting from pg_stat_activity;
  datname   | procpid | usename  |  current_query   | waiting
+-+--+--+-
 postgres   |   12927 | postgres | select datname, procpid, | f
 global |   12185 | postgres || f
 accounting |   12367 | postgres || f
 orders |   12225 | orders   || f
 archive|   12267 | postgres || f
(5 rows)

After a lot of digging around, I found this in the /postgres/pg_stat_tmp 
directory.  If I list the directory including the i-nodes once every second, I 
find that a new 2MB file is being created roughly once every two seconds:

# ls -li
total 2384
69731459 -rw--- 1 postgres postgres 2437957 2011-05-20 14:07 pgstat.stat
# ls -li
total 2384
69731258 -rw--- 1 postgres postgres 2437957 2011-05-20 14:07 pgstat.stat
# ls -li
total 2384
69731459 -rw--- 1 postgres postgres 2437957 2011-05-20 14:07 pgstat.stat
# ls -li
total 2384
69731459 -rw--- 1 postgres postgres 2437957 2011-05-20 14:07 pgstat.stat
# ls -li
total 3332
69731258 -rw--- 1 postgres postgres 2437957 2011-05-20 14:07 pgstat.stat
69731459 -rw--- 1 postgres postgres  970752 2011-05-20 14:07 pgstat.tmp
# ls -li
total 2384
69731459 -rw--- 1 postgres postgres 2437957 2011-05-20 14:07 pgstat.stat
# ls -li
total 2384
69731258 -rw--- 1 postgres postgres 2437957 2011-05-20 14:07 pgstat.stat
# ls -li
total 2384
69731258 -rw--- 1 postgres postgres 2437957 2011-05-20 14:07 pgstat.stat
# ls -li
total 2384
69731258 -rw--- 1 postgres postgres 2437957 2011-05-20 14:07 pgstat.stat

This is a development server, so there are no queries coming in from the web or 
other random activities.  Nothing is going on.

The other two servers are configured identically.  If I diff the configuration files, the 
only difference is the IP addresses for the "listen" section.

Can anyone tell me what's going on?  Why is pgstat.stat being rewritten on this 
server constantly and not on the other two servers?

Thanks,
Craig


--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] Who is causing all this i/o?

2011-05-20 Thread Craig James

On 5/20/11 3:38 PM, Joshua D. Drake wrote:



The other two servers are configured identically. If I diff the
configuration files, the only difference is the IP addresses for the
"listen" section.

Can anyone tell me what's going on? Why is pgstat.stat being rewritten
on this server constantly and not on the other two servers?


Do you have something monitoring the stats of the system aggressively? (like 
pg_stat_user_tables?)

No, there is nothing like that running on this server.  Just Apache, and Apache 
can only be accessed from inside the company, and its logs showed no activity 
at all.

Thanks,
Craig


JD




Thanks,
Craig








--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] Who is causing all this i/o?

2011-05-20 Thread Craig James

On 5/20/11 4:25 PM, Scott Marlowe wrote:

On Fri, May 20, 2011 at 3:14 PM, Craig James  wrote:

Our development server (PG 8.4.4 on Ubuntu server) is constantly doing
something, and I can't figure out what.  The two production servers, which
are essentially identical, don't show these symptoms.  In a nutshell, it's
showing 10K blocks per second of data going out, all the time, and
essentially zero blocks per second of input.
After a lot of digging around, I found this in the /postgres/pg_stat_tmp
directory.  If I list the directory including the i-nodes once every second,
I find that a new 2MB file is being created roughly once every two seconds:

Have you got a lot of databases in your development environment?  I
think that can sometimes cause a lot of pg_stat writes.

Yes.  The production servers have a dozen or so databases, but the development server has 
a couple hundred databases. Does that count as "a lot of databases"?

We've had even more databases in the past (>500) and didn't see this sort of 
I/O activity.

The odd thing is that this activity is gradually growing, so slowly that you don't notice 
it right away.  A month or two ago, nothing.  A couple weeks ago, a constant stream of 
"grass" in xload's graph.  Today, xload is showing a 50% all the time.  I can 
only guess that the load will continue to increase.

Is there a way to tell Postgres not to do this, or to do it less?  It's not a 
big disaster (yet), but it's annoying to have the system spewing a couple 
megabytes of pg_stat_tmp data every few seconds.

Thanks,
Craig

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] Who is causing all this i/o?

2011-05-21 Thread Craig James

On 5/21/11 8:11 AM, Tom Lane wrote:

Craig James  writes:

On 5/20/11 4:25 PM, Scott Marlowe wrote:

On Fri, May 20, 2011 at 3:14 PM, Craig James   
wrote:

Our development server (PG 8.4.4 on Ubuntu server) is constantly doing
something, and I can't figure out what.  The two production servers, which
are essentially identical, don't show these symptoms.  In a nutshell, it's
showing 10K blocks per second of data going out, all the time, and
essentially zero blocks per second of input.
After a lot of digging around, I found this in the /postgres/pg_stat_tmp
directory.  If I list the directory including the i-nodes once every second,
I find that a new 2MB file is being created roughly once every two seconds:

Have you got a lot of databases in your development environment?  I
think that can sometimes cause a lot of pg_stat writes.

Yes.  The production servers have a dozen or so databases, but the development server has 
a couple hundred databases. Does that count as "a lot of databases"?

Yeah.  I think what is happening is that the autovacuum launcher is
visiting every database, doing accesses to the system catalogs (and not
much more than that), which results in access-count updates in the stats
collector, which have to get written to disk.

What's not apparent however is why the stats collector is writing disk
so much.  8.4 does have the logic change to not write stats out unless
something is asking to see them.  So either it's really pre-8.4,

$ ./postgres --version
postgres (PostgreSQL) 8.4.4


  or you
have a monitoring task that is constantly asking to see stats.

Nothing like that. There's virtually nothing on this server but Postgres and 
Apache.  There are some fast-cgi perl programs that stay connected to Postgres 
all the time, but the constant I/O starts up even if I kill Apache.

One possible band-aid solution is to increase autovacuum_naptime.  This
is defined as the AV cycle time *for each database*, so AV wakes up and
touches another database every naptime/#databases seconds.

I'll try this.  Since it's a development system, it tends to get large bursts 
of loading followed by a bit of searching.  Cutting down on vacuums should be 
fine.

   If your
number of databases has been growing over time, this would probably
explain why the problem is getting worse.

For the last few months the number of databases has been almost static.  Maybe 
five added (of 200) in the last three months, and probably a few others dropped.

Thanks,
Craig

regards, tom lane




--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] Postgresql 8.4 replication using Londiste

2011-06-08 Thread Craig James

On 6/8/11 12:39 PM, Maria L. Wilson wrote:

Interested in using Londiste to set up replication of only one column in a 
particular database to another database on a remote Postgres server.  I've 
installed and configured londiste on the servers - that was no problem.  I just 
don't see anything in the docs that specify how to replicate only one column - 
not the entire table.  Anybody out there have any expertise in this that can 
point me in the right direction?

Londiste is a table-level replicator.

You might be able to create one table with the column you want replicated (plus 
a primary key), another table with the rest of the columns, and a view to give 
your app the original view.  Even so, you have to replicate the one column plus 
the PK, unless the data you're replicating is unique and can be its own PK.

On the other hand, I've noticed that Londiste doesn't know when you change a 
table.  You might be able to create your one-column table (plus PK), start 
replication with Londiste, then do ALTER TABLE to add the rest of the columns.  
A hack for sure.

Craig

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] tsvector limitations

2011-06-14 Thread Craig James

On 6/14/11 1:42 PM, Tim wrote:

So I ran this test:
unzip -p text.docx word/document.xml | perl -p -e 's/<.+?>/\n/g;s/[^a-z0-9\n]/\n/ig;'|grep 
".." > text.txt
ls -hal ./text.*
#-rwxrwxrwx 1 postgres postgres 15M 2011-06-14 15:12 ./text.docx
#-rwxrwxrwx 1 postgres postgres 29M 2011-06-14 15:17 ./text.txt
mv /tmp/text.* /var/lib/postgresql/9.0/main/
cd ~/;psql -d postgres
#psql (9.0.4)
CREATE DATABASE test;
\q
cd ~/;psql -d test
CREATE TABLE test(title VARCHAR(256), data OID, words TSVECTOR);
INSERT INTO test VALUES (  'text.docx',  LO_IMPORT('text.docx'),  
TO_TSVECTOR(pg_read_file('text.txt' ,0, 1))  );

and I got this:
#ERROR:  string is too long for tsvector (30990860 bytes, max 1048575 bytes)

doing the math
echo "scale=3;29/(30990860/1048575)"|bc #==0.981

Indicates a (worst case) limit of searching only the first 1MB of a text file 
before you need to start building your own search maybe on top of tsvector.
The year is 2011 I don't think searching a 2MB text file is to much to expect.

The novel "Hawaii" at 960 pages is roughly 1MB.  tsvector was intended for 
documents (web pages, news articles, corporate memos, ...), not for books.  What you're 
asking for is interesting, but you can't complain that an open-source project that was 
designed for a different purpose doesn't meet your needs.

So how am I to use the PGSQL FTS as a "full text search" when the above example can only 
handle a "small or partial text search"?
If I'm not missing anything maybe the documentation should be adjusted 
accordingly.

Maybe a better question is, "So how am I to use PGSQL FTS as a "massively huge text 
search" when it was designed for nothing bigger than "huge text search"?

Any thoughts or alternatives are most welcome.

I'm curious how tsvector could be useful on a 29 MB document.  That's roughly 
one whole encyclopedia set.  A document that size should have a huge 
vocabulary, and tsvector's index would be saturated.

However, if the vocabulary in this 29 MB document isn't that big, then you might consider creating 
a smaller "document."  You could write a Perl script that scans the document and creates 
a dictionary which it writes out as a secondary "vocabulary" file that's a list of the 
unique words in your document.  Create an auxillary column in your database to hold this vocabulary 
for each document, and use tsvector to index that.  The perl program would be trivial, and tsvector 
would be happy.

Craig



Re: [ADMIN] Who is causing all this i/o?

2011-06-19 Thread Craig James

On 6/17/11 11:51 AM, Shianmiin wrote:

Tom Lane-2 wrote:

What's not apparent however is why the stats collector is writing disk
so much.  8.4 does have the logic change to not write stats out unless
something is asking to see them.  So either it's really pre-8.4, or you
have a monitoring task that is constantly asking to see stats.


We have a PostgreSQL 9.0.4 on CentOs for performance testing and we are
seeing the similar issue.
we have a "crazy" setup it has 1 database with 1000 identical schemas. There
are occasional I/O write storm
of over 100 MB/sec without any disk reads, and it could last for a couple of
minutes when the schemas/data are aggressively populated by pg_restore. All
the io writes seem to be on pgstat.tmp.

The I/O write storm seemed to be trigger by Vacuum.

Based on the advice I got from my original question, I changed autovacuum_naptime to 
"5min", and the problem completely disappeared.  (I know that's a long 
interval, but this particular server gets maybe 5-10 heavy updates per week and is idle 
the rest of the time.)

select count(1) from pg_database ;
 count
---
   267

It seems like there's a problem somewhere.  Autovacuum has improved enormously 
in the last couple of years, but some change to its algorithm causes a lot of 
I/O thrashing when there are more than a few hundred separate databases.

Craig

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] All the functionality I need is pgsql2shp.exe-- isolated installation

2011-08-07 Thread Craig James

On 8/7/11 10:30 AM, antismarmy wrote:

Hello community,

I need to execute pgsql2shp from the command line, but I am not interested
in the other functionalities offered by postgresql on this machine. I have
been told that installing an instance of a spatial database on the machine
in question just to serve as a workaround for one utility executable is not
something that IT will approve of, due to the extra overhead of software&
DB maintenance, security configuration and server system resources...

Is there a way to isolate all of the files necessary for a specific
executable (in my case pgsql2shp.exe) from the postgresql 8.3 binaries
without using the installer program.

My collegue has already tried the following:
"I did download a copy of the postgresql 8.3 binaries without the installer
program and extracted a copy of psql.exe and put it on the application
server in d:\database_apps, along with a dll file dependency "ssleay32.dll".
I added "d:\database_apps" to the system PATH in hopes that psql.exe could
be invoked at a command prompt without specifying where it was.
Unfortunately this did not work. The executable needs something else that
isn't there, and I am afraid it might be other Windows system 32-bit dlls
that don't exist on a 64-bit version of Windows Server 2008."

Thanks,
Stacy

--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/All-the-functionality-I-need-is-pgsql2shp-exe-isolated-installation-tp4675350p4675350.html
Sent from the PostgreSQL - admin mailing list archive at Nabble.com.




--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] All the functionality I need is pgsql2shp.exe-- isolated installation

2011-08-07 Thread Craig James

On 8/7/11 10:30 AM, antismarmy wrote:

Hello community,

I need to execute pgsql2shp from the command line, but I am not interested
in the other functionalities offered by postgresql on this machine. I have
been told that installing an instance of a spatial database on the machine
in question just to serve as a workaround for one utility executable is not
something that IT will approve of, due to the extra overhead of software&
DB maintenance, security configuration and server system resources...

Is there a way to isolate all of the files necessary for a specific
executable (in my case pgsql2shp.exe) from the postgresql 8.3 binaries
without using the installer program.

(sorry for the last post ... here's what I meant to send.)

Get the Postgres source code, and configure it to install into your home 
directory.  The configure command has an option for this:

  ./configure --prefix=$HOME/postgres

Then follow the make and install directives.  You'll end up with a complete 
running Postgres system in your $HOME/postgres directory.  At that point, you 
can delete the source code plus everything in the $HOME/postgres/bin directory 
except the one program you want.  Set the environment variable LD_LIBRARY_PATH 
to include $HOME/postgres/lib, and you should be able to run the binaries.  You 
can probably delete some of the libraries in $HOME/postgres/lib too that aren't 
used by the program you want.

Craig

My collegue has already tried the following:
"I did download a copy of the postgresql 8.3 binaries without the installer
program and extracted a copy of psql.exe and put it on the application
server in d:\database_apps, along with a dll file dependency "ssleay32.dll".
I added "d:\database_apps" to the system PATH in hopes that psql.exe could
be invoked at a command prompt without specifying where it was.
Unfortunately this did not work. The executable needs something else that
isn't there, and I am afraid it might be other Windows system 32-bit dlls
that don't exist on a 64-bit version of Windows Server 2008."

Thanks,
Stacy

--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/All-the-functionality-I-need-is-pgsql2shp-exe-isolated-installation-tp4675350p4675350.html
Sent from the PostgreSQL - admin mailing list archive at Nabble.com.




--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [GENERAL] [ADMIN] Using Postgresql as application server

2011-08-16 Thread Craig James

On 8/16/11 10:24 AM, Chris Travers wrote:

I can't let this slide :-D

On Tue, Aug 16, 2011 at 9:27 AM, Evan Rempel  wrote:

Technically it can be done, but just because we can do something does not
mean we should do something. Having said that...

We have been using a middleware product that shall remain nameless,
that goes against a large commercial database that shall also remain
nameless.
The middleware has been migrating to a more and more database based code
set, and as an administrator of such a system I can state that this is
awful.

> From your description below, it truly sounds awful.  However, this
strikes me as being an issue of *which* logic is moved into the
database instead of *that* logic is being moved into the database.

In other words, it's just a typical engineering problem: pick the right tools 
for the job.  Just because you're a good mason doesn't mean you should build a 
boat out of cement.  You can do it, but aluminum, steel, fiberglass and wood 
are better choices.  Just because you are really good at database programming 
doesn't mean you should turn a database server into a web server.  You can do 
it, but it will be an expensive, suboptimal, and unmaintainable system.

There are many opinions about the exact right balance of where functionality 
should be implemented, but extreme solutions are almost never optimal.

Craig


Getting appropriate logging out of the application logic for both auditing
purposes
and trouble shooting is near impossible. Performance is nearly impossible to
tune as
everything runs inside the database. One giant process chewing up cores of
CPU power.

LedgerSMB has been moving in the direction of more logic in the
database because we have found the opposite.  Troubleshooting is
easier, code maintenance is easier, performance is easier to
troubleshoot and manage, and security is far more manageable.  Now,
granted we are retrofitting security onto a codebase which had none
when we forked, so that is a difference.  We have eliminated a
much larger number of bottlenecks by going this way than we have run
into.  Again the issue is *which* logic goes into the db, and that's
an absolutely key question when running middle tiers in the dbms.


Security is near impossible to manage as well. Again, almost everything
needs to run as
the same user. The database is now making calls to generate pdf objects and
make
printing calls.

Ouch I guess I could kinda see the PDF generation calls (I'd still
prefer a queue and transform approach), but not the printing calls.
And if you can't move security into the database, then you have a kind
of major problem:  you aren't *really* generating a consistent and
consistently enforced API in this way, and so you can't get to the
roles a middleware solution gives you.

As long as you still need the middleware, then the question really
becomes, what logic needs to be centralized in the db and what logic
is specific to each middleware application?  What do you get from
putting each thing in the database?  My recommendation is to review
that.  If it is a single app db, then use stored procs as essentially
named queries.  If it is possible, move the printing calls into a
separate process and have it signalled by the db app on database
commit.

But it really sounds like an unmaintainable mess.  IME, however, that
is avoidable while still placing the RDBMS in the center of the
logic-complete application server environment.

None of the traditional tools can be used to integrate the application into
the enterprise.
The load balancer needs to add x-forwarded headers to http requests, but the
custom http code can't handle that, so all web access appears to come from
the load
balancer. This violates regulatory requirements. Log file formats are not
standard
since none of the code is standard, this means that none of the event
correlation
tools can be used for intrusion detection etc.

Ouch.  I second the suggestion that the architecture here lacks
the separation of concerns approach necessary to make this work, and
that either the software you are using is the problem or the architect
is.  However, it doesn't tell me that the approach of using the RDBMS
as the entry point into an application server environment is
necessarily a bad thing.

Best Wishes,
Chris Travers




--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


[ADMIN] Londiste won't start with dual IP addresses

2011-08-18 Thread Craig James

We had to temporarily assign two IP addresses to our servers.  After doing so 
and rebooting, Londiste will start, but it just sits there doing nothing.  The 
logfile has zero bytes, and it doesn't seem to connect to either the master or 
the slave database.

We reconfigured Postgres to listen on both addresses, but the /etc/hosts tables 
still point to the original addresses (that is, the second IP address shouldn't 
matter).  But I can connect to Postgres on either IP address on all servers.

All other Postgres applications work.  I can use psql, and our Perl DBI and PHP 
applications all work with no problems.

When I invoke "londiste.py --verbose ...", the log file says:

  2011-08018 16:35:47,250  22074 DEBUG Attaching

And that's all.  Nothing more ever happens.

Help?  Thanks!

Craig

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] Londiste won't start with dual IP addresses

2011-08-19 Thread Craig James

On 8/19/11 1:40 AM, Marko Kreen wrote:

On Fri, Aug 19, 2011 at 2:44 AM, Craig James  wrote:

We had to temporarily assign two IP addresses to our servers.  After doing
so and rebooting, Londiste will start, but it just sits there doing nothing.
  The logfile has zero bytes, and it doesn't seem to connect to either the
master or the slave database.

We reconfigured Postgres to listen on both addresses, but the /etc/hosts
tables still point to the original addresses (that is, the second IP address
shouldn't matter).  But I can connect to Postgres on either IP address on
all servers.

All other Postgres applications work.  I can use psql, and our Perl DBI and
PHP applications all work with no problems.

When I invoke "londiste.py --verbose ...", the log file says:

  2011-08018 16:35:47,250  22074 DEBUG Attaching

And that's all.  Nothing more ever happens.

Help?  Thanks!

Does the exact connect string you use in londiste config work
with psql? Like so:

   $ psql -d 'connstr'

Yes.

  psql -d 'dbname=archive host=iridium user=postgres'

connects with no problem.

Thanks,
Craig





--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] Londiste won't start with dual IP addresses

2011-08-19 Thread Craig James

On 8/19/11 1:40 AM, Marko Kreen wrote:

On Fri, Aug 19, 2011 at 2:44 AM, Craig James  wrote:

We had to temporarily assign two IP addresses to our servers.  After doing
so and rebooting, Londiste will start, but it just sits there doing nothing.
  The logfile has zero bytes, and it doesn't seem to connect to either the
master or the slave database.

We reconfigured Postgres to listen on both addresses, but the /etc/hosts
tables still point to the original addresses (that is, the second IP address
shouldn't matter).  But I can connect to Postgres on either IP address on
all servers.

All other Postgres applications work.  I can use psql, and our Perl DBI and
PHP applications all work with no problems.

When I invoke "londiste.py --verbose ...", the log file says:

  2011-08018 16:35:47,250  22074 DEBUG Attaching

And that's all.  Nothing more ever happens.

Help?  Thanks!

Does the exact connect string you use in londiste config work
with psql? Like so:

   $ psql -d 'connstr'

As I mentioned earlier, yes it does work.

I also discovered by looking at all the postgres processes that it's making the 
first connection (to the master), but it doesn't seem to even try to make the 
second connection.

Could there be something else it's waiting for?  The dual-network issue may be 
completely irrelevant.  I know that while the IT guy was reconfiguring the 
network, he had to do a forced-reboot at least once without shutting down 
Postgres.  Is it possible that the Londiste tables are in some invalid state, 
and the daemon is waiting for something that's never going to happen?

Thanks,
Craig


--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] Dumping data using pg_dump after chrooting to a different partition

2011-10-24 Thread Craig James

On 10/24/11 3:10 PM, Krishnamurthy Radhakrishnan wrote:

Hi,

I am new to PostgreSQL. We are using PostgreSQL 9.0.2 on our linux server. We 
have an instance of PostgreSQL 9.0 running using the primary partition on the 
server.

We want to use the pg_dump and psql programs to migrate the data during our 
software upgrade process. For upgrade, we plan to do the following:

  * chroot to a secondary partition on the server.
  * install the software RPMs including PostgreSQL RPMs
  * start a secondary instance of PostgreSQL DB server using a different port 
and data directory.
  * run pg_dump to dump the data from the primary instance to a file.
  * run psql to import the data from the file into the secondary instance.

However when I tried to perform the pg_dump as mentioned above, I get the 
following error:
pg_dump: [archiver (db)] connection to database "TestDB" failed: could not connect to 
server: No such file or directory|
Is the server running locally and accepting|
connections on Unix domain socket "/tmp/.s.PGSQL.5432"?|

I suspect the problem is that localhost sockets on Unix/Linux are actual implemented 
as file-system sockets rather than TCP/IP sockets.  If you do chroot, those files 
won't exist.  Try connecting using a "-h hostname (e.g. -h myserver.domain.com) 
rather than the default localhost.  You may have to reconfigure your server to 
listen on port 80.

Craig


[ADMIN] Deadlock on "select ... for update"?

2011-11-29 Thread Craig James
Several times recently one of our databases has gotten stuck with the
following situation:

postgres=# select datname, procpid, usename, current_query from
pg_stat_activity where current_query != '';
  datname   | procpid | usename  |
current_query
+-+--+
 emolecules |   13503 | customerdb | select tableid from hitlist_table_pool
where hitlistid <= 0 for update
 emolecules |   32082 | customerdb | select tableid from hitlist_table_pool
where hitlistid <= 0 for update
 emolecules |   17974 | customerdb | select tableid from hitlist_table_pool
where hitlistid <= 0 for update
 emolecules |   31299 | customerdb | select tableid from hitlist_table_pool
where hitlistid = 0 limit 1 for update
 emolecules |   30247 | customerdb | select tableid from hitlist_table_pool
where hitlistid = 0 limit 1 for update
 postgres   |1705 | postgres | select datname, procpid, usename,
current_query from pg_stat_activity where current_query != '';
 emolecules |   28866 | customerdb |  in transaction
 emolecules |   21394 | customerdb | select tableid from hitlist_table_pool
where hitlistid = 0 limit 1 for update
 emolecules |   22237 | customerdb | select tableid from hitlist_table_pool
where hitlistid = 0 limit 1 for update
(9 rows)

It's obvious that they're all waiting ... but for what?  The ""
process looks like the culprit, but how do I figure out what it's doing?

The next time this happens, what queries can I run to help diagnose what's
going on?

This is PG 8.4.4 on Ubuntu 10.

Thanks,
Craig


Re: [ADMIN] Deadlock on "select ... for update"?

2011-11-30 Thread Craig James

On 11/29/11 10:36 AM, Scott Marlowe wrote:

On Tue, Nov 29, 2011 at 11:15 AM, Craig James
  wrote:

Several times recently one of our databases has gotten stuck with the
following situation:

postgres=# select datname, procpid, usename, current_query from
pg_stat_activity where current_query != '';
   datname   | procpid | usename  |
   current_query
+-+--+
  emolecules |   13503 | customerdb | select tableid from hitlist_table_pool
where hitlistid<= 0 for update
  emolecules |   32082 | customerdb | select tableid from hitlist_table_pool
where hitlistid<= 0 for update
  emolecules |   17974 | customerdb | select tableid from hitlist_table_pool
where hitlistid<= 0 for update
  emolecules |   31299 | customerdb | select tableid from hitlist_table_pool
where hitlistid = 0 limit 1 for update
  emolecules |   30247 | customerdb | select tableid from hitlist_table_pool
where hitlistid = 0 limit 1 for update
  postgres   |1705 | postgres | select datname, procpid, usename,
current_query from pg_stat_activity where current_query != '';
  emolecules |   28866 | customerdb |  in transaction
  emolecules |   21394 | customerdb | select tableid from hitlist_table_pool
where hitlistid = 0 limit 1 for update
  emolecules |   22237 | customerdb | select tableid from hitlist_table_pool
where hitlistid = 0 limit 1 for update
(9 rows)

It's obvious that they're all waiting ... but for what?  The ""
process looks like the culprit, but how do I figure out what it's doing?
The next time this happens, what queries can I run to help diagnose what's
going on?
This is PG 8.4.4 on Ubuntu 10.

Does this help?

http://wiki.postgresql.org/wiki/Lock_Monitoring


Yes, thanks! That's exactly what I needed.

Craig


--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] Giving postgres roles 'sudo'-like access

2011-12-19 Thread Craig James

On 12/19/11 10:04 AM, Mario Splivalo wrote:

I need to have postgres role to be able to cancel queries run by that
same role.

I know that I can kill the client connection that started the query, but
I also need to have that role connect to postgres and kill some of it's
running queries.

It's on postgres 9.1.1, running on Debian Squeeze.

One idea that comes to mind is to have a special user account on the box
which would have sudo privileges to the script which will issue 'sh
postgres -c 'psql -c "select pg_cancel_backend()"', but before that
run checks that  is actually owned by correct role, etc, etc. But,
all that seems very fishy.

Has anyone had similar requests, and maybe solved them somehow?

Yes.

First of all, you really don't want to do this.  If there's any other way, find 
it.

In our case, we use a third-party library that sometimes gets stuck on problems 
that would take a virtually infinite time to solve, and for reasons having to 
do with NP-complete theory, you can't easily detect them ahead of time.  So

Backend Postgres processes run as the Postgres user, so they have permission to 
kill each other.  You write an add-on function that just kills a process:

  select my_kill_backend(pid);

Naturally, this is very dangerous. There are all sorts of security 
implications.  You want to use signal 2 or 15, not 9.  And so forth.

Craig

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] Can't Insert from Staging Table to Production Table

2011-12-21 Thread Craig James

On 12/21/11 12:34 PM, Elliot Voris wrote:


Hi, everyone

I've got 32,404 rows of data in a staging table (marcxml_import) that I'm 
trying to get into my production table (biblio.record_entry) in my database 
(evergreen). When trying to do so, I'm getting the following error:

evergreen=# INSERT INTO biblio.record_entry (marc, last_xact_id) SELECT marc, 
'IMPORT' FROM marcxml_import;

WARNING:  could not find ParserDetails.ini in 
/usr/local/share/perl/5.10.1/XML/SAX

CONTEXT:  compilation of PL/Perl function "maintain_control_numbers"

ERROR:  Unable to provide required features

CONTEXT:  PL/Perl function "maintain_control_numbers"

evergreen=#

Any ideas what this means? Any idea how to get around it? I was asked in IRC 
what language I was working with, and it's plperlu.



Google is your friend.  Look for "ParserDetails.ini" and you find the exact 
problem you're describing:

http://perl-xml.sourceforge.net/faq/#parserdetails.ini

Craig


Thank you!

Elliot J Voris

IT Director

*Saint Louis Christian College*

evo...@slcconline.edu 

314.837.6777 x 1250





Re: [ADMIN] Establishing remote connections is slow

2012-01-17 Thread Craig James
2012/1/17 Mindaugas Žakšauskas 

> On Tue, Jan 17, 2012 at 7:23 PM, Tom Lane  wrote:
> > <..>  Mindaugas, are you using SSL,
> > and if so can you turn it off and see whether things change?
> > (It should be safe to do so at least on the "localhost" connection,
> > even if you feel your network is insecure.)
>
> No, I am not using SSL; it is either disabled or the default setting
> is off anyway. This was one of the first things I have checked.
> Moreover, this would probably make it hard to explain why does it take
> no time to establish connections immediately after PostgreSQL restart
> and why it does it degrade later.
>
> To respond to previous emails - we tried doing DNS lookups against the
> client host and they took no time.
>

Try putting the hostnames and IP addresses in /etc/hosts ... first on the
server (for the client) and then on the client (for the server).

Craig


>
> Thanks for your ideas.
>
> Regards,
> Mindaugas
>
> --
> Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-admin
>


Re: [ADMIN] overcoming a recursive relationship in a sql statement

2012-02-01 Thread Craig James
On Wed, Feb 1, 2012 at 3:27 PM,  wrote:

> Hello,
>
> So I am not very advanced in SQL, and after a week of wild SQL
> experimenting have finally reduced my thought process to a complete  blank
> on this query...
>
> I have a list of organizations in table:organization and a one to many
> list of organizational partnerships in a second table (table:partners).
>  The 'partners' table lists each organization id that belongs to each
> partnership.
>
> The partnership relationships are recursive in that each partner is an
> 'ego' to one or more 'alter' partners, and an alter to other 'ego' partners.
>
> So I havent been able to figure out a way to select business partners
> 'alter' organizations that are distinctly related to each 'ego'
> organization.
>

I'm not positive from your description, but it seems like the concept you
are missing is that a single table can have several aliases in one SQL
query, which makes it appear to be several different tables.  Take a simple
example:  a "personnel" table with employees and supervisors.  Find
everyone with supervisor "Jones":

  select name from personnel emp join personnel super on (emp.supervisor =
super.name)
  where super.name = 'Jones';

Even though it's one table, you can make it look like two tables.

Craig


>
>  I would like to create  SQL output that looks like this:
>
> selfpartner idbusiness partnerOrg1100Org2Org1100Org3Org2100Org1Org2100Org3
> Org3100Org1Org3100Org2
>
>
> (The futher  complexities are that every combination of partnerships and
> the life cycle of each partnership can be a many to many relation and  each
> org may be partnered or opposed to other orgs, or  a partnership may
> eventually turn sour resulting in the partnership  being dissolved and the
> participants becoming  competitors - its endless in this regard :<)
>
> I tried a many to many tables querying ego_org and alter_org, however each
> alter has the same ego elsewhere in the table, so the normanization is
> screwed up...
>
> so i finaly went back to a one to many model shown below:
>
>
> table:org
> 
> orgid org name
> org1 xyz
> org2 abc
> org3 blah blah
> orgx etc
>
>   Table: partners
> ---
> partneridmember_org
> 100 org1
> 100 org2
> 101 org1
> 101 org25
> 102 org2
> 102 org3
>
> table: affiliation unique constraints = (partner, competitor, ally)
> -
> affiliation_id affiliation
> 100 partner
> 101 competitor
> 102 ally
> 103 competitor
> 1xx   etc
>
>
> Each organization is connected to other org(s) as a business partner or
>  competitor. (which I ignore for the moment).
>
> I have tried a variety of self joins, and many to many joins, to no avail.
>
> I have no idea how to select "business partners" for each organization
> that do not include the ego partner.
>
> all I have so far is this ( less all the many to many joins and self
> joins attempts etc.)
>
> select p.partnum as "partner id", o.org as "self",  p.member_id as
> "business partner",a.affiliation
> from testorg o
> join partners p on o.org = p.member_id
> join
> order by p.partnum asc, o.org
>
> the sql returns a duplicate list for self (ego) and business partner
> (alter orgs):
>
> partner idselfbusiness partner100Org1Org1100Org2Org2100Org3Org3101Org1Org1
> 101Org25Org25102Org2Org2102Org3Org3103Org4Org4104Org1Org1104Org16Org16
>
>
> I have gotten various Cartesian joins showing every org related to every
> other, but have not been unable to properly filter the 'partner
> organizations column to exclude 'ego' e.g. the organization that "others"
> are related to.
>
> Can anybody offer a solution that  selects only the partners that are
> linked to each organization?
>
> thanks very much,
>
> glenn B
>
>
>
>
>
>
>


[ADMIN] C/C++ Linker Problem in Postgres extension functions

2012-05-04 Thread Craig James
I recently recompiled our Postgres extension functions and re-encountered a
problem that I thought was solved.  When I call our function, the code just
hangs ... it sit there not executing, not using any CPU cycles.  Forever.
The process is still running, but seems to be waiting for ... what?

I originally fixed this problem by using g++ instead of gcc to invoke the
linking step for our .so shared object file.  It seems that g++ knows how
to invoke the C++ initialization code, whereas gcc doesn't.  Since Postgres
is written in C, linking a C++ library didn't work until I switched to g++
to do the linking.

However ... now it's not working any more, for no apparent reason.  The
stack trace below shows what I find when I attach to the running Postgres
process using gdb.

I ran valgrind on a test program that calls this exact same function and it
says the function is clean -- no memory errors or leaks.

Has anyone seen a problem like this when building Postgres extensions?
This is Postgres 8.4.10.

Thanks,
Craig

0x7fdfd755816e in ?? () from /lib/libc.so.6
(gdb) bt
#0  0x7fdfd755816e in ?? () from /lib/libc.so.6
#1  0x7fdfd74e13dc in ?? () from /lib/libc.so.6
#2  0x7fdfd74dfe78 in free () from /lib/libc.so.6
#3  0x7fdfd7c7e722 in ?? () from /lib64/ld-linux-x86-64.so.2
#4  0x7fdfd7c795e5 in ?? () from /lib64/ld-linux-x86-64.so.2
#5  0x7fdfd7c7f5f5 in ?? () from /lib64/ld-linux-x86-64.so.2
#6  0x7fdfd7c7a986 in ?? () from /lib64/ld-linux-x86-64.so.2
#7  0x7fdfd7c7efba in ?? () from /lib64/ld-linux-x86-64.so.2
#8  0x7fdfd7586be0 in ?? () from /lib/libc.so.6
#9  0x7fdfd7c7a986 in ?? () from /lib64/ld-linux-x86-64.so.2
#10 0x7fdfd7586d37 in __libc_dlopen_mode () from /lib/libc.so.6
#11 0x7fdfd75613f1 in backtrace () from /lib/libc.so.6
#12 0x7fdfd74cf50f in ?? () from /lib/libc.so.6
#13 0x7fdfd74d95b6 in ?? () from /lib/libc.so.6
#14 0x7fdfd74dfe83 in free () from /lib/libc.so.6
#15 0x7fdf95de56ac in
__gnu_cxx::new_allocator::deallocate (this=0xfc2b88,
__p=0xff8ab0)
at /usr/include/c++/4.4/ext/new_allocator.h:95
#16 0x7fdf95de413c in std::_Vector_base >::_M_deallocate (
this=0xfc2b88, __p=0xff8ab0, __n=2) at
/usr/include/c++/4.4/bits/stl_vector.h:146
#17 0x7fdf95de3e37 in ~_Vector_base (this=0xfc2b88, __in_chrg=)
at /usr/include/c++/4.4/bits/stl_vector.h:132
#18 0x7fdf95de2e82 in ~vector (this=0xfc2b88, __in_chrg=)
at /usr/include/c++/4.4/bits/stl_vector.h:313
#19 0x7fdf959977da in ~OBRingData (this=0xfc2b70, __in_chrg=)
at /home/cjames/openbabel-2.3.x-r4744/source/src/generic.cpp:784
#20 0x7fdf95907d83 in ~OBBase (this=0xfc1c40, __in_chrg=)
at
/home/cjames/openbabel-2.3.x-r4744/source/include/openbabel/base.h:249
#21 0x7fdf95a13c1b in ~OBMol (this=0xfc1c40, __in_chrg=)
at /home/cjames/openbabel-2.3.x-r4744/source/src/mol.cpp:3225
#22 0x7fdf95decf5f in emol_ichem_mol_dealloc (mol_ob=0xfc1c40) at
openbabel2.3/postgres_mol_openbabel.cpp:190
#23 0x7fdf95ded864 in emol_ichem_normalize (length=25,
molecule_text=0xf90bcc "Br.Br.Oc1ccc(cc1)N1CCNCC1",
intype_len=3, intype=0x7fdf95df2fa9 "SDF", outtype_len=3,
outtype=0x7fdf95df2fa9 "SDF")
at openbabel2.3/postgres_mol_openbabel.cpp:461
#24 0x7fdf95df1ef5 in emol_normalize_sdf (fcinfo=0x7fff2369dcc0) at
postgres_mol.c:553
#25 0x0054c815 in ExecMakeFunctionResult ()
#26 0x0054795c in ExecEvalExprSwitchContext ()
#27 0x005ac003 in evaluate_expr ()
#28 0x005ad932 in simplify_function ()
#29 0x005ae339 in eval_const_expressions_mutator ()
#30 0x00570129 in expression_tree_mutator ()
#31 0x005ae03c in eval_const_expressions_mutator ()
#32 0x0057068b in expression_tree_mutator ()
#33 0x005ae03c in eval_const_expressions_mutator ()
#34 0x005af3ef in eval_const_expressions ()
#35 0x0059ffcd in preprocess_expression ()
#36 0x005a28bd in subquery_planner ()
#37 0x005a2c5f in standard_planner ()
#38 0x005ebb5a in pg_plan_query ()
#39 0x005ebc44 in pg_plan_queries ()
#40 0x005ec0d7 in exec_simple_query ()
#41 0x005ed677 in PostgresMain ()
#42 0x005c15ab in ServerLoop ()
#43 0x005c3a9c in PostmasterMain ()
#44 0x0056e530 in main ()


Re: [ADMIN] Data split -- Creating a copy of database without outage

2012-05-30 Thread Craig James
On Wed, May 30, 2012 at 8:14 AM, Igor Shmain  wrote:

> Thank you, Liu, for your suggestion. 
>
> ** **
>
> I might be missing something (I am new to postgres), but it seems that
> your suggestion will not help much in my case.  Since the number of db
> requests will grow with time (more users will come into the system), the
> current server will become incapable of serving all the requests quickly
> enough.
>
> ** **
>
> The idea is to increase overall calculation capacity of the system by
> adding more servers to it. Throwing more hardware to a single main server
> is not an option in my case. Creating multiple replicas (slaves) is also
> not an good option – it would be way more efficient to have a group of db
> servers, each serving only some subset of users and hosting data for those
> users only. Buying new servers in advance is not an option too.
>
> ** **
>
> What I am looking for is switching some of the users to another db server
> when the capacity of the existing server(s) is not enough. The point is to
> do it without interrupting the users’ work (so they do not see that
> horrible “Sorry! This site is under maintenance…”).
>

Londiste is very good at replicating a database.  It's normally used for
real-time backup or to load-balance read-only applications, but it can also
be used for the situation you describe.  At some point when you decide it's
necessary to split your database, you would create a new database on a
second server, create an empty duplicate schema (using pg_dump's
schema-only feature) then install Londiste.  Londiste would replicate your
database in real time up to the point where you were ready to make the
actual switch.

To switch, you'd simply stop Londiste and simultaneously reconfigure your
load-balancing system so that a subset of your users were directed to the
new database.  Then you could uninstall Londiste, and clean out each
database by removing the user data that is for users on the other server.

You might also consider partitioning.  If you know in advance that you're
going to be discarding large subsets of the data, it will be much more
efficient if you partition it at the outset.  When you split  your system
in two, the cleanup process will be nothing more than dropping partitions.
You won't be plagued by bloated indexes and files.

Craig



> 
>
> ** **
>
> If I missed something it would be very kind of you to point this out.
>
> ** **
>
> Thank you once again,
>
> -igorS
>
> ** **
>
> ** **
>
> *From:* Haifeng Liu [mailto:liuhaif...@live.com]
> *Sent:* May-29-12 9:13 PM
> *To:* Igor Shmain
> *Cc:* pgsql-admin@postgresql.org
> *Subject:* Re: [ADMIN] Data split -- Creating a copy of database without
> outage
>
> ** **
>
> Why not use a partitioned table? You can write a trigger to control which
> partition the coming data should be inserted.
>
> ** **
>
> Regards,
>
> 
>
> Liu Haifeng
>
> Home: http://liuhaifeng.com
>
> ** **
>
>
>
> 
>
> ** **
>
> On May 30, 2012, at 4:13 AM, Igor Shmain wrote:
>
>
>
> 
>
> Can you please help with advice?
>
> I need to design a solution for a database which will grow and will
> require a horizontal split at some moment.
>
> Here is how I am planning to do it: Every record in every table has a
> shard number. After the database becomes too large (gets too many
> requests), the tables need to be horizontally split. It means that for
> every table all records with some specific shard numbers need to be moved
> to a new database.
>
> My plan is to (1) create a copy of the database on a new server
> (backup/restore?), (2) synchronize the databases (how?), and then (3)
> forward all requests for the moved shards to the new database. (I will get
> rid of the unneeded shards later). An important requirement: there should
> be no outage for the users.
>
> What are the ways to achieve it?
>
>  
>
> Thank you in advance,
> -igorS
>
>  
>
> ** **
>


Re: [ADMIN] What happens when PostgreSQL fails to log to SYSLOG

2012-07-11 Thread Craig James
On Wed, Jul 11, 2012 at 10:47 AM, Arnold, Sandra  wrote:

> Tablelog would be ok for keeping up with transactions for tables.
>  However, we also need to audit who connects successfully and
> unsuccessfully.  As far as I am aware, if a user fails to log in
> successfully, say three times, PostgreSQL is not able to lock the account
> for 9.0.  Is this the case for 9.1 as well?


This is off topic for your question, but locking an account after a small
number of failures is a bad security practice.  It's an invitation for a
denial-of-service attack.  Anyone who knows anyone else's account name can
lock them out.  Anyone who gets a list of accounts can lock up the whole
system.

Craig


>  This is the version that I will be installing that I am writing the
> Security controls for.  There are several events that I have to Audit and
> table transactions are just a small part of it.  The events I need to audit
> are listed below:
>
> - Unauthorized User Access
> - Changes to User Privileges
> - Changes to Audit Policy
> - Reset User Password
> - New User created in Database
> - Users dropped from Database
> - Invalid Login Attempts
>
> Sandra Arnold
> Senior Database Administrator
> Contractor to DOE/OSTI
> Information International Associates (IIA)
>
>
>
> -Original Message-
> From: Joshua D. Drake [mailto:j...@commandprompt.com]
> Sent: Tuesday, July 10, 2012 4:31 PM
> To: Arnold, Sandra
> Cc: pgsql-admin@postgresql.org
> Subject: Re: [ADMIN] What happens when PostgreSQL fails to log to SYSLOG
>
>
> On 07/10/2012 01:08 PM, Arnold, Sandra wrote:
> > I am trying to find out what PostgreSQL does when it cannot write to
> > its SYSLOG file, whether it is permissions or the file system where
> > the log resides is full is the problem.
>
> PostgreSQL doesn't write to a SYSLOG file. It sends it to the syslog
> daemon. (if you are indeed using syslog)
>
> > Does PostgreSQL stall, does it rollback the transaction it cannot log
> > to the SYSLOG, or does it continue on as if there is not an issue?
>
> This is a non-issue in terms of transactions and operations.
>
>
> > I am writing Security controls and since I am using the SYSLOG for
> > auditing purposes and I need to document what happens in case there
> > was a failure in writing to the SYSLOG.  For instance, Oracle
> > rollbacks any transactions that are being audited it cannot write to
> > its audit logs.  Just want to know what PostgreSQL does.
> >
>
> You should probably look at tablelog for auditing. It automates it.
> Syslog is not really a good way to handle that.
>
> Sincerely,
>
> Joshua D. Drake
>
>
> --
> Command Prompt, Inc. - http://www.commandprompt.com/ PostgreSQL Support,
> Training, Professional Services and Development The PostgreSQL Conference -
> http://www.postgresqlconference.org/
> @cmdpromptinc - @postgresconf - 509-416-6579
>
>
>
> --
> Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-admin
>


Re: [ADMIN] Sometime Update is not modifying data inside database.

2012-07-30 Thread Craig James
On Mon, Jul 30, 2012 at 4:06 AM, Amit Kumar  wrote:

> All,
>
> I am facing a weired issue in PG 9.0.3. Sometime via application UPDATES
> are not changing data in the database but in postgres logs its also not
> raising any error.  It is happening 2-3 times in a week. I have tried but
> not able to reproduce this error. fsync is ON.  We are not using hibernate
> and no pg pool or any other tool inside database.   The  application
> running from last 4-5 year but from last few days we are experiencing some
> times UPDATE is not updating records in database, without giving any
> error.  No changes has been made in database or server recently.
>
> *UPDATE dbpush_camp_main SET status=7 WHERE camp_id IN(137721)*
>
> Below is the logs of database.
>
> 
> Jul 30 12:40:36 dell16 postgres[11088]: [11-1] 2012-07-30 12:40:36 IST
> [11088]: [10-1] user=postgres,db=dbpush xx(43225)LOG:
> statement: START TRANSACTION
> Jul 30 12:40:36 dell16 postgres[11088]: [12-1] 2012-07-30 12:40:36 IST
> [11088]: [11-1] user=postgres,db=dbpush xx(43225)LOG:
> statement: *UPDATE dbpush_camp_main SET status=7 WHERE camp_id IN(137721)*
> Jul 30 12:40:36 dell16 postgres[11088]: [13-1] 2012-07-30 12:40:36 IST
> [11088]: [12-1] user=postgres,db=dbpush xx(43225)LOG:
> statement: COMMIT
> Jul 30 12:40:36 dell16 postgres[9368]: [5-1] 2012-07-30 12:40:36 IST
> [9368]: [4-1] user=postgres,db=dbpush xx(43187)LOG:
> statement: SELECT status FROM  dbpush_camp_main  WHERE camp_id=137721
> 
>
> The size of database is 237GB, AND dbpush_camp_main  table size is 37MB
> with 36888 records and one primary key. status is a integer field.
>
> Please share your thoughts.  Thanks a lot in Advance !!
>

You've almost certainly got problems with overlapping transactions, where
the process making the change and the process testing the change are
looking at different "snapshots" of the data.  The "snapshot" started when
each one started its own transaction.

Craig


Re: [ADMIN] JDBC keep alive issue

2012-08-11 Thread Craig James
On Fri, Aug 10, 2012 at 7:19 AM, Tom Lane  wrote:

> Haifeng Liu  writes:
> > I have a program running like a daemon, which analyze data and write to
> postgresql 9.1 on centos 5.8. There is only one connection between my
> program and the postgresql database, and I hope the connection may keep
> alive all the time. But I failed, the connection will be reset after idle
> for about 2 hours.
>
> > jdbc driver: 9.1-901, connection url has parameter tcpKeepAlive=true;
> > postgresql:9.1, keep alive related settings use default
> values(commented);
> > centos 5.8 64bit, net.ipv4.tcp_keepalive_intvl = 75, probes = 9, time =
> 7200.
>
> IIRC, time = 7200 (seconds) means to start sending keepalive packets
> after 2 hours of idle time.  So if you have something in the way that is
> dropping the connection after 2 hours, these settings will not activate
> keepalive soon enough to save it.  I'd try setting that to 3600.
>

I use a much shorter time (300).  Why not?  The keepalive pings are a
trivial amount of data.  I've found over the years that router NAT tables
are a big offender (probably doesn't apply in this case) ... some drop your
internal-to-external IP address mapping after as little as five minutes.

Craig


>
> > There is no firewall or any other device which behaves force idle
> connection cleanup.
>
> Seems pretty darn unlikely given these symptoms.  Look harder...
> maybe something you thought was just a bridge has got routing behavior.
>
> BTW, in the real world connections drop for all sorts of reasons, and
> kernel keepalive configurations can't prevent them all.  You might be
> better advised to build some reconnect-after-connection-loss logic into
> your application, rather than spending time on making this work.
>
> regards, tom lane
>
> -
> Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-admin
>


Re: [ADMIN] Canot access PostgreSQL via psql -h (Vmware Fusion)

2012-08-24 Thread Craig James
On Fri, Aug 24, 2012 at 7:08 AM, Kevin Grittner
 wrote:
> CS DBA  wrote:
>
>> I've fired up 2 CentOS 6.2 VM's via vmware fusion 5 (on a mac).
>
>> psql -h 192.168.91.145
>> psql: could not connect to server: No route to host
>
> That problem has nothing to do with PostgreSQL; you might have
> better luck on a list related to the other technologies.
>
> FWIW, on Linux I would start with `netstat -plnt` to see if the
> process was listening on the expected port and host address.

Or even simpler, do

  ifconfig

and make sure that you have the right IP address, followed by "ping
192.168.91.145" (assuming that's the right address) to see if your
network is even running. Another thing to try:

  psql -h localhost

Craig


-- 
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


[ADMIN] Schema diagramming tool?

2012-09-04 Thread Craig James
Can anyone recommend a good tool for producing a good drawing of an
existing database schema?  I don't need a design tool, but rather one that
can take an existing schema and produce a nice diagram that can be further
edited and "beautified."  I want something I can print and hang on the wall
as a reference document for developers writing code and SQL.

Thanks,
Craig


Re: [ADMIN] Log-Shipping Standby Server: USE_FLOAT8_BYVAL compatibility error

2012-09-05 Thread Craig James
On Wed, Sep 5, 2012 at 2:21 AM, Mathias Breuninger
> I wasn't aware of the binary problem with replication.
> Maybe the PostgreSQL docs should emphasize the architecture restriction.

That's sort of what "binary file" means -- a raw,
architecture-specific representation of data that's optimized for
performance and/or space, not portability.

Craig


-- 
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] Schema diagramming tool?

2012-09-05 Thread Craig James
On Tue, Sep 4, 2012 at 9:48 PM, Sergey Konoplev  wrote:
> On Tue, Sep 4, 2012 at 8:35 PM, Craig James  wrote:
>> Can anyone recommend a good tool for producing a good drawing of an existing
>> database schema?  I don't need a design tool, but rather one that can take
>> an existing schema and produce a nice diagram that can be further edited and
>> "beautified."  I want something I can print and hang on the wall as a
>> reference document for developers writing code and SQL.
>
> I used to use PostgreSQL Maestro for this purpose. Its Designer tool
> is quite nice.

They don't say anywhere on their web site which platforms they
support.  Is this a Windows-only application?

Thanks,
Craig


-- 
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] Upgrading from 9.1.2 to 9.1.5

2012-09-08 Thread Craig James
On Sat, Sep 8, 2012 at 1:26 PM, Sergey Konoplev  wrote:
> On Fri, Sep 7, 2012 at 3:20 AM, Bruce Momjian  wrote:
>> On Thu, Sep  6, 2012 at 05:55:05PM -0500, Antoine Guidi wrote:
>>> Is it possible to do a pg_upgrade from 9.1.2 to 9.1.5 just using pg_upgrade?
>>> For what I could read, the only exception would be if I was using a
>>> citext column (which I am not).
>>
>> You cannot use pg_upgrade for this.  You just need to stop the server,
>> install the binaries, and restart the server.
>
> AFAIU it is not necessary to stop the server when updating binaries if
> one is not going to create extensions, PLs or anything else that will
> be dynamically linked after the binaries update and before the server
> restart.
>
> So with the process
>
> 1. update binaries
> 2. postgres restart
>
> the downtime will be shorter.

I'm just guessing, but this is probably a bad idea.  This could happen...

1. Postgres master and a bunch of clients are running

2. You start updating binaries

3. In the middle of your update, an new client connects and a new
backend process starts.

4. The 9.1.2 executable links to the 9.1.5 binaries.  Or a 9.1.5
executable links to the 9.1.2 libraries.  Or a 9.1.5 executable starts
with the right binaries, but is talking to a 9.1.2 postmaster process,
which might not have the same shared-memory map.  Or ...

... and so forth.


-- 
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] Upgrading from 9.1.2 to 9.1.5

2012-09-10 Thread Craig James
On Mon, Sep 10, 2012 at 10:17 AM, Antoine Guidi wrote:

> Another question, when I get a reply from the list, to which email
> should I then reply?
> To all? the User posting, or pgsql-admin@?
>

Either reply-to-all or reply to the pgsql-admin address.  Either way,
everyone gets it who participated.

Craig


Re: [ADMIN] how to allow integer overflow for calculating hash code of a string?

2012-09-20 Thread Craig James
On Thu, Sep 20, 2012 at 1:55 AM, Haifeng Liu  wrote:

> I want to write a hash function which acts as String.hashCode() in java:
> hash = hash * 31 + s.charAt(i)... but I got integer out of range error. How
> can I avoid this? I saw java do not care overflow of int, it just make the
> result negative.
>
>
Use the bitwise AND operator to mask the hash value with 0x3FF before
each iteration:

  hash = (hash & 67108863) * 31 + s.charAt(i);

Craig


Re: [ADMIN] how to allow integer overflow for calculating hash code of a string?

2012-09-21 Thread Craig James
On Thu, Sep 20, 2012 at 7:56 PM, Haifeng Liu  wrote:

>
> On Sep 20, 2012, at 10:34 PM, Craig James  wrote:
>
>
>
> On Thu, Sep 20, 2012 at 1:55 AM, Haifeng Liu  wrote:
>
>> I want to write a hash function which acts as String.hashCode() in java:
>> hash = hash * 31 + s.charAt(i)... but I got integer out of range error. How
>> can I avoid this? I saw java do not care overflow of int, it just make the
>> result negative.
>>
>>
> Use the bitwise AND operator to mask the hash value with 0x3FF before
> each iteration:
>
>   hash = (hash & 67108863) * 31 + s.charAt(i);
>
> Craig
>
>
> Thank you, I believe your solution is OK for a hash function, but I am
> aiming to create a hash function that is consistent with the one
> applications use. I know postgresql 9.1 has a hash function called
> hashtext, but I don't know what algorithm it use,  and I also see that it's
> not recommended to relay on it. So I am trying to create a hash function
> which behaves exactly the same as java.lang.String.hashCode().  The later
> one may generate negative hash value. I guess when the number is
> overflowing, the part out of range will be ignored, and if the highest bit
> get 1, the hash value turn to negative value.
>

You are probably doing something where you want the application and the
database to implement the exact same function, but if you stick to the Java
built-in function, you will only have control over one implementation of
that function.  What happens if someone working on Java changes the how the
Java internals work?

A better solution would be to implement your own hash function in Postgres,
and then once you know exactly how it will work, re-implement it in Java
with your own code.  That's the only way you can ensure consistency between
the two.

Craig


Re: [ADMIN] alter table alter column to resize a varchar

2012-09-27 Thread Craig James
On Thu, Sep 27, 2012 at 2:23 AM, Gary Stainburn
 wrote:
> Hi folks.
>
> I'm planning on extending a field in one of my main tables using:
>
> alter table stock alter column type varchar(255);

Why not just do

   alter table stock alter column  type text;

That is, do you really need a limit?  The VARCHAR type in Postgress is
just a TEXT type with an additional test to limit data to 255
characters.  It doesn't save space to limit the length.

> to extend it from the current varchar(80).  My concern is that this table and
> this field is used in a number of views, and views of views.
>
> Are then any gotchas that I need to worry about or things I need to check, or
> will it just work?

You can't alter a table that has dependent views.

  => alter table foo alter t type varchar(255);
  ERROR:  cannot alter type of a column used by a view or rule
  DETAIL:  rule _RETURN on view fooview depends on column "t"

I keep scripts around to rebuild all of my views when necessary. If I
keep the scripts up to date, it make me much more willing to make a
schema change instead of putting it off until things are really messy.

Craig


-- 
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] Creating schema best practices

2012-10-03 Thread Craig James
On Tue, Oct 2, 2012 at 11:54 AM, Babay Adi, Hava  wrote:

>  Dear list,
>
> ** **
>
> I’m new to PostgreSQL, planning now a migration to PostgreSQL and would
> appreciate your help. 
>
> ** **
>
> One aspect of the migration is re-thinking our DB structure.
>
> ** **
>
> The application considered contains several modules (let’s say ten), each
> one uses and manages a small number of tables (maximum 10 tables per
> module). Today all tables are located on the same DB, which makes
> management a bit uncomfortable. What comes to mind is grouping each
> module’s tables on a separate schema. From you experience, is there any
> performance impact for grouping tables into schemas? In general, what is
> the best practice for grouping tables in schemas vs. locating several
> tables (that might be logically separated) into the same schema? Is there
> any advantage \ disadvantage of using schemas vs naming standards that
> includes prefix for each module’s tables?
>
> ** **
>
> In the considered application there are no name duplications among tables.
> In addition, there are there are no queries that involve tables managed by
> different modules. In addition, since all modules are owned by the same
> application, currently there is no interest in limiting access for tables
> (it is all or nothing).
>

Since you have no security needs and no naming conflicts, the answer to
your question is that it's purely for your convenience.  Schemas are
primarily useful for separating namespaces and for security.  Separating
your modules into schemas will have no impact on performance.

pg_dump and pg_restore have support for single-schema dumps, which might
make your backup procedures easier.

If I were in your shoes, I'd use schemas just to make management easier.
Renaming all of your tables with a module prefix is a waste of time --
that's what schemas are for.

Craig


Re: [ADMIN] Creating schema best practices

2012-10-03 Thread Craig James
On Wed, Oct 3, 2012 at 10:58 AM, Babay Adi, Hava  wrote:

>  Thanks Craig for the useful information.
>
> ** **
>
> On the same regard – Some of the mentioned modules in the mentioned
> application use a set of tables which is logically separate (there are no
> join statements with tables of other modules). What are the pros\cons of
> using a separate database instead of a separate schema for maintaining such
> tables?
>
> ** **
>
> I understand that resources are shared among multiple databases on the
> same cluster, so in terms of performance, are there resources that are
> dedicated for each database and would benefit performance?
>
> ** **
>
> I’d appreciate a best practice also regarding to using database vs schema.
>

Best practice is more about opinion than anything else.

Regarding multiple databases: it depends entirely on your needs. If you
separate your table into two databases, then your application will have to
make two connections rather than one.  That might be a performance issue
depending on how many connections per second you get.

When you do backups, you'll have to do two instead of one.  It's hard to
see why two databases would be better than one in your case.

Everything (database, schema, table, metadata, ) is managed by the same
database cluster, so there's no performance advantage to building separate
databases.  If you have several file systems on separate disks, you can
improve performance by using them, but you don't need separate databases
for that. You can create tablespaces and use that to assign tables or
schemas to a particular file system.

Craig


[ADMIN] Mixing 8.4.x and 9.2.x clients and servers

2012-10-09 Thread Craig James
Is it OK to mix 9.2.x and 8.4.x clients and servers?  Can they talk to each
other?  In other words, if I install 9.2 on a new server alongside older
servers running 8.4.x, can clients on the 8.4 systems talk to 9.2 servers,
and can the 9.2 clients talk to 8.4 servers?

Thanks,
Craig


[ADMIN] 9.2 won't load C-language function

2012-10-10 Thread Craig James
I have a C-language function I've been using on 8.4 for a long time.  On
9.2 it won't load:

test=# set search_path = public;
SET
test=# CREATE OR REPLACE FUNCTION chmoogle_session_id() RETURNS integer
AS '/usr/local/pgsql/lib/libchmoogle.so', 'chmoogle_session_id'
LANGUAGE c VOLATILE;
ERROR:  could not load library "/usr/local/pgsql/lib/libchmoogle.so":
libopenbabel.so.4: cannot open shared object file: No such file or directory

But libopenbabel.so.4 does exist:

# ls -l /usr/local/pgsql/lib/libopenbabel.so.4
-rwxr-xr-x 1 root root 18250709 2012-10-09 18:05
/usr/local/pgsql/lib/libopenbabel.so.4*

There seems to be some difference in the library search path from 8.4 to
9.2.  For administrative simplicity, I don't use ldconfig, besides which it
wasn't needed for 8.4.

What changed in how libraries are loaded between 8.4 and 9.2?

Thanks,
Craig


Re: [ADMIN] 9.2 won't load C-language function

2012-10-11 Thread Craig James
On Wed, Oct 10, 2012 at 10:08 PM, Craig Ringer wrote:

> On 10/11/2012 02:22 AM, Craig James wrote:
>
>> I have a C-language function I've been using on 8.4 for a long time.  On
>> 9.2 it won't load:
>>
>> test=# set search_path = public;
>> SET
>> test=# CREATE OR REPLACE FUNCTION chmoogle_session_id() RETURNS integer
>> AS '/usr/local/pgsql/lib/**libchmoogle.so', 'chmoogle_session_id'
>> LANGUAGE c VOLATILE;
>> ERROR:  could not load library "/usr/local/pgsql/lib/**libchmoogle.so":
>> libopenbabel.so.4: cannot open shared object file: No such file or
>> directory
>>
>
> What does `ldd /usr/local/pgsql/lib/**libchmoogle.so` say? What about
> when run under the PostgreSQL user account?
>

Too late ... after reading up on LD_LIBRARY_PATH and why ldconfig is
better, I just ran ldconfig on the /usr/local/pgsql/lib and
/usr/local/openbabel/lib directories.  Problem fixed.


> You may need to set LD_LIBRARY_PATH in the Pg startup script, or modify
> ld.so.conf, or re-build your extension with rpath linking enabled.
>
> As for why it worked with 8.4 - I expect you either altered
> LD_LIBRARY_PATH in the 8.4 startup scripts, or maybe the 8.4 version of the
> extension was built with -Wl,-rpath so the paths to required libraries were
> embedded into the extension library.
>

It is a bit odd ... 8.4 was working, and I just compiled and installed 9.2
on top of it (and created a fresh database).  Oh well, I guess it will
remain a mystery.

Craig James


>
> --
> Craig Ringer
>


[ADMIN] Odd DEALLOCATE statements never finish in 9.2?

2012-10-18 Thread Craig James
I installed 9.2 on our new server and am seeing something odd that doesn't
happen in 8.4:

postgres=# select datname, pid, usename, query from pg_stat_activity where
query != '';
  datname   |  pid  | usename  |
query
+---+--+
 emolecules | 19249 | test | DEALLOCATE dbdpg_p18629_26
 emolecules | 19250 | test | DEALLOCATE dbdpg_p18625_20
 emolecules | 19251 | test | DEALLOCATE dbdpg_p18632_23
 postgres   | 21583 | postgres | select datname, pid, usename, query from
pg_stat_activity where query != '';
 emolecules | 19253 | test | DEALLOCATE dbdpg_p18627_18
 emolecules | 19320 | test | DEALLOCATE dbdpg_p18640_7
 emolecules | 19271 | test | DEALLOCATE dbdpg_p18628_17
 emolecules | 19779 | test | DEALLOCATE dbdpg_p19319_2
 emolecules | 19282 | test | DEALLOCATE dbdpg_p18626_12
 emolecules | 19321 | test | DEALLOCATE dbdpg_p19283_9

There are indeed nine Postgres backends running.  They are connected to
nine "fastcgi" Apache processes that sit there and wait for low-overhead
requests.

What are these DEALLOCATE requests, and why do they just sit there and
never finish?

Thanks,
Craig


Re: [ADMIN] Odd DEALLOCATE statements never finish in 9.2?

2012-10-18 Thread Craig James
On Thu, Oct 18, 2012 at 2:54 PM, Tom Lane  wrote:

> Craig James  writes:
> > I installed 9.2 on our new server and am seeing something odd that
> doesn't
> > happen in 8.4:
>
> > postgres=# select datname, pid, usename, query from pg_stat_activity
> where
> > query != '';
>
> The query column now means "current or last query" --- there's a
> separate column to check for the idle status.  Those DEALLOCATEs
> are evidently what your app does before going idle.
>

Nice feature.  Thanks.

Craig


>
> regards, tom lane
>


Re: [ADMIN] Failed Login Attempts parameter

2012-11-15 Thread Craig James
On Thu, Nov 15, 2012 at 1:32 AM, Lukasz Brodziak
wrote:

> 2012/11/15 Craig Ringer 
> > Another option would be to monitor syslog or the csvlog and lock the
> > user out by changing their password or revoking CONNECT rights if they
> > trip the threshold. It wouldn't be as responsive to high-rate brute
> > forcing attempts but your IDS should be handing those already.
> >
> > --
> >  Craig Ringer   http://www.2ndQuadrant.com/
> >  PostgreSQL Development, 24x7 Support, Training & Services
> >
>
> I wouldn't go with password change approach, at least not
> automatically...
>

Or never.  Locking users out invites denial-of-service attacks.  All you
have to do is figure out someone's username and you can lock them out of
the system by deliberately failing login.

A far better approach is an escalating delay. Check the number of failed
login attempts N and delay (for example) N^2 seconds before responding
again.  Legitimate users are mildly inconvenienced, and hackers are
severely hampered.

Craig


Re: [ADMIN] Database archive solutions

2012-11-21 Thread Craig James
On Wed, Nov 21, 2012 at 1:21 AM, Gnanakumar  wrote:

> Hi,
>
> Our application requirement demands archiving of records, so that
> performance/speed of the application is not compromised.  So, am looking
> out/evaluating on various techniques/solutions available for database
> archiving, in general, for PostgreSQL.
>

Your question isn't specific enough.  What does "not compromised" mean?
Every technique for archiving records has some impact on some part of your
system.  Is a 0.1% impact too much?  A 1% impact?  A 10% impact?  A factor
of 2 slowdown?

Do you need real-time archiving, a few minutes delay, or once a month
backup?

Do you need a hot standby or records that can be retrieved in a few hours?

You have to describe your needs completely before you can get a meaningful
answer.

Craig


>
> Few solutions that comes to my mind are:
>
> 1) Table partitioning
> 2) Separate tablespace and/or schema
> 3) Moving archived records/tables to a different harddisk
>
> Any other suggestions/pointers/solutions are really welcome and
> appreciated.
>
> NOTE: We're running PostgreSQL v9.1.3 on CentOS5.2
>
> Regards,
> Gnanam
>
>
>
> --
> Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-admin
>


Re: [ADMIN] Massive table bloat

2012-12-11 Thread Craig James
On Tue, Dec 11, 2012 at 8:11 AM, Michael Sawyers  wrote:

> Our dba quit last week leaving me with an interesting problem.
>
> We have a table currently using 33gb worth of space for only 152mb worth of
> data because of bad processes or autovacuum not being aggressive enough. I
> was able to confirm the size difference by doing a create table as select
> along with a second test of restoring the table from the dump file to a dev
> machine.
>
> There is a very large list of foreign key relationships that I'm not
> including for the sake of brevity.
>
> The database version is 8.4.1
>
> The old DBA had said that vacuum full would take days to complete, and we
> don't have that much of a window.  So I was considering using the  to force
> a full table rewrite.  In testing on a dev machine it only took about five
> minutes.
>
> I do not have as much hands on experience with postgres so I wanted to get
> thoughts on what is considered the proper way to deal with this kind of
> situation.
>
> Any comments would be welcome.
>
>
With only 33 GB of data, you might consider a full
dump/drop-db/create-db/restore. You didn't say what sort of hardware you're
using, or what sort of down time you could afford, but generally pg_dump
and pg_restore are remarkably fast.  If you have spare/backup machines, you
could even do trial runs to see how long it takes.

That would also give you the opportunity to upgrade to a new version of
Postgres, although that might require changes to your app that you're not
willing to make.  Even if you don't upgrade the version (you should at
least upgrade to the latest 8.4.x release), dump/restore will fix your
problem.

Craig James


Re: [ADMIN] Massive table bloat

2012-12-11 Thread Craig James
On Tue, Dec 11, 2012 at 12:48 PM, Steve Crawford <
scrawf...@pinpointresearch.com> wrote:

> On 12/11/2012 11:41 AM, Michael Sawyers wrote:
>
>> Political reasons have ruled out the dump and reload options, but
>> restoring
>> the entire database took several hours.  I'm also restricted on version
>> because newer versions of postgres are not supported with that specific
>> product, including maintenance updates.
>>
>
> You may want to gently suggest that the powers-that-be start at
> http://www.postgresql.org/**docs/8.4/static/release-8-4-**15.html<http://www.postgresql.org/docs/8.4/static/release-8-4-15.html>and
>  work their way back to
> http://www.postgresql.org/**docs/8.4/static/release-8-4-2.**html<http://www.postgresql.org/docs/8.4/static/release-8-4-2.html>compiling
>  a complete list of data-corrupting, query-wrecking or
> security-damaging issues that have been corrected in the over three years
> since 8.4.1 was released.
>
> They may also want to look at http://www.postgresql.org/**
> support/versioning/ <http://www.postgresql.org/support/versioning/> -
> particularly the statement, "The community considers not upgrading to be
> riskier than upgrading."
>
> Ultimately it is their responsibility to make the decision whether or not
> to upgrade. It is your responsibility to provide the data necessary to help
> them make an informed choice.
>

Often the reason for these technoligically stupid policies is third-party
software, where vendor X says, "we only support Postgres 8.3.2, using
anything else voids our service agreement."  If that's the problem here,
then ask vendor X if they're willing to take responsibility for the
potential data corruption that their policy will cause.  Vendors that work
with Postgres need to understand that "minor releases" won't break their
applications, and that they should encourage their customers to keep their
Postgres software up to date.

If, on the other hand, this is your company's internal policies, then pass
Steve's message along in with a strongly worded admonition that they pay
attention.

Craig James


>
> Cheers,
> Steve
>
>
>
> --
> Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/**mailpref/pgsql-admin<http://www.postgresql.org/mailpref/pgsql-admin>
>


Re: [ADMIN] Notes or comments on columns in a table

2013-02-14 Thread Craig James
On Thu, Feb 14, 2013 at 11:09 AM, Campbell, Lance wrote:

>  PostgreSQL 9.2
>
> When creating a table or altering a column of a table, is there a way to
> add notes on a column?
>
> ** **
>
> If not:
>
> I have a new enhancement request.  When creating a table or altering a
> column, it would be nice if a user could add “comments” or “notes” to a
> column.  This note would only appear when doing “\d table_name”.  
>
> ** **
>
> Why?  I was adding a new column to a table called “type”.  Normally I only
> add types as characters even though integers are much better for indexing.
> The problem with integers is I never remember what value means what.  Does
> a 1 mean directory or file.  So I end up creating “type” as a character so
> I can do a quick SELECT DISTINCT to figure out the options.  I would use
> integers if I could do \d table_name to see in the notes section what
> values meant.  I would think other people run into this issue all the time.
>

You should use an enumerated type definition.

create type my_file_type (
  'File',
  'Directory',
  'SymLink')

create table foo (
   id integer primary key,
   filetype my_file_type
)

insert into foo(1, 'File');

select * from foo;
  1  File

... and so forth.

Craig

> 
>
> ** **
>
> ** **
>
> Thanks,
>
> ** **
>
> Lance Campbell
>
> Software Architect
>
> Web Services at Public Affairs
>
> 217-333-0382
>
> [image: University of Illinois at Urbana-Champaign logo]
> 
>
> ** **
>
> ** **
>
<>

Re: [ADMIN] select exact term

2013-03-28 Thread Craig James
On Thu, Mar 28, 2013 at 10:51 AM, Marc Fromm  wrote:

>  Is there a way to create a select statement that will select a record if
> the exact term is found in a field that contains the text to describe
> something?
>
> ** **
>
> If I create a select statement using WHERE description LIKE ‘art’ I get
> every record that has words like depart, start and so on.
>
> If I create a select statement using WHERE description = ‘art’ I get no
> results even though the word art is in some records description field.
>

Use a regular expression instead of LIKE, and the left- and
right-word-boundary expressions (see section 9.7 of the Postgres manual):

db=> select 'the quick brown fox' ~ '[[:<:]]brown[[:>:]]';
 ?column?
--
 t

=> select 'the quick brown fox' ~ '[[:<:]]own[[:>:]]';
 ?column?
--
 f


Craig


Re: [ADMIN] Grant tables cascade to sequence?

2013-03-29 Thread Craig James
On Thu, Mar 28, 2013 at 10:56 PM, Rural Hunter wrote:

> Hi,
>
> I encounter the same issue often: Granted update/insert to an user but
> forgot to grant it on the related sequence. It's hard to understand that an
> user has write access on table but not on necessary sequences. I think the
> grant on tables should cascade to related sequences. What do you think?
>

Wouldn't it make more sense for the grant on the table to fail with an
appropriate error message?  That would solve your problem, and it wouldn't
be making security assumptions.  Cascading permissions seems like a recipe
for trouble.

Craig


Re: [ADMIN] regexp_replace grief

2013-04-10 Thread Craig James
On Wed, Apr 10, 2013 at 4:59 PM, Armin Resch  wrote:

> Not sure this is the right list to vent about this but here you go:
>
> I) select regexp_replace('BEFORE.AFTER','(.*)\..*','\1','g') "Substring"
> II) select regexp_replace('BEFORE.AFTER','(.*)\\..*','\\1','g') "Substring"
>
> Executing (II) against pg 8.4.4 or 9.0.4 yields 'BEFORE', but in order for
> 9.1.7 to yield the same one has to execute (I) .. bummer
>

This has nothing to do with regexp's.  It's a change in how '\' is
interpreted in any quoted string.  The change came with Postgres 9.x and is
documented in the release notes.  It brings Postgres into compliance with
the SQL standard.

In Perl, I do something like this:

my $pg_bs_char;
if ($dbh->get_info($GetInfoType{SQL_DBMS_VER}) gt "09.00.00") {
$pg_bs_char = "\\"; # a single '\' for PG 9.1 and higher
} else {
$pg_bs_char = "";   # a double '\\' for PG up to 9.0
}

You can also revert to the old 8.x interpretation; see

  http://www.postgresql.org/docs/9.1/static/runtime-config-compatible.html

Craig


>
> -ar
>


[ADMIN] Invalid SQL not rejected?

2013-04-11 Thread Craig James
Hmmm the subselect is invalid, but not rejected.  The outer select
returns every row in the customer_order_matches table.  This seems pretty
wrong. This is PG 9.2.1 running on Ubuntu.


db=> select count(1) from customer_order_matches where
customer_order_item_id in (select customer_order_item_id from
customer_order where customer_order_id = 105862140);
 count
---
 36180
(1 row)

db=> select customer_order_item_id from customer_order where
customer_order_id = 105862140;
ERROR:  column "customer_order_item_id" does not exist
LINE 1: select customer_order_item_id from customer_order where cust...
   ^

Craig


Re: [ADMIN] Database encoding and collation

2013-04-20 Thread Craig James
On Fri, Apr 19, 2013 at 5:12 PM, Rodrigo Barboza wrote:

> Hi guys.
> I created a database with default encoding (SQL_ASCII) and default collate
> (C).
> I created a table test like this:
> create table test (a varchar (10));
> Then i executed "insert into teste (a) values ('áéç&ã','Æ','ß');
>
> After that:
> select * from test;
> a
> -
>  áéç&ã
>  Æ
>  ß
>
> Why did it stora correctly if those values are not ASCII?
>

Characters are interpreted and displayed by your terminal, not the Postgres
system.  I suspect that you have language settings on whatever windowing
system you use.  Postgres merely interprets the characters you send as a
series of 8-bit bytes.  It's up to your display system to interpret them.
If you change your display terminal's language, I expect you'll see
something different.

The language settings of Postgres don't change what is stored, only how it
is interpreted (such as sorting).

Craig


[ADMIN] top posting?

2013-05-06 Thread Craig James
Just out of curiousity, I see comments like this all the time:

> (*please* stop top-posting).

I've been participating in newsgroups since UUCP days, and I've never
encountered a group before that encouraged bottom posting.  Bottom posting
has traditionally been considered rude -- it forces readers to scroll,
often through pages and pages of text, to see a few lines of original
material.

The most efficient strategy, one that respects other members' time, is to
briefly summarize your point at the TOP of a posting, then to *briefly*
quote only the relevant parts of the post to which you are replying, and
bottom-post after the quoted text.  That lets your reader quickly see if
it's relevant or not, and move on to the next post.

Contributors in these newsgroups seem to think it's OK to quote five pages
of someone else's response, then add one or two sentences at the bottom ...
it's just laziness that forces readers to wade through the same stuff over
and over in each thread.

How did the Postgres newsgroups get started with this "only bottom post"
idea?

(I'm not trying to start a flame war, just genuinely curious.)

Craig


Re: [ADMIN] top posting?

2013-05-06 Thread Craig James
On Mon, May 6, 2013 at 11:25 AM, Szymon Guz  wrote:

> On 6 May 2013 20:15, Craig James  wrote:
>
>> Just out of curiousity, I see comments like this all the time:
>>
>> > (*please* stop top-posting).
>>
>
> We are reading from top to bottom. That's why people should answer below
> cited text, so we can read it later normally. I mean that I should read
> first the part of email you answer to, and than below your answer.
>

Exactly.  That's why the very FIRST thing you read should be relevant
instead of a bunch of quoted text that the poster was too lazy to edit.

My real gripe is with laziness, not bottom posting.  Anyone who can't spend
a minute to edit quoted material to the relevant part is forcing thousands
of readers to wade through irrelevant crap.  It's inconsiderate.

Claims that "top post " or "bottom post" is better miss the point.  The
real issue has nothing to do with top- or bottom-posting.  It's about
relevance, effective editing, and paraphrasing.

Notice that I manage to make my point here, and to quote only the relevant
bits of your email, within the first twenty lines.  That means you can read
it without scrolling, and decide whether to click "delete", or continue on
to read this paragraph, which may be more wordy and reiterate points made
in the first part.  You didn't have to wade through stuff that you'd
already read before deciding that this long and wordy paragraph, which is
mostly fluff and irrelevant reiteration of points already made, was of any
interest to you.  You already got the important bits. ;-).  Now you can
click "delete."

Craig


Re: [ADMIN] top posting?

2013-05-06 Thread Craig James
Regarding top posting versus bottom posting, pretty almost everyone who has
commented agrees that top versus bottom posting isn't the problem.  It's
laziness about editing, perhaps exacerbated by certain email systems that
encourage that laziness.

So how about this: instead of demanding "PLEASE DON'T TOP POST," why not
suggest, "PLEASE EDIT FOR BREVITY (and keep your comments in context by
bottom posting)."

It really would improve these Postgres conversations.

Cheers,
Craig

(... and this illustrates the art of paraphrasing.  In many instances, a
quick paraphrase summarizing the issue is far better than quoting.)


Re: [ADMIN] Sr. Postgres DBA

2013-05-10 Thread Craig James
On Thu, May 9, 2013 at 5:01 PM, Miu, Monica  wrote:

>  Hi all,
>
> ** **
>
> My name is Monica Miu and I work in Talent Acquisition for Asurion Mobile
> Applications
>
>
>
Basically you just pissed everyone off that you're trying to recruit.  This
is a highly technical discussion group, and posting job listings here is
nothing less than spamming.  Try the postgres-jobs newsgroup instead.


Re: [ADMIN] Migration of server

2013-05-16 Thread Craig James
On Thu, May 16, 2013 at 11:04 AM, Oscar Calderon <
ocalde...@solucionesaplicativas.com> wrote:

> Hi everybody, this is my first message in this list. The company where i
> work is bringing maintenance service of PostgreSQL to another company,
> and currently they have installed PostgreSQL 9.1.1, and they want to move
> to 9.3 version when it will come out. So, because the difference of
> versions, and because it was installed by compiling it (using source code),
> and because the 9.1.1 installation is in a different directory than the
> default, they decided to replace 9.1.1 version with 9.3 (no upgrade, but
> replace it).
>
> Currently, they only have one database in production of 2.2 GB with some
> procedures and triggers. So, my plan to execute this database installation
> is the next:
>
>
>1. Install PostgreSQL 9.3 from postgresql repository (
>yum.postgresql.org) with a different port to avoid interrupt the
>production PostgreSQL instance operation
>2. Tune the database parameters in postgresql.conf, also create the
>same rules in pg_hba as the production instance, configure log and so on.
>3. At the end of the operations day, create a backup of the production
>database and then restore it into the new instance
>4. Test the new instance with the PHP applications that use it and
>verify that all is in order
>5. Stop the old instance and change the port to another port, then
>change the port of the new instance to 5432 in order to avoid change the
>network configuration, permissions and so on.
>
> But really is the first time that i do that, so i don't know if i'm
> missing something or there's something wrong about i'm planning to do, so i
> will appreciate very much if you can guide me about what steps i have to do
> exactly and considerations during this process.
>

I would expand step 4 into a much longer period.  Say, do steps 1..3 (you
don't even have to stop your services ... do it during a low-traffic
period), then spend a few days on step 4 to ensure that all of your
applications work and that you don't have any queries that have problems.
Unless your application is really simple, it will take more than an hour or
two to ensure that the migration will go well.

Once you're convinced that everything will work, discard the new 9.3
database and start over again at step 1, and this time complete through
step 5.

Craig

>
> Regards.
> ***
> Oscar Calderon
> Analista de Sistemas
> Soluciones Aplicativas S.A. de C.V.
> www.solucionesaplicativas.com
> Cel. (503) 7741 7850
>


Re: [ADMIN] pg_restore

2013-05-20 Thread Craig James
On Mon, May 20, 2013 at 3:57 PM, Kasia Tuszynska wrote:

>   Hi Everybody,
>
> Has anyone ran into issues running pg_restore?
>
> It seems that between 8.3.8 and 9.0.5, 9.1.3 the behavior of pg_restore
> has changed. 
>
> ** **
>
> Previously I was able to have several data owners with their own schemas
> and running a pg_restore as one superuser was able to restore the objects
> in those schemas without an issue. 
>
> ** **
>
> At 9.0.5, I found that I had to restore each schema of a data owner
> separately.
>
> At 9.1.3,  I found that in addition to that I need to make each of those
> data owners superusers. 
>
> ** **
>
> I am fully aware that the dev work for core replication was occurring at
> this time, but I have been unable to find any documentation about the
> potential changes to the basic pg_restore functionality.
>

I recently restored a database using 9.2.1 that has a number of schemas,
each with a separate owner, and had no problems at all.

Craig


> 
>
> ** **
>
> Anyone else noticed or had issues with this?
>
> ** **
>
> Sincerely,
>
> Kasia 
>


Re: [ADMIN] Why sequence grant is separated from table?

2013-06-19 Thread Craig James
On Wed, Jun 19, 2013 at 2:35 AM, Rural Hunter  wrote:

> I really hate the error "permission denied for sequence x" when I
> grant on a table but forget to grant additionally on the related sequence
> to users. Can the permission of table and related sequences be merged?
>

You asked this question back in March; here's what I suggested at the time:

On Thu, Mar 28, 2013 at 10:56 PM, Rural Hunter wrote:
> Hi,
>
> I encounter the same issue often: Granted update/insert to an user but
> forgot to grant it on the related sequence. It's hard to understand that an
> user has write access on table but not on necessary sequences. I think the
> grant on tables should cascade to related sequences. What do you think?
>

Wouldn't it make more sense for the grant on the table to fail with an
appropriate error message?  That would solve your problem, and it wouldn't
be making security assumptions.  Cascading permissions seems like a recipe
for trouble.

Craig

I suggest is that having the "grant ... on tablename" fail would serve your
purpose.  What you want is for it to let you know you've made a security
change that is bound to fail.  I think it would actually be better to have
the GRANT fail since it would notify you that the script or procedure you
are using is incorrect.

Craig


>
>
> --
> Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/**mailpref/pgsql-admin
>


Re: [ADMIN] Opinions on SSDs

2013-08-12 Thread Craig James
On Mon, Aug 12, 2013 at 8:28 AM, David F. Skoll wrote:

>
> 3) Our current workload peaks at about 5000 transactions per second;
> you can assume about one-third to one-half of those are writes.  Do
> you think we can get away with 16 10Krpm SATA drives instead of the
> SSDs?
>

pgbench peaks out at 5K-7K transactions per second on my server which uses
just 10ea. of 7Krpm SATA drives:

  WAL: RAID1 (2 disks)
  Data: RAID10 (8 disks)
  3Ware RAID controller with BBU
  2x4 core Intel CPUs
  12 GB memory

I don't know how pgbench compares to your workload.  But suspect 16 10K
SATA drives would be pretty fast if you combine them with a BBU RAID
controller.

On the other hand, I swore this would be the last server I buy with
spinning storage.

Craig