Re: [GENERAL] sslv3 alert illegal parameter

2009-07-09 Thread Albe Laurenz
Pedro Doria Meunier wrote:
 My server started spitting this out every second!
 
 LOG:  could not accept SSL connection: sslv3 alert illegal parameter
 
 PostgreSQL 8.3.7 on x86_64-redhat-linux-gnu, compiled by GCC gcc
 (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33)
 
 Server key in place..
 
 What can I do about it?!

Several ideas:

You say started spitting: that suggests that this is a new
phenomenon. What changes have there been since before it started?

Every second: Try to find out what tries to connect every second.
Set log_connections to on and find out the IP address.

Once you have found the culprit, figure out how it tries to connect.

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Performance problem with low correlation data

2009-07-09 Thread m_lists

Since noone replied to 
http://www.mail-archive.com/pgsql-general@postgresql.org/msg133360.html, I 
tried another approach:

I can't cluster the whole table every day; it would take too much (as I said, 
table as 60M rows, and I have hundreds of them).
Plus, it wouldn't really make much sense: the only portion of table to be 
clustered is the one written after the last cluster command (since no row is 
deleted/updated, only inserted each 15 minutes).
So I thought: I'll cluster only the part that has been written every day:

begin; 
lock table testinsert in ACCESS EXCLUSIVE MODE;
insert into testinsert select ne_id+10, t, v from testinsert where t 
between '2009-08-01 00:00:00' and '2009-08-02 00:00:00'  order by ne_id,t;
DELETE from testinsert where t between '2009-08-01 00:00:00' and '2009-08-02 
00:00:00' and ne_id10;
update testinsert set ne_id = ne_id - 10 where t between '2009-08-01 
00:00:00' and '2009-08-02 00:00:00';
commit;


this would run after midnight of 2009-08-02. Next day would have different time 
values.
What I'm trying to do here is cluster on ne_id,t the portion of table written 
every day.
Well, I guess the table is layed out as expected, but in pg_stats correlation 
for the ne_id col is still VERY low:
select attname,n_distinct,correlation from pg_stats where 
tablename='testinsert3'; 
attname | n_distinct | correlation
-++-
ne_id   |      2 |    0.111041   low value
t       |        864 |    0.987778
v       |          1 |           1


this leads the planner to sequence scans of the table as soon as 10% of the 
table has to be read: 
explain  select * FROM idtable as g      inner  join testinsert on id=ne_id 
where  groupid between 1 and 4 and  t between '2009-08-01 00:00:00' and 
'2009-08-09 00:00:00' 
Hash Join  (cost=134.45..2127071.28 rows=614331 width=244)
   Hash Cond: (testinsert3.ne_id = g.id)
   -  Seq Scan on testinsert  (cost=0.00..2063200.00 rows=15358272 width=236)
               Filter: ((t = '2009-08-01 00:00:00'::timestamp without time 
zone) AND (t = '2009-08-09 00:00:00'::timestamp without time zone))
   -  Hash  (cost=124.45..124.45 rows=800 width=8)
               -  Bitmap Heap Scan on idtable g  (cost=24.45..124.45 rows=800 
width=8)
                          Recheck Cond: ((groupid = 1) AND (groupid = 4))
                               -  Bitmap Index Scan on idtable_pk  
(cost=0.00..24.25 rows=800 width=0)
                                             Index Cond: ((groupid = 1) AND 
(groupid = 4))

Which is a terrible plan! 
testinsert contains t values between '2009-08-01' and '2009-08-09', and ne_id 
from 1 to 2. But only 800 out of 2 ne_id have to be read; there's no 
need for a table scan!
I guess this is a reflection of the poor correlation on ne_id; but, as I 
said, I don't really think ne_id is so bad correlated.
In fact, doing a select ne_id, t from testinsert limit 10  I can see that 
data is laid out pretty much by ne_id, t, grouped by day (that is, same ne_id 
for one day, then next ne_id and so on until next day).
How is the correlation calculated? Can someone explain to me why, after the 
procedure above,correlation is so low???





-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] c++ program to connect to postgre database

2009-07-09 Thread John R Pierce

ramon rhey serrano wrote:

Hi Sir John,

Thanks for the links and reply.
I downloaded this libpqxx-3.0 but i really don't know what to do 
with the file and where to put them. I'm still having hard time how to 
start the C++ program using Dev C++ as my IDE, what files do i need to 
install, what headers to use, how the basic flow of program should 
look like (maybe for a simple database connection). I already 
installed PostgreSQL version 1.8.4 in my PC (windows). I have basic 
understanding and background in C++ programming, but totally new to 
PostgreSQL database system.


as I said earlier, I'd probably just use C API calls to libpq, and do my 
own C++ wrappings.C++ libraries pretty much have to be compiled for 
the speciifc compiler environment, so I'd have to assume youd load the 
libpqxx project source into your IDE, and build it.   

with libpq, the OS should already have the .so (or dll if you're on 
windows), and .h files, so you'd simply #include libpq-fe.h  and call 
the various functions using :: in front of the code to indicate they are 
not part of a class.  you might have to tell the IDE where the libpq h 
and .so files are if they aren't in a standard system library locations 
like /usr/include and /usr/lib


I've never used that IDE so I have no idea of the specifics of how you 
configure it with new libraries or include files.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] pg_dump PostgreSQL 8.4

2009-07-09 Thread serafin g.segador
hi!.

i am trying to migrate my database from 8.3.7 to 8.4 (both versions
running on the same server windows 2003, hp ml350 quad core xeon) using
pg_dump and restore using pgAdmin III R 1.10.0, this i was able to do.  but
i cannot seem to make a backup on 8.4.  i keep getting the following error
message:

pg_dump: reading user-defined tables
pg_dump: SQL command failed
pg_dump: Error message from server: ERROR:  column reltriggers does not
exist
LINE 1: ...oles WHERE oid = relowner) as rolname, relchecks, reltrigger...
 ^
pg_dump: The command was: SELECT c.tableoid, c.oid, relname, relacl,
relkind, relnamespace, (SELECT rolname FROM pg_catalog.pg_roles WHERE oid =
relowner) as rolname, relchecks, reltriggers, relhasindex, relhasrules,
relhasoids, d.refobjid as owning_tab, d.refobjsubid as owning_col, (SELECT
spcname FROM pg_tablespace t WHERE t.oid = c.reltablespace) AS
reltablespace, array_to_string(c.reloptions, ', ') as reloptions from
pg_class c left join pg_depend d on (c.relkind = 'S' and d.classid =
c.tableoid and d.objid = c.oid and d.objsubid = 0 and d.refclassid =
c.tableoid and d.deptype = 'a') where relkind in ('r', 'S', 'v', 'c') order
by c.oid
pg_dump: *** aborted because of error
Process returned exit code 1.
is there any configuration setting i need to do.  any help would be
appreciated.

regards

raffy segador
destiny cable inc
ncr, philipines


Re: [GENERAL] pg_dump PostgreSQL 8.4

2009-07-09 Thread Frank Heikens

What version of pg_dump and pgAdmin are you using?

In PostgreSQL version 8.3 it's reltriggers, in version 8.4 it's  
relhastriggers. Just run the next query to see the difference:


SELECT * FROM pg_class;

Using pgAdmin 1.10, I don't have any problemes with 8.3 nor 8.4

Regards,
Frank

Op 9 jul 2009, om 10:24 heeft serafin g.segador het volgende geschreven:


hi!.

i am trying to migrate my database from 8.3.7 to 8.4 (both versions  
running on the same server windows 2003, hp ml350 quad core xeon)  
using pg_dump and restore using pgAdmin III R 1.10.0, this i was  
able to do.  but i cannot seem to make a backup on 8.4.  i keep  
getting the following error message:


pg_dump: reading user-defined tables
pg_dump: SQL command failed
pg_dump: Error message from server: ERROR:  column reltriggers  
does not exist
LINE 1: ...oles WHERE oid = relowner) as rolname, relchecks,  
reltrigger...

 ^
pg_dump: The command was: SELECT c.tableoid, c.oid, relname, relacl,  
relkind, relnamespace, (SELECT rolname FROM pg_catalog.pg_roles  
WHERE oid = relowner) as rolname, relchecks, reltriggers,  
relhasindex, relhasrules, relhasoids, d.refobjid as owning_tab,  
d.refobjsubid as owning_col, (SELECT spcname FROM pg_tablespace t  
WHERE t.oid = c.reltablespace) AS reltablespace,  
array_to_string(c.reloptions, ', ') as reloptions from pg_class c  
left join pg_depend d on (c.relkind = 'S' and d.classid = c.tableoid  
and d.objid = c.oid and d.objsubid = 0 and d.refclassid = c.tableoid  
and d.deptype = 'a') where relkind in ('r', 'S', 'v', 'c') order by  
c.oid

pg_dump: *** aborted because of error
Process returned exit code 1.
is there any configuration setting i need to do.  any help would be  
appreciated.


regards

raffy segador
destiny cable inc
ncr, philipines


Frank Heikens
frankheik...@mac.com




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Table replication

2009-07-09 Thread S Arvind
Hi Members,
  Is there any way to sync a single table across the DBs. We need a single
table alone to be equal in all DBs in the single postgres. Presenly we are
doing this with the help of Update, insert, delete trigger.
Is there any other best way for that.

Arvind S


Re: [GENERAL] c++ program to connect to postgre database

2009-07-09 Thread Pedro Doria Meunier
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi Rey,

If you give QT4 a try I might be able to help you there ;-)

Regards

Pedro Doria Meunier
GSM: +351 96 17 20 188
Skype: pdoriam
 



ramon rhey serrano wrote:
 !-- DIV {margin:0px;} --
 Hello,

 I'm thinking of making a program in C++ (API) that will connect,
 read, and write to a PostgreSQL database. I honestly have no idea
 where to start and how to implement it in C++ using Dev C++ IDE.
 Any help would be greatly appreciated.
 Thanks.

 Sincerely,
 Rhey
 

 --
 New Email addresses available on Yahoo!
 http://sg.rd.yahoo.com/aa/mail/domainchoice/mail/signature/*http://mail.promotions.yahoo.com/newdomains/aa/

 Get the Email name you've always wanted on the new @ymail and
 @rocketmail.
 Hurry before someone else does!
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD4DBQFKVbp+2FH5GXCfxAsRAi0WAKC1LxYxa9TVzOy6/MN7+UA5mYfcfQCYsCvz
LdGheJTQJcWApVisqDeS+w==
=yx4r
-END PGP SIGNATURE-


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] sslv3 alert illegal parameter [SOLVED]

2009-07-09 Thread Pedro Doria Meunier
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi Albe,

Txs for replying.

I did all the necessary mods to the postgresql.conf and put as much
info as I could in the logs.

The 'culprit' was localhost  with no discernible reason whatsoever
... :O

Anyway, after exausting all possbilities and learning that openssl
0.9.8b as memory leaks issues :-( the only thing left for me was
restarting the server and go figure: it went way on its on! :O

So as much as I'd love to know the reason I guess I'll just be happy
that the thing resolved itself with a sort of a
sword-dangling-over-my-head feeling ...

Best regards,

Pedro Doria Meunier
GSM: +351 96 17 20 188
Skype: pdoriam
 



Albe Laurenz wrote:
 Pedro Doria Meunier wrote:
 My server started spitting this out every second!

 LOG:  could not accept SSL connection: sslv3 alert illegal
 parameter

 PostgreSQL 8.3.7 on x86_64-redhat-linux-gnu, compiled by GCC gcc
  (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33)

 Server key in place..

 What can I do about it?!

 Several ideas:

 You say started spitting: that suggests that this is a new
 phenomenon. What changes have there been since before it started?

 Every second: Try to find out what tries to connect every second.
 Set log_connections to on and find out the IP address.

 Once you have found the culprit, figure out how it tries to
 connect.

 Yours, Laurenz Albe

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFKVb6a2FH5GXCfxAsRAmBfAJ46fR1jOrSphHPxL5icYVum/EU1xQCgvTew
INQgVjKv4fOXvw7Ic82XNM4=
=oR5w
-END PGP SIGNATURE-


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_dump PostgreSQL 8.4

2009-07-09 Thread serafin segador

i found where the error is.  thanks for the tip.

i have versions 8.3.7 and 8.4 of PostgreSQL running on the same server, as  
well as versions 1.8 and 1.10 of pgAdmin.  although i run the backup  
routine for pg8.4 from pgAdmin 1.10, the pdAdmin uses  
PostgreSQL\8.3\bin\pg_dump as default.  i now run the backup routine from  
the command line utility pointing to the correct version.


thank you.

regards,
raffy

On Thu, 09 Jul 2009 17:03:04 +0800, Frank Heikens frankheik...@mac.com  
wrote:



What version of pg_dump and pgAdmin are you using?

In PostgreSQL version 8.3 it's reltriggers, in version 8.4 it's  
relhastriggers. Just run the next query to see the difference:


SELECT * FROM pg_class;

Using pgAdmin 1.10, I don't have any problemes with 8.3 nor 8.4

Regards,
Frank

Op 9 jul 2009, om 10:24 heeft serafin g.segador het volgende geschreven:


hi!.

i am trying to migrate my database from 8.3.7 to 8.4 (both versions  
running on the same server windows 2003, hp ml350 quad core xeon) using  
pg_dump and restore using pgAdmin III R 1.10.0, this i was able to do.   
but i cannot seem to make a backup on 8.4.  i keep getting the  
following error message:


pg_dump: reading user-defined tables
pg_dump: SQL command failed
pg_dump: Error message from server: ERROR:  column reltriggers does  
not exist
LINE 1: ...oles WHERE oid = relowner) as rolname, relchecks,  
reltrigger...

 ^
pg_dump: The command was: SELECT c.tableoid, c.oid, relname, relacl,  
relkind, relnamespace, (SELECT rolname FROM pg_catalog.pg_roles WHERE  
oid = relowner) as rolname, relchecks, reltriggers, relhasindex,  
relhasrules, relhasoids, d.refobjid as owning_tab, d.refobjsubid as  
owning_col, (SELECT spcname FROM pg_tablespace t WHERE t.oid =  
c.reltablespace) AS reltablespace, array_to_string(c.reloptions, ', ')  
as reloptions from pg_class c left join pg_depend d on (c.relkind = 'S'  
and d.classid = c.tableoid and d.objid = c.oid and d.objsubid = 0 and  
d.refclassid = c.tableoid and d.deptype = 'a') where relkind in ('r',  
'S', 'v', 'c') order by c.oid

pg_dump: *** aborted because of error
Process returned exit code 1.
is there any configuration setting i need to do.  any help would be  
appreciated.


regards

raffy segador
destiny cable inc
ncr, philipines


Frank Heikens
frankheik...@mac.com







--
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_dump PostgreSQL 8.4

2009-07-09 Thread Dave Page
On Thu, Jul 9, 2009 at 11:06 AM, serafin segadorrsega...@mydestiny.net wrote:
 i found where the error is.  thanks for the tip.

 i have versions 8.3.7 and 8.4 of PostgreSQL running on the same server, as
 well as versions 1.8 and 1.10 of pgAdmin.  although i run the backup routine
 for pg8.4 from pgAdmin 1.10, the pdAdmin uses PostgreSQL\8.3\bin\pg_dump as
 default.  i now run the backup routine from the command line utility
 pointing to the correct version.

Check the PG bin path option and File - Options.

It also sounds like you disabled the version match check on that
dialogue, otherwise pg_dump 8.3 should have complained about be used
with 8.4.

-- 
Dave Page
EnterpriseDB UK:   http://www.enterprisedb.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Table replication

2009-07-09 Thread Richard Huxton

S Arvind wrote:

Hi Members,
  Is there any way to sync a single table across the DBs. We need a single
table alone to be equal in all DBs in the single postgres. Presenly we are
doing this with the help of Update, insert, delete trigger.
Is there any other best way for that.


Most trigger-based replication systems should work fine with one or more 
tables: slony, londiste etc. That's assuming a single master DB that 
receives all updates.


--
  Richard Huxton
  Archonet Ltd

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] c++ program to connect to postgre database

2009-07-09 Thread Rainer Bauer
John R Pierce schrieb:

ramon rhey serrano wrote:
 Hi Sir John,

 Thanks for the links and reply.
 I downloaded this libpqxx-3.0 but i really don't know what to do 
 with the file and where to put them. I'm still having hard time how to 
 start the C++ program using Dev C++ as my IDE, what files do i need to 
 install, what headers to use, how the basic flow of program should 
 look like (maybe for a simple database connection). I already 
 installed PostgreSQL version 1.8.4 in my PC (windows). I have basic 
 understanding and background in C++ programming, but totally new to 
 PostgreSQL database system.

as I said earlier, I'd probably just use C API calls to libpq, and do my 
own C++ wrappings.C++ libraries pretty much have to be compiled for 
the speciifc compiler environment, so I'd have to assume youd load the 
libpqxx project source into your IDE, and build it.

The best C++ library I came across so far is SOCI:
http://soci.sourceforge.net/

Maybe this could be a good starting point for Ramon since he is familiar with
C++.

Rainer

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] c++ program to connect to postgre database

2009-07-09 Thread Peter Geoghegan
I'm a big libpqxx user, and can confirm that it's a great wrapper, but
good luck getting it to build in Dev-c++, which hasn't had a new
release in 4 years and isn't supported by pqxx. I'm not sure of the
specifics, but Dev-c++ uses a MinGW port of GCC. libpqxx does support
MinGW, but probably not the Dev-C++ flavour.

I suggest you download visual studio 2008 express, and use that to
build libpqxx. No one uses Dev-C++ anymore anyway.

Regards,
Peter Geoghegan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] ZFS prefetch considered evil?

2009-07-09 Thread Alban Hertroys

On Jul 9, 2009, at 3:53 AM, Yaroslav Tykhiy wrote:


On 08/07/2009, at 8:39 PM, Alban Hertroys wrote:


On Jul 8, 2009, at 2:50 AM, Yaroslav Tykhiy wrote:
IIRC prefetch tries to keep data (disk blocks?) in memory that it  
fetched recently.


What you described is just a disk cache.  And a trivial  
implementation of prefetch would work as follows:  An application or  
other file/disk consumer asks the provider (driver, kernel,  
whatever) to read, say, 2 disk blocks worth of data.  The provider  
thinks, I know you are short-sighted; I bet you are going to ask  
for more contiguous blocks very soon, so it schedules a disk read  
for many more contiguous blocks than requested and caches them in  
RAM.  For bulk data applications such as file serving this trick  
works as a charm.  But other applications do truly random access and  
they never come back after the prefetched blocks; in this case both  
disk bandwidth and cache space are wasted.  An advanced  
implementation can try to distinguish sequential and random access  
patterns, but in reality it appears to be a challenging task.


Ah yes, thanks for the correction, I now remember reading about that  
before. Makes the name 'prefetch' that more fitting, doesn't it?


And as you say, it's not that useful a feature with random access  
(hadn't thought about that); in fact, I can imagine that it might  
delay moving the disk-heads to the next desired (random) position as  
the FS is still requesting data that it isn't going to be needing  
(except for some lucky cases) - unless it manages to detect the  
randomness of the access patterns. You can't predict randomness from  
just read requests of course, you don't know about the requests that  
are still to come. You can however assume something like that is the  
case if historic requests turned out to be random by nature, but then  
you'd want to know for which area of the FS this is the case.


I don't know how you partitioned your zpools, but to me it seems like  
it'd be preferable to have the PostgreSQL tablespaces (and possibly  
other data that's likely to be accessed randomly) in a separate zpool  
from the rest of the system so you can restrict disabling prefetch to  
just that file-system. You probably already did that...


It could be interesting to see how clustering the relevant tables  
would affect the prefetch performance, I'd expect disk access to be  
less random that way. It's probably still better to disable prefetch  
though.


ZFS uses quite a bit of memory, so if you distributed all your  
memory to be used by just postgres and disk cache then you didn't  
leave enough space for the prefetch data and _something_ will be  
moved to swap.


I hope you know that FreeBSD is exceptionally good at distributing  
available memory between its consumers.  That said, useless prefetch  
indeed puts extra pressure on disk cache and results in unnecessary  
cache evictions, thus making things even worse.  It is true that ZFS  
is memory hungry and so rather sensitive to non-optimal memory use  
patterns.  Useless prefetch wastes memory that could be used to  
speed up other ZFS operations.


Yes, I do know that, it's one of the reasons I prefer it over other  
OSs. The keyword here was 'available memory' though, under the  
assumption that something was hitting swap. But apparently that wasn't  
the case.


You'll probably want to ask about this on the FreeBSD mailing lists  
as well, they'll know much better than I do ;)


Are you a local FreeBSD expert? ;-)  Jokes apart, I don't think this  
topic has to do with FreeBSD as such; it is mostly about making the  
advanced technologies of Postgresql and ZFS go well together.  Even  
ZFS developers admit that in database related applications  
exceptions from general ZFS practices and rules may be called for.


I wouldn't call myself an expert, I just use it on a few systems at  
home and am more a user than an administrator. I do read the stable/ 
current mailing lists though (since 2004 according to my mail client)  
and keep an eye on (among others) the ZFS discussions as I feel  
tempted to change my gmirrors into zpools some day. It certainly looks  
like an interesting FS, very flexible and reliable.


Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,4a55e49a10131296212767!



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Bug in ecpg lib ?

2009-07-09 Thread leif
   Hello Laurenz,

   Thank you for your very thorough walk through the 'ecpg use' of threads with 
respect to the sqlca. It was very clear and specific. I reproduced what you did 
almost exactly as you have done and I could then also play around with things 
to see what happens 'if'... I have learned much about threads and ecpg, which 
I'm sure will be very helpful. Also I'm afraid I have to agree with you that it 
must be a mudflap flop ;-)   ...   unfortunately, because now I'm then back to 
the real problem in the larger program and how to track that error.

   I'm pleased that it wasn't an ecpg bug, and I know now not to use mudflap 
for tracking my problem.

   Thanks for your big effort on this,

 Leif


- Albe Laurenz laurenz.a...@wien.gv.at wrote:

 l...@crysberg.dk wrote:
  I have been trying to figure this thing out myself too, 
  breakpointing and single stepping my way through some of the 
  ecpg code, but without much clarification. (More that I 
  learned new things about pthread). I have been trying to 
  figure out whether this is a real thing or more a mudflapth 
  mis-judgement. Also on most (the faster ones) machines 
  mudflap complains either about invalid pointer in free() or 
  double free() or corruption. I haven't been able to verify 
  this yet. Specifically on one (slower) machine, I have only 
  seen this mudflapth complaint once, though I have been both 
  running and debugging it on that many times.
  
  Are you sure what you suggest is nonsense ? In the light 
  of the sqlca struct being local to each thread ? I tried to 
  put the open and close connection within the thread, but I 
  was still able to get the mudflap complaint. Theoretically, I 
  guess one could use just 1 connection for all db access in 
  all threads just having them enclosed within 
  pthread_mutex_[un]lock()s !? (Not what I do, though.)
 
 The sqlca is local to each thread, but that should not be a problem.
 On closer scrutiny of the source, it works like this:
 
 Whenever a thread performs an SQL operation, it will allocate
 an sqlca in its thread-specific data area (TSD) in the ECPG function
 ECPGget_sqlca(). When the thread exits or is cancelled, the
 sqlca is freed by pthread by calling the ECPG function
 ecpg_sqlca_key_destructor(). pthread makes sure that each
 destructor function is only called once per thread.
 
 So when several threads use a connection, there will be
 several sqlca's around, but that should not matter as they get
 freed when the thread exits.
 
 After some experiments, I would say that mudflap's complaint
 is a mistake.
 
 I've compiled your program against a debug-enabled PostgreSQL 8.4.0
 with
 
 $ ecpg crashex
 
 $ gcc -Wall -O0 -g -o crashex crashex.c -I
 /magwien/postgres-8.4.0/include \
 -L/magwien/postgres-8.4.0/lib -lecpg
 -Wl,-rpath,/magwien/postgres-8.4.0/lib
 
 and run a gdb session:
 
 $ gdb
 GNU gdb Red Hat Linux (6.3.0.0-1.138.el3rh)
 Copyright 2004 Free Software Foundation, Inc.
 GDB is free software, covered by the GNU General Public License, and
 you are
 welcome to change it and/or distribute copies of it under certain
 conditions.
 Type show copying to see the conditions.
 There is absolutely no warranty for GDB.  Type show warranty for
 details.
 This GDB was configured as i386-redhat-linux-gnu.
 
Set the program to be debugged:
 
 (gdb) file crashex
 Reading symbols from /home/laurenz/ecpg/crashex...done.
 Using host libthread_db library /lib/tls/libthread_db.so.1.
 
This is where the source of libecpg is:
 
 (gdb) dir
 /home/laurenz/rpmbuild/BUILD/postgresql-8.4.0/src/interfaces/ecpg/ecpglib
 Source directories searched:
 /home/laurenz/rpmbuild/BUILD/postgresql-8.4.0/src/interfaces/ecpg/ecpglib:$cdir:$cwd
 
Start the program (main thread):
 
 (gdb) break main
 Breakpoint 1 at 0x804892c: file crashex.pgc, line 54.
 (gdb) run
 Starting program: /home/laurenz/ecpg/crashex 
 [Thread debugging using libthread_db enabled]
 [New Thread -1218572160 (LWP 29290)]
 [Switching to Thread -1218572160 (LWP 29290)]
 
 Breakpoint 1, main (argc=1, argv=0xbfffce44) at crashex.pgc:54
 54  PerformTask( 25 );
 (gdb) delete
 Delete all breakpoints? (y or n) y
 
Set breakpoint #2 in the function where sqlca is freed:
 
 (gdb) break ecpg_sqlca_key_destructor
 Breakpoint 2 at 0x457a27: file misc.c, line 124.
 (gdb) list misc.c:124
 119   
 120   #ifdef ENABLE_THREAD_SAFETY
 121   static void
 122   ecpg_sqlca_key_destructor(void *arg)
 123   {
 124   free(arg);  /* sqlca 
 structure allocated in ECPGget_sqlca */
 125   }
 126   
 127   static void
 128   ecpg_sqlca_key_init(void)
 
Set breakpoint #3 where a new sqlca is allocated in
 ECPGget_sqlca():
 
 (gdb) break misc.c:147
 Breakpoint 3 at 0x457ad2: file misc.c, line 147.
 (gdb) list misc.c:134,misc.c:149
 134   struct sqlca_t *
 135   ECPGget_sqlca(void)
 136   {
 137   #ifdef ENABLE_THREAD_SAFETY
 138   struct sqlca_t *sqlca;
 139   
 140 

[GENERAL] SELECT DISTINCT very slow

2009-07-09 Thread Ben Harper
Hi,
Can anybody explain this:

Records: 600,000
Field Width: Approximately 20 UTF8 characters - type is VARCHAR(25)
Field is Indexed.

SELECT DISTINCT field FROM table;

Takes about 6 seconds. There are 111 distinct items.

On Sqlite, and another place where I have a B+Tree, this query is
faster than my eye can measure.

Is this a well known issue?

Thanks,
Ben

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Asking for assistance in determining storage requirements

2009-07-09 Thread Chris Barnes

You assistance is appreciated.




I have question regarding disk 
storage for postgres servers
 
We are thinking long term about 
scalable storage and performance and would like some advise
or feedback about 
what other people are using.
 
We would like to get as much 
performance from our file systems as possible.
 
We use ibm 3650 quad processor with 
onboard SAS controller ( 3GB/Sec) with 15,000rpm drives.
We use raid 1 for the centos 
operating system and the wal archive logs.
The postgres database is on 5 drives 
configured as raid 5 with a global hot spare.
 
We are curious about using SAN with 
fiber channel hba and if anyone else uses this technology.
We would also like to know if people 
have preference to the level of raid with/out 
striping.


Sincerely,

Chris Barnes
Recognia Inc.
Senior DBA

_
Attention all humans. We are your photos. Free us.
http://go.microsoft.com/?linkid=9666046

[GENERAL] PostgreSQL 8.4 Window functions

2009-07-09 Thread Paolo Saudin
Hi all,

I am trying to calculate an 8 hour moving average using the new
Window functions without success. Here is what I am trying to do :

-- create test table
CREATE TABLE temperatures
(
  fulldate timestamp NOT NULL PRIMARY KEY,
  value numeric
);

-- inserts
INSERT INTO temperatures 
select '2009-07-01 00:00:00'::timestamp + interval '1 hour' * s.a as
fulldate, 
round(cast(random() as numeric), 1) as value from 
generate_series(0,23) as s(a)
;

-- selects
select * from temperatures order by fulldate;

-- window function
SELECT fulldate, value, avg(value) OVER () FROM temperatures ORDER BY
fulldate; SELECT fulldate, value, avg(value) OVER (ORDER BY fulldate) FROM
temperatures ORDER BY fulldate; SELECT fulldate, value, round(avg(value)
OVER (ORDER BY fulldate RANGE UNBOUNDED PRECEDING), 2) as value FROM
temperatures ORDER BY fulldate;
-- not supported
SELECT fulldate, value, round(avg(value) OVER (ORDER BY fulldate RANGE -8
PRECEDING), 2) as value FROM temperatures ORDER BY fulldate;

Is there any way to PARTITION on a subset of rows (in this case 8) ?

-- expected result 
-- date timevalue   moving-average
2009-07-01 00:00:00 0,3 
2009-07-01 01:00:00 0,1 
2009-07-01 02:00:00 0,5 
2009-07-01 03:00:00 0,1 
2009-07-01 04:00:00 0,2 
2009-07-01 05:00:00 0,7 
2009-07-01 06:00:00 0,9 
2009-07-01 07:00:00 0,7 0,44
2009-07-01 08:00:00 0   0,4
2009-07-01 09:00:00 0,9 0,5
2009-07-01 10:00:00 0,8 0,54
2009-07-01 11:00:00 0,4 0,58
2009-07-01 12:00:00 0,6 0,63
2009-07-01 13:00:00 0,4 0,59
2009-07-01 14:00:00 0,7 0,56
2009-07-01 15:00:00 0,2 0,5
2009-07-01 16:00:00 0,2 0,53
2009-07-01 17:00:00 0,5 0,48
2009-07-01 18:00:00 0,7 0,46
2009-07-01 19:00:00 0   0,41
2009-07-01 20:00:00 0,4 0,39
2009-07-01 21:00:00 0,9 0,45
2009-07-01 22:00:00 0,4 0,41
2009-07-01 23:00:00 0,7 0,48
0,51
0,52
0,48
0,6
0,67
0,55
0,7

Thanks in advance
Paolo Saudin



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [Password?]

2009-07-09 Thread Ms swati chande
Hi,
 
Thank you all for your kind responses.
 
Things however aren't falling in place. 
 
Will take a short break, rework, and get back.
Probably, with a new problem!
 
Thanks again,
Regards
Swati


  

Re: [GENERAL] SELECT DISTINCT very slow

2009-07-09 Thread Bill Moran
In response to Ben Harper rogo...@gmail.com:
 Hi,
 Can anybody explain this:
 
 Records: 600,000
 Field Width: Approximately 20 UTF8 characters - type is VARCHAR(25)
 Field is Indexed.
 
 SELECT DISTINCT field FROM table;
 
 Takes about 6 seconds. There are 111 distinct items.

What's the output of EXPLAIN ANALYZE SELECT DISTINCT field FROM table;?

Does a VACUUM ANALYZE of the table help?  Is the query significantly
faster the second time you run it?

 Is this a well known issue?

Not that I'm aware of.

-- 
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] SELECT DISTINCT very slow

2009-07-09 Thread Andres Freund
On Thursday 09 July 2009 17:09:13 Ben Harper wrote:
 Hi,
 Can anybody explain this:

 Records: 600,000
 Field Width: Approximately 20 UTF8 characters - type is VARCHAR(25)
 Field is Indexed.

 SELECT DISTINCT field FROM table;

 Takes about 6 seconds. There are 111 distinct items.

 On Sqlite, and another place where I have a B+Tree, this query is
 faster than my eye can measure.

 Is this a well known issue?
Yes, I think so.

AFAIK the primary cause is that indexes in pg do not store visibility 
information. That means you need to check for existence of the tuple on the 
heap.
Possibly due to that PG has no special case code for DISTINCT to optimize such 
a query using mostly the index. It would be possible that for each possible 
value of 'field' you check the index only long enough to prove that there is at 
least one such entry.

Taking that single field into its own table is not possible?

Andres

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] SELECT DISTINCT very slow

2009-07-09 Thread Pavel Stehule
Hello

when you use older pg than 8.3, please, use GROUP BY.

SELECT field FROM table GROUP BY field.

Regards
Pavel Stehule

2009/7/9 Ben Harper rogo...@gmail.com:
 Hi,
 Can anybody explain this:

 Records: 600,000
 Field Width: Approximately 20 UTF8 characters - type is VARCHAR(25)
 Field is Indexed.

 SELECT DISTINCT field FROM table;

 Takes about 6 seconds. There are 111 distinct items.

 On Sqlite, and another place where I have a B+Tree, this query is
 faster than my eye can measure.

 Is this a well known issue?

 Thanks,
 Ben

 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Checkpoint Tuning Question

2009-07-09 Thread Dan Armbrust
 As Greg commented upthread, we seem to be getting forced to the
 conclusion that the initial buffer scan in BufferSync() is somehow
 causing this.  There are a couple of things it'd be useful to try
 here:

 * see how the size of the hiccup varies with shared_buffers;

I tried decreasing shared buffers - both 25MB and 50MB were too small
for my load - I had slow queries at all times.

So then I increased it from what I was using - 100MB - to 500MB - and
the hiccup roughly doubles in length.

At 100MB, the hiccup is about 2-3 seconds long.
At 500MB, the hiccup is about 6 seconds long.


 * try inserting a delay into that scan loop, as per attached
  quick-and-dirty patch.  (Numbers pulled from the air, but
  we can worry about tuning after we see if this is really
  where the problem is.)


After finally getting this particular system into a state where I
could build postgres (I was using the binary install) I built a 8.3.4,
using your patch - but I didn't see any change in the behaviour.  I
see hiccups that appear to be the same length as I saw on the binary
build of 8.3.4.

Thanks,

Dan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Table replication

2009-07-09 Thread Arndt Lehmann
On Jul 9, 7:53 pm, d...@archonet.com (Richard Huxton) wrote:
 S Arvind wrote:
  Hi Members,
    Is there any way to sync a single table across the DBs. We need a single
  table alone to be equal in all DBs in the single postgres. Presenly we are
  doing this with the help of Update, insert, delete trigger.
  Is there any other best way for that.

 Most trigger-based replication systems should work fine with one or more
 tables: slony, londiste etc. That's assuming a single master DB that
 receives all updates.

 --
    Richard Huxton
    Archonet Ltd

 --
 Sent via pgsql-general mailing list (pgsql-gene...@postgresql.org)
 To make changes to your 
 subscription:http://www.postgresql.org/mailpref/pgsql-general

And tools like bucardo or rubyrep can do this in a master-master
replication.

Regards,
  Arndt Lehmann

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] is autovacuum recommended?

2009-07-09 Thread Willy-Bas Loos
Hi,

Whenever i start a big action, like inserting millions of recs or doing a
large update, the autovacuum fires on top of that.
It has some adverse effects on performance when i need it most. More than
once a postgres service crashed on me because of it.
Sure, it had too little memory, but it wouldn't have happened if it wasn't
for the autovacuum.

Should ik keep autovacuum on, or is it better to run a script like:
vacuumdb -az (daily except one day)
vacuumdb -azf (once a week)

My database is growing, so really freeing up space isn't such an issue. What
I want is optimal performance.
Of course autovacuum is more generic, but for me i think the daily scheme
will be sufficient.
A connection told me it would be better to enable the autovacuum, because it
does more than the above script. Can anyone verify that?

Cheers,

WBL
-- 
Patriotism is the conviction that your country is superior to all others
because you were born in it. -- George Bernard Shaw


Re: [GENERAL] is autovacuum recommended?

2009-07-09 Thread Alvaro Herrera
Willy-Bas Loos escribió:
 Hi,
 
 Whenever i start a big action, like inserting millions of recs or doing a
 large update, the autovacuum fires on top of that.
 It has some adverse effects on performance when i need it most. More than
 once a postgres service crashed on me because of it.
 Sure, it had too little memory, but it wouldn't have happened if it wasn't
 for the autovacuum.

1. the server should never crash.  If it does you have another problem
that perhaps is triggered by the high load.  Investigate that.

2. what PG version are you running?  What operating system?

3. You can turn autovacuum off for particular tables.  This would allow
you to have it process most tables, and manually vacuum only the
specific tables that are causing you grief.

 Should ik keep autovacuum on, or is it better to run a script like:
 vacuumdb -az (daily except one day)
 vacuumdb -azf (once a week)

4. VACUUM FULL is (almost) never needed unless your server is not
properly tuned.  If you find a need for it, it's a symptom that you need
to tweak something somewhere.  Need more details about the problem to
give you more advice.

2a. Upgrading to 8.4 may give you several benefits in this area.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] is autovacuum recommended?

2009-07-09 Thread Andres Freund
Hi,

On Thursday 09 July 2009 19:25:15 Willy-Bas Loos wrote:
 Whenever i start a big action, like inserting millions of recs or doing a
 large update, the autovacuum fires on top of that.
You can configure autovacuum to use less resources.
http://www.postgresql.org/docs/current/static/runtime-config-autovacuum.html

 It has some adverse effects on performance when i need it most. More than
 once a postgres service crashed on me because of it.
 Sure, it had too little memory, but it wouldn't have happened if it wasn't
 for the autovacuum.
That sounds like misconfiguration. PG got OOM killed? 

It is also possible that your max_fsm_pages were to small, so autovacuum hat 
to start all over.

 My database is growing, so really freeing up space isn't such an issue.
 What I want is optimal performance.
Well, not doing vacuum regulary can lead to significant bloat which makes 
lookups (and thus updates/deletes) way much slower.

 Of course autovacuum is more generic, but for me i think the daily scheme
 will be sufficient.
That may be. Hard to say without knowing more though. How fast are your tables 
changing? Inserts only or also updates/deletes?

 A connection told me it would be better to enable the autovacuum, because
 it does more than the above script. Can anyone verify that?
I dont think it really does more.


 vacuumdb -azf (once a week)
Using -f is normally not a good idea. It takes an exclusive lock on the tables 
- so you cannot do anything in those tables.

Which version of PG are you using?

Another possibility would be to upgrade to 8.4 - keeping track of free space 
works quite different there.

Andres

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Performance problem with low correlation data

2009-07-09 Thread Alvaro Herrera
m_li...@yahoo.it wrote:

 testinsert contains t values between '2009-08-01' and '2009-08-09', and ne_id 
 from 1 to 2. But only 800 out of 2 ne_id have to be read; there's no 
 need for a table scan!
 I guess this is a reflection of the poor correlation on ne_id; but, as I 
 said, I don't really think ne_id is so bad correlated.
 In fact, doing a select ne_id, t from testinsert limit 10  I can see 
 that data is laid out pretty much by ne_id, t, grouped by day (that is, 
 same ne_id for one day, then next ne_id and so on until next day).
 How is the correlation calculated? Can someone explain to me why, after the 
 procedure above,correlation is so low???

Did you run ANALYZE after the procedure above?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] is autovacuum recommended?

2009-07-09 Thread Bill Moran
In response to Willy-Bas Loos willy...@gmail.com:
 
 Whenever i start a big action, like inserting millions of recs or doing a
 large update, the autovacuum fires on top of that.
 It has some adverse effects on performance when i need it most. More than
 once a postgres service crashed on me because of it.
 Sure, it had too little memory, but it wouldn't have happened if it wasn't
 for the autovacuum.

If autovacuum is interfering with performance, you have two basic choices:
disable autovac or add hardware (assuming your system is already optimally
tuned).

 Should I keep autovacuum on, or is it better to run a script like:
 vacuumdb -az (daily except one day)
 vacuumdb -azf (once a week)

VACUUM FULL is seldom a good choice for a scheduled activity.  That being
said, there are some corner use cases where a regular VACUUM FULL is a
good idea.

Also, it's likely that a daily vacuum won't be enough.  If that were the
case, then autovac wouldn't be kicking off as often as it does.  Once you
start vacuuming multiple times per day, you're back to interfering with
performance again.  However, if you're doing it on a controlled schedule,
it's possible that you can schedule it in such a way that it's less
intrusive.

Overall, however, you're probably going to have to solve the problem by
adding hardware, or better tuning your system.

 My database is growing, so really freeing up space isn't such an issue. What
 I want is optimal performance.

If you _only_ do inserts, then vacuum isn't required at all.  If you do _ANY_
UPDATE or DELETE operations, then you still need vacuum or your filesystem
will fill up with data that's no longer used.  If autovacuum is kicking off,
then you need to do vacuums, as autovac isn't random, it uses statistics on
the usage of tables to determine if it's needed.

You can also manually vacuum as part of your application.  For example,
vacuum a table manually after a bunch of UPDATEs.  If you do this
intelligently, you can leave autovacuum enabled, and it will only pick
vacuum tables that aren't done manually.

 Of course autovacuum is more generic, but for me i think the daily scheme
 will be sufficient.

Unlikely.  Unless your database sees very few updates, daily vacuum probably
isn't often enough.  However, your usage pattern will dictate that.

 A connection told me it would be better to enable the autovacuum, because it
 does more than the above script. Can anyone verify that?

Autovacuum is smarter -- it won't vacuum tables that don't need it, whereas
the above script vacuums everything, even if it doesn't need it.

-- 
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] constraint checking on partitions

2009-07-09 Thread Chris Spotts
I have several partitions on a history table that are partitioned by a date
range (monthly).  However, it's possible for an unexpected but valid date
(either very far in the future or very far in the past) to come in the data
set and so there is an overflow table.

Say table A is parent, B  is April data, C is June data, D is July data, and
O is overflow data.

I set several stored procedures to facilitate the adding of triggers,
constraints, etc for partitions.  These procs, in addition to adding the
constraint the normal partitions, also add a NOT constraint to the
overflow table.  i.e., when the July partition is created with 

 

alter table D add constraint onlyjuly check (date1 = '2009-07-01' and date1
 '2009-07-01')

Then this is also run

alter table O add constraint notjuly check (NOT(date1 = '2009-07-01' and
date1  '2009-07-01'))

 

The planner excludes correctly except that it always checks O.

It doesn't seem to be able to use the multiple constraints on O.

Are multiple NOT constraints too much for the planner for excluding
partitions?

 

 

 

 

 

postgres=# select version();

   version


--

 PostgreSQL 8.4.0 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC)
3.4.6 20060404 (Red Hat 3.4.6-8.0.1), 64-bit

 

 

 

Chris Spotts

 



Re: [GENERAL] ZFS prefetch considered evil?

2009-07-09 Thread John R Pierce

Alban Hertroys wrote:
I don't know how you partitioned your zpools, but to me it seems like 
it'd be preferable to have the PostgreSQL tablespaces (and possibly 
other data that's likely to be accessed randomly) in a separate zpool 
from the rest of the system so you can restrict disabling prefetch to 
just that file-system. You probably already did that...


It could be interesting to see how clustering the relevant tables 
would affect the prefetch performance, I'd expect disk access to be 
less random that way. It's probably still better to disable prefetch 
though.


in fact, somewhere in Sun.com land there's an app-note that suggests 
creating TWO ZFS mount-points for Postgres, one for the $PGDATA 
directory, which uses 128k blocks, and another for a tablespace that you 
put all your regular databases in, this uses 8k blocks.the idea is, 
the WAL logging is relatively sequential, and takes place in the 128k 
block zfs, while the actual database table files are far more purely random.


These two ZFS can be made in the same zpool, the normal recommendation 
is to have one large non-root zpool mirror for all your data (and 
another smaller zpool mirror for your OS, at least assuming you have 
more than two physical disk drives).



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Asking for assistance in determining storage requirements

2009-07-09 Thread Vick Khera
On Thu, Jul 9, 2009 at 11:15 AM, Chris
Barnescompuguruchrisbar...@hotmail.com wrote:
 We are curious about using SAN with fiber channel hba and if anyone else
 uses this technology.

 We would also like to know if people have preference to the level of raid
 with/out striping.

I used SurfRAID Triton external RAID units connected to Sun X4100
boxes via LSI Fibre Channel cards.  I run them as RAID6 plus hot spare
with a total of 16 drives.  This is extremely fast and provides for up
to 2 disk failure.  The key is to have 1 or 2 gigs of cache on the
RAID units.  I also crank up the RAM on the servers to at least 20Gb.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] constraint checking on partitions

2009-07-09 Thread Tom Lane
Chris Spotts rfu...@gmail.com writes:
 Then this is also run

 alter table O add constraint notjuly check (NOT(date1 = '2009-07-01' and
 date1  '2009-07-01'))

 The planner excludes correctly except that it always checks O.

What are the actual queries you're hoping it will exclude for?

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] constraint checking on partitions

2009-07-09 Thread Chris Spotts

 -Original Message-
 From: Tom Lane [mailto:t...@sss.pgh.pa.us]
 Sent: Thursday, July 09, 2009 1:52 PM
 To: Chris Spotts
 Cc: 'postgres list'
 Subject: Re: [GENERAL] constraint checking on partitions
 
 Chris Spotts rfu...@gmail.com writes:
  Then this is also run
 
  alter table O add constraint notjuly check (NOT(date1 = '2009-07-01'
 and
  date1  '2009-07-01'))
 
  The planner excludes correctly except that it always checks O.
 
 What are the actual queries you're hoping it will exclude for?
 
   regards, tom lane
[Spotts, Christopher] 

I mistyped, that should be 

alter table D add constraint onlyjuly check (date1 = '2009-07-01' and date1
 '2009-08-01')
Then this is also run
alter table O add constraint notjuly check (NOT(date1 = '2009-07-01' and
date1  '2009-08-01'))

If I ran a select * from A where date1 = '2009-07-02' and date1 
'2009-07-15' then I would think it wouldn't check O. 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] constraint checking on partitions

2009-07-09 Thread Tom Lane
Chris Spotts rfu...@gmail.com writes:
 I mistyped, that should be 

 alter table D add constraint onlyjuly check (date1 = '2009-07-01' and date1
  '2009-08-01')
 Then this is also run
 alter table O add constraint notjuly check (NOT(date1 = '2009-07-01' and
 date1  '2009-08-01'))

 If I ran a select * from A where date1 = '2009-07-02' and date1 
 '2009-07-15' then I would think it wouldn't check O. 

Works for me ...

regression=# create table a (date1 date);
CREATE TABLE
regression=# create table july() inherits(a);
CREATE TABLE
regression=# create table other() inherits(a);
CREATE TABLE
regression=# alter table other add constraint notjuly check (NOT(date1 = 
'2009-07-01' and date1  '2009-08-01'));
ALTER TABLE
regression=# explain select * from a where date1 = '2009-07-02' and date1  
'2009-07-15';
   QUERY PLAN   


 Result  (cost=0.00..92.00 rows=24 width=4)
   -  Append  (cost=0.00..92.00 rows=24 width=4)
 -  Seq Scan on a  (cost=0.00..46.00 rows=12 width=4)
   Filter: ((date1 = '2009-07-02'::date) AND (date1  
'2009-07-15'::date))
 -  Seq Scan on july a  (cost=0.00..46.00 rows=12 width=4)
   Filter: ((date1 = '2009-07-02'::date) AND (date1  
'2009-07-15'::date))
(6 rows)

regression=# 

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] constraint checking on partitions

2009-07-09 Thread Chris Spotts
 
  If I ran a select * from A where date1 = '2009-07-02' and date1 
  '2009-07-15' then I would think it wouldn't check O.
 
[Spotts, Christopher] 
I oversimplified this too much - but I figured out what was happening. 
If you added the June table as well and added a separate NOT constraint for
June, and then wrote the query 
SELECT * from A where date1 = '2009-06-01' and date1  '2009-07-05' the
planner can't match them to individual constraints, so it doesn't exclude.
Theoretically the planner could logically AND them together to get better
exclusion, but it must not be.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Trying to find a low-cost program for Data migration and ETL

2009-07-09 Thread Tguru


How about using an open source ETL tool? You could go with Talend Open
Studio, it is an open source program to do data migration, ETL and data
synchronization.

Check the site: http://www.talend.com/





Scott Mead-3 wrote:
 
 On Tue, Jul 7, 2009 at 11:48 AM, Rstat tom.rus...@gmail.com wrote:
 


 Hi, Im building a database for my company. We are a rather small size
 book
 company with a lot of references and still growing.

 We have a Mysql database here and are trying to find some good tools to
 use
 it at its best. Basically we are just starting up the database after
 dealing
 with Excel: we had a size problem… So im trying to find a program that
 will
 allow us to do two different things: the migration of our data from the
 old
 system to the new one and a specialized software to perform ETL (Extract,
 transform and load) on our database.

 About the price of the tools, if we were one year ago, the accounting
 department would have been pretty relaxed about this. But today, we have
 some budget restrictions and therefore need a low cost tool. So could you
 give me some advice on a good data migration and etl tool for a low cost?

 Thanks for your help.
 
 
   You may have some luck by viewing a similar thread on another mailing
 list:
 
 http://www.theserverside.net/discussions/thread.tss?thread_id=54755
 
 
 -- Scott
 
 

-- 
View this message in context: 
http://www.nabble.com/Trying-to-find-a-low-cost-program-for-Data-migration-and-ETL-tp24375920p24407304.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Postgres 8.4 literal escaping

2009-07-09 Thread Niederland
with Postgres 8.4:

SQL 1: update customer set note = ('A' || '\r\n' || 'B') where 1=0;

generates:
WARNING:  nonstandard use of escape in a string literal
LINE 1: update customer set note = ('A' || '\r\n' || 'B') where 1=0;
   ^
HINT:  Use the escape string syntax for escapes, e.g., E'\r\n'.
(Note is of datatype TEXT)

SQL 2: select ('A' || '\r\n' || 'B') as tt from customer limit 1
Functions without escaping literal

On postgres 8.3.7 both of the above SQLs function without requiring
any literals to be escaped.

This seems inconsistent to me in that I would have thought that in
Postgres 8.4 either both SQLs would have required escaping or neither
of them would have required escaping.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Help me please...

2009-07-09 Thread Roseller A. Romanos
Please help me with this. I really need your advice as to how to retrieve the 
data in my postgresql database.

I have postgresql installed in Windows XP platform five months ago. Just 
yesterday my OS bugged down and saying NTDLR is missing.  What I did was I 
re-installed my OS. When I finished my installation I found out that I cannot 
access anymore my postgresql using PGAdmin III.

What should I do to access my database and retrieve the important records in 
it?   

Unfortunately, I have not created a backup of it.  Please help me.

Thanks in advance and God bless...


 Roseller Romanos
STI-Pagadian
Gillamac Building, Pagadian City
Office Nos: (062) 2144-785
Home Nos: (062) 2144-695
Mobile Nos: 09203502636



  

[GENERAL] Overhead of union versus union all

2009-07-09 Thread Tim Keitt
I am combining query results that I know are disjoint. I'm wondering
how much overhead there is in calling union versus union all. (Just
curious really; I can't see a reason not to use union all.) (cc me
please; not subscribed...)

THK

-- 
Timothy H. Keitt
http://www.keittlab.org/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Overhead of union versus union all

2009-07-09 Thread Alvaro Herrera
Tim Keitt wrote:
 I am combining query results that I know are disjoint. I'm wondering
 how much overhead there is in calling union versus union all. (Just
 curious really; I can't see a reason not to use union all.)

UNION needs to uniquify the output, for which it plasters an additional
sort step, whereas UNION ALL does not need to uniquify its output and
thus it can avoid the sort step.  Using UNION ALL is recommended
wherever possible.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Overhead of union versus union all

2009-07-09 Thread Adam Rich

Tim Keitt wrote:

I am combining query results that I know are disjoint. I'm wondering
how much overhead there is in calling union versus union all. (Just
curious really; I can't see a reason not to use union all.) (cc me
please; not subscribed...)

THK




I think you can test this one yourself pretty easily.  Just run the two 
queries with explain analyze.  Union All should run in about the sum 
of the separate queries.  Plain Union will always be slower, because it 
takes the same results from union all and runs them through an extra 
sort/distinct or hash step.  In my tests, on a query with 600,000 rows, 
the Plain Union took about 3x as long to complete.




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Overhead of union versus union all

2009-07-09 Thread Bruce Momjian
Alvaro Herrera wrote:
 Tim Keitt wrote:
  I am combining query results that I know are disjoint. I'm wondering
  how much overhead there is in calling union versus union all. (Just
  curious really; I can't see a reason not to use union all.)
 
 UNION needs to uniquify the output, for which it plasters an additional
 sort step, whereas UNION ALL does not need to uniquify its output and
 thus it can avoid the sort step.  Using UNION ALL is recommended
 wherever possible.

Yep, ideally UNION ALL would be the default behavior, but that standard
requires otherwise.  Many people don't know that UNION has an extra
SORT/UNIQUE step.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Help me please...

2009-07-09 Thread jacob
do the DB folders still exist? if so back them up, reinstall Postgres
(reinstalling XP probably wiped out either DLL's or registry entries)
and relaunch it. don't have it initialize a DB on install

 

From: pgsql-general-ow...@postgresql.org
[mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Roseller A.
Romanos
Sent: Wednesday, July 08, 2009 9:37 PM
To: pgsql-general@postgresql.org
Subject: [GENERAL] Help me please...

 

Please help me with this. I really need your advice as to how to
retrieve the data in my postgresql database.

I have postgresql installed in Windows XP platform five months ago. Just
yesterday my OS bugged down and saying NTDLR is missing.  What I did was
I re-installed my OS. When I finished my installation I found out that I
cannot access anymore my postgresql using PGAdmin III. 

What should I do to access my database and retrieve the important
records in it?   

Unfortunately, I have not created a backup of it.  Please help me.

Thanks in advance and God bless...

 

Roseller Romanos
STI-Pagadian
Gillamac Building, Pagadian City
Office Nos: (062) 2144-785
Home Nos: (062) 2144-695
Mobile Nos: 09203502636

 

 



Re: [GENERAL] Help me please...

2009-07-09 Thread jacob
if you used default locations I believe it should be (this is from
memory mind) under c:\program files\postgres\version\data\

 

data is the folder you want.

 

From: pgsql-general-ow...@postgresql.org
[mailto:pgsql-general-ow...@postgresql.org] On Behalf Of ja...@aers.ca
Sent: Thursday, July 09, 2009 1:18 PM
To: don2_...@yahoo.com; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Help me please...

 

do the DB folders still exist? if so back them up, reinstall Postgres
(reinstalling XP probably wiped out either DLL's or registry entries)
and relaunch it. don't have it initialize a DB on install

 

From: pgsql-general-ow...@postgresql.org
[mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Roseller A.
Romanos
Sent: Wednesday, July 08, 2009 9:37 PM
To: pgsql-general@postgresql.org
Subject: [GENERAL] Help me please...

 

Please help me with this. I really need your advice as to how to
retrieve the data in my postgresql database.

I have postgresql installed in Windows XP platform five months ago. Just
yesterday my OS bugged down and saying NTDLR is missing.  What I did was
I re-installed my OS. When I finished my installation I found out that I
cannot access anymore my postgresql using PGAdmin III. 

What should I do to access my database and retrieve the important
records in it?   

Unfortunately, I have not created a backup of it.  Please help me.

Thanks in advance and God bless...

 

Roseller Romanos
STI-Pagadian
Gillamac Building, Pagadian City
Office Nos: (062) 2144-785
Home Nos: (062) 2144-695
Mobile Nos: 09203502636

 

 



Re: [GENERAL] Help me please...

2009-07-09 Thread John R Pierce

ja...@aers.ca wrote:


if you used default locations I believe it should be (this is from 
memory mind) under c:\program files\postgres\version\data\


 


data is the folder you want.

 



First, verify the location of pgdata...

   sc qc pgsql-8.3

(I'm assuming this is 8.3, modify for other versions)
note the value of the -D parameter on the BINARY_PATH_NAME, like...

   BINARY_PATH_NAME   : D:\postgres\8.3\bin\pg_ctl.exe runservice 
-w -N pgsql-8.3 -D D:\postgres\8.3\data\


hence, mine is D:\postgres\8.3\data\

If you've reinstalled postgres from scratch, you'll likely have to do a 
bit of dinking around.


First,

   NET STOP pgsql-8.3

Now, MOVE the current data dir somewhere safe, and COPY/S your backup of 
the DATA directory to the active location.   Then, use the 'security' 
dialog in file explorer, or the CACL command line, to grant the 
'postgres' user full control over the data directory and all files in it.


command line version:

   cacls /t /e /c /g postgres:f \path\to\data

if this postgres user already existed from before, but the reinstalled 
postgres service is having problems starting, you may need to reset the 
service account password.pick a random forgettable password.   I'm 
going to use shattUp373treHodhu (random generator output)...


   NET USER postgres shattUp373treHodhu
   SC CONFIG pgsql-8.3 password= shattUp373treHodhu

upper case doesn't matter in the commands except for the password itself 
but the spacing around the password=  is critical (none before the =, 
one space after)


then try

   net start pgsql-8.3

and with any luck, your data is all intact.

its absolutely critical if you've reinstalled postgres that you install 
the same version as you used before.






--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres 8.4 literal escaping

2009-07-09 Thread Tom Lane
Niederland niederl...@gmail.com writes:
 SQL 2: select ('A' || '\r\n' || 'B') as tt from customer limit 1
 Functions without escaping literal

Really?  I get

regression=# select ('A' || '\r\n' || 'B') as tt from customer limit 1;
WARNING:  nonstandard use of escape in a string literal
LINE 1: select ('A' || '\r\n' || 'B') as tt from customer limit 1;
   ^
HINT:  Use the escape string syntax for escapes, e.g., E'\r\n'.
 tt 

(0 rows)

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Database storage

2009-07-09 Thread nabble . 30 . miller_2555
Hi -

I'm running Postgres v8.3 on Fedora 10 (home machine - x86_64
architecture). After realizing that the storage requirements of one of
my databases will exceed 4Tb, I wanted to see if anyone had any
suggestions as to hardware setup that works well with Postgres running
on Linux. I have partitioned most of the database, so older data can
go on slower devices with only 400-500Gb needed on faster devices.
Redundancy is a requirement. General database usage is generally low,
with burst input. For the curious, here's the general profile:
a) ~19,000,000 INSERT statements/ day (190 INSERTs/ commit blocks)
for a total of ~10 Gb INSERT/day
b) INSERTs occur sequentially (daily batch)
c) Handful of SELECT statements run per day to generate a few
dozen reports, but these are not resource-intensive.
d) Only 3-5 database users configured.

Initial thought: Use an external multi-bay eSATA case and inserting
some 7200RPM+, 32Mb cache 1T+ SATA drives running RAID 5.

A few specific questions:
1) Which components generally lead to bottlenecks (controller
card, individual HD cache, HD rotational speed, other???)
2) Are there better solutions (broad question in terms of price/
data availability)? Budget: $500 - $750 for the storage medium. Better
question: Which setup was worked well with a Postgres implementation
running on a similar budget?
3) If using a similar setup, are there any particular
implementation issues that are not relatively obvious?

Thanks for the input (even if not Postgres-specifc, I thought this
might be of interest to hobby database admins like myself running
Postgres implementations).


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Database storage

2009-07-09 Thread Scott Marlowe
On Thu, Jul 9, 2009 at 5:40 PM, nabble.30.miller_2...@spamgourmet.com wrote:
 Hi -

 I'm running Postgres v8.3 on Fedora 10 (home machine - x86_64
 architecture). After realizing that the storage requirements of one of
 my databases will exceed 4Tb, I wanted to see if anyone had any
 suggestions as to hardware setup that works well with Postgres running
 on Linux. I have partitioned most of the database, so older data can
 go on slower devices with only 400-500Gb needed on faster devices.
 Redundancy is a requirement. General database usage is generally low,
 with burst input. For the curious, here's the general profile:
    a) ~19,000,000 INSERT statements/ day (190 INSERTs/ commit blocks)
 for a total of ~10 Gb INSERT/day
    b) INSERTs occur sequentially (daily batch)
    c) Handful of SELECT statements run per day to generate a few
 dozen reports, but these are not resource-intensive.
    d) Only 3-5 database users configured.

 Initial thought: Use an external multi-bay eSATA case and inserting
 some 7200RPM+, 32Mb cache 1T+ SATA drives running RAID 5.

While RAID-5 is generally not the best database setup, considering
that your inputs will be bulk you're probably ok.  Also, given that
2TB server class drives (5400rpm) are now out, and that they have
about the same throughput as 1TB 7200 RPM server class drives, you
might want to look into them as you can get by on fewer drives for the
same storage.

 A few specific questions:
    1) Which components generally lead to bottlenecks (controller
 card, individual HD cache, HD rotational speed, other???)

IO is almost always your bottleneck on queries over large data sets.

    2) Are there better solutions (broad question in terms of price/
 data availability)? Budget: $500 - $750 for the storage medium. Better
 question: Which setup was worked well with a Postgres implementation
 running on a similar budget?

$750 is about what a decent RAID controller would cost you, but again
it's likely that given your bulk import scenario,  you're probably ok
without one.  In this instance, you're probably best off with software
RAID than a cheap RAID card which will cost extra and probably be
slower than linux software RAID.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] SELECT DISTINCT very slow

2009-07-09 Thread Greg Stark
On Thu, Jul 9, 2009 at 4:47 PM, Andres Freundand...@anarazel.de wrote:
 AFAIK the primary cause is that indexes in pg do not store visibility
 information.

Not really. The OP doesn't say how wide the record rows are but unless
they're very wide it wouldn't pay to use an index for this even if you
didn't have to access the heap also. It's going to be faster to scan
the whole heap and either sort or use a hash. Currently there aren't
many cases where a btree with 6,000 copies of 111 distinct keys is
going to be useful.

Arguably the missing feature here is skip-scans where we scan the
index but only pull out one record for each distinct value. I'm not
sure there's anything particularly stopping Postgres from being able
to do them, but it might be a lot of code for a narrow use case.

-- 
greg
http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] SELECT DISTINCT very slow

2009-07-09 Thread Tom Lane
Greg Stark gsst...@mit.edu writes:
 Not really. The OP doesn't say how wide the record rows are but unless
 they're very wide it wouldn't pay to use an index for this even if you
 didn't have to access the heap also. It's going to be faster to scan
 the whole heap and either sort or use a hash. Currently there aren't
 many cases where a btree with 6,000 copies of 111 distinct keys is
 going to be useful.

It was 600,000 not 6,000 ... so a skip-scan might be worth the trouble,
but as you say we haven't done it.

In any case I think the real issue is that the OP is probably using a
pre-8.4 release which will always do SELECT DISTINCT via sort-and-unique.
Hash aggregation would be a whole lot faster for these numbers, even
if not exactly instantaneous.  He could update to 8.4, or go over to
using GROUP BY as was recommended upthread.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_dump PostgreSQL 8.4

2009-07-09 Thread serafin segador
thank you all for your assistance.  i tried not to be adventurous tweaking  
the configuration of a production system except for system upgrade which  
has not failed on me before.  burned my fingers once.  i need to visit the  
documentation and study more the admin side of the system.


thanks all.

raffy segador

On Thu, 09 Jul 2009 18:27:35 +0800, Dave Page dp...@pgadmin.org wrote:

On Thu, Jul 9, 2009 at 11:06 AM, serafin segadorrsega...@mydestiny.net  
wrote:

i found where the error is.  thanks for the tip.

i have versions 8.3.7 and 8.4 of PostgreSQL running on the same server,  
as
well as versions 1.8 and 1.10 of pgAdmin.  although i run the backup  
routine
for pg8.4 from pgAdmin 1.10, the pdAdmin uses  
PostgreSQL\8.3\bin\pg_dump as

default.  i now run the backup routine from the command line utility
pointing to the correct version.


Check the PG bin path option and File - Options.

It also sounds like you disabled the version match check on that
dialogue, otherwise pg_dump 8.3 should have complained about be used
with 8.4.





--
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] ubuntu packages for 8.4

2009-07-09 Thread Tim Uckun
I don't see any ubuntu packages for 8.4 in the default repositories.

Does anybody know if they will be upgrading the postgresql package to
8.4 or creating a new package for it.

I'd rather use the packages than to compile it myself. If anybody has
an argument as to why I should compile it I am all ears.

TIA.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Database storage

2009-07-09 Thread Greg Stark
On Fri, Jul 10, 2009 at 1:28 AM, Scott Marlowescott.marl...@gmail.com wrote:

 $750 is about what a decent RAID controller would cost you, but again
 it's likely that given your bulk import scenario,  you're probably ok
 without one.  In this instance, you're probably best off with software
 RAID than a cheap RAID card which will cost extra and probably be
 slower than linux software RAID.


Fwiw the main disadvantage of software raid is NOT speed -- Linux
software RAID is very fast. Aside from raid-5 where it lets you
offload the parity calculation there really isn't much speed benefit
to hardware raid.

The main advantage of hardware raid is the error handling. When you
get low level errors or pull a drive a lot of consumer level
controllers and their drivers don't respond very well and have long
timeouts or keep retrying tragically unaware that the software raid
would be able to handle recoverying. A good server-class RAID
controller should handle those situations without breaking a sweat.

-- 
greg
http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Database storage

2009-07-09 Thread Scott Marlowe
On Thu, Jul 9, 2009 at 7:29 PM, Greg Starkgsst...@mit.edu wrote:
 On Fri, Jul 10, 2009 at 1:28 AM, Scott Marlowescott.marl...@gmail.com wrote:

 $750 is about what a decent RAID controller would cost you, but again
 it's likely that given your bulk import scenario,  you're probably ok
 without one.  In this instance, you're probably best off with software
 RAID than a cheap RAID card which will cost extra and probably be
 slower than linux software RAID.


 Fwiw the main disadvantage of software raid is NOT speed -- Linux
 software RAID is very fast. Aside from raid-5 where it lets you
 offload the parity calculation there really isn't much speed benefit
 to hardware raid.

 The main advantage of hardware raid is the error handling. When you
 get low level errors or pull a drive a lot of consumer level
 controllers and their drivers don't respond very well and have long
 timeouts or keep retrying tragically unaware that the software raid
 would be able to handle recoverying. A good server-class RAID
 controller should handle those situations without breaking a sweat.

Definitely a big plus of a quality HW controller, and one of the
reasons I don't scrimp on the HW controllers I put in our 24/7
servers.  OTOH, if you can afford a bit of downtime to handle
failures, linux software RAID works pretty well, and since quad core
CPUs are now pretty much the standard, it's ok if parity calculation
uses up a bit of one core for lower performing servers like the
reporting server the OP was talking about.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] ubuntu packages for 8.4

2009-07-09 Thread Bret Fledderjohn
2009/7/9 Tim Uckun timuc...@gmail.com

 I don't see any ubuntu packages for 8.4 in the default repositories.

 Does anybody know if they will be upgrading the postgresql package to
 8.4 or creating a new package for it.

 I'd rather use the packages than to compile it myself. If anybody has
 an argument as to why I should compile it I am all ears.


I just forwarded your message to Martin Pitt, he's the package maintain for
Postgres for Ubuntu (and Debian, I believe).  I don't know if this is
planned for Karmic Koala (to be released in October).

-- 
- Bret


Re: [GENERAL] Overhead of union versus union all

2009-07-09 Thread Scott Bailey

Alvaro Herrera wrote:

Tim Keitt wrote:

I am combining query results that I know are disjoint. I'm wondering
how much overhead there is in calling union versus union all. (Just
curious really; I can't see a reason not to use union all.)


UNION needs to uniquify the output, for which it plasters an additional
sort step, whereas UNION ALL does not need to uniquify its output and
thus it can avoid the sort step.  Using UNION ALL is recommended
wherever possible.



I think I read somewhere that as of 8.4 it no longer required the sort 
step, due to the improvements in hashing. Here it is


http://wiki.postgresql.org/wiki/WhatsNew84#Performance

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Overhead of union versus union all

2009-07-09 Thread Bruce Momjian
Scott Bailey wrote:
 Alvaro Herrera wrote:
  Tim Keitt wrote:
  I am combining query results that I know are disjoint. I'm wondering
  how much overhead there is in calling union versus union all. (Just
  curious really; I can't see a reason not to use union all.)
  
  UNION needs to uniquify the output, for which it plasters an additional
  sort step, whereas UNION ALL does not need to uniquify its output and
  thus it can avoid the sort step.  Using UNION ALL is recommended
  wherever possible.
  
 
 I think I read somewhere that as of 8.4 it no longer required the sort 
 step, due to the improvements in hashing. Here it is
 
 http://wiki.postgresql.org/wiki/WhatsNew84#Performance

Oh, yea, hashing is used in some cases rather than sort.  I assume sort
is still used if the hash exceeds workmem size.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] psql language

2009-07-09 Thread Stuart McGraw

Hello,

I just installed pg-8.4 on Windows XP but ran into
some unexpected problems.

I am working on some tools to aid English-speaking 
learners of Japanese.  This of course requires me
to regularly display and enter Japanese text on my 
machine, so I have the Regional setting, Language

for non-unicode programs set to Japanese although
the locale language is English.  This allows me to
to work with both english and japanese text in the
windows console (cmd.exe) just fine.  Psql also 
worked fine until 8.4.


With 8.4 though, psql presents messages in Japanese.
Since I am still a beginner at Japanese myself, this
is a problem.

FWIW, I tried doing SET LANGUAGE en_US and
with every other locale-related variable LC_ALL,
LANG, LC_MESSAGES, etc, I could think of,
before running psql but with no effect.

How can I tell psql (and any other command line 
tools) to use english messages?



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] ubuntu packages for 8.4

2009-07-09 Thread Scott Bailey

Tim Uckun wrote:

I don't see any ubuntu packages for 8.4 in the default repositories.

Does anybody know if they will be upgrading the postgresql package to
8.4 or creating a new package for it.

I'd rather use the packages than to compile it myself. If anybody has
an argument as to why I should compile it I am all ears.

TIA.



I was looking for this the other day. I found it in lenny backports, but 
I haven't tried it yet.


http://packages.debian.org/lenny-backports/postgresql-8.4


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Overhead of union versus union all

2009-07-09 Thread Scott Marlowe
On Thu, Jul 9, 2009 at 7:58 PM, Bruce Momjianbr...@momjian.us wrote:
 Scott Bailey wrote:
 Alvaro Herrera wrote:
  Tim Keitt wrote:
  I am combining query results that I know are disjoint. I'm wondering
  how much overhead there is in calling union versus union all. (Just
  curious really; I can't see a reason not to use union all.)
 
  UNION needs to uniquify the output, for which it plasters an additional
  sort step, whereas UNION ALL does not need to uniquify its output and
  thus it can avoid the sort step.  Using UNION ALL is recommended
  wherever possible.
 

 I think I read somewhere that as of 8.4 it no longer required the sort
 step, due to the improvements in hashing. Here it is

 http://wiki.postgresql.org/wiki/WhatsNew84#Performance

 Oh, yea, hashing is used in some cases rather than sort.  I assume sort
 is still used if the hash exceeds workmem size.

The important point being that it's still more expensive than a plain
union all thought, right?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Overhead of union versus union all

2009-07-09 Thread Bruce Momjian
Scott Marlowe wrote:
 On Thu, Jul 9, 2009 at 7:58 PM, Bruce Momjianbr...@momjian.us wrote:
  Scott Bailey wrote:
  Alvaro Herrera wrote:
   Tim Keitt wrote:
   I am combining query results that I know are disjoint. I'm wondering
   how much overhead there is in calling union versus union all. (Just
   curious really; I can't see a reason not to use union all.)
  
   UNION needs to uniquify the output, for which it plasters an additional
   sort step, whereas UNION ALL does not need to uniquify its output and
   thus it can avoid the sort step. ?Using UNION ALL is recommended
   wherever possible.
  
 
  I think I read somewhere that as of 8.4 it no longer required the sort
  step, due to the improvements in hashing. Here it is
 
  http://wiki.postgresql.org/wiki/WhatsNew84#Performance
 
  Oh, yea, hashing is used in some cases rather than sort. ?I assume sort
  is still used if the hash exceeds workmem size.
 
 The important point being that it's still more expensive than a plain
 union all thought, right?

Yep.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Multi - table statistics

2009-07-09 Thread decibel

On Jul 1, 2009, at 4:15 AM, Scara Maccai wrote:
I query is using Index scan instead of Bitmap Index Scan because it  
thinks that the number of rows returned will be low (12). In  
fact, the number of rows returned is 120.


This happens because there are some WHERE conditions that check  
cross-tables values (something like taba.col1  tabb.col2) . Since  
Postgresql doesn't know (I guess) anything about cross-table values  
statistics, how are the rows values calculated?


Are you doing this as a JOIN or a subquery? That could make a  
difference. Also, non-equality predicates (ie: ) are hard to  
estimate for.



Is there any way I can force the planner to use bitmap index scan?



I would try pulling the explicit values you need into a temp table,  
analyze that, and then join. The planner will then know exactly how  
many rows it's dealing with. But keep in mind that it doesn't  
actually look at the values it will be getting, so if you have a  
skewed distribution of values in the join column in the larger table  
you might be stuck...

--
Decibel!, aka Jim C. Nasby, Database Architect  deci...@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] cache reads vs. disk reads

2009-07-09 Thread decibel

On Jul 1, 2009, at 4:20 AM, Gerd König wrote:

b) pg_statio_user_tables=

heap_blks_read is the number of disk blocks read for that table  
(excluding index
access), does heap_blks_hit mean the number of accesses to the  
cache for that data ?
...and is the number of heap_blks_read in heap_blks_hit included,  
or is this
number the additional accesses, after reading the data from disk to  
buffer ?



Take a look in the manual; there's a pretty clear explanation in there.
--
Decibel!, aka Jim C. Nasby, Database Architect  deci...@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] REINDEX is not a btree

2009-07-09 Thread decibel

On Jul 4, 2009, at 8:06 AM, Craig Ringer wrote:

On Fri, 2009-07-03 at 15:00 +0100, Vanessa Lopez wrote:

I don't know much about postgre, I have no clue what else I can do.
Please, please any help is very very much appreciated I have lots of
databases and months of work in postgre (also lots of backups for the
data in /data)


When you say in /data, do you mean the directory that contains the
directories pg_xlog, base, global, pg_clog, etc ?

Did you back up and restore the WHOLE data directory at once? Or  
did you

restore only parts of it?



And how exactly did you make the backups? You can't simply take a  
filesystem copy of a running database; that won't work.

--
Decibel!, aka Jim C. Nasby, Database Architect  deci...@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general