Re: [GENERAL] Sum of multiplied deltas

2009-09-27 Thread Gerhard Wiesinger

Hello,

Finally I used a function below which works well. Only one problem is 
left: It polutes the buffer cache because of the cursor. Any idea to get 
rid of this behavior?


BTW: WINDOWING FUNCTION of 8.4 should help but noone could provide an 
examples how this could work. Any further comments how to implement it?


Thnx.

Ciao,
Gerhard

--
http://www.wiesinger.com/

CREATE TYPE PS AS
(
  sum_m1 double precision,
  sum_m2 double precision
);

DROP FUNCTION getSum();
CREATE OR REPLACE FUNCTION getSum(IN start_ts timestamp with time 
zone, IN stop_ts timestamp with time zone) RETURNS PS AS $$

DECLARE
  curs CURSOR FOR
  SELECT
*
  FROM
log_entries
  WHERE
datetime = start_ts
AND datetime = stop_ts
  ORDER BY
datetime
  ;
  row log_entries%ROWTYPE;
  i bigint = 0;
  datetime_old timestamp with time zone;
  old double precision;
  sum_m1 double precision = 0;
  sum_m2 double precision = 0;
  psum PS;
BEGIN
  OPEN curs;
  LOOP
FETCH curs INTO row;
EXIT WHEN NOT FOUND;
IF row.col IS NOT NULL THEN
  IF i  0 THEN
sum_m1 = sum_m1 + (row.col - old) * 0.01 * row.col2;
sum_m2 = sum_m2 + EXTRACT('epoch' FROM row.datetime - datetime_old) * 
row.col3;
  END IF;
  i = i + 1;
  old = row.old;
  datetime_old = row.datetime;
END IF;
  END LOOP;
  CLOSE curs;
  psum.sum_m1 = sum_m1;
  psum.sum_m2 = sum_m2;
  RETURN psum;
END;
$$ LANGUAGE plpgsql;


On Mon, 8 Jun 2009, Gerhard Wiesinger wrote:


Hello!

I've the following data:
datetime | val1 | val2
time1|4 | 40%
time2|7 | 30%
time3|   12 | 20%
...

I'd like to sum up the following:

(7-4)*30% + (12-7)*20% + ...

datetime is ordered (and unique and has also an id).

Rows are in the area of millions.

How is it done best?
1.) Self join with one row shift?
2.) function?

Any hint?

Thnx.

Ciao,
Gerhard

--
http://www.wiesinger.com/


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Problem with data corruption and psql memory usage

2009-09-27 Thread Gerhard Wiesinger

Hello Tom,

Late answer, but answer :-) :
Finally, it was a very strange hardware problem, where a very small part 
of RAM was defect but kernel never crashed.


I had also a very strange behavior when verifying rpm packages with rpm 
-V. First I had the harddisk under suspicion. But then I flushed the OS caches:

echo 3  /proc/sys/vm/drop_caches
and rpm -V was correct. = RAM issue.

A memtest86+ showed very fast a defect RAM.

So PostgreSQL didn't have any issue :-)

Ciao,
Gerhard

--
http://www.wiesinger.com/


On Wed, 9 May 2007, Gerhard Wiesinger wrote:


Hello Tom!

I don't think this is a hardware problem. Machine runs 24/7 for around 4 
years without any problems, daily backup with GBs of data to it, uptimes to 
the next kernel security patch, etc.


The only problem I could believe is:
I'm running the FC7 test packages of postgresql in FC6 and maybe there is a 
slight glibc library conflict or any other incompatibility.


Ciao,
Gerhard

--
http://www.wiesinger.com/


On Wed, 9 May 2007, Tom Lane wrote:


Gerhard Wiesinger li...@wiesinger.com writes:

LOG:  could not fsync segment 0 of relation 1663/16386/42726: Input/output
error


[ raised eyebrow... ]  I think your machine is flakier than you believe.
This error is particularly damning, but the general pattern of weird
failures all over the place seems to me to fit the idea of hardware
problems much better than any other explanation.  FC6 and PG 8.2.3 are
both pretty darn stable for most people, so there's *something* wrong
with your installation, and unstable hardware is the first thing that
comes to mind.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

 http://www.postgresql.org/docs/faq



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Understanding sort's memory/disk usage

2009-09-27 Thread Adam Rich

Hello,
Please reference these explain plans.  This is Pg 8.4.1

http://explain-analyze.info/query_plans/4032-query-plan-2745
http://explain-analyze.info/query_plans/4033-query-plan-2746

First, could somebody explain what is leading the first query to choose 
a different plan that's much slower?  In the first plan only, this 
expression is in the select  group by:


s.store_num || ' - ' || s.title

These are both non-null varchar fields.  Both have a unique index.

Second, why would it choose to sort on disk for what appears to be ~32MB 
of data, when my work_mem and temp_buffers are both 64 MB each?


If I increase work_mem and temp_buffers to 128 MB, I get a faster plan:

http://explain-analyze.info/query_plans/4034-query-plan-2747

But it's only reporting 92kb of memory used? Why don't I see numbers 
between 64 MB and 128 MB for both the on-disk and in-memory plans?


Thanks,
Adam









--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How should the first step of PostgreSQL implementation should be?

2009-09-27 Thread Ricky Tompu Breaky
On Sat, 26 Sep 2009 09:35:27 -0400
Bill Moran wmo...@potentialtech.com wrote:

 Ricky Tompu Breaky ricky.bre...@uni.de wrote:
 
  Dear my friends...
  
  I've installed postgresql-server on OpenSuSE11.1 successfully but I
  can't connect to it from psql.
  
  I did these steps:
  
  1. I created a new opensuse11.1-linux login account + its password
  (username: ivia) with YaST2;
  2. i...@sussy:~ su postgres -c psql postgres
  Passwort: 
  Dies ist psql 8.3.7, das interaktive PostgreSQL-Terminal.
  
  Geben Sie ein:  \copyright für Urheberrechtsinformationen
  \h für Hilfe über SQL-Anweisungen
  \? für Hilfe über interne Anweisungen
  \g oder Semikolon, um eine Anfrage auszuführen
  \q um zu beenden
  3. postgres'# ALTER USER postgres WITH PASSWORD 'mypassword';
  ALTER ROLE
  postgres=# create user ivia with password 'mypassword';
  CREATE ROLE
  postgres'# 
  4. sussy:/etc # cat /var/lib/pgsql/data/pg_hba.conf
  # TYPE  DATABASEUSERCIDR-ADDRESS  METHOD
  
  # local is for Unix domain socket connections only
  local   all all   ident sameuser
  # IPv4 local connections:
  hostall all 127.0.0.1/32  ident sameuser
  # IPv6 local connections:
  hostall all ::1/128   ident sameuser
  #local all all md5
  #host all all 127.0.0.1/32 md5
  #host all all ::1/128 md5
  #host all all 0.0.0.0/0 md5
  5. sussy:/etc # rcpostgresql restart
  Shutting down PostgreSQLServer angehalten
  done
  Starting PostgreSQL done 
  sussy:/etc # 
 
 Step 5 was unnecessary.  There's no need to restart the server after
 altering/adding/removing user accounts.
 
  6. sussy:/etc # cat /etc/sysconfig/postgresql
  POSTGRES_DATADIR=~postgres/data
  POSTGRES_OPTIONS=
  POSTGRES_LANG=
  sussy:/etc # 
  7. sussy:/etc # psql -h 127.0.0.1 -U ivia -W
  Password for user ivia: 
  psql: FATAL:  Passwort-Authentifizierung für Benutzer »ivia«
  fehlgeschlagen (my translation: Password-Authentication for user
  »ivia« failed)
  sussy:/etc # 
 
 Your did not create the role with the LOGIN priv.  Do:
 ALTER ROLE ivia WITH LOGIN;
 
  Look, the change to the table of user previously just dissapear and
  I even don't need to supply the password of 'postgres' user
  although I've created its password as I mention above:
 
 Your pg_hba.conf is configured for ident authentication, so PG isn't
 even looking at the password.  Based on what you're doing in these
 steps, I would guess that you want to use password authentication
 in pg_hba.
 
 Note that you _do_ need to reload the PG server after changing the
 pg_hba.conf
 
  
  sussy:/var/lib/pgsql/data # su postgres -c psql postgres
  Welcome to psql 8.3.7, the PostgreSQL interactive terminal.
  
  Type:  \copyright for distribution terms
 \h for help with SQL commands
 \? for help with psql commands
 \g or terminate with semicolon to execute query
 \q to quit
  
  postgres=# select * from user;
   current_user 
  --
   postgres
  (1 row)
 
 I don't think that query does what you think it does.  Try issuing
 \du
 at the postgresql prompt to get a list of configured roles.
 
  Please help me for the first step I use this PostgreSQL. This RDBMS
  server is far complicated then MySQL.
 
 I assure you it's not.  Once you've got a grasp of the role system in
 PostgreSQL, I'm willing to bet that you'll understand that it's far
 simpler and more elegant than MySQL's insane grant tables.  Of course,
 being new to something always introduces a learning curve, and
 learning curves are frustrating.
 


RBDear Bill Moran...

You're absolutely correct and thank you for your advise on the last
line of your previous email. A precious advise to encourage me learning
PostgreSQL.

RBAfter editting some configuration files, now my PostgreSQL always
RBrequire a password to let me login as 'postgres'. And I can not
RBlogin with wrong password. But I still can not login as 'ivia' user
RBaccount. The error message said: Database ivia does not exist.
RBWhat kind of database actually does it mean? I believe it's not a
RBnormal RDBMS Database (tables collection), but somewhat different.
sussy:~ # psql -h 127.0.0.1 -U ivia -W
Password for user ivia: 
psql: FATAL:  Datenbank »ivia« existiert nicht (my translation:
Database does not exist)
sussy:~ # su postgres -c psql postgres
could not change directory to /root (Why does the postgresql look
for '/root'?)
Password: 
Welcome to psql 8.3.7, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms
   \h for help with SQL commands
   \? for help with psql commands
   \g or terminate with semicolon to execute query
   \q to quit

postgres=# createdb ivia
postgres-# \du
   List of roles
Role name | Superuser | Create role | Create DB 

Re: [GENERAL] How should the first step of PostgreSQL implementation should be? (revised)

2009-09-27 Thread Ricky Tompu Breaky
On Sun, 27 Sep 2009 15:36:33 +0700
Ricky Tompu Breaky ricky.bre...@uni.de wrote:

 On Sat, 26 Sep 2009 09:35:27 -0400
 Bill Moran wmo...@potentialtech.com wrote:
 
  Ricky Tompu Breaky ricky.bre...@uni.de wrote:
  
   Dear my friends...
   
   I've installed postgresql-server on OpenSuSE11.1 successfully but
   I can't connect to it from psql.
   
   I did these steps:
   
   1. I created a new opensuse11.1-linux login account + its password
   (username: ivia) with YaST2;
   2. i...@sussy:~ su postgres -c psql postgres
   Passwort: 
   Dies ist psql 8.3.7, das interaktive PostgreSQL-Terminal.
   
   Geben Sie ein:  \copyright für Urheberrechtsinformationen
   \h für Hilfe über SQL-Anweisungen
   \? für Hilfe über interne Anweisungen
   \g oder Semikolon, um eine Anfrage auszuführen
   \q um zu beenden
   3. postgres'# ALTER USER postgres WITH PASSWORD 'mypassword';
   ALTER ROLE
   postgres=# create user ivia with password 'mypassword';
   CREATE ROLE
   postgres'# 
   4. sussy:/etc # cat /var/lib/pgsql/data/pg_hba.conf
   # TYPE  DATABASEUSERCIDR-ADDRESS  METHOD
   
   # local is for Unix domain socket connections only
   local   all all   ident
   sameuser # IPv4 local connections:
   hostall all 127.0.0.1/32  ident
   sameuser # IPv6 local connections:
   hostall all ::1/128   ident
   sameuser #local all all md5
   #host all all 127.0.0.1/32 md5
   #host all all ::1/128 md5
   #host all all 0.0.0.0/0 md5
   5. sussy:/etc # rcpostgresql restart
   Shutting down PostgreSQLServer angehalten
   done
   Starting PostgreSQL   done 
   sussy:/etc # 
  
  Step 5 was unnecessary.  There's no need to restart the server after
  altering/adding/removing user accounts.
  
   6. sussy:/etc # cat /etc/sysconfig/postgresql
   POSTGRES_DATADIR=~postgres/data
   POSTGRES_OPTIONS=
   POSTGRES_LANG=
   sussy:/etc # 
   7. sussy:/etc # psql -h 127.0.0.1 -U ivia -W
   Password for user ivia: 
   psql: FATAL:  Passwort-Authentifizierung für Benutzer »ivia«
   fehlgeschlagen (my translation: Password-Authentication for user
   »ivia« failed)
   sussy:/etc # 
  
  Your did not create the role with the LOGIN priv.  Do:
  ALTER ROLE ivia WITH LOGIN;
  
   Look, the change to the table of user previously just dissapear
   and I even don't need to supply the password of 'postgres' user
   although I've created its password as I mention above:
  
  Your pg_hba.conf is configured for ident authentication, so PG isn't
  even looking at the password.  Based on what you're doing in these
  steps, I would guess that you want to use password authentication
  in pg_hba.
  
  Note that you _do_ need to reload the PG server after changing the
  pg_hba.conf
  
   
   sussy:/var/lib/pgsql/data # su postgres -c psql postgres
   Welcome to psql 8.3.7, the PostgreSQL interactive terminal.
   
   Type:  \copyright for distribution terms
  \h for help with SQL commands
  \? for help with psql commands
  \g or terminate with semicolon to execute query
  \q to quit
   
   postgres=# select * from user;
current_user 
   --
postgres
   (1 row)
  
  I don't think that query does what you think it does.  Try issuing
  \du
  at the postgresql prompt to get a list of configured roles.
  
   Please help me for the first step I use this PostgreSQL. This
   RDBMS server is far complicated then MySQL.
  
  I assure you it's not.  Once you've got a grasp of the role system
  in PostgreSQL, I'm willing to bet that you'll understand that it's
  far simpler and more elegant than MySQL's insane grant tables.  Of
  course, being new to something always introduces a learning curve,
  and learning curves are frustrating.
  
 
 
 RBDear Bill Moran...
 
 You're absolutely correct and thank you for your advise on the last
 line of your previous email. A precious advise to encourage me
 learning PostgreSQL.
 
 RBAfter editting some configuration files, now my PostgreSQL always
 RBrequire a password to let me login as 'postgres'. And I can not
 RBlogin with wrong password. But I still can not login as 'ivia' user
 RBaccount. The error message said: Database ivia does not
 RBexist. What kind of database actually does it mean? I believe
 RBit's not a normal RDBMS Database (tables collection), but somewhat
 RBdifferent.
 sussy:~ # psql -h 127.0.0.1 -U ivia -W
 Password for user ivia: 
 psql: FATAL:  Datenbank »ivia« existiert nicht (my translation:
 Database does not exist)
 sussy:~ # su postgres -c psql postgres
 could not change directory to /root (Why does the postgresql look
 for '/root'?)
 Password: 
 Welcome to psql 8.3.7, the PostgreSQL interactive terminal.
 
 Type:  \copyright for distribution terms
\h for help with SQL commands
\? for help 

[GENERAL] Limit of bgwriter_lru_maxpages of max. 1000?

2009-09-27 Thread Gerhard Wiesinger

Hello,

I think the limit of bgwriter_lru_maxpages of max. 1000 doesn't make any 
sense because in fact it limits performance of the database (version 8.3).


On heavy write operations buffer cached must be freed. With the default 
config this is practically limited to:

bgwriter_delay=200ms
bgwriter_lru_maxpages=100
8k*bgwriter_lru_maxpages*1000/bgwriter_delay=
=8k*100*1000/200=4000k=4MB/s
Isn't that a major performancd bottleneck in default config?

bgwriter_delay=200ms
bgwriter_lru_maxpages=1000
8k*bgwriter_lru_maxpages*1000/bgwriter_delay=
=8k*1000*1000/200=4k=40MB/s
Still not a very high number for current I/O loads.

Lowering bgwriter_delay is possible, but I think overhead is too much and 
still there is a limit of 800MB/s involved:

bgwriter_delay=10ms
bgwriter_lru_maxpages=1000
8k*bgwriter_lru_maxpages*1000/bgwriter_delay=
=8k*1000*1000/10=80k=800MB/s

So I think it would be better to have such a configuration:
bgwriter_delay=50ms
bgwriter_lru_maxpages=10
8k*bgwriter_lru_maxpages*1000/bgwriter_delay=
=8k*10*1000/50=1600k=16000MB/s

So in fact I think bgwriter_lru_maxpages should be limited to 10 if 
limited at all.


Are my argumentations correct?
Any comments?

Thnx.

Ciao,
Gerhard

--
http://www.wiesinger.com/

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How should the first step of PostgreSQL implementation should be? (revised)

2009-09-27 Thread Alban Hertroys

On 27 Sep 2009, at 10:44, Ricky Tompu Breaky wrote:


RBI forgot to show you that I've done these steps too:
postgres=# ALTER ROLE ivia WITH LOGIN;
ALTER ROLE
postgres=# alter user ivia with password 'my password';
ALTER ROLE
postgres=# alter user ivia with login;
ALTER ROLE
postgres=# commit;
WARNUNG:  keine Transaktion offen
COMMIT
postgres=# \q
sussy:~ # psql -h 127.0.0.1 -U ivia -W
Password for user ivia:
psql: FATAL:  Datenbank »ivia« existiert nicht
sussy:~ #



You didn't specify a database to connect to. By default psql tries to  
connect to a database named after the login user, in this case ivia.  
You probably want to connect to the database named postgres that's  
created by default (at the initdb step).


Alban Hertroys

--
Screwing up is the best way to attach something to the ceiling.


!DSPAM:737,4abf320b11688043321471!



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How should the first step of PostgreSQL implementation should be?

2009-09-27 Thread Scott Marlowe
On Sun, Sep 27, 2009 at 2:36 AM, Ricky Tompu Breaky ricky.bre...@uni.de wrote:

 Welcome to psql 8.3.7, the PostgreSQL interactive terminal.

 Type:  \copyright for distribution terms
       \h for help with SQL commands
       \? for help with psql commands
       \g or terminate with semicolon to execute query
       \q to quit

 postgres=# createdb ivia

1: This is not the SQL command, it's the command line command.  SQL is:
create database dbname;

Note the semicolon (or you can use \g to execute the buffer)

 postgres-# \du

Note that your prompt has postgres-# not postgres=#

the - tells you there's already something in the buffer.
Also, \du shows you users.  \l shows you databases.

                               List of roles
 Role name | Superuser | Create role | Create DB | Connections | Memberof
 ---+---+-+---+-+---
 hidden| yes       | yes         | yes       | no limit    | {}
 ivia      | no        | no          | no        | no limit    | {}
 postgres  | yes       | yes         | yes       | no limit    | {}
 ricky     | yes       | yes         | yes       | no limit    | {}
 (4 rows)

 postgres-# commit

Again, no ;.  Note that pgsql doesn't start an open transaction
automagically like oracle.  you need an explicit begin; to open a
transaction block or all your commands will be individual transactions
executed immediately when \g or a semi colon is used.

 postgres-# \q
 sussy:~ # psql -h 127.0.0.1 -U ivia -W
 Password for user ivia:
 psql: FATAL:  Datenbank »ivia« existiert nicht (my translation:
 Database does not exist. Look!!! It does not make difference although
 I created a database named 'ivia')

Nope, you only thought you did.  Go back and try again.   :)

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Newbie's question: How can I connect to my postgresql-server?

2009-09-27 Thread Sam Mason
On Sat, Sep 26, 2009 at 03:00:41PM +0700, Ricky Tompu Breaky wrote:
 You've solved my several problem. But now I'm stucked on another
 problem which I know easy to anybody else but difficult as a
 postgres starter like me.

A general observation; you seem to be making things awfully complicated
initially.  It may be worth going with the defaults of assuming that PG
usernames and Unix usernames are the same initially.  Once you've got
a bit more experience then it may not seem so complicated.  The main
reason I'm saying this is that I *very* rarely have to fiddle with these
sorts of things and yet they always do the right thing when I try.

Maybe there are other things going on that are getting in the way of
solving the real issue that will go away with more experience.

 I did these steps:
 
 1. I created a new opensuse11.1-linux login account + its password
 (username: ivia) with YaST2;

PG users and Unix users/accounts are unrelated so this step is redundant
(think of large sites with thousands of database users).  If you've
always got Unix user accounts why not just use the ident auth as
default?

 2. i...@sussy:~ su postgres -c psql postgres
 3. postgres'# ALTER USER postgres WITH PASSWORD 'mypassword';
 ALTER ROLE
 postgres=# ALTER USER ivia WITH PASSWORD 'mypassword';
 postgres'# 
 4. sussy:/etc # cat /var/lib/pgsql/data/pg_hba.conf
 host all all 0.0.0.0/0 md5

Were these setting in place when you ran the initial connection with the
postgres account to change people's passwords?  If so, I can't see how
it would work.  Maybe you are changing the wrong config file.

Try putting something invalid into the config file and check to see if
it starts up or gives an error.

 7. sussy:/etc # psql -h 127.0.0.1 -U ivia -W
 Password for user ivia: 
 psql: FATAL:  Passwort-Authentifizierung für Benutzer »ivia«
 fehlgeschlagen (my translation: Password-Authentication for user »ivia«
 failed)

It's obviously using password auth, are you sure you got the passwords
right?  You haven't got a backslash in the password have you? it'll need
to be escaped in the initial creation bit if you do.  A simple ASCII
password such as 'test' may be a good place to start.

 Why can I not login with 'iVia' to my postgresql? Is it because I use
 'md5()' but not blowfish as I remember OpenSuSE11.1 use 'blowfish()'
 as its default password encryption. But AFAIK, there's nothing to do
 with the RDBMS Encryption (PostgreSQL in my case) and the Host OS
 password encryption method.

Yup, as far as I can tell this shouldn't have any effect.  Then again,
I don't use SuSE and it's possible (though very unlikely) that they
changed this.


  Sam

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] PostgreSQL reads each 8k block - no larger blocks are used - even on sequential scans

2009-09-27 Thread Gerhard Wiesinger

Hello,

As blocksizes, random I/O and linear I/O are critical I/O performance 
parameters I had a look on PostgreSQL and a commercial software vendor.


Therefore I enhanced the system tap script: 
http://www.wiesinger.com/opensource/systemtap/disktop_gw.stp


Output per 5 seconds on a sequence scan:
UID  PID PPID   CMD   DEVICETBYTES 
REQUESTSBYTES/REQ
 26 4263 4166postmaster dm-1R168542208 
20574 8192
= 32MB/s

So I saw, that even on sequential reads (and also on bitmap heap scan acces) 
PostgreSQL uses only 8k blocks. I think that's a major I/O bottleneck.


A commercial software database vendor solved the problem by reading multiple 
continuous blocks by multiple 8k blocks up to a maximum threshold. Output per 5 
seconds on an equivalent sequence scan:

UID  PID PPID   CMD   DEVICETBYTES 
REQUESTSBYTES/REQ
   1001 53811   process dm-1R277754638 
2338   118800
= 53 MB/s

A google research has shown that Gregory Stark already worked on that issue 
(see references below) but as far as I saw only on bitmap heap scans.


I think this is one of the most critical performance showstopper of PostgreSQL 
on the I/O side.


What's the current status of the patch of Gregory Stark? Any timeframes to 
integrate?
Does it also work for sequence scans? Any plans for a generic multi block read 
count solution?


Any comments?

Thnx.

Ciao,
Gerhard

--
http://www.wiesinger.com/

http://wiki.postgresql.org/wiki/Todo#Concurrent_Use_of_Resources
http://archives.postgresql.org/pgsql-hackers/2007-12/msg00027.php
http://archives.postgresql.org/pgsql-hackers/2007-12/msg00395.php
http://archives.postgresql.org/pgsql-hackers/2007-12/msg00088.php
http://archives.postgresql.org/pgsql-hackers/2007-12/msg00092.php
http://archives.postgresql.org/pgsql-hackers/2007-12/msg00098.php

http://archives.postgresql.org/pgsql-hackers/2006-10/msg00820.php

http://markmail.org/message/a5osy4qptxk2jgu3#query:+page:1+mid:hz7uzhwxtkbzncy2+state:results
http://markmail.org/message/a5osy4qptxk2jgu3#query:+page:1+mid:a5osy4qptxk2jgu3+state:results

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Understanding sort's memory/disk usage

2009-09-27 Thread Tom Lane
Adam Rich ada...@sbcglobal.net writes:
 Please reference these explain plans.  This is Pg 8.4.1

 http://explain-analyze.info/query_plans/4032-query-plan-2745
 http://explain-analyze.info/query_plans/4033-query-plan-2746

 First, could somebody explain what is leading the first query to choose 
 a different plan that's much slower?

I think it's rejecting the HashAggregate plan because, with the
estimated-wider rows, the hash table is estimated to exceed work_mem.

 Second, why would it choose to sort on disk for what appears to be ~32MB 
 of data, when my work_mem and temp_buffers are both 64 MB each?

The on-disk representation is more compact for various reasons.

 But it's only reporting 92kb of memory used? Why don't I see numbers 
 between 64 MB and 128 MB for both the on-disk and in-memory plans?

You're not taking into account whether the sort is on pre-aggregation or
post-aggregation data.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] PostgreSQL reads each 8k block - no larger blocks are used - even on sequential scans

2009-09-27 Thread Gerhard Wiesinger

Hello,

As blocksizes, random I/O and linear I/O are critical I/O performance 
parameters I had a look on PostgreSQL and a commercial software vendor.


Therefore I enhanced the system tap script: 
http://www.wiesinger.com/opensource/systemtap/disktop_gw.stp


Output per 5 seconds on a sequence scan:
UID  PID PPID   CMD   DEVICETBYTES  
   REQUESTSBYTES/REQ
 26 4263 4166postmaster dm-1R168542208  
  20574 8192
= 32MB/s

So I saw, that even on sequential reads (and also on bitmap heap scan 
acces) PostgreSQL uses only 8k blocks. I think that's a major I/O 
bottleneck.


A commercial software database vendor solved the problem by reading 
multiple continuous blocks by multiple 8k blocks up to a maximum 
threshold. Output per 5 seconds on an equivalent sequence scan:

UID  PID PPID   CMD   DEVICETBYTES  
   REQUESTSBYTES/REQ
   1001 53811   process dm-1R277754638  
   2338   118800
= 53 MB/s

A google research has shown that Gregory Stark already worked on that 
issue (see references below) but as far as I saw only on bitmap heap 
scans.


I think this is one of the most critical performance showstopper of 
PostgreSQL on the I/O side.


What's the current status of the patch of Gregory Stark? Any timeframes 
to integrate?
Does it also work for sequence scans? Any plans for a generic multi block 
read count solution?


Any comments?

Thnx.

Ciao,
Gerhard

--
http://www.wiesinger.com/

http://wiki.postgresql.org/wiki/Todo#Concurrent_Use_of_Resources
http://archives.postgresql.org/pgsql-hackers/2007-12/msg00027.php
http://archives.postgresql.org/pgsql-hackers/2007-12/msg00395.php
http://archives.postgresql.org/pgsql-hackers/2007-12/msg00088.php
http://archives.postgresql.org/pgsql-hackers/2007-12/msg00092.php
http://archives.postgresql.org/pgsql-hackers/2007-12/msg00098.php

http://archives.postgresql.org/pgsql-hackers/2006-10/msg00820.php

http://markmail.org/message/a5osy4qptxk2jgu3#query:+page:1+mid:hz7uzhwxtkbzncy2+state:results
http://markmail.org/message/a5osy4qptxk2jgu3#query:+page:1+mid:a5osy4qptxk2jgu3+state:results

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL reads each 8k block - no larger blocks are used - even on sequential scans

2009-09-27 Thread Sam Mason
On Sun, Sep 27, 2009 at 06:05:51PM +0200, Gerhard Wiesinger wrote:
 A google research has shown that Gregory Stark already worked on that issue 
 (see references below) but as far as I saw only on bitmap heap scans.

Greg Stark's patches are about giving the IO subsystem enough
information about where the random accesses will be ending up next.
This is important, but almost completely independent from the case
where you know you're doing sequential IO, which is what you seem to be
talking about.

 I think this is one of the most critical performance showstopper of 
 PostgreSQL on the I/O side.

PG's been able to handle data as fast as it can come back from the disk
in my tests.  When you start doing calculations then it will obviously
slow down, but what you were talking about wouldn't help here either.

Then again, I don't have a particularly amazing IO subsystem.  What
sort of performance do your disks give you and at what rate is PG doing
sequential scans for you?

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] CREATE LANGUAGE workaround

2009-09-27 Thread Martin Gainty

I am slowly transferring my Procedures over and came upon this workaround
implemented by a script which maps the plpgsql (type) to call plpgsql.dll 
pg_finfo_plpgsql_call_handler 

# Create a plpgsql handler for plpgsql type
CREATE FUNCTION pg_finfo_plpgsql_call_handler() RETURNS plpgsql AS 
'/postgres/pgsql/bin/plpgsql.dll' LANGUAGE C;
CREATE TRUSTED PROCEDURAL LANGUAGE plpgsql HANDLER 
pg_finfo_plpgsql_call_handler;

is there a system defined script I can run which would map the plpgsql handler 
to the plpgsql type automatically?

thanks!
Martin Gainty 
__ 
Verzicht und Vertraulichkeitanmerkung/Note de déni et de confidentialité

Diese Nachricht ist vertraulich. Sollten Sie nicht der vorgesehene Empfaenger 
sein, so bitten wir hoeflich um eine Mitteilung. Jede unbefugte Weiterleitung 
oder Fertigung einer Kopie ist unzulaessig. Diese Nachricht dient lediglich dem 
Austausch von Informationen und entfaltet keine rechtliche Bindungswirkung. 
Aufgrund der leichten Manipulierbarkeit von E-Mails koennen wir keine Haftung 
fuer den Inhalt uebernehmen.
Ce message est confidentiel et peut être privilégié. Si vous n'êtes pas le 
destinataire prévu, nous te demandons avec bonté que pour satisfaire informez 
l'expéditeur. N'importe quelle diffusion non autorisée ou la copie de ceci est 
interdite. Ce message sert à l'information seulement et n'aura pas n'importe 
quel effet légalement obligatoire. Étant donné que les email peuvent facilement 
être sujets à la manipulation, nous ne pouvons accepter aucune responsabilité 
pour le contenu fourni.


  
_
Microsoft brings you a new way to search the web.  Try  Bing™ now
http://www.bing.com?form=MFEHPGpubl=WLHMTAGcrea=TEXT_MFEHPG_Core_tagline_try 
bing_1x1

Re: [GENERAL] generic modelling of data models; enforcing constraints dynamically...

2009-09-27 Thread InterRob
Dear David, dear all,
I very well understand what you are saying... However, the solution won't be
found in the direction you are suggesting: the system I am designing will be
used by archaeologists, involved in archaeological research (fieldwork).
Their research strategy (and with it their methodology and techniques) may
vary during research, depending on their findings and understanding of the
past that is reconstructed on-site... Along with these methodologies en
techniques, differing data-models may be needed to introduced...

Relationships between these models may be formalised (that's what I (will)
put stakeholders together for); in fact this is what I try to model --
trying to develop a system that will centralize and version the data
gathered. On a supra-project level, in fact.

Meanwhile, I made some progress; in another mail I sent to this same list, I
described the technological challenge at hand as follows:
-
What I am trying to do is: building views on a base table, extended by one
or more columns, extracted (hence the naming of the function
deserialize()) from a SINGLE column (XML) **that is in this same base
table** (see below). Instructions for deserialization (that is: which
'fields' to look for) reside in some other table. There are MULTIPLE base
tables, they basically look like this:

[table definition:]
BASETABLE(ID INT, model TEXT, field1 some_type, field2 some_type, ... fieldN
some_type, serialized_data XML)

So, I wish to define multiple VIEWs based on a BASETABLE; one for each
model (as stated in the above table definition: model is a property for
each row). This QUERY would look like this (producing a VIEW for MODEL1;
the query below in invalid, unfortunately):

 SELECT base_t.*, deserialized.* FROM BASETABLE base_t,
deserialize('MODEL1', base_t) as deserialized(fieldX some_type, fieldY
some_type) WHERE base_t.model = 'MODEL1';

I have no problem with the requirement to supply the table type in the
query; infact this is logical. Still, this query is impossible, obviously,
because base_t as a target is not known in the context of the FROM-clause,
where I whish to use it in calling deserialize(...). Ofcourse, I could
write a deserialize() function for each base table (e.g.
deserialize_base1(...)) but I wish it to perform it's action on only rows
that will actually be part of the result set; thus I want the WHERE-clause
to apply to the function's seq scan álso. When provided, I whish to
incorporated the user's WHERE-clause as well; this is done by the PostgreSQL
RULE system...

Alternatively, the VIEW could be defined by the following query:
 SELECT base_t.*, (deserialize('MODEL1', base_t) as temp(fieldX some_type,
field_Y some_type)).* FROM BASETABLE base_t WHERE base_t.model = 'MODEL1';

This approach does not work either: deserialize(...) will return its set of
fields as ONE field (comma separated, circumfixed by brackets); expressions
within a SELECT-list seem to be only allowed to result in ONE column, except
from the * shorthand...

** So, the question is: how can i feed my deserialize() function with a
record subset (e.g. filter by a WHERE-clause) of the BASETABLE, while still
returning a record?!? **

I tried the following approach also:
 SELECT (SELECT fieldX FROM deserialize(base_t) deserialized(fieldX
some_type, fieldY some_type)) fieldX, (SELECT fieldY FROM
deserialize(base_t) deserialized(fieldX some_type, fieldY some_type))
fieldY FROM BASETABLE table_t WHERE model= 'MODEL1';

Which infact worked, but caused the function to get invoked TWICE FOR EACH
ROW (even when marked IMMUTABLE STRICT; or did I have to flush cached query
plans in psql?).

Another approach would be to put all key/value pairs into a separate table
(as one would do when implementing a EAV-model within a RDBMS) which is then
to be joined (and joined again... and possibly again (!); in case of
MULTIPLE additional rows -- depending on the definition of the VIEW) onto
the BASETABLE, rather than to deserialize from XML which is stored within
the same record... How does this approach then actually translate in terms
of table scans? Will they be limited by the filter on the BASETABLE, as the
available values to join on will be limited? At any rate: this approach will
be more difficult to implement / maintain in case of EDITABLE VIEWS
(inserts, update, delete)...

Hope any of you has some useful thoughts on this... It appears to me
updating the additional (virtual) fields in the BASETABLE is much easier:
the serialize()-function can be fed by a list of key/value pairs,
producing some XML that can be stored in the xml field of serialized_data,
part of this same base table...
All this needs to be implemented fully in the database back-end; client
application will not know they are talking to VIEWS rather than tables...
Thus: the hosted database must simulate to provide various tables, whereas
these are in fact stored in a limited number of base tables.



 Thanks in advance, you guys out there!


Rob


[GENERAL] Questions On Tablespace

2009-09-27 Thread Carlo Camerino
Hi Everyone,

I have questions regarding tablespaces, What happens when the disk on
which my tablespace is in fills up?
How do I expand my tablespace, in oracle there is a concept of
datafiles? In postgresql I specify a directory instead of a single
file...

For example I have two tables and they both use a single tablespace.
After some time the tablespace fills up. How do I point one table to
use a new tablespace?
Or is there a way in which I can get my tablespace to increase size by
using another disk for this purpose?
What's the best approach to this situation?

Thanks A Lot
Carlo

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] CREATE LANGUAGE workaround

2009-09-27 Thread Scott Marlowe
2009/9/27 Martin Gainty mgai...@hotmail.com:
 I am slowly transferring my Procedures over and came upon this workaround
 implemented by a script which maps the plpgsql (type) to call plpgsql.dll
 pg_finfo_plpgsql_call_handler

 # Create a plpgsql handler for plpgsql type
 CREATE FUNCTION pg_finfo_plpgsql_call_handler() RETURNS plpgsql AS
 '/postgres/pgsql/bin/plpgsql.dll' LANGUAGE C;
 CREATE TRUSTED PROCEDURAL LANGUAGE plpgsql HANDLER
 pg_finfo_plpgsql_call_handler;

 is there a system defined script I can run which would map the plpgsql
 handler to the plpgsql type automatically?

On Unix there's a createlang command (run from the CLI, not from psql)
that does this.

createlang plpgsql dbname

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] generic modelling of data models; enforcing constraints dynamically...

2009-09-27 Thread InterRob
In fact, I considered doing so, yes... But no luck: to complicate things, I
will need the support for spatial datatypes, as implemented by the contrib
PostGIS... Moreover: various applications that will make-up the front-end,
will only be able to talk with mainstraim or ODBC-compatible databases  :((

Rob

2009/9/26 Erik Jones ejo...@engineyard.com


 On Sep 24, 2009, at 2:07 PM, InterRob wrote:

  I guess it IS quite overengineered indeed...

 What I'm trying to do is to facilitate different fieldwork methodologies
 for archaeological research (on project basis); there is no final agreement
 on data structure and semantics; however, on a meta-level all choices are
 rational and can be modelled... Infact, all models can be related to each
 other: that's where the hybrid part comes in: I wish to implement the
 common denominator (90%) and then further extend this, enabing specific data
 model implementations -- including checks for data integrity.


 Have you considered a non-relational, schema-less database such as
 MongoDB or Cassandra?  You're pretty much throwing out the relational
 features of this database anyways so it seems that it would make sense to
 use something more geared to that kind of work.

 Erik Jones, Database Administrator
 Engine Yard
 Support, Scalability, Reliability
 866.518.9273 x 260
 Location: US/Pacific
 IRC: mage2k









Re: [GENERAL] PostgreSQL reads each 8k block - no larger blocks are used - even on sequential scans

2009-09-27 Thread Gerhard Wiesinger


On Sun, 27 Sep 2009, Sam Mason wrote:


On Sun, Sep 27, 2009 at 06:05:51PM +0200, Gerhard Wiesinger wrote:

A google research has shown that Gregory Stark already worked on that issue
(see references below) but as far as I saw only on bitmap heap scans.


Greg Stark's patches are about giving the IO subsystem enough
information about where the random accesses will be ending up next.
This is important, but almost completely independent from the case
where you know you're doing sequential IO, which is what you seem to be
talking about.



I'm talking about 2 cases
1.) Sequential scans
2.) Bitmap index scans
which both hopefully end physically in blocks which are after each other 
and were larger block sizes can benefit.



I think this is one of the most critical performance showstopper of
PostgreSQL on the I/O side.


PG's been able to handle data as fast as it can come back from the disk
in my tests.  When you start doing calculations then it will obviously
slow down, but what you were talking about wouldn't help here either.

Then again, I don't have a particularly amazing IO subsystem.  What
sort of performance do your disks give you and at what rate is PG doing
sequential scans for you?



Hello Sam,

Detailed benchmarks are below, the original one from PostgreSQL have 
already been posted. So i would expect at least about 60-80MB in reading 
for PostgreSQL (when larger block sizes are read)in practical issues on 
sequence scans but they are at about 30MB/s. See also pgiosim below.


Setup is:
Disk Setup: SW RAID 5 with 3x1TB SATA 7200 RPM disks
Linux Kernel: 2.6.30.5-43.fc11.x86_64
CPU: Quad Core: AMD Phenom(tm) II X4 940 Processor, 3GHz
RAM: 8GB

Thnx.

Ciao,
Gerhard

--
http://www.wiesinger.com/


http://pgfoundry.org/projects/pgiosim/

#
# Performance benchmarks:
#
dd if=/dev/zero of=test.txt bs=8192 count=1310720 conv=fdatasync
1310720+0 records in
1310720+0 records out
10737418240 bytes (11 GB) copied, 169.482 s, 63.4 MB/s

dd if=test.txt of=/dev/null bs=8192
1310720+0 records in
1310720+0 records out
10737418240 bytes (11 GB) copied, 86.4457 s, 124 MB/s

#
# RANDOM
#

# Random 8k block reads
echo 3  /proc/sys/vm/drop_caches;./pgiosim -b 1 test.txt
Arg: 1
Added test.txt
Elapsed: 148.22
Read 1 blocks Wrote 0 blocks
67.47 op/sec, 539.75kB/sec

# Random 8k block reads  writes
echo 3  /proc/sys/vm/drop_caches;./pgiosim -b 1 -w 100 test.txt
Write Mode: 100%
Arg: 1
Added test.txt
Elapsed: 201.44
Read 1 blocks Wrote 1 blocks
49.64 op/sec, 397.14kB/sec

# Random 8k block reads  writes, sync after each block
echo 3  /proc/sys/vm/drop_caches;./pgiosim -b 1 -w 100 -y test.txt
Write Mode: 100%
fsync after each write
Arg: 1
Added test.txt
Elapsed: 282.30
Read 1 blocks Wrote 1 blocks
35.42 op/sec, 283.39kB/sec

#
# SEQUENTIAL
#

# Sequential 8k block reads
echo 3  /proc/sys/vm/drop_caches;./pgiosim -s -b 100 test.txt
Seq Scan
Arg: 1
Added test.txt
Elapsed: 71.88
Read 100 blocks Wrote 0 blocks
13911.40 op/sec, 111291.17kB/sec

# Sequential 8k block reads  writes
echo 3  /proc/sys/vm/drop_caches;./pgiosim -s -b 100 -w 100 test.txt
Seq Scan
Write Mode: 100%
Arg: 1
Added test.txt
Elapsed: 261.24
Read 100 blocks Wrote 100 blocks
3827.90 op/sec, 30623.18kB/sec

# Sequential 8k block reads  writes, sync after each block
echo 3  /proc/sys/vm/drop_caches;./pgiosim -s -b 1 -w 100 -y test.txt
Seq Scan
Write Mode: 100%
fsync after each write
Arg: 1
Added test.txt
Elapsed: 27.03
Read 1 blocks Wrote 1 blocks
369.96 op/sec, 2959.68kB/sec

#


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] CREATE LANGUAGE workaround

2009-09-27 Thread Tom Lane
Martin Gainty mgai...@hotmail.com writes:
 I am slowly transferring my Procedures over and came upon this workaround
 implemented by a script which maps the plpgsql (type) to call plpgsql.dll 
 pg_finfo_plpgsql_call_handler 

 # Create a plpgsql handler for plpgsql type
 CREATE FUNCTION pg_finfo_plpgsql_call_handler() RETURNS plpgsql AS 
 '/postgres/pgsql/bin/plpgsql.dll' LANGUAGE C;
 CREATE TRUSTED PROCEDURAL LANGUAGE plpgsql HANDLER 
 pg_finfo_plpgsql_call_handler;

 is there a system defined script I can run which would map the plpgsql 
 handler to the plpgsql type automatically?

I'm not sure what you are trying to accomplish, but the above looks like
it should all just be replaced by
CREATE LANGUAGE plpgsql;
in reasonably modern versions of Postgres.

There is no such thing as a plpgsql type.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Questions On Tablespace

2009-09-27 Thread Tom Lane
Carlo Camerino carlo.camer...@gmail.com writes:
 I have questions regarding tablespaces, What happens when the disk on
 which my tablespace is in fills up?

You start getting errors.

 How do I expand my tablespace, in oracle there is a concept of
 datafiles? In postgresql I specify a directory instead of a single
 file...

If you expect to need to expand the filesystem, you should be using
LVM or local equivalent so that you can add or remove disks from
the filesystem as needed.

Oracle's design dates from a time when filesystems tended to suck and so
Oracle felt it should reimplement all the filesystem-level functionality
for itself.  Postgres is not interested in reinventing the wheel, so we
don't do that.  You won't find any raw disk access functions in
Postgres either.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] generic modelling of data models; enforcing constraints dynamically...

2009-09-27 Thread David Fetter
On Sun, Sep 27, 2009 at 08:26:27PM +0200, InterRob wrote:
 Dear David, dear all,
 I very well understand what you are saying...

Clearly you do not.  What you are proposing has been tried many, many
times before, and universally fails.

That your people are failing to get together and agree to a data model
is not a reason for you to prop up their failure with a technological
fix that you know from the outset can't be made to work.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] generic modelling of data models; enforcing constraints dynamically...

2009-09-27 Thread Peter Hunsberger
On Sun, Sep 27, 2009 at 2:22 PM, David Fetter da...@fetter.org wrote:
 On Sun, Sep 27, 2009 at 08:26:27PM +0200, InterRob wrote:
 Dear David, dear all,
 I very well understand what you are saying...

 Clearly you do not.  What you are proposing has been tried many, many
 times before, and universally fails.

I've been refraining from jumping on this due to time constraints, but
this statement is silly.  We have a system that does almost exactly
what the OP wants although the implementation is slightly different:
we use an EAV like model with strong typing and build set / subset
forests to maintain arbitrary hierarchies of relationships.  Our
reasons for doing this are similar to the OPs; it's for research (in
our case medical research).  We maintain over 200,000 pieces of end
user generated metadata, describing what would be in a conventional
relational model over 20,000 columns and some 1,000s of tables but the
actual physical model is some 40 tables.   Yes, the flip side is, such
a system won't support more than 1,000,000s of transactions per day,
but that's not why you build them.


 That your people are failing to get together and agree to a data model
 is not a reason for you to prop up their failure with a technological
 fix that you know from the outset can't be made to work.


Spoken like someone who has always had the luxury of working in areas
with well defined problem domains...   I can't tell you the number of
people that told us exactly the same thing when we started on it.
That was 8 years ago.  Not only can such systems be built, they can be
made to scale reasonably well.  You do need to understand what you are
doing and why: the costs can be high, but when it comes to research,
the benefits can far outweigh the costs.

-- 
Peter Hunsberger

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] generic modelling of data models; enforcing constraints dynamically...

2009-09-27 Thread InterRob
Dear David, dear Peter, dear all,
Peter, I was happy reading your reply right after I opened and read Davids.
I do think I am on the right track; it is not a matter of building the
one-and-only right schema, not in this case. Archaeology has the same twist
as has ethnography, antropology and alike: they work with (what I would
call) narratives (in fact, in the case of archaeology this seems to me to
be an archaeologists monologue...). They try to support their findings with
statistics and other means of quatification -- as does this modern,
rationalist world require them to do, to be taken seriously as science... I
seek to implement all this in a hybrid form; a fusion between the relational
and EAV concept.

Peter, may I invite you to privately share some more details on the system
you are using and the design of it? Did you implement it using PostgreSQL?
Looking forward to your reply.
(And with respect to your previous message: whom are you actually referring
to by the acronym OPs?)

Cheerz,


Rob

2009/9/27 Peter Hunsberger peter.hunsber...@gmail.com

 On Sun, Sep 27, 2009 at 2:22 PM, David Fetter da...@fetter.org wrote:
  On Sun, Sep 27, 2009 at 08:26:27PM +0200, InterRob wrote:
  Dear David, dear all,
  I very well understand what you are saying...
 
  Clearly you do not.  What you are proposing has been tried many, many
  times before, and universally fails.

 I've been refraining from jumping on this due to time constraints, but
 this statement is silly.  We have a system that does almost exactly
 what the OP wants although the implementation is slightly different:
 we use an EAV like model with strong typing and build set / subset
 forests to maintain arbitrary hierarchies of relationships.  Our
 reasons for doing this are similar to the OPs; it's for research (in
 our case medical research).  We maintain over 200,000 pieces of end
 user generated metadata, describing what would be in a conventional
 relational model over 20,000 columns and some 1,000s of tables but the
 actual physical model is some 40 tables.   Yes, the flip side is, such
 a system won't support more than 1,000,000s of transactions per day,
 but that's not why you build them.

 
  That your people are failing to get together and agree to a data model
  is not a reason for you to prop up their failure with a technological
  fix that you know from the outset can't be made to work.
 

 Spoken like someone who has always had the luxury of working in areas
 with well defined problem domains...   I can't tell you the number of
 people that told us exactly the same thing when we started on it.
 That was 8 years ago.  Not only can such systems be built, they can be
 made to scale reasonably well.  You do need to understand what you are
 doing and why: the costs can be high, but when it comes to research,
 the benefits can far outweigh the costs.

 --
 Peter Hunsberger




Re: [GENERAL] PostgreSQL reads each 8k block - no larger blocks are used - even on sequential scans

2009-09-27 Thread Sam Mason
On Sun, Sep 27, 2009 at 09:04:31PM +0200, Gerhard Wiesinger wrote:
 I'm talking about 2 cases
 1.) Sequential scans
 2.) Bitmap index scans
 which both hopefully end physically in blocks which are after each other 
 and were larger block sizes can benefit.

Unfortunately it's all a bit more complicated than you hope :(
Sequential scans *may* benefit from larger block sizes, but not much.
Your testing below doesn't seem to test this at all though.

Bitmap index scan will still be accessing blocks in a somewhat random
order (depending on how much correlation there is between the index and
physical rows, and what the selectivity is like).  The result of any
index scan (bitmap or otherwise) must come back in the correct order
(PG is designed around this) and the the best idea to speed this up has
been Greg's read ahead patch.  This pushes more information down into
the kernel so it can start reading the blocks back before PG actually
gets to them.  They are still going to be somewhat out of order so, in
the general case, you're going to be limited by the seek speed of your
disks.


 Detailed benchmarks are below, the original one from PostgreSQL have 
 already been posted.

Which was saying what?  you were getting 32MB/s and 53MB/s from what?

As a quick test, maybe:

  create table benchmark ( i integer, j text, k text );
  begin; truncate benchmark; insert into benchmark select 
generate_series(1,1024*1024*10), '0123456789abcdef','0123456789abcdef'; commit;

The first run of:

  select count(*) from benchmark;

Will cause the hint bits to get set and will cause a lot of writing to
happen.  Subsequent runs will be testing read performance.  My simple
SATA disk at home gets ~90MB/s when tested hdparm, which I'm taking as
the upper performance limit.  When I perform the above query, I see the
disk pulling data back at 89.60MB/s (stddev of 2.27) which is actually
above what I was expecting (there's a filesystem in the way). CPU usage
is around 10%.  Tested by turning on \timing mode in psql, dropping
caches and running:

  SELECT 715833344 / 7597.216 / 1024;

Where 715833344 is the size of the file backing the benchmark table
above and 7597.216 is the time taken in ms.

 http://pgfoundry.org/projects/pgiosim/

This seems to just be testing seek performance, not sequential
performance.

 dd if=/dev/zero of=test.txt bs=8192 count=1310720 conv=fdatasync
 10737418240 bytes (11 GB) copied, 169.482 s, 63.4 MB/s
 
 dd if=test.txt of=/dev/null bs=8192
 10737418240 bytes (11 GB) copied, 86.4457 s, 124 MB/s

These look slow.  RAID5 isn't going to be amazing, but it should be
better than this.  I'd spend some more time optimizing your system
config before worrying about PG.  If I can read at 90MB/s from a single
stock SATA drive you should be almost hitting 200MB/s with this, or
300MB/s in a RAID1 across three drives.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] generic modelling of data models; enforcing constraints dynamically...

2009-09-27 Thread Oliver Kohll - Mailing Lists


On 27 Sep 2009, at 21:10, InterRob rob.mar...@gmail.com wrote:

Peter, may I invite you to privately share some more details on the  
system you are using and the design of it? Did you implement it  
using PostgreSQL? Looking forward to your reply.
(And with respect to your previous message: whom are you actually  
referring to by the acronym OPs?)




Or publicly? I for one would be interested hearing more. From  
situations I've come across, EAV seems to be proposed when either

1) attributes are very numerous and values very sparse
2) people want to be able to quickly add (and remove?) attributes
My feeling is it's probably valid for 1, at least I haven't come  
across anything better, but not for 2.


Regards
Oliver

www.gtwm.co.uk - company
www.gtportalbase.com - product



Re: [GENERAL] PostgreSQL reads each 8k block - no larger blocks are used - even on sequential scans

2009-09-27 Thread Sam Mason
On Sun, Sep 27, 2009 at 10:01:27PM +0100, Sam Mason wrote:
 Tested by turning on \timing mode in psql, dropping
 caches and running:
 
   SELECT 715833344 / 7597.216 / 1024;

Help, I can't do maths!  This is overestimating the performance and
should be:

  SELECT 715833344 / 7597.216 / 1024 / 1024 * 1000;

After a few more runs to increase confidence, the read performance is
87.17 and a stddev of 2.8.  Which seems more reasonable, it should *not*
be going above 90MB/s as often as it was.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] generic modelling of data models; enforcing constraints dynamically...

2009-09-27 Thread InterRob
Oliver,
Would you say it is not valid for proposition 2 (people wanting to be able
to quickly add (and remove?) attributes) because within the relational model
this can be done reasonably well?

If you think so, then I we do in fact agree on that... Still, however,
implementing this transparently (that is: back-end/server side; using VIEWs,
is the only way I can think of) is a major challenge. Implementing the use
of USER DEFINED additional fields within a certain application (front-end /
client side) is much more easy...


Rob

2009/9/27 Oliver Kohll - Mailing Lists oliver.li...@gtwm.co.uk


 On 27 Sep 2009, at 21:10, InterRob rob.mar...@gmail.com wrote:

 Peter, may I invite you to privately share some more details on the system
 you are using and the design of it? Did you implement it using PostgreSQL?
 Looking forward to your reply.
 (And with respect to your previous message: whom are you actually referring
 to by the acronym OPs?)


 Or publicly? I for one would be interested hearing more. From situations
 I've come across, EAV seems to be proposed when either
 1) attributes are very numerous and values very sparse
 2) people want to be able to quickly add (and remove?) attributes
 My feeling is it's probably valid for 1, at least I haven't come across
 anything better, but not for 2.

 Regards
 Oliver

 www.gtwm.co.uk - company
 www.gtportalbase.com - product




[GENERAL] dump time increase by 1h with new kernel

2009-09-27 Thread Justin Pryzby
When we upgraded from linux-2.6.24 to ./linux-2.6.27, our pg_dump
duration increased by 20%.  My first attempt at resolution was to boot
with elevator=deadline.  However that's actually the default IO
scheduler in both kernels.

The two dmesg's are at:
https://www.norchemlab.com/tmp/linux-2.6.24-22.45-server
https://www.norchemlab.com/tmp/linux-2.6.27-14.41-server

The database partition is: xfs / lvm / aic79xx / scsi.

Booting back into the .24 kernel brings the pg_dump down to 5 hours
(rather than 6, for daily 20GB output compressed by pg_dump -Fc).

Does anyone know what might be different which could cause such a
drastic change?

Thanks,
Justin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] generic modelling of data models; enforcing constraints dynamically...

2009-09-27 Thread Scott Marlowe
On Sun, Sep 27, 2009 at 5:44 PM, InterRob rob.mar...@gmail.com wrote:
 Oliver,
 Would you say it is not valid for proposition 2 (people wanting to be able
 to quickly add (and remove?) attributes) because within the relational model
 this can be done reasonably well?

Actually that's what I think it's best at, as long as you aren't
trying to get fancy.  We have a part of an intranet type app that lets
users upload table formatted data that's like a freeform spreadsheet
and we use EAV to store the data for that.  There's no FK or other
relational stuff.

The problems start to pile up when you try to do something exciting,
interesting, fascinating or other 'ings...

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL reads each 8k block - no larger blocks are used - even on sequential scans

2009-09-27 Thread Scott Marlowe

 dd if=/dev/zero of=test.txt bs=8192 count=1310720 conv=fdatasync
 10737418240 bytes (11 GB) copied, 169.482 s, 63.4 MB/s

 dd if=test.txt of=/dev/null bs=8192
 10737418240 bytes (11 GB) copied, 86.4457 s, 124 MB/s

 These look slow.  RAID5 isn't going to be amazing, but it should be
 better than this.  I'd spend some more time optimizing your system
 config before worrying about PG.  If I can read at 90MB/s from a single
 stock SATA drive you should be almost hitting 200MB/s with this, or
 300MB/s in a RAID1 across three drives.

They are slow, they are not atypical for RAID5; especially the slow
writes with SW RAID-5 are typical.

I'd try a simple test on a 2 or 3 disk RAID-0 for testing purposes
only to see how much faster a RAID-10 array of n*2 disks could be.
The increase in random write performance for RAID-10 will be even more
noticeable.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] problem with array query

2009-09-27 Thread Grant Maxwell

Hi Folks

According to the 8.3 docs I should be able to write:
select * from tblretrain where   'ms-ap-t2-02c9' NOT IN   (owners);

where owners is an array per the following definition

CREATE TABLE tblretrain
(
  pkretrainid integer NOT NULL,
  mailid integer NOT NULL,
  train_to smallint NOT NULL,
  owners character varying(1024)[],
  bayes_trained boolean DEFAULT false,
  contents text NOT NULL,
  CONSTRAINT tblretrain_pk PRIMARY KEY (pkretrainid)
)

The problem is that it generates an error:

ERROR:  array value must start with { or dimension information
** Error **
ERROR: array value must start with { or dimension information
SQL state: 22P02

It seems as though postgres is not recognising owners as an array.

Any suggestions please ?
regards
Grant





Re: [GENERAL] problem with array query

2009-09-27 Thread Tom Lane
Grant Maxwell grant.maxw...@maxan.com.au writes:
 According to the 8.3 docs I should be able to write:
 select * from tblretrain where   'ms-ap-t2-02c9' NOT IN   (owners);
 where owners is an array per the following definition
owners character varying(1024)[],

No, what you can write is  ALL, not NOT IN.

 It seems as though postgres is not recognising owners as an array.

It's trying to parse the literal as an array so that it can do a
plain equality comparison against the owners column.

You probably read the part of the docs where it says that
x NOT IN (SELECT ...) is equivalent to x  ALL (SELECT ...).
Which is true, but it has nothing to do with the non-sub-SELECT syntax.
Without a sub-SELECT, we have two cases:
x NOT IN (y,z,...) expects x,y,z to all be the same type.
x  ALL (y) expects y to be an array of x's type.
Got it?

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] problem with array query

2009-09-27 Thread Grant Maxwell

Hi Tom

The bit I was reading is

http://www.postgresql.org/docs/8.3/interactive/arrays.html#AEN6019
__ EXTRACT 
 However, this quickly becomes tedious for large arrays, and is not  
helpful if the size of the array is uncertain. An alternative method  
is described in Section 9.20. The above query could be replaced by:


SELECT * FROM sal_emp WHERE 1 = ANY (pay_by_quarter);
In addition, you could find rows where the array had all values equal  
to 1 with:


SELECT * FROM sal_emp WHERE 1 = ALL (pay_by_quarter);
 END EXTRACT __
(section 9.20 is the bit that suggests the syntax I was trying)
 ALL is not working. I thought it would fail if the LS does not  
match every array member of the RS.
What I'm trying to do is find every record where my name is not in  
the array.

So I tried  ANY and also  ALL and both returned an empty row set.
regards
Grant


On 28/09/2009, at 11:42 AM, Tom Lane wrote:


Grant Maxwell grant.maxw...@maxan.com.au writes:

According to the 8.3 docs I should be able to write:
select * from tblretrain where   'ms-ap-t2-02c9' NOT IN   (owners);
where owners is an array per the following definition
  owners character varying(1024)[],


No, what you can write is  ALL, not NOT IN.


It seems as though postgres is not recognising owners as an array.


It's trying to parse the literal as an array so that it can do a
plain equality comparison against the owners column.

You probably read the part of the docs where it says that
x NOT IN (SELECT ...) is equivalent to x  ALL (SELECT ...).
Which is true, but it has nothing to do with the non-sub-SELECT  
syntax.

Without a sub-SELECT, we have two cases:
x NOT IN (y,z,...) expects x,y,z to all be the same type.
x  ALL (y) expects y to be an array of x's type.
Got it?

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general




Re: [GENERAL] problem with array query

2009-09-27 Thread Tom Lane
Grant Maxwell grant.maxw...@maxan.com.au writes:
 What I'm trying to do is find every record where my name is not in  
 the array.
 So I tried  ANY and also  ALL and both returned an empty row set.

Maybe you have some nulls in the arrays?   ALL works for me.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] getting PostgreSQL to run on superH-based machines

2009-09-27 Thread Nobuhiro Iwamatsu
Hi, all.

Sorry, I did not check these mail

2009/7/27 Tom Lane t...@sss.pgh.pa.us:
 sibu xolo sib...@btconnect.com writes:
 +                    tas.b �...@%1\n\t  \n
 +                    movt   %0\n\t   \n
 +                    xor    #1,%0    \n

 Hmm, what is the point of introducing extra blank lines into the asm
 output?  I would hope those are unnecessary, but one never knows ...


I rewrite patch and test on git/HEAD.
Please wait...

Best regards,
  Nobuhiro

-- 
Nobuhiro Iwamatsu / Debian Developer
   iwamatsu at {nigauri.org / debian.org}

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How should the first step of PostgreSQL implementation should be? (revised)

2009-09-27 Thread Ricky Tompu Breaky
Dear Alban.

You've solved my problem

Thank you 1000x.

On Sun, 27 Sep 2009 11:36:09 +0200
Alban Hertroys dal...@solfertje.student.utwente.nl wrote:

 On 27 Sep 2009, at 10:44, Ricky Tompu Breaky wrote:
 
  RBI forgot to show you that I've done these steps too:
  postgres=# ALTER ROLE ivia WITH LOGIN;
  ALTER ROLE
  postgres=# alter user ivia with password 'my password';
  ALTER ROLE
  postgres=# alter user ivia with login;
  ALTER ROLE
  postgres=# commit;
  WARNUNG:  keine Transaktion offen
  COMMIT
  postgres=# \q
  sussy:~ # psql -h 127.0.0.1 -U ivia -W
  Password for user ivia:
  psql: FATAL:  Datenbank »ivia« existiert nicht
  sussy:~ #
 
 
 You didn't specify a database to connect to. By default psql tries
 to connect to a database named after the login user, in this case
 ivia. You probably want to connect to the database named postgres
 that's created by default (at the initdb step).
 
 Alban Hertroys
 
 --
 Screwing up is the best way to attach something to the ceiling.
 
 
 !DSPAM:737,4abf320b11688043321471!
 
 
 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Newbie's question: How can I connect to my postgresql-server?

2009-09-27 Thread Ricky Tompu Breaky
Dear Sam...

After I found the solution of my problem and again read the postgres
manual, I've understood you're correct that I tried too much as an
initial step for a newbie like me.

But anyway, I thank you so many times because you've taught me a
lot about postgres.

I really appreciate you're help. You made so much steps further for me.

On Sun, 27 Sep 2009 14:47:06 +0100
Sam Mason s...@samason.me.uk wrote:

 On Sat, Sep 26, 2009 at 03:00:41PM +0700, Ricky Tompu Breaky wrote:
  You've solved my several problem. But now I'm stucked on another
  problem which I know easy to anybody else but difficult as a
  postgres starter like me.
 
 A general observation; you seem to be making things awfully
 complicated initially.  It may be worth going with the defaults of
 assuming that PG usernames and Unix usernames are the same
 initially.  Once you've got a bit more experience then it may not
 seem so complicated.  The main reason I'm saying this is that I
 *very* rarely have to fiddle with these sorts of things and yet they
 always do the right thing when I try.
 
 Maybe there are other things going on that are getting in the way of
 solving the real issue that will go away with more experience.
 
  I did these steps:
  
  1. I created a new opensuse11.1-linux login account + its password
  (username: ivia) with YaST2;
 
 PG users and Unix users/accounts are unrelated so this step is
 redundant (think of large sites with thousands of database users).
 If you've always got Unix user accounts why not just use the ident
 auth as default?
 
  2. i...@sussy:~ su postgres -c psql postgres
  3. postgres'# ALTER USER postgres WITH PASSWORD 'mypassword';
  ALTER ROLE
  postgres=# ALTER USER ivia WITH PASSWORD 'mypassword';
  postgres'# 
  4. sussy:/etc # cat /var/lib/pgsql/data/pg_hba.conf
  host all all 0.0.0.0/0 md5
 
 Were these setting in place when you ran the initial connection with
 the postgres account to change people's passwords?  If so, I can't
 see how it would work.  Maybe you are changing the wrong config
 file.
 
 Try putting something invalid into the config file and check to see if
 it starts up or gives an error.
 
  7. sussy:/etc # psql -h 127.0.0.1 -U ivia -W
  Password for user ivia: 
  psql: FATAL:  Passwort-Authentifizierung für Benutzer »ivia«
  fehlgeschlagen (my translation: Password-Authentication for user
  »ivia« failed)
 
 It's obviously using password auth, are you sure you got the passwords
 right?  You haven't got a backslash in the password have you? it'll
 need to be escaped in the initial creation bit if you do.  A simple
 ASCII password such as 'test' may be a good place to start.
 
  Why can I not login with 'iVia' to my postgresql? Is it because I
  use 'md5()' but not blowfish as I remember OpenSuSE11.1 use
  'blowfish()' as its default password encryption. But AFAIK, there's
  nothing to do with the RDBMS Encryption (PostgreSQL in my case) and
  the Host OS password encryption method.
 
 Yup, as far as I can tell this shouldn't have any effect.  Then again,
 I don't use SuSE and it's possible (though very unlikely) that they
 changed this.
 
 
   Sam
 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general