[GENERAL] left outer join without rows from left table

2009-02-16 Thread Scara Maccai
Hi all,

I want to get data from these tables:
TABID
integer id,
name varchar
example values:
1  'id1'
2  'id2'
[...]
TABA
integer id,
timestamp t,
integer a
example values:
1 '2009-02-13 00:00:00' 10
1 '2009-02-13 02:00:00' 19
TABB
integer id,
timestamp t,
integer b
example values:
1 '2009-02-13 00:00:00' 90
1 '2009-02-13 01:00:00'  109
in this form:
select * from TABID left outer join TABA on (id) left outer join TABB on 
TABB.id = TABID.id and TABA.t = TABB.t
So, basically, all the rows from table TABID joined with both table TABA and 
TABB. The problem is that some times TABB doesn't have the t values TABA has, 
and some other times TABA doesn't have the t values TABB has.
So I would like an output like:
id, TABA.t, a, TABB.t, b
1 '2009-02-13 00:00:00' 10'2009-02-13 00:00:00' 90
1 NULL NULL '2009-02-13 01:00:00'  109
1 '2009-02-13 02:00:00'  19 NULL NULL 
How can I do it? 


  Passa a Yahoo! Mail.

La webmail che ti offre GRATIS spazio illimitato, 
antispam e messenger integrato.
http://it.mail.yahoo.com/  


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] ask: select right(column) ???

2009-02-16 Thread hendra kusuma
Dear all,

I have simple question
I tried following code

 select right(column, number_of_character) from table

but it didn't work, saying that pg doesn't have the function
is there any way to achieve such output?

honestly I have no idea that such simple feature doesn't exist in postgresql
or am I wrong? since I look at SQL Key Words table and it's written as
reserved

Thank you
Regards
Hendra


Re: [GENERAL] ask: select right(column) ???

2009-02-16 Thread Ludwig Kniprath
Hello Hendra,
there is no function right(column, n-Chars), but you can use 
substring(column-name from offset for num_chars) in combination with 
char_length for getting the right-n-characters as f. e.:

select substring(column from (char_length(column) - 3) for 4) from table

Ludwig



Dear all,

I have simple question
I tried following code

 select right(column, number_of_character) from table

but it didn't work, saying that pg doesn't have the function
is there any way to achieve such output?

honestly I have no idea that such simple feature doesn't exist in postgresql
or am I wrong? since I look at SQL Key Words table and it's written as
reserved

Thank you
Regards
Hendra

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] idle DB and resources

2009-02-16 Thread Ivan Sergio Borgonovo
We've one site already in production and another one that will take
its birth from the previous one.
Since editors have to prune and adapt content from the previous one
I'm going to duplicate the DB and put it online on the same box and
let editors do their job.

Later we may decide to move it on another box or leave it there
since once the editors job will be finished the DB will be much
smaller.

Meanwhile we will have 2 large DB, one of them being nearly idle.
Is the idle DB going to have any impact on performance?

thanks

-- 
Ivan Sergio Borgonovo
http://www.webthatworks.it


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] left outer join without rows from left table

2009-02-16 Thread Sam Mason
On Mon, Feb 16, 2009 at 12:15:47AM -0800, Scara Maccai wrote:
 select * from TABID left outer join TABA on (id) left outer join TABB on 
 TABB.id = TABID.id and TABA.t = TABB.t
 So, basically, all the rows from table TABID joined with both table
 TABA and TABB. The problem is that some times TABB doesn't have the
 t values TABA has, and some other times TABA doesn't have the t
 values TABB has.

I think you want to use a full outer join with slightly unusual
bracketing:

  SELECT t.id, COALESCE(a.t,b.t) AS t, a.a, b.b
  FROM tabid t LEFT JOIN (
  taba a FULL OUTER JOIN tabb b ON (a.id,a.t) = (b.id,b.t))
ON t.id = COALESCE(a.id,b.id);

This will start by doing the inner most thing first, join taba to tabb
matching rows where the id and timestamp is the same, then go on to
match these to tabid.  The COALESCEs are needed because when either side
is missing their values will be NULL and hence we need to look at both.


-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] ask: select right(column) ???

2009-02-16 Thread Sam Mason
On Mon, Feb 16, 2009 at 03:21:20PM +0700, hendra kusuma wrote:
  select right(column, number_of_character) from table
[..]
 honestly I have no idea that such simple feature doesn't exist in postgresql
 or am I wrong? since I look at SQL Key Words table and it's written as
 reserved

AFAIK, it's reserved because right is used in outer join syntax; i.e.
table RIGHT OUTER JOIN table, and not because of the function name.

If you want this in Postgres, you could always do:

  CREATE FUNCTION right(TEXT,INTEGER) RETURNS TEXT
LANGUAGE sql IMMUTABLE STRICT AS $$
  SELECT substring($1 FROM char_length($1)-$2+1) $$;

Using right to extract the right most characters from a string sounds
very much like BASIC to me but may have moved to other languages now.
Humm, lets have a look; BASIC has Right$(), Pascal has RightStr(), MySql
has Right(), MS Sql Server has Right().  The other languages I chose to
look at (C, C++, Java, PHP, Lua, Ruby, Javascript) all naively expose a
substring function, like PG and Oracle, and leave anything like Right()
up to the user.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] ask: select right(column) ???

2009-02-16 Thread Lennin Caro
 
 I have simple question
 I tried following code
 
  select right(column, number_of_character) from table
 
 but it didn't work, saying that pg doesn't have the
 function
 is there any way to achieve such output?
 
 honestly I have no idea that such simple feature
 doesn't exist in postgresql
 or am I wrong? since I look at SQL Key Words table and
 it's written as
 reserved
 
 Thank you
 Regards
 Hendra

you are right but you can use the substring function, like this 

select 'test123',substring('test123' from '...$')

this return '123'





  

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] idle DB and resources

2009-02-16 Thread Craig Ringer
Ivan Sergio Borgonovo wrote:

 Meanwhile we will have 2 large DB, one of them being nearly idle.
 Is the idle DB going to have any impact on performance?

I'm far from an expert here, but AFAIK it shouldn't have much impact.
That does depend, though, on just how idle it really is, and how much of
it is accessed at a time when it is used.

If you have queries that involve scans of large proportions of the
tables, or require the reading of big indexes, then even infrequent use
of the database will probably cause performance blips in the main
database due to contention for disk I/O, and will probably push data
from the main DB out of cache too, forcing it to be re-read from disk later.

On the other hand, if your queries don't read huge parts of the DB in at
once and you have small, efficient indexes, I imagine you won't notice
it at all.

You might be able to reduce the impact by putting the idle DB (or if
space is a concern, at least its indexes) on a different disk or disk
array. It won't help with contention for cache memory though, and I
don't know if there's anything much to be done about that.

--
Craig Ringer

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] left outer join without rows from left table

2009-02-16 Thread Scara Maccai
Thank you: that's exactly what I needed.



 I think you want to use a full outer join with slightly unusual
 bracketing:
 
   SELECT t.id, COALESCE(a.t,b.t) AS t, a.a, b.b
   FROM tabid t LEFT JOIN (
   taba a FULL OUTER JOIN tabb b ON (a.id,a.t) = (b.id,b.t))
 ON t.id = COALESCE(a.id,b.id);
 
 This will start by doing the inner most thing first, join taba to tabb
 matching rows where the id and timestamp is the same, then go on to
 match these to tabid.  The COALESCEs are needed because when either side
 is missing their values will be NULL and hence we need to look at both.


  Passa a Yahoo! Mail.

La webmail che ti offre GRATIS spazio illimitato, 
antispam e messenger integrato.
http://it.mail.yahoo.com/  


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] clearing the buffer cache

2009-02-16 Thread Chris Mayfield

With recent versions of Linux you can flush the system's buffer cache by
doing:

  # echo 3  /proc/sys/vm/drop_caches


You can also try something like this:
http://wiki.services.openoffice.org/wiki/Cold-start-simulator

Shutting down the server and running fillmem has worked for me, but that 
might take too long in your situation of thousands of small queries. But 
I guess SELECT * FROM large_table would have taken just as long anyway.


--Chris

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] ask: select right(column) ???

2009-02-16 Thread Sam Mason
On Mon, Feb 16, 2009 at 07:10:11AM -0800, Lennin Caro wrote:
 you can use the substring function, like this 
 
 select 'test123',substring('test123' from '...$')
 
 this return '123'

Note that regexps are slower than substrings; as an example, I did:

  SELECT COUNT(s) FROM (
SELECT 'test'::text AS s
FROM generate_series(1,10) n
OFFSET 0) x;

We have to put the OFFSET 0 in to force evaluation otherwise PG is
smart enough to optimize code away and invalidate the test.  I replaced
the COUNT(s) with various exressions to see how it performed:

  test  expression
  a COUNT(s)
  b COUNT(substr(s))
  c COUNT(substr(s,char_length(s)-3+1))
  d COUNT(substring(s from '...$'))

Over several iterations:

  test  mean  stddev
  a 72.21.09
  b109.90.75
  c140.21.19
  d569.2   59.46

Not sure why I'm getting so much variance on the last run, strange.
Anyway...  Also note that because PG is nice about expanding SQL
functions, test c is the same as calling the right() function I
defined earlier.  I got a mean of 146.2 and a standard deviation of 9.04
so they're basically the same.

The basic string functions (substr and char_length) take about 0.4
microseconds to execute on my computer, and the regex function about 10
times as long at just under 0.5 microseconds.

The useful result being that substring(s from pattern) is easy to use,
and for small numbers of rows (i.e. less than a few thousand) you're
not going to notice much difference in performance.  It's only when you
start dealing with a hundred thousand or so rows the difference is going
to be really noticeable.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] clearing the buffer cache

2009-02-16 Thread Sam Mason
On Mon, Feb 16, 2009 at 11:09:59AM -0500, Chris Mayfield wrote:
 With recent versions of Linux you can flush the system's buffer cache by
 doing:
 
   # echo 3  /proc/sys/vm/drop_caches
 
 You can also try something like this:
 http://wiki.services.openoffice.org/wiki/Cold-start-simulator
 
 Shutting down the server and running fillmem has worked for me, but that 
 might take too long in your situation of thousands of small queries.

Why is this better than asking the kernel to drop its caches?  The
kernel could reasonably assume that because you only touched the data in
a page once it could swap the fillmems data out in preference to the
page cache.

Even if it did, you need to restart the server which is what prevents
testing going quickly in the first place.

 But I guess SELECT * FROM large_table would have taken just as long
 anyway.

This doesn't work on several levels any more; PG is smart enough to
recognize that this is what's happening and only uses a small buffer for
this (256KB at the moment).  Linux can be smart enough as well which is
why you need to ask it to drop the page cache.


-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] partial unique index and the planner

2009-02-16 Thread Michal Politowski
On Sun, 15 Feb 2009 13:41:05 -0500, Tom Lane t...@sss.pgh.pa.us wrote:
  Michal Politowski mpol...@meep.pl writes:
  Is it normal that plans using a scan on a partial unique index
  estimate that much more than one row is returned?
 
  There isn't currently any special logic to recognize that case;
  the estimate is just whatever is going to come out of the normal
  statistics-based estimation.

Too bad. It seems then that the schema is not well suited to what Postgres
would like. Maybe changing it will be the right thing to do.

The situation is that there are potentially several versions of a row,
only one of which is active at any given moment. The partial unique index
lets a query find the active row quickly, but since it is not known to the
planner that there is only one such row, a join caused problems.

So it looks like, at least for the current problem, separating the active
and inactive rows in their own tables would help.

On the other hand it seems that the table finally grew big enough for
Postgres to prefer index scans to heap scans even with the skewed
estimates.

  I'm unsure how hard it'd be to improve the situation.  If we've already
  identified relevant partial indexes before any of the stats code has to
  run then it'd be pretty easy, but that might be a bit fragile.

  Anyway, the usual advice for such cases is to see if raising the
  statistics target helps.

Unfortunately it's already 1000. And it seems to me it woud not help here
anyway. The estimate is very correct for the number of rows with the most
common identifier, only it cannot take the active status into consideration.

-- 
Michał Politowski
Talking has been known to lead to communication if practiced carelessly.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] partial unique index and the planner

2009-02-16 Thread Alban Hertroys


On Feb 16, 2009, at 7:18 PM, Michal Politowski wrote:

On Sun, 15 Feb 2009 13:41:05 -0500, Tom Lane t...@sss.pgh.pa.us  
wrote:

Michal Politowski mpol...@meep.pl writes:

Is it normal that plans using a scan on a partial unique index
estimate that much more than one row is returned?


There isn't currently any special logic to recognize that case;
the estimate is just whatever is going to come out of the normal
statistics-based estimation.


Too bad. It seems then that the schema is not well suited to what  
Postgres

would like. Maybe changing it will be the right thing to do.

The situation is that there are potentially several versions of a row,
only one of which is active at any given moment. The partial unique  
index
lets a query find the active row quickly, but since it is not known  
to the

planner that there is only one such row, a join caused problems.

So it looks like, at least for the current problem, separating the  
active

and inactive rows in their own tables would help.


I don't know your exact situation, but you could define a foreign key  
from some other table to the 'active' rows in your table. That key  
would then need to be updated by a few triggers (on insert, update   
delete) on your table. Put an index on the row that's referencing the  
foreign key and you get the same index content-wise that your partial  
index was covering, except that all the rows in the referencing table  
(and thus the index) are known to have only one matching row in your  
table. This way you shouldn't have your earlier problem with the  
estimates.


Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,4999b115747031962913450!



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Which SQL is the best for servers?

2009-02-16 Thread Troels Arvin
joel garry wrote:

[...]
 Check out Oracle XE
[...]

As far as I know, Oracle has never issued any patches for Oracle XE. 
Given the stream of patches for the regular Oracle database, I fear 
that an Oracle XE installation will have a number of known bugs - 
possibly security bugs.

Or?

(Note: Follow-up set to comp.databases and comp.databases.oracle.server.)

-- 
Troels


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] clearing the buffer cache

2009-02-16 Thread Greg Smith

On Mon, 16 Feb 2009, Sam Mason wrote:


On Mon, Feb 16, 2009 at 11:09:59AM -0500, Chris Mayfield wrote:

You can also try something like this:
http://wiki.services.openoffice.org/wiki/Cold-start-simulator
might take too long in your situation of thousands of small queries.


Why is this better than asking the kernel to drop its caches?


fillmem/flushdisk also work with kernels before 2.6.16, which means that 
it's not avaialble on still common platforms (RHEL4 for example).  If 
you've got drop_caches, it's the better approach, that pages gives an 
answer if you don't too.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] audit table

2009-02-16 Thread Robert Treat
On Thursday 12 February 2009 22:13:05 Craig Ringer wrote:
 Sim Zacks wrote:
  I want a trigger on every table that inserts the old row into an audit
  table (for updates and deletes). If the audit table was per table, then
  I could easily have a field of type that table and insert old into it.
 
  Is there any way that I could accomplish this functionality with any
  other type, so I could input any record into it?

 You want a single audit table that looks like this:

 CREATE TABLE audit (
id SERIAL PRIMARY KEY,
table_changed regclass,
changed_by VARCHAR,
changed_when TIMESTAMP WITH TIME ZONE,
oldrow ANY_ROW_TYPE
 );

 ie you want a field that can dynamically contain anything?

 AFAIK that's not possible unless you want to store a textual
 representation of the row. I'm not sure of an easy way to do it even
 then, and of course you can't read it out again as a real row.

 What you might want to look at doing is using table inheritance. Your
 master audit table looks like this:

 CREATE TABLE audit (
id SERIAL PRIMARY KEY,
table_changed regclass,
changed_by VARCHAR,
changed_when TIMESTAMP WITH TIME ZONE,
 );

 and then you have child audit tables for each audited table, each of
 which looks like this:

 CREATE TABLE audit_tablename (
 old_row tablename;
 ) INHERITS audit;


http://pgfoundry.org/projects/tablelog/

-- 
Robert Treat
Conjecture: http://www.xzilla.net
Consulting: http://www.omniti.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Update table with random values from another table

2009-02-16 Thread Brent Wood
I'm not sure if that query will do what you want, but to make it work, one 
thing you might try, is to pre calculate the random values for each record, 
then order by those, eg: 

select trip_code, random() as rand from obs order by rand;

works for me, so the following might for you:

:
  UPDATE
users   
SET 
t_firstname = x.firstname,
t_surname   = x.lastname,
t_username  = x.username,
t_contact   = x.email  
FROM
(select firstname, lastname, username, email, random() as rand
 from testnames order by rand) 
WHERE
t_firstname  x.firstname;


Cheers,

  Brent Wood


Brent Wood
DBA/GIS consultant
NIWA, Wellington
New Zealand
 Rory Campbell-Lange r...@campbell-lange.net 02/17/09 4:33 PM 
I have a test system for which I need to replace actual user's data (in
'users') with anonymised data from another table ('testnames') on
postgres 8.3. 

The tricky aspect is that each row from testnames must be randomised to
avoid each row in users being updated with the same value.

I've been trying to avoid a correlated subquery for each column I am trying
to update, and instead trying the following, which never returns. There are
2000 records in testnames and about 200 in users. 

   UPDATE
users   
SET 
t_firstname = x.firstname,
t_surname   = x.lastname,
t_username  = x.username,
t_contact   = x.email   
FROM
(select firstname, lastname, username, email
 from testnames order by random()) x
WHERE
t_firstname  x.firstname;


Any help much appreciated
Rory 



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

NIWA is the trading name of the National Institute of Water  Atmospheric 
Research Ltd.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] transfering tables into other schema

2009-02-16 Thread Ivan Sergio Borgonovo
I've around 150-200 tables in the same schema.
Some of them have pk/fk relationships and are referenced into
functions (~20).

One of them is surely referenced by most of those 20 and it is the
largest.

I'd like to move all the 200 tables to a new schema and leave that
one in the public schema.

The DB size accordingly to select pg_size_pretty(pg_database_size is
1.5Gb

I can afford 1 day down (WE).

I was wondering what would be the cheapest and safest way to do
the above without making a mess of the relationships, constraints,
functions considering... I can afford one day down.

Of course running a well crafted set of SQL statement may do...
But sed and plain text backup may do as well.

Any other options?
Any advices on how to procede once an option is the clear winner?

thanks

-- 
Ivan Sergio Borgonovo
http://www.webthatworks.it


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general