[GENERAL] Normalized Tables & SELECT [was: Find "smallest common year"]

2007-09-28 Thread Stefan Schwarzer
What would you recommend for say, 500 global national statistical  
variables,
500 regional and 500 subregional and 500 global aggregations?  
Years being

covered having something between 10 and 60 years for each of these
variables. All available for 240 countries/territories.


I generally approach such problems by putting the data right
(normalized) at the start, then munging the data into summary tables
to handle the problems you're seeing now.

I find it far easier to maintain normalized tables that produced
non-normalized ones (for things like data warehousing) than it is to
maintain non-normalized tables and trying to produce normalized data
from that.


Ok, I do understand that.

So, instead of the earlier mentioned database design, I would have  
something like this:


   - one table for the country names/ids/etc. (Afghanistan, 1;  
Albania, 2)
   - one table for the variable names/ids/etc. (GDP, 1; Population,  
2; Fish Catch, 3;)
   - one table for the years names/ids/etc. (1970, 1; 1971, 2; 1973,  
3; )

and
   - one table for all "statistical data" with four fields -  
id_variable, id_country, id_year, and the actual value


You say


I find it far easier to maintain normalized tables that produced
non-normalized ones (for things like data warehousing) than it is to
maintain non-normalized tables and trying to produce normalized data
from that.


It seems to me more difficult now to produce a non-normalized output  
based on the normalized table. How would look a query like, if I need  
now to SELECT, say 100 countries and 20 years? Something like this  
(simplified and without joins):


SELECT
value,
id.year
FROM
main_table
WHERE
year = '1970' OR
year = '1971' OR

country_name = 'Afghanistan' OR
country_name = 'Albania' OR
...

Actually, last time we came up with SubSelects for each year. So,  
does this make sense?


Thanks a lot for your help!

Stef

 

  Stefan Schwarzer

  Lean Back and Relax - Enjoy some Nature Photography:
  http://photoblog.la-famille-schwarzer.de

  Appetite for Global Data? UNEP GEO Data Portal:
  http://geodata.grid.unep.ch
  




---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


[GENERAL] Re: Why the ERROR: duplicate key violates unique constraint "master_pkey" is raised? - Is this a Bug?

2007-09-28 Thread Nis Jørgensen
Alban Hertroys skrev:
> Nis Jørgensen wrote:
>> If you can't wait, you are probably better off working around the
>> problem. Standard solution is to do:
>>
>> UPDATE  master SET m2 = -m2;
>> UPDATE  master SET m2 = -m2+1;
>>
>> or something similar.
> 
> Would something like
> 
> UPDATE master set m2 = master2.m2
>   FROM (
>   SELECT m2 +1
> FROM master m
>WHERE m.master_id = master.master_id
>ORDER BY m2 DESC
>  ) master2
> 
> work? I think it might be faster (and possibly cause less index bloat)
> than doing two consequent updates.


I don't understand your query. I don't think you can use a correlated
subquery in that way.

Anyway, tricks like these might work. They might stop working without
warning, if the plan changes. Relying on unspecified behavior is a
recipe for trouble.

Nis



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Normalized Tables & SELECT [was: Find "smallest common year"]

2007-09-28 Thread Nis Jørgensen
A few more comments on your table design.

Stefan Schwarzer skrev:

> So, instead of the earlier mentioned database design, I would have
> something like this:
> 
>- one table for the country names/ids/etc. (Afghanistan, 1; Albania,
> 2)

There is a well-established natural key for countries - ISO 3166. Use
that instead of "id".

>- one table for the variable names/ids/etc. (GDP, 1; Population, 2;
> Fish Catch, 3;)
>- one table for the years names/ids/etc. (1970, 1; 1971, 2; 1973, 3;
> )

No need for an ID column on this one. Just use the year as the primary key.



Nis


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[GENERAL] Preventing selection of data during transaction

2007-09-28 Thread Henrik

Hello List,

I wonder if it is possible to make data selected in one query NOT  
accessible for a SELECT in another query?


The thing is that I have a cleaning script that selects some rows  
from a table and processes them.
During that processing I don't want these rows to end up in another  
result set. But I also don't want the other result set to wait until  
my cleaning transaction is done, I just want it to ignore it.


Is this possible in the database level or do I need to take care of  
it on the  application level?


Regards,
Henke


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] Debian problem...

2007-09-28 Thread Tom Allison


On Sep 12, 2007, at 3:52 AM, Tino Wildenhain wrote:



Hi,

Tom Allison schrieb:

On Sep 11, 2007, at 5:49 AM, Tom Allison wrote:


I was able get my database working again.
Never figured out why...
My database data (sorry about the redundancy there) is sitting on  
a RAID1 array with LVM and ReiserFS.
I've heard some dissention about the use of ReiserFS and was  
wondering:


No problem with RazorFS if your hardware does not fail. But if...
chances are you have to use your backup.

Alternatively you could use XFS but I'm not sure if it performs
better on failing hardware. I guess not.

Regards
Tino


Well, when one is looking at hardware failure, the performance  
doesn't really matter that much.

Is EXT3 better for dealing with a RAID1 failure?
I know reiserfs does better performance wise, but there's no point in  
going fast if you can't steer.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] access privileges: grant select on (all current and future tables)?

2007-09-28 Thread John Smith
On 9/27/07, John Smith <[EMAIL PROTECTED]> wrote:
> On 9/27/07, John Smith <[EMAIL PROTECTED]> wrote:
> > how'd i "grant select on (all current and future tables inside a
> > private schema) to username" without turning that user into superuser?
> > "grant usage on..." doesn't do it.
> >
> > or do i, everytime i batch/auto create the tables, do a "grant select
> > on (new table) to username"?
>
> also how'd i find access privileges for a schema. something like "\z
> schemaname" not "\dp schemaname."?

ok let me ask this one other way:
when i "drop user username" which system tables does it access to then reply:
ERROR: role "username" cannot be dropped...
DETAIL: access to schema schemaname

where is this "access to schema..." info stored?

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] question about pg_dump -a

2007-09-28 Thread Alvaro Herrera
Ottavio Campana wrote:
> Richard Huxton ha scritto:
> > Ottavio Campana wrote:
> >>
> >> Is there a way to export tables in order, so that dependencies are
> >> always met? reading the manpage of pg_dump I found the -Fc flag, but I
> >> haven't understood if it is good for me and how it works. Or is there a
> >> way to relax constraints while loading data?
> > 
> > Try a pg_dump with -Fc and then pg_restore --data-only.
> > 
> > If all else fails, you can control item-by-item what gets restored by
> > producing a list from pg_restore (--list), commenting out lines and then
> > using it as a specification with (--use-list). See manuals for full
> > details.
> 
> with -L I was able to solve it, thanks.
> 
> But why does pg_dump does not already exports data such that previous
> tables do not depend on successive ones?

It does -- but it can't with data-only dumps.  (I think that it just
punts and doesn't care.  Maybe we could improve it to do a "best
effort").  The current suggested usage is to avoid using data-only
dumps.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] How to avoid: 9.50184e+06

2007-09-28 Thread Martijn van Oosterhout
On Fri, Sep 28, 2007 at 02:08:18PM +0200, Stefan Schwarzer wrote:
> how can I avoid results like this: 9.50184e+06
> 
> Instead it should return the "real" value, as 950184.

Presumably to_text would do what you want. Alternatively, perhaps you
intended your column to be type numeric?

Have a nice day,
-- 
Martijn van Oosterhout   <[EMAIL PROTECTED]>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to 
> litigate.


signature.asc
Description: Digital signature


Re: [GENERAL] How to avoid: 9.50184e+06

2007-09-28 Thread Scott Marlowe
On 9/28/07, Stefan Schwarzer <[EMAIL PROTECTED]> wrote:
> Hi there,
>
> how can I avoid results like this: 9.50184e+06
>
> Instead it should return the "real" value, as 950184.
>
> Thanks for any hints!

Cast it to numeric:

select 9.50184e+06::real::numeric;

But know that you're losing accuracy if you're working with real / float types.

If you need all the parts of the number kept around, use numeric from the start.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


[GENERAL] Request for feature: pg_dump schema masquerade flag

2007-09-28 Thread Owen Hartnett


I don't think this would be too hard to effect:

When pg_dumping a schema, have an additional flag -m , 
that would convert all references in the dump from the original 
schema to the new schema name.


Thus the command:

pg_dump -c -s myoldschemaname -m mynewschemaname mydatabase -f foo

would generate a dump file such that all the data that was in 
myoldschemaname would, upon psql mydatabase < foo would be in a new 
(or replaced) schema mynewschemaname.


The present strategy is to go through the dump and manually change 
the schema names in the dump text.  This is not a reliable mechanism, 
as there may be name collisions with the schema name and other names, 
and there's always the possibility that you might miss one when 
you're hand modifying the code.


I'd be happy to help on the effort if that makes sense.  I don't know 
what the code to pg_dump is like, though.


-Owen

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] How to avoid: 9.50184e+06

2007-09-28 Thread Ted Byers

--- Stefan Schwarzer <[EMAIL PROTECTED]>
wrote:

> Hi there,
> 
> how can I avoid results like this: 9.50184e+06
> 
> Instead it should return the "real" value, as
> 950184.
> 
But 9.50184e+06 IS the real value!  That is about nine
and a half million, not nine hundred and fifty
thousand, BTW.  I do not see why you want the database
back end to divide the real number by ten and then
display the result as an integer.  

I have not checked tbe behaviour of the functions
provided by Postgresql to convert numbers to strings,
but I would be surprised if a function suitable for
use in serializing a floating point number failed to
show 9.50184e+06 as "9.50184e+06"; to have it
automagically convert it to an integer would be
counter intuitive to me.  Really, how a number is
displayed is properly in the domain of the client
application.  If it is written in C, then you have
functions like printf, with all their glorious format
specifiers, to give you exactly what you want and
expect.  And you have similar control with IO streams
in C++ and Java.  ALL real programming languages
provide support for producing formatted output, and
give you absolute control over the format used.

Whether you are using a thin client or a thick client,
manipulating how floating point numbers really belongs
in the interface layer of the client.

If you want your numbers displayed as integers, then
you used the wrong type (as others have also
suggested).  If the data really requires use of
floating point numbers, then use the libraries
provided by whatever language you're using to develop
your client to produce the format you want.

HTH

Ted

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[GENERAL] problemas zona horaria

2007-09-28 Thread Fernando De Pedro
Estimados...tengo un problema con la zona horaria de (GMT-06:00) Guadalajara, 
Ciudad de México, Monterrey - Nuevo.
   
  Cuando me conecto desde una aplicacion java a un servidor postgres todo en la 
misma maquina a el momento de insertar registros me cambia las horas me 
adelanta 5 horas imagino que es porque no esta definido este timezone en 
postgres  cuando le pongo show timezone me dice GMT en cambio si uso 
(GMT-06:00) Guadalajara, Ciudad de México, Monterrey - Antiguo me pone
  "America/Mexico_City". y anda bien alguien sabe como solucionar este tema
   
  Desde Ya muchas Gracias
   
  Saludos Fernando
   

   
-
Moody friends. Drama queens. Your life? Nope! - their life, your story.
 Play Sims Stories at Yahoo! Games. 

[GENERAL] How to avoid: 9.50184e+06

2007-09-28 Thread Stefan Schwarzer

Hi there,

how can I avoid results like this: 9.50184e+06

Instead it should return the "real" value, as 950184.

Thanks for any hints!

Stef

 

  Lean Back and Relax - Enjoy some Nature Photography:
  http://photoblog.la-famille-schwarzer.de

  Appetite for Global Data? UNEP GEO Data Portal:
  http://geodata.grid.unep.ch
  







[GENERAL] Triggers & inheritance

2007-09-28 Thread Scott Ribe
Triggers have never been inherited, right? Not in any version?

I'm pretty sure that's the case, but I'm debugging some old logging and just
need to confirm it.

-- 
Scott Ribe
[EMAIL PROTECTED]
http://www.killerbytes.com/
(303) 722-0567 voice



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] more problems with count(*) on large table

2007-09-28 Thread Andrew Sullivan
On Fri, Sep 28, 2007 at 12:50:34PM -0400, Alvaro Herrera wrote:
> 
> But he does have a WHERE condition.  THe problem is, probably, that the
> condition is not selective enough so the planner chooses to do a
> seqscan.

Or else the planner has a bad idea of how selective the condition is. 
I've found that this sort of inconsistent selectivity on an indexed
column can be because the column statistics aren't good enough.  You
can try ALTER TABLE...SET STATISTICS to see if it helps.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
This work was visionary and imaginative, and goes to show that visionary
and imaginative work need not end up well. 
--Dennis Ritchie

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] access privileges: grant select on (all current and future tables)?

2007-09-28 Thread Alvaro Herrera
John Smith escribió:
> On 9/27/07, John Smith <[EMAIL PROTECTED]> wrote:
> > On 9/27/07, John Smith <[EMAIL PROTECTED]> wrote:
> > > how'd i "grant select on (all current and future tables inside a
> > > private schema) to username" without turning that user into superuser?
> > > "grant usage on..." doesn't do it.
> > >
> > > or do i, everytime i batch/auto create the tables, do a "grant select
> > > on (new table) to username"?
> >
> > also how'd i find access privileges for a schema. something like "\z
> > schemaname" not "\dp schemaname."?
> 
> ok let me ask this one other way:
> when i "drop user username" which system tables does it access to then reply:
> ERROR: role "username" cannot be dropped...
> DETAIL: access to schema schemaname
> 
> where is this "access to schema..." info stored?

pg_shdepend


-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[GENERAL] row->ARRAY or row->table casting?

2007-09-28 Thread Nico Sabbi

Hi,
is there any way to cast a generic row to an array or to a table type?

The example is trivial, but it explains what I'm trying to do:

nb1=# select * from tab1;
a | t
---+---
1 | a
2 | b
3 | c
(3 rows)


nb1=# select r from (select row(tab1.*) as r from tab1)x;
  r
---
(1,a)
(2,b)
(3,c)
(3 rows)

nb1=# select r[1] from (select row(tab1.*) as r from tab1)x;
ERROR:  cannot subscript type record because it is not an array

nb1=# select (r::tab1).a from (select row(tab1.*) as r from tab1)x;
ERROR:  cannot cast type record to tab1

The real use case is much more complex than this example of course :)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] question about pg_dump -a

2007-09-28 Thread Ottavio Campana
Richard Huxton ha scritto:
> Ottavio Campana wrote:
>>
>> Is there a way to export tables in order, so that dependencies are
>> always met? reading the manpage of pg_dump I found the -Fc flag, but I
>> haven't understood if it is good for me and how it works. Or is there a
>> way to relax constraints while loading data?
> 
> Try a pg_dump with -Fc and then pg_restore --data-only.
> 
> If all else fails, you can control item-by-item what gets restored by
> producing a list from pg_restore (--list), commenting out lines and then
> using it as a specification with (--use-list). See manuals for full
> details.

with -L I was able to solve it, thanks.

But why does pg_dump does not already exports data such that previous
tables do not depend on successive ones?



signature.asc
Description: OpenPGP digital signature


Re: [GENERAL] row->ARRAY or row->table casting?

2007-09-28 Thread Nico Sabbi

Tom Lane ha scritto:

Nico Sabbi <[EMAIL PROTECTED]> writes:
  

is there any way to cast a generic row to an array or to a table type?



"row(...)::composite_type" should work in 8.2 and up.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

  

I'm using 8.2.4.

What's the right syntax for a case like this?

nb1=# select r.a from (select row(tab1.*)::tab1 as r from tab1)x;
ERROR:  missing FROM-clause entry for table "r"
LINE 1: select r.a from (select row(tab1.*)::tab1 as r from tab1)x;
  ^

I tried many variations (including casting x as tab1) obtaining only 
syntax errors.


Thanks,
   Nico

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] more problems with count(*) on large table

2007-09-28 Thread Alvaro Herrera
A. Kretschmer wrote:
> am  Fri, dem 28.09.2007, um 11:56:46 -0400 mailte Mike Charnoky folgendes:
> > Hi,
> > 
> > I am still having problems performing a count(*) on a large table.  This
> > 
> > Now, certain count(*) queries are failing to complete for certain time
> > ranges (I killed the query after about 24 hours).  The table is indexed
> > on a timestamp field.  Here is one query that hangs:
> 
> Again: an index can't help! Because of MVCC: 'select count(*)' without
> WHERE-condition forces an seq. table-scan. 

But he does have a WHERE condition.  THe problem is, probably, that the
condition is not selective enough so the planner chooses to do a
seqscan.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] DAGs and recursive queries

2007-09-28 Thread Jeff Davis
On Thu, 2007-09-27 at 23:58 +0100, Gregory Stark wrote:
> It keeps the same information in more than one place. Consider:
> 
> 1
> 1.1
> 1.1.1
> 
> Note that all three records contain the root's id of "1". If you want to
> reparent 1.1 to be 2.1 you have to know that all its children also need to be
> reparented as well.

Aren't there still some update anomolies with any schema representing a
DAG?

Regards,
Jeff Davis


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Re: Why the ERROR: duplicate key violates unique constraint "master_pkey" is raised? - Is this a Bug?

2007-09-28 Thread Alban Hertroys
Nis Jørgensen wrote:
> Alban Hertroys skrev:
>> Would something like
>>
>> UPDATE master set m2 = master2.m2
>>   FROM (
>>  SELECT m2 +1
>>FROM master m
>>   WHERE m.master_id = master.master_id
>>   ORDER BY m2 DESC
>>  ) master2
>>
>> work? I think it might be faster (and possibly cause less index bloat)
>> than doing two consequent updates.
> 
> 
> I don't understand your query. I don't think you can use a correlated
> subquery in that way.

Hmm indeed, it complains something vague: "ERROR:  subquery in FROM may
not refer to other relations of same query level". Not sure why?

Effectively it orders the updates descending, so that the new value of
m2 can never be updated to an already existing value, because that has
been updated previously.

The WHERE condition makes the query look a bit more complex than it
actually is, but is necessary of course.

> Anyway, tricks like these might work. They might stop working without
> warning, if the plan changes. Relying on unspecified behavior is a
> recipe for trouble.

If I specifically ask for an ordering, I don't think the planner should
change or ignore that ordering. So I'm not relying on unspecified behaviour.

-- 
Alban Hertroys
[EMAIL PROTECTED]

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
   7500 AK Enschede

// Integrate Your World //


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Normalized Tables & SELECT [was: Find "smallest common year"]

2007-09-28 Thread Stefan Schwarzer
Ah, but there is a standardised list of country-codes ideal for the  
DBA since the code is (usually) easily understandable.

  http://en.wikipedia.org/wiki/ISO_3166
The only problem might be if historical data uses old boundaries  
(e.g. Soviet Union, Yugoslavia).


Yep, have all of them (ISO-2, ISO-3, UN), but they don't cover all  
countries + territories. Nevertheless, in most cases the list is  
sufficient...


Thanks nevertheless for the link!

Stef



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] Debian problem...

2007-09-28 Thread Vivek Khera


On Sep 28, 2007, at 5:09 AM, Tom Allison wrote:

I know reiserfs does better performance wise, but there's no point  
in going fast if you can't steer.


I recently had to replace 16 Western Digital 10kRPM SATA drives with  
Hitachi 7.2kRPM drives because the WD drives kept randomly (and  
falsely) reporting failure to the RAID system.  The performance loss  
was noticeable, but the reliability was more important.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] Normalized Tables & SELECT [was: Find "smallest common year"]

2007-09-28 Thread Alban Hertroys
Stefan Schwarzer wrote:
> Ok, I do understand that.
> 
> So, instead of the earlier mentioned database design, I would have
> something like this:
> 
>- one table for the country names/ids/etc. (Afghanistan, 1; Albania,
> 2)
>- one table for the variable names/ids/etc. (GDP, 1; Population, 2;
> Fish Catch, 3;)


>- one table for the years names/ids/etc. (1970, 1; 1971, 2; 1973, 3;
> )

If you _do_ need this table (because you want to constrain your
statistical data to only contain a specific set of years, or because you
need a quick list of available years to select from): Make the year
primary key and drop the artificial index.

Years are perfectly fine data to constrain on, and it saves you the
joins with that table (the foreign key constraint constrains your data
sufficiently).

You may want to constrain your years further by adding a check
constraint, fe. CHECK (year > 1900).


The same argument goes for the country names, but I generally don't like
to have text data for primary key.

Contrary to years, their values might change (for which an ON UPDATE
CASCADE FK constraint trigger would have to go through your data to
update all the references). Plus they take up a bit more space than
integers, although with country names that won't matter that much.

OTOH... If your country names contain non-ascii characters and the
database is configured to collate on those properly, it will be slower.

That aside (they're kind of minor points), the country name is also a
good candidate to become key (primary and foreign respectively).

> and
>- one table for all "statistical data" with four fields -
> id_variable, id_country, id_year, and the actual value

Some things may have changed here due to my above suggestions.

> It seems to me more difficult now to produce a non-normalized output
> based on the normalized table. How would look a query like, if I need
> now to SELECT, say 100 countries and 20 years? Something like this
> (simplified and without joins):
> 
> SELECT
> value,
> id.year
> FROM
> main_table
> WHERE
> year = '1970' OR
> year = '1971' OR
> 
> country_name = 'Afghanistan' OR
>  country_name = 'Albania' OR
> ...

I don't really see what the problem is; with the years as columns in
your table they're just in a different places in this query.

Without knowing what problem you're trying to solve with this query it's
a bit of a guess. I assume those years and countries are user input?

I usually use BETWEEN or IN (...) for such cases, but that boils down to
the same query. It's just shorter to write.

BTW, You didn't actually use type text for your year column, did you? No
quotes needed then. Otherwise you'd have to make sure your year values
are all the same length or sorting gets... interesting.

SELECT
value,
year
FROM
main_table
WHERE
year BETWEEN 1970 AND 1975
AND country_name IN ('Afghanistan', 'Albania', ...)


But I think you'll usually be looking for aggregates, so GROUP BY and
HAVING are probably more appropriate for your queries.

SELECT country_name, year, SUM(value)
  FROM main_table
 WHERE country_name IN ('Afghanistan', 'Albania', ...)
   AND year BETWEEN 1970 AND 1975
 GROUP BY country_name, year
 HAVING SUM(value) > 0;


-- 
Alban Hertroys
[EMAIL PROTECTED]

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
   7500 AK Enschede

// Integrate Your World //

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Normalized Tables & SELECT [was: Find "smallest common year"]

2007-09-28 Thread Nis Jørgensen
Stefan Schwarzer skrev:
>>> What would you recommend for say, 500 global national statistical
>>> variables,
>>> 500 regional and 500 subregional and 500 global aggregations? Years
>>> being
>>> covered having something between 10 and 60 years for each of these
>>> variables. All available for 240 countries/territories.
>>
>> I generally approach such problems by putting the data right
>> (normalized) at the start, then munging the data into summary tables
>> to handle the problems you're seeing now.
>>
>> I find it far easier to maintain normalized tables that produced
>> non-normalized ones (for things like data warehousing) than it is to
>> maintain non-normalized tables and trying to produce normalized data
>> from that.
> 
> Ok, I do understand that.
> 
> So, instead of the earlier mentioned database design, I would have
> something like this:
> 
>- one table for the country names/ids/etc. (Afghanistan, 1; Albania,
> 2)
>- one table for the variable names/ids/etc. (GDP, 1; Population, 2;
> Fish Catch, 3;)
>- one table for the years names/ids/etc. (1970, 1; 1971, 2; 1973, 3;
> )
> and
>- one table for all "statistical data" with four fields -
> id_variable, id_country, id_year, and the actual value

This is one posibility. Another is to have one table for each variable.
This has the benefit of not mixing different units/data types in the
same field. It does mean you cannot use the same (parameterized) query
for getting different measures.

Since it is easy to create views converting from one to the other of
these presentations, which one you choose is not that important

Also, there is no obvious need to have a lookup table for years - just
store the year as an integer in your data table(s). If necessary, add a
constraint indicating which years are valid. You can produce rows from
missing years by left joining with generate_series(start_year, end_year)

Even if you choose to store the valid years in a table, the id_year is
unnecessary - just use the year itself as the primary key.

More in another reply.

Nis


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Re: Why the ERROR: duplicate key violates unique constraint "master_pkey" is raised? - Is this a Bug?

2007-09-28 Thread Alban Hertroys
Nis Jørgensen wrote:
> If you can't wait, you are probably better off working around the
> problem. Standard solution is to do:
> 
> UPDATE  master SET m2 = -m2;
> UPDATE  master SET m2 = -m2+1;
> 
> or something similar.

Would something like

UPDATE master set m2 = master2.m2
  FROM (
SELECT m2 +1
  FROM master m
 WHERE m.master_id = master.master_id
 ORDER BY m2 DESC
 ) master2

work? I think it might be faster (and possibly cause less index bloat)
than doing two consequent updates.

Haven't tested this though...

-- 
Alban Hertroys
[EMAIL PROTECTED]

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
   7500 AK Enschede

// Integrate Your World //


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] question about pg_dump -a

2007-09-28 Thread Vivek Khera


On Sep 28, 2007, at 9:07 AM, Ottavio Campana wrote:


But why does pg_dump does not already exports data such that previous
tables do not depend on successive ones?


Because you can't always sort your tables that way.  The restore  
procedure is responsible for either sorting or disabling the FK  
checks during bulk load.  The latter is more efficient, especially if  
there are no indexes yet, as in a full restore from dump.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] access privileges: grant select on (all current and future tables)?

2007-09-28 Thread John D. Burger

John Smith wrote:


and "grant usage on new tables in schema..." doesn't exist yet. which
leads to my next question (see
http://svr5.postgresql.org/pgsql-hackers/2005-01/msg01070.php)- are we
there yet?


If I understand your question, you cannot grant "anticipatory"  
privileges to tables which do not yet exist, which I discovered in  
this thread:


  http://archives.postgresql.org/pgsql-general/2007-02/msg00911.php

If you follow the thread, you'll find that one reply pointed to some  
existing functions for managing this stuff.  I found these useful.


- John D. Burger
  MITRE



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] question about pg_dump -a

2007-09-28 Thread Richard Huxton

Ottavio Campana wrote:


Is there a way to export tables in order, so that dependencies are
always met? reading the manpage of pg_dump I found the -Fc flag, but I
haven't understood if it is good for me and how it works. Or is there a
way to relax constraints while loading data?


Try a pg_dump with -Fc and then pg_restore --data-only.

If all else fails, you can control item-by-item what gets restored by 
producing a list from pg_restore (--list), commenting out lines and then 
using it as a specification with (--use-list). See manuals for full details.


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[GENERAL] more problems with count(*) on large table

2007-09-28 Thread Mike Charnoky
Hi,

I am still having problems performing a count(*) on a large table.  This
is a followup from a recent thread:

http://archives.postgresql.org/pgsql-general/2007-09/msg00561.php

Since the last time these problems happened, we have tweaked some
postgresql config parameters (fsm, etc).  I also recreated the large
table, with the assumption it was somehow corrupted.

Now, certain count(*) queries are failing to complete for certain time
ranges (I killed the query after about 24 hours).  The table is indexed
on a timestamp field.  Here is one query that hangs:

select count(*) from mytable where evtime between '2007-09-26' and
'2007-09-27';

However, this query runs successfully and takes 2 minutes:

select count(*) from mytable where evtime between '2007-09-25' and
'2007-09-26';

  count
--
 14150928
(1 row)

What is going on?  I analyzed the table before running the query and
have no reason to believe that the amount of data added to the table
varies much from day to day.  No data has been deleted from the table
yet, just added.

Here is some config info:

PostgreSQL 8.1.8 on Fedora 3

shared_buffers = 8000
temp_buffers = 1000
work_mem = 16384
maintenance_work_mem = 262144
max_fsm_pages = 50
max_fsm_relations = 3


Mike

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] How to avoid: 9.50184e+06

2007-09-28 Thread Merlin Moncure
On 9/28/07, Stefan Schwarzer <[EMAIL PROTECTED]> wrote:
> Hi there,
>
> how can I avoid results like this: 9.50184e+06
>
> Instead it should return the "real" value, as 950184.

The type 'real' in postgresql comes from the mathematical definition
of real, numbers that can be expressed as a fraction, or rational
numbers.  In a practical sense, it is an alias for the binary floating
point type iirc.

In your case, it looks like you should be using the integer or the numeric type.

merlin

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] Normalized Tables & SELECT [was: Find "smallest common year"]

2007-09-28 Thread Richard Huxton

Alban Hertroys wrote:

If you _do_ need this table (because you want to constrain your
statistical data to only contain a specific set of years, or because you
need a quick list of available years to select from): Make the year
primary key and drop the artificial index.

Years are perfectly fine data to constrain on, and it saves you the
joins with that table (the foreign key constraint constrains your data
sufficiently).

You may want to constrain your years further by adding a check
constraint, fe. CHECK (year > 1900).

The same argument goes for the country names, but I generally don't like
to have text data for primary key.


Ah, but there is a standardised list of country-codes ideal for the DBA 
since the code is (usually) easily understandable.

  http://en.wikipedia.org/wiki/ISO_3166
The only problem might be if historical data uses old boundaries (e.g. 
Soviet Union, Yugoslavia).


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] more problems with count(*) on large table

2007-09-28 Thread A. Kretschmer
am  Fri, dem 28.09.2007, um 11:56:46 -0400 mailte Mike Charnoky folgendes:
> Hi,
> 
> I am still having problems performing a count(*) on a large table.  This
> 
> Now, certain count(*) queries are failing to complete for certain time
> ranges (I killed the query after about 24 hours).  The table is indexed
> on a timestamp field.  Here is one query that hangs:

Again: an index can't help! Because of MVCC: 'select count(*)' without
WHERE-condition forces an seq. table-scan. 


Andreas
-- 
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID:   0x3FFF606C, privat 0x7F4584DA   http://wwwkeys.de.pgp.net

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] more problems with count(*) on large table

2007-09-28 Thread Sean Davis
Mike Charnoky wrote:
> Hi,
> 
> I am still having problems performing a count(*) on a large table.  This
> is a followup from a recent thread:
> 
> http://archives.postgresql.org/pgsql-general/2007-09/msg00561.php
> 
> Since the last time these problems happened, we have tweaked some
> postgresql config parameters (fsm, etc).  I also recreated the large
> table, with the assumption it was somehow corrupted.
> 
> Now, certain count(*) queries are failing to complete for certain time
> ranges (I killed the query after about 24 hours).  The table is indexed
> on a timestamp field.  Here is one query that hangs:
> 
> select count(*) from mytable where evtime between '2007-09-26' and
> '2007-09-27';
> 
> However, this query runs successfully and takes 2 minutes:
> 
> select count(*) from mytable where evtime between '2007-09-25' and
> '2007-09-26';
> 
>   count
> --
>  14150928
> (1 row)
> 
> What is going on?  I analyzed the table before running the query and
> have no reason to believe that the amount of data added to the table
> varies much from day to day.  No data has been deleted from the table
> yet, just added.
> 
> Here is some config info:
> 
> PostgreSQL 8.1.8 on Fedora 3

Have you vacuumed recently?

Sean


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] more problems with count(*) on large table

2007-09-28 Thread A. Kretschmer
am  Fri, dem 28.09.2007, um 12:50:34 -0400 mailte Alvaro Herrera folgendes:
> A. Kretschmer wrote:
> > am  Fri, dem 28.09.2007, um 11:56:46 -0400 mailte Mike Charnoky folgendes:
> > > Hi,
> > > 
> > > I am still having problems performing a count(*) on a large table.  This
> > > 
> > > Now, certain count(*) queries are failing to complete for certain time
> > > ranges (I killed the query after about 24 hours).  The table is indexed
> > > on a timestamp field.  Here is one query that hangs:
> > 
> > Again: an index can't help! Because of MVCC: 'select count(*)' without
> > WHERE-condition forces an seq. table-scan. 
> 
> But he does have a WHERE condition.  THe problem is, probably, that the
> condition is not selective enough so the planner chooses to do a
> seqscan.

Yes, i'm sorry, i havn't recognize this. Maybe wrong statistics. As
Andrew suggested, a 'ALTER TABLE...SET STATISTICS' can help.


Andreas
-- 
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID:   0x3FFF606C, privat 0x7F4584DA   http://wwwkeys.de.pgp.net

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


[GENERAL] Re: Why the ERROR: duplicate key violates unique constraint "master_pkey" is raised? - Is this a Bug?

2007-09-28 Thread Nis Jørgensen
Alban Hertroys skrev:
> Nis Jørgensen wrote:
>> Alban Hertroys skrev:
>>> Would something like
>>>
>>> UPDATE master set m2 = master2.m2
>>>   FROM (
>>> SELECT m2 +1
>>>   FROM master m
>>>  WHERE m.master_id = master.master_id
>>>  ORDER BY m2 DESC
>>>  ) master2
>>>
>>> work? I think it might be faster (and possibly cause less index bloat)
>>> than doing two consequent updates.
>>
>> I don't understand your query. I don't think you can use a correlated
>> subquery in that way.
> 
> Hmm indeed, it complains something vague: "ERROR:  subquery in FROM may
> not refer to other relations of same query level". Not sure why?

As I said, I don't understand what you think it does. What you are doing
is similar to writing

SELECT m2
FROM master, (
SELECT m2
FROM master m
WHERE m.master_id = master.master_id
)

Which doesn' make any sense either.

You probably want

UPDATE master set m2 = master2.m2
FROM (
SELECT m2 +1
FROM master m
ORDER BY m2 DESC
) master2
WHERE master2.master_id = master.master_id

> Effectively it orders the updates descending, so that the new value of
> m2 can never be updated to an already existing value, because that has
> been updated previously.
> 
> The WHERE condition makes the query look a bit more complex than it
> actually is, but is necessary of course.
> 
>> Anyway, tricks like these might work. They might stop working without
>> warning, if the plan changes. Relying on unspecified behavior is a
>> recipe for trouble.
> 
> If I specifically ask for an ordering, I don't think the planner should
> change or ignore that ordering. So I'm not relying on unspecified behaviour.

According to the SQL spec, all the updates happen at the same time. Thus
any order this happens in is an implementation detail.

The only places where an "ORDER BY" clause is guaranteed to yield
specific results are those which are documented. - off the top of my
head, I can think of "outputting rows to the client", "LIMIT" and
"DISTINCT ON".

The fact that you stick an "ORDER BY" into a subquery guarantees
nothing. The planner might even see that it has no effect (according to
the spec) and ignore it. For instance this

SELECT *
FROM (SELECT *
FROM mytable
ORDER BY id
)
WHERE some_criteria

is not guaranteed to return an ordered result set. Thus the planner can
ignore the ORDER BY (but might not do so).

Nis


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[GENERAL] Re: Why the ERROR: duplicate key violates unique constraint "master_pkey" is raised? - Is this a Bug?

2007-09-28 Thread Alban Hertroys
I had to manipulate the headers a bit, as I hadn't noticed the message
that reached me first was from the newsgroup instead of the ML.

Nis Jørgensen wrote:
> Alban Hertroys skrev:

> As I said, I don't understand what you think it does. What you are doing
> is similar to writing
> 
> SELECT m2
> FROM master, (
>   SELECT m2
>   FROM master m
>   WHERE m.master_id = master.master_id
>   )
> 
> Which doesn' make any sense either.
> 
> You probably want
> 
> UPDATE master set m2 = master2.m2
>   FROM (
>   SELECT m2 +1
>   FROM master m
>   ORDER BY m2 DESC
>   ) master2
> WHERE master2.master_id = master.master_id

So you do understand.

As I mentioned earlier, I didn't test that query. The extra alias
bothered me as unnecessary, and now I see why - I put the where clause
at the wrong place.

> According to the SQL spec, all the updates happen at the same time. Thus
> any order this happens in is an implementation detail.

According to the SQL spec the original update statement should have worked.

But it doesn't, so the updates _don't_ all happen at the same time. That
means there is an order in which they occur, and that order is likely to
be manipulatable.

> The fact that you stick an "ORDER BY" into a subquery guarantees
> nothing. The planner might even see that it has no effect (according to
> the spec) and ignore it. For instance this
> 
> SELECT *
> FROM (SELECT *
>   FROM mytable
>   ORDER BY id
> )
> WHERE some_criteria
> 
> is not guaranteed to return an ordered result set. Thus the planner can
> ignore the ORDER BY (but might not do so).

You are probably right that there's no way to guarantee that ordering,
but the method I suggested works in at least the version of Postgres I
have available (8.1.8), and they'll also work in database versions that
update atomically.

There _might_ be a small window of future PG versions where the planner
outsmarts this "trick" while it doesn't yet update atomically, but I
believe that to be rather unlikely.

I expect the priorities of the developers to be on atomic updates as
opposed to filtering out explicitly requested but unnecessary ordering.
The latter may be in use by many to massage the planner into picking a
different plan (even though it's not the right way to fix a bad plan of
course).

Here's some proof:

CREATE TABLE update_test (
update_id serial NOT NULL PRIMARY KEY,
num integer NOT NULL UNIQUE
);

INSERT INTO update_test (num) SELECT * FROM generate_series(5, 15);

-- fails
UPDATE update_test SET num = u2.num
FROM (
SELECT update_id, num +1 AS num
FROM update_test
) u2
WHERE update_test.update_id = u2.update_id;

-- succeeds
UPDATE update_test SET num = u2.num
FROM (
SELECT update_id, num +1 AS num
FROM update_test
ORDER BY num DESC
) u2
WHERE update_test.update_id = u2.update_id;

-- 
Alban Hertroys
[EMAIL PROTECTED]

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
   7500 AK Enschede

// Integrate Your World //


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


[GENERAL] Re: Why the ERROR: duplicate key violates unique constraint "master_pkey" is raised? - Is this a Bug?

2007-09-28 Thread Nis Jørgensen
Alban Hertroys skrev:
> 
> Nis Jørgensen wrote:
>> Alban Hertroys skrev:
> 
>> As I said, I don't understand what you think it does. What you are doing
>> is similar to writing
>>
>> SELECT m2
>> FROM master, (
>>  SELECT m2
>>  FROM master m
>>  WHERE m.master_id = master.master_id
>>  )
>>
>> Which doesn' make any sense either.
>>
>> You probably want
>>
>> UPDATE master set m2 = master2.m2
>>  FROM (
>>  SELECT m2 +1
>>  FROM master m
>>  ORDER BY m2 DESC
>>  ) master2
>> WHERE master2.master_id = master.master_id
> 
> So you do understand.
> 
> As I mentioned earlier, I didn't test that query. The extra alias
> bothered me as unnecessary, and now I see why - I put the where clause
> at the wrong place.
> 
>> According to the SQL spec, all the updates happen at the same time. Thus
>> any order this happens in is an implementation detail.
> 
> According to the SQL spec the original update statement should have worked.
> 
> But it doesn't, so the updates _don't_ all happen at the same time. That
> means there is an order in which they occur, and that order is likely to
> be manipulatable.

Yes. No dispute here.

> You are probably right that there's no way to guarantee that ordering,
> but the method I suggested works in at least the version of Postgres I
> have available (8.1.8), and they'll also work in database versions that
> update atomically.

It works right now, for the current contents of the table. It might not
work tomorrow, when the planner chooses a different plan.

As an example, I just tried disabling seqscans. After doing this, the
update fails. The plans given for the two cases are estimated to 87 and
97 units, respectively. Do you really want to bet your money on this
plan staying ahead?

Nis


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] row->ARRAY or row->table casting?

2007-09-28 Thread Tom Lane
Nico Sabbi <[EMAIL PROTECTED]> writes:
> is there any way to cast a generic row to an array or to a table type?

"row(...)::composite_type" should work in 8.2 and up.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] more problems with count(*) on large table

2007-09-28 Thread Mike Charnoky
The autovacuum is turned on.  Since this is pg8.1, I don't know when the
table was actually last vacuumed.  I *did* run analyze on the table,
though.  Also, nothing has been deleted in this table... so vacuum
should have no affect, right?


Mike

Sean Davis wrote:
> Mike Charnoky wrote:
>> Hi,
>>
>> I am still having problems performing a count(*) on a large table.
> Have you vacuumed recently?
> 
> Sean

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] more problems with count(*) on large table

2007-09-28 Thread Gregory Stark
"Alvaro Herrera" <[EMAIL PROTECTED]> writes:

> A. Kretschmer wrote:
>> am  Fri, dem 28.09.2007, um 11:56:46 -0400 mailte Mike Charnoky folgendes:
>> > Hi,
>> > 
>> > I am still having problems performing a count(*) on a large table.  This
>> > 
>> > Now, certain count(*) queries are failing to complete for certain time
>> > ranges (I killed the query after about 24 hours).  The table is indexed
>> > on a timestamp field.  Here is one query that hangs:
>> 
>> Again: an index can't help! Because of MVCC: 'select count(*)' without
>> WHERE-condition forces an seq. table-scan. 
>
> But he does have a WHERE condition.  THe problem is, probably, that the
> condition is not selective enough so the planner chooses to do a
> seqscan.

What does EXPLAIN SELECT ... say?



-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] row->ARRAY or row->table casting?

2007-09-28 Thread Gregory Stark
"Nico Sabbi" <[EMAIL PROTECTED]> writes:

> nb1=# select r.a from (select row(tab1.*)::tab1 as r from tab1)x;
> ERROR:  missing FROM-clause entry for table "r"
> LINE 1: select r.a from (select row(tab1.*)::tab1 as r from tab1)x;
>   ^
> I tried many variations (including casting x as tab1) obtaining only syntax
> errors.

r.a would be the column "a" in the table named "r", but the only table in the
FROM list is "x". So you have to use a workaround to make it clear to the
parser that you're referring to the column "r", it would look like 

SELECT (r).a from (select row(tab1.*)::tab1 as r from tab1)x;

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] more problems with count(*) on large table

2007-09-28 Thread Bill Moran
In response to Mike Charnoky <[EMAIL PROTECTED]>:

> The autovacuum is turned on.  Since this is pg8.1, I don't know when the
> table was actually last vacuumed.  I *did* run analyze on the table,
> though.  Also, nothing has been deleted in this table... so vacuum
> should have no affect, right?

Updated rows also produce dead tuples that require vacuuming.  If the
table is insert only, you don't need vacuum.

-- 
Bill Moran
http://www.potentialtech.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] Using RETURNING with INTO inside pgsql

2007-09-28 Thread Bruce Momjian
Ben wrote:
> Woah, when did that come around? Talk about sweet syntactic sugar

8.2

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] question about pg_dump -a

2007-09-28 Thread Ottavio Campana
Vivek Khera ha scritto:
> 
> On Sep 28, 2007, at 9:07 AM, Ottavio Campana wrote:
> 
>> But why does pg_dump does not already exports data such that previous
>> tables do not depend on successive ones?
> 
> Because you can't always sort your tables that way.  The restore
> procedure is responsible for either sorting or disabling the FK checks
> during bulk load.  The latter is more efficient, especially if there are
> no indexes yet, as in a full restore from dump.

how can FK checks be disabled? is there a command?



signature.asc
Description: OpenPGP digital signature


Re: [GENERAL] more problems with count(*) on large table

2007-09-28 Thread Mike Charnoky
With respect to the ALTER TABLE SET STATISTICS... how do I determine a
good value to use?  This wasn't really clear in the pg docs.  Also, do I
need to run ANALYZE on the table after I change the statistics?

Here are the EXPLAINs from the queries:

db=# explain select count(*) from prediction_accuracy where evtime
between '2007-09-25' and '2007-09-26';

QUERY PLAN

-
 Aggregate  (cost=475677.40..475677.41 rows=1 width=0)
   ->  Index Scan using pred_acc_evtime_index on prediction_accuracy
(cost=0.00..51.44 rows=12490383 width=0)
 Index Cond: ((evtime >= '2007-09-25 00:00:00-07'::timestamp
with time zone) AND (evtime <= '2007-09-26 00:00:00-07'::timestamp with
time zone))
(3 rows)

db=# explain select count(*) from prediction_accuracy where evtime
between '2007-09-26' and '2007-09-27';

QUERY PLAN

-
 Aggregate  (cost=486615.04..486615.05 rows=1 width=0)
   ->  Index Scan using pred_acc_evtime_index on prediction_accuracy
(cost=0.00..454671.07 rows=12777586 width=0)
 Index Cond: ((evtime >= '2007-09-26 00:00:00-07'::timestamp
with time zone) AND (evtime <= '2007-09-27 00:00:00-07'::timestamp with
time zone))
(3 rows)


Mike

Gregory Stark wrote:
> "Alvaro Herrera" <[EMAIL PROTECTED]> writes:
> 
>> A. Kretschmer wrote:
>>> am  Fri, dem 28.09.2007, um 11:56:46 -0400 mailte Mike Charnoky folgendes:
 Hi,

 I am still having problems performing a count(*) on a large table.  This

 Now, certain count(*) queries are failing to complete for certain time
 ranges (I killed the query after about 24 hours).  The table is indexed
 on a timestamp field.  Here is one query that hangs:
>>> Again: an index can't help! Because of MVCC: 'select count(*)' without
>>> WHERE-condition forces an seq. table-scan. 
>> But he does have a WHERE condition.  THe problem is, probably, that the
>> condition is not selective enough so the planner chooses to do a
>> seqscan.
> 
> What does EXPLAIN SELECT ... say?
> 
> 
> 

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Triggers & inheritance

2007-09-28 Thread Tom Lane
Scott Ribe <[EMAIL PROTECTED]> writes:
> Triggers have never been inherited, right? Not in any version?

AFAIR, no.  If they had been I kinda doubt we would have removed it.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] more problems with count(*) on large table

2007-09-28 Thread Scott Marlowe
On 9/28/07, Mike Charnoky <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I am still having problems performing a count(*) on a large table.  This
> is a followup from a recent thread:
>
> http://archives.postgresql.org/pgsql-general/2007-09/msg00561.php
>
> Since the last time these problems happened, we have tweaked some
> postgresql config parameters (fsm, etc).  I also recreated the large
> table, with the assumption it was somehow corrupted.
>
> Now, certain count(*) queries are failing to complete for certain time
> ranges (I killed the query after about 24 hours).  The table is indexed
> on a timestamp field.  Here is one query that hangs:
>
> select count(*) from mytable where evtime between '2007-09-26' and
> '2007-09-27';
>
> However, this query runs successfully and takes 2 minutes:
>
> select count(*) from mytable where evtime between '2007-09-25' and
> '2007-09-26';

I had a similar problem pop up recently in my medium large reporting
database.  I found the problem solved by two things, one was upping
the stats target on the time column to 100, and the other was running
regularly scheduled analyzes on the large reporting table, about once
every 8 hours.

autovacuum works fine otherwise, but was not running analyze often enough

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[GENERAL] Please change default characterset for database cluster

2007-09-28 Thread CN
Hi!
"initdb" use SQL_ASCII as the default characterset encoding when it is
not given option "-E" and when it can not correctly derive one from
locale. I suggest "initdb" use UNICODE instead of SQL_ASCII because
UNICODE is far more useful than SQL_ASCII.

Not all webmasters are willing to spend time reading "initdb"
documentation. I have encountered a free web hosting providing
PhpPgAdmin through which I can create my databases. Problem is that all
newly created databases use SQL_ASCII which is completely useless to me.
Their PhpPgAdmin script does not support "-E" switch for "createdb". As
a result, I have to abandon that service all together. Was "initdb"
using UNICODE as the default characterset, everthing would be perfect.

Regards,

CN

-- 
http://www.fastmail.fm - Same, same, but different�


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] Please change default characterset for database cluster

2007-09-28 Thread Carlos Moreno
CN wrote:
> Hi!
> "initdb" use SQL_ASCII as the default characterset encoding when it is
> not given option "-E" and when it can not correctly derive one from
> locale. I suggest "initdb" use UNICODE instead of SQL_ASCII because
> UNICODE is far more useful than SQL_ASCII.
>
> Not all webmasters are willing to spend time reading "initdb"
> documentation. I have encountered a free web hosting providing
> PhpPgAdmin through which I can create my databases. Problem is that all
> newly created databases use SQL_ASCII which is completely useless to me.
> Their PhpPgAdmin script does not support "-E" switch for "createdb". As
> a result, I have to abandon that service all together. Was "initdb"
> using UNICODE as the default characterset, everthing would be perfect.

In addition to the general comment that the world does not necessarily
revolve around you, and that you should not expect all software products
in the world to be customized to suit *your* needs, I have to highlight
how horrifying this is:

> Not all webmasters are willing to spend time reading "initdb"
> documentation.

This is truly horrifying --- well, fortunately, one could hope that it
is as wrong as the rest of your message; that dumb and lazy end users
and computer illiterate people are not willing to spend time reading
documentation or instructions is ok... But webmasters and database
administrators??? Do you *seriously* expect that some highly complex
software like a DB server should be handled by people who are not
willing to read documentation That's the most preposterous notion
I've read in the last few months!

Another detail to add --- for a lot of people, Unicode is a useless
feature that has a very important performance hit. For a *very large*
fraction of applications, I see it generally advised to use a database
with no encoding (which SQL_ASCII essentially is), and in the situations
where some locale-aware processing is needed, then the client
application can do it.

Of course, there are also many many applications where a DB with
Unicode encoding is very useful. In those cases, the administrators
can create a database with Unicode encoding (you seem to be one of
those that are too busy to be willing to spend time reading the
documentation of *createdb*), regardless of what default encoding was
specified with initdb.

Oh, and BTW, welcome to version 8 of PostgreSQL ... The default
encoding for initdb is . Ta-d!!! Unicode !!!

Carlos
--


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Please change default characterset for database cluster

2007-09-28 Thread Michael Fuhr
On Fri, Sep 28, 2007 at 09:32:43PM -0400, Carlos Moreno wrote:
> Oh, and BTW, welcome to version 8 of PostgreSQL ... The default
> encoding for initdb is . Ta-d!!! Unicode !!!

No, it isn't.  If you get UTF8 (formerly UNICODE) as a default then
it's because initdb is picking it up from your environment.

http://www.postgresql.org/docs/8.2/interactive/app-initdb.html

"The default is derived from the locale, or SQL_ASCII if that does not work."

-- 
Michael Fuhr

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Please change default characterset for database cluster

2007-09-28 Thread Tom Lane
Michael Fuhr <[EMAIL PROTECTED]> writes:
> No, it isn't.  If you get UTF8 (formerly UNICODE) as a default then
> it's because initdb is picking it up from your environment.

Which initdb has done since 8.0.  If the OP is such a rabid UTF8 fan,
one wonders why his default locale setting isn't using UTF8 ...

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Please change default characterset for database cluster

2007-09-28 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/28/07 21:12, Tom Lane wrote:
> Michael Fuhr <[EMAIL PROTECTED]> writes:
>> No, it isn't.  If you get UTF8 (formerly UNICODE) as a default then
>> it's because initdb is picking it up from your environment.
> 
> Which initdb has done since 8.0.  If the OP is such a rabid UTF8 fan,
> one wonders why his default locale setting isn't using UTF8 ...

He uses Windows?

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG/bXTS9HxQb37XmcRAjquAJ9EkSRbr4oHmQsFgKbSl7KJzZbqwgCfWp41
6ccK8ThWWoScU9yv3nCq3xQ=
=xcfs
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] Please change default characterset for database cluster

2007-09-28 Thread Carlos Moreno

Michael Fuhr wrote:

On Fri, Sep 28, 2007 at 09:32:43PM -0400, Carlos Moreno wrote:
  

Oh, and BTW, welcome to version 8 of PostgreSQL ... The default
encoding for initdb is . Ta-d!!! Unicode !!!



No, it isn't.  If you get UTF8 (formerly UNICODE) as a default then
it's because initdb is picking it up from your environment.

http://www.postgresql.org/docs/8.2/interactive/app-initdb.html

"The default is derived from the locale, or SQL_ASCII if that does not work."
  


Right --- I made the "over-assumption" based on the fact that all the 
systems
where I've installed it (all Fedora flavors of Linux) use UTF8 as system 
locale,
and thus that one in a sense becomes the "default" ...   Not sure about 
other

flavors of Unix, but certainly on the Windows world all bets would be off
(not like anyone would care, huh?  ;-) )

Carlos
--


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] Please change default characterset for database cluster

2007-09-28 Thread Magnus Hagander
Ron Johnson wrote:
> On 09/28/07 21:12, Tom Lane wrote:
>> Michael Fuhr <[EMAIL PROTECTED]> writes:
>>> No, it isn't.  If you get UTF8 (formerly UNICODE) as a default then
>>> it's because initdb is picking it up from your environment.
>> Which initdb has done since 8.0.  If the OP is such a rabid UTF8 fan,
>> one wonders why his default locale setting isn't using UTF8 ...
> 
> He uses Windows?

Just FYI: The next version of the Windows installer will attempt to pick
up the locale from the environment. If that succeeds, it will use that
locale and UNICODE encoding. Only if that fails will it pick SQL_ASCII.

//Magnus

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq