[PERFORM] [OT] Very strange postgresql behaviour

2007-01-29 Thread Arnau

Hi all,

  I have postgresql 7.4.2 running on debian and I have the oddest
postgresql behaviour I've ever seen.

I do the following queries:


espsm_asme=# select customer_app_config_id, customer_app_config_name
from customer_app_config where customer_app_config_id = 5929 or
customer_app_config_id = 11527 order by customer_app_config_id;


 customer_app_config_id | customer_app_config_name
+--
   5929 | INFO
(1 row)


  I do the same query but changing the order of the or conditions:


espsm_asme=# select customer_app_config_id, customer_app_config_name
from customer_app_config where customer_app_config_id = 11527 or
customer_app_config_id = 5929 order by customer_app_config_id;


 customer_app_config_id | customer_app_config_name
+--
  11527 | MOVIDOSERENA TONI 5523
(1 row)



  As you can see, the configuration 5929 and 11527 both exists, but
when I do the queries they don't appear.

  Here below you have the execution plans. Those queries use an index,
I have done reindex table customer_app_config but nothing has changed.

espsm_asme=# explain analyze select customer_app_config_id,
customer_app_config_name from customer_app_config where
customer_app_config_id = 11527 or customer_app_config_id = 5929 order by
customer_app_config_id;

  QUERY PLAN


 Sort  (cost=10.28..10.29 rows=2 width=28) (actual time=0.252..0.253
rows=1 loops=1)
   Sort Key: customer_app_config_id
   -  Index Scan using pk_cag_customer_application_id,
pk_cag_customer_application_id on customer_app_config  (cost=0.00..10.27
rows=2 width=28) (actual time=0.168..0.232 rows=1 loops=1)
 Index Cond: ((customer_app_config_id = 11527::numeric) OR
(customer_app_config_id = 5929::numeric))
 Total runtime: 0.305 ms
(5 rows)

espsm_asme=# explain analyze select customer_app_config_id,
customer_app_config_name from customer_app_config where
customer_app_config_id = 5929 or customer_app_config_id = 11527 order by
customer_app_config_id;

  QUERY PLAN


 Sort  (cost=10.28..10.29 rows=2 width=28) (actual time=0.063..0.064
rows=1 loops=1)
   Sort Key: customer_app_config_id
   -  Index Scan using pk_cag_customer_application_id,
pk_cag_customer_application_id on customer_app_config  (cost=0.00..10.27
rows=2 width=28) (actual time=0.034..0.053 rows=1 loops=1)
 Index Cond: ((customer_app_config_id = 5929::numeric) OR
(customer_app_config_id = 11527::numeric))
 Total runtime: 0.114 ms
(5 rows)

  The table definition is the following:

espsm_asme=# \d customer_app_config
  Table public.customer_app_config
  Column  | Type  | Modifiers
--+---+
 customer_app_config_id   | numeric(10,0) | not null
 customer_app_config_name | character varying(32) | not null
 keyword  | character varying(43) |
 application_id   | numeric(10,0) | not null
 customer_id  | numeric(10,0) | not null
 customer_app_contents_id | numeric(10,0) |
 number_access_id | numeric(10,0) |
 prefix   | character varying(10) |
 separator| numeric(1,0)  | default 0
 on_hold  | numeric(1,0)  | not null default 0
 with_toss| numeric(1,0)  | not null default 0
 number_id| numeric(10,0) |
 param_separator_id   | numeric(4,0)  | default 1
 memory_timeout   | integer   |
 with_memory  | numeric(1,0)  | default 0
 session_enabled  | numeric(1,0)  | default 0
 session_timeout  | integer   |
 number   | character varying(15) |
Indexes:
pk_cag_customer_application_id primary key, btree
(customer_app_config_id)
un_cag_kwordnumber unique, btree (keyword, number_id)
idx_cappconfig_ccontentsid btree (customer_app_contents_id)
idx_cappconfig_cusidappid btree (customer_id, application_id)
idx_cappconfig_customerid btree (customer_id)
idx_cappconfig_onhold btree (on_hold)
idx_cappconfig_onholdkeyw btree (on_hold, keyword)
Rules:

  A lot of rules that I don't paste as matter of length.


  Do you have any idea about how I can fix this?

--
Arnau

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] [OT] Very strange postgresql behaviour

2007-01-29 Thread Bill Moran
In response to Arnau [EMAIL PROTECTED]:
 
I have postgresql 7.4.2 running on debian and I have the oddest
 postgresql behaviour I've ever seen.
 
 I do the following queries:
 
 
 espsm_asme=# select customer_app_config_id, customer_app_config_name
 from customer_app_config where customer_app_config_id = 5929 or
 customer_app_config_id = 11527 order by customer_app_config_id;
 
 
   customer_app_config_id | customer_app_config_name
 +--
 5929 | INFO
 (1 row)
 
 
I do the same query but changing the order of the or conditions:
 
 
 espsm_asme=# select customer_app_config_id, customer_app_config_name
 from customer_app_config where customer_app_config_id = 11527 or
 customer_app_config_id = 5929 order by customer_app_config_id;
 
 
   customer_app_config_id | customer_app_config_name
 +--
11527 | MOVIDOSERENA TONI 5523
 (1 row)
 
 
 
As you can see, the configuration 5929 and 11527 both exists, but
 when I do the queries they don't appear.

[snip]

Just a guess, but perhaps your index is damaged.  Have you tried
REINDEXing?

-- 
Bill Moran
Collaborative Fusion Inc.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] [OT] Very strange postgresql behaviour

2007-01-29 Thread Heikki Linnakangas

Arnau wrote:

Hi all,

  I have postgresql 7.4.2 running on debian and I have the oddest
postgresql behaviour I've ever seen.


You should upgrade. The latest 7.2 release is 7.4.15


I do the following queries:
...


At first glance this looks like a bug in PostgreSQL, but..


Rules:

  A lot of rules that I don't paste as matter of length.


Is there any SELECT rules by chance that might explain this?

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] work-mem how do I identify the proper size

2007-01-29 Thread Bill Moran
In response to Campbell, Lance [EMAIL PROTECTED]:

 If I set work-mem at a particular amount of memory how do I answer the
 following questions:
 
 1)   How many of my queries were able to run inside the memory I
 allocated for work-mem?
 
 2)   How many of my queries had to run from disk because work-mem
 was not set high enough?
 
 3)   If a query had to go to disk in order to be sorted or completed
 is there a way to identify how much memory it would have taken in order
 to run the query from memory?

I don't know of any good way to answer these questions on current versions.

I have a patch in for 8.3 that logs the usage of temporary files, which
helps with some of this.

It'd be nice to have additional debug logging that tells you:
1) when a sort/join operation uses disk instead of memory
2) A higher level debugging that announces this query used temp files for
   some operations.

#1 would be nice for optimizing, but may involve a lot of overhead.

#2 could (potentially) be enabled on production servers to flag queries
that need investigated, without generating a significant amount of logging
overhead.

Hopefully I'll get some time to try to hack some stuff together for this
soon.

A little bit of playing around shows that cost estimates for queries change
radically when the system thinks it will be creating temp files (which
makes sense ...)

Notice these two partial explains:
   -  Sort  (cost=54477.32..55674.31 rows=478798 width=242)
   -  Sort  (cost=283601.32..284798.31 rows=478798 width=242)

These are explains of the same query (a simple select * + order by on a
non-indexed column)  The first one is taken with work_mem set at 512m,
which would appear to be enough space to do the entire sort in memory.
The second is with work_mem set to 128k.

More interesting is that that actual runtime doesn't differ by nearly
that much: 3100ms vs 2200ms.  (I've realized that my setting for
random_page_cost is too damn high for this hardware -- thanks for
causing me to look at that ... :)

Anyway -- hope that helps.

-- 
Bill Moran
Collaborative Fusion Inc.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] [OT] Very strange postgresql behaviour

2007-01-29 Thread Arnau

Hi Bill,


In response to Arnau [EMAIL PROTECTED]:

   I have postgresql 7.4.2 running on debian and I have the oddest
postgresql behaviour I've ever seen.

I do the following queries:


espsm_asme=# select customer_app_config_id, customer_app_config_name
from customer_app_config where customer_app_config_id = 5929 or
customer_app_config_id = 11527 order by customer_app_config_id;


  customer_app_config_id | customer_app_config_name
+--
5929 | INFO
(1 row)


   I do the same query but changing the order of the or conditions:


espsm_asme=# select customer_app_config_id, customer_app_config_name
from customer_app_config where customer_app_config_id = 11527 or
customer_app_config_id = 5929 order by customer_app_config_id;


  customer_app_config_id | customer_app_config_name
+--
   11527 | MOVIDOSERENA TONI 5523
(1 row)



   As you can see, the configuration 5929 and 11527 both exists, but
when I do the queries they don't appear.


[snip]

Just a guess, but perhaps your index is damaged.  Have you tried
REINDEXing?



  Yes, I have tried with:

   reindex table customer_app_config
   reindex index pk_cag_customer_application_id

but nothing changed. I also tried to drop the index:

espsm_asme=# begin; drop index pk_cag_customer_application_id;
BEGIN
ERROR:  cannot drop index pk_cag_customer_application_id because 
constraint pk_cag_customer_application_id on table customer_app_config 
requires it
HINT:  You may drop constraint pk_cag_customer_application_id on table 
customer_app_config instead.

espsm_asme=# rollback;
ROLLBACK

  But I can't remove the constraint as it's the primary key and there 
are foreign keys over it



--
Arnau

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Tuning

2007-01-29 Thread Ron

At 06:24 PM 1/28/2007, Josh Berkus wrote:

John,

 -work_mem

Depends on the number of concurrent queries you expect to run and what size
sorts you expect them to do.
EXPLAIN ANALYZE is your friend.  It will tell you how much data each 
query is manipulating and therefore how much memory each query will chew.


The next step is to figure out how many of each query will be running 
concurrently.
Summing those will tell you the maximum work_mem each kind of query 
will be capable of using.


If you have a deep enough understanding of how your pg system is 
working, then you can set work_mem on a per query basis to get the 
most efficient use of the RAM in your system.




 -maintenance_work_mem - 50% of the largest table?

Actually, in current code I've found that anything over 256mb 
doesn't actually

get used.
Is this considered a bug?  When will this limit go away?  Does 
work_mem have a similar limit?




 -shared_buffers - max value 5

Actually, I need to update that.   On newer faster multi-core 
machines you may

want to allocate up to 1GB of shared buffers.

 -effective_cache_size - max 2/3 of available ram, ie 24GB on the
 hardware described above

Yes.
Why?  max of 2/3 of available RAM sounds a bit 
hand-wavy.  Especially with 32gb, 64GB, and 128GB systems available.


Is there are hidden effective or hard limit  here as well?

For a dedicated pg machine, I'd assume one would want to be very 
aggressive about configuring the kernel, minimizing superfluous 
services, and configuring memory use so that absolutely as much as 
possible is being used by pg and in the most intelligent way given 
one's specific pg usage scenario.




 -shmmax - how large dare I set this value on dedicated postgres servers?

Set it to 2GB and you'll be covered.

I thought  that on 32b systems the 2GB shmmax limit had been raised to 4GB?
and that there essentially is no limit to shmmax on 64b systems?

What are Oracle and EnterpriseDB recommending for shmmax these days?


My random thoughts,
Ron Peacetree 



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] int4 vs varchar to store ip addr

2007-01-29 Thread Pomarede Nicolas



Hello,

I have an authorization table that associates 1 customer IP to a service 
IP to determine a TTL (used by a radius server).


table auth
 client varchar(15);
 service varchar(15);
 ttl int4;


client and service are both ip addr.

The number of distinct clients can be rather large (say around 4 million) 
and the number of distinct service around 1000.


table auth can contain between 10 M and 20 M lines.

there's a double index on ( client , service ).

Since I would like to maximize the chance to have the whole table cached 
by the OS (linux), I'd like to reduce the size of the table by replacing 
the varchar by another data type more suited to store ip addr.


I could use PG internal inet/cidr type to store the ip addrs, which would 
take 12 bytes per IP, thus gaining a few bytes per row.


Apart from gaining some bytes, would the btree index scan be faster with 
this data type compared to plain varchar ?



Also, in my case, I don't need the mask provided by inet/cidr ; is there a 
way to store an IPV4 addr directly into an INT4 but using the same syntax 
as varchar or inet/cidr (that is I could use '192.12.18.1' for example), 
or should I create my own data type and develop the corresponding function 
to convert from a text input to an int4 storage ?


This would really reduce the size of the table, since it would need 3 int4 
for client/service/ttl and I guess index scan would be faster with int4 
data that with varchar(15) ?


Thanks for any input.


Nicolas

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] int4 vs varchar to store ip addr

2007-01-29 Thread Florian Weimer
* Pomarede Nicolas:

 I could use PG internal inet/cidr type to store the ip addrs, which
 would take 12 bytes per IP, thus gaining a few bytes per row.

I thought it's down to 8 bytes in PostgreSQL 8.2, but I could be
mistaken.

 Apart from gaining some bytes, would the btree index scan be faster
 with this data type compared to plain varchar ?

It will be faster because less I/O is involved.

For purposes like yours, there is a special ip4 type in a contributed
package which brings down the byte count to 4.  I'm not sure if it's
been ported to PostgreSQL 8.2 yet.

-- 
Florian Weimer[EMAIL PROTECTED]
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] int4 vs varchar to store ip addr

2007-01-29 Thread Cosimo Streppone

Nicolas wrote:

I have an authorization table that associates 1 customer IP to a service 
IP to determine a TTL (used by a radius server).


table auth
 client varchar(15);
 service varchar(15);
 ttl int4;
client and service are both ip addr.

The number of distinct clients can be rather large (say around 4 
million) and the number of distinct service around 1000.


there's a double index on ( client , service ).


It comes to mind another solution... I don't know if it is better or worse,
but you could give it a try.
Store IP addresses as 4 distinct columns, like the following:

CREATE TABLE auth (
client_ip1 shortint,
client_ip2 shortint,
client_ip3 shortint,
client_ip4 shortint,
servicevarchar(15),
ttlint4,
);

And then index by client_ip4/3/2/1, then service.

CREATE INDEX auth_i1 ON auth (client_ip4, client_ip3, client_ip2, 
client_ip1);

or:

CREATE INDEX auth_i1 ON auth (client_ip4, client_ip3, client_ip2, 
client_ip1, service);

I'm curious to know from pg internals experts if this could be a
valid idea or is totally non-sense.

Probably the builtin ip4 type is better suited for these tasks?

--
Cosimo

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] [OT] Very strange postgresql behaviour

2007-01-29 Thread Carlos Moreno

Arnau wrote:


Hi Bill,


In response to Arnau [EMAIL PROTECTED]:


   I have postgresql 7.4.2 running on debian and I have the oddest
postgresql behaviour I've ever seen.

I do the following queries:


espsm_asme=# select customer_app_config_id, customer_app_config_name
from customer_app_config where customer_app_config_id = 5929 or
customer_app_config_id = 11527 order by customer_app_config_id;




Just wild guessing:  is there any chance that there may be some form of
implicit limit modifier for the select statements on this table?  Does 
the

behaviour change if you add limit 2 at the end of the query?  Does it
change if you use customer_app_config_id in (5929, 11527) instead?

Another wild guess:  if the data is somewhat corrupt, maybe a vacuum
analyze would detect it?  Or perhaps try pg_dumping, to see if pg_dump
at some point complains about mssing or corrupt data?

HTH,

Carlos
--


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


[PERFORM] Thanks All!

2007-01-29 Thread Saranya Sivakumar
Hi All,

Thanks to all in this support community. You are doing a great job! 
With all the information/support from the communities and documentations, we 
successfuly upgraded to 8.1 from 7.3.2 on our production environment! It was a 
smooth switch over. 

Just wanted to say thanks to everyone on the community!

Thanks,
Saranya Sivakumar

 
-
Expecting? Get great news right away with email Auto-Check.
Try the Yahoo! Mail Beta.

[PERFORM] Partitioning

2007-01-29 Thread Abu Mushayeed
I have partitioned a table based on period (e.g., cdate = '2007-01-01'::date 
and cdate=.2007-03-31':;date).

Now, I am issuing query like cdate = CURRENT_DATE - 1 and cdate = 
CURRENT_DATE, it scans all the partitions. But if I do cdate = 
'2007-01-01'::date and cdate=.2007-03-31'::date it picks the correct 
partition. Also if I join the cdate field with another table, it does not pick 
the correct partition.

I would like to know if it is possible to pick the correct partition using the 
above example.

Thanks
Abu

 
-
Need Mail bonding?
Go to the Yahoo! Mail QA for great tips from Yahoo! Answers users.

Re: [PERFORM] Partitioning

2007-01-29 Thread Tomas Vondra
 Can anybody help me out
  
 I just wanted to knw what will be the configuraion settings for
 partitioning table so as to make inserts faster on the partitioned tables.

Well, that depends. Many questions are important here. Will you insert
into several partitions or only to a single one? Do you need to enforce
some constraint between several partitioned tables?

If you need to insert into several partitions, it can be faster as you
can place them on different drives. If you need to insert only into the
last one (that's usually the case with 'logging' tables) then this
probably won't give a huge performance benefit.

If you need to enforce some kind of constraint between multiple
partitions (possibly from several tables), you'll have to do that
manually using a plpgsql procedure (for example). This is the case with
UNIQUE constraint on a single table, FOREIGN KEY between multimple
partitioned tables, etc. This can mean a serious performance penalty,
esecially when you do mostly insert/update on that table.

This is mostly about application architecture - if you use partitions
incorrectly it's almost impossible to fix that by changing settings in
postgresql.conf.

Tomas

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly