Re: [PERFORM] Regarding pg_dump utility

2006-06-08 Thread Christopher Kings-Lynne

Personally I think it would be neat.  For example the admin-tool guys
would be able to get a dump without invoking an external program.
Second it would really be independent of core releases (other than being
tied to the output format.)  pg_dump would be just a simple caller of
such a library, and anyone else would be able to get dumps easily, in
whatever format.


What about fully completing our SQL API for dumping?

ie. We finish adding pg_get_blahdef() for all objects, add a function 
that returns the proper ordering of all objects in the database, and 
then somehow drop out a dump with a single JOIN :D


Chris


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Regarding pg_dump utility

2006-06-08 Thread Christopher Kings-Lynne

It probably wouldn't be terribly difficult to put the guts of pg_dump
into a library that you could interface with via C. I'm not sure if the
community would accept such a patch; though, I seem to recall other
people asking for this on occasion.
 

I think script support is bit risky because if anything goes wrong while
taking backup using pg_dump then user will not understand the problem of
falling

If only script support is possible then what should we prefer perl or shell?


Depends on what you're trying to accomplish. Perl is a much more capable
language than shell, obviously.



In phpPgAdmin we just execute pg_dump as a child process and capture its 
output



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Query performance

2006-05-30 Thread Christopher Kings-Lynne
I'm executing the queries from phpPgAdmin. 
The above are for explain analyse. I was referring to the pure query

execution time.
Does anyone have an idea why the OR-query takes so long?
Any server-side tuning possibilities? I wouldn't like to change the code of
ldap's back-sql...


If you're using phpPgAdmin's timings, they could be more off than the 
real explain analyze timings.  Make sure you're using the figure given 
by explain analyze itself.


Chris


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] lowering priority automatically at connection

2006-05-25 Thread Christopher Kings-Lynne

That guy doesn't actually have the foggiest idea what he's doing.
The reason there is no built-in capability to do that is that it *does
not work well*.  Search the list archives for "priority inversion" to
find out why not.


http://en.wikipedia.org/wiki/Priority_inversion


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] IMMUTABLE?

2006-05-16 Thread Christopher Kings-Lynne
Yes, but there are definitely programming cases where 
memoization/caching definitely helps. And it's easy to tell for a given 
function whether or not it really helps by simply trying it with CACHED 
and without.


Would this be a simple thing to implement?


It's called a "table" :)


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Inserts optimization?

2006-04-19 Thread Christopher Kings-Lynne

Scott Marlowe <[EMAIL PROTECTED]> writes:

It's the refusal of people to stop using MyISAM table types that's the
real issue.


Isn't MyISAM still the default over there?  It's hardly likely that the
average MySQL user would use anything but the default table type ...


Since MySQL 5, InnoDB tables are default I recall.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] SELECT FOR UPDATE performance is bad

2006-04-18 Thread Christopher Kings-Lynne

Suppose you have a table codes :
(
game_idINT,
codeTEXT,
usedBOOL NOT NULL DEFAULT 'f',
prize...
...
PRIMARY KEY (game_id, code)
)

Just UPDATE codes SET used='t' WHERE used='f' AND game_id=... AND 
code=...


Then check the rowcount : if one row was updated, the code was not 
used yet. If no row was updated, the code either did not exist, or was 
already used.


You can use a stored procedure with exceptions no?

Try this:

http://www.postgresql.org/docs/8.1/interactive/plpgsql-control-structures.html#PLPGSQL-UPSERT-EXAMPLE

Chris



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Restore performance?

2006-04-11 Thread Christopher Kings-Lynne

Well, your pg_dump command lost your BLOBs since the plain text
format doesn't support them.


Well, no.. they are stored as BYTEA not Large Objects.. They are encoded
in ASCII in the pg_dump output.


As a side note: plain text dump format in 8.1 supprts LOBs


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Slow performance on Windows .NET and OleDb

2006-03-29 Thread Christopher Kings-Lynne
You should run the select query from the psql utility to determine if 
it's PostgreSQL, or your OleDb driver that's being slow.  It takes like 
185ms on one of my tables to get 7000 rows.


Greg Quinn wrote:

how many rows does it return ? a few, or a lot ?


3000 Rows - 7 seconds - very slow

Which client library may have a problem? I am using OleDb, though 
haven't tried the .NET connector yet.


Network configuration?? I am running it off my home PC with no network. 
It is P4 2.4 with 1 Gig Ram. Windows XP


- Original Message - From: "PFC" <[EMAIL PROTECTED]>
To: "Greg Quinn" <[EMAIL PROTECTED]>
Sent: Wednesday, March 29, 2006 11:02 AM
Subject: Re: [PERFORM] Slow performance on Windows .NET and OleDb





select * from users
which returns 4 varchar fields, there is no where clause


how many rows does it return ? a few, or a lot ?

Yes, I am running the default postgres config. Basically I have been 
a MySQL user and thought I would like to check out PostGreSql.


Good idea...

From the tests I made, on simple queries like yours, with no joins, 
speed pf pg 8.x is about the same as mysql 5.x ; that is to say very 
fast. If you have a performance problem on something so basic, and 
moreover on windows, it smells like a problem in the client library, 
or in the TCP transport between client and server.
I remember messages saying postgres on windows was slow some time ago 
here, and it turned out to be a problem in the network configuration 
of the machine.






---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

      http://www.postgresql.org/docs/faq


--
Christopher Kings-Lynne

Technical Manager
CalorieKing
Tel: +618.9389.8777
Fax: +618.9389.8444
[EMAIL PROTECTED]
www.calorieking.com


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Really really slow query. What's a better way?

2006-02-23 Thread Christopher Kings-Lynne

how about something like:

DELETE FROM cds.cds_mspecxx WHERE NOT EXISTS (SELECT 1 FROM 
cds_stage.cds_Catalog stage where stage.countryCode =  'us' and 
stage.ProdId=cds.cds_mspecxx.ProdId) and countryCode = 'us';


Run explain on it first to see how it will be planned.  Both tables 
should have an index over (countryCode, ProdId) I think.


Chris

Brendan Duddridge wrote:

Hi,

We're executing a query that has the following plan and we're wondering 
given the size of the data set, what's a better way to write the query? 
It's been running since 2pm 2 days ago.


explain DELETE FROM cds.cds_mspecxx WHERE ProdID not in (SELECT 
stage.ProdID FROM cds_stage.cds_Catalog stage where stage.countryCode = 
'us') and countryCode = 'us';
QUERY PLAN 
---
Index Scan using pk_mspecxx on cds_mspecxx 
(cost=53360.87..208989078645.48 rows=7377879 width=6)

Index Cond: ((countrycode)::text = 'us'::text)
Filter: (NOT (subplan))
SubPlan
-> Materialize (cost=53360.87..77607.54 rows=1629167 width=12)
-> Seq Scan on cds_catalog stage (cost=0.00..43776.70 rows=1629167 width=12)
Filter: ((countrycode)::text = 'us'::text)
(7 rows)

Thanks,
*
*
*Brendan Duddridge* | CTO | 403-277-5591 x24 |  [EMAIL PROTECTED] 


*
*ClickSpace Interactive Inc.
Suite L100, 239 - 10th Ave. SE
Calgary, AB  T2G 0V9

http://www.clickspace.com 




---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Joins and full index scans...mysql vs postgres?

2006-02-22 Thread Christopher Kings-Lynne
The pgAdmin query tool is known to give an answer about 5x the real 
answer - don't believe it!


ryan groth wrote:

Hmm, it came from the timer on the pgadmin III sql query tool. I guess
the 1,000ms includes the round-trip? See the wierd thing is that
mysqlserver is running default configuration on a virtual machine
(P3/1.3GHZ conf'd for 128mb ram) over a 100m/b ethernet connection.
Postgres is running on a real P4/3.0ghz 4GB running localhost. Timings
from the mysql query tool indicate that the 6.5k record query runs in
"1.3346s (.3361s)" vs. the pgadmin query tool saying that the query runs
"997+3522 ms". Am I reading these numbers wrong? Are these numbers
reflective of application performance? Is there an optimization I am
missing?

Ryan



On Wed, 22 Feb 2006, ryan groth wrote:


Does this work:

"Merge Left Join  (cost=0.00..2656.36 rows=6528 width=1522) (actual
time=0.057..123.659 rows=6528 loops=1)"
"  Merge Cond: ("outer".uid = "inner".uid)"
"  ->  Merge Left Join  (cost=0.00..1693.09 rows=6528 width=1264)
(actual time=0.030..58.876 rows=6528 loops=1)"
"Merge Cond: ("outer".uid = "inner".user_id)"
"->  Index Scan using users_pkey on users  (cost=0.00..763.81
rows=6528 width=100) (actual time=0.016..9.446 rows=6528 loops=1)"
"->  Index Scan using phorum_users_base_pkey on
phorum_users_base  (cost=0.00..822.92 rows=9902 width=1168) (actual
time=0.007..15.674 rows=9845 loops=1)"
"  ->  Index Scan using useraux_pkey on useraux  (cost=0.00..846.40
rows=7582 width=262) (actual time=0.007..11.935 rows=7529 loops=1)"
"Total runtime: 127.442 ms"

Well, this implies the query took about 127 ms on the server side. Where
did the 1000 ms number come from (was that on a client, and if so, what
type)?

---(end of broadcast)---
TIP 6: explain analyze is your friend







---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Need pointers to "standard" pg database(s) for

2006-02-19 Thread Christopher Kings-Lynne
Relating to this.  If anyone can find govt or other free db's and 
convert them into pgsql format, I will host them on the dbsamples page. 
 The dbsamples are _really_ popular!


Chris

Scott Marlowe wrote:

On Fri, 2006-02-17 at 10:51, Ron wrote:

I assume we have such?


Depends on what you wanna do.
For transactional systems, look at some of the stuff OSDL has done.

For large geospatial type stuff, the government is a good source, like
www.usgs.gov or the fcc transmitter database.

There are other ones out there.  Really depends on what you wanna test.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Need pointers to "standard" pg database(s) for testing

2006-02-19 Thread Christopher Kings-Lynne

Not really, but you can check out the sample databases project:

http://pgfoundry.org/projects/dbsamples/

Chris

Ron wrote:

I assume we have such?

Ron



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-02-15 Thread Christopher Kings-Lynne
Ouch! That confirms my problem. I generated the random test case because 
it was easier than including the dump of my tables, but you can 
appreciate that tables 20 times the size are basically crippled when it 
comes to creating an index on them.



I have to say that I restored a few gigabyte dump on freebsd the other 
day, and most of the restore time was in index creation - I didn't think 
too much of it though at the time.  FreeBSD 4.x.


Chris


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Sequential scan being used despite indexes

2006-01-31 Thread Christopher Kings-Lynne
Reading about this issue further in the FAQ, it seems that I should 
ensure that Postgres has adequate and accurate information about the 
tables in question by regularly running VACUUM ANALYZE, something I 
don't do currently.


Well then you'll get rubbish performance always in PostgreSQL...

I strongly suggest you run autovacuum if you don't really understand 
PostgreSQL vacuuming/analyzing.


Chris


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Autovacuum / full vacuum

2006-01-17 Thread Christopher Kings-Lynne
So my question is: What's the use of an autovacuum daemon if I still 
have to use a cron job to do full vacuums? wouldn't it just be a minor 
job to enhance autovacuum to be able to perform full vacuums, if one 
really wants it to do that - even if some developers think that it's the 
wrong approach?


You should never have to do full vacuums...

Chris

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] NOT LIKE much faster than LIKE?

2006-01-09 Thread Christopher Kings-Lynne

UNLIKELY string LIKE '%% PREEMPT %%'

or:

LIKELY string NOT LIKE '%% PREEMPT %%'


You should be using contrib/tsearch2 for an un-anchored text search perhaps?


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Christopher Kings-Lynne
... This seems like a case where PostgreSQL's current optimiser should 
easily know what to do if your config settings are correct and you've 
been running ANALYZE, so I'd like to see your settings and the explain 
analyze plan...


I could, but it would divert us from the main topic of this discussion.  
It's not about that query, which was just an example.  It's the larger 
issue.


So your main example bad query is possibly just a case of lack of 
analyze stats and wrong postgresql.conf config?  And that's what causes 
you to shut down your database?  Don't you want your problem FIXED?


But like I said - no developer is interested in doing planner hints. 
Possibly you could get a company to sponsor it.  Maybe what you want is 
a statement of "If someone submits a good, working, fully implemented 
patch that does planner hints, then we'll accept it."


Chris


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Christopher Kings-Lynne

Boy, you picked a *really* bad example ;-)

The problem is that Postgres decided to filter on myfunc() *first*, and 
then filter on row_num, resulting in a query time that jumped from 
seconds to hours.  And there's no way for me to tell Postgres not to do 
that!


Can you paste explain analyze and your effective_cache_size, etc. settings.

So, "you still have no problem" is exactly wrong, because Postgres 
picked the wrong plan.  Postgres decided that applying myfunc() to 
10,000,000 rows was a better plan than an index scan of 50,000 
row_nums.  So I'm screwed.


This seems like a case where PostgreSQL's current optimiser should 
easily know what to do if your config settings are correct and you've 
been running ANALYZE, so I'd like to see your settings and the explain 
analyze plan...


Chris


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Christopher Kings-Lynne
  > Right on. Some of these "coerced" plans may perform   > much better. 
If so, we can look at tweaking your runtime

  > config: e.g.
  >
  > effective_cache_size
  > random_page_cost
  > default_statistics_target
  >
  > to see if said plans can be chosen "naturally".

I see this over and over.  Tweak the parameters to "force" a certain 
plan, because there's no formal way for a developer to say, "I know the 
best plan."


No, this is "fixing your wrongn, inaccurate parameters so that 
postgresql can choose a better plan".


I don't necessarily disagree with your assertion that we need planner 
hints, but unless you or someone else is willing to submit a patch with 
the feature it's unlikely to ever be implemented...


Chris


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Christopher Kings-Lynne

  select * from my_table where row_num >= 5 and row_num < 10
   and myfunc(foo, bar);


You just create an index on myfunc(foo, bar)


only if myfunc(foo, bar) is immutable...


And if it's not then the best any database can do is to index scan 
row_num - so still you have no problem.


Chris


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Christopher Kings-Lynne

   select * from my_table where row_num >= 5 and row_num < 10
and myfunc(foo, bar);


You just create an index on myfunc(foo, bar)

Chris


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] PostgreSQL performance question.

2005-12-14 Thread Christopher Kings-Lynne
I have been using PostgreSQL (currently 7.4.7) for several years now and 
am very happy with it but I currently run a website that has had a 
little bit of a boost and I am starting to see some performance problems 
(Not necessarily PostgreSQL).


PostgreSQL 8.1.1 should give you greater performance...

The database has been allocated 2Gb worth of shared buffers and I have 
tweaked most of the settings in the config recently to see if I could 
increase the performance any more and have seen very little performance 
gain for the various types of queries that I am running.


That sounds like far too many shared buffers?  I wouldn't usually use 
more than a few tens of thousands, eg. 10k-50k.  And that'd only be on 
8.1 that has more efficient buffer management.


Get it into RAM hence the slight delay here. This delay has a serious 
impact on the user waiting in the web application.


# select * from test where text = 'uk' ;
Time: 477.739 ms


You need to show us the explain analyze plan output for this.  But 477ms 
is far too slow for an index scan on a million row table.



max_fsm_pages = 50 # I am thinking this might be a bit low.
max_fsm_relations = 1000


Maybe do a once-off vacuum full to make sure all your tables are clean?

Chris


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] INSERTs becoming slower and slower

2005-12-08 Thread Christopher Kings-Lynne
You might find it faster to install contrib/tsearch2 for text indexing 
sort of purposes...


Nörder-Tuitje wrote:



Hi,

I am breaking up huge texts (between 25K and 250K words) into single 
words using PgPlsql.


For this I am using a temp table in the first step :

LOOP   


vLeft   := vRight;
vTmp:= vLeft;
   
LOOP

vChr := SUBSTRING ( pText FROM vTmp FOR 1);
vTmp := vTmp + 1;
EXIT WHEN (vChr = ' ' OR vChr IS NULL OR vTmp = 
cBorder);

END LOOP;
   
vRight  := vTmp;
   
vLit:= SUBSTRING(pText FROM vLeft FOR (vRight - 
vLeft - 1));


IF (LENGTH(vLit) > 0) THEN
WRDCNT := WRDCNT +1;
INSERT INTO DEX_TEMPDOC(TMP_DOO_ID
,   TMP_SEQ_ID
,   TMP_RAWTEXT)
VALUES (pDOO_ID
,   I
,   vLIT
); 
END IF;
   
I := I + 1;

vTmp := LENGTH(vLIT);

   
IF ((WRDCNT % 100) = 0) THEN

PROGRESS = ROUND((100 * I) / DOCLEN,0);
RAISE NOTICE '[PROC] % WORDS -- LAST LIT % (Len 
%) [% PCT / % of %]', WRDCNT, vLIT, vTMP, PROGRESS, I, DOCLEN;


END IF;
   
   
EXIT WHEN vRight >= cBorder;

END LOOP;


The doc is preprocessed, between each word only a single blank can be.

My problem is : The first 25K words are quite quick, but  the insert 
become slower and slower. starting with 1K words per sec I end up with 
100 words in 10 sec (when I reach 80K-100K words)


the only (nonunique index) on tempdoc is on RAWTEXT.

What can I do ? Should I drop the index ?

Here is my config:

shared_buffers = 2000   # min 16, at least max_connections*2, 
8KB each

work_mem = 32768# min 64, size in KB
maintenance_work_mem = 16384# min 1024, size in KB
max_stack_depth = 8192  # min 100, size in KB

enable_hashagg = true
enable_hashjoin = true
enable_indexscan = true
enable_mergejoin = true
enable_nestloop = true
enable_seqscan = false

The machine is a XEON 3GHz, 1GB RAM, SATA RAID 1 Array running 8.0.4 
i686 precompiled



Thanks !



Mit freundlichen Grüßen
*Dipl.Inform.Marcus Noerder-Tuitje
**Entwickler
*
software technology AG
*Kortumstraße 16   *
*44787 Bochum*
*Tel:  0234 / 52 99 6 26*
*Fax: 0234 / 52 99 6 22*
*E-Mail:   [EMAIL PROTECTED]  *
*Internet: www.technology.de *





---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] TSearch2 vs. Apache Lucene

2005-12-07 Thread Christopher Kings-Lynne
No, my problem is that using TSearch2 interferes with other core 
components of postgres like (auto)vacuum or dump/restore.


That's nonsense...seriously.

The only trick with dump/restore is that you have to install the 
tsearch2 shared library before restoring.  That's the same as all 
contribs though.


Chris


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Query Fails with error calloc - Cannot alocate memory

2005-12-06 Thread Christopher Kings-Lynne
If you're trying to retrieve 26 million rows into RAM in one go of 
course it'll be trouble.


Just use a cursor.  (DECLARE/FETCH/MOVE)

Chris


Howard Oblowitz wrote:

Hi …

I am trying to run a query that selects 26 million rows from a

table with 68 byte rows.

When run on the Server via psql the following error occurs:

calloc : Cannot allocate memory

When run via ODBC from Cognos Framework Manager only works

if we limit the retrieval to 3 million rows.

I notice that the memory used by the query when run on the Server increases

to about 2.4 GB before the query fails.

Postgres version is 7.3.4

Running on Linux Redhat 7.2

4 GB memory

7 Processor 2.5 Ghz

Shmmax set to 2 GB

Configuration Parameters

Shared Buffers  12 288

Max Connections 16

Wal buffers 24

Sort Mem40960

Vacuum Mem  80192

Checkpoint Timeout  600

Enable Seqscan  false

Effective Cache Size20


Results of explain analyze and expain analyze verbose:

explain analyze select * from flash_by_branches;

  QUERY 
PLAN 


--

 Seq Scan on flash_by_branches  (cost=1.00..100567542.06 
rows=26854106 width=68) (actual time=12.14..103936.35 rows=26854106 loops=1)


 Total runtime: 122510.02 msec

(2 rows)

explain analyze verbose:

{ SEQSCAN

:startup_cost 1.00

:total_cost 100567542.06

:rows 26854106

:width 68

:qptargetlist (

   { TARGETENTRY

   :resdom

  { RESDOM

  :resno 1

  :restype 1043

  :restypmod 8

  :resname br_code

  :reskey 0

  :reskeyop 0

  :ressortgroupref 0

  :resjunk false

  }

   :expr

  { VAR

  :varno 1

  :varattno 1

  :vartype 1043

  :vartypmod 8

  :varlevelsup 0

  :varnoold 1

  :varoattno 1

  }

   }

   { TARGETENTRY

   :resdom

  { RESDOM

  :resno 2

  :restype 23

  :restypmod -1

  :resname fty_code

  :reskey 0

  :reskeyop 0

  :ressortgroupref 0

  :resjunk false

  }

   :expr

  { VAR

  :varno 1

  :varattno 2

  :vartype 23

  :vartypmod -1

  :varlevelsup 0

  :varnoold 1

  :varoattno 2

}

   }

   { TARGETENTRY

   :resdom

  { RESDOM

  :resno 3

  :restype 1082

  :restypmod -1

  :resname period

  :reskey 0

  :reskeyop 0

  :ressortgroupref 0

  :resjunk false

  }

   :expr

  { VAR

  :varno 1

  :varattno 3

  :vartype 1082

  :vartypmod -1

  :varlevelsup 0

  :varnoold 1

  :varoattno 3

  }

   }

   { TARGETENTRY

   :resdom

  { RESDOM

  :resno 4

  :restype 1700

  :restypmod 786436

  :resname value

  :reskey 0

  :reskeyop 0

  :ressortgroupref 0

  :resjunk false

  }

   :expr

  { VAR

  :varno 1

  :varattno 4

  :vartype 1700

  :vartypmod 786436

  :varlevelsup 0

  :varnoold 1

  :varoattno 4

  }

   }

   { TARGETENTRY

   :resdom

  { RESDOM

  :resno 7

  :restype 1700

  :restypmod 786438

  :resname value1

  :reskey 0

  :reskeyop 0

  :ressortgroupref 0

  :resjunk false

  }

   :expr

  { VAR

  :varno 1

  :varattno 7

  :vartype 1700

  :vartypmod 786438

  :varlevelsup 0

  :varnoold 1

  :varoattno 7

  }

   }

)

:qpqual <>

:lefttree <>

:righttree <>

:extprm ()

:locprm ()

:initplan <>

:nprm 0

:scanrelid 1

}

 Seq Scan on flash_by_branches  (cost=1.00..100567542.06 
rows=26854106 width=68) (actual time=6.59..82501.15 rows=2685


4106 loops=1)

 Total runtime: 102089.00 msec

(196 rows)



Please assist.

Thanks,

Howard Oblowitz



---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.859 / Virus Database: 585 - Release Date: 14/02/2005
 




---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] TSearch2 vs. Apache Lucene

2005-12-06 Thread Christopher Kings-Lynne

...

So you'll avoid a non-core product and instead only use another non-core 
product...?


Chris

Michael Riess wrote:


Has anyone ever compared TSearch2 to Lucene, as far as performance is 
concerned?



I'll stay away from TSearch2 until it is fully integrated in the 
postgres core (like "create index foo_text on foo (texta, textb) USING 
TSearch2"). Because a full integration is unlikely to happen in the near 
future (as far as I know), I'll stick to Lucene.


Mike

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Help speeding up delete

2005-11-17 Thread Christopher Kings-Lynne

I do not really see why all the distributions could do something like this,
instead of mucking around with special statically compiled pg_dumps and the
like...


Contrib modules and tablespaces.

Plus, no version of pg_dump before 8.0 is able to actually perform such 
reliable dumps and reloads (due to bugs).  However, that's probably moot 
these days.


Chris


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [pgsql-www] [PERFORM] Help speeding up delete

2005-11-17 Thread Christopher Kings-Lynne
That way if someone wanted to upgrade from 7.2 to 8.1, they 
can just grab the latest dumper from the website, dump their 
old database, then upgrade easily.


But if they're upgrading to 8.1, don't they already have the new
pg_dump? How else are they going to dump their *new* database?


Erm.  Usually when you install the new package/port for 8.1, you cannot 
have both new and old installed at the same time man.  Remember they 
both store exactly the same binary files in exactly the same place.


In my experience not many pgsql admins have test servers or 
the skills to build up test machines with the latest pg_dump, 


I don't, but I still dump with the latest version - works fine both on
linux and windows for me... 


So you're saying you DO have the skills to do it then...

etc.  (Seriously.) In fact, few realise at all that they 
should use the 8.1 dumper.


That most people don't know they should use the new one I understand
though. But I don't see how this will help against that :-)


It'll make it easy...

Chris

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Help speeding up delete

2005-11-17 Thread Christopher Kings-Lynne

Isn't your distribution supposed to do this for you? Mine does these days...


A distribution that tries to automatically do a major postgresql update 
is doomed to fail - spectacularly...


Chris

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Help speeding up delete

2005-11-16 Thread Christopher Kings-Lynne

Perhaps we should put a link on the home page underneath LATEST RELEASEs
saying
7.2: de-supported

with a link to a scary note along the lines of the above.

ISTM that there are still too many people on older releases.

We probably need an explanation of why we support so many releases (in
comparison to licenced software) and a note that this does not imply the
latest releases are not yet production (in comparison to MySQL or Sybase
who have been in beta for a very long time).


By the way, is anyone interested in creating some sort of online 
repository on pgsql.org or pgfoundry where we can keep statically 
compiled pg_dump/all for several platforms for 8.1?


That way if someone wanted to upgrade from 7.2 to 8.1, they can just 
grab the latest dumper from the website, dump their old database, then 
upgrade easily.


In my experience not many pgsql admins have test servers or the skills 
to build up test machines with the latest pg_dump, etc.  (Seriously.) 
In fact, few realise at all that they should use the 8.1 dumper.


Chris


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Help speeding up delete

2005-11-16 Thread Christopher Kings-Lynne

Update to 7.4 or later ;-)

Quite seriously, if you're still using 7.2.4 for production purposes
you could justifiably be accused of negligence.  There are three or four
data-loss-grade bugs fixed in the later 7.2.x releases, not to mention
security holes; and that was before we abandoned support for 7.2.
You *really* need to be thinking about an update.



Perhaps we should put a link on the home page underneath LATEST RELEASEs
saying
7.2: de-supported

with a link to a scary note along the lines of the above.


I strongly support an explicit desupported notice for 7.2 and below on 
the website...


Chris


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Improving performance on multicolumn query

2005-11-09 Thread Christopher Kings-Lynne

transfer=> explain analyse SELECT * FROM test WHERE test_a=9091150001
AND test_b=1 AND test_c=2 AND test_d=0 AND test_e=0;

 Index Scan using test_idx on test  (cost=0.00..50.27 rows=1 width=1891)
(actual time=0.161..0.167 rows=1 loops=1)
   Index Cond: (test_a = 9091150001::bigint)
   Filter: ((test_b = 1) AND (test_c = 2) AND (test_d = 0) AND (test_e 0))

So, what to do to speed things up? If I understand correctly this
output, the planner uses my index (test_idx is the same as test_pkey
created along with the table), but only for the first column.


Hi Jan,

If you're using 7.4.x then the planner can't use the index for unquoted 
bigints.  Try this:


SELECT * FROM test WHERE test_a='9091150001' AND test_b='1' AND 
test_c=''2 AND test_d='0' AND test_e='0';


Chris

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Sort performance on large tables

2005-11-08 Thread Christopher Kings-Lynne

I'd set up a trigger to maintain summary tables perhaps...

Chris


Charlie Savage wrote:

Thanks everyone for the feedback.

I tried increasing work_mem:

set work_mem to 30;

select tlid, min(ogc_fid)
from completechain
group by tld;

The results are:

"GroupAggregate  (cost=9041602.80..10003036.88 rows=48071704 width=8)
(actual time=4371749.523..5106162.256 rows=47599910 loops=1)"
"  ->  Sort  (cost=9041602.80..9161782.06 rows=48071704 width=8) (actual
time=4371690.894..4758660.433 rows=48199165 loops=1)"
"Sort Key: tlid"
"->  Seq Scan on completechain  (cost=0.00..2228584.04
rows=48071704 width=8) (actual time=49.518..805234.970 rows=48199165
loops=1)"
"Total runtime: 5279988.127 ms"

Thus the time decreased from 8486 seconds to 5279 seconds - which is a 
nice improvement.  However, that still leaves postgresql about 9 times 
slower.


I tried increasing work_mem up to 50, but at that point the machine 
started using its swap partition and performance degraded back to the 
original values.


Charlie


Richard Huxton wrote:
 > Charlie Savage wrote:
 >> Hi everyone,
 >>
 >> I have a question about the performance of sort.
 >
 >> Note it takes over 10 times longer to do the sort than the full
 >> sequential scan.
 >>
 >> Should I expect results like this?  I realize that the computer is
 >> quite low-end and is very IO bound for this query, but I'm still
 >> surprised that the sort operation takes so long.
 >
 > The sort will be spilling to disk, which will grind your I/O to a halt.
 >
 >> work_mem =  16384# in Kb
 >
 > Try upping this. You should be able to issue "set work_mem = 10"
 > before running your query IIRC. That should let PG do its sorting in
 > larger chunks.
 >
 > Also, if your most common access pattern is ordered via tlid look into
 > clustering the table on that.



Richard Huxton wrote:


Charlie Savage wrote:


Hi everyone,

I have a question about the performance of sort.



Note it takes over 10 times longer to do the sort than the full 
sequential scan.


Should I expect results like this?  I realize that the computer is 
quite low-end and is very IO bound for this query, but I'm still 
surprised that the sort operation takes so long.



The sort will be spilling to disk, which will grind your I/O to a halt.


work_mem =  16384# in Kb



Try upping this. You should be able to issue "set work_mem = 10" 
before running your query IIRC. That should let PG do its sorting in 
larger chunks.


Also, if your most common access pattern is ordered via tlid look into 
clustering the table on that.



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Temporary Table

2005-11-07 Thread Christopher Kings-Lynne

Or you could just run the 'vacuumdb' utility...

Put something like this in cron:

# Vacuum full local pgsql database
30 * * * *  postgres  vacuumdb -a -q -z

You really should read the manual.

Chris

Christian Paul B. Cosinas wrote:

I see.

But How Can I put this in the Cron of my Linux Server?
I really don't have an idea :)
What I want to do is to loop around all the databases in my server and
execute the vacuum of these 3 tables in each tables.

-Original Message-
From: Joshua D. Drake [mailto:[EMAIL PROTECTED]
Sent: Tuesday, November 08, 2005 2:11 AM
To: Christian Paul B. Cosinas
Cc: 'Alvaro Nunes Melo'; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Temporary Table

Christian Paul B. Cosinas wrote:


I try to run this command in my linux server.
VACUUM FULL pg_class;
VACUUM FULL pg_attribute;
VACUUM FULL pg_depend;

But it give me the following error:
-bash: VACUUM: command not found



That needs to be run from psql ...







I choose Polesoft Lockspam to fight spam, and you?
http://www.polesoft.com/refer.html



---(end of 
broadcast)---

TIP 4: Have you searched our list archives?

  http://archives.postgresql.org




I choose Polesoft Lockspam to fight spam, and you?
http://www.polesoft.com/refer.html



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Temporary Table

2005-11-07 Thread Christopher Kings-Lynne

In what directory in my linux server will I find these 3 tables?


Directory?  They're tables in your database...


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Temporary Table

2005-11-07 Thread Christopher Kings-Lynne
Ummm...they're SQL commands.  Run them in PostgreSQL, not on the unix 
command line...


Christian Paul B. Cosinas wrote:

I try to run this command in my linux server.
VACUUM FULL pg_class;
VACUUM FULL pg_attribute;
VACUUM FULL pg_depend;

But it give me the following error:
-bash: VACUUM: command not found





I choose Polesoft Lockspam to fight spam, and you?
http://www.polesoft.com/refer.html



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Performance PG 8.0 on dual opteron / 4GB / 3ware

2005-11-06 Thread Christopher Kings-Lynne

Now *I* am confused. What does PgAdmin do more than giving the query to
the database?


It builds it into the data grid GUI object.

Chris


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] browsing table with 2 million records

2005-10-26 Thread Christopher Kings-Lynne

Who needs a paginated view with 100.000 pages ?

- Select min(date) and max(date) from your table
- Present a nifty date selector to choose the records from any day, 
hour,  minute, second

- show them, with "next day" and "previous day" buttons

- It's more useful to the user (most likely he wants to know what  
happened on 01/05/2005 rather than view page 2857)
- It's faster (no more limit/offset ! just "date BETWEEN a AND b",  
indexed of course)

- no more new items pushing old ones to the next page while you browse
- you can pretend to your boss it's just like a paginated list


All very well and good, but now do it generically...


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] browsing table with 2 million records

2005-10-26 Thread Christopher Kings-Lynne
We have a GUI that let user browser through the record page by page at 
about 25 records a time. (Don't ask me why but we have to have this 
GUI). This translates to something like


  select count(*) from table   <-- to give feedback about the DB size
  select * from table order by date limit 25 offset 0


Heh, sounds like phpPgAdmin...I really should do something about that.

Tables seems properly indexed, with vacuum and analyze ran regularly. 
Still this very basic SQLs takes up to a minute run.


Yes, COUNT(*) on a large table is always slow in PostgreSQL.  Search the 
mailing lists for countless discussions about it.


Chris


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] prepared transactions that persist across sessions?

2005-10-23 Thread Christopher Kings-Lynne

I am using PHP's PDO PGSQL interface - I haven't read up enough on it
to determine whether a persistent connection can re-use server-side
prepared queries as an option. Anybody know?


It re-uses server-side prepared queries by default, if you are using the 
 PDOPrepare/PDOExecute stuff.


Chris


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Deleting Records

2005-10-20 Thread Christopher Kings-Lynne

What could possibly I do so that I can make this fast?

Here is the code inside my function:

FOR temp_rec IN SELECT * FROM item_qc_doer LOOP
DELETE FROM qc_session WHERE item_id = temp_rec.item_id;
DELETE FROM item_qc_doer WHERE item_id = temp_rec.item_id;
END LOOP;

Item_qc_oder table contains 22,000 records.


I'd check to see if i have foreign keys on those tables and if the 
columns that refer to them are properly indexed.  (For cascade delete or 
even just checking restrict)


Chris


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Deleting Records

2005-10-20 Thread Christopher Kings-Lynne

Here is the code inside my function:

FOR temp_rec IN SELECT * FROM item_qc_doer LOOP
DELETE FROM qc_session WHERE item_id = temp_rec.item_id;
DELETE FROM item_qc_doer WHERE item_id = temp_rec.item_id;
END LOOP;

Item_qc_oder table contains 22,000 records.


Also, chekc you have an index on both those item_id columns.

Also, why don't you just not use the loop and do this instead:

DELETE FROM qc_session WHERE item_id IN (SELECT item_id FROM item_qc_doer);
DELETE FROM item_qc_doer;

Chris


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Server misconfiguration???

2005-10-10 Thread Christopher Kings-Lynne

Yes you're right it really bosst a little.
I want to improve the system performance. Are there any more tipps?


The rest of the numbers look vaguely ok...

On this server runs only a webserver with php application which uses 
postgre Db. Should I give more memory to postgre? From what I noticed 
this is the most memory "needing" service from this system.


The best thing you can do is use two servers so that pgsql does not 
compete with web server for RAM...  Personally I'd start looking at my 
queries themselves next, see where I could optimise them.


Chris

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Server misconfiguration???

2005-10-10 Thread Christopher Kings-Lynne

A lot of them are too large.  Try:

Andy wrote:

Hi to all,
 
I have the following configuration:

Dual Xeon 2.8 Ghz, 1G RAM and postgre 8.0.3 installed.
 
Modified configuration parameters:
 
max_connections = 100
 
shared_buffers = 64000  # 500MB = 500 x 1024 x 1024 / (8 x 1024) (8KB)


shared_buffers = 1


work_mem = 51200  # 50MB = 50 x 1024 KB


work_mem = 4096


maintenance_work_mem = 102400 # 50MB = 100 x 1024 KB
 
checkpoint_segments = 10
 
effective_cache_size = 25600 # 200MB = 50 x 1024 / 8
 
client_min_messages = notice 
log_min_messages = notice

log_min_duration_statement = 2000
 
 
 
I get the feeling the server is somehow missconfigured or it does not 
work at full parameter. If I look at memory allocation, it never 
goes over 250MB whatever I do with the database. The kernel shmmax is 
set to 600MB. Database Size is around 550MB.


That's because you have work_mem set massively high.  Remember that's 
PER SORT.  If you have 10 queries running each doing 3 sorts that's 30x 
the work_mem right there.


Chris


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] [HACKERS] Query in SQL statement

2005-09-29 Thread Christopher Kings-Lynne



CREATE SEQUENCE ai_id;
CREATE TABLE badusers (
  id int DEFAULT nextval('ai_id') NOT NULL,
  UserName varchar(30),
  Date  datetime DEFAULT '-00-00 00:00:00' NOT NULL,
  Reason varchar(200),
  Admin varchar(30) DEFAULT '-',
  PRIMARY KEY (id),
  KEY UserName (UserName),
  KEY Date (Date)
);


Am always getting foll. Errors,

ERROR:  relation "ai_id" already exists
ERROR:  syntax error at or near "(" at character 240


You have just copied the Mysql code to Postgresql.  It will in no way 
work.  Your default for 'Date' is illegal in postgresql and hence it 
must allow NULLs.  There is no such thing as a 'datetime' type.  There 
is no such thing as 'Key'.  Also your mixed case identifiers won't be 
preserved.  You want:


CREATE TABLE badusers (
  id SERIAL PRIMARY KEY,
  UserName varchar(30),
  Date  timestamp,
  Reason varchar(200),
  Admin varchar(30) DEFAULT '-'
);

CREATE INDEX UserName_Idx ON badusers(Username);
CREATE INDEX Date_Idx ON badusers(Date);


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Too slow query, do you have an idea to optimize?

2005-09-09 Thread Christopher Kings-Lynne
Generate them all into a table and just delete them as you use them.
It's only 1 rows...

Chris

Choe, Cheng-Dae wrote:
> I need to generate unused random id with format is ID[0-9]{4}
> so i write below query but it seems to be too slow
> 
> SELECT * FROM ( 
> SELECT user_id FROM (
> SELECT 'ID' || LPAD(r, 4, '0') AS user_id 
> FROM generate_series(1, ) as r) AS s 
> EXCEPT
> SELECT user_id FROM account ) AS t 
> ORDER BY random() 
> LIMIT 1
> 
> and I execute explain analyze query.
> --
> Limit  (cost=318.17..318.17 rows=1 width=32) (actual
> time=731.703..731.707 rows=1 loops=1)
>->  Sort  (cost=318.17..318.95 rows=312 width=32) (actual
> time=731.693..731.693 rows=1 loops=1)
>  Sort Key: random()
>  ->  Subquery Scan t  (cost=285.79..305.24 rows=312 width=32)
> (actual time=424.299..659.193 rows= loops=1)
>->  SetOp Except  (cost=285.79..301.35 rows=311
> width=16) (actual time=424.266..566.254 rows= loops=1)
>  ->  Sort  (cost=285.79..293.57 rows=3112
> width=16) (actual time=424.139..470.529 rows=12111 loops=1)
>Sort Key: user_id
>->  Append  (cost=0.00..105.24 rows=3112
> width=16) (actual time=5.572..276.485 rows=12111 loops=1)
>  ->  Subquery Scan "*SELECT* 1" 
> (cost=0.00..30.00 rows=1000 width=4) (actual time=5.565..149.615
> rows= loops=1)
>->  Function Scan on
> generate_series r  (cost=0.00..20.00 rows=1000 width=4) (actual
> time=5.553..63.224 rows= loops=1)
>  ->  Subquery Scan "*SELECT* 2" 
> (cost=0.00..75.24 rows=2112 width=16) (actual time=0.030..28.473
> rows=2112 loops=1)
>->  Seq Scan on account 
> (cost=0.00..54.12 rows=2112 width=16) (actual time=0.019..10.155
> rows=2112 loops=1)
> Total runtime: 738.809 ms
> 
> 
> do you have any idea for optimize?


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Poor performance of delete by primary key

2005-09-06 Thread Christopher Kings-Lynne

Unfortunately there's no very simple way to determine which FK is the
problem.  (In 8.1 it'll be possible to do that with EXPLAIN ANALYZE,
but in existing releases EXPLAIN doesn't break out the time spent in
each trigger ...)  You have to just eyeball the schema :-(.


phpPgAdmin has a handy info feature where you can see all tables that 
refer to the current one.  You can always go and steal that query to 
find them...


Chris

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] shared buffers

2005-08-29 Thread Christopher Kings-Lynne

I forgot to say that it´s a 12GB database...


That's actually not that large.

Ok, I´ll set shared buffers to 30.000 pages but even so "meminfo" and 
"top" shouldn´t show some shared pages?


Yeah. The reason for not setting buffers so high is because PostgreSQL 
cannot efficiently manage huge shared buffers, so you're better off 
giving the RAM to Linux's disk cache.


Chris


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] shared buffers

2005-08-29 Thread Christopher Kings-Lynne
I´ve configured postgresql to use 1GB of shared buffers but meminfo and 
"top" are indicanting 0 shared buffers page. Why?


1GB shared buffers is far too much.  Set it back to like 3 buffers 
max...


Chris


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] limit number of concurrent callers to a stored proc?

2005-08-17 Thread Christopher Kings-Lynne
You could use a 1 column/1 row table perhaps.  Use some sort of locking 
mechanism.


Also, check out contrib/userlock

Chris

Alan Stange wrote:

Hello all,

is there a simple way to limit the number of concurrent callers to a 
stored proc?


The problem we have is about 50 clients come and perform the same 
operation at nearly the same time.  Typically, this query takes a few 
seconds to run, but in the case of this thundering herd the query time 
drops to 70 seconds or much more.  The query can return up to 15MB of data.


The machine is a dual opteron, 8 GB memory, lots of fiber channel disk, 
Linux 2.6, etc.


So, I'm thinking that a semaphore than will block more than N clients 
from being in the core of the function at one time would be a good thing.

Thanks!

-- Alan

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Indexed views.

2005-08-04 Thread Christopher Kings-Lynne

No, unless you use some custom triggers.

prasanna s wrote:
Does postgres support indexed views/materialised views that some of the 
other databases support?


Thanks
Prasanna S



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] [IMPORTANT] - My application performance

2005-07-26 Thread Christopher Kings-Lynne



Roberto Germano Vieweg Neto wrote:

My application is using Firebird 1.5.2

I have at my database:
- 150 Doamins
- 318 tables
- 141 Views
- 365 Procedures
- 407 Triggers
- 75 generators
- 161 Exceptions
- 183 UDFs
- 1077 Indexes

My question is:

Postgre SQL will be more faster than Firebird? How much (in percent)?


I think you can probably expect around 10341.426% improvement.







ps. Yes, I am joking just in case...


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] faster INSERT with possible pre-existing row?

2005-07-26 Thread Christopher Kings-Lynne
Insert into a temp table then use INSERT INTO...SELECT FROM to insert 
all rows into the proper table that don't have a relationship.


Chris

Dan Harris wrote:
I am working on a process that will be inserting tens of million rows  
and need this to be as quick as possible.


The catch is that for each row I could potentially insert, I need to  
look and see if the relationship is already there  to prevent  multiple 
entries.  Currently I am doing a SELECT before doing the  INSERT, but I 
recognize the speed penalty in doing to operations.  I  wonder if there 
is some way I can say "insert this record, only if it  doesn't exist 
already".  To see if it exists, I would need to compare  3 fields 
instead of just enforcing a primary key.


Even if this could be a small increase per record, even a few percent  
faster compounded over the whole load could be a significant reduction.


Thanks for any ideas you might have.

-Dan

---(end of broadcast)---
TIP 6: explain analyze is your friend



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Mirroring PostgreSQL database

2005-07-25 Thread Christopher Kings-Lynne

Try Slony: www.slony.info

Shashi Kanth Boddula wrote:


Hi,
I have one customer who is using PostgreSQL 7.4.8 on Linux . He has some 
problems with database mirroring . The details are follows.
The customer is using Linux on which PostgreSQL 7.4.8 along with Jboss 
3.2.3 is running . He has 2 servers , one is acting as a live server 
(primary) and another is acting as a fail-over (secondary)  server 
.  Secondary server is placed in remote location . These servers are 
acting as a Attendence server for daily activities . Nearly 50,000 
employees depend on the live server .
 
The customer is using DBmirror tool to mirror the database records of 
primary to secondary . The customer is complaining that there is one day 
(24 hours) delay between primary and secondray for database 
synchronization . They have dedicated line and bandwidth , but still the 
problems exists.
 
I just want to know , for immediate data mirroring , what is the best 
way for PostgreSQL . PostgreSQL is offering many mirror tools , but 
which one is the best ?. Is there any other way to accomplish the task ?
 
Thank you . Waiting for your reply.
 


Thanks & Regards,
Shashi Kanth
Consultant - Linux
RHCE , LPIC-2
Onward Novell - Bangalore
9886455567




---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Profiler for PostgreSQL

2005-07-13 Thread Christopher Kings-Lynne

Try turning on query logging and using the 'pqa' utility on pgfoundry.org.

Chris

Agha Asif Raza wrote:
Is there any MS-SQL Server like 'Profiler' available for PostgreSQL? A 
profiler is a tool that monitors the database server and outputs a 
detailed trace of all the transactions/queries that are executed on a 
database during a specified period of time. Kindly let me know if any of 
you knows of such a tool for PostgreSQL.
 
Agha Asif Raza



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] General DB Tuning

2005-07-12 Thread Christopher Kings-Lynne

Is there a different kind of 'prepared' statements
that we should be using in the driver to get logging
to work properly?  What is the 'new' protocol?


The 8.0.2 jdbc driver uses real prepared statements instead of faked 
ones.  The problem is the new protocol (that the 8.0.2 driver users) has 
a bug where protocol-prepared queries don't get logged properly.


I don't know if it's been fixed...

Chris


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] General DB Tuning

2005-07-12 Thread Christopher Kings-Lynne
we are using jdbc -- the "log_min_duration_statement = 3000 " 
statement works fine for me.  Looks like there's no other work around 
for the bug(?).  Not sure since I have no interest in logging a 
million statements a day, I only want to see the poorly performing hits. 


Doesn't it depend on what jdbc driver you are using?


It depends if he's using new-protocol prepared queries which don't get 
logged properly.  Wasn't that fixed for 8.1 or something?


Chris


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Why the planner is not using the INDEX .

2005-07-04 Thread Christopher Kings-Lynne
I'm a bit surprised of that behavior thought, since it means that if we 
delete a row from table A all tables (B,C,D) with FK pointing to this 
table (A) must be scanned. 
If there is no index on those tables it means we gone do all Sequantial 
scans. Than can cause significant performance problem!!!.


Correct.

Is there a reason why implicit index aren't created when FK are 
declared.


Because it's not a requirement...

I looked into the documentation and I haven't found a way to 
tell postgresql to automatically create an index when creating la FK.  
Does it means I need to manage it EXPLICITLY with create index statement 
?  Is there another way ?


No other way - you need to explicitly create them.  It's not that hard 
either to write a query to search the system catalogs for unindexed FK's.


Chris


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Too slow querying a table of 15 million records

2005-06-28 Thread Christopher Kings-Lynne

database=> explain select date_trunc('hour', time),count(*) as total from
test where p1=53 and time > now() - interval '24 hours' group by
date_trunc order by date_trunc ;


Try going:

time > '2005-06-28 15:34:00'

ie. put in the time 24 hours ago as a literal constant.

Chris


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] How can I speed up this function?

2005-06-27 Thread Christopher Kings-Lynne

What's wrong with Slony?


Because it's not multi-master. Our mirroring package is.


I'm curious - how did you write a multi-master replication package in 
pgsql, when pgsql doesn't have 2 phase commits or any kind of 
distributed syncing or conflict resolution in a release version?


Chris


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] How can I speed up this function?

2005-06-27 Thread Christopher Kings-Lynne

What's wrong with Slony?

David Mitchell wrote:
We have the following function in our home grown mirroring package, but 
it isn't running as fast as we would like. We need to select statements 
from the pending_statement table, and we want to select all the 
statements for a single transaction (pending_trans) in one go (that is, 
we either select all the statements for a transaction, or none of them). 
We select as many blocks of statements as it takes to top the 100 
statement limit (so if the last transaction we pull has enough 
statements to put our count at 110, we'll still take it, but then we're 
done).


Here is our function:

CREATE OR REPLACE FUNCTION dbmirror.get_pending()
  RETURNS SETOF dbmirror.pending_statement AS
$BODY$

DECLARE
count INT4;
transaction RECORD;
statement dbmirror.pending_statement;
BEGIN
count := 0;

FOR transaction IN SELECT t.trans_id as ID
FROM pending_trans AS t WHERE fetched = false
ORDER BY trans_id LIMIT 50
LOOP
update pending_trans set fetched =  true where trans_id = 
transaction.id;


FOR statement IN SELECT s.id, s.transaction_id, s.table_name, 
s.op, s.data

FROM dbmirror.pending_statement AS s
WHERE s.transaction_id = transaction.id
ORDER BY s.id ASC
LOOP
count := count + 1;

RETURN NEXT statement;
END LOOP;

IF count > 100 THEN
EXIT;
END IF;
END LOOP;

RETURN;
END;$BODY$
  LANGUAGE 'plpgsql' VOLATILE;

Table Schemas:

CREATE TABLE dbmirror.pending_trans
(
  trans_id oid NOT NULL,
  fetched bool DEFAULT false,
  CONSTRAINT pending_trans_pkey PRIMARY KEY (trans_id)
)
WITHOUT OIDS;

CREATE TABLE dbmirror.pending_statement
(
  id oid NOT NULL DEFAULT nextval('dbmirror.statement_id_seq'::text),
  transaction_id oid NOT NULL,
  table_name text NOT NULL,
  op char NOT NULL,
  data text NOT NULL,
  CONSTRAINT pending_statement_pkey PRIMARY KEY (id)
)
WITHOUT OIDS;

CREATE UNIQUE INDEX idx_stmt_tran_id_id
  ON dbmirror.pending_statement
  USING btree
  (transaction_id, id);

Postgres 8.0.1 on Linux.

Any Help would be greatly appreciated.

Regards




---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Postgres 8 vs Postgres 7.4/cygwin

2005-06-23 Thread Christopher Kings-Lynne

PostgreSQL 8 for windows faster AND more reliable :)

Chris

Scott Goldstein wrote:
I'm currently trying to make a decision on whether to use the Cygwin 
port of Postgres 7.4 or Postgres 8.0 for a windows installation.  Can 
someone provide some comparison info from a performance point of view?  
I was thinking that the Cygwin port has the overhead of the translation 
layer, but 8.0 is a newer product and may still have performance issue.  
Can anyone comment on this? 
 
Thanks for the help.
 
Scott



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] regular expression search

2005-06-14 Thread Christopher Kings-Lynne

Just read the docs in contrib/tsearch2 in the PostgreSQL distribution.

Pierre A. Fortier wrote:

I search for particular strings using regular expressions (e.g. where
column ~* $query) through a text data type column which contains notes
(some html code like bold is included).

It works but my question is whether there would be a way to speed up
searches?


From my limited investigation, I see the features "CREATE INDEX"  and

"tsearch2" but I'm not clear on how these work, whether they would be
appropriate, and whether there would be a better approach.

I'd appreciate being pointed in the right direction.

Pierre


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings



---(end of broadcast)---
TIP 6: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Updates on large tables are extremely slow

2005-06-13 Thread Christopher Kings-Lynne

Ok, if all 21 are affected, I can understand the problem.
But allow me to say that this is a "functional error"


No, it's normal MVCC design...


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] Recommendations for configuring a 200 GB

2005-06-09 Thread Christopher Kings-Lynne

We're hoping PostgreSQL can match or beat Sybase performance, and
preliminary tests look good.  We should be able to get some load testing
going within a week, and we're shooting for slipping these machines into
the mix around the end of this month.  (We've gone to some lengths to
keep our code portable.)


Just make sure to set up and run the contrib/pg_autovacuum daemon, or 
make sure you fully read 'regular database maintenance' in the manual.


Chris

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Query plan changes after pg_dump / pg_restore

2005-06-09 Thread Christopher Kings-Lynne

Is effective_cache_size set the same on the test and live?

Jona wrote:
Thanks... have notified our sys admin of that so he can make the correct 
changes.


It still doesn't explain the difference in query plans though?

I mean, it's the same database server the two instances of the same 
database is running on.
One instance (the live) just insists on doing the seq scan of the 50k 
records in Price_Tbl and the 6.5k records in SCT2SubCatType_Tbl.

Seems weird

Cheers
Jona

Christopher Kings-Lynne wrote:

  Thank you for the swift reply, the following is the output of the 
SHOW ALL for shared_buffers and effective_cache_size.

shared_buffers:  13384
effective_cache_size: 4000
server memory: 2GB




effective_cache_size should be 10-100x larger perhaps...

Chris






---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Query plan changes after pg_dump / pg_restore

2005-06-09 Thread Christopher Kings-Lynne
  Thank you for the swift reply, the following is the output of the SHOW 
ALL for shared_buffers and effective_cache_size.

shared_buffers:  13384
effective_cache_size: 4000
server memory: 2GB


effective_cache_size should be 10-100x larger perhaps...

Chris


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] How to find the size of a database - reg.

2005-06-08 Thread Christopher Kings-Lynne

contrib/dbsize in the postgresql distribution.

Shanmugasundaram Doraisamy wrote:

Dear Group!
 Thank you for all the support you all have been 
providing from time to time.  I have a small question: How do I find the 
actual size of the Database?  Awaiting you replies,


Shan.

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match



---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Need help to decide Mysql vs Postgres

2005-06-05 Thread Christopher Kings-Lynne

So, our problem in installing is we don't know a cluster or SSL from a
hole in the ground.  Things get confusing about contexts- are we
talking about a user of the system or the database?  Yikes, do I need
to write down the 30+ character autogenerated password?  


No you don't need to write it down :)


We just want to use JDBC, code SQL queries and essentially not care
what database is under us.  We would love to find a good tool that runs
as an Eclipse plug-in that lets us define our database, generate a
script file to create it and perhaps also help us concoct queries.


Dunno if such a thing exists?


Our experience is that the many UNIX-ish thing about postgres are there
and we don't know UNIX.  This makes you realize how much you take for
granted about the OS you do know.  Of course, we'll learn, but postgres
people, if you're listening: good job, now take us a little farther and
we will be your most ardent supporters.


Just ask questions on the lists, or get instant answers on #postgresql 
on irc.freenode.org.


Cheers,

Chris

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] strategies for optimizing read on rather large tables

2005-06-04 Thread Christopher Kings-Lynne
Without reading too hard, I suggest having a quick look at contrib/ltree 
module in the PostgreSQL distribution.  It may or may not help you.


Chris

hubert lubaczewski wrote:

hi
first let me draw the outline.
we have a database which stores "adverts".
each advert is in one category, and one or more "region".
regions and categories form (each) tree structure.
assume category tree:

 a
/ \
   b   c
  / \
 d   e

if any given advert is in category "e". it means it is also in "b" and
"a".
same goes for regions.

as for now we have approx. 400 categories, 1300 regions, and 100
adverts.

since checking always over the tress of categories and regions we
created acr_cache table (advert/category/region)
which stores information on all adverts and all categories and regions
this particular region is in.
plus some more information for sorting purposes.

this table is ~ 11 milion records.

now.
we query this in more or less this manner:

select advert_id from acr_cache where category_id = ? and region_id = ?
order by XXX {asc|desc} limit 20;

where XXX is one of 5 possible fields,
timestamp,
timestamp,
text,
text,
numeric

we created index on acr_cache (category_id, region_id) 
and it works rather well.

usually.
if a given "crossing" (category + region) has small amount of ads (less
then 1) - the query is good enough (up to 300 miliseconds).
but when we enter the crossings which result in 5 ads - the query
takes up to 10 seconds.
which is almost "forever".

we thought about creating indices like this:
index on acr_cache (effective_date);
where effective_dateis on of the timestamp fields.
it worked well for the crossings with lots of ads, but when we asked for
small crossing (like 1000 ads) it took > 120 seconds!
it appears that postgresql was favorizing this new advert instead of
using much better index on category_id and region_id.

actually - i'm not sure what to do next.
i am even thinkinh about createing special indices (partial) for big
crossings, but that's just weird. plus the fact that already the
acr_cache vacuum time exceeds 3 hours!.


any suggestions?
hardware is dual xeon 3 ghz, 4G ram, hardware scsi raid put into raid 1.
settings in postgresql.conf:
listen_addresses = '*'
port = 5800
max_connections = 300
superuser_reserved_connections = 50
shared_buffers = 131072
work_mem = 4096
maintenance_work_mem = 65536
fsync = false
commit_delay = 100
commit_siblings = 5
checkpoint_segments = 10
effective_cache_size = 1
random_page_cost = 1.1
log_destination = 'stderr'
redirect_stderr = true
log_directory = '/home/pgdba/logs'
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
log_truncate_on_rotation = false
log_rotation_age = 1440
log_rotation_size = 502400
log_min_duration_statement = -1
log_connections = true
log_duration = true
log_line_prefix = '[%t] [%p] <[EMAIL PROTECTED]> '
log_statement = 'all'
stats_start_collector = true
stats_command_string = true
stats_block_level = true
stats_row_level = true
stats_reset_on_server_start = true
lc_messages = 'en_US.UTF-8'
lc_monetary = 'en_US.UTF-8'
lc_numeric = 'en_US.UTF-8'
lc_time = 'en_US.UTF-8'

actual max numer of connection is 120 plus some administrative connections 
(psql sessions).
postgresql version 8.0.2 on linux debian sarge.

best regards,

depesz



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] poor performance involving a small table

2005-05-30 Thread Christopher Kings-Lynne
 Seq Scan on sensor  (cost=0.00..1.25 rows=1 width=6) (actual 
time=0.055..0.068 rows=1 loops=1)

   Filter: (sensor_id = 12)
 Total runtime: 801641.333 ms
(3 rows)

Can anybody help me out?  Thanks so much!


Does your table have millions of dead rows?  Do you vacuum once an hour? 
 Run VACUUM FULL ANALYE sensor;


Chris


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] sequential scan performance

2005-05-29 Thread Christopher Kings-Lynne

When I do an EXPLAIN ANALYZE on the above query, the result is:

 Seq Scan on city_alias  (cost=0.00..59282.31 rows=2 width=42)  (actual 
time=73.369..3330.281 rows=407 loops=1)

   Filter: ((name)::text ~~ '%FRANCISCO'::text)
Total runtime: 3330.524 ms
(3 rows)


this is a query that our system needs to do a LOT.   Is there any way  
to improve the performance on this either with changes to our query  or 
by configuring the database deployment?   We have an index on  city_name 
but when using the % operator on the front of the query  string 
postgresql can't use the index .


Of course not.  There really is now way to make your literal query above 
fast.  You could try making a functional index on the reverse() of the 
string and querying for the reverse() of 'francisco'.


Or, if you want a general full text index, you should absolutely be 
using contrib/tsearch2.


Chris

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] slow queries, possibly disk io

2005-05-26 Thread Christopher Kings-Lynne

I have some queries that have significan't slowed down in the last
couple days. It's gone from 10 seconds to over 2 mins.

The cpu has never gone over 35% in the servers lifetime, but the load
average is over 8.0 right now. I'm assuming this is probably due to
disk io.


You sure it's not a severe lack of vacuuming that's the problem?

Chris

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] [PORTS] Which library has these symbols? -- Eureka

2005-05-25 Thread Christopher Kings-Lynne

My Next Task: Finding a Stress Test Harness to Load, and Query Data.

Anyone have ideas?

I am eagerly awaiting the * DESTRUCTION* ** of Oracle around here, and
"yes" I am an oracle DBA and think it's */ very /*// good technology.


Have you tried the simple 'gmake test'?

Other than that, try http://osdb.sourceforge.net/ perhaps...

Chris

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Can anyone explain this: duplicate dbs.

2005-05-24 Thread Christopher Kings-Lynne
Would CLUSTER / REINDEX still have an effect if our queries were done 
via sequential scan? 


SELECTS don't write to the database, so they have no effect at all on 
vacuuming/analyzing.  You only need to worry about that with writes.


This is a old database (as in built by me when i 
was just starting to learn unix / postgres) so the database design is 
pretty horrible (little normalisation, no indexes).


No indexes?  Bloody hell :D

Use EXPLAIN ANALYZE SELECT ... ; on all of your selects to see where 
they are slow and where you can add indexes...


Chris

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Can anyone explain this: duplicate dbs.

2005-05-24 Thread Christopher Kings-Lynne
Can anyone explain why this may be occurring and how I might be able to 
keep the original database running at the same speed as "tempdb"?


You're not vacuuming anywhere near often enough.  Read up the database 
maintenance section of the manual.  Then, set up contrib/pg_autovacuum 
to vacuum your database regularly, or make a cron job to run "vacuumdb 
-a -z -q" once an hour, say.


You can fix for the case when you haven't been vacuuming enough by a 
once off VACUUM FULL ANALYZE command, but this will lock tables 
exclusively as it does its work.


Chris

---(end of broadcast)---
TIP 6: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Can anyone explain this: duplicate dbs.

2005-05-24 Thread Christopher Kings-Lynne
If I pg_dump that database then create a new database (e.g. "tempdb") 
and upload the dump file (thus making a duplicate) then the same query 
only takes 190ms !!
Vacuum, vacuum analyse, and vacuum full analyse does not seem to have an 
impact on these times.


Damn, for some reason I didn't read that you had already tried vacuum 
full.  In that case, I can't explain it except perhaps you aren't 
vacuuming properly, or the right thing, or it's a disk cache thing.


Chris

---(end of broadcast)---
TIP 6: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Select performance vs. mssql

2005-05-24 Thread Christopher Kings-Lynne

--As Chris pointed out, how real-world is this test?
His point is valid. The database we're planning will
have a lot of rows and require a lot of summarization
(hence my attempt at a "test"), but we shouldn't be
pulling a million rows at a time.


If you want to do lots of aggregate analysis, I suggest you create a 
sepearate summary table, and create triggers on the main table to 
maintain your summaries in the other table...



--MSSQL's ability to hit the index only and not having
to go to the table itself results in a _big_
performance/efficiency gain. If someone who's in
development wants to pass this along, it would be a
nice addition to PostgreSQL sometime in the future.
I'd suspect that as well as making one query faster,
it would make everything else faster/more scalable as
the server load is so much less.


This is well-known and many databases do it.  However, due to MVCC 
considerations in PostgreSQL, it's not feasible for us to implement it...


Chris

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Select performance vs. mssql

2005-05-23 Thread Christopher Kings-Lynne



select count(*) from mtable where day='Mon'

Results:

1. P3 600 512MB RAM MSSQL. It takes about 4-5 secs to
run. If I run a few queries and everything is cached,
it is sometimes  just 1 second.

2. Athlon 1.3 Ghz 1GB RAM. PostgreSQL takes 7 seconds.
I have played with the buffers setting and currently
have it at 7500. At 2 it took over 20 seconds to
run.

5 seconds vs 7 isn't that big of a deal, but 1 second
vs 7 seconds is. Also, the slower performance is with
much lesser hardware.


Post the result of this for us:

explain analyze select count(*) from mtable where day='Mon';

On both machines.

Chris

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Optimizing for writes. Data integrity not critical

2005-05-19 Thread Christopher Kings-Lynne
I'm doing the writes individually.  Is there a better way?  Combining 
them all into a transaction or something?
Use COPY of course :)
Or at worst bundle 1000 inserts at a time in a transation...
And if you seriously do not care about your data at all, set fsync = off 
 in you postgresql.conf for a mega speedup.

Chris
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Is there any other way to do this?

2005-05-17 Thread Christopher Kings-Lynne
This time it worked! But VACUUM FULL requires an exclusive lock on the 
table which I don't really want to grant. So my question is: why is 
VACUUM ANALYZE didn't do the job? Is there any setting I can tweak to 
make a VACUUM without granting a exclusive lock?
You need to run normal vacuum analyze every few minutes or so, to stop 
it growing.  I suggest pg_autovacuum.

Chris
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faq


Re: [PERFORM] Partitioning / Clustering

2005-05-11 Thread Christopher Kings-Lynne
Acceptable Answers to 'So, when/is PG meant to be getting a decent  
partitioning system?':

1. Person X is working on it I believe.
2. It's on the list, but nobody has done anything about it yet
3. Your welcome to take a stab at it, I expect the community  would 
support your efforts as well.
4. If you have a huge pile of money you could probably buy the  
Moon. Thinking along those lines, you can probably pay someone to  write 
it for you.
5. It's a stupid idea, and it's never going to work, and heres  
why..

Unacceptable Answers to the same question:
1. Yours.
Be more helpful, and less arrogant please. Everyone else who has  
contributed to this thread has been very helpful in clarifying the  
state of affairs and pointing out what work is and isn't being done,  
and alternatives to just waiting for PG do it for you.
Please YOU be more helpful and less arrogant.  I thought your inital 
email was arrogant, demanding and insulting.  Your followup email has 
done nothing to dispel my impression.  Simon (one of PostgreSQL's major 
contributors AND one of the very few people working on partitioning in 
PostgreSQL, as you requested) told you all the reasons clearly and politely.

Chris
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] Prefetch

2005-05-10 Thread Christopher Kings-Lynne
Another trick you can use with large data sets like this when you want 
results
back in seconds is to have regularly updated tables that aggregate the data
along each column normally aggregated against the main data set.

Maybe some bright person will prove me wrong by posting some working
information about how to get these apparently absent features working.
Most people just use simple triggers to maintain aggregate summary tables...
Chris
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Partitioning / Clustering

2005-05-10 Thread Christopher Kings-Lynne

*laff*
Yeah, like they've been working on views for the last 5 years, and
still haven't released them :D :D :D

? 
http://dev.mysql.com/doc/mysql/en/create-view.html
...for MySQL 5.0.1+ ?
Give me a call when it's RELEASED.
Chris
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Partitioning / Clustering

2005-05-10 Thread Christopher Kings-Lynne
This is why I mention partitioning. It solves this issue by storing  
different data sets on different machines under the same schema.  These 
seperate chunks of the table can then be replicated as well for  data 
redundancy and so on. MySQL are working on these things
*laff*
Yeah, like they've been working on views for the last 5 years, and still 
haven't released them :D :D :D

Chris
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] Bad choice of query plan from PG 7.3.6 to PG 7.3.9

2005-05-06 Thread Christopher Kings-Lynne
You didn't do analyze.
Chris
Jona wrote:
  Results of VACUUM VERBOSE from both servers
Test server:
comm=# VACUUM VERBOSE StatCon_Tbl;
INFO:  --Relation public.statcon_tbl--
INFO:  Pages 338: Changed 338, Empty 0; Tup 11494: Vac 0, Keep 0, UnUsed 0.
Total CPU 0.02s/0.00u sec elapsed 0.04 sec.
INFO:  --Relation pg_toast.pg_toast_179851--
INFO:  Pages 85680: Changed 85680, Empty 0; Tup 343321: Vac 0, Keep 0, 
UnUsed 0.
Total CPU 4.03s/0.40u sec elapsed 70.99 sec.
VACUUM

Live Server:
comm=# VACUUM VERBOSE StatCon_Tbl;
INFO:  --Relation public.statcon_tbl--
INFO:  Pages 424: Changed 0, Empty 0; Tup 12291: Vac 0, Keep 0, UnUsed 6101.
Total CPU 0.01s/0.00u sec elapsed 0.60 sec.
INFO:  --Relation pg_toast.pg_toast_891830--
INFO:  Pages 89234: Changed 0, Empty 0; Tup 352823: Vac 0, Keep 0, 
UnUsed 5487.
Total CPU 4.44s/0.34u sec elapsed 35.48 sec.
VACUUM

Cheers
Jona
Tom Lane wrote:
Jona <[EMAIL PROTECTED]>  writes:
 

I'm currently experiencing problems with long query execution times.
What I believe makes these problems particularly interesting is the 
difference in execution plans between our test server running PostGreSQL 
7.3.6 and our production server running PostGreSQL 7.3.9.
The test server is an upgraded "home machine", a Pentium 4 with 1GB of 
memory and IDE disk.
The production server is a dual CPU XEON Pentium 4 with 2GB memory and 
SCSI disks.
One should expect the production server to be faster, but appearently 
not as the outlined query plans below shows.
   

I think the plans are fine; it looks to me like the production server
has serious table-bloat or index-bloat problems, probably because of
inadequate vacuuming.  For instance compare these entries:
->  Index Scan using ctp_statcon on statcon_tbl  (cost=0.00..6.01 rows=1 
width=4) (actual time=0.05..0.31 rows=39 loops=4)
 Index Cond: ((statcon_tbl.sctid = "outer".sctid) AND (statcon_tbl.ctpid = 
1))
->  Index Scan using ctp_statcon on statcon_tbl  (cost=0.00..20.40 rows=5 
width=4) (actual time=27.97..171.84 rows=39 loops=4)
 Index Cond: ((statcon_tbl.sctid = "outer".sctid) AND (statcon_tbl.ctpid = 
1))
Appears to be exactly the same task ... but the test server spent
1.24 msec total while the production server spent 687.36 msec total.
That's more than half of your problem right there.  Some of the other
scans seem a lot slower on the production machine too.
 

1) How come the query plans between the 2 servers are different?
   

The production server's rowcount estimates are pretty good, the test
server's are not.  How long since you vacuumed/analyzed the test server?
It'd be interesting to see the output of "vacuum verbose statcon_tbl"
on both servers ...
regards, tom lane
PS: if you post any more query plans, please try to use software that
doesn't mangle the formatting so horribly ...
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster
 

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] COPY vs INSERT

2005-05-04 Thread Christopher Kings-Lynne
So what's the difference between a COPY and a batch of INSERT
statements.  Also, surely, fsyncs only occur at the end of a
transaction, no need to fsync before a commit has been issued, right?
With COPY, the data being inserted itself does not have to pass through 
the postgresql parser.

Chris
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] batch inserts are "slow"

2005-05-02 Thread Christopher Kings-Lynne
conn.setAutoCommit(false);
pst = conn.prepareStatement("INSERT INTO tmp (...) VALUES (?,?)");
for (int i = 0; i < len; i++) {
   pst.setInt(0, 2);
   pst.setString(1, "xxx");
   pst.addBatch();
}
pst.executeBatch();
conn.commit();
This snip takes 1.3 secs in postgresql. How can I lower that?
You're batching them as one transaction, and using a prepared query both 
of which are good.  I guess the next step for a great performance 
improvement is to use the COPY command.  However, you'd have to find out 
how to access that via Java.

I have a nasty suspicion that the release JDBC driver doesn't support it 
and you may have to apply a patch.

Ask on pgsql-jdbc@postgresql.org perhaps.
Chris
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] immutable functions vs. join for lookups ?

2005-04-20 Thread Christopher Kings-Lynne
Yay!  You're right!  I wonder why have I forgotten about LANGUAGE SQL. :)
It's 30 seconds vs 5 seconds for CASE ... END insisde PLpgsql vs CASE...END
LANGUAGE SQL. :)  I.e. its almost the same as in-place entered SQL.
  Regards,
  Dawid
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] immutable functions vs. join for lookups ?

2005-04-20 Thread Christopher Kings-Lynne
Yay!  You're right!  I wonder why have I forgotten about LANGUAGE SQL. :)
It's 30 seconds vs 5 seconds for CASE ... END insisde PLpgsql vs CASE...END
LANGUAGE SQL. :)  I.e. its almost the same as in-place entered SQL.
Probably because simple SQL functions get inlined by the optimiser.
Chris
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] How to tell what your postgresql server is doing

2005-04-19 Thread Christopher Kings-Lynne
Is there a way to look at the stats tables and tell what is jamming up 
your postgres server the most?  Other than seeing long running queries 
and watch top, atop, iostat, vmstat in separate xterms...I'm wondering 
if postgres keeps some stats on what it spends the most time doing or if 
there's a way to extract that sort of info from other metrics it keeps 
in the stats table?

Maybe a script which polls the stats table and correlates the info with 
stats about the system in /proc?
Turn on logging of all queries, sample for a few hours or one day.  Then 
 run Practical Query Analyzer (PQA on pgfoundry.org) over it to get 
aggregate query information.

Chris
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Postgresql works too slow

2005-04-17 Thread Christopher Kings-Lynne
Try enabliing your checkpoint_segments. In my example, our database 
restore took 75mins. After enabling checkpoints_segments to 20, we cut 
it down to less than 30 minutes.

Increasing maintenance_work_mem might help too ... or several other
settings ... with no information about exactly *what* is slow, it's
hard to say.
Try turning fsync = false for the duration of your reload.
Chris
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Foreign key slows down copy/insert

2005-04-15 Thread Christopher Kings-Lynne
Am I correct is assuming that the timings are calculated locally by psql 
on my client, thus including network latency?
No explain analyze is done on the server...
Chris
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Foreign key slows down copy/insert

2005-04-14 Thread Christopher Kings-Lynne
My problem with this really is that in my database it is hard to predict 
which inserts will be huge (and thus need FKs dissabled), so I would 
have to code it around all inserts. Instead I can code my own integirty 
logic and avoid using FKs all together.
Just drop the fk and re-add it, until postgres gets more smarts.
Chris
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] Foreign key slows down copy/insert

2005-04-14 Thread Christopher Kings-Lynne
Thanks for the pointer. I got this from the archives:

update pg_class set reltriggers=0 where relname = 'YOUR_TABLE_NAME';
to enable them after you are done, do
update pg_class set reltriggers = count(*) from pg_trigger where 
pg_class.oid=tgrelid and relname='YOUR_TABLE_NAME';


I assume the re-enabling will cause an error when the copy/insert added 
data that does not satisfy the FK. In that case I'll indeed end up with 
invalid data, but at least I will know about it.
No it certainly won't warn you.  You have _avoided_ the check entirely. 
 That's why I was warning you...

If you wanted to be really careful, you could:
being;
lock tables for writes...
turn off triggers
insert
delete where rows don't match fk constraint
turn on triggers
commit;
Chris
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] Foreign key slows down copy/insert

2005-04-14 Thread Christopher Kings-Lynne
Deferring makes no difference to FK checking speed...

But why then is the speed acceptable if I copy and then manually add the 
FK? Is the check done by the FK so much different from when it is done 
automatically using an active deffered FK?
Yeah I think it uses a different query formulation...  Actually I only 
assume that deferred fk's don't use that - I guess your experiment 
proves that.

Well, that's what people do - even pg_dump will restore data and add 
the foreign key afterward...
If I have to go this route, is there a way of automatically dropping and 
re-adding FKs? I can probably query pg_constraints and drop the 
appropriate ones, but how do I re-add them after the copy/insert?
Actually, you can just "disable" them if you want to be really dirty :) 
 You have to be confident that the data you're inserting does satisfy 
the FK, however otherwise you can end up with invalid data.

To see how to do that, try pg_dump with --disable-triggers mode enabled. 
 Just do a data-only dump.

Chris
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] Foreign key slows down copy/insert

2005-04-14 Thread Christopher Kings-Lynne
I am new to cross references between tables, and I am trying to
understand how they impact performance. From reading the documentation I
was under the impression that deffering foreign keys would yield about
the same performance as dropping them before a copy, and adding them
after. However, I cannot see this in my test case.
Even if you defer them, it just defers the check, doesn't eliminate it...
I have a table A with an int column ID that references table B column
ID. Table B has about 150k rows, and has an index on B.ID. When trying
to copy 1 million rows into A, I get the following \timings:
1) drop FK, copy (200s), add FK (5s)
2) add FK defferable initially deffered, copy (I aborted after 30min)
3) add FK defferable initially deffered, begin, copy (200s), commit (I
aborted after 30min)
How do I explain why test cases 2 and 3 do not come close to case 1? Am
I missing something obvious?
Deferring makes no difference to FK checking speed...
Since the database I am working on has many FKs, I would rather not have
to drop/add them when I am loading large data sets.
Well, that's what people do - even pg_dump will restore data and add the 
foreign key afterward...

Chris
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Use of data within indexes

2005-04-14 Thread Christopher Kings-Lynne
To be more explicit, let's say I have table with two fields a and b. If 
I have an index on (a,b) and I do a request like "SELECT b FROM table 
WHERE a=x", will Postgresql use only the index, or will it need to also 
read the table page for that (those) row(s)?
It must read the table because of visibility considerations.
There might be a reason why this is not possible (I don't know if the 
indexes have all necessary transaction ID information?) but otherwise 
this could possibly provide an interesting performance gain for some 
operations, in particular with some types of joins. Or maybe it already 
does it.
It's already been thought of :)
The 4 or so columns that store visibility information are not in the 
indexes, to do so would require a significant write cost.

Chris
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Is Indexed View Supported in psql 7.1.3??

2005-04-10 Thread Christopher Kings-Lynne
   I like to know whether Indexed View supported in psql 7.1.3.?
No...
Is there any performance analysis tool for psql.?
No, we keep telling you to upgrade to newer PostgreSQL.  Then you can 
use EXPLAIN ANALYZE.

Chris
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


  1   2   3   >