Re: [GENERAL] Installing support for python on windows

2006-12-31 Thread novnov

Thanks to both of you. I tried the following and got an error that
plpython.dll couldn't be found:

D:\postgresql\bin>createlang -U sauser plpythonu mydb
Password:
createlang: language installation failed: ERROR:  could not load library
"D:/pos
tgresql/lib/plpython.dll": The specified module could not be found.

The file spec'd by the error message does indeed exist, though the slashes
in windows would of course be the other way around in normal use. Then I
tried leaving of the u in plpythonu

D:\postgresql\bin>createlang -U sauser plpython mydb
Password:
createlang: language installation failed: ERROR:  unsupported language
"plpython"
HINT:  The supported languages are listed in the pg_pltemplate system
catalog.

Then I tried something like what Adrian had suggested:

D:\postgresql\bin>createlang -U sauser -d mydb plpythonu
Password:
createlang: language installation failed: ERROR:  could not load library
"D:/pos
tgresql/lib/plpython.dll": The specified module could not be found.

It's interesting that createlang knows to look in the peer lib directory for
the language file but somehow does not like the plpython that it sees there.
I also tried plpython (no trailing u) and had the same error as the earlier
experiment.
-- 
View this message in context: 
http://www.nabble.com/Installing-support-for-python-on-windows-tf2902841.html#a8111635
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Database versus filesystem for storing images

2006-12-31 Thread Guy Rouillier

Clodoaldo wrote:

A web application will store hundreds of thousands images.





What is the best practice in this situation? I mean not only the two
options as above but any.


This discussion comes up regularly.  See the archives for a thread 
titled "Storing images in PostgreSQL databases (again)" for the latest.


--
Guy Rouillier

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/



Re: [GENERAL] Installing support for python on windows

2006-12-31 Thread Tom Lane
novnov <[EMAIL PROTECTED]> writes:
> - Per most docs the installation of a language is done with createlang. I've
> tried many times and cannot get the syntax right. Typically I log in to psql
> in the \bin directory as the main postgres user. After that, a typical
> attempt is looks like this:

> postgres=# createlang 'd:\postgresql\lib\plpython' mydb;

> Typical error is:
> ERROR:  syntax error at or near "createlang"
> LINE 1: createlang d:

createlang is a command-line program, not a SQL command.  You can use
the equivalent CREATE LANGUAGE command if you want to do it from the
SQL prompt.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] Installing support for python on windows

2006-12-31 Thread Adrian Klaver
On Sunday 31 December 2006 2:09 pm, novnov wrote:
> I've spent a couple of hours on this and would like to ask for an assist at
> this point. I need to install python support on windows. The postgresql
> install is the windows 8.2.0-1 binary, and pl/pgsql was included in the
> initial installation. I have python 2.5 installed. The workstation's path
> does not ref either the python or postgres directories.
>
> - I take it that plpython is what I should be installing? I first read in
> this list that psycopg is commonly used to work with python in postgres,
> and I installed the windows port of that, but 'nothing happened' ie no
> python lang support materialized. I'm not at all sure what the score is
> with psycopg and plpython, but am now aiming for plpython as that seems to
> be the offering that is included with the postgres installer. But I'm
> curious, why psycopg when plpython is there? Are they complimentary, or
> should it be one or the other?
Psycopg is an interface from Python to Postgres. In other words it works from 
the outside. You use it to connect a Postgres database and manipulate data as 
needed. plpython is a procedural language for Postgres that allows one to use 
Python from within the server. For what you want to do plpython is the choice.
>
> - The postgresql docs reference plpythonu (untrusted) but what I find in
> \lib is plpython? Are the docs out of date or am I mixing up information?
> (http://www.postgresql.org/docs/8.2/interactive/plpython.html)
>
I haven't installed Postgres on Windows so I am not familiar with the naming 
there. At one time there was a plpython but this has been replaced by plpythonu 
as Python did not have a robust mechanism for keeping the language from 
straying outside the server.
> - Per most docs the installation of a language is done with createlang.
> I've tried many times and cannot get the syntax right. Typically I log in
> to psql in the \bin directory as the main postgres user. After that, a
> typical attempt is looks like this:
>
> postgres=# createlang 'd:\postgresql\lib\plpython' mydb;
>
> Typical error is:
> ERROR:  syntax error at or near "createlang"
> LINE 1: createlang d:
>
> This is a later attempt where I've added the lib path. I've tried no path,
> double quotes, many things. I've tried executing from the windows command
> line (ie not while in psql), adding the dll extension. Rather than spending
> the rest of the year on this (ie ten hours ) I hope someone can knock
> out a line or two to steer me in the right direction.

Just for reference sake, did you try createlang -d mydb plpythonu

See URL below for more information
http://www.postgresql.org/docs/8.2/interactive/app-createlang.html


-- 
Adrian Klaver
[EMAIL PROTECTED]

[GENERAL] Application validation of data on insert

2006-12-31 Thread John Cohorn

Hello!

I'm only moderately familiar with postgres and the DBI related modules that
I'm using to access it so forgive me if this is a simple question. I've been
looking for several hours for information on this and have not found a
satisfying answer.

I'm writing a simple CGI application to do simple create, update, delete
operations on a few tables and although doing these basic operations is
trivial I am having a great deal of trouble figuring out  the best way to
validate data being inserted and handle related errors.

My question in a nutshell is when performing an INSERT, how do I determine
exactly which columns are responsible for the error when one occurs(so that
within the application I can say "hey user check how you typed field x,
field y). The SQLSTATUS error codes are very non-specific and I cannot find
another method for checking what failed without parsing text
output(unreasonable for many reasons). Perhaps there is a way to define an
INSERT trigger that could place/return the names of the problem columns
somewhere? Doing this in a way that does not tie one only to PGSQL(although
it's my favorite) is even more of a problem.

It seems to me that all information about acceptable values, constraints and
types for columns is already present within the table definitions so
duplicating those checks and definitions within the application and user
view is folly.

Thanks for your enlightenment!


[GENERAL] Installing support for python on windows

2006-12-31 Thread novnov

I've spent a couple of hours on this and would like to ask for an assist at
this point. I need to install python support on windows. The postgresql
install is the windows 8.2.0-1 binary, and pl/pgsql was included in the
initial installation. I have python 2.5 installed. The workstation's path
does not ref either the python or postgres directories.

- I take it that plpython is what I should be installing? I first read in
this list that psycopg is commonly used to work with python in postgres, and
I installed the windows port of that, but 'nothing happened' ie no python
lang support materialized. I'm not at all sure what the score is with
psycopg and plpython, but am now aiming for plpython as that seems to be the
offering that is included with the postgres installer. But I'm curious, why
psycopg when plpython is there? Are they complimentary, or should it be one
or the other?

- The postgresql docs reference plpythonu (untrusted) but what I find in
\lib is plpython? Are the docs out of date or am I mixing up information?
(http://www.postgresql.org/docs/8.2/interactive/plpython.html)

- Per most docs the installation of a language is done with createlang. I've
tried many times and cannot get the syntax right. Typically I log in to psql
in the \bin directory as the main postgres user. After that, a typical
attempt is looks like this:

postgres=# createlang 'd:\postgresql\lib\plpython' mydb;

Typical error is:
ERROR:  syntax error at or near "createlang"
LINE 1: createlang d:

This is a later attempt where I've added the lib path. I've tried no path,
double quotes, many things. I've tried executing from the windows command
line (ie not while in psql), adding the dll extension. Rather than spending
the rest of the year on this (ie ten hours ) I hope someone can knock out
a line or two to steer me in the right direction. 





-- 
View this message in context: 
http://www.nabble.com/Installing-support-for-python-on-windows-tf2902841.html#a8110347
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[GENERAL] Database versus filesystem for storing images

2006-12-31 Thread Clodoaldo

A web application will store hundreds of thousands images.

The images meta data will be stored in postgresql but I'm in doubt
about where to store the images themselves, if in the database as byte
strings or in the file system.

Would one of them have a faster image search than the other?

In the database the images would be indexed by a serial id, that of
the file submitter, and an image number. They would always be searched
by the same key.

In the file system there would be a two or three level directory
structure and the directory names would be hexadecimal numbers like
'/images_directory/f/0/d' taken from the 2/3 first letters of the file
name, which would be a hash.

Whenever I want to know the hash name of an image so I can find it in
the file system I would have to search it first in the database by the
index. So I suppose the database will always be faster.

But there is an advantage in using the file system. It is that the
Apache's HTTP file caching management is there for free. If I store
the images in the database I will have to handle the request and
response HTTP headers myself if I want the images to be cached by the
clients.

What is the best practice in this situation? I mean not only the two
options as above but any.

Regards,
--
Clodoaldo Pinto Neto

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Generic timestamp function for updates where field

2006-12-31 Thread novnov

Thanks Adrian, I'll see what I can do with that, I'll learn a lot by going
through what you've done.


Adrian Klaver wrote:
> 
> On Sunday 31 December 2006 8:48 am, novnov wrote:
>> OK. python would be the preference, if anyone is interested in showing me
>> how it would be done, I've never used one of the dynamic languages with
>> postgres.
>>
>> Why would not be possible in plpgsql? It has loop etc, the only part I'm
>> not sure it can do it use the variable as field name.
>>
> 
>> >http://archives.postgresql.org/
> Here is a function I wrote in python to do something similar.  My
> timestamp 
> fields are of the form tc_ts_update where tc is  a table code that can be 
> found by looking up the table name in the table_code table. In pl/pythonu 
> that ships with 8.2 it is no longer necessary to do the relid look up. 
> There 
> is a TD["table_name"] variable that returns the table name directly.
> 
> CREATE OR REPLACE FUNCTION public.ts_update()
> RETURNS trigger AS
> $Body$
> table_oid=TD["relid"]
> plan_name=plpy.prepare("SELECT relname FROM pg_class WHERE
> oid=$1",["oid"])
> plan_code=plpy.prepare("SELECT tc_table_code FROM table_code WHERE 
> tc_table_name=$1",["text"])
> rs_name=plpy.execute(plan_name,[table_oid])
> rs_code=plpy.execute(plan_code,[rs_name[0]["relname"]])
> fld_name="_ts_update"
> tbl_code=rs_code[0]["tc_table_code"]
> full_name=tbl_code+fld_name
> TD["new"][full_name]="now()"
> return "MODIFY"
> $Body$
> LANGUAGE plpythonu SECURITY DEFINER;
> -- 
> Adrian Klaver
> [EMAIL PROTECTED]
> 
> ---(end of broadcast)---
> TIP 2: Don't 'kill -9' the postmaster
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Generic-timestamp-function-for-updates-where-field-names-vary-tf2899327.html#a8108979
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] Generic timestamp function for updates where field

2006-12-31 Thread novnov

Yes thanks that does make it clear.


Martijn van Oosterhout wrote:
> 
> On Sun, Dec 31, 2006 at 08:48:07AM -0800, novnov wrote:
>> Why would not be possible in plpgsql? It has loop etc, the only part I'm
>> not
>> sure it can do it use the variable as field name.
> 
> The "dynamic" in dynamic language usually refers to dynamic typing.
> pl/pgsql is a strictly typed language which means one can look at the
> code and determine what type each variable will be.
> 
> Languages like perl and python are dynamic because what type a variable
> is does not matter until runtime and the type can change every time you
> run it.
> 
> The problem with using a variable as a field name is that you can no
> longer tell what type the result will be. It could be any type
> supported by postgres. pl/pgsql can't handle that, perl and python
> can...
> 
> Hope this helps,
> -- 
> Martijn van Oosterhout  http://svana.org/kleptog/
>> From each according to his ability. To each according to his ability to
>> litigate.
> 
>  
> 

-- 
View this message in context: 
http://www.nabble.com/Generic-timestamp-function-for-updates-where-field-names-vary-tf2899327.html#a8108978
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Generic timestamp function for updates where field

2006-12-31 Thread Adrian Klaver
On Sunday 31 December 2006 8:48 am, novnov wrote:
> OK. python would be the preference, if anyone is interested in showing me
> how it would be done, I've never used one of the dynamic languages with
> postgres.
>
> Why would not be possible in plpgsql? It has loop etc, the only part I'm
> not sure it can do it use the variable as field name.
>

> >http://archives.postgresql.org/
Here is a function I wrote in python to do something similar.  My timestamp 
fields are of the form tc_ts_update where tc is  a table code that can be 
found by looking up the table name in the table_code table. In pl/pythonu 
that ships with 8.2 it is no longer necessary to do the relid look up.  There 
is a TD["table_name"] variable that returns the table name directly.

CREATE OR REPLACE FUNCTION public.ts_update()
RETURNS trigger AS
$Body$
table_oid=TD["relid"]
plan_name=plpy.prepare("SELECT relname FROM pg_class WHERE oid=$1",["oid"])
plan_code=plpy.prepare("SELECT tc_table_code FROM table_code WHERE 
tc_table_name=$1",["text"])
rs_name=plpy.execute(plan_name,[table_oid])
rs_code=plpy.execute(plan_code,[rs_name[0]["relname"]])
fld_name="_ts_update"
tbl_code=rs_code[0]["tc_table_code"]
full_name=tbl_code+fld_name
TD["new"][full_name]="now()"
return "MODIFY"
$Body$
LANGUAGE plpythonu SECURITY DEFINER;
-- 
Adrian Klaver
[EMAIL PROTECTED]

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] slow speeds after 2 million rows inserted

2006-12-31 Thread Chad Wagner

On 12/31/06, Nikola Milutinovic <[EMAIL PROTECTED]> wrote:


> 1. There is no difference (speed-wise) between committing every 1K or
every 250K rows.

It was really some time ago, since I have experimented with this. My las
experiment was on PG 7.2 or 7.3. I was inserting cca 800,000 rows.
Inserting without transactions took 25 hrs. Inserting with 10,000 rows per
transaction took about 2.5 hrs. So, the speedup was 10x. I have not
experimented with the transaction batch size, but I suspect that 1,000 would
not show much speedup.



I would imagine the overhead here is sending the BEGIN/COMMIT (or the
behavior could be version dependent), and the commit is forcing a log
flush.  According to the documentation multiple inserts have the potential
of being flushed in one shot when the database wakes up to do a logflush
automatically, so committing more frequently would actually appear to slow
you down.

The time to commit is dependent on hardware, on my hardware it was around
40ms (which you have to question the precision of the calculation, clearly
it is probably less than 40ms and that is an upper limit under load).

My experiment was with 8.2, default configuration, so there is room for
improvement.  It was installed from Devrim's RPM packages.


2. Vacuuming also makes no difference for a heavy insert-only table, only
slows it down.

Makes sense. Since my application was dumping all records each month and
inserting new ones, vacuum was really needed, but no speedup.



I agree, vacuuming is clearly important.  I would also think if you are
going to do a massive one-time update/delete as a maintenance item on a
normally read only table that you should plan on doing a vacuum full to
recover the space used by the "old" rows.  Logically the fewer pages on
disk, the less I/O that will result in scenarios where you are doing
sequential scans and probably even many index scans.

It seems that the MVCC implementation would introduce fragmentation (with
respect to a btree indexed field) if your table design had an indexed
creation date field and you often range scanned on that field but also
updated the record then over time the optimizer would less favor the index
as the correlation approached 0.  Obviously this is a great feature for a
"last update date" field.  :)  Not so great if your primary queries are on a
creation date field.



3. Table size plays no real factor.

The reason I saw speedup, must have to do with the fact that without
transactions, each insert was it's own transaction. That was eating
resources.




I would agree.  I am also reusing the same statement handles (prepare once,
execute many) with DBD::Pg.  The benefit here appears to be that it prepares
the cursor once (one time to parse and generate the execution plan), and
executes the same plan multiple times.  The difference in inserts was about
2000 inserts/s!

This is the one of the reasons why everyone keeps saying use COPY instead of
INSERT, COPY is essentially a one time prepare and execute many.

Test #1 (prepare once, execute many):
4000 inserts 0.92 secs, 4368.84 inserts/s, commit 0.04 secs.
4000 inserts 0.93 secs, 4303.47 inserts/s, commit 0.02 secs.
4000 inserts 0.93 secs, 4319.78 inserts/s, commit 0.02 secs.
4000 inserts 0.93 secs, 4306.38 inserts/s, commit 0.02 secs.
4000 inserts 0.93 secs, 4299.53 inserts/s, commit 0.02 secs.
4000 inserts 0.92 secs, 4345.44 inserts/s, commit 0.02 secs.
4000 inserts 0.93 secs, 4298.67 inserts/s, commit 0.03 secs.
4000 inserts 0.91 secs, 4382.13 inserts/s, commit 0.04 secs.
4000 inserts 0.92 secs, 4347.44 inserts/s, commit 0.02 secs.
4000 inserts 0.93 secs, 4314.66 inserts/s, commit 0.02 secs.
inserts/s (StdDev) = 28.7435
inserts/s (Mean) = 4328.6351

Test #2 (prepare/execute each time):
4000 inserts 1.92 secs, 2086.21 inserts/s, commit 0.04 secs.
4000 inserts 1.91 secs, 2092.67 inserts/s, commit 0.02 secs.
4000 inserts 1.91 secs, 2094.54 inserts/s, commit 0.05 secs.
4000 inserts 1.96 secs, 2042.55 inserts/s, commit 0.03 secs.
4000 inserts 1.91 secs, 2098.57 inserts/s, commit 0.04 secs.
4000 inserts 1.91 secs, 2098.91 inserts/s, commit 0.03 secs.
4000 inserts 1.91 secs, 2098.38 inserts/s, commit 0.03 secs.
4000 inserts 1.92 secs, 2083.94 inserts/s, commit 0.02 secs.
4000 inserts 1.95 secs, 2050.07 inserts/s, commit 0.03 secs.
4000 inserts 1.92 secs, 2086.14 inserts/s, commit 0.02 secs.
inserts/s (StdDev) = 19.2360
inserts/s (Mean) = 2083.1987


Re: [GENERAL] slow speeds after 2 million rows inserted

2006-12-31 Thread Richard Broersma Jr
> It was really some time ago, since I have experimented with this. My las 
> experiment was on PG
> 7.2 or 7.3. I was inserting cca 800,000 rows. Inserting without transactions 
> took 25 hrs.
> Inserting with 10,000 rows per transaction took about 2.5 hrs. So, the 
> speedup was 10x. I have
> not experimented with the transaction batch size, but I suspect that 1,000 
> would not show much
> speedup.
> 
> > 2. Vacuuming also makes no difference for a heavy insert-only table, only 
> > slows it down.
> 
> Makes sense. Since my application was dumping all records each month and 
> inserting new ones,
> vacuum was really needed, but no speedup.
> 
> > 3. Table size plays no real factor.

Maybe this link my be useful,  it contains additional links to various 
postgresql preformance
test.
http://archives.postgresql.org/pgsql-general/2006-10/msg00662.php

Regards,

Richard Broersma Jr.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Generic timestamp function for updates where field

2006-12-31 Thread Martijn van Oosterhout
On Sun, Dec 31, 2006 at 08:48:07AM -0800, novnov wrote:
> Why would not be possible in plpgsql? It has loop etc, the only part I'm not
> sure it can do it use the variable as field name.

The "dynamic" in dynamic language usually refers to dynamic typing.
pl/pgsql is a strictly typed language which means one can look at the
code and determine what type each variable will be.

Languages like perl and python are dynamic because what type a variable
is does not matter until runtime and the type can change every time you
run it.

The problem with using a variable as a field name is that you can no
longer tell what type the result will be. It could be any type
supported by postgres. pl/pgsql can't handle that, perl and python
can...

Hope this helps,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to 
> litigate.


signature.asc
Description: Digital signature


Re: [GENERAL] Generic timestamp function for updates where field

2006-12-31 Thread novnov

When responding just now I mentioned loops because another issue I'm working
on involves those, loops wouldn't be involved here.


novnov wrote:
> 
> OK. python would be the preference, if anyone is interested in showing me
> how it would be done, I've never used one of the dynamic languages with
> postgres.
> 
> Why would not be possible in plpgsql? It has loop etc, the only part I'm
> not sure it can do it use the variable as field name.
> 
> 
> David Fetter wrote:
>> 
>> On Sat, Dec 30, 2006 at 10:33:01AM -0800, novnov wrote:
>>> 
>>> The pagila database has generic trigger function called last_updated()
>>> (shown
>>> below) which is used to update timestamp columns in various tables. The
>>> reason I can't use the function 'as is' for my own purposes is that in
>>> my
>>> app the timestamp fields are not all named alike. The field names do
>>> follow
>>> a pattern, two example names would be "user_datem "and "item_datem".
>> 
>> In cases like these, it's better to use a more dynamic language for
>> your trigger like PL/Perl.
>> 
>> Cheers,
>> David.
>>> I know
>>> I could change my db so that all these timestamp fields are named
>>> "datem",
>>> but I'd prefer to keep the names distinct, and of course I don't want to
>>> create a tigger funtion for each table. Using the pagila trigger
>>> function as
>>> a starting point, can someone suggest a solution? I am pretty sure that
>>> a
>>> simple solution would be to pass in the prefix value, and concatenate
>>> with
>>> the common "_datem".  Or is there a better solution? I will give the
>>> approach I've outlined a try, but I'm not even sure it's doable
>>> (primarliy,
>>> using the contatenated field name inplace of the "last-update" in
>>> "NEW.last_update = CURRENT_TIMESTAMP;", that's just stuff I've not done
>>> in
>>> plpgsql)...I'm all thumbs with plpgsql syntax, so anyone that wants to
>>> lay a
>>> solution down would be helping out a lot.
>>> 
>>> >From pagila:
>>> CREATE or REPLACE FUNCTION "public"."last_updated"()
>>> RETURNS "pg_catalog"."trigger" AS 
>>> $BODY$
>>> BEGIN
>>> NEW.last_update = CURRENT_TIMESTAMP;
>>> RETURN NEW;
>>> END
>>> $BODY$
>>> LANGUAGE 'plpgsql' VOLATILE;
>>> -- 
>>> View this message in context:
>>> http://www.nabble.com/Generic-timestamp-function-for-updates-where-field-names-vary-tf2899327.html#a8100353
>>> Sent from the PostgreSQL - general mailing list archive at Nabble.com.
>>> 
>>> 
>>> ---(end of broadcast)---
>>> TIP 4: Have you searched our list archives?
>>> 
>>>http://archives.postgresql.org/
>> 
>> -- 
>> David Fetter <[EMAIL PROTECTED]> http://fetter.org/
>> phone: +1 415 235 3778AIM: dfetter666
>>   Skype: davidfetter
>> 
>> Remember to vote!
>> 
>> ---(end of broadcast)---
>> TIP 4: Have you searched our list archives?
>> 
>>http://archives.postgresql.org/
>> 
>> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Generic-timestamp-function-for-updates-where-field-names-vary-tf2899327.html#a8108294
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Generic timestamp function for updates where field

2006-12-31 Thread novnov

OK. python would be the preference, if anyone is interested in showing me how
it would be done, I've never used one of the dynamic languages with
postgres.

Why would not be possible in plpgsql? It has loop etc, the only part I'm not
sure it can do it use the variable as field name.


David Fetter wrote:
> 
> On Sat, Dec 30, 2006 at 10:33:01AM -0800, novnov wrote:
>> 
>> The pagila database has generic trigger function called last_updated()
>> (shown
>> below) which is used to update timestamp columns in various tables. The
>> reason I can't use the function 'as is' for my own purposes is that in my
>> app the timestamp fields are not all named alike. The field names do
>> follow
>> a pattern, two example names would be "user_datem "and "item_datem".
> 
> In cases like these, it's better to use a more dynamic language for
> your trigger like PL/Perl.
> 
> Cheers,
> David.
>> I know
>> I could change my db so that all these timestamp fields are named
>> "datem",
>> but I'd prefer to keep the names distinct, and of course I don't want to
>> create a tigger funtion for each table. Using the pagila trigger function
>> as
>> a starting point, can someone suggest a solution? I am pretty sure that a
>> simple solution would be to pass in the prefix value, and concatenate
>> with
>> the common "_datem".  Or is there a better solution? I will give the
>> approach I've outlined a try, but I'm not even sure it's doable
>> (primarliy,
>> using the contatenated field name inplace of the "last-update" in
>> "NEW.last_update = CURRENT_TIMESTAMP;", that's just stuff I've not done
>> in
>> plpgsql)...I'm all thumbs with plpgsql syntax, so anyone that wants to
>> lay a
>> solution down would be helping out a lot.
>> 
>> >From pagila:
>> CREATE or REPLACE FUNCTION "public"."last_updated"()
>> RETURNS "pg_catalog"."trigger" AS 
>> $BODY$
>> BEGIN
>> NEW.last_update = CURRENT_TIMESTAMP;
>> RETURN NEW;
>> END
>> $BODY$
>> LANGUAGE 'plpgsql' VOLATILE;
>> -- 
>> View this message in context:
>> http://www.nabble.com/Generic-timestamp-function-for-updates-where-field-names-vary-tf2899327.html#a8100353
>> Sent from the PostgreSQL - general mailing list archive at Nabble.com.
>> 
>> 
>> ---(end of broadcast)---
>> TIP 4: Have you searched our list archives?
>> 
>>http://archives.postgresql.org/
> 
> -- 
> David Fetter <[EMAIL PROTECTED]> http://fetter.org/
> phone: +1 415 235 3778AIM: dfetter666
>   Skype: davidfetter
> 
> Remember to vote!
> 
> ---(end of broadcast)---
> TIP 4: Have you searched our list archives?
> 
>http://archives.postgresql.org/
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Generic-timestamp-function-for-updates-where-field-names-vary-tf2899327.html#a8108262
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] slow speeds after 2 million rows inserted

2006-12-31 Thread Nikola Milutinovic
> 1. There is no difference (speed-wise) between committing every 1K or every 
> 250K rows.

It was really some time ago, since I have experimented with this. My las 
experiment was on PG 7.2 or 7.3. I was inserting cca 800,000 rows. Inserting 
without transactions took 25 hrs. Inserting with 10,000 rows per transaction 
took about 2.5 hrs. So, the speedup was 10x. I have not experimented with the 
transaction batch size, but I suspect that 1,000 would not show much speedup.

> 2. Vacuuming also makes no difference for a heavy insert-only table, only 
> slows it down.

Makes sense. Since my application was dumping all records each month and 
inserting new ones, vacuum was really needed, but no speedup.

> 3. Table size plays no real factor.




The reason I saw speedup, must have to do with the fact that without 
transactions, each insert was it's own transaction. That was eating resources.

Nix.



__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

Re: [GENERAL] Autovacuum Improvements

2006-12-31 Thread Christopher Browne
A long time ago, in a galaxy far, far away, [EMAIL PROTECTED] (Alvaro Herrera) 
wrote:
> Christopher Browne wrote:
>
>> Seems to me that you could get ~80% of the way by having the
>> simplest "2 queue" implementation, where tables with size < some
>> threshold get thrown at the "little table" queue, and tables above
>> that size go to the "big table" queue.
>> 
>> That should keep any small tables from getting "vacuum-starved."
>
> Hmm, would it make sense to keep 2 queues, one that goes through the
> tables in smaller-to-larger order, and the other one in the reverse
> direction?

Interesting approach; that would mean having just one priority queue
for all the work.  That seems to simplify things a bit, which is a
good thing.

Unifying policies further might have some merit, too.  The worker
processes (that do the vacuuming) could be set up to alternate between
head and tail of the queue.  That is, a worker process could vacuum
the littlest table and then go after the biggest table.  That way,
they'd eat at both ends towards the middle.  Adding more workers could
easily add to the speed at which both ends of the queue get eaten
(assuming you've got the I/O to support having 4 or 5 vacuums running
concurrently).

There is one thing potentially bad, with that; the thing we never want
is for all the workers to get busy on the biggest tables so that
little ones are no longer being serviced.  So there needs to be a way
to make sure that there's one worker devoted to "little tables."  I
suppose the rule may be that the 1st worker process *never* goes after
the biggest tables.

That ought to be enough to prevent starvation.

> I am currently writing a design on how to create "vacuum queues" but
> I'm thinking that maybe it's getting too complex to handle, and a
> simple idea like yours is enough (given sufficient polish).

There's plenty to like about coming up with a reasonable set of
heuristics...
-- 
output = ("cbbrowne" "@" "acm.org")
http://linuxdatabases.info/info/slony.html
Rules of the Evil Overlord #191.  "I will not appoint a relative to my
staff of advisors.  Not only  is nepotism the cause of most breakdowns
in policy, but it also causes trouble with the EEOC."


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match