Re: [HACKERS] insert ... delete ... returning ... ?

2008-02-24 Thread Tom Lane
Mark Mielke <[EMAIL PROTECTED]> writes:
> I'm noticing a massive reduction in on disk storage required for my 
> database that I believe is primarily attributable due to Tom's reduced 
> overhead for short strings.

Twasn't my work; Greg Stark gets most of the credit for that one, and
you might be seeing some benefit from Heikki's work to cut the tuple
header size too.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] pg_dump additional options for performance

2008-02-24 Thread Pavan Deolasee
On Sun, Feb 24, 2008 at 6:52 PM, Jochem van Dieten <[EMAIL PROTECTED]> wrote:
>
>  Or we could have a switch that specifies a directory and have pg_dump
>  split the dump not just in pre-schema, data and post-schema, but also
>  split the data in a file for each table. That would greatly facilitate
>  a parallel restore of the data through multiple connections.
>


How about having a single switch like --optimize  and then based
on the "level", pg_dump behaves differently. For example, if optimization is
turned off (i.e. -O0), pg_dump just dumps the schema and data. At level 1,
it will dump the pre-schema, data and post-schema.

We can then add more levels and optimize it further. For example, postponing
the creation of non-constraining indexes, splitting the data into
multiple files etc.
I can also think of adding constructs to the dump so that we can identify what
can be restored in parallel and pg_restore using that information
during restore.


Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[HACKERS] dblink doesn't honor interrupts while waiting a result

2008-02-24 Thread Florian G. Pflug

Hi

dblink in 8.3 blocks without any possibility of interrupting it while
waiting for an answer from the remote server. Here is a strace
[pid 27607] rt_sigaction(SIGPIPE, {SIG_IGN}, {SIG_IGN}, 8) = 0
[pid 27607] sendto(56, "Q\0\0\0008lock table travelhit.booking_code in 
exclusive mode\0", 57, 0, NULL, 0) = 57

[pid 27607] rt_sigaction(SIGPIPE, {SIG_IGN}, {SIG_IGN}, 8) = 0
[pid 27607] poll([{fd=56, events=POLLIN|POLLERR}], 1, -1) = ? 
ERESTART_RESTARTBLOCK (To be restarted)

[pid 27607] --- SIGTERM (Terminated) @ 0 (0) ---
[pid 27607] rt_sigreturn(0xf)   = -1 EINTR (Interrupted system call)
[pid 27607] poll(

As you can see I'm trying to lock the table travelhit.booking_code, 
which blocks because someone else is already holding that lock. When

I send a SIGTERM to the backend, the poll() syscalll is interruped -
but immediatly restarted.

I'm not sure how a proper fix for this could look like, since the 
blocking actually happens inside libpq - but this certainly makes 
working with dblink painfull...


regards, Florian Pflug

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] insert ... delete ... returning ... ?

2008-02-24 Thread Mark Mielke

Tom Lane wrote:

"Jonah H. Harris" <[EMAIL PROTECTED]> writes:
  

Not stupid, it doesn't work :)  This was a limitation of the original
design based on (IIRC) executor-related issues.



There are definitional issues not only implementation ones; in
particular, in subquery-like cases it's entirely unclear how many times
the DML operation will or should get evaluated.
  


Interesting. Would it be cheating to only allow it in cases where the 
evaluation should definately be only once? For example, insert ... 
delete, create table ... delete, or part of a join expression?


In any case - I don't have the know how to fix it, and it's certainly 
more of a "would be cute" than "I must have it." I'll settle with my 
table locks for now. It's no big deal for my application.


I'm noticing a massive reduction in on disk storage required for my 
database that I believe is primarily attributable due to Tom's reduced 
overhead for short strings. Some of the tables I am importing have a 10 
- 20 short string fields (many 0 length strings!). Unfortunately - I 
wasn't looking for this specifically, so I didn't keep my old database 
instance around. But I'm thinking by memory that the biggest table is 
now 1/3 the number of relpages in 8.3 as it was in 8.2. Good job all 
around hackers. Again - *NO* problems. It just works.


Cheers,
mark

--
Mark Mielke <[EMAIL PROTECTED]>



Re: [HACKERS] insert ... delete ... returning ... ?

2008-02-24 Thread Jonah H. Harris
On Sun, Feb 24, 2008 at 5:44 PM, Tom Lane <[EMAIL PROTECTED]> wrote:
> "Jonah H. Harris" <[EMAIL PROTECTED]> writes:
>  > Not stupid, it doesn't work :)  This was a limitation of the original
>  > design based on (IIRC) executor-related issues.
>
>  There are definitional issues not only implementation ones; in
>  particular, in subquery-like cases it's entirely unclear how many times
>  the DML operation will or should get evaluated.

Yup,that's what it was.  I think I remember the trigger-level and
top-level executor-related stuff.  If I'm in that area of the code
soon, I'll see how much would be involved and if I think I have enough
time, submit a proposal for it.

-- 
Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
EnterpriseDB Corporation| fax: 732.331.1301
499 Thornall Street, 2nd Floor  | [EMAIL PROTECTED]
Edison, NJ 08837| http://www.enterprisedb.com/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] insert ... delete ... returning ... ?

2008-02-24 Thread Tom Lane
"Jonah H. Harris" <[EMAIL PROTECTED]> writes:
> Not stupid, it doesn't work :)  This was a limitation of the original
> design based on (IIRC) executor-related issues.

There are definitional issues not only implementation ones; in
particular, in subquery-like cases it's entirely unclear how many times
the DML operation will or should get evaluated.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] insert ... delete ... returning ... ?

2008-02-24 Thread Jonah H. Harris
On Sun, Feb 24, 2008 at 4:21 PM, Mark Mielke <[EMAIL PROTECTED]> wrote:
>  I tried using insert ... delete ... return ... and get a syntax error:

Yeah...

>  In the past I've executed insert ... select and then the delete.
>  However, I believe there is race condition here as the delete may see
>  more or less rows than the insert ... select. I thought the above would
>  be a clever PostgreSQL-8.3 alternative, but I'm either stupid or it
>  doesn't work... :-)

Not stupid, it doesn't work :)  This was a limitation of the original
design based on (IIRC) executor-related issues.  I've seen about 6 or
so posts now about using DELETE returning in the same manner as you're
discussing, and I agree it would be quite useful.  Unfortunately, with
the amount of changes required to make it work properly, no one has
wanted to pick that up and add it yet :(

Depending on what else I'm working on, I'd like to get this fixed for
8.4.  Though, I'll probably be working on other, more important
projects.

-- 
Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
EnterpriseDB Corporation| fax: 732.331.1301
499 Thornall Street, 2nd Floor  | [EMAIL PROTECTED]
Edison, NJ 08837| http://www.enterprisedb.com/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] insert ... delete ... returning ... ?

2008-02-24 Thread Guillaume Smet
On Sun, Feb 24, 2008 at 10:39 PM, Guillaume Smet
<[EMAIL PROTECTED]> wrote:
>  On Sun, Feb 24, 2008 at 10:21 PM, Mark Mielke <[EMAIL PROTECTED]> wrote:
>  >  I'm at PostgreSQL 8.3 for my production database and everything is
>  >  working great. I had no problems converting free text search from 8.2 to
>  >  8.3, and I really like the improvements.
>  >
>  >  I tried using insert ... delete ... return ... and get a syntax error:
>  >
>  >  > pccyber=# insert into product_manufacturer_archived (itemno,
>  >  > manufacturer_id)
>
>  I would expect a semicolon here.

Mmmmh, my bad, I missed your point. Sorry for the noise :).

--
Guillaume

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] insert ... delete ... returning ... ?

2008-02-24 Thread Guillaume Smet
Hi Mark,

On Sun, Feb 24, 2008 at 10:21 PM, Mark Mielke <[EMAIL PROTECTED]> wrote:
>  I'm at PostgreSQL 8.3 for my production database and everything is
>  working great. I had no problems converting free text search from 8.2 to
>  8.3, and I really like the improvements.
>
>  I tried using insert ... delete ... return ... and get a syntax error:
>
>  > pccyber=# insert into product_manufacturer_archived (itemno,
>  > manufacturer_id)

I would expect a semicolon here.

>  > pccyber-#   delete from product_manufacturer
>  > pccyber-# where not exists (select * from icitem
>  > pccyber(# where icitem.itemno =
>  > product_manufacturer.itemno and
>  > pccyber(#   not inactive)
>  > pccyber-# returning itemno, manufacturer_id;
>  > ERROR:  syntax error at or near "delete"
>  > LINE 2:   delete from product_manufacturer
>  >   ^

--
Guillaume

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


[HACKERS] insert ... delete ... returning ... ?

2008-02-24 Thread Mark Mielke

Hi all:

I'm at PostgreSQL 8.3 for my production database and everything is 
working great. I had no problems converting free text search from 8.2 to 
8.3, and I really like the improvements.


I tried using insert ... delete ... return ... and get a syntax error:

pccyber=# insert into product_manufacturer_archived (itemno, 
manufacturer_id)

pccyber-#   delete from product_manufacturer
pccyber-# where not exists (select * from icitem
pccyber(# where icitem.itemno = 
product_manufacturer.itemno and

pccyber(#   not inactive)
pccyber-# returning itemno, manufacturer_id;
ERROR:  syntax error at or near "delete"
LINE 2:   delete from product_manufacturer
  ^


The goal here is to move inactive records to an archived table. This is 
to be performed as part of a daily batch job instead of as a trigger. 
Assume my model is correct - my question isn't how can I do this. I 
would like to know if insert .. delete .. returning is intended to work 
or not.


In the past I've executed insert ... select and then the delete. 
However, I believe there is race condition here as the delete may see 
more or less rows than the insert ... select. I thought the above would 
be a clever PostgreSQL-8.3 alternative, but I'm either stupid or it 
doesn't work... :-)


Any ideas?

Thanks,
mark

--
Mark Mielke <[EMAIL PROTECTED]>


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Batch update of indexes on data loading

2008-02-24 Thread Simon Riggs
On Thu, 2008-02-21 at 13:26 +0900, ITAGAKI Takahiro wrote:
> This is a proposal of fast data loading using batch update of indexes for 8.4.
> It is a part of pg_bulkload (http://pgbulkload.projects.postgresql.org/) and
> I'd like to integrate it in order to cooperate with other parts of postgres.
> 
> The basic concept is spooling new coming data, and merge the spool and
> the existing indexes into a new index at the end of data loading. It is 
> 5-10 times faster than index insertion per-row, that is the way in 8.3.
> 
> 
> One of the problem is locking; Index building in bulkload is similar to
> REINDEX rather than INSERT, so we need ACCESS EXCLUSIVE LOCK during it.
> Bulkloading is not a upper compatible method, so I'm thinking about
> adding a new "WITH LOCK" option for COPY command.
> 
>   COPY tbl FROM 'datafile' WITH LOCK;
> 

I'm very excited to see these concepts going into COPY.

One of the reasons why I hadn't wanted to pursue earlier ideas to use
LOCK was that applying a lock will prevent running in parallel, which
ultimately may prevent further performance gains.

Is there a way of doing this that will allow multiple concurrent COPYs?

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com 


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Behaviour of rows containg not-null domains in plpgsql

2008-02-24 Thread Florian G. Pflug

Tom Lane wrote:

"Florian G. Pflug" <[EMAIL PROTECTED]> writes:

Plus, the fact that we don't support "default" specifications in
pl/pgsql for row types turns this inconvenience into a major PITA,


You mean initialization expressions, not defaults, correct?  (I would
consider the latter to mean that whatever attrdef entries were attached
to the rowtype's parent table would be used implicitly.)

Yeah, I mean writing "declare; v_var schema.table default row()"

regards, Florian Pflug

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Behaviour of rows containg not-null domains in plpgsql

2008-02-24 Thread Tom Lane
"Florian G. Pflug" <[EMAIL PROTECTED]> writes:
> Plus, the fact that we don't support "default" specifications in
> pl/pgsql for row types turns this inconvenience into a major PITA,

You mean initialization expressions, not defaults, correct?  (I would
consider the latter to mean that whatever attrdef entries were attached
to the rowtype's parent table would be used implicitly.)

> Is there some difficulty in implementing row-type defaults, or is it
> just that nobody cared enough about them to do the work?

The second statement is certainly true, I don't know about the first.
Feel free to take a shot at it.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Behaviour of rows containg not-null domains in plpgsql

2008-02-24 Thread Florian G. Pflug

Tom Lane wrote:

"Florian G. Pflug" <[EMAIL PROTECTED]> writes:
I just stumbled over the following behaviour, introduced with 8.3, 
and wondered if this is by design or an oversight.


No, this was in 8.2.

Ah, sorry - I'm porting an app from 8.1 straight to 8.3, and blindly
assumes that i'd have worked with 8.2...

If you define a domain over some existing type, constrain it to 
non-null values, and use that domain as a field type in a table 
definition, it seems to be impossible to declare pl/pgsql variables
 of that table's row type. The problem seems to be that upon 
declaration, the row variable is filled with nulls - but since the 
domain is marked not-null, that immediatly triggers an exception.


What else would you expect it to do?  AFAICS any other behavior would
 be contrary to spec.

It's the inconsistency between row types (where the not-null contraint
in the table definition *doesn't* prevent a declaration like "myvar
mytable" in pl/pgsql), and domains (where the not-null constraint *does*
prevent such a declaration) that bugs me.

Plus, the fact that we don't support "default" specifications in
pl/pgsql for row types turns this inconvenience into a major PITA,
forcing you to use "record" when you know that correct type perfectly
well...

Is there some difficulty in implementing row-type defaults, or is it
just that nobody cared enough about them to do the work?

regards, Florian Pflug


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] 8.3 / 8.2.6 restore comparison

2008-02-24 Thread Joshua D. Drake

Tom Lane wrote:

Andrew Dunstan <[EMAIL PROTECTED]> writes:

Joshua D. Drake wrote:
I would also question the 64KB at a time. Why not a 1024KB (arbitrary) 
at a time? Is it a resource issue? In the old days when we actually 
had people trying to run postgresql on 128 and 256 megs of ram, o.k. 
but now?


It would be simple enough to change. Try it and see if it actually makes 
a difference. All you have to change is the define of RAW_BUF_SIZE.


Seems unlikely that making it bigger than (a fraction of) L2 cache
would be a smart move.


O.k. these CPUs have 1meg of L2 so I will try with 512k.

Joshua D. Drake



regards, tom lane




---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] 8.3 / 8.2.6 restore comparison

2008-02-24 Thread Joshua D. Drake

Andrew Dunstan wrote:



Joshua D. Drake wrote:

Heikki Linnakangas wrote:


We read 64 KB at a time, and then CopyReadLineText returns one line 
at a time from that buffer.


O.k. I am sure I am oversimplifying things but why are we returning 
one line at a time? That reads expensive to me. Just following the 
general, don't do inserts one at a time, do them in batch idea for 
example.


Quite simply because one line corresponds to one record. And yes, I 
believe you are oversimplifying, or under several misapprehensions about 
what can be done at this level.


Well I find without questioning, I won't get the answers so :).

 > It would be simple enough to change. Try it and see if it actually 
makes

a difference. All you have to change is the define of RAW_BUF_SIZE.



Fair enough. May I assume this is the only place I need to change it?

http://doxygen.postgresql.org/backend_2commands_2copy_8c-source.html#l00158


It looks like CopyReadAttributesText is used as part of the column 
breakup. It also appears that this happens "before" insert right? So 
if that is the case we are going to pay an additional penalty on the 
data checking.




What? I don't understand what you are talking about.


Data checking on insert to the DB itself. I have no doubt that I may be 
wrong on this.


Joshua D. Drake


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] 8.3 / 8.2.6 restore comparison

2008-02-24 Thread Tom Lane
Andrew Dunstan <[EMAIL PROTECTED]> writes:
> Joshua D. Drake wrote:
>> I would also question the 64KB at a time. Why not a 1024KB (arbitrary) 
>> at a time? Is it a resource issue? In the old days when we actually 
>> had people trying to run postgresql on 128 and 256 megs of ram, o.k. 
>> but now?

> It would be simple enough to change. Try it and see if it actually makes 
> a difference. All you have to change is the define of RAW_BUF_SIZE.

Seems unlikely that making it bigger than (a fraction of) L2 cache
would be a smart move.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Behaviour of rows containg not-null domains in plpgsql

2008-02-24 Thread Tom Lane
Andrew Dunstan <[EMAIL PROTECTED]> writes:
> What seems worse is that it still fails even if you declare the domain 
> to have a default value.

Hmm, that seems like it could be a bug.  We don't currently consider
that a rowtype includes the parent table's defaults or constraints.
But if we are going to honor a domain's constraints then maybe the
domain's default has to float along with that.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] 8.3 / 8.2.6 restore comparison

2008-02-24 Thread Andrew Dunstan



Joshua D. Drake wrote:

Heikki Linnakangas wrote:


We read 64 KB at a time, and then CopyReadLineText returns one line 
at a time from that buffer.


O.k. I am sure I am oversimplifying things but why are we returning 
one line at a time? That reads expensive to me. Just following the 
general, don't do inserts one at a time, do them in batch idea for 
example.


Quite simply because one line corresponds to one record. And yes, I 
believe you are oversimplifying, or under several misapprehensions about 
what can be done at this level.




I would also question the 64KB at a time. Why not a 1024KB (arbitrary) 
at a time? Is it a resource issue? In the old days when we actually 
had people trying to run postgresql on 128 and 256 megs of ram, o.k. 
but now?


It would be simple enough to change. Try it and see if it actually makes 
a difference. All you have to change is the define of RAW_BUF_SIZE.




In reading:

http://doxygen.postgresql.org/backend_2commands_2copy_8c-source.html

It looks like CopyReadAttributesText is used as part of the column 
breakup. It also appears that this happens "before" insert right? So 
if that is the case we are going to pay an additional penalty on the 
data checking.




What? I don't understand what you are talking about.

cheers

andrew

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Improved (De)Serialization Support

2008-02-24 Thread Andy Pavlo
On Saturday 23 February 2008 00:40, Tom Lane wrote:
> Andy Pavlo <[EMAIL PROTECTED]> writes:
> > I have added support in readfuncs.c to write out Query and PlannedStmt
> > objects using nodeToString() and then read them back in. We needed this
> > so that we could use PREPARE and write the arguments to
> > StorePreparedStatement() out to a file. We are going then read them back
> > into Postgres on a different server and use EXECUTE.
>
> Um, surely Query objects can be written/read already?

I meant specifically UpdateStmt, DeleteStmt, and InsertStmt objects. 
SelectStmt curiously was already in place.

> I'm not very excited about adding a load of code that will never be
> exercised during normal use.  That will inevitably be a hotbed of
> bit-rot.  Now, if the feature that needs it is also being proposed for
> inclusion in core, then maybe we could talk.

The research project that we are working will not be merged back into 
Postgres. The exercise of updating this part of the code for serialization 
that we need was very useful for me to understand the internals of Postgres 
better. I did spend about two days updating the out/read functions, and I 
figure that my enhancements would probably be useful for somebody else in the 
future.

At the very least you might want to consider taking my minimal changes to 
outfuncs.c, since using nodeToString() is useful for new developers to 
understand the nested structures used by the planner and optimizer.
-- 
Andy Pavlo
[EMAIL PROTECTED]

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Behaviour of rows containg not-null domains in plpgsql

2008-02-24 Thread Tom Lane
"Florian G. Pflug" <[EMAIL PROTECTED]> writes:
> I just stumbled over the following behaviour, introduced with 8.3, and
> wondered if this is by design or an oversight.

No, this was in 8.2.

> If you define a domain over some existing type, constrain it to
> non-null values, and use that domain as a field type in a table 
> definition, it seems to be impossible to declare pl/pgsql variables
> of that table's row type. The problem seems to be that upon declaration,
> the row variable is filled with nulls - but since the domain is marked
> not-null, that immediatly triggers an exception.

What else would you expect it to do?  AFAICS any other behavior would be
contrary to spec.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] 8.3 / 8.2.6 restore comparison

2008-02-24 Thread Joshua D. Drake

Tom Lane wrote:

"Heikki Linnakangas" <[EMAIL PROTECTED]> writes:
At some point, I think we have to bite the bullet and find a way to use 
multiple CPUs for a single load. I don't have any good ideas or plans 
for that, but hopefully someone does.


As already mentioned upthread, we could do that today, with zero backend
changes, by making pg_restore drive multiple sessions.  Now there are
scenarios where this wouldn't help too much --- eg, a database with only
one enormous table.  We couldn't parallelize the loading of that table,
although we could parallelize creation of its indexes.  But for an
example such as JD is complaining about, we ought to have no trouble
thrashing his disks into the ground ;-)


Bring it on ! but I would note that with the current pg_restore I was 
not able to trash my disks. I only used four backends (I could have used 
8) but even with that, I was only doing ~ 45M a second. So if I double, 
I still have bandwidth. It would still be an huge improvement though.




What you would need for this is an -Fc or -Ft dump, because a plain
script output from pg_dump doesn't carry any dependency information,


I would focus on -Fc. With the limitations of -Ft this would be a good 
way to start phasing -Ft out.



much less any index of where in the file different bits of data are.
Just armwaving, I envision a multiprocess pg_restore like this:

* one controller process that tracks the not-yet-loaded
  TOC items in the dump
* N child processes that just transmit a selected TOC item
  to a connected backend, and then ask the controller
  what to do next



Most likely, the bottleneck with this sort of thing would be multiple
parallel reads from the pg_dump archive file.  Possibly the controller
process could be taught to schedule COPY and CREATE INDEX operations
so that not too many processes are trying to read lots of archive
data at the same time.


A less hacker and more DBA bottleneck will be to limit the number of 
backends being created for restore. We don't really want to have more 
than one backend per CPU, otherwise we just start switching.


Sincerely,

Joshua D. Drake

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] 8.3 / 8.2.6 restore comparison

2008-02-24 Thread Hannes Dorbath

Tom Lane wrote:

Greg Smith <[EMAIL PROTECTED]> writes:

On Fri, 22 Feb 2008, Tom Lane wrote:
Counts are useless here, we need to see the sequence of write locations 
to find out if there's a lot of nonconsecutive writes happening.



How were you planning to analyze the strace output to quantify that?


I didn't really have any preconceived ideas about that.  I just want to
see some raw data to see if something shows up.


Isn't blktrace the tool to get that kind of information? Anyway, as the 
following threads point out the problems seems to be somewhere else..



--
Best regards,
Hannes Dorbath

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] 8.3 / 8.2.6 restore comparison

2008-02-24 Thread Joshua D. Drake

Heikki Linnakangas wrote:

Joshua D. Drake wrote:

On Sun, 24 Feb 2008 00:43:18 +
"Heikki Linnakangas" <[EMAIL PROTECTED]> wrote:



I know that copy is in theory a bulk loader but, when performing the
readline how many lines are we reading?  Do we read up to 8192? Or do we
shove in say 8megs of data before we invoke DoCopy?


We read 64 KB at a time, and then CopyReadLineText returns one line at a 
time from that buffer.


O.k. I am sure I am oversimplifying things but why are we returning one 
line at a time? That reads expensive to me. Just following the general, 
don't do inserts one at a time, do them in batch idea for example.


I would also question the 64KB at a time. Why not a 1024KB (arbitrary) 
at a time? Is it a resource issue? In the old days when we actually had 
people trying to run postgresql on 128 and 256 megs of ram, o.k. but now?


Looking at your profile more, and after the memchr patch, the "raw input 
side" of copy, consisting of reading the data from disk in 64KB blocks, 
splitting that into lines, and splitting lines into columns, still takes 
~20% of the CPU time. I suspect CopyReadAttributesText is the biggest 
culprit there.


In reading:

http://doxygen.postgresql.org/backend_2commands_2copy_8c-source.html

It looks like CopyReadAttributesText is used as part of the column 
breakup. It also appears that this happens "before" insert right? So if 
that is the case we are going to pay an additional penalty on the data 
checking.




You could avoid the ~8% spent in XLogInsert in PostgreSQL 8.3, by 
creating the table (or truncating it) in the same transaction with the 
COPY.


In the same transaction? Oh that's interesting. So that might be a TODO 
right there. Change pg_dump so it does:


create,copy,create,copy,index

Instead of:

create,create,copy,copy,index



After that, heap_formtuple is high on the list. I wonder if we could do 
something about that.


Just from the name I am going to guess this is where we turn it into 
something that is actually injected into PostgreSQL.


I don't see any piece of code that's causing problems. We can shave off 
a few percents here and there I think, but don't expect a 300% 
improvement anytime soon. A few ideas I've thought about are:


Well don't get me wrong, I am not expecting miracles here. I am just 
confounded at the complete lack of performance in this arena. I don't 
think a lot of people recognize what a significant issue this is since 
we don't have in place backups.




- use a specialized version of strtol, for base 10. That won't help on 
your table, but I've seen strtol consume a significant amount of time on 
tables with numeric/integer columns.


- Instead of pallocing and memcpying the text fields, leave a little bit 
of room between fields in the attribute_buf, and write the varlen header 
there directly. This might help you since your table has a lot of text 
fields.


- Instead of the normal PG function calling conventions, provide 
specialized fastpath input functions for the most common data types. 
InputFunctionCall consumed 4.5% of the CPU time in your profile.


- Use a simpler memory context implementation, that's like a stack with 
no pfree support, for the per-tuple context.




By my calculations you are presenting a possibility of at least ~ 30% 
improvement. That is significant in my book. Hopefully as we explore 
these options we will find others.


Sincerely,

Joshua D. Drake






---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] Behaviour of rows containg not-null domains in plpgsql

2008-02-24 Thread Florian G. Pflug

Andrew Dunstan wrote:

Florian G. Pflug wrote:
If you define a domain over some existing type, constrain it to 
non-null values, and use that domain as a field type in a table 
definition, it seems to be impossible to declare pl/pgsql variables

 of that table's row type. The problem seems to be that upon
declaration, the row variable is filled with nulls - but since the
domain is marked not-null, that immediatly triggers an exception.

Here is an example 



What seems worse is that it still fails even if you declare the
domain to have a default value.

I didn't try that, but I *did* try was providing a default value for the
row variable - which doesn't work either, since we do not currently
support row variable defaults.

The only workaround I found was to define the variable as "record".

regards, Florian Pflug


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] 8.3 / 8.2.6 restore comparison

2008-02-24 Thread Tom Lane
"Heikki Linnakangas" <[EMAIL PROTECTED]> writes:
> At some point, I think we have to bite the bullet and find a way to use 
> multiple CPUs for a single load. I don't have any good ideas or plans 
> for that, but hopefully someone does.

As already mentioned upthread, we could do that today, with zero backend
changes, by making pg_restore drive multiple sessions.  Now there are
scenarios where this wouldn't help too much --- eg, a database with only
one enormous table.  We couldn't parallelize the loading of that table,
although we could parallelize creation of its indexes.  But for an
example such as JD is complaining about, we ought to have no trouble
thrashing his disks into the ground ;-)

What you would need for this is an -Fc or -Ft dump, because a plain
script output from pg_dump doesn't carry any dependency information,
much less any index of where in the file different bits of data are.
Just armwaving, I envision a multiprocess pg_restore like this:

* one controller process that tracks the not-yet-loaded
  TOC items in the dump
* N child processes that just transmit a selected TOC item
  to a connected backend, and then ask the controller
  what to do next

The controller would need to use the dependency information to avoid,
eg, handing out a CREATE INDEX command before the parent table was
created and loaded.

One issue is that this couldn't use "-1" single-transaction restoring,
since obviously each child would need its own transaction, and
furthermore would have to commit before going back to the controller
for more work (since dependent TOC items might well get loaded by
a different child later).  That defeats a couple of optimizations that
Simon put in recently.  The one for no XLOG during COPY is not too
hard to see how to re-enable, but I'm not sure what else there was.

Most likely, the bottleneck with this sort of thing would be multiple
parallel reads from the pg_dump archive file.  Possibly the controller
process could be taught to schedule COPY and CREATE INDEX operations
so that not too many processes are trying to read lots of archive
data at the same time.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Behaviour of rows containg not-null domains in plpgsql

2008-02-24 Thread Andrew Dunstan



Florian G. Pflug wrote:

Hi

I just stumbled over the following behaviour, introduced with 8.3, and
wondered if this is by design or an oversight.

If you define a domain over some existing type, constrain it to
non-null values, and use that domain as a field type in a table 
definition, it seems to be impossible to declare pl/pgsql variables

of that table's row type. The problem seems to be that upon declaration,
the row variable is filled with nulls - but since the domain is marked
not-null, that immediatly triggers an exception.

Here is an example
CREATE DOMAIN d AS varchar NOT NULL;
CREATE TABLE t (txt d);
CREATE FUNCTION f() RETURNS VOID AS $$
DECLARE
  v_t t;
BEGIN
END;
$$ LANGUAGE 'plpgsql' VOLATILE;
SELECT f();

Note that the following works.
CREATE TABLE t2 (txt varchar not null);
CREATE FUNCTION f2() RETURNS VOID AS $$
DECLARE
  v_t t2;
BEGIN
END;
$$ LANGUAGE 'plpgsql' VOLATILE;
SELECT f2();

If you guys agree that this is a bug, I'll try to find a fix and send 
a patch.





What seems worse is that it still fails even if you declare the domain 
to have a default value.


cheers

andrew

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] Behaviour of rows containg not-null domains in plpgsql

2008-02-24 Thread Florian G. Pflug

Hi

I just stumbled over the following behaviour, introduced with 8.3, and
wondered if this is by design or an oversight.

If you define a domain over some existing type, constrain it to
non-null values, and use that domain as a field type in a table 
definition, it seems to be impossible to declare pl/pgsql variables

of that table's row type. The problem seems to be that upon declaration,
the row variable is filled with nulls - but since the domain is marked
not-null, that immediatly triggers an exception.

Here is an example
CREATE DOMAIN d AS varchar NOT NULL;
CREATE TABLE t (txt d);
CREATE FUNCTION f() RETURNS VOID AS $$
DECLARE
  v_t t;
BEGIN
END;
$$ LANGUAGE 'plpgsql' VOLATILE;
SELECT f();

Note that the following works.
CREATE TABLE t2 (txt varchar not null);
CREATE FUNCTION f2() RETURNS VOID AS $$
DECLARE
  v_t t2;
BEGIN
END;
$$ LANGUAGE 'plpgsql' VOLATILE;
SELECT f2();

If you guys agree that this is a bug, I'll try to find a fix and send a 
patch.


greetings, Florian Pflug


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


[HACKERS] idea: simple variadic functions in SQL and PL/pgSQL

2008-02-24 Thread Pavel Stehule
Hello,

I found easy implementation of variadic functions. It's based on
adapation FuncnameGetCandidates. When I found variadic function, then
I should create accurate number of last arguments (diff between
pronargs and nargs). Variadic function can be signed via flag or via
some pseudotype. Flag is better - allows variadic arguments of any
type. In static languages (like SQL or PL/pgSQL) variadic variables
can ba accessed via array (variadic arguments can be only nonarray).
This isn't problem in C language, there are arguments available
directly.

Sample:

CREATE OR REPLACE FUNCTION Least(anyelement)
RETURNS anyelement AS $$
  SELECT MIN($1[i])
 FROM generate_series(1, array_upper($1,1)) g(i);
$$  LANGUAGE SQL IMMUTABLE VARIADIC.

This sample is really simple. The goal is support sophistic libraries
like JSON support: http://www.mysqludf.org/lib_mysqludf_json/index.php

Main change in FuncnameGetCandidates.

if (!OidIsValid(variadic_oid))
{
memcpy(newResult->args, procform->proargtypes.values,
  pronargs * sizeof(Oid));
}
else
{
int j;
/* copy nonvariadic parameters */
memcpy(newResult->args, procform->proargtypes.values,
pronargs * sizeof(Oid));
/* set variadic parameters, !!! */
for (j = pronargs - 1; j < nargs; j++)
  newResult->args[j] = variadic_oid;
}

I invite any ideas, notes

Regards
Pavel Stehule

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] 8.3 / 8.2.6 restore comparison

2008-02-24 Thread andy

Heikki Linnakangas wrote:

Joshua D. Drake wrote:

On Sun, 24 Feb 2008 00:43:18 +
"Heikki Linnakangas" <[EMAIL PROTECTED]> wrote:
 

Incidentally, I've been working on a patch to speed up CopyReadLine.
I was going to run some more tests first, but since we're talking
about it, I guess I should just post the patch. I'll post to
pgsql-patches shortly.


On your post to patches you mentioned only about a 5% improvement.
Don't get me wrong, 5% is 5% and I respect it greatly but as far as I
can tell we are about 300% behind the curve.


Yeah. Looking at the profile, the time is spent really all over the 
place. There's no one clear bottleneck to focus on. I think we could do 
a few more ~5% improvements, but


At some point, I think we have to bite the bullet and find a way to use 
multiple CPUs for a single load. I don't have any good ideas or plans 
for that, but hopefully someone does.




There was talk elsewhere about making pg_dump/restore smarter.  It could 
create tables, then COPY, and create the indexes last.  Add to that 
pg_restore using multiple connections and you'd have it.  One connection 
could do a COPY, then a second connection could be created to start the 
CREATE INDEX's for that table, while the first connection went on to 
COPY the next table.


Or something like that...

-Andy

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] pg_dump additional options for performance

2008-02-24 Thread Jochem van Dieten
On Mon, Feb 11, 2008 at 5:51 PM, Tom Lane wrote:
>  I agree.  Since any multiple-output-file case can't usefully use stdout,
>  I think we should combine the switches and just have one switch that
>  says both that you want separated output and what the target filename
>  is.  Thus something like
>
> --pre-schema-file = foo
> --data-file = bar
> --post-schema-file = baz

Or we could have a switch that specifies a directory and have pg_dump
split the dump not just in pre-schema, data and post-schema, but also
split the data in a file for each table. That would greatly facilitate
a parallel restore of the data through multiple connections.

Jochem

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] RFP: Recursive query in 8.4

2008-02-24 Thread Gregory Stark

[This message is mostly for the benefit of the list -- he and I already talked
a bit about this here at FOSDEM. Ishii-san, if you have a chance we should sit
down and talk about this in more detail before we leave!]


Tatsuo Ishii wrote:
>> On Tue, Feb 19, 2008 at 3:36 AM, Tatsuo Ishii <[EMAIL PROTECTED]> wrote:
>> 
> I hope so. But the first thing I would like to do is, to implement the
> right thing (i.e. following the standard).
>
> I don't see any reason that the proposal gets less performance than
> existing functions.  Moreover the proposal could better cooperate with
> the optimizer since it can feed more info to it. Any ideas to enhance
> the performance are welcome.

I agree about following the standard but I think it's true that the standard
creates some challenges for the optimizer.

The standard recursive query syntax is quite general. It can represent
arbitrary non-linear recursive queries including possibly mutually recursive
queries, for example. The challenge is that there are no extra hints when you
have the more usual case of a simple linear recursion.

You really do want to discover such linear recursive structures because you
can use simpler algorithms and recover memory sooner if you know you have a
linear recursive query. You can also support the SEARCH and CYCLE clauses to
do depth-first searches which you can't do for arbitrary recursive queries. I
also don't have much hope for good optimizer estimates for general recursive
queries but for linear recursive queries we can probably do better.

But I think (surprisingly) it's actually easier to implement the general case
than the special nodes to handle the linear case more efficiently. To handle
the general case we need the memoize node to handle recursive loops in the
plan and then we can use otherwise normal plan nodes.

My plan was to implement the general case first, then look for ways to add
intelligence in the planner to discover linearity and add new paths to take
advantage of it.

-- 
greg


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] RFP: Recursive query in 8.4

2008-02-24 Thread Tatsuo Ishii
> On Tue, Feb 19, 2008 at 3:36 AM, Tatsuo Ishii <[EMAIL PROTECTED]> wrote:
> >  We propose to implement the recursive query (WITH RECURSIVE clause)
> >  defined in SQL:1999 and later. With the recursive query, one can
> >  easily inquire the data expressed as tree and graph structures.  The
> >  actual syntax we prefer is the one defined in SQL:2008 (it's not
> >  published yet, but I have a closest draft).
> 
> I am sure you are aware of various ad hoc approaches that are
> currently possible.  The recursive clause seems to generalize these
> approaches.
> 
> Do you expect that your proposed solution will have performance
> advantages over solutions like using recursive functions and (for tree
> organized data) arrays?

I hope so. But the first thing I would like to do is, to implement the
right thing (i.e. following the standard).

I don't see any reason that the proposal gets less performance than
existing functions.  Moreover the proposal could better cooperate with
the optimizer since it can feed more info to it. Any ideas to enhance
the performance are welcome.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] 8.3 / 8.2.6 restore comparison

2008-02-24 Thread Heikki Linnakangas

Heikki Linnakangas wrote:
I don't see any piece of code that's causing problems. 


Meant to say: I don't see any *single* piece of code that's causing the 
problems...


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] 8.3 / 8.2.6 restore comparison

2008-02-24 Thread Heikki Linnakangas

Joshua D. Drake wrote:

On Sun, 24 Feb 2008 00:43:18 +
"Heikki Linnakangas" <[EMAIL PROTECTED]> wrote:
 

Incidentally, I've been working on a patch to speed up CopyReadLine.
I was going to run some more tests first, but since we're talking
about it, I guess I should just post the patch. I'll post to
pgsql-patches shortly.


On your post to patches you mentioned only about a 5% improvement.
Don't get me wrong, 5% is 5% and I respect it greatly but as far as I
can tell we are about 300% behind the curve.


Yeah. Looking at the profile, the time is spent really all over the 
place. There's no one clear bottleneck to focus on. I think we could do 
a few more ~5% improvements, but


At some point, I think we have to bite the bullet and find a way to use 
multiple CPUs for a single load. I don't have any good ideas or plans 
for that, but hopefully someone does.


My tests were maxing out at ~22G an hour. On hardware that can do 
in 360G an hour and that is assuming > 50% overhead between OS, libs,

etc... I have no choice but to conclude we have a much, much deeper and
fundamental issue going on with COPY. I am inspired by Itagaki Takahiro
and his batch update of indexes which should help greatly overall but
doesn't help my specific issue.


Yep, the index build idea is an I/O improvement, not a CPU one.


Forgive me for not being a C programmer and Alvaro is not online so I
would vet these questions with him first.

I know that copy is in theory a bulk loader but, when performing the
readline how many lines are we reading?  Do we read up to 8192? Or do we
shove in say 8megs of data before we invoke DoCopy?


We read 64 KB at a time, and then CopyReadLineText returns one line at a 
time from that buffer.


Looking at your profile more, and after the memchr patch, the "raw input 
side" of copy, consisting of reading the data from disk in 64KB blocks, 
splitting that into lines, and splitting lines into columns, still takes 
~20% of the CPU time. I suspect CopyReadAttributesText is the biggest 
culprit there.


You could avoid the ~8% spent in XLogInsert in PostgreSQL 8.3, by 
creating the table (or truncating it) in the same transaction with the COPY.


After that, heap_formtuple is high on the list. I wonder if we could do 
something about that.



I am just curious if there is some simple low hanging fruit that is
possibly missing.


I don't see any piece of code that's causing problems. We can shave off 
a few percents here and there I think, but don't expect a 300% 
improvement anytime soon. A few ideas I've thought about are:


- use a specialized version of strtol, for base 10. That won't help on 
your table, but I've seen strtol consume a significant amount of time on 
tables with numeric/integer columns.


- Instead of pallocing and memcpying the text fields, leave a little bit 
of room between fields in the attribute_buf, and write the varlen header 
there directly. This might help you since your table has a lot of text 
fields.


- Instead of the normal PG function calling conventions, provide 
specialized fastpath input functions for the most common data types. 
InputFunctionCall consumed 4.5% of the CPU time in your profile.


- Use a simpler memory context implementation, that's like a stack with 
no pfree support, for the per-tuple context.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly