date:20070611

Re: [HACKERS] ecpg leaves broken files around

2007-06-11 Thread Michael Meskes

On Sun, Jun 10, 2007 at 09:56:44PM +0200, Magnus Hagander wrote:
 AFAIK, most other compilers delete their output if it's not valid. Is
 there any particular reason why ecpg doesn't do this?

No, not really. Sometimes it comes handy to see what was already
processed, but you're right, it's not what I would expect from a
compiler either.

Any objects changing this behaviour?

Michael
-- 
Michael Meskes
Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: [EMAIL PROTECTED]
Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-11 Thread ITAGAKI Takahiro

Heikki Linnakangas [EMAIL PROTECTED] wrote:

 True. On the other hand, if we issue writes in essentially random order, 
 we might fill the kernel buffers with random blocks and the kernel needs 
 to flush them to disk as almost random I/O. If we did the writes in 
 groups, the kernel has better chance at coalescing them.

If the kernel can treat sequential writes better than random writes, 
is it worth sorting dirty buffers in block order per file at the start
of checkpoints? Here is the pseudo code:

  buffers_to_be_written =
  SELECT buf_id, tag FROM BufferDescriptors
WHERE (flags  BM_DIRTY) != 0 ORDER BY tag.rnode, tag.blockNum;
  for { buf_id, tag } in buffers_to_be_written:
  if BufferDescriptors[buf_id].tag == tag:
  FlushBuffer(BufferDescriptors[buf_id])

We can also avoid writing buffers newly dirtied after the checkpoint was
started with this method.


 I tend to agree that if the goal is to finish the checkpoint as quickly 
 as possible, the current approach is better. In the context of load 
 distributed checkpoints, however, it's unlikely the kernel can do any 
 significant overlapping since we're trickling the writes anyway.

Some kernels or storage subsystems treat all I/Os too fairly so that user
transactions waiting for reads are blocked by checkpoints writes. It is
unavoidable behavior though, but we can split writes in small batches.


 I'm starting to feel we should give up on smoothing the fsyncs and 
 distribute the writes only, for 8.3. As we get more experience with that 
 and it's shortcomings, we can enhance our checkpoints further in 8.4.

I agree with the only writes distribution for 8.3. The new parameters
introduced by it (checkpoint_write_percent and checkpoint_write_min_rate)
will continue to be alive without major changes in the future, but other
parameters seem to be volatile.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Truncate Permission

2007-06-11 Thread Ewald Geschwinde


Yes, there is a use-case for it.  If you don't have triggers or
transactional concerns on the table and you want users to be able to
truncate tables while not allowing them to do things like change the
table structure.  I proposed a patch a while ago to implement a seperate
permission for truncate but it was turned down because of concern over
using the few remaining bits in the ACL structure.




I second this proposal
My problem is that some users don't have access to change the structure but
they wanted to delete all data from the table
they try truncate - does not work because not the owner
so they make a delete from a really big table

So I would like to see a truncate permission - makes some things easier in
my opinion
--
Ewald Geschwinde
http://www.postgresql.at

Re: [HACKERS] So, why isn't every buildfarm member failing ecpg right now?

2007-06-11 Thread Kris Jurka




On Mon, 4 Jun 2007, Kris Jurka wrote:


On Mon, 4 Jun 2007, Andrew Dunstan wrote:


turnip_moth is also a Solaris 9 box and doesn't seem have the same issue.

Kris, is there anything unusual installed on the box that would make it 
behave like this?




Not sure what's going on here.  I did a manual run of the ecpg tests and it 
completed normally.  This machine is quite out of date and it has a large mix 
of GNU tools with the solaris ones.  Since turnip_moth is maintained by Sun I 
would expect it to be up to date on patches and have few non-Sun tools 
installed.  So it could be using a different interpreter or it could be using 
a broken tool for which a patch has been released. I'll try doing a run with 
the buildfarm client later today to try and reproduce this.




Running ecpg's pg_regress script with -x yields:

+ cp connect/test2.c results/connect-test2.c
+ connect/test2
+ mv results/connect-test2.c results/connect-test2.c.tmp
+ cat results/connect-test2.c.tmp
+ sed -e s,^\(#line [0-9]*\) .*/\([^/]*\),\1 \2,
+ rm results/connect-test2.c.tmp
+ [ yes = yes ]
./pg_regress: bad substitution

So it looks like it's bailing on this line:

   if [ $enable_threading = yes ]  [ ${i%%/*} = thread ]; then

and it doesn't like the ${i%%/*} construct.  Still not sure why it happens 
some places and not others.


Kris Jurka

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-11 Thread Greg Smith


On Mon, 11 Jun 2007, ITAGAKI Takahiro wrote:

If the kernel can treat sequential writes better than random writes, is 
it worth sorting dirty buffers in block order per file at the start of 
checkpoints?


I think it has the potential to improve things.  There are three obvious 
and one subtle argument against it I can think of:


1) Extra complexity for something that may not help.  This would need some 
good, robust benchmarking improvements to justify its use.


2) Block number ordering may not reflect actual order on disk.  While 
true, it's got to be better correlated with it than writing at random.


3) The OS disk elevator should be dealing with this issue, particularly 
because it may really know the actual disk ordering.


Here's the subtle thing:  by writing in the same order the LRU scan occurs 
in, you are writing dirty buffers in the optimal fashion to eliminate 
client backend writes during BuferAlloc.  This makes the checkpoint a 
really effective LRU clearing mechanism.  Writing in block order will 
change that.


I spent some time trying to optimize the elevator part of this operation, 
since I knew that on the system I was using block order was actual order. 
I found that under Linux, the behavior of the pdflush daemon that manages 
dirty memory had a more serious impact on writing behavior at checkpoint 
time than playing with the elevator scheduling method did.  The way 
pdflush works actually has several interesting implications for how to 
optimize this patch.  For example, how writes get blocked when the dirty 
memory reaches certain thresholds means that you may not get the full 
benefit of the disk elevator at checkpoint time the way most would expect.


Since much of that was basically undocumented, I had to write my own 
analysis of the actual workings, which is now available at 
http://www.westnet.com/~gsmith/content/linux-pdflush.htm  I hope that 
anyone who wants more information about how Linux kernel parameters like 
dirty_background_ratio actually work, and how they impact the writing 
strategy, should find that article uniquely helpful.


Some kernels or storage subsystems treat all I/Os too fairly so that 
user transactions waiting for reads are blocked by checkpoints writes.


In addition to that (which I've seen happen quite a bit), in the Linux 
case another fairness issue is that the code that handles writes allows a 
single process writing a lot of data to block writes for everyone else. 
That means that in addition to being blocked on actual reads, if a client 
backend starts a write in order to complete a buffer allocation to hold 
new information, that can grind to a halt because of the checkpoint 
process as well.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [HACKERS] ecpg leaves broken files around

2007-06-11 Thread Magnus Hagander

On Mon, Jun 11, 2007 at 08:05:16AM +0200, Michael Meskes wrote:
 On Sun, Jun 10, 2007 at 09:56:44PM +0200, Magnus Hagander wrote:
  AFAIK, most other compilers delete their output if it's not valid. Is
  there any particular reason why ecpg doesn't do this?
 
 No, not really. Sometimes it comes handy to see what was already
 processed, but you're right, it's not what I would expect from a
 compiler either.
 
 Any objects changing this behaviour?

Certainly not from me :)

If you find the other behaviour useful, perhaps add a commandline switch
that makes it leave the file around? Just make the
remove-the-file-on-failure default.

Oh, and it seems if you want to keep the feature, it needs fixing. It looks
like ecpg doesn't flush/close the file descriptor before error-exit, so the
file that drops out isn't even complete up to the point of error. At least
that's what it looks like from a quick glance - it ends mid-row on
something that's not related to the error itself.

//Magnus

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

[HACKERS] ToDO, support UPDATE/DELETE WHERE CURRENT OF cursor_name in plpgsql

2007-06-11 Thread Pavel Stehule


Hello,

Updatable cursors can be supported in PL/pgSQL.

Explicit PL/pgSQL cursors are named, then we need only skip CURRENT OF
clause in parsing of SQL statements.

Regards
Pavel Stehule

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Truncate Permission

2007-06-11 Thread Martijn van Oosterhout

On Mon, Jun 11, 2007 at 09:40:08AM +0200, Ewald Geschwinde wrote:
 My problem is that some users don't have access to change the structure but
 they wanted to delete all data from the table
 they try truncate - does not work because not the owner
 so they make a delete from a really big table

Wouldn't it be far more logical to decide that if a user has the
permissions to do a DELETE FROM table; then they have permission to do
a TRUNCATE? Why make an additional permission?

Have a nice day,
-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature

Re: [HACKERS] Truncate Permission

2007-06-11 Thread Heikki Linnakangas


Martijn van Oosterhout wrote:

On Mon, Jun 11, 2007 at 09:40:08AM +0200, Ewald Geschwinde wrote:

My problem is that some users don't have access to change the structure but
they wanted to delete all data from the table
they try truncate - does not work because not the owner
so they make a delete from a really big table


Wouldn't it be far more logical to decide that if a user has the
permissions to do a DELETE FROM table; then they have permission to do
a TRUNCATE? Why make an additional permission?


Truncate doesn't fire ON DELETE triggers.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-11 Thread Heikki Linnakangas


ITAGAKI Takahiro wrote:

Heikki Linnakangas [EMAIL PROTECTED] wrote:

True. On the other hand, if we issue writes in essentially random order, 
we might fill the kernel buffers with random blocks and the kernel needs 
to flush them to disk as almost random I/O. If we did the writes in 
groups, the kernel has better chance at coalescing them.


If the kernel can treat sequential writes better than random writes, 
is it worth sorting dirty buffers in block order per file at the start

of checkpoints? Here is the pseudo code:

  buffers_to_be_written =
  SELECT buf_id, tag FROM BufferDescriptors
WHERE (flags  BM_DIRTY) != 0 ORDER BY tag.rnode, tag.blockNum;
  for { buf_id, tag } in buffers_to_be_written:
  if BufferDescriptors[buf_id].tag == tag:
  FlushBuffer(BufferDescriptors[buf_id])

We can also avoid writing buffers newly dirtied after the checkpoint was
started with this method.


That's worth testing, IMO. Probably won't happen for 8.3, though.

I tend to agree that if the goal is to finish the checkpoint as quickly 
as possible, the current approach is better. In the context of load 
distributed checkpoints, however, it's unlikely the kernel can do any 
significant overlapping since we're trickling the writes anyway.


Some kernels or storage subsystems treat all I/Os too fairly so that user
transactions waiting for reads are blocked by checkpoints writes. It is
unavoidable behavior though, but we can split writes in small batches.


That's really the heart of our problems. If the kernel had support for 
prioritizing the normal backend activity and LRU cleaning over the 
checkpoint I/O, we wouldn't need to throttle the I/O ourselves. The 
kernel has the best knowledge of what it can and can't do, and how busy 
the I/O subsystems are. Recent Linux kernels have some support for read 
I/O priorities, but not for writes.


I believe the best long term solution is to add that support to the 
kernel, but it's going to take a long time until that's universally 
available, and we have a lot of platforms to support.


I'm starting to feel we should give up on smoothing the fsyncs and 
distribute the writes only, for 8.3. As we get more experience with that 
and it's shortcomings, we can enhance our checkpoints further in 8.4.


I agree with the only writes distribution for 8.3. The new parameters
introduced by it (checkpoint_write_percent and checkpoint_write_min_rate)
will continue to be alive without major changes in the future, but other
parameters seem to be volatile.


I'm going to start testing with just distributing the writes. Let's see 
how far that gets us.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [HACKERS] little PITR annoyance

2007-06-11 Thread ohp

Hi Simon,

I'll be glad to test it for you when you're ready!
Thanks for looking at this issue.

Best regards,
On Sun, 10 Jun 2007, Simon Riggs wrote:

 Date: Sun, 10 Jun 2007 23:55:32 +0100
 From: Simon Riggs [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Cc: pgsql-hackers list pgsql-hackers@postgresql.org
 Subject: Re: [HACKERS] little PITR annoyance

 On Sun, 2007-06-10 at 20:48 +0200, [EMAIL PROTECTED] wrote:

My questions was: why don't we start the archiving *BEFORE* postmaster 
to
make room.

   The archiver is executed from the postmaster, so thats not possible.

  I'm aware of that, my point is maybe the archiver doesn't need postmaster
  to be fully operational  (responding to db requests) to be started
   We could investigate where best to put some code, but it wouldn't be
   executed very frequently.
  I agree, OTOH, the more PITR is used on big busy db to more this may
  happend.

   Why not just execute the archive_command in a script, replacing
   the .ready with .done files in archive_status directory when its
   processed?

  Sure, but if *I* can do it, why can't the system?

  What do you think,

 Just looked at the code. Does seem possible to start archiver earlier -
 it has no hooks into anything else and doesn't need transactions.

 Starting archiver earlier would not be the only change required, since
 recovery could be very short. That will take some thought on how to
 resolve.

 I have other things pressing on me now, but I'll add this to my todo,
 though I'll be relying on you to test it when I get round to it.

-- 
Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
15, Chemin des Monges+33-5-61-50-97-01 (Fax)
31190 AUTERIVE   +33-6-07-63-80-64 (GSM)
FRANCE  Email: [EMAIL PROTECTED]
--
Make your life a dream, make your dream a reality. (St Exupery)

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Truncate Permission

2007-06-11 Thread Zeugswetter Andreas ADI SD


  Wouldn't it be far more logical to decide that if a user has the 
  permissions to do a DELETE FROM table; then they have permission to
do 
  a TRUNCATE? Why make an additional permission?
 
 Truncate doesn't fire ON DELETE triggers.

Yes, but it would imho be ok if there are'nt any on delete triggers on
the table.

Andreas

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

[HACKERS] one click install?

2007-06-11 Thread Timasmith

Hi,

I can probably figure it out on linux but I would like to do a one
click install based upon defined defaults for the Postgresql database
(creating it as a service and load my sql file which creates the
database) - has anyone written such a how to?

thanks

Tim


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] Truncate Permission

2007-06-11 Thread Stephen Frost

* Zeugswetter Andreas ADI SD ([EMAIL PROTECTED]) wrote:
 
   Wouldn't it be far more logical to decide that if a user has the 
   permissions to do a DELETE FROM table; then they have permission to
 do 
   a TRUNCATE? Why make an additional permission?
  
  Truncate doesn't fire ON DELETE triggers.
 
 Yes, but it would imho be ok if there are'nt any on delete triggers on
 the table.

Nope, it doesn't follow MVCC rules properly either.  It really needs to
be a seperate permission.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] ecpg leaves broken files around

2007-06-11 Thread Michael Meskes

On Mon, Jun 11, 2007 at 10:03:57AM +0200, Magnus Hagander wrote:
 If you find the other behaviour useful, perhaps add a commandline switch
 that makes it leave the file around? Just make the
 remove-the-file-on-failure default.

Should be fixed now. I don't think such a command line switch is needed.
After all you could run ecpg with -o - and make it output to stdout.

Michael
-- 
Michael Meskes
Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: [EMAIL PROTECTED]
Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Performance regression on CVS head

2007-06-11 Thread Heikki Linnakangas


Heikki Linnakangas wrote:

Tom Lane wrote:

Heikki Linnakangas [EMAIL PROTECTED] writes:
I tried to repeat the DBT-2 runs with the oldestxmin refresh patch, 
but to my surprise the baseline run with CVS head, without the patch, 
behaved very differently than it did back in March.


I rerun the a shorter 1h test with CVS head from May 20th, and March 
6th (which is when I ran the earlier tests), and something has 
clearly been changed between those dates that affects the test. Test 
run 248 is with CVS checkout from May 20th, and 249 is from March 6th:


May 20th is not quite my idea of HEAD ;-).  It might be worth checking
current code before investing any think-time on this.  But having said
that, it looks a bit like a planner problem --- if I'm reading the
graphs correctly, I/O wait time goes through the roof, suggesting a
change to a much less efficient plan.


I tracked this down to the patch to enable plan invalidation for SPI plans:

http://archives.postgresql.org/pgsql-committers/2007-03/msg00136.php

Apparently the vacuum causes a plan invalidation and a worse plan is 
chosen. I'll dig deeper into which queries are being affected and why. 
Unless someone has any better ideas.


Ok, found it. The plan for stock-level transaction changed as a result 
of a lot of dead tuples in the district table.


I turned autovacuum on for the small, frequently-updated tables, and 
that fixed the problem.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] What's with the StartDb:2 failures on skylark?

2007-06-11 Thread Andrew Dunstan




Tom Lane wrote:

Every so often, buildfarm member skylark reports a StartDb:2 failure,
as for instance here:
http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=skylarkdt=2007-06-10%2023:00:01
but the logs contain no trace of any actual failure.  What's happening,
and why is the buildfarm failing to record any useful information?

  



There is certainly something odd happening on the box. Magnus and I 
discussed it about a month ago and he was going to test something I 
suggested, but I'm not sure if he did. I have committed some small 
buildfarm changes that might take care of things. Marcus, can you please 
upgrade your run_build.pl to CVS HEAD (1.83) and we'll continue to keep 
an eye on things?


cheers

andrew

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [HACKERS] What's with the StartDb:2 failures on skylark?

2007-06-11 Thread Magnus Hagander

On Mon, Jun 11, 2007 at 10:23:06AM -0400, Andrew Dunstan wrote:
 
 
 Tom Lane wrote:
 Every so often, buildfarm member skylark reports a StartDb:2 failure,
 as for instance here:
 http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=skylarkdt=2007-06-10%2023:00:01
 but the logs contain no trace of any actual failure.  What's happening,
 and why is the buildfarm failing to record any useful information?
 
   
 
 
 There is certainly something odd happening on the box. Magnus and I 
 discussed it about a month ago and he was going to test something I 
 suggested, but I'm not sure if he did. I have committed some small 
 buildfarm changes that might take care of things. Marcus, can you please 
 upgrade your run_build.pl to CVS HEAD (1.83) and we'll continue to keep 
 an eye on things?

Done.

FWIW, I think you wanted me to stick sleeps at some places in the code, but
I never managed to figure out where to put them. So if you want them in
somewhere, please tell me where to put them and I'll make sure to do that
as well.

//Magnus


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] COPYable logs status

2007-06-11 Thread Andrew Dunstan




Tom Lane wrote:

Alvaro Herrera [EMAIL PROTECTED] writes:
  

The idea of one pipe per process is not really workable, because it
would mean having as many pipes as backends which does not sound very
good.  But how about a mixed approach -- like have the all the backends
share a pipe, controlled by an LWLock, and the auxiliary process have a
separate pipe each?



Multiple pipes seem like a mess, and in any case the above still doesn't
work for stderr output produced by non-cooperative software (dynamic
loader for instance).

The only solution that I can see is to invent some sort of simple
protocol for the syslogger pipe.  Assume that the kernel honors PIPE_BUF
(this assumption may need proving, see other message).  We could imagine
having elog.c divvy up its writes to the pipe into chunks of less than
PIPE_BUF bytes, where each chunk carries info sufficient to let it be
reassembled.  Perhaps something on the order of

\0 \0 2-byte-length source-PID end-flag text...

The syslogger reassembles these by joining messages with the same
origination PID, until it gets one with the end-flag set.  It would need
enough code to track multiple in-progress messages.

The logger would have to also be able to deal with random text coming
down the pipe (due to aforesaid non-cooperative software).  I would be
inclined to say just take any text not preceded by \0\0 as a standalone
message, up to the next \0\0.  Long chunks of non-protocol text would
risk getting treated as multiple messages, but there's probably not a
lot of harm in that.

BTW, exactly what is the COPYable-logs code going to do with random
text?  I trust the answer is not throw it away.


  


The CSVlog pipe is a separate pipe from the stderr pipe. Anything that 
goes to stderr now will continue to go to stderr, wherever that is.


I like this scheme for a couple of reasons:
. it will include the ability to tell the real end of a message
. it will let us handle non-protocol messages (although there shouldn't 
be any in the CSVlog pipe).


I'll try to get a patch out for just the stderr case, which should be 
back-patchable, then adjust the CSVlog patch to use it.


I'm thinking of handling the partial lines with a small dynahash of 
StringInfo buffers, which get discarded whenever we don't have a partial 
line for the PID.


cheers

andrew

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] What's with the StartDb:2 failures on skylark?

2007-06-11 Thread Andrew Dunstan




Magnus Hagander wrote:

On Mon, Jun 11, 2007 at 10:23:06AM -0400, Andrew Dunstan wrote:
  

Tom Lane wrote:


Every so often, buildfarm member skylark reports a StartDb:2 failure,
as for instance here:
http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=skylarkdt=2007-06-10%2023:00:01
but the logs contain no trace of any actual failure.  What's happening,
and why is the buildfarm failing to record any useful information?

 
  
There is certainly something odd happening on the box. Magnus and I 
discussed it about a month ago and he was going to test something I 
suggested, but I'm not sure if he did. I have committed some small 
buildfarm changes that might take care of things. Marcus, can you please 
upgrade your run_build.pl to CVS HEAD (1.83) and we'll continue to keep 
an eye on things?



Done.

FWIW, I think you wanted me to stick sleeps at some places in the code, but
I never managed to figure out where to put them. So if you want them in
somewhere, please tell me where to put them and I'll make sure to do that
as well.


  


It's in the update - nothing else for you to do.

cheers

andrew

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [HACKERS] COPYable logs status

2007-06-11 Thread Tom Lane

Andrew Dunstan [EMAIL PROTECTED] writes:
 I'll try to get a patch out for just the stderr case, which should be 
 back-patchable, then adjust the CSVlog patch to use it.

Sounds like a plan.

 I'm thinking of handling the partial lines with a small dynahash of 
 StringInfo buffers, which get discarded whenever we don't have a partial 
 line for the PID.

A hashtable might be overkill --- based on reports so far, it's unlikely
you'd have more than two or three messages being received concurrently,
so a simple list or array might be quicker to search.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[HACKERS] ecpg compile error in regression tests

2007-06-11 Thread Magnus Hagander

Hi!

Still working on the ecpg-regression-tests-on-msvc. Update to follow
later today I hope. But for now:

I get the following error when trying to build the sql/parser.pgc test:
c:\prog\pgbin\pgsql\bin\ecpg --regression  -o parser.c parser.pgc
parser.pgc:26: ERROR: syntax error at or near NULLS


Any pointers for where to look? (Kinda seems it could be the ecpg binary
not being up-to-date, but I did a make clean and rebuild of it)

//Magnus

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[HACKERS] Is this a feature?

2007-06-11 Thread Joshua D. Drake


Hello,

It is past feature freeze which means we can't introduce new features. 
It is possible to submit a patch for slightly different logging output?


Take the following:

INFO:  analyzing pg_catalog.pg_authid
INFO:  pg_authid: scanned 1 of 1 pages, containing 5 live rows and 0 
dead rows; 5 rows in sample, 5 estimated total rows


The above is completely redundant. Why not just say:

INFO:  pg_authid: scanned 1 of 1 pages, containing 5 live rows and 0 
dead rows; 5 rows in sample, 5 estimated total rows


If the first line is meant to be an indicator, then make the above line 
do this:


INFO: analyzing pg_catalog.pg_authid :

Don't add a new line, and when the next step of information comes up 
append it to the existing line to get:


INFO: analyzing pg_catalog.pg_authid: scanned 1 of 1 pages, containing 
5 live rows and 0 dead rows; 5 rows in sample, 5 estimated total rows


Sincerely,

Joshua D. Drake


--

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Is this a feature?

2007-06-11 Thread Florian G. Pflug


Joshua D. Drake wrote:

Take the following:

INFO:  analyzing pg_catalog.pg_authid
INFO:  pg_authid: scanned 1 of 1 pages, containing 5 live rows and 0 
dead rows; 5 rows in sample, 5 estimated total rows


The above is completely redundant. Why not just say:

INFO:  pg_authid: scanned 1 of 1 pages, containing 5 live rows and 0 
dead rows; 5 rows in sample, 5 estimated total rows


If the first line is meant to be an indicator, then make the above line 
do this:


INFO: analyzing pg_catalog.pg_authid :

Don't add a new line, and when the next step of information comes up 
append it to the existing line to get:


INFO: analyzing pg_catalog.pg_authid: scanned 1 of 1 pages, containing 
5 live rows and 0 dead rows; 5 rows in sample, 5 estimated total rows


But then the line could only be pushed to the client *after* the analysis
of the table has finished, while with the current output you know what
postgres is currently doing, because you get analyzing ... *before*
the operation starts.

greetings, Florian Pflug



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

[HACKERS] Selecting a constant question

2007-06-11 Thread Dann Corbit

SELECT 1  FROM test.dbo.a_003 

 

gets about 60,000 records per second

 

SELECT '1'  FROM test.dbo.a_003 

 

gets about 600 records per second.

 

The cause is that postgres describes the return column as unknown
length 65534 in the 2nd case.

 

Since the value is a constant, it seems rather odd to make the length
65534 characters.  Why not make it char(1) or some other appropriate and
less costly data type?  After all, it isn't going to grow during the
query.

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Tom Lane

Dann Corbit [EMAIL PROTECTED] writes:
 SELECT 1  FROM test.dbo.a_003
 gets about 60,000 records per second
 SELECT '1'  FROM test.dbo.a_003
 gets about 600 records per second.

 The cause is that postgres describes the return column as unknown
 length 65534 in the 2nd case.

Postgres describes it in no such fashion --- unknown will always have a
typmod of -1 which means unspecified.  Possibly you have some client
code that knows much less than it thinks it does about the meanings of
typmod values?

The actual volume of data transmitted is going to be just about the same
either way, so I'm not sure you've diagnosed the cause of slowdown
correctly.  Trying the example in psql seems to be about the same speed
both ways, with if anything a slight advantage to select '1'.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Gregory Stark

Dann Corbit [EMAIL PROTECTED] writes:

 SELECT 1  FROM test.dbo.a_003 

 gets about 60,000 records per second

 SELECT '1'  FROM test.dbo.a_003 

 gets about 600 records per second.

 The cause is that postgres describes the return column as unknown
 length 65534 in the 2nd case.

Wait, back up. How does this cause it to go slower?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Dann Corbit

 -Original Message-
 From: Gregory Stark [mailto:[EMAIL PROTECTED]
 Sent: Monday, June 11, 2007 12:48 PM
 To: Dann Corbit
 Cc: pgsql-hackers@postgresql.org
 Subject: Re: [HACKERS] Selecting a constant question

 Dann Corbit [EMAIL PROTECTED] writes:

  SELECT 1  FROM test.dbo.a_003

  gets about 60,000 records per second

  SELECT '1'  FROM test.dbo.a_003

  gets about 600 records per second.

  The cause is that postgres describes the return column as unknown
  length 65534 in the 2nd case.

 Wait, back up. How does this cause it to go slower?

The issue is this:

Postgres describes the column with a typmod of -1 (unknown) and a length
of 65534.

This means that any client application must reserve 65534 bytes of
spaces for every row of data (like a grid control for example), which
postgres should know (and report) that the maximum length of the column
is 1.

This is not a PSQL issue, it's an issue with other products relying on
the accuracy of the reported postgres metadata for a given SQL
statement.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Martijn van Oosterhout

On Mon, Jun 11, 2007 at 12:55:55PM -0700, Dann Corbit wrote:
 The issue is this:
 
 Postgres describes the column with a typmod of -1 (unknown) and a length
 of 65534.

Postgres does no such thing. How can it possibly know the maximum size
of a column before executing the query?

Have a nice day,
-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Tom Lane

Dann Corbit [EMAIL PROTECTED] writes:
 The issue is this:
 Postgres describes the column with a typmod of -1 (unknown) and a length
 of 65534.

Oh, you're looking at typlen not typmod.  Please observe the comments in
pg_type.h:

/*
 * For a fixed-size type, typlen is the number of bytes we use to
 * represent a value of this type, e.g. 4 for an int4.  But for a
 * variable-length type, typlen is negative.  We use -1 to indicate a
 * varlena type (one that has a length word), -2 to indicate a
 * null-terminated C string.
 */
int2typlen;

You should be treating typlen as signed not unsigned, and not assuming a
fixed width for any negative value.

Since the width refers to the server internal representation, and not to
what comes down the wire, I find it pretty strange for an application to
be using typlen for anything at all actually.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Dann Corbit

 -Original Message-
 From: Tom Lane [mailto:[EMAIL PROTECTED]
 Sent: Monday, June 11, 2007 1:32 PM
 To: Dann Corbit
 Cc: Gregory Stark; pgsql-hackers@postgresql.org
 Subject: Re: [HACKERS] Selecting a constant question

 Dann Corbit [EMAIL PROTECTED] writes:
  The issue is this:
  Postgres describes the column with a typmod of -1 (unknown) and a
length
  of 65534.

 Oh, you're looking at typlen not typmod.  Please observe the comments
in
 pg_type.h:

   /*
* For a fixed-size type, typlen is the number of bytes we use
to
* represent a value of this type, e.g. 4 for an int4.  But for
 a
* variable-length type, typlen is negative.  We use -1 to
indicate
 a
* varlena type (one that has a length word), -2 to indicate a
* null-terminated C string.
*/
   int2typlen;

 You should be treating typlen as signed not unsigned, and not assuming
a
 fixed width for any negative value.

 Since the width refers to the server internal representation, and not
to
 what comes down the wire, I find it pretty strange for an application
to
 be using typlen for anything at all actually.

Thanks for the response.

Since libpq function PQfsize returns -2 for all constant character
strings in SQL statements ... What is the proper procedure to determine
the length of a constant character column after query execution but
before fetching the first row of data?

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Gregory Stark

Tom Lane [EMAIL PROTECTED] writes:

 Trying the example in psql seems to be about the same speed both ways, with
 if anything a slight advantage to select '1'.

Fwiw I see a slight advantage for '1' as well. I wonder why.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Dann Corbit

 -Original Message-
 From: Martijn van Oosterhout [mailto:[EMAIL PROTECTED]
 Sent: Monday, June 11, 2007 1:46 PM
 To: Dann Corbit
 Subject: Re: [HACKERS] Selecting a constant question

 On Mon, Jun 11, 2007 at 01:29:37PM -0700, Dann Corbit wrote:
  Our application is using the libPQ interface to access postgres.

  The query is select '123' from tablename  .. the table is not
  important.

  After executing the query, we interrogate the metadata of the result
set
  using the PQfsize, PQfmod and PQftype functions.

 Did you read the documentation of the PQfsize function?

 PQfsize returns the space allocated for this column in a database row,
 in other words the size of the server's internal representation of the
 data type. (Accordingly, it is not really very useful to clients.) A
 negative value indicates the data type is variable-length.

http://www.postgresql.org/docs/8.2/interactive/libpq-exec.html#LIBPQ-EXE
C-
 SELECT-INFO

  The size of the column is returned as 65534  (or -2 if you consider
this
  a signed short value)

 It's variable length, you can't say anything more.

So what you are saying is that the constant '1' is variable length, and
there is no way to find out the maximum length from the database.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Dann Corbit

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:pgsql-hackers-
 [EMAIL PROTECTED] On Behalf Of Dann Corbit
 Sent: Monday, June 11, 2007 1:52 PM
 To: Martijn van Oosterhout
 Cc: pgsql-hackers@postgresql.org
 Subject: Re: [HACKERS] Selecting a constant question

  -Original Message-
  From: Martijn van Oosterhout [mailto:[EMAIL PROTECTED]
  Sent: Monday, June 11, 2007 1:46 PM
  To: Dann Corbit
  Subject: Re: [HACKERS] Selecting a constant question

  On Mon, Jun 11, 2007 at 01:29:37PM -0700, Dann Corbit wrote:
   Our application is using the libPQ interface to access postgres.

   The query is select '123' from tablename  .. the table is not
   important.

   After executing the query, we interrogate the metadata of the
result
 set
   using the PQfsize, PQfmod and PQftype functions.

  Did you read the documentation of the PQfsize function?

  PQfsize returns the space allocated for this column in a database
row,
  in other words the size of the server's internal representation of
the
  data type. (Accordingly, it is not really very useful to clients.) A
  negative value indicates the data type is variable-length.

http://www.postgresql.org/docs/8.2/interactive/libpq-exec.html#LIBPQ-EXE
 C-
  SELECT-INFO

   The size of the column is returned as 65534  (or -2 if you
consider
 this
   a signed short value)

  It's variable length, you can't say anything more.

 So what you are saying is that the constant '1' is variable length,
and
 there is no way to find out the maximum length from the database.

I have a PostgreSQL feature request:

Report the maximum size of a variable length string from the server.

Surely, we cannot be the only people who will need this information.  If
(for example) someone wants to bind to a grid, then the maximum size has
to be known in advance.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Gregory Stark

Dann Corbit [EMAIL PROTECTED] writes:

 Surely, we cannot be the only people who will need this information.  If
 (for example) someone wants to bind to a grid, then the maximum size has
 to be known in advance.

In fact psql needs it and implements this. It has to skim through the entire
result set to calculate the column widths. It's quite a lot of work but the
server is in no better position to do it than psql.

On the contrary the server is missing quite a bit of information of how you
intend to display the information. Do you need the number of bytes or
characters? Are all the characters the same width in your display system? What
about currency symbols? Do you intend to reverse any quoting or just display
backslashes?

Even knowing how many characters and assuming fixed character widths that
wouldn't even be enough to set your grid control widths. Usually people like
numeric quantities decimal aligned and so two records 1.00 and 0.01 will
take much more width than two records with 1.00 and 2.00.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Dann Corbit

 -Original Message-
 From: Gregory Stark [mailto:[EMAIL PROTECTED]
 Sent: Monday, June 11, 2007 2:41 PM
 To: Dann Corbit
 Cc: Martijn van Oosterhout; pgsql-hackers@postgresql.org; Larry McGhaw
 Subject: Re: [HACKERS] Selecting a constant question

 Dann Corbit [EMAIL PROTECTED] writes:

  Surely, we cannot be the only people who will need this information.
If
  (for example) someone wants to bind to a grid, then the maximum size
has
  to be known in advance.

 In fact psql needs it and implements this. It has to skim through the
 entire
 result set to calculate the column widths. It's quite a lot of work
but
 the
 server is in no better position to do it than psql.

Reading the data twice sounds a little painful.  What if there are 30
million rows?

 On the contrary the server is missing quite a bit of information of
how
 you
 intend to display the information. Do you need the number of bytes or
 characters? Are all the characters the same width in your display
system?
 What
 about currency symbols? Do you intend to reverse any quoting or just
 display
 backslashes?

Giving me the information about the data type will be enough.  As an
example, in this case we have varchar data.  If the server should be so
kind as to report varchar(1) for '1' or varchar(3) for '123' then I
would not have any difficulty binding the data to a grid.

 Even knowing how many characters and assuming fixed character widths
that
 wouldn't even be enough to set your grid control widths. Usually
people
 like
 numeric quantities decimal aligned and so two records 1.00 and
0.01
 will
 take much more width than two records with 1.00 and 2.00.

SQL*Server, Oracle, Ingres, DB/2 and other database systems somehow
manage to do it, so I guess it is not technically intractable.

I suspect that your own ODBC/JDBC and other drivers suffer from this
same effect.

Now, I do recognize that sometimes nobody is going to know how big
something is, including the server.  But with a query using a constant
it seems like it ought to be well defined to me.  Perhaps the
difficulties are escaping me because I am not familiar with the low
level guts of this problem.  But I suspect that lots of people besides
me would benefit if sizes of things were known when it is possible to
know them.

As I said before, I see that it cannot be known right now. So I am
putting it in as a feature request.

If you could be so kind as to point out the right spot to look in the
server code, I imagine we could fix it and check in the patch ourselves.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Alvaro Herrera

Dann Corbit wrote:

  Dann Corbit [EMAIL PROTECTED] writes:
  
  In fact psql needs it and implements this. It has to skim through the
  entire
  result set to calculate the column widths. It's quite a lot of work
 but
  the
  server is in no better position to do it than psql.
 
 Reading the data twice sounds a little painful.  What if there are 30
 million rows?

You get an out of memory error.

  On the contrary the server is missing quite a bit of information of
  how you intend to display the information. Do you need the number of
  bytes or characters? Are all the characters the same width in your
  display system?  What about currency symbols? Do you intend to
  reverse any quoting or just display backslashes?
 
 Giving me the information about the data type will be enough.  As an
 example, in this case we have varchar data.  If the server should be so
 kind as to report varchar(1) for '1' or varchar(3) for '123' then I
 would not have any difficulty binding the data to a grid.

Oh, you have the length information for each datum all right.  It's on
the first four bytes of it.

-- 
Alvaro Herrera   http://www.PlanetPostgreSQL.org/
The West won the world not by the superiority of its ideas or values
or religion but rather by its superiority in applying organized violence.
Westerners often forget this fact, non-Westerners never do.
(Samuel P. Huntington)

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

[HACKERS] Got no response last time on setsockopt post, so I thought I would reiterate.

2007-06-11 Thread Dann Corbit

These two calls make our remote queries via libpq about twice as fast on
average.  It seems to me like it might be a nice addition to the core
product's libpq (I poked it into the spot where the Nagle algorithm is
turned off, but another place would be fine too).  Can anyone give me a
reason why it is a bad idea to add this in?  If it were made a parameter
with a default of 64K, that would be even better.  Then it could be
tuned to particular systems for maximum throughput.

 

  on = 65535;

  if (setsockopt(conn-sock, SOL_SOCKET, SO_RCVBUF,(char *) on,
sizeof(on))  0)

{

char  sebuf[256];

 

printfPQExpBuffer(conn-errorMessage,

libpq_gettext(could not set socket
SO_RCVBUF window size: %s\n),

 
SOCK_STRERROR(SOCK_ERRNO, sebuf, sizeof(sebuf)));

return 0;

}



  on = 65535;

  if (setsockopt(conn-sock, SOL_SOCKET, SO_SNDBUF,(char *) on,
sizeof(on))  0)

{

char  sebuf[256];

 

printfPQExpBuffer(conn-errorMessage,

libpq_gettext(could not set socket
SO_SNDBUF window size: %s\n),

 
SOCK_STRERROR(SOCK_ERRNO, sebuf, sizeof(sebuf)));

return 0;

}

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Dann Corbit

 -Original Message-
 From: Alvaro Herrera [mailto:[EMAIL PROTECTED]
 Sent: Monday, June 11, 2007 3:16 PM
 To: Dann Corbit
 Cc: Gregory Stark; Martijn van Oosterhout;
pgsql-hackers@postgresql.org;
 Larry McGhaw
 Subject: Re: [HACKERS] Selecting a constant question

 Dann Corbit wrote:

   Dann Corbit [EMAIL PROTECTED] writes:

   In fact psql needs it and implements this. It has to skim through
the
   entire
   result set to calculate the column widths. It's quite a lot of
work
  but
   the
   server is in no better position to do it than psql.

  Reading the data twice sounds a little painful.  What if there are
30
  million rows?

 You get an out of memory error.

   On the contrary the server is missing quite a bit of information
of
   how you intend to display the information. Do you need the number
of
   bytes or characters? Are all the characters the same width in your
   display system?  What about currency symbols? Do you intend to
   reverse any quoting or just display backslashes?

  Giving me the information about the data type will be enough.  As an
  example, in this case we have varchar data.  If the server should be
so
  kind as to report varchar(1) for '1' or varchar(3) for '123' then I
  would not have any difficulty binding the data to a grid.

 Oh, you have the length information for each datum all right.  It's on
 the first four bytes of it.

Sure, but when I bind to a grid, I need to know a-priori how big the
biggest returned instance can be.  Reading the entire data set twice to
learn the size of a constant seems rather conceptually odd to me.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Alvaro Herrera

Dann Corbit wrote:

  Oh, you have the length information for each datum all right.  It's on
  the first four bytes of it.
 
 Sure, but when I bind to a grid, I need to know a-priori how big the
 biggest returned instance can be.  Reading the entire data set twice to
 learn the size of a constant seems rather conceptually odd to me.

Did you read up on typmod already?  I think that's part of the info sent
down in the query response.

-- 
Alvaro Herrera http://www.flickr.com/photos/alvherre/
No single strategy is always right (Unless the boss says so)
  (Larry Wall)

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Martijn van Oosterhout

On Mon, Jun 11, 2007 at 03:18:33PM -0700, Dann Corbit wrote:
 Sure, but when I bind to a grid, I need to know a-priori how big the
 biggest returned instance can be.  Reading the entire data set twice to
 learn the size of a constant seems rather conceptually odd to me.

To be honest, the concept that a widget requires a constant that can't
be changed later is also a bit odd. There are many times you won't know
beforehand how big the data is, surely the framework should be smart
enough to handle these cases?

Start the width at 100, if it turns out to be too small, make it
bigger...

Have a nice day,
-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Tom Lane

Dann Corbit [EMAIL PROTECTED] writes:
 Giving me the information about the data type will be enough.  As an
 example, in this case we have varchar data.  If the server should be so
 kind as to report varchar(1) for '1' or varchar(3) for '123' then I
 would not have any difficulty binding the data to a grid.

This seems merest fantasy.  Reflect on multibyte character sets for a
bit --- even if it's known that the column is varchar(3) there is no
guarantee that the value will fit in 3 bytes.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Andrew Dunstan




Dann Corbit wrote:

I have a PostgreSQL feature request:

Report the maximum size of a variable length string from the server.

Surely, we cannot be the only people who will need this information.  If
(for example) someone wants to bind to a grid, then the maximum size has
to be known in advance.



  


Does PQfmod not tell you what you need if the field is varchar(n) ?

cheers

andrew



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Dann Corbit

 -Original Message-
 From: Martijn van Oosterhout [mailto:[EMAIL PROTECTED]
 Sent: Monday, June 11, 2007 3:29 PM
 To: Dann Corbit
 Cc: Alvaro Herrera; Gregory Stark; pgsql-hackers@postgresql.org; Larry
 McGhaw
 Subject: Re: [HACKERS] Selecting a constant question

 On Mon, Jun 11, 2007 at 03:18:33PM -0700, Dann Corbit wrote:
  Sure, but when I bind to a grid, I need to know a-priori how big the
  biggest returned instance can be.  Reading the entire data set twice
to
  learn the size of a constant seems rather conceptually odd to me.

 To be honest, the concept that a widget requires a constant that can't
 be changed later is also a bit odd.

Not when the data itself is a constant that cannot be changed.

 There are many times you won't know
 beforehand how big the data is, surely the framework should be smart
 enough to handle these cases?

If it were impossible to know the size of a string constant supplied in
the query, then I think I would agree with you here.  However, it seems
to me that the maximum possible size of such a known, constant-width
string is not hard to determine.

 Start the width at 100, if it turns out to be too small, make it
 bigger...

If that were a good idea, then why report data sizes at all?  Just let
it always be a surprise when it comes streaming down the pipe.

Honestly, I cannot fathom this answer.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Got no response last time on setsockopt post, so I thought I would reiterate.

2007-06-11 Thread Tom Lane

Dann Corbit [EMAIL PROTECTED] writes:
 These two calls make our remote queries via libpq about twice as fast on
 average.

And, perhaps, cause even greater factors of degradation in other
scenarios (not to mention the possibility of complete failure on some
platforms).  You haven't provided nearly enough evidence that this is
a safe change to make.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Alvaro Herrera

Dann Corbit wrote:

 If the server bound the data as UNICODE, then it will tell me
 UNICODE(3).  I know how big this will be.
 
 In the worst case scenario it will fit in 3*4 = 12 bytes.
 
 If the server is built without UNICODE enabled, then it will definitely
 fit in 3 bytes.

Unless it's some other multibyte encoding.  And nowadays, the server is
always unicode enabled.  The stuff sent down the wire is unicode or
not depending on a configuration parameter.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Tom Lane

Dann Corbit [EMAIL PROTECTED] writes:
 To be honest, the concept that a widget requires a constant that can't
 be changed later is also a bit odd.

 Not when the data itself is a constant that cannot be changed.

Surely this case is not sufficiently important to justify designing
your entire application (not to mention the client/server protocol)
around it.  You're always going to have variable-width columns in there
somewhere.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Got no response last time on setsockopt post, so I thought I would reiterate.

2007-06-11 Thread Dann Corbit

 -Original Message-
 From: Tom Lane [mailto:[EMAIL PROTECTED]
 Sent: Monday, June 11, 2007 3:41 PM
 To: Dann Corbit
 Cc: pgsql-hackers@postgresql.org; Larry McGhaw
 Subject: Re: [HACKERS] Got no response last time on setsockopt post,
so I
 thought I would reiterate.

 Dann Corbit [EMAIL PROTECTED] writes:
  These two calls make our remote queries via libpq about twice as
fast on
  average.

 And, perhaps, cause even greater factors of degradation in other
 scenarios (not to mention the possibility of complete failure on some
 platforms).  You haven't provided nearly enough evidence that this is
 a safe change to make.

May I suggest:
http://www-didc.lbl.gov/TCP-tuning/setsockopt.html
http://www.ncsa.uiuc.edu/People/vwelch/net_perf/tcp_windows.html

We test against dozens of operating systems and we have never had a
problem (generally, we use our own tcp/ip network objects for
communication and we only recently figured out why PostgreSQL was
lagging so far behind and patched libPQ ourselves.)  Now, it will be
about 2 weeks before our full regressions have run against PostgreSQL on
all of our platforms, but we do adjust the TCP/IP window on all of our
clients and servers and have yet to find one that is unable to either
negotiate a decent size or ignore our request at worst.

However, I won't twist your arm.  I just wanted to be sure that those at
the PostgreSQL organization were aware of this simple trick.  Our
products run on:
Aix
BeOS
Hpux
Linux (everywhere, including mainframe zLinux)
MVS
SunOS
Solaris
OpenVMS Alpha
OpenVMS VAX
OpenVMS Itanium
Windows

And several others

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Dann Corbit

 -Original Message-
 From: Alvaro Herrera [mailto:[EMAIL PROTECTED]
 Sent: Monday, June 11, 2007 3:44 PM
 To: Dann Corbit
 Cc: Tom Lane; Gregory Stark; Martijn van Oosterhout; pgsql-
 [EMAIL PROTECTED]; Larry McGhaw
 Subject: Re: [HACKERS] Selecting a constant question

 Dann Corbit wrote:

  If the server bound the data as UNICODE, then it will tell me
  UNICODE(3).  I know how big this will be.

  In the worst case scenario it will fit in 3*4 = 12 bytes.

  If the server is built without UNICODE enabled, then it will
definitely
  fit in 3 bytes.

 Unless it's some other multibyte encoding.  And nowadays, the server
is
 always unicode enabled.  The stuff sent down the wire is unicode or
 not depending on a configuration parameter.

Even at that, we still know an absolute maximum of 12 bytes.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Dann Corbit

 -Original Message-
 From: Tom Lane [mailto:[EMAIL PROTECTED]
 Sent: Monday, June 11, 2007 3:50 PM
 To: Dann Corbit
 Cc: Martijn van Oosterhout; Alvaro Herrera; Gregory Stark; pgsql-
 [EMAIL PROTECTED]; Larry McGhaw
 Subject: Re: [HACKERS] Selecting a constant question

 Dann Corbit [EMAIL PROTECTED] writes:
  To be honest, the concept that a widget requires a constant that
can't
  be changed later is also a bit odd.

  Not when the data itself is a constant that cannot be changed.

 Surely this case is not sufficiently important to justify designing
 your entire application (not to mention the client/server protocol)
 around it.  You're always going to have variable-width columns in
there
 somewhere.

Right.  But normally I get back a length for those variable length
columns, or I can collect it from the metadata of the database.

Surely, PostgreSQL can determine the size of a constant string.
Otherwise it would be impossible to know if it would be safe to insert a
constant string into a database column.

PostgreSQL has decided upon a data type, and gives me data bound in that
type.  It is only the length that it is unwilling to divulge.

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Got no response last time on setsockopt post, so I thought I would reiterate.

2007-06-11 Thread Greg Smith


On Mon, 11 Jun 2007, Dann Corbit wrote:


These two calls make our remote queries via libpq about twice as fast on
average.


Can you comment a bit on what your remote queries are typically doing? 
You'll need to provide at least an idea what type of test case you're 
seeing the improvement on for others to try and replicate it.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] Got no response last time on setsockopt post, so I thought I would reiterate.

2007-06-11 Thread Dann Corbit

 -Original Message-
 From: Greg Smith [mailto:[EMAIL PROTECTED]
 Sent: Monday, June 11, 2007 4:09 PM
 To: Dann Corbit
 Cc: pgsql-hackers@postgresql.org
 Subject: Re: [HACKERS] Got no response last time on setsockopt post,
so I
 thought I would reiterate.

 On Mon, 11 Jun 2007, Dann Corbit wrote:

  These two calls make our remote queries via libpq about twice as
fast on
  average.

 Can you comment a bit on what your remote queries are typically doing?
 You'll need to provide at least an idea what type of test case you're
 seeing the improvement on for others to try and replicate it.

We have literally thousands (maybe hundreds of thousands -- I'm not
totally sure exactly how many there are because I am in development and
not in testing) of queries, that take dozens of machines over a week to
run.

Our queries include inserts, updates, deletes, joins, views, um... You
name it.

Our product is database middleware and so we have to test against
anything that is a legal SQL query against every sort of database and
operating system combination (PostgreSQL is one of many environments
that we support).

If you have seen the NIST SQL verification suite, that is part of our
test suite.  We also found the PostgreSQL suite useful (though the
PostgreSQL specific things we only run against PostgreSQL).  We also
have our own collection of regression tests that we have gathered over
the past 20 years or so.

I can't be specific because we run every sort of query.  Most of our
hardware is fairly high end (generally 1GB Ethernet, but we do have some
machines that only have 100 MB net cards in them).

I guess that our usage is atypical for general business use but fairly
typical for those companies that produce middleware tool sets.  However,
many of our regressions came from customer feedback and so we do test
lots and lots of valid customer requirements.

I have a simple suggestion:
Put the setsockopt calls in (with the necessary fluff to make it robust)
and then perform the OSDB test.  I guess that unless the OSDB houses the
clients and the server on the same physical hardware you will see a very
large bonus for a very simple change.

 --
 * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore,
MD

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Got no response last time on setsockopt post, so I thought I would reiterate.

2007-06-11 Thread Andrew Dunstan




Dann Corbit wrote:

However, I won't twist your arm.  I just wanted to be sure that those at
the PostgreSQL organization were aware of this simple trick.  Our
products run on:
Aix
BeOS
Hpux
Linux (everywhere, including mainframe zLinux)
MVS
SunOS
Solaris
OpenVMS Alpha
OpenVMS VAX
OpenVMS Itanium
Windows

And several others



  


We already set the SNDBUF on Windows for reasons documented in the code.

I think if you were to quantify the alleged improvement by platform it 
might allay suspicion.


cheers

andrew

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] Got no response last time on setsockopt post, so I thought I would reiterate.

2007-06-11 Thread Dann Corbit

 -Original Message-
 From: Andrew Dunstan [mailto:[EMAIL PROTECTED]
 Sent: Monday, June 11, 2007 4:35 PM
 To: Dann Corbit
 Cc: Tom Lane; pgsql-hackers@postgresql.org; Larry McGhaw
 Subject: Re: [HACKERS] Got no response last time on setsockopt post,
so I
 thought I would reiterate.

 Dann Corbit wrote:
  However, I won't twist your arm.  I just wanted to be sure that
those at
  the PostgreSQL organization were aware of this simple trick.  Our
  products run on:
  Aix
  BeOS
  Hpux
  Linux (everywhere, including mainframe zLinux)
  MVS
  SunOS
  Solaris
  OpenVMS Alpha
  OpenVMS VAX
  OpenVMS Itanium
  Windows

  And several others

 We already set the SNDBUF on Windows for reasons documented in the
code.

The only place I see it is for Windows *only* in PQCOMM.C (to 32K).  Did
I miss it somewhere else?

 I think if you were to quantify the alleged improvement by platform it
 might allay suspicion.

I do not know if you will see the same results as we do.  We support
ancient and modern operating systems, on ancient and modern hardware (we
have OpenVMS 6.1 running Rdb as old as 4.2, for instance -- 1980's
technology).

The only way for you to see if your environments have the same sort of
benefits that we see is to test it yourselves.

The TCP/IP window size is such a well known optimization setting (in
fact the dominant one) that I am kind of surprised to be catching anyone
unawares.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Larry McGhaw

I think perhaps we have lost sight of the main issue:

1) libpq can properly describe the maximum internal data size of any
numeric or char column in a table via Pqfsize
2) libpq can properly describe the maximum internal data size of any
varchar column via Pqfmod
3) libpq can properly describe the maximum internal data size of any
numeric constant in a SQL statement via Pqfsize
4) libpq **cannot** describe the maximum internal data size of a char or
varchar constant!
Example:  select '123' from any table

This is clearly a bug or serious oversight in libpq that should be
addressed.

The database *knows* this size of the char constant (obviously), and
should report the size via a metadata call, as all other relational
databases do.

Thanks

lm


-Original Message-
From: Alvaro Herrera [mailto:[EMAIL PROTECTED]
Sent: Monday, June 11, 2007 3:44 PM
To: Dann Corbit
Cc: Tom Lane; Gregory Stark; Martijn van Oosterhout;
pgsql-hackers@postgresql.org; Larry McGhaw
Subject: Re: [HACKERS] Selecting a constant question

Dann Corbit wrote:

 If the server bound the data as UNICODE, then it will tell me 
 UNICODE(3).  I know how big this will be.
 
 In the worst case scenario it will fit in 3*4 = 12 bytes.
 
 If the server is built without UNICODE enabled, then it will 
 definitely fit in 3 bytes.

Unless it's some other multibyte encoding.  And nowadays, the server is
always unicode enabled.  The stuff sent down the wire is unicode or
not depending on a configuration parameter.

-- 
Alvaro Herrera
http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Larry McGhaw

I think perhaps we have lost sight of the main issue:

1) libpq can properly describe the maximum internal data size of any
numeric or char column in a table via Pqfsize
2) libpq can properly describe the maximum internal data size of any
varchar column via Pqfmod
3) libpq can properly describe the maximum internal data size of any
numeric constant in a SQL statement via Pqfsize
4) libpq **cannot** describe the maximum internal data size of a char or
varchar constant!
Example:  select '123' from any table

This is clearly a bug or serious oversight in libpq that should be
addressed.

The database *knows* this size of the char constant (obviously), and
should report the size via a metadata call, as all other relational
databases do.

Thanks

lm


-Original Message-
From: Alvaro Herrera [mailto:[EMAIL PROTECTED] 
Sent: Monday, June 11, 2007 3:44 PM
To: Dann Corbit
Cc: Tom Lane; Gregory Stark; Martijn van Oosterhout;
pgsql-hackers@postgresql.org; Larry McGhaw
Subject: Re: [HACKERS] Selecting a constant question

Dann Corbit wrote:

 If the server bound the data as UNICODE, then it will tell me 
 UNICODE(3).  I know how big this will be.
 
 In the worst case scenario it will fit in 3*4 = 12 bytes.
 
 If the server is built without UNICODE enabled, then it will 
 definitely fit in 3 bytes.

Unless it's some other multibyte encoding.  And nowadays, the server is
always unicode enabled.  The stuff sent down the wire is unicode or
not depending on a configuration parameter.

-- 
Alvaro Herrera
http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Kris Jurka




On Mon, 11 Jun 2007, Larry McGhaw wrote:


I think perhaps we have lost sight of the main issue:

2) libpq can properly describe the maximum internal data size of any
varchar column via Pqfmod


SELECT cola || colb FROM tab;


3) libpq can properly describe the maximum internal data size of any
numeric constant in a SQL statement via Pqfsize


SELECT 3::numeric;

Kris Jurka


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Dann Corbit

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:pgsql-hackers-
 [EMAIL PROTECTED] On Behalf Of Kris Jurka
 Sent: Monday, June 11, 2007 5:04 PM
 To: Larry McGhaw
 Cc: pgsql-hackers@postgresql.org
 Subject: Re: [HACKERS] Selecting a constant question
 
 
 
 On Mon, 11 Jun 2007, Larry McGhaw wrote:
 
  I think perhaps we have lost sight of the main issue:
 
  2) libpq can properly describe the maximum internal data size of any
  varchar column via Pqfmod
 
 SELECT cola || colb FROM tab;

Suggestion:
Return (column size of cola) + (column size of colb) in the maximum
length field.
 
  3) libpq can properly describe the maximum internal data size of any
  numeric constant in a SQL statement via Pqfsize
 
 SELECT 3::numeric;

Suggestion:
Return sizeof (numeric(1,0)) -- after all, it's a constant here.

In the words of the great poet Spike Lee:
'Always do the right thing.'


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Andrew Dunstan




Larry McGhaw wrote:

4) libpq **cannot** describe the maximum internal data size of a char or
varchar constant!
Example:  select '123' from any table

This is clearly a bug or serious oversight in libpq that should be
addressed.

The database *knows* this size of the char constant (obviously), and
should report the size via a metadata call, as all other relational
databases do.

  
  


What is not clear to me is why it is so important for you to know the 
length of a piece of data you are supplying. If it is so vitally 
important, you could always cast it, e.g. select '123'::varchar(3)


cheers

andrew

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [HACKERS] Got no response last time on setsockopt post, so I thought I would reiterate.

2007-06-11 Thread Tom Lane

Dann Corbit [EMAIL PROTECTED] writes:
 May I suggest:
 http://www-didc.lbl.gov/TCP-tuning/setsockopt.html
 http://www.ncsa.uiuc.edu/People/vwelch/net_perf/tcp_windows.html

I poked around on those pages and almost immediately came across
http://www.psc.edu/networking/projects/tcptune/
which appears more up-to-date than the other pages, and it specifically
recommends *against* setting SO_SNDBUF or SO_RCVBUF on modern Linuxen.
So that's one fairly large category where we probably do not want this.

You have not even made it clear whether you were increasing the sizes in
the server-to-client or client-to-server direction, and your handwaving
about the test conditions makes it even harder to know what you are
measuring.  I would think for instance that local vs remote connections
make a big difference and might need different tuning.

BTW, if we look at this issue we ought to also look at whether the
send/recv quantum in libpq and the backend should be changed.  It's been
8K for about ten years now ...

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Dann Corbit

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:pgsql-hackers-
 [EMAIL PROTECTED] On Behalf Of Andrew Dunstan
 Sent: Monday, June 11, 2007 5:12 PM
 To: Larry McGhaw
 Cc: pgsql-hackers@postgresql.org
 Subject: Re: [HACKERS] Selecting a constant question

 Larry McGhaw wrote:
  4) libpq **cannot** describe the maximum internal data size of a
char or
  varchar constant!
  Example:  select '123' from any table

  This is clearly a bug or serious oversight in libpq that should be
  addressed.

  The database *knows* this size of the char constant (obviously), and
  should report the size via a metadata call, as all other relational
  databases do.

 What is not clear to me is why it is so important for you to know the
 length of a piece of data you are supplying. If it is so vitally
 important, you could always cast it, e.g. select '123'::varchar(3)

We're a middleware company.  We are not in control of the queries that
are sent.  We can intercept and reformat them, and perhaps that is what
we will need to do for PostgreSQL

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Tom Lane

Larry McGhaw [EMAIL PROTECTED] writes:
 I think perhaps we have lost sight of the main issue:
 1) libpq can properly describe the maximum internal data size of any
 numeric or char column in a table via Pqfsize
 2) libpq can properly describe the maximum internal data size of any
 varchar column via Pqfmod
 3) libpq can properly describe the maximum internal data size of any
 numeric constant in a SQL statement via Pqfsize

None of the above statements are actually true, at least not when you
take off your blinders and note the existence of unconstrained-width
numeric and text columns.

 The database *knows* this size of the char constant (obviously),

No, what it knows (and reports) is type information.  There are a small
number of datatypes where you can infer a maximum width from knowledge
of the datatype.  There are many others where you can't set an upper
bound from this knowledge --- at least not a usefully tight one.

Anyway, if we were to cast those constants to something other than
unknown, it would be text, not varchar, and you'd still have the same
issue.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] Got no response last time on setsockopt post, so I thought I would reiterate.

2007-06-11 Thread Dann Corbit

 -Original Message-
 From: Tom Lane [mailto:[EMAIL PROTECTED]
 Sent: Monday, June 11, 2007 5:12 PM
 To: Dann Corbit
 Cc: pgsql-hackers@postgresql.org; Larry McGhaw
 Subject: Re: [HACKERS] Got no response last time on setsockopt post,
so I
 thought I would reiterate.

 Dann Corbit [EMAIL PROTECTED] writes:
  May I suggest:
  http://www-didc.lbl.gov/TCP-tuning/setsockopt.html
  http://www.ncsa.uiuc.edu/People/vwelch/net_perf/tcp_windows.html

 I poked around on those pages and almost immediately came across
 http://www.psc.edu/networking/projects/tcptune/
 which appears more up-to-date than the other pages, and it
specifically
 recommends *against* setting SO_SNDBUF or SO_RCVBUF on modern Linuxen.
 So that's one fairly large category where we probably do not want
this.

It can still be a good idea to set it:
http://datatag.web.cern.ch/datatag/howto/tcp.html
64K was just an example.  Like I said before, it should be configurable.

 You have not even made it clear whether you were increasing the sizes
in
 the server-to-client or client-to-server direction, and your
handwaving
 about the test conditions makes it even harder to know what you are
 measuring.  I would think for instance that local vs remote
connections
 make a big difference and might need different tuning.

The configuration is a negotiation between client and server.  You may
or may not get what you ask for.  I suggest that it is simple to
implement and worthwhile to test.  But it was only a suggestion.

 BTW, if we look at this issue we ought to also look at whether the
 send/recv quantum in libpq and the backend should be changed.  It's
been
 8K for about ten years now ...

I suspect that TCP/IP packetizing will moderate the affects of changes
on this parameter.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Dann Corbit

 -Original Message-
 From: Tom Lane [mailto:[EMAIL PROTECTED]
 Sent: Monday, June 11, 2007 5:32 PM
 To: Larry McGhaw
 Cc: Alvaro Herrera; Dann Corbit; Gregory Stark; Martijn van
Oosterhout;
 pgsql-hackers@postgresql.org
 Subject: Re: [HACKERS] Selecting a constant question

 Larry McGhaw [EMAIL PROTECTED] writes:
  I think perhaps we have lost sight of the main issue:
  1) libpq can properly describe the maximum internal data size of any
  numeric or char column in a table via Pqfsize
  2) libpq can properly describe the maximum internal data size of any
  varchar column via Pqfmod
  3) libpq can properly describe the maximum internal data size of any
  numeric constant in a SQL statement via Pqfsize

 None of the above statements are actually true, at least not when you
 take off your blinders and note the existence of unconstrained-width
 numeric and text columns.

Unconstrained width columns are not what are being discussed here.  It
is constant expressions of known width.

  The database *knows* this size of the char constant (obviously),

 No, what it knows (and reports) is type information.  There are a
small
 number of datatypes where you can infer a maximum width from knowledge
 of the datatype.  There are many others where you can't set an upper
 bound from this knowledge --- at least not a usefully tight one.

If you do not know how large 1::numeric is, then how can you know
whether it is safe or not to insert it into a column of type
numeric(12,4)?

If you do not know how large 'Joe'::varchar is, then how can you know
whether it is safe to insert it into a column of type varchar(256)?

Clearly, neither of these operations will cause any problems and so the
size of a constant can be determined.

 Anyway, if we were to cast those constants to something other than
 unknown, it would be text, not varchar, and you'd still have the same
 issue.

Other database systems can manage this, and the programmers of those
database systems are not smarter than the programmers of the PostgreSQL
group.  Therefore I can conclude that if the PostgreSQL group decides it
is important, then they can figure out the size of a string or numeric
constant.

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Dann Corbit

 -Original Message-
 From: Tom Lane [mailto:[EMAIL PROTECTED]
 Sent: Monday, June 11, 2007 3:35 PM
 To: Dann Corbit
 Cc: Gregory Stark; Martijn van Oosterhout;
pgsql-hackers@postgresql.org;
 Larry McGhaw
 Subject: Re: [HACKERS] Selecting a constant question

 Dann Corbit [EMAIL PROTECTED] writes:
  Giving me the information about the data type will be enough.  As an
  example, in this case we have varchar data.  If the server should be
so
  kind as to report varchar(1) for '1' or varchar(3) for '123' then I
  would not have any difficulty binding the data to a grid.

 This seems merest fantasy.  Reflect on multibyte character sets for a
 bit --- even if it's known that the column is varchar(3) there is no
 guarantee that the value will fit in 3 bytes.

If the server bound the data as UNICODE, then it will tell me
UNICODE(3).  I know how big this will be.

In the worst case scenario it will fit in 3*4 = 12 bytes.

If the server is built without UNICODE enabled, then it will definitely
fit in 3 bytes.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] COPYable logs status

2007-06-11 Thread FAST PostgreSQL


Andrew Dunstan wrote:

 The CSVlog pipe is a separate pipe from the stderr pipe. Anything that
 goes to stderr now will continue to go to stderr, wherever that is.

 I like this scheme for a couple of reasons:
 . it will include the ability to tell the real end of a message
 . it will let us handle non-protocol messages (although there shouldn't
 be any in the CSVlog pipe).

Another important reason I went for two seperate pipes is that, in 
Windows, the pipe calls being blocking calls, the performance really 
deteriorates unless we increase the allocated buffer to the pipes 
dramatically.


On a rather decent machine, simply running the regression tests would 
consume a lot of resources, especially when it comes to the errors tests.


Rgds,
Arul Shaji


Andrew Dunstan wrote:



Tom Lane wrote:

Alvaro Herrera [EMAIL PROTECTED] writes:
 

The idea of one pipe per process is not really workable, because it
would mean having as many pipes as backends which does not sound very
good.  But how about a mixed approach -- like have the all the backends
share a pipe, controlled by an LWLock, and the auxiliary process have a
separate pipe each?



Multiple pipes seem like a mess, and in any case the above still doesn't
work for stderr output produced by non-cooperative software (dynamic
loader for instance).

The only solution that I can see is to invent some sort of simple
protocol for the syslogger pipe.  Assume that the kernel honors PIPE_BUF
(this assumption may need proving, see other message).  We could imagine
having elog.c divvy up its writes to the pipe into chunks of less than
PIPE_BUF bytes, where each chunk carries info sufficient to let it be
reassembled.  Perhaps something on the order of

\0 \0 2-byte-length source-PID end-flag text...

The syslogger reassembles these by joining messages with the same
origination PID, until it gets one with the end-flag set.  It would need
enough code to track multiple in-progress messages.

The logger would have to also be able to deal with random text coming
down the pipe (due to aforesaid non-cooperative software).  I would be
inclined to say just take any text not preceded by \0\0 as a standalone
message, up to the next \0\0.  Long chunks of non-protocol text would
risk getting treated as multiple messages, but there's probably not a
lot of harm in that.

BTW, exactly what is the COPYable-logs code going to do with random
text?  I trust the answer is not throw it away.

   
  


The CSVlog pipe is a separate pipe from the stderr pipe. Anything that 
goes to stderr now will continue to go to stderr, wherever that is.


I like this scheme for a couple of reasons:
. it will include the ability to tell the real end of a message
. it will let us handle non-protocol messages (although there shouldn't 
be any in the CSVlog pipe).


I'll try to get a patch out for just the stderr case, which should be 
back-patchable, then adjust the CSVlog patch to use it.


I'm thinking of handling the partial lines with a small dynahash of 
StringInfo buffers, which get discarded whenever we don't have a partial 
line for the PID.


cheers

andrew

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly





---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Larry McGhaw

As far as I am aware these statements are true.  If you have a specific example 
you could provide to the contrary that would be interesting.
 
Even if there are such conditions it does not change the fact that libpq and/or 
postgresql is deficient in this area.
 
For any query, the database should be capable of describing the metadata for 
the columns, which includes
1) the column type
and
2) the column maximum length.
 
This is such a basic database interface principle that I very disappointed that 
someone has not recognized this and simply said  yes, we see the issue we will 
work on it.
 
Again, *all* other major relational databases do this ...  even blob fields 
have a maximum length reported from the database.
 
I hope someone who truly understands database interfaces will read this thread 
and address the issue.
For now we will have to special case postgres in our application until it is 
addressed.
 
Thanks
lm



From: Tom Lane [mailto:[EMAIL PROTECTED]
Sent: Mon 6/11/2007 5:32 PM
To: Larry McGhaw
Cc: Alvaro Herrera; Dann Corbit; Gregory Stark; Martijn van Oosterhout; 
pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Selecting a constant question 



Larry McGhaw [EMAIL PROTECTED] writes:
 I think perhaps we have lost sight of the main issue:
 1) libpq can properly describe the maximum internal data size of any
 numeric or char column in a table via Pqfsize
 2) libpq can properly describe the maximum internal data size of any
 varchar column via Pqfmod
 3) libpq can properly describe the maximum internal data size of any
 numeric constant in a SQL statement via Pqfsize

None of the above statements are actually true, at least not when you
take off your blinders and note the existence of unconstrained-width
numeric and text columns.

 The database *knows* this size of the char constant (obviously),

No, what it knows (and reports) is type information.  There are a small
number of datatypes where you can infer a maximum width from knowledge
of the datatype.  There are many others where you can't set an upper
bound from this knowledge --- at least not a usefully tight one.

Anyway, if we were to cast those constants to something other than
unknown, it would be text, not varchar, and you'd still have the same
issue.

regards, tom lane

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Hannu Krosing

Ühel kenal päeval, E, 2007-06-11 kell 13:38, kirjutas Dann Corbit:
  -Original Message-
  From: Tom Lane [mailto:[EMAIL PROTECTED]
  Sent: Monday, June 11, 2007 1:32 PM
  To: Dann Corbit
  Cc: Gregory Stark; pgsql-hackers@postgresql.org
  Subject: Re: [HACKERS] Selecting a constant question
...
  You should be treating typlen as signed not unsigned, and not assuming
 a
  fixed width for any negative value.

  Since the width refers to the server internal representation, and not
 to
  what comes down the wire, I find it pretty strange for an application
 to
  be using typlen for anything at all actually.

 Thanks for the response.

 Since libpq function PQfsize returns -2 for all constant character
 strings in SQL statements ... What is the proper procedure to determine
 the length of a constant character column after query execution but
 before fetching the first row of data?

Why not just get the first row and determine the width from it before
you actually use any of tha data ?

--
Hannu

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Hannu Krosing

Ühel kenal päeval, E, 2007-06-11 kell 22:11, kirjutas Larry McGhaw:
 As far as I am aware these statements are true.  If you have a
 specific example you could provide to the contrary that would be
 interesting.
  
 Even if there are such conditions it does not change the fact that
 libpq and/or postgresql is deficient in this area.
  
 For any query, the database should be capable of describing the
 metadata for the columns, which includes
 1) the column type
 and
 2) the column maximum length.
  
 This is such a basic database interface principle that I very
 disappointed that someone has not recognized this and simply said 
 yes, we see the issue we will work on it.
  
 Again, *all* other major relational databases do this ...  even blob
 fields have a maximum length reported from the database.
  
 I hope someone who truly understands database interfaces will read
 this thread and address the issue.
 For now we will have to special case postgres in our application
 until it is addressed.
  

or redesign your application so that it allocates memory as needed and
won't waste client memory by allocating maximum possible amount for each
and every grid cell weather needed or not ;)

As I understand from this discussion you are writing some kind of
middleware (i.e. tools), and I'd expect toolmakers to do the right
thing.

allocating as much as possibly ever needed is something that would be
excusable in quick-n-dirty end user application, but not in a tool.


Hannu





---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Dann Corbit

 -Original Message-
 From: Hannu Krosing [mailto:[EMAIL PROTECTED]
 Sent: Monday, June 11, 2007 8:42 PM
 To: Dann Corbit
 Cc: Tom Lane; Gregory Stark; pgsql-hackers@postgresql.org
 Subject: Re: [HACKERS] Selecting a constant question

 Ühel kenal päeval, E, 2007-06-11 kell 13:38, kirjutas Dann Corbit:
   -Original Message-
   From: Tom Lane [mailto:[EMAIL PROTECTED]
   Sent: Monday, June 11, 2007 1:32 PM
   To: Dann Corbit
   Cc: Gregory Stark; pgsql-hackers@postgresql.org
   Subject: Re: [HACKERS] Selecting a constant question
 ...
   You should be treating typlen as signed not unsigned, and not assuming
  a
   fixed width for any negative value.

   Since the width refers to the server internal representation, and not
  to
   what comes down the wire, I find it pretty strange for an application
  to
   be using typlen for anything at all actually.

  Thanks for the response.

  Since libpq function PQfsize returns -2 for all constant character
  strings in SQL statements ... What is the proper procedure to determine
  the length of a constant character column after query execution but
  before fetching the first row of data?

 Why not just get the first row and determine the width from it before
 you actually use any of tha data ?

What if the second row is 1000x longer?

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] Selecting a constant question

2007-06-11 Thread Dann Corbit

 -Original Message-
 From: Hannu Krosing [mailto:[EMAIL PROTECTED]
 Sent: Monday, June 11, 2007 10:43 PM
 To: Larry McGhaw
 Cc: Tom Lane; Alvaro Herrera; Dann Corbit; Gregory Stark; Martijn van
 Oosterhout; pgsql-hackers@postgresql.org
 Subject: Re: [HACKERS] Selecting a constant question

 Ühel kenal päeval, E, 2007-06-11 kell 22:11, kirjutas Larry McGhaw:
  As far as I am aware these statements are true.  If you have a
  specific example you could provide to the contrary that would be
  interesting.

  Even if there are such conditions it does not change the fact that
  libpq and/or postgresql is deficient in this area.

  For any query, the database should be capable of describing the
  metadata for the columns, which includes
  1) the column type
  and
  2) the column maximum length.

  This is such a basic database interface principle that I very
  disappointed that someone has not recognized this and simply said 
  yes, we see the issue we will work on it.

  Again, *all* other major relational databases do this ...  even blob
  fields have a maximum length reported from the database.

  I hope someone who truly understands database interfaces will read
  this thread and address the issue.
  For now we will have to special case postgres in our application
  until it is addressed.

 or redesign your application so that it allocates memory as needed and
 won't waste client memory by allocating maximum possible amount for each
 and every grid cell weather needed or not ;)

 As I understand from this discussion you are writing some kind of
 middleware (i.e. tools), and I'd expect toolmakers to do the right
 thing.

In this case the middleware is:
ODBC/JDBC/OLEDB/.NET data drivers for PostgreSQL.

There are other related tools, but the above is the product for which the bug 
needs corrected.

 allocating as much as possibly ever needed is something that would be
 excusable in quick-n-dirty end user application, but not in a tool.

It's a requirement of the ODBC/JDBC/OLEDB/.NET specifications.  I suppose we 
could scan the table twice to figure out how large a column might be, but that 
would make the PostgreSQL driver run at 1/2 speed.  Not a very appetizing 
solution.

None of the other database vendors has any trouble reporting this information 
correctly.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

71 matches

Mail list logo