Re: [HACKERS] Vacuum questions...

2005-09-25 Thread Hannu Krosing
On L, 2005-09-24 at 20:25 -0700, Joshua D. Drake wrote:

 Actually this also probably would not gain you much in 8.1
 as vacuum in theory is already dealing with itself.

Interesting. Could you explain it in a more detailed way ?
How does vacuum deal with itself in 8.1 ?

 Also, would it be possible to add some means to check the status of a
 running vacuum? Even with vacuum verbose, once it starts in on a large
 table you have no way to know how far along it is.
   
 
 That is an interesting thought... Perhaps a quick scan of
 the table to see how many dead rows there are? Then check
 back every n/10 ? Hmmm... I am not a C guy so I don't know if
 that is technically feasible (although probably possible) but it
 is interesting from a DBA perspective.

Not sure of a quick scan approach, espacially for tables big enough
for the progress info would be interesting (in my experience a scan is
never quick).

Perhaps VACUUM could send some statistics after each N pages and this
would then be available through something similar to pg_statistics
table.

-- 
Hannu Krosing [EMAIL PROTECTED]


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Discarding relations from FSM

2005-09-25 Thread Hannu Krosing
On L, 2005-09-24 at 19:32 -0500, Jim C. Nasby wrote:
 On Sat, Sep 24, 2005 at 07:21:19PM -0400, Tom Lane wrote:
   Of course maybe a better question is why we even limit based on the
   number of relations...
  
  Shared memory is fixed-size.
 
 True, but can't the fixed memory required per-relation just be shared
 with the fixed memory used to store free pages?
 
 Though, the idea mentioned recently of just using one shared memory
 segment for everything and allocating dynamically within that probably
 makes more sense...

I guess that communicating those changes to all running backends may be
expensive.

-- 
Hannu Krosing [EMAIL PROTECTED]


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] \d on database with a lot of tables is slow

2005-09-25 Thread Hannu Krosing
On L, 2005-09-24 at 18:59 -0500, Jim C. Nasby wrote:
 I have a client with a database that contains 4000 relations according
 to vacuum verbose, and \d in psql is painfully slow. In particular...
 
-  Seq Scan on pg_class c  (cost=0.00..2343.09 rows=6124 width=73) 
 (actual time=0.325..22100.840 rows=16856 loops=1)
  Filter: (((relkind = 'r'::char) OR (relkind = 'v'::char) OR 
 (relkind = 'S'::char) OR (relkind = ''::char)) AND 
 pg_table_is_visible(oid))
 
 That's off my laptop, but they're seeing similar issues on an 8-way
 Opteron as well...

I expext the issue on 8-way opteron to be more of a high load than slow
scan. It seems that sometimes a database with lots of activity slows
down considerably. I suspect some locking issues, but I'm not sure this
is the case.

Also, if a lot of temp tebles are used, then pg_class and pg_attribute
(at least) get bloated quickly and need vacuuming .

 I've messed around with adding indexes to a copy of pg_class to no
 avail. Any ideas on how to improve the performance?
 
 Also, not sure if this matters, but they're occasionally getting errors
 like 'Did not find relation named table that exists' (where 'table
 that exists' is the name of some table that is in the catalog) from \d.
 Does anyone know what causes that?

mostly this happens on temp tables from other connections, which have
managed to disappear by the time their detailed info is requested, and
which would actually not show up in \d output due tu visibility checks.

-- 
Hannu Krosing [EMAIL PROTECTED]


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] \x output blowing up

2005-09-25 Thread Martijn van Oosterhout
On Sat, Sep 24, 2005 at 07:18:16PM -0400, Bruce Momjian wrote:
 
 Well, it seems we are going to have to fix it somehow for 8.1.  It is
 not crashing here so I can't work up a patch.  Can you submit a minimal
 fix for 8.1?  Thanks.

Ah, it would only happen if your encoding was UTF-8 since that's the
only case psql handles differently. I've attached a patch which fixes
it. With a bit more rearrangement you could probably simplify it a bit
but this works.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.
Index: print.c
===
RCS file: /projects/cvsroot/pgsql/src/bin/psql/print.c,v
retrieving revision 1.74
diff -c -r1.74 print.c
*** print.c 24 Sep 2005 17:53:27 -  1.74
--- print.c 25 Sep 2005 09:16:01 -
***
*** 697,714 
else
fputs( , fout);
  
{
char *my_cell = pg_local_malloc(cell_w[i] + 1);
  
strcpy(my_cell, *ptr);
!   if (opt_align[i % col_count] == 'r'  
opt_numeric_locale)
!   format_numeric_locale(my_cell);
if (opt_border  2)
fprintf(fout, %s\n, my_cell);
else
fprintf(fout, %-s%*s |\n, my_cell, dwidth - 
cell_w[i], );
free(my_cell);
}
}
  
if (opt_border == 2)
--- 697,726 
else
fputs( , fout);
  
+   /* We seperate the cases for numeric_locale because for UTF-8 
(and
+* other) encodings, cell_w = strlen(*ptr). We can't just use
+* strlen() in the malloc because there needs to be enough room 
for
+* format_numeric_locale() to store its result. */
+ 
+   if (opt_align[i % col_count] == 'r'  opt_numeric_locale)
{
char *my_cell = pg_local_malloc(cell_w[i] + 1);
  
strcpy(my_cell, *ptr);
!   format_numeric_locale(my_cell);
if (opt_border  2)
fprintf(fout, %s\n, my_cell);
else
fprintf(fout, %-s%*s |\n, my_cell, dwidth - 
cell_w[i], );
free(my_cell);
}
+   else
+   {
+   if (opt_border  2)
+   fprintf(fout, %s\n, *ptr);
+   else
+   fprintf(fout, %-s%*s |\n, *ptr, dwidth - 
cell_w[i], );
+   }
}
  
if (opt_border == 2)


pgpOenUi8TQtR.pgp
Description: PGP signature


Re: [HACKERS] stack depth limit exceeded problem.

2005-09-25 Thread Thomas Hallgren

Martijn van Oosterhout wrote:

I rely on the signal handler that the JVM uses for page-faults (which a 
stack overflow generally amounts to) and fpe exeptions so I know that 
they will generate java exceptions in a controlled way (which I in turn 
translate to elog(ERROR) on the main thread).
   



Well, actually, what I was thinking is if someone sends a -INT or -TERM
to the backend, which thread will catch it? You have to block it in
every thread except the one you want to catch it in if you want to
control it. This means that for any signal handler that PostgreSQL
installs, you need to intercept it with a wrapper function to make sure
it runs in the right stack.

Actually, while running backend code, you're probably fine since the
elog stuff will handle it. But if a signal is received while the JVM is
running, the signal handler will get the stack of the JVM. Now,
PostgreSQLs signal handlers tend not to do much so you may be safe.
They tend not to throws errors, but who knows...

Still, this is all solvable I think...
 

Yes, the signal handling in PL/Java needs a bit more work. Interrupts 
doesn't work well when using PL/Java at present. This is what I plan to 
do (I think this is what you mean too, right?).


Many threads are spawned by the JVM and will never enter the backend. I 
can't control this and I can't add thread initialization code. Hence, I 
have no way of blocking signals on a per-thread basis the normal way. 
Instead, all PostgreSQL handlers that might break when called from 
another thread must be replaced by a wrapper that checks the interrupted 
thread. Since an arbitrary thread will receive the signal, the wrapper 
must sometimes dispatch the signal to another thread. The final receiver 
of the signal must be either the thread that currently executes a 
backend request, or if no such thread exists, the main thread.


PL/Java will be limited to platforms that support that signals are 
dispatched to specific threads. I don't consider that a limitation. I 
think many JVM's have the same restriction.


Regards,
Thomas Hallgren



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Vacuum questions...

2005-09-25 Thread Joshua D. Drake

Hannu Krosing wrote:


On L, 2005-09-24 at 20:25 -0700, Joshua D. Drake wrote:

 


Actually this also probably would not gain you much in 8.1
as vacuum in theory is already dealing with itself.
   



Interesting. Could you explain it in a more detailed way ?
How does vacuum deal with itself in 8.1 ?
 


Autovacuum is integrated into the backend for 8.1


Not sure of a quick scan approach, espacially for tables big enough
for the progress info would be interesting (in my experience a scan is
never quick).
 


It would be a seq so on a larger table it would probably be
a long time. I was thinking if there was some mapping of
known dead rows or something so we didn't have to scan
the page for the statistics.

We of course would scan to do the actual work but if vacuum
cleared the map while doing the work it may not be that bad.

Sincerely,

Joshua D. Drake



Perhaps VACUUM could send some statistics after each N pages and this
would then be available through something similar to pg_statistics
table.

 




--
Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240
PostgreSQL Replication, Consulting, Custom Programming, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Vacuum questions...

2005-09-25 Thread Tom Lane
Hannu Krosing [EMAIL PROTECTED] writes:
 Perhaps VACUUM could send some statistics after each N pages and this
 would then be available through something similar to pg_statistics
 table.

Why not just have it send some text to be displayed in the current
command field of pg_stat_activity?  The infrastructure is all there
already for that.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] roundoff problem in time datatype

2005-09-25 Thread Tom Lane
Inserting into a time field with limited precision rounds off, which
is good except for this case:

regression=# select '23:59:59.9'::time(0);
   time   
--
 24:00:00
(1 row)

This is bad because:

regression=# select '24:00:00'::time(0);
ERROR:  date/time field value out of range: 24:00:00

which means that data originally accepted will fail to dump and reload.

I see this behavior in all versions back to 7.3.  7.2 was even more
broken:

regression=# select '23:59:59.9'::time(0);
   time   
--
 00:00:00
(1 row)

I think the correct behavior has to be to check for overflow again
after rounding off.  Alternatively: why are we forbidding the value
24:00:00 anyway?  Is there a reason not to allow the hours field
to exceed 23?

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Releasing memory during External sorting?

2005-09-25 Thread Simon Riggs
On Fri, 2005-09-23 at 11:31 -0400, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  Since we know the predicted size of the sort set prior to starting the
  sort node, could we not use that information to allocate memory
  appropriately? i.e. if sort size is predicted to be more than twice the
  size of work_mem, then just move straight to the external sort algorithm
  and set the work_mem down at the lower limit?
 
 Have you actually read the sort code?

Yes and Knuth too. Your research and code are incredible, almost
untouchable. Yet sort performance is important and empirical evidence
suggests that this can be improved upon significantly, so I am and will
be spending time trying to improve upon that. Another time...

This thread was aiming to plug a problem I saw with 8.1's ability to use
very large work_mem settings. I felt that either my performance numbers
were wrong or we needed to do something; I've not had anybody show me
performance numbers that prove mine doubtful, yet.

 During the run-forming phase it's definitely useful to eat all the
 memory you can: that translates directly to longer initial runs and
 hence fewer merge passes.  

Sounds good, but maybe that is not the dominant effect. I'll retest, on
the assumption that there is a benefit, but there's something wrong with
my earlier tests.

 During the run-merging phase it's possible
 that using less memory would not hurt performance any, but as already
 stated, I don't think it will actually end up cutting the backend's
 memory footprint --- the sbrk point will be established during the run
 forming phase and it's unlikely to move back much until transaction end.

 Also, if I recall the development of that code correctly, the reason for
 using more than minimum memory during the merge phase is that writing or
 reading lots of tuples at once improves sequentiality of access to the
 temp files.  So I'm not sure that cutting down the memory wouldn't hurt
 performance.

Cutting memory below about 16 MB does definitely hurt external sort
performance; I explain that as being the effect of sequential access. I
haven't looked to nail down the breakpoint exactly since it seemed more
important simply to say that there looked like there was one.. Its just
that raising it above that mark doesn't help much, according to my
current results.

I'll get some more test results and repost them, next week. I will be
very happy if the results show that more memory helps.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [PERFORM] Releasing memory during External sorting?

2005-09-25 Thread Simon Riggs
On Fri, 2005-09-23 at 12:48 -0400, Ron Peacetree wrote:

  I have some indications from private tests that very high memory settings
 may actually hinder performance of the sorts, though I cannot explain that
 and wonder whether it is the performance tests themselves that have issues.
 
 Hmmm.  Are you talking about amounts so high that you are throwing the OS
 into paging and swapping thrash behavior?  If not, then the above is weird.

Thanks for your thoughts. I'll retest, on the assumption that there is a
benefit, but there's something wrong with my earlier tests.

Best Regards, Simon Riggs



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] Questions about proper newline handling in psql output

2005-09-25 Thread Martijn van Oosterhout
Hi,

I basically have a functional version for aligned output, examples at
the bottom of this email. It handles multiline data values and
multiline headers. However, there are some areas where I could use some
input.

1. To be able to control the spacing, psql now has to be very careful
about its output. eg \r is printed as \r, ascii control characters are
output as \x00 style and other control chars as \u. This is a
change from previous behaviour, yet you're pretty much forced to if you
want to control the output.

Is this change acceptable?

2. Currently I've changed the aligned outputs but not the unaligned
ones. Given you're not worrying about alignment there anyway, why do
the work? Also, we recommend unaligned output for script users so I
don't feel right changing it.

Is this distinction acceptable?

3. How to show that a value is continued? As you can see below I use
':' before columns that have data. This obviously doesn't work for
first column if there's no outer border. If your border style is 0
you're totally out of luck.

I remember a discussion on this before but couldn't find it in the
archives. Either a reference or some other hints would be appreciated.

4. Some system output like pg_views has really really long strings,
would it be acceptable to change the output there to add newlines at
various places to make it output nicer with this change?

5. Auto string folding. If a string is really long, fold it so it fits
in a screen width, perhaps with '\' continuation. I havn't done this
but I can imagine some people (including me) would love it.

6. Currently I've implemented support for UTF-8 and all ASCII
compatable single-byte encodings. Given that psql didn't support the
others anyway maybe no-one cares, but I have to ask: does anyone care?
If so, I need info on *how* to support an encoding.

Thanks for your attention. See you tomorrow.

Query is: select oid, prosrc as HdrLine1
HdrLine2, proacl from pg_proc limit 1;

Border style is 1.
  oid  |  HdrLine1   | 
proacl 
   |  HdrLine2   |  
  
---+-+
 17009 | select 1 union all select 2 union all select 3 union all|  
 
   : select 4 union all select 5 union all select 6 union all   
 
   : select 7 union all select 8 union all select 9 union all   
 
   : select 10 union all select 11 union all select 12 union all
 
   : select 13 union all select 14 union all select 15 union all
 
   : select 16 union all select 17 union all select 18 union all
 
   : select 19 union all select 20 union all select 21 union all
 
   : select 22 union all select 23 union all select 24 union all
 
   : select 25 union all select 26 union all select 27 union all
 
   : select 28 union all select 29 union all select 30 union all
 
   : select 31 union all select 32  
 
(1 row)

Expanded display is on.
-[ RECORD 1 ]-
oid  | 17009
HdrLine1 | select 1 union all select 2 union all select 3 union all
HdrLine2 : select 4 union all select 5 union all select 6 union all
 : select 7 union all select 8 union all select 9 union all
 : select 10 union all select 11 union all select 12 union all
 : select 13 union all select 14 union all select 15 union all
 : select 16 union all select 17 union all select 18 union all
 : select 19 union all select 20 union all select 21 union all
 : select 22 union all select 23 union all select 24 union all
 : select 25 union all select 26 union all select 27 union all
 : select 28 union all select 29 union all select 30 union all
 : select 31 union all select 32
proacl   | 

# select chr(8);
 chr  
--
 \x08
(1 row)


-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpMYIVtyDxCp.pgp
Description: PGP signature


Re: [HACKERS] statement logging / extended query protocol issues

2005-09-25 Thread Oliver Jowett
Bruce Momjian wrote:

 Simon's page is in the patches queue.  What would you like changed,
 exactly?

I'm not going to have time to comment on this any time soon, sorry :( ..
I guess I will try to look at it for 8.2.

-O

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match