Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Mark Mielke

Jeff Davis wrote:

My first thought would be that we would need a new executor node (e.g.
ParallelSort) that would only be chosen when the cost of the sort is
large enough to outweigh other factors (such as process creation time,
dividing available work_mem, and any necessary IPC).

It seems to me the simplest way to do it would be to allow each sub
process to allocate work_mem/P where P is the degree of parallelization.
However, that somewhat works against our schemes for dynamic run
handling and forecasting, and may lead to more random reading from disk.
Any other scheme I can think of would involve more IPC, which might make
the idea just too complex.
  
I am curious - what algorithms exist to efficiently do a parallel sort? 
Do you mean if sorting 1 million items, it is possible to separate this 
into  2 sets of 500 thousand each, execute them in separate threads 
(with task administration and synchronization overhead) , combine the 
results, and complete the task in significantly less time than doing it 
in one thread? I am skeptical that this is possible, and suspect that 
the overall efficiency of the system would go down even if the 
throughput of a single execution increases.


Or do you mean being able to perform parts of the query plan fully in 
parallel? If this, then one would need a lot more than ParallelSort...


Cheers,
mark

--
Mark Mielke [EMAIL PROTECTED]

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Michał Zaborowski
2007/12/19, Mark Mielke [EMAIL PROTECTED]:
 Jeff Davis wrote:
  My first thought would be that we would need a new executor node (e.g.
  ParallelSort) that would only be chosen when the cost of the sort is
  large enough to outweigh other factors (such as process creation time,
  dividing available work_mem, and any necessary IPC).
 
  It seems to me the simplest way to do it would be to allow each sub
  process to allocate work_mem/P where P is the degree of parallelization.
  However, that somewhat works against our schemes for dynamic run
  handling and forecasting, and may lead to more random reading from disk.
  Any other scheme I can think of would involve more IPC, which might make
  the idea just too complex.
 
 I am curious - what algorithms exist to efficiently do a parallel sort?
 Do you mean if sorting 1 million items, it is possible to separate this
 into  2 sets of 500 thousand each, execute them in separate threads
 (with task administration and synchronization overhead) , combine the
 results, and complete the task in significantly less time than doing it
 in one thread? I am skeptical that this is possible, and suspect that
 the overall efficiency of the system would go down even if the
 throughput of a single execution increases.

Ok - we want to sort table with quick sort and we want to do it on - N threads.
Every thread - gets begin and end of indices of the table. First step starts
at 0 and lasts with count -1. Single step:  find medium value and move
lover before it and bigger after. In normal case - we use recursive call - so
the same procedure is being called for that two parts. In thread we can put
indices at side list - and use queue of threads to pick up data from the list.
We can use common table, access to side list with indices has to be serialized.

 Or do you mean being able to perform parts of the query plan fully in
 parallel? If this, then one would need a lot more than ParallelSort...

Nice to have, but rather for data warehouses. In other cases... IE - backend
for Internet - there are many requests and every processor / core works nice.

-- 
Regards,
  Michał Zaborowski (TeXXaS)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[HACKERS] Testing mail list

2007-12-19 Thread Gregory Stark

I'm receiving bogus bounce messages like this (which are malformed even, the
Subject isn't properly encoded). I'm not sure what list is generating them or
what address but if we can figure out who could we drop whoever it is from the
list please?

---BeginMessage---






A mensagem de email enviada para [EMAIL PROTECTED] requer confirmação para ser entregue. Por favor, responda este e-mail informando os caracteres que você vê na imagem abaixo.

The email message sent to [EMAIL PROTECTED] requires a confirmation to be delivered. Please, answer this email informing the characters that you see in the image below.



Não remova a próxima linha / Don't remove next line
captchakey:asbEJKOW1nbXcwMTY4Mjk













---End Message---


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] WORM and Read Only Tables (v0.1)

2007-12-19 Thread Hannu Krosing

Ühel kenal päeval, E, 2007-12-17 kell 09:20, kirjutas Simon Riggs:
 On Sat, 2007-12-15 at 13:32 +0100, Albert Cervera i Areny wrote:
   Read-Only Tables
   
   Postgres supports the concept of freezing tuples, so they can live
   forever within the database without needing further writes. Currently
   there is no command that will guarantee that a table has been completely
   frozen. This makes it difficult to reliably write data files to WORM
   media for longer term archiving. (WORM means Write-Once, Read-Many).
   It's also a pain having to VACUUM a large table again just because a
   small number of rows need to be frozen.
  
  
  I'm not an expert at all, but I'd like to understand this, do you plan that 
  READ-ONLY tables wouldn't even store transaction information? That should 
  save quite a lot of space. Maybe when the table would be moved to the 
  compressed tablespace, MVCC information could be dropped too? Of course 
  that 
  would avoid future insert  update possibilities though.
 
 It could, but its a lot of work for little gain. The tuple headers look
 like they will compress fairly well, so why bother to remove them at
 all?

One place for removing them would be if we do column-stores where there
would be one header per column instead of one per tuple.

--
Hannu




---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Andreas Joseph Krogh
On Tuesday 18 December 2007 10:03:25 Dimitri Fontaine wrote:
 Hi,

 Le mardi 18 décembre 2007, Ron Mayer a écrit :
  Has anyone looked into sorting algorithms that could use
  more than one CPU or core at a time?

 [...]

  PS: Yeah, I know multi-threading is a hot-button on these
  lists; but sorting seems a relatively isolated of the code
  and I'd wonder if it'd be isolate-able enough that multiple
  CPUs could be used there.

 And before that objection to multi-threading implementation and portability
 concerns arise, what about using a coroutine BSD-licenced portable
 implementation such as Protothreads to have backend code use several CPU at
 a time?
   http://www.sics.se/~adam/pt/

 With such a tool, would it be possible to think about producer/consumer
 parallel executions for sorting, aggregates nodes or other parts of the
 executor?

 Hope this helps, regards,

And remember; Users don't care about portability-issues, they care about 
performance. If multi-threading is a way to speed up sorting considerably, it 
should, IMHO, be considered seriously.

--
Andreas Joseph Krogh

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Testing mail list

2007-12-19 Thread Andrew Dunstan



Gregory Stark wrote:

I'm receiving bogus bounce messages like this (which are malformed even, the
Subject isn't properly encoded). I'm not sure what list is generating them or
what address but if we can figure out who could we drop whoever it is from the
list please?

  



Subject:
Confirmação de envio / Sending confirmation (captchaid:1324333124c3)
From:
[EMAIL PROTECTED]



The email message sent to [EMAIL PROTECTED] requires a confirmation to 
be delivered. Please, answer this email informing the characters that 
you see in the image below



  


Receipt of messages like this is guaranteed an immediate entry in my 
junk filter. Use of this braindead software is bad enough, but being so 
clueless as not to whitelist a technical mailing list you subscribe to 
is truly horrible.


cheers

andrew

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] buildenv.pl/buildenv.bat

2007-12-19 Thread Magnus Hagander
On Fri, Dec 14, 2007 at 10:47:02AM -0500, Andrew Dunstan wrote:
 
 
 Magnus Hagander wrote:
 On Fri, Dec 14, 2007 at 03:39:14PM +, Dave Page wrote:
   
 Andrew Dunstan wrote:
 
 Writing and calling a temp .bat file might be yucky - having to keep two
 environment files is a lot more yucky, IMNSHO, and we shouldn't make
 people do it.
   
 +1
 
 
 Ok, I guess I'm outvoted ;-) I don't care *that* much about it, so let's do
 it your way.
 
 You need help testing that patch further, or do you consider it ready for
 application? If so, go.
 
 
   
 
 Yes, please do test it - I don't use buildenv.* at all normally, as then 
 buildfarm client takes care of its own environment, and that's what I 
 usually use to build on Windows.

It works in my two standalone environments (win32 and win64 machine),
so I've applied the patch now to get full buildfarm testing.

//Magnus

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] pgwin32_open returning EINVAL

2007-12-19 Thread Magnus Hagander
On Sun, Dec 16, 2007 at 10:31:59AM +, Gregory Stark wrote:
 Tom Lane [EMAIL PROTECTED] writes:
 
  Gregory Stark [EMAIL PROTECTED] writes:
  Andrew Dunstan [EMAIL PROTECTED] writes:
  Interesting. Maybe forever is going a bit too far, but retrying for n
  seconds or so.
 
  I think looping forever is the right thing. Having a fixed timeout just 
  means
  Postgres will break sometimes instead of all the time. And it introduces
  non-deterministic behaviour too.
 
  Looping forever would be considered broken by a very large fraction of
  the community.
 
 Really? I understood we're talking about having Postgres fail with an error if
 any of its files are opened by another program such as backup software. So
 with a 30s limit it means Postgres might or might not fail depending on how
 long this other software has the file open. That doesn't seem like an
 improvement.

If your software is locking a file for that long, that software is more
than just broken, it's horribly broken. Having a workaround against
something that might happen once or twice because of a bug in the other
software is one thing, but if it's actually *designed* to do that you
really need to get that software removed from your machine.

Having the server completely stop for 30 seconds waiting for something to
happen is bad enough, I think. Better to error out to let the user know
that there's a major configuration problem on the machine.

  IIRC we have a 30-second timeout in rename() for Windows, and that seems
  to be working well enough, so I'd be inclined to copy the behavior for
  this case.
 
 I thought it was unlink, and the worst-case there is that we leak a file until
 some later time. I'm wasn't exactly following that case though.

We do it on both rename and unlink.

//Magnus

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] pgwin32_open returning EINVAL

2007-12-19 Thread Magnus Hagander
On Sat, Dec 15, 2007 at 10:31:38PM -0500, Tom Lane wrote:
 Gregory Stark [EMAIL PROTECTED] writes:
  Andrew Dunstan [EMAIL PROTECTED] writes:
  Interesting. Maybe forever is going a bit too far, but retrying for n
  seconds or so.
 
  I think looping forever is the right thing. Having a fixed timeout just 
  means
  Postgres will break sometimes instead of all the time. And it introduces
  non-deterministic behaviour too.
 
 Looping forever would be considered broken by a very large fraction of
 the community.
 
 IIRC we have a 30-second timeout in rename() for Windows, and that seems
 to be working well enough, so I'd be inclined to copy the behavior for
 this case.

Here's a patch that I think implements this ;) Alvaro - do you have a build
env so you can test it? I can't reproduce the problem in my environment...

Also, it currently just silently loops. Would it be interesting to
ereport(WARNING) that it's looping on the open, to let the user know
there's a problem? (Naturally, only warning the first time it tries it on
each file, so we don't spam the log too hard)

//Magnus
Index: src/port/open.c
===
RCS file: /projects/cvsroot/pgsql/src/port/open.c,v
retrieving revision 1.22
diff -c -r1.22 open.c
*** src/port/open.c 30 Nov 2007 11:16:43 -  1.22
--- src/port/open.c 19 Dec 2007 12:42:05 -
***
*** 58,65 
  pgwin32_open(const char *fileName, int fileFlags,...)
  {
int fd;
!   HANDLE  h;
SECURITY_ATTRIBUTES sa;

/* Check that we can handle the request */
assert((fileFlags  ((O_RDONLY | O_WRONLY | O_RDWR) | O_APPEND |
--- 58,66 
  pgwin32_open(const char *fileName, int fileFlags,...)
  {
int fd;
!   HANDLE  h = INVALID_HANDLE_VALUE;
SECURITY_ATTRIBUTES sa;
+   int loops = 0;

/* Check that we can handle the request */
assert((fileFlags  ((O_RDONLY | O_WRONLY | O_RDWR) | O_APPEND |
***
*** 71,77 
sa.bInheritHandle = TRUE;
sa.lpSecurityDescriptor = NULL;

!   if ((h = CreateFile(fileName,
/* cannot use O_RDONLY, as it == 0 */
  (fileFlags  O_RDWR) ? (GENERIC_WRITE 
| GENERIC_READ) :
 ((fileFlags  O_WRONLY) ? 
GENERIC_WRITE : GENERIC_READ),
--- 72,78 
sa.bInheritHandle = TRUE;
sa.lpSecurityDescriptor = NULL;

!   while ((h = CreateFile(fileName,
/* cannot use O_RDONLY, as it == 0 */
  (fileFlags  O_RDWR) ? (GENERIC_WRITE 
| GENERIC_READ) :
 ((fileFlags  O_WRONLY) ? 
GENERIC_WRITE : GENERIC_READ),
***
*** 88,93 
--- 89,108 
((fileFlags  O_DSYNC) ? 
FILE_FLAG_WRITE_THROUGH : 0),
NULL)) == INVALID_HANDLE_VALUE)
{
+   /*
+* Sharing violation or locking error can indicate antivirus, 
backup
+* or similar software that's locking the file. Try again for 
30 seconds
+* before giving up.
+*/
+   if (GetLastError() == ERROR_SHARING_VIOLATION ||
+   GetLastError() == ERROR_LOCK_VIOLATION)
+   {
+   pg_usleep(10);
+   loops++;
+   if (loops  30)
+   continue;
+   }
+
_dosmaperr(GetLastError());
return -1;
}

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] pgwin32_open returning EINVAL

2007-12-19 Thread Gregory Stark
Magnus Hagander [EMAIL PROTECTED] writes:

 If your software is locking a file for that long, that software is more
 than just broken, it's horribly broken. Having a workaround against
 something that might happen once or twice because of a bug in the other
 software is one thing, but if it's actually *designed* to do that you
 really need to get that software removed from your machine.

I was under the impression though that this was just how open worked on
windows. Only one process can have a file open at the same time.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Testing mail list

2007-12-19 Thread Gregory Stark
Andrew Dunstan [EMAIL PROTECTED] writes:

 Receipt of messages like this is guaranteed an immediate entry in my junk
 filter. Use of this braindead software is bad enough, but being so clueless as
 not to whitelist a technical mailing list you subscribe to is truly horrible.

It's worse than that in this case. This is an *impressively* broken
configuration. What appears to be happening is that the mail server at this
university is looking at the To and From headers and treating it as a personal
email between those two addresses. It sends this captcha to the From header
claiming that the person in the To header is insisting on the captcha being
filled out. The first such bounce I looked at actually claimed it was on Tom's
behalf!

If I were the list maintainer here I would ban infotecnica.com.br addresses
from subscribing to any of our lists. Ideally with a message saying as a
result of misconfigured mail software addreses from infotecnica.com.br are
banned from pgsql mailing lists. Please contact your postmaster to request
they fix the problems

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] pgwin32_open returning EINVAL

2007-12-19 Thread Magnus Hagander
On Wed, Dec 19, 2007 at 01:05:00PM +, Gregory Stark wrote:
 Magnus Hagander [EMAIL PROTECTED] writes:
 
  If your software is locking a file for that long, that software is more
  than just broken, it's horribly broken. Having a workaround against
  something that might happen once or twice because of a bug in the other
  software is one thing, but if it's actually *designed* to do that you
  really need to get that software removed from your machine.
 
 I was under the impression though that this was just how open worked on
 windows. Only one process can have a file open at the same time.

Then you're wrong. You can open files in shared mode no problem - it's how
we have multiple backensd opening the same file. You do have to remember to
specify the flag when you open the file - something backup software for
example has always done, unless it's really crappy design. 

Most proper backup software will use VSS these days, which makes the whole
point moot.

And antivirus is supposed to be entirely transparent, so it's definitely a
bug when it happens. But we've seen a lot of buggy antivirus.

//Magnus

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] pgwin32_open returning EINVAL

2007-12-19 Thread Magnus Hagander
On Wed, Dec 19, 2007 at 11:23:35AM -0300, Alvaro Herrera wrote:
 Magnus Hagander wrote:
  On Sat, Dec 15, 2007 at 10:31:38PM -0500, Tom Lane wrote:
   Gregory Stark [EMAIL PROTECTED] writes:
Andrew Dunstan [EMAIL PROTECTED] writes:
Interesting. Maybe forever is going a bit too far, but retrying for n
seconds or so.
   
I think looping forever is the right thing. Having a fixed timeout just 
means
Postgres will break sometimes instead of all the time. And it introduces
non-deterministic behaviour too.
   
   Looping forever would be considered broken by a very large fraction of
   the community.
   
   IIRC we have a 30-second timeout in rename() for Windows, and that seems
   to be working well enough, so I'd be inclined to copy the behavior for
   this case.
  
  Here's a patch that I think implements this ;) Alvaro - do you have a build
  env so you can test it? I can't reproduce the problem in my environment...
 
 Thanks -- forwarded to the appropriate parties. :-)

Thanks. Let us know the results :-)

  Also, it currently just silently loops. Would it be interesting to
  ereport(WARNING) that it's looping on the open, to let the user know
  there's a problem? (Naturally, only warning the first time it tries it on
  each file, so we don't spam the log too hard)
 
 Yeah, I think it would be useful to log one message if after (say) 5
 seconds we still haven't been able to open the file.

Either that, or on the first run.


 Is the sleep time correct?  If I'm reading it right, it sleeps 100 ms
 each time, 30 times, that totals 3 seconds ... ?

Uh, I copied that from pgunlink() and pgrename(), but forgot a zero on the
loop. It's supposed to loop 300 times.

 (Are we OK with the idea of sleeping 1 second each time?)

I think not. 0.1 seconds is better. We don't want to delay a full second if
it's just a transient thing.

//Magnus

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] pgwin32_open returning EINVAL

2007-12-19 Thread Alvaro Herrera
Magnus Hagander wrote:
 On Sat, Dec 15, 2007 at 10:31:38PM -0500, Tom Lane wrote:
  Gregory Stark [EMAIL PROTECTED] writes:
   Andrew Dunstan [EMAIL PROTECTED] writes:
   Interesting. Maybe forever is going a bit too far, but retrying for n
   seconds or so.
  
   I think looping forever is the right thing. Having a fixed timeout just 
   means
   Postgres will break sometimes instead of all the time. And it introduces
   non-deterministic behaviour too.
  
  Looping forever would be considered broken by a very large fraction of
  the community.
  
  IIRC we have a 30-second timeout in rename() for Windows, and that seems
  to be working well enough, so I'd be inclined to copy the behavior for
  this case.
 
 Here's a patch that I think implements this ;) Alvaro - do you have a build
 env so you can test it? I can't reproduce the problem in my environment...

Thanks -- forwarded to the appropriate parties. :-)

 Also, it currently just silently loops. Would it be interesting to
 ereport(WARNING) that it's looping on the open, to let the user know
 there's a problem? (Naturally, only warning the first time it tries it on
 each file, so we don't spam the log too hard)

Yeah, I think it would be useful to log one message if after (say) 5
seconds we still haven't been able to open the file.

Is the sleep time correct?  If I'm reading it right, it sleeps 100 ms
each time, 30 times, that totals 3 seconds ... ?

(Are we OK with the idea of sleeping 1 second each time?)



-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] pgwin32_open returning EINVAL

2007-12-19 Thread Zeugswetter Andreas ADI SD

  Yeah, I think it would be useful to log one message if after (say) 5
  seconds we still haven't been able to open the file.
 
 Either that, or on the first run.

Imho 1-5s is better, so that would be after the 10-50th try.

 loop. It's supposed to loop 300 times.

Yes.
 
  (Are we OK with the idea of sleeping 1 second each time?)
 
 I think not. 0.1 seconds is better. We don't want to delay a full
second if
 it's just a transient thing.

Yes 0.1 s is imho good. Btw. m$ is talking about milliseconds
(http://support.microsoft.com/kb/316609) 

Andreas

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] pgwin32_open returning EINVAL

2007-12-19 Thread Andrew Dunstan



Zeugswetter Andreas ADI SD wrote:



Yes 0.1 s is imho good. Btw. m$ is talking about milliseconds
(http://support.microsoft.com/kb/316609) 



  


We have seen cases in the past where these locks last quite a long time. 
That 30s total timeout in rename and unlink was not chosen arbitrarily - 
it's based on experience.


I think a retry interval of 0.1s here should be OK.

cheers

andrew

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] pgwin32_open returning EINVAL

2007-12-19 Thread Magnus Hagander
On Wed, Dec 19, 2007 at 04:20:23PM +0100, Zeugswetter Andreas ADI SD wrote:
 
   Yeah, I think it would be useful to log one message if after (say) 5
   seconds we still haven't been able to open the file.
  
  Either that, or on the first run.
 
 Imho 1-5s is better, so that would be after the 10-50th try.

Ok, so I'll put in a warning after 50 tries.


  loop. It's supposed to loop 300 times.
 
 Yes.
  
   (Are we OK with the idea of sleeping 1 second each time?)
  
  I think not. 0.1 seconds is better. We don't want to delay a full
 second if
  it's just a transient thing.
 
 Yes 0.1 s is imho good. Btw. m$ is talking about milliseconds
 (http://support.microsoft.com/kb/316609) 

Their sample code sleeps for 0.25 seconds though. (Sleep() takes
milliseconds). So we're definitely fine with 0.1sec I think.

//Magnus

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Benchmark for GiST index?

2007-12-19 Thread Teodor Sigaev

What kind of benchmark? I have benchmark for text search
http://www.sigaev.ru/cvsweb/cvsweb.cgi/ftsbench/
and several others, based on it.


Koichi Suzuki wrote:

Hi,

Does anybody have an information about GiST index benchmark?  I'm
planning to run GiST benchmark and analyze WAL structure to see PG's
dynamic behavior.

Regards;

--
Koichi Suzuki

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] pgwin32_open returning EINVAL

2007-12-19 Thread Magnus Hagander
On Wed, Dec 19, 2007 at 04:44:12PM +0100, Magnus Hagander wrote:
 On Wed, Dec 19, 2007 at 04:20:23PM +0100, Zeugswetter Andreas ADI SD wrote:
  
Yeah, I think it would be useful to log one message if after (say) 5
seconds we still haven't been able to open the file.
   
   Either that, or on the first run.
  
  Imho 1-5s is better, so that would be after the 10-50th try.
 
 Ok, so I'll put in a warning after 50 tries.

Updated version attached. Comments on the wording of the messages are also
welcome ;-)

//Magnus
Index: src/port/open.c
===
RCS file: /projects/cvsroot/pgsql/src/port/open.c,v
retrieving revision 1.22
diff -c -r1.22 open.c
*** src/port/open.c 30 Nov 2007 11:16:43 -  1.22
--- src/port/open.c 19 Dec 2007 15:50:06 -
***
*** 13,19 
  
  #ifdef WIN32
  
! #include c.h
  
  #include windows.h
  #include fcntl.h
--- 13,23 
  
  #ifdef WIN32
  
! #ifndef FRONTEND
! #include postgres.h
! #else
! #include postgres_fe.h
! #endif
  
  #include windows.h
  #include fcntl.h
***
*** 58,65 
  pgwin32_open(const char *fileName, int fileFlags,...)
  {
int fd;
!   HANDLE  h;
SECURITY_ATTRIBUTES sa;
  
/* Check that we can handle the request */
assert((fileFlags  ((O_RDONLY | O_WRONLY | O_RDWR) | O_APPEND |
--- 62,70 
  pgwin32_open(const char *fileName, int fileFlags,...)
  {
int fd;
!   HANDLE  h = INVALID_HANDLE_VALUE;
SECURITY_ATTRIBUTES sa;
+   int loops = 0;
  
/* Check that we can handle the request */
assert((fileFlags  ((O_RDONLY | O_WRONLY | O_RDWR) | O_APPEND |
***
*** 71,77 
sa.bInheritHandle = TRUE;
sa.lpSecurityDescriptor = NULL;
  
!   if ((h = CreateFile(fileName,
/* cannot use O_RDONLY, as it == 0 */
  (fileFlags  O_RDWR) ? (GENERIC_WRITE 
| GENERIC_READ) :
 ((fileFlags  O_WRONLY) ? 
GENERIC_WRITE : GENERIC_READ),
--- 76,82 
sa.bInheritHandle = TRUE;
sa.lpSecurityDescriptor = NULL;
  
!   while ((h = CreateFile(fileName,
/* cannot use O_RDONLY, as it == 0 */
  (fileFlags  O_RDWR) ? (GENERIC_WRITE 
| GENERIC_READ) :
 ((fileFlags  O_WRONLY) ? 
GENERIC_WRITE : GENERIC_READ),
***
*** 88,93 
--- 93,121 
((fileFlags  O_DSYNC) ? 
FILE_FLAG_WRITE_THROUGH : 0),
NULL)) == INVALID_HANDLE_VALUE)
{
+   /*
+* Sharing violation or locking error can indicate antivirus, 
backup
+* or similar software that's locking the file. Try again for 
30 seconds
+* before giving up.
+*/
+   if (GetLastError() == ERROR_SHARING_VIOLATION || 
+   GetLastError() == ERROR_LOCK_VIOLATION)
+   {
+   pg_usleep(10);
+   loops++;
+
+ #ifndef FRONTEND
+   if (loops == 50)
+   ereport(WARNING,
+   (errmsg(could not open file \%s\ due 
to sharing violation. Will retry for 30 seconds.,
+   fileName,
+errhint(You may have antivirus, 
backup or similar software interfering with the database.;
+ #endif
+
+   if (loops  300)
+   continue;
+   }
+
_dosmaperr(GetLastError());
return -1;
}

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Tom Lane
Andreas Joseph Krogh [EMAIL PROTECTED] writes:
 And remember; Users don't care about portability-issues, they care about 
 performance.

Nonsense.  If Postgres stops working on their machine, they'll care.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] pgwin32_open returning EINVAL

2007-12-19 Thread Alvaro Herrera
Zeugswetter Andreas ADI SD wrote:

  I think not. 0.1 seconds is better. We don't want to delay a full
  second if it's just a transient thing.
 
 Yes 0.1 s is imho good. Btw. m$ is talking about milliseconds
 (http://support.microsoft.com/kb/316609) 

Hm, the article only mentions ERROR_SHARING_VIOLATION but we're also
considering ERROR_LOCKING_VIOLATION.  Should we retry only on the
SHARING violation?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Testing mail list

2007-12-19 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 It's worse than that in this case. This is an *impressively* broken
 configuration.

Understatement of the week.  The mail includes absolutely no evidence
about what message is allegedly being filtered.  Are you sure that
this is really a filtering engine at all, and not just random spam
hoping to draw responses from careless people?  I've heard of web
comment-spammers who try to get other people to decode captchas
for them this way.

Adding to my suspicion is that I don't recall having seen one of these
personally, and if it were really tied to posting on any of the PG
lists, I shoulda seen a lot ;-)

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Testing mail list

2007-12-19 Thread Alvaro Herrera
Tom Lane wrote:
 Gregory Stark [EMAIL PROTECTED] writes:
  It's worse than that in this case. This is an *impressively* broken
  configuration.
 
 Understatement of the week.  The mail includes absolutely no evidence
 about what message is allegedly being filtered.  Are you sure that
 this is really a filtering engine at all, and not just random spam
 hoping to draw responses from careless people?  I've heard of web
 comment-spammers who try to get other people to decode captchas
 for them this way.
 
 Adding to my suspicion is that I don't recall having seen one of these
 personally, and if it were really tied to posting on any of the PG
 lists, I shoulda seen a lot ;-)

Yeah, I think it comes from pgsql-performance.  I just got one
mentioning an address to which I had responded some minutes before.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Testing mail list

2007-12-19 Thread Andrew Sullivan
On Wed, Dec 19, 2007 at 11:15:37AM -0500, Tom Lane wrote:
 hoping to draw responses from careless people?  I've heard of web
 comment-spammers who try to get other people to decode captchas
 for them this way.

Yes.  This is the latest spammer trick.  They get people all over the globe
to decode the captchas.  It's way easier than programming to decode the
captchas (which itself isn't that hard -- there are plenty of toolkits out
there that will decode such things for you).

A


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Testing mail list

2007-12-19 Thread Tom Lane
I wrote:
 Adding to my suspicion is that I don't recall having seen one of these
 personally,

I take that back --- some digging in my mail logs shows that I have
gotten a few of these, but they went straight to /dev/null because
my spam filters thought they were a virus.  Have you checked whether
that gif is really an image, rather than a bit of malware?

The mail-log trace of the last such attempt is pretty interesting too:

Dec 16 13:05:16 sss2 sm-mta[27362]: lBGI5G1g027362: infotecnica.com.br 
[201.35.247.5] did not issue MAIL/EXPN/VRFY/ETRN during connection to MTA
Dec 16 13:05:16 sss2 sm-mta[27363]: lBGI5GFn027363: infotecnica.com.br 
[201.35.247.5] did not issue MAIL/EXPN/VRFY/ETRN during connection to MTA
Dec 16 13:05:17 sss2 sm-mta[27365]: lBGI5HIe027365: infotecnica.com.br 
[201.35.247.5] did not issue MAIL/EXPN/VRFY/ETRN during connection to MTA
Dec 16 13:05:52 sss2 sm-mta[27368]: lBGI5n2G027368: from=[EMAIL PROTECTED], 
size=27892, class=0, nrcpts=1, msgid=[EMAIL PROTECTED]
r, bodytype=8BITMIME, proto=ESMTP, daemon=MTA, relay=infotecnica.com.br 
[201.35.247.5]
Dec 16 13:05:52 sss2 sm-mta[27369]: lBGI5n2G027368: 
to=|/usr/local/bin/procmail -tYf- || exit 75 #tgl, ctladdr=[EMAIL 
PROTECTED] (301/20), delay=00:00:02, xdelay=0
0:00:00, mailer=prog, pri=58095, dsn=2.0.0, stat=Sent

Since 11 December there are consistently three no-op connections before
anything actually happens, which adds a whole new layer of incompetence
that could be charged against whoever is running this, if it actually is
a mail server --- which I grow increasingly dubious of.  I also see a
whole lot of connection attempts in the preceding months in which
nothing was *ever* sent, just did not issue MAIL reports in bursts of
three.

Looks like spamhaus.org was blocking them for portions of last month,
too, so other people have been unhappy about this as well.

Whoever these people are, I've seen enough; I'm off to add this IP
address to my local permanent blacklist.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Mark Mielke

Michał Zaborowski wrote:

Ok - we want to sort table with quick sort and we want to do it on - N threads.
Every thread - gets begin and end of indices of the table. First step starts
at 0 and lasts with count -1. Single step:  find medium value and move
lover before it and bigger after. In normal case - we use recursive call - so
the same procedure is being called for that two parts. In thread we can put
indices at side list - and use queue of threads to pick up data from the list.
We can use common table, access to side list with indices has to be serialized.
  
Stupid question #2: Is it well recognized that the CPU is the bottleneck 
in the PostgreSQL sorting mechanism? Or might it be memory bandwidth and 
I/O?


It would seem to me that any sort worth parallelizing (administrative 
and synchronization overhead), must have data larger than the L2 cache. 
If larger than the L2 cache, it becomes real memory speed. If real 
memory speed, wouldn't one CPU without hardware synchronization, be able 
to fill the memory read/write pipe? If 'divide and conquer' to 
parallize, wouldn't the values written
from one thread, often (1 / N) need to be read from another thread, 
requiring hardware data synchronization?


I see the wikipedia.org page describes how easy it is to parallelize 
quick sort, and scale performance linearly with the number of 
processors, but I don't see references to back this claim.
At least some of these steps seem difficult or impractical to 
parallelize. For example, the initial partition reorder that moves items 
lower than the pivot to the left, and items higher than the pivot to the 
right, would not be easy to parallelize using an in-place re-order. It 
needs to move one partition down before it can 'divide and conquer'. 
They say no synchronization is required, but I think they are missing 
the hardware synchronization required (especially in the inner most 
loops where the thread task becomes shorter, and starts to fit in 
L1/L2). They say linear, but then talk about a 'new thread being 
created'. New thread creation has a cost, and if reduced to using a 
thread pool, then synchronization *is* required.


It sounds like a 'nice in theory' idea. :-) Which doesn't mean it is 
wrong...


I am curious enough to write a test...

Or do you mean being able to perform parts of the query plan fully in
parallel? If this, then one would need a lot more than ParallelSort..

Nice to have, but rather for data warehouses. In other cases... IE - backend
for Internet - there are many requests and every processor / core works nice.
  
I'm a fan of the 'each plan item is a task, that is assigned to the 
pool, with each CPU grabbing tasks from the pool'. Another 'nice in 
theory' idea (used by DB2?). As it is, though, I think PostgreSQL 
planning is heavily designed to maximize performance on a single CPU, 
and single queries would not easily scale to multiple CPUs. (Perhaps 
hashing could be done on another CPU, or as you describe above, sorting)


Cheers,
mark

--
Mark Mielke [EMAIL PROTECTED]



Re: [HACKERS] Proposal for Null Bitmap Optimization(for TrailingNULLs)

2007-12-19 Thread Gokulakannan Somasundaram
I have submitted the first working patch for the trailing null optimization.
It currently does the following
a) Doesn't store the null bitmap, if the heap tuple / index tuple contains
only trailing nulls
b) In Heap Tuple, the trailing nulls won't occupy space in the null bitmap.

The General design is like this
a) After checking for trailing nulls, i reduce the number of attributes
field, which gets stored in each heap tuple.
b) For Index, i have changed the Index_form_tuple to store the unaligned
total size in the size mask. While navigating through the index tuple, if
the offset exceeds the unaligned total size stored, then a null is returned

Please review it and provide suggestions.



 
  I doubt you have fixed it; I doubt it's *possible* to fix it without
  significant rejiggering of IndexTuple representation.  The problem is
  that IndexTuple lacks a number-of-fields field, so there is no place
  to indicate how many null bitmap bits you have actually stored.


Actually i have made one change to the structure of IndexTupleData. Instead
of storing the Aligned size in the size mask, i have stored the  un-aligned
size. I am storing the size before the final MAXALIGN. The interface remains
un-changed. IndexTupleSize does a MAXALIGN before returning the size value.
so the interface remains un-changed.  The advantage of storing the
un-aligned size is that we can get both aligned size and un-aligned size(As
you may know). I have created two more macros to return the un-aligned size.






  I would suggest forgetting that part and submitting the part that
  has some chance of getting accepted.


Actually i want to submit the patch, which is best according to me.




 I suspect there's also an awkward case that *does* need to handled when
 you
 insert a tuple which has a null column which you're leaving out of the
 tuple
 but which appears in an index. You would have to make sure that the index
 tuple has that datum listed as NULL even though it's entirely missing from
 the
 heap tuple.

 Actually this is taken care because of your suggestion. When you add a new
column, it doesn't appear in the heaptuple, but if you create an index on
that column afterwards, the case is handled. There is a field in HeapTuple,
which mentions the number of attributes in the tuple. If we are requesting
for attribute numbers greater than this number, it is returned as null. So
that problem was taken care.


-- 
Thanks,
Gokul.
CertoSQL Project,
Allied Solution Group.
(www.alliedgroups.com)


trailing-nulls.patch.gz
Description: GNU Zip compressed data

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Jeff Davis
On Wed, 2007-12-19 at 12:08 -0500, Mark Mielke wrote:
 Stupid question #2: Is it well recognized that the CPU is the
 bottleneck in the PostgreSQL sorting mechanism? Or might it be memory
 bandwidth and I/O?
 

I think it depends a lot on several factors. It's probably a different
bottleneck for integers versus localized text, and depends on the
available memory and I/O characteristics.

 
 It would seem to me that any sort worth parallelizing (administrative
 and synchronization overhead), must have data larger than the L2
 cache. If larger than the L2 cache, it becomes real memory speed. If
 real memory speed, wouldn't one CPU without hardware synchronization,
 be able to fill the memory read/write pipe? 

We do an external merge sort, which involves merging M runs together.
You seem to be implying that we can generate the output run at disk
speed, and therefore the CPU speed is unimportant.

I suspect that the comparison costs are enough that the above statement
isn't true in all cases, particularly in the case of localized text. 

Also, there is probably a lot of memory copying going on, and that
probably destroys a lot of the effectiveness of L2 caching. When L2
caching is ineffective, the CPU spends a lot of time just waiting on
memory. In that case, it's better to have P threads of execution all
waiting on memory operations in parallel.

This would explain why 1p2t would outperform a 1p1t in Ron's
reference above.

These are just my first thoughts, however. There is a lot of existing
research out there that we can look into, and also a lot of tests that
we can run before jumping into this.

I think parallel sorting can be looked into separately from the other
sorting improvements.

Regards,
Jeff Davis




---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Ron Mayer
Mark Mielke wrote:
 I am curious - what algorithms exist to efficiently do a parallel sort?
 Do you mean if sorting 1 million items, it is possible to separate this
 into  2 sets of 500 thousand each, execute them in separate threads
 (with task administration and synchronization overhead) , combine the
 results, and complete the task in significantly less time than doing it
 in one thread? I am skeptical that this is possible...

The link in the beginning of the thread points to articles
that seem to describe one such algorithm; along with benchmarks.
(http://tinyurl.com/3bvu4u, http://tinyurl.com/32wg2m)
The improvements were pretty consistent from set sizes ranging
from very small sets (hundreds) to quite large ones (hundreds of K).

Interestingly, even multi-threading helped a lot.

   Our tests correlate well with previous research that showed
Intel’s implementation of SMT (Hyper-Threading) to be
adept at hiding this latency [6, 20, 12].Table 4 shows that by
having two threads access memory at the same time, performance
improved over 80% when compared to the singlethreaded version.

It uses both quicksort phases and merge phases; for the merge phase
using 2CPUs (no hyperthreading) apparently gave more than 2X speed
improvement; apparently because it could parallelize memory access
with CPU more.

 Or do you mean being able to perform parts of the query plan fully in
 parallel? If this, then one would need a lot more than ParallelSort...

I wouldn't recommend that - it seems like a Hard Problem.

My guess is that the best way to use multiple threads in one backend
would be to find specific algorithms like sorting that  would be
easier to isolate.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Proposal for Null Bitmap Optimization(for TrailingNULLs)

2007-12-19 Thread Andrew Dunstan



Gokulakannan Somasundaram wrote:
 
 
  



 I would suggest forgetting that part and submitting the part that
 has some chance of getting accepted.


Actually i want to submit the patch, which is best according to me.
 



That's not an attitude that is likely to succeed - you need to take 
suggestions from Tom very seriously.


Also, please submit patches as context diffs, as set out in the 
Developer FAQ, which you should probably read carefully: 
http://www.postgresql.org/docs/faqs.FAQ_DEV.html


cheers

andrew

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Proposal for Null Bitmap Optimization(for TrailingNULLs)

2007-12-19 Thread Joshua D. Drake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Wed, 19 Dec 2007 13:46:15 -0500
Andrew Dunstan [EMAIL PROTECTED] wrote:

   I would suggest forgetting that part and submitting the part
   that has some chance of getting accepted.
 
 
  Actually i want to submit the patch, which is best according to me.
   

You do need to be able to be able to feel that your work is up to a
standard that you find redeemable. However...

 That's not an attitude that is likely to succeed - you need to take 
 suggestions from Tom very seriously.

Andrew is absolutely correct here. If you do not agree with Tom, you
best prove why. Otherwise your patch will likely be ignored on
submission.

Sincerely,

Joshua D. Drake

- -- 
The PostgreSQL Company: Since 1997, http://www.commandprompt.com/ 
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
SELECT 'Training', 'Consulting' FROM vendor WHERE name = 'CMD'


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHaWj9ATb/zqfZUUQRAqsNAJ9k6p0z7rQEcqal0JoKw/ZZG8h5kACfaB9y
xQJ4O+h1xe947O1gnTLEbTU=
=WaSW
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Mark Mielke

Ron Mayer wrote:

The link in the beginning of the thread points to articles
that seem to describe one such algorithm; along with benchmarks.
(http://tinyurl.com/3bvu4u, http://tinyurl.com/32wg2m)
The improvements were pretty consistent from set sizes ranging
from very small sets (hundreds) to quite large ones (hundreds of K).

Interestingly, even multi-threading helped a lot.

   Our tests correlate well with previous research that showed
Intel’s implementation of SMT (Hyper-Threading) to be
adept at hiding this latency [6, 20, 12].Table 4 shows that by
having two threads access memory at the same time, performance
improved over 80% when compared to the singlethreaded version.

It uses both quicksort phases and merge phases; for the merge phase
using 2CPUs (no hyperthreading) apparently gave more than 2X speed
improvement; apparently because it could parallelize memory access
with CPU more.
  
Good points. I had forgotten about DDR and DDR2 having high throughput 
at the cost of high latency. Somewhere in there, having the most number 
of memory requests in the queue would allow hardware to eliminate this 
high latency effect.



Or do you mean being able to perform parts of the query plan fully in
parallel? If this, then one would need a lot more than ParallelSort...


I wouldn't recommend that - it seems like a Hard Problem.

My guess is that the best way to use multiple threads in one backend
would be to find specific algorithms like sorting that  would be
easier to isolate.
  

Also a good point. :-)

Cheers,
mark

--
Mark Mielke [EMAIL PROTECTED]



Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Mark Mielke

Jeff Davis wrote:

On Wed, 2007-12-19 at 12:08 -0500, Mark Mielke wrote:
  

Stupid question #2: Is it well recognized that the CPU is the
bottleneck in the PostgreSQL sorting mechanism? Or might it be memory
bandwidth and I/O?


I think it depends a lot on several factors. It's probably a different
bottleneck for integers versus localized text, and depends on the
available memory and I/O characteristics.
  

Makes sense.


It would seem to me that any sort worth parallelizing (administrative
and synchronization overhead), must have data larger than the L2
cache. If larger than the L2 cache, it becomes real memory speed. If
real memory speed, wouldn't one CPU without hardware synchronization,
be able to fill the memory read/write pipe? 


We do an external merge sort, which involves merging M runs together.
You seem to be implying that we can generate the output run at disk
speed, and therefore the CPU speed is unimportant.
  
Correct. Or, alternatively, you could achieve the same effect using 
asychronous I/O or read ahead.

I suspect that the comparison costs are enough that the above statement
isn't true in all cases, particularly in the case of localized text.
That sounds possible, but I still feel myself suspecting that disk reads 
will be much slower than localized text comparison. Perhaps I am 
overestimating the performance of the comparison function?


 
Also, there is probably a lot of memory copying going on, and that

probably destroys a lot of the effectiveness of L2 caching. When L2
caching is ineffective, the CPU spends a lot of time just waiting on
memory. In that case, it's better to have P threads of execution all
waiting on memory operations in parallel.
  
I didn't consider the high throughput / high latency effect. This could 
be true if the CPU prefetch isn't effective enough.



This would explain why 1p2t would outperform a 1p1t in Ron's
reference above.

These are just my first thoughts, however. There is a lot of existing
research out there that we can look into, and also a lot of tests that
we can run before jumping into this.

I think parallel sorting can be looked into separately from the other
sorting improvements.
  
Yep - I started to read up on it. It still sounds like it's a hard-ish 
problem (to achieve near N times speedup for N CPU cores without 
degrading performance for existing loads), but that doesn't mean 
impossible. :-)


Cheers,
mark

--
Mark Mielke [EMAIL PROTECTED]



Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Dann Corbit
As long as sorting improvements are being considered, may I suggest an
experiment that uses a very simple model?

Assuming that you have K subfiles created by the initial sorting pass,
insert the top record of each file into a priority queue.

Then, emit records from the queue until the priority queue is empty.

Now, there will be the objection that we will be jumping willy-nilly all
over the disk because of reading one record at a time, but (depending on
how it is implemented) generally several records are buffered during a
read.

So (as a gentle suggestion) I suggest testing the model.  It works great
for a single CPU or multiple CPU system for the work that *I* do.  I
have no idea if it will be a benefit for PostgreSQL or not, but it
should be a very simple matter to try it.  As long as someone is doing
the work right now, it would be a good time to give it a go.

I am not very familiar with PostgreSQL internals, but I would be willing
to give a hand with it (not really sure how much time I can guarantee,
though, since I would be doing it on my free time).


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Jeff Davis
On Wed, 2007-12-19 at 15:51 -0500, Mark Mielke wrote: 
 That sounds possible, but I still feel myself suspecting that disk
 reads will be much slower than localized text comparison. Perhaps I am
 overestimating the performance of the comparison function?

I think this simple test will change your perceptions:

Do an initdb with --locale=en_US.UTF-8 and start postgres.

test= create table sorter(t text, b bytea, f float); CREATE TABLE
test= insert into sorter select r AS rt, r::text::bytea AS rb, r AS rf
from (select random() as r from generate_series(1,100)) a;
INSERT 0 100
test= select pg_size_pretty(pg_total_relation_size('sorter'));
pg_size_pretty

70 MB
(1 row)

test= explain analyze select * from sorter order by t; 
test= explain analyze select * from sorter order by b;
test= explain analyze select * from sorter order by f;

On my machine this table fits easily in memory (so there aren't any disk
reads at all). Sorting takes 7 seconds for floats, 9 seconds for binary
data, and 20 seconds for localized text. That's much longer than it
would take to read that data from disk, since it's only 70MB (which
takes a fraction of a second on my machine).

I think this disproves your hypothesis that sorting happens at disk
speed.

 Yep - I started to read up on it. It still sounds like it's a hard-ish
 problem (to achieve near N times speedup for N CPU cores without
 degrading performance for existing loads), but that doesn't mean
 impossible. :-)
 

You don't even need multiple cores to achieve a speedup, according to
Ron's reference.

Regards,
Jeff Davis



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Jeff Davis
On Wed, 2007-12-19 at 14:41 -0800, Dann Corbit wrote:
 As long as sorting improvements are being considered, may I suggest an
 experiment that uses a very simple model?
 
 Assuming that you have K subfiles created by the initial sorting pass,
 insert the top record of each file into a priority queue.
 
 Then, emit records from the queue until the priority queue is empty.
 

What is the principle difference between that idea and our existing sort
algorithm?

There's a good explanation in the comment at the top of tuplesort.c.

Regards,
Jeff Davis


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Dann Corbit
 -Original Message-
 From: Jeff Davis [mailto:[EMAIL PROTECTED]
 Sent: Wednesday, December 19, 2007 3:10 PM
 To: Dann Corbit
 Cc: pgsql-hackers@postgresql.org
 Subject: Re: [HACKERS] Sorting Improvements for 8.4
 
 On Wed, 2007-12-19 at 14:41 -0800, Dann Corbit wrote:
  As long as sorting improvements are being considered, may I suggest
an
  experiment that uses a very simple model?
 
  Assuming that you have K subfiles created by the initial sorting
pass,
  insert the top record of each file into a priority queue.
 
  Then, emit records from the queue until the priority queue is empty.
 
 
 What is the principle difference between that idea and our existing
sort
 algorithm?
 
 There's a good explanation in the comment at the top of tuplesort.c.


According to the comments, PostgreSQL uses replacement selection.
Replacement selection is a wonderful thing because it creates runs that
are twice as long as normal due to the snowplow effect.  See (for
instance):
http://ieeexplore.ieee.org/Xplore/login.jsp?url=/iel5/69/27216/01209012.
pdf

Then, the merge routine will have half as many runs to merge the files
together.

So (for instance) without replacement selection, if you create 1024
subfiles, then replacement selection will create 512.  That saves one
merge pass.

The algorithm that I am suggesting will take exactly one pass to merge
all of the files.

It works like this...

Imagine an array of pointers to the subfiles:
[*subfile][*subfile]...[*subfile]

Step 0:
We sort the array by a comparison operator that examines the top element
of each subfile.  So now the array is ordered such that the record with
the smallest key is in array slot 0.

Step 1:
We remove the first record from the subfile in array slot 0.  Now, the
priority of the first element *may* have changed.  So if it is no longer
smaller than the subfile immediately to the right, we do a binary
insertion to put this subfile in its new location, moving the contents
of array slot[1] to array slot 0 if it is needed.

Step 2:
Is the entire list of subfiles empty?  If yes, then terminate, if no
then go to Step 1.

Like I said, it is ultra-simple and it sorts the entire contents of all
subfiles to the output with a single pass.

Consider the way that current replacement selection works.  The actual
O(f(N)) behavior of replacement selection is just terrible O(n^2).  But
because we save one full merge pass, it is usually worth it anyway,
since memory access is much faster than disk.  And if we only have a few
subfiles, the savings will be large.  In the case of a priority queue
merge, we only have one single merge pass no matter how many subfiles
there are.



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Dann Corbit
P.S.
A beautiful paper on replacement selection is found here:
http://students.fim.uni-passau.de/~fickensc/Proseminar/Proseminar.pdf


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] pgwin32_open returning EINVAL

2007-12-19 Thread Tom Lane
Magnus Hagander [EMAIL PROTECTED] writes:
 Updated version attached. Comments on the wording of the messages are also
 welcome ;-)

Well, since you asked ;-) ... I don't particularly like this.

1. Doesn't distinguish SHARING_VIOLATION from LOCK_VIOLATION.
We might want to know that.

2. Do we really want this to be WARNING?  LOG seems a better idea,
since it's not warning about anything the client app did wrong.

3. Doesn't follow the message style guide: shoehorning two sentences
into errmsg() is surely not compliant.  Perhaps

errmsg: could not open file \%s\: sharing violation
OR errmsg: could not open file \%s\: lock violation

errdetail: Continuing to retry for up to 30 seconds.

errhint: OK as you have it.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Jeff Davis
On Wed, 2007-12-19 at 15:19 -0800, Dann Corbit wrote:
 The algorithm that I am suggesting will take exactly one pass to merge
 all of the files.
 

From tuplesort.c:

In the current code we determine the number of tapes M on the basis of
workMem: we want workMem/M to be large enough that we read a fair amount
of data each time we preread from a tape, so as to maintain the locality
of access described above.  Nonetheless, with large workMem we can have
many tapes.

It seems like you are just choosing M to be equal to the number of
initial runs, whereas the current code takes into account the cost of
having workMem/M too small.

We do want to increase the number of runs that can be merged at once;
that's what dynamic run handling and forecasting are all about. But we
want to avoid unnecessary seeking, also.

Regards,
Jeff Davis


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Tom Lane
Mark Mielke [EMAIL PROTECTED] writes:
 Jeff Davis wrote:
 Also, there is probably a lot of memory copying going on, and that
 probably destroys a lot of the effectiveness of L2 caching. When L2
 caching is ineffective, the CPU spends a lot of time just waiting on
 memory. In that case, it's better to have P threads of execution all
 waiting on memory operations in parallel.
 
 I didn't consider the high throughput / high latency effect. This could 
 be true if the CPU prefetch isn't effective enough.

Note that if this is the argument, then there's a ceiling on the speedup
you can expect to get: it's just the extent of mismatch between the CPU
and memory speeds.  I can believe that suitable test cases would show
2X improvement for 2 threads, but it doesn't follow that you will get
10X improvement with 10 threads, or even 4X with 4.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Gregory Stark

Jeff Davis [EMAIL PROTECTED] writes:

 test= explain analyze select * from sorter order by t; 
 test= explain analyze select * from sorter order by b;
 test= explain analyze select * from sorter order by f;

 On my machine this table fits easily in memory (so there aren't any disk
 reads at all). Sorting takes 7 seconds for floats, 9 seconds for binary
 data, and 20 seconds for localized text. That's much longer than it
 would take to read that data from disk, since it's only 70MB (which
 takes a fraction of a second on my machine).

 I think this disproves your hypothesis that sorting happens at disk
 speed.

I suspect most of that is spent just copying the data around. Which would not
be helped by having multiple threads doing the copying -- and in fact might be
exacerbated if it required an extra copy to consolidate all the data in the
end.

How long does a explain analyze sinmple select * from sorter take?

And assuming you're doing disk sorts (in disk cache) you're doing quite a lot
of copying to temporary files (in disk cache) and then back to memory.


Note that speeding up a query from 20s to 5s isn't terribly useful. If it's
OLTP you can't be using all your cores for each user anyways. And if it's DSS
20s isn't a problem.

Where parallel processing like this becomes attractive is when you're running
a 2 hour query on a machine sequentially running scheduled batch jobs which
can be sped up to 30 minutes. But in that case you're almost certainly being
limited by your disk bandwidth, not your cpu speed.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] New style of hash join proposal

2007-12-19 Thread Decibel!

On Dec 13, 2007, at 7:13 AM, Gregory Stark wrote:
We currently execute a lot of joins as Nested Loops which would be  
more
efficient if we could batch together all the outer keys and execute  
a single

inner bitmap index scan for all of them together.

Essentially what I'm saying is that we're missing a trick with Hash  
Joins
which currently require that we can execute the inner side once  
without any

parameters from the outer side.

Instead what we could do is build up the hash table, then scan the  
hash table
building up an array of keys and pass them as a parameter to the  
inner side.
The inner side could do a bitmap index scan to fetch them all at  
once and

start returning them just as normal to the hash join.

There are a couple details:

1) Batched hash joins. Actually I think this would be fairly  
straightforward.
   You want to rescan the inner side once for each batch. That  
would actually
   be easier than what we currently do with saving tuples to files  
and all

   that.

2) How to pass the keys. This could be a bit tricky especially for
   multi-column keys. My first thought was to build up an actually  
Array node
   but that only really works for single-column keys I think.  
Besides it would
   be more efficient to somehow arrange to pass over a reference to  
the whole

   hash.

I fear the real complexity would be (as always) in the planner  
rather than the
executor. I haven't really looked into what it would take to  
arrange this or

how to decide when to do it.


TODO?
--
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED]
Give your computer some brain candy! www.distributed.net Team #1828




smime.p7s
Description: S/MIME cryptographic signature


Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Dann Corbit
 -Original Message-
 From: [EMAIL PROTECTED] [mailto:pgsql-hackers-
 [EMAIL PROTECTED] On Behalf Of Gregory Stark
 Sent: Wednesday, December 19, 2007 5:26 PM
 To: Jeff Davis
 Cc: Mark Mielke; Michał Zaborowski; Simon Riggs; Ron Mayer; pgsql-
 [EMAIL PROTECTED]
 Subject: Re: [HACKERS] Sorting Improvements for 8.4
 
 
 Jeff Davis [EMAIL PROTECTED] writes:
 
  test= explain analyze select * from sorter order by t;
  test= explain analyze select * from sorter order by b;
  test= explain analyze select * from sorter order by f;
 
  On my machine this table fits easily in memory (so there aren't any disk
  reads at all). Sorting takes 7 seconds for floats, 9 seconds for binary
  data, and 20 seconds for localized text. That's much longer than it
  would take to read that data from disk, since it's only 70MB (which
  takes a fraction of a second on my machine).
 
  I think this disproves your hypothesis that sorting happens at disk
  speed.
 
 I suspect most of that is spent just copying the data around. Which would
 not
 be helped by having multiple threads doing the copying -- and in fact
 might be
 exacerbated if it required an extra copy to consolidate all the data in
 the
 end.

Benchmarking a single system will really only explain that system.
Someone may have a disk farm with 2GB/Sec throughput:
http://www.sql-server-performance.com/articles/per/system_storage_configuration_p1.aspx

But such a configuration is very unlikely.

Someone may have 10GB/S NIC cards, but those too, are rare.

So for any benchmark, we will really just end up with a number for that system.

Typically, disk is the bottleneck.
I found this on the net somewhere, but it's quite a useful table for capacity 
planning (to find the weak link in the chain using back of the envelope 
calculations):

Interface   Width   Frequency   Bytes/Sec   Bits/Sec
4-way interleaved PC1600 (DDR200) SDRAM 4 x 64bits  100 MHz DDR 6.4 
GB/s51 Gbps
Opteron HyperTransport memory bus   128bits 200 MHz DDR 6.4 GB/s
51 Gbps
Pentium 4 800 MHz FSB 64bits  200 MHz QDR 6.4 GB/s51 Gbps
PC2 6400 (DDR-II 800) SDRAM 64bits  400 MHz DDR 6.4 GB/s51 Gbps
PC2 5300 (DDR-II 667) SDRAM 64bits  333 MHz DDR 5.3 GB/s43 Gbps
Pentium 4 533 MHz FSB 64bits  133 MHz QDR 4.3 GB/s34 Gbps
PC2 4300 (DDR-II 533) SDRAM 64bits  266 MHz DDR 4.3 GB/s34 Gbps
2-channel PC1066 RDRAM  2 x 16bits  533 MHz DDR 4.3 GB/s34 Gbps
PCI-X 533   64bits  533 MHz 4.3 GB/s34 Gbps
PCI-Express x16 serial/16lanes  2.5 GHz 4 GB/s  32 Gbps
Pentium 4 400 MHz FSB 64bits  100 MHz QDR 3.2 GB/s25.6 Gbps
2-channel PC800 RDRAM   2 x 16bits  400 MHz DDR 3.2 GB/s25.6 
Gbps
2-way interleaved PC1600 (DDR200) SDRAM 2 x 64bits  100 MHz DDR 3.2 
GB/s25.6 Gbps
PC2 3200 (DDR-II 400) SDRAM 64bits  200 MHz DDR 3.2 GB/s25.6 
Gbps
PC3200 (DDR400) SDRAM   64bits  200 MHz DDR 3.2 GB/s25.6 Gbps
PC2700 (DDR333) SDRAM   64bits  167 MHz DDR 2.7 GB/s21 Gbps
PC2100 (DDR266) SDRAM   64bits  133 MHz DDR 2.1 GB/s17 Gbps
AGP 8x  32bits  533 MHz 2.1 GB/s17 Gbps
PCI-X 266   64bits  266 MHz 2.1 GB/s17 Gbps
PCI-Express x8  serial/8lanes   2.5 GHz 2 GB/s  16 Gbps
EV6 bus (Athlon/Duron FSB)  64bits  100 MHz DDR 1.6 GB/s13 Gbps
PC1600 (DDR200) SDRAM   64bits  100 MHz DDR 1.6 GB/s13 Gbps
PC800 RDRAM 16bits  400 MHz DDR 1.6 GB/s13 Gbps
PC150 SDRAM 64bits  150 MHz 1.3 GB/s10.2 Gbps
10 gigabit ethernet serial  10 GHz  1.25 GB/s   10 Gbps
OC-192  serial  9.953 GHz   1.24 GB/s   9.953 Gbps
133 MHz FSB 64bits  133 MHz 1.06 GB/s   8.5 Gbps
PC133 SDRAM 64bits  133 MHz 1.06 GB/s   8.5 Gbps
AGP 4x  32bits  266 MHz 1.06 GB/s   8.5 Gbps
PCI-X   64bits  133 MHz 1.06 GB/s   8.5 Gbps
PCI-Express x4  serial/4lanes   2.5 GHz 1 GB/s  8 Gbps
100 MHz FSB 64bits  100 MHz 800 MB/s6.4 Gbps
PC100 SDRAM 64bits  100 MHz 800 MB/s6.4 Gbps
PC66 SDRAM  64bits  66 MHz  533 MB/s4.3 Gbps
fast/wide PCI   64bits  66 MHz  533 MB/s4.3 Gbps
AGP 2x  32bits  133 MHz 533 MB/s4.3 Gbps
single-link DVI 12bits  165 MHz DDR 495 MB/s3.96 Gbps
Ultra-320 SCSI  16bits  160 MHz 320 MB/s2.6 Gbps
OC-48 network   serial  2.488 GHz   311 MB/s2.488 Gbps
AGP 32bits  66 MHz  266 MB/s2.1 Gbps
PCI-Express x1  serial  2.5 GHz 250 MB/s2 Gbps
Serial ATA/1500 diskserial  1.5 GHz 187 MB/s1.5 Gbps
Ultra-160 SCSI  16bits  80 MHz  160 MB/s1.3 Gbps
OC-24 network   serial  1.244 GHz   155 MB/s1.244 Gbps
PCI 32bits  33 MHz  133 MB/s1.06 Gbps
ATA/133 disk8bits   66 MHz DDR  133 MB/s1.06 Gbps
gigabit ethernetserial  1 GHz   125 MB/s1 Gbps
ATA/100 disk8bits   50 MHz DDR  100 MB/s   

Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Greg Smith

On Wed, 19 Dec 2007, Dann Corbit wrote:


Benchmarking a single system will really only explain that system.
Someone may have a disk farm with 2GB/Sec throughput
But such a configuration is very unlikely.


If you believe comments like those at 
http://www.c0t0d0s0.org/archives/1792-Do-it-yourself-X4500.html it's 
possible to hit 2GB/s total to the 48 disks in one of the Sun X4500 
servers, which start at $24K.  May be unlikely to you, but I was reading 
there after I set one up last night, and that's a boring standard 
configuration for some Sun and Greenplum customers.


Also, that's today--by the time 8.4 is mainstream high-end machines will 
be even faster.  Wanna make a bet on how much disk throughput will be 
available as SSD disks go mainstream in the next two years?


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Gregory Stark
Dann Corbit [EMAIL PROTECTED] writes:

 Note that speeding up a query from 20s to 5s isn't terribly useful. If it's
 OLTP you can't be using all your cores for each user anyways. And if it's
 DSS 20s isn't a problem.

 Unless (of course) there are 20,000 users doing the queries that would take 20
 seconds but now they take 5 (when run single-user). They will still have a bit
 of a wait, of course.

I'm not exactly following. If you have 20,000 users then you're probably using
all the processors already. If you process them one by one on 4 cores in 5s
then you'll get the same throughput as if you ran them four at a time on 1
core each in 20s.

 Where parallel processing like this becomes attractive is when you're
 running a 2 hour query on a machine sequentially running scheduled batch
 jobs which can be sped up to 30 minutes. But in that case you're almost
 certainly being limited by your disk bandwidth, not your cpu speed.

 A linear speedup of 2 or more is always worth while[*]. Since sorting (e.g. 
 for
 group by' and 'order by') and sort joins are a major database task, I guess
 that a linear speedup by a factor of 2 might make the database operations on
 the whole be 10% faster or so {OK, it's a SWAG}. I guess it would look good on
 the benchmarks, if nothing else.

Except note that you're not getting this linear speedup for free. To get a
linear speedup of 2x you'll be using more than 2x the cpu resources. If there
is nothing else contending for that resource (such as the scenario I described
where you're running a single large batch query on a system and want to use
all available resources to run it as fast as possible), then you'll get a 2x
speedup. 

But if there is more than one query running on the system then you're not
actually gaining anything. Each query will run faster but you won't be able to
run as many simultaneously without having them slow back down. And the
overhead of parallelizing the query will be a net loss.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Gregory Stark
Greg Smith [EMAIL PROTECTED] writes:

 On Wed, 19 Dec 2007, Dann Corbit wrote:

 Benchmarking a single system will really only explain that system.
 Someone may have a disk farm with 2GB/Sec throughput
 But such a configuration is very unlikely.

 If you believe comments like those at
 http://www.c0t0d0s0.org/archives/1792-Do-it-yourself-X4500.html it's possible
 to hit 2GB/s total to the 48 disks in one of the Sun X4500 servers, which
 start at $24K.  May be unlikely to you, but I was reading there after I set 
 one
 up last night, and that's a boring standard configuration for some Sun and
 Greenplum customers.

Surely such machines have kickass memory backplanes too though? How could it
ever be reasonable to have an i/o controller with more bandwidth than your
memory?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Ron Mayer
Tom Lane wrote:
 ...I can believe that suitable test cases would show
 2X improvement for 2 threads,

One other thing I found  interesting is that their test case
showed a near 2X improvement for hyperthreading; where I haven't
heard of many other ways to get hyperthreading to show improvements
for postgreql.


 but it doesn't follow that you will get
 10X improvement with 10 threads, or even 4X with 4.

Yeah - unless those 10 cores have additional I/O to the
memories compared to a 1 core system (which I'd hope
would be the case or else I'd expect many apps would be
run into memory bottlenecks on such systems, no?).




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[HACKERS] PGCon 2008 - call for papers

2007-12-19 Thread Dan Langille

Hello folks,

PGCon 2008 will be held 22-23 May 2008, in Ottawa at the University of
Ottawa.  It will be preceded by two days of tutorials on 20-21 May 2008.

We are now requesting proposals for presentations.

If you are doing something interesting with PostgreSQL, please submit
a proposal.  You might be one of the backend hackers or work on a
PostgreSQL related project and want to share your know-how with
others. You might be developing an interesting system using
PostgreSQL as the foundation. Perhaps you migrated from another
database to PostgreSQL and would like to share details.  These, and
other stories are welcome. Both users and developers are encouraged
to share their experiences.

Here are a few ideas to jump start your proposal process:

 - novel, unique or complex ways in which PostgreSQL are used
 - migration of production systems to PostgreSQL
 - data warehousing with PostgreSQL
 - tuning PostgreSQL for different work loads
 - replicating data on top of PostgreSQL

Both users and developers are encouraged to share their experiences.

The schedule is:

 19 Dec 2007 Proposal acceptance begins
 19 Jan 2008 Proposal acceptance ends
 19 Feb 2008 Confirmation of accepted proposals
 19 Apr 2008 Final papers/slides must arrive no later than this date

See also http://www.pgcon.org/2008/papers.php

Instructions for submitting a proposal to PGCon 2008 are available
from: http://www.pgcon.org/2008/submissions.php

--
Dan Langille - http://www.langille.org/
BSDCan - The Technical BSD Conference: http://www.bsdcan.org/
PGCon  - The PostgreSQL Confernce: http://www.pgcon.org/


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Sorting Improvements for 8.4

2007-12-19 Thread Greg Smith

On Thu, 20 Dec 2007, Gregory Stark wrote:


Surely such machines have kickass memory backplanes too though? How could it
ever be reasonable to have an i/o controller with more bandwidth than your
memory?


Dann had the right general numbers here--max of 6.4GB/s between processors 
and you might coax an aggregate of double that out of the DDR RAM with 2 
4-way interleaved banks of memory.  Let's call it 12GB/s theoretical max. 
If the theoretical max of the disks is 2GB/s, that's only a 6:1 headroom, 
and with a decent cache rate it's not outrageous to imagine you could 
bottleneck on memory with some things before you run out of disk 
throughput.


Right now I think a lot of the disk bottlenecks are seek-limited more than 
anything, but SSD will knock that one out for apps that are more about 
throughput than maximum storage.  I could already switch to SDD usefully 
today for some of what I do that's in that category, it's just a bit too 
expensive to do yet; soon, though.


Just trying to usefully estimate where the edge of that back of the 
envelope should go to.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Benchmark for GiST index?

2007-12-19 Thread Koichi Suzuki
I'm trying to run some GiST activities and see XLOG record structure.

README tells it runs PG8.2 or later.   Does it run with PG8.3beta4?
It seems to depend upon tsearch2 in contrib.

2007/12/20, Teodor Sigaev [EMAIL PROTECTED]:
 What kind of benchmark? I have benchmark for text search
 http://www.sigaev.ru/cvsweb/cvsweb.cgi/ftsbench/
 and several others, based on it.


 Koichi Suzuki wrote:
  Hi,
 
  Does anybody have an information about GiST index benchmark?  I'm
  planning to run GiST benchmark and analyze WAL structure to see PG's
  dynamic behavior.
 
  Regards;
 
  --
  Koichi Suzuki
 
  ---(end of broadcast)---
  TIP 3: Have you checked our extensive FAQ?
 
 http://www.postgresql.org/docs/faq

 --
 Teodor Sigaev   E-mail: [EMAIL PROTECTED]
 WWW: http://www.sigaev.ru/



-- 
--
Koichi Suzuki

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org