Re: [HACKERS] [SQL] info is a reserved word?

2006-01-12 Thread Tom Lane
[ moved to -hackers ]

Michael Fuhr <[EMAIL PROTECTED]> writes:
> On Thu, Jan 12, 2006 at 11:21:28PM -0500, Tom Lane wrote:
>> plpgsql is not very good about reserving words "minimally", ie, not
>> treating a word as a keyword outside the context where the keyword
>> is meaningful.
>> 
>> This could probably be fixed, or at least greatly reduced, with some
>> flex/bison hacking.  Anyone up for it?

> Possibly.  Would it involve much more than what the main parser's
> grammar does with unreserved_keyword and friends?  I suppose this
> ought to move to pgsql-hackers.

The keyword-classification tactic would be one approach.  For the
specific case of the RAISE severity codes, I'd be inclined to eliminate
all those "keywords" entirely and let them be lexed/parsed as simple
identifiers --- there's no strong reason not to do strcmp's for specific
identifiers at the point where we're building a raise_level value.
There are probably some other methods that might apply in other places.
But in any case it's a fairly self-contained problem; you don't need
any vast knowledge of Postgres internals to tackle it, just some
understanding of the flex and bison tools.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Overflow of bgwriter's request queue

2006-01-12 Thread ITAGAKI Takahiro
I'm sorry when you have received mails of the same content. I had sent
a mail but it seemed not to be delivered, so I'll send it again.


Tom Lane <[EMAIL PROTECTED]> wrote:

> > I encountered overflow of bgwriter's file-fsync request queue.
> I can't help thinking that this is a situation that could only be got
> into with a seriously misconfigured database --- per the comments for
> ForwardFsyncRequest, we really don't want this code to run at all,
> let alone run so often that a queue with NBuffers entries overflows.
> What exactly are the test conditions under which you're seeing this
> happen?

It happened at the two environments.
  [1] TPC-C(DBT-2) / RHEL4 U1 (2.6.9-11)
  XFS, 8 S-ATA disks / 8GB memory(shmem=512MB)
  [2] TPC-C(DBT-2) / RHEL4 U2 (2.6.9-22)
  XFS, 6 SCSI disks / 6GB memory(shmem=1GB)

I think it is not so bad configuration. There seems to be a problem in
the combination of XFS and heavy update workloads, but the total throuput
at XFS with my patch was better than ext3.

I suspect that NBuffers for the queue length is not enough. If all buffers
are dirty, ForwardFsyncRequest would be called more than NBuffers times
during BufferSync, so the queue could become full.


> If there actually is a problem that needs to be solved, I think it'd be
> better to try to do AbsorbFsyncRequests somewhere in the main checkpoint
> loops.  I don't like the idea of holding the BgWriterCommLock long
> enough to do a qsort ... especially not if this occurs only with very
> large NBuffers settings.

Ok, I agree. I sent PATCHES a patch that calls AbsorbFsyncRequests
in the loops of BufferSync and mdsync.


> Also, what if the qsort fails to eliminate any
> duplicates, or eliminates only a few?  You could get into a scenario
> where the qsort gets repeated every few ForwardFsyncRequest calls, in
> which case it'd become a drag on performance itself.

Now, I think the above solution is better than qsort, but qsort will also
work not so bad. NBuffers is at least one thousand, while the count of files
that needs fsync is at most hundreds, so duplidate elimination will work well.
In fact, in my machine, the queue became full twice in a checkpoint and
length of the queue decreased from 65536 to *32* by duplicate eliminations.

---
ITAGAKI Takahiro
NTT Cyber Space Laboratories



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Contrib Schemas

2006-01-12 Thread John DeSoi


On Jan 12, 2006, at 9:03 PM, Christopher Kings-Lynne wrote:

Default schema really has to be public to help the newbies out  
there. All contribs should come with some sort of standard  
uninstall.sql script though.


Speaking of contrib newbies and install/uninstall, I recently had the  
need to install tsearch2 in a shared hosting environment (where, of  
course, I don't have superuser access). Since there are C language  
functions, I could not install it. The hoster's solution was to  
install it in template1 which seems bad because (1) now everyone gets  
it whether they want it or not, (2) restoring the database is going  
to have permission problems.


Would it be reasonable for there to be a way for the super user to  
grant access to load "approved" modules and/or C language functions?




John DeSoi, Ph.D.
http://pgedit.com/
Power Tools for PostgreSQL


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Contrib Schemas

2006-01-12 Thread Christopher Kings-Lynne
Default schema really has to be public to help the newbies out there. 
All contribs should come with some sort of standard uninstall.sql script 
though.


Chris

Mark Kirkwood wrote:

David Fetter wrote:


Folks,

I'm picturing something like this:

make install  # Option 0  Leave as-is
make install --auto_schema  [--generate_path_mod] # Option 1
make install --schema=contrib # Option 2
make install --schema=foo # Option 3

What do you think?



I like the idea of being able to specify a schema as an option, but 
would prefer the default schema to be 'public'.


i.e.


make install [--schema=foo] # Option 4

Cheers

Mark

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


[HACKERS] Contrib Schemas

2006-01-12 Thread David Fetter
Folks,

For the same reason they feel that *n*x installs shouldn't be messing
around in /usr/local, some people also feel that contrib modules
shouldn't install in the public schema by default.  Before I go
a-patching, I'd like to see what people think about how to handle
this.  A few of possibilities come to mind:

1.  One schema per contrib module.  In some sense, this is a clean way
to do things because the modules can't step on each other's
namespaces, and it makes it easy to remove a module cleanly.

A downside of this is the increased maintenance overhead of adding
those schemas to search_paths as appropriate, although this might be
worked around by having the install script generate some search_path
modifications that the person installing could choose to run.

2.  One big contrib schema.  This avoids the above search_path
problem, but leaves the namespace collision of putting things in
public.

3.  Name a schema on installation.  This could help people decide
exactly where they want things to go.

I'm picturing something like this:

make install  # Option 0  Leave as-is
make install --auto_schema  [--generate_path_mod] # Option 1
make install --schema=contrib # Option 2
make install --schema=foo # Option 3

What do you think?

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 415 235 3778

Remember to vote!

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Contrib Schemas

2006-01-12 Thread Josh Berkus
Tom,

> People who want the contrib stuff segregated can easily make it
> happen by modifying the contrib .sql files before they run them.
> The .sql files already have a skeleton for this, eg
>   -- Adjust this setting to control where the objects get created.
>   SET search_path = public;
> I don't really see a need to go further than that.

Tangentally, I filed a but with Tsearch2 because that SET statement is 
outside the transaction in the .sql file, which means that stuff will end 
up in the public schema if the admin typos the schema name.  Not sure if 
other contrib modules have the same issue.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Contrib Schemas

2006-01-12 Thread Tom Lane
David Fetter <[EMAIL PROTECTED]> writes:
> What do you think?

I think it's not broken and doesn't need fixing.  I have not seen any
groundswell of demand for moving the contrib stuff out of the public
schema.  On the other hand, doing so *will* break many if not all
existing applications that use contrib modules.

People who want the contrib stuff segregated can easily make it
happen by modifying the contrib .sql files before they run them.
The .sql files already have a skeleton for this, eg
-- Adjust this setting to control where the objects get created.
SET search_path = public;
I don't really see a need to go further than that.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Contrib Schemas

2006-01-12 Thread Andrew Dunstan
David Fetter said:

> I'm picturing something like this:
>
> make install  # Option 0  Leave
> as-is make install --auto_schema  [--generate_path_mod] # Option 1
> make install --schema=contrib # Option 2
> make install --schema=foo # Option 3
>


Or maybe set it up at configure time, c.f. installation directories.

cheers

andrew



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Please Help: PostgreSQL Query Optimizer

2006-01-12 Thread Anjan Kumar. A.



Through googling, i found that Normal Disk has external data transfer rate of 
around 40MBps,

^^
Does this includes, seek and rotational latency ?


where as Main Memory has Data transfer rate ranging from 1.6GBps to 2.8GBps.












As we can see, the ratio between Disk and Main Memory data transfer rates is 
around 50. Then, if we multiply all cpu_* paramters by 50, the resulting 
values will be:


   random_page_cost = 1;
   cpu_tuple_cost = 0.5;
   cpu_index_tuple_cost = 0.05;
   cpu_operator_cost = 0.0125;


Would it be a suitable approach ? We request all of u to give 
comments/suggestions on this calcualations. Thanking You.






On Sun, 11 Dec 2005, Tom Lane wrote:


[ trimming cc list to something sane ]

"Anjan Kumar. A." <[EMAIL PROTECTED]> writes:
In Main Memory DataBase(MMDB) entire database on the disk is loaded 
on to the main memory during initial startup of the system.  There after 
all the references are made to database on the main memory.  When the 
system is going to shutdown, we will write back the database on  the main 
memory to disk.  Here, for the sake of recovery we are writing log records 
on to the disk  during the transaction execution.


Don't you get 99.9% of this for free with Postgres' normal behavior?
Just increase shared_buffers.

  Can any one tell me the modifications needs to be incorporated to 
PostgreSQL,  so that it considers only Processing Costs during 
optimization of the Query.


Assuming that a page fetch costs zero is wrong even in an all-in-memory
environment.  So I don't see any reason you can't maintain the
convention that a page fetch costs 1.0 unit, and just adjust the other
cost parameters in the light of a different idea about what that
actually means.

Will it be sufficient, if we change the  default values of above paramters 
in "src/include/optimizer/cost.h and 
src/backend/utils/misc/postgresql.conf.sample" as follows:



 random_page_cost = 4;
 cpu_tuple_cost = 2;
 cpu_index_tuple_cost = 0.2;
 cpu_operator_cost = 0.05;


You'd want random_page_cost = 1 since there is presumably no penalty for
random access in this context.  Also, I think you'd want
cpu_operator_cost a lot higher than that (maybe you dropped a decimal
place?  You scaled the others up by 200 but this one only by 20).

It's entirely possible that the ratios of the cpu_xxx_cost values
aren't very good and will need work.  In the past we've never had
occasion to study them very carefully, since they were only marginal
contributions anyway.

regards, tom lane






--
Regards.

Anjan Kumar A.
MTech2,  Comp Sci.,
www.cse.iitb.ac.in/~anjankumar
__
Bradley's Bromide:
If computers get too powerful, we can organize
them into a committee -- that will do them in.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Contrib Schemas

2006-01-12 Thread Mark Kirkwood

David Fetter wrote:

Folks,

I'm picturing something like this:

make install  # Option 0  Leave as-is
make install --auto_schema  [--generate_path_mod] # Option 1
make install --schema=contrib # Option 2
make install --schema=foo # Option 3

What do you think?



I like the idea of being able to specify a schema as an option, but 
would prefer the default schema to be 'public'.


i.e.


make install [--schema=foo] # Option 4

Cheers

Mark

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Libpq COPY optimization

2006-01-12 Thread Tom Lane
"Alon Goldshuv" <[EMAIL PROTECTED]> writes:
> Please help me understand this better. It appears to me that when the
> client->backend pipe fills up, pqSendSome() consumes any incoming
> NOTICE/WARNING messages before waiting, which should prevent deadlock.

Hm, I had forgotten that the low-level pqSendSome routine does that.
That makes the PQconsumeInput call in PQputCopyData redundant (or
almost; see below).  The parseInput call is still needed, because it's
there to pull NOTICE messages out of the input buffer and get rid of
them, rather than possibly having the input buffer grow to exceed
memory.  But when there's nothing for it to do, parseInput is cheap
enough that there's no real need to bypass it.

In short, if you just remove the PQconsumeInput call I think you'll find
that it does what you want.

The only case where it's helpful to have it there is if there's a
incomplete message in the input buffer, as parseInput isn't quite so
fast if it has to determine that the message is incomplete.  Without
the PQconsumeInput call, the incomplete-message state could persist
for a long time, and you'd pay the parseInput overhead each time through
PQputCopyData.  However, that's certainly not the normal situation,
so I think we could leave that case slightly pessimal.  It's certainly
true that that path in parseInput is a lot faster than a kernel call,
so it'd still be better than it is now.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Checkpoint question

2006-01-12 Thread Qingqing Zhou


On Thu, 12 Jan 2006, Jim C. Nasby wrote:

>
> It sounds like worrying about this would be much more interesting on a
> machine that is seeing both a fairly heavy IO load (meaning checkpoint
> will both take longer and affect other workloads more) and is seeing a
> pretty high rate of buffer updates (meaning that we'd likely do a bunch
> of extra work as part of the checkpoint if we didn't take note of
> exactly what buffers needed to be flushed). Unfortunately I don't think
> there's any way for the backend to know much about either condition
> right now, so it couldn't decide when it made sense to make a list of
> buffers to flush. Maybe in the future...
>

The senario you mentioned is happened in many OLTP applications. No need
for backend to know this -- we can leave the decision to the DBA:
CHECKPOINT FULL or CHECPOINT PARTIAL. If you got some machines can observe
its CHECKPOINT duration, that would be sweet.

Regards,
Qingqing


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Checkpoint question

2006-01-12 Thread Jim C. Nasby
On Thu, Jan 12, 2006 at 04:50:30AM -0500, Qingqing Zhou wrote:
> 
> 
> On Thu, 12 Jan 2006, Simon Riggs wrote:
> >
> > The only buffers this will miss are ones that were clean throughout the
> > whole of the last checkpoint cycle, yet have been dirtied between the
> > start of the checkpoint pass and when the pass reaches it.
> 
> I agree on the analysis but I am not sure current interval of doing a
> checkpoint. So it depends. If the checkpoint on an io-intensive machine
> the interval I guess would not be small. Also, in this environment, one
> more round of lock cycle should be relatively cheap. But we currently
> don't have any numbers on hand ...

It sounds like worrying about this would be much more interesting on a
machine that is seeing both a fairly heavy IO load (meaning checkpoint
will both take longer and affect other workloads more) and is seeing a
pretty high rate of buffer updates (meaning that we'd likely do a bunch
of extra work as part of the checkpoint if we didn't take note of
exactly what buffers needed to be flushed). Unfortunately I don't think
there's any way for the backend to know much about either condition
right now, so it couldn't decide when it made sense to make a list of
buffers to flush. Maybe in the future...

As for the questionable benefit to delaying work for bgwriter or next
checkpoint, I think there's a number of scenarios where it would make
sense. A simple example is doing some kind of processing once a minute
that's IO intensive with default checkpoint timing. Sometimes a
checkpoint will occur at the same time as the once-a-minute process, and
in those cases reducing the amount of work the checkpoint does will
definately help even out the load on the machine.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Libpq COPY optimization

2006-01-12 Thread Alon Goldshuv
Tom,

> It's not enough better, because it will still deadlock given a
> sufficiently large message-to-send.  I don't think you can postpone
> the clearing-input action until after all the data is sent.
> 
> regards, tom lane

Please help me understand this better. It appears to me that when the
client->backend pipe fills up, pqSendSome() consumes any incoming
NOTICE/WARNING messages before waiting, which should prevent deadlock. I'll
look at the code again, maybe I missed something.

Thx,
Alon.



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Checkpoint question

2006-01-12 Thread Qingqing Zhou


On Thu, 12 Jan 2006, Simon Riggs wrote:
>
> The only buffers this will miss are ones that were clean throughout the
> whole of the last checkpoint cycle, yet have been dirtied between the
> start of the checkpoint pass and when the pass reaches it.

I agree on the analysis but I am not sure current interval of doing a
checkpoint. So it depends. If the checkpoint on an io-intensive machine
the interval I guess would not be small. Also, in this environment, one
more round of lock cycle should be relatively cheap. But we currently
don't have any numbers on hand ...

Regards,
Qingqing



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Checkpoint question

2006-01-12 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes:
>> On Wed, 11 Jan 2006, Tom Lane wrote:
>>> It'd be possible to do something like this: after establishing
>>> RedoRecPtr, make one quick pass through the buffers and make a list of
>>> what needs to be dumped at that instant.  Then go back and do the actual
>>> I/O for only those buffers.

> To compile the list, you'd need to stop all buffer write activity while
> you compile it, which sounds a high price for the benefit.

Not really --- I was only thinking of narrowing the window for "extra"
writes to get in, not removing the window entirely.  Don't need any sort
of global lock for that.

But I agree with your analysis that the extra cycles won't save much in
practice.  The objection I see is that two lock cycles on each targeted
buffer are a nontrivial expense in SMP machines.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Checkpoint question

2006-01-12 Thread Simon Riggs
On Wed, 2006-01-11 at 22:33 -0500, Qingqing Zhou wrote: 
> On Wed, 11 Jan 2006, Tom Lane wrote:

> > It'd be possible to do something like this: after establishing
> > RedoRecPtr, make one quick pass through the buffers and make a list of
> > what needs to be dumped at that instant.  Then go back and do the actual
> > I/O for only those buffers.  I'm dubious that this will really improve
> > matters though, as the net effect is just to postpone I/O that will
> > happen anyway soon after the checkpoint (as soon as the bgwriter goes
> > back to normal activity).
> >
> Looks like a good idea. I don't worry too much about the problem you
> mentioned. AFAIK, checkpoint has two targets: (1) cleanup buffer pool; (2)
> reduce recovery time;

I think its a good idea, but agree it does not save much in practice.

The only buffers this will miss are ones that were clean throughout the
whole of the last checkpoint cycle, yet have been dirtied between the
start of the checkpoint pass and when the pass reaches it. Given the
relative durations of those two intervals, I would guess that this would
yield very few buffers.

Further, if you miss a buffer on one checkpoint it will not be able to
avoid being written at the next. If we write the buffer again in next
checkpoint cycle then we combine the two I/Os and save effort. If the
buffer is not written to in the next cycle, and this seems likely since
it wasn't written to in the last, we do not avoid I/O, we just defer it
to next checkpoint. 

So the only buffer I/O we would save is for buffers that
- are not written to in checkpoint cycle, n (by definition)
- are written to *during* the checkpoint
- are written to again during the next checkpoint cycle, n+1

You could do math, or measure that, though my guess is that this
wouldn't save more than a few percentage points on the checkpoint
process.

To compile the list, you'd need to stop all buffer write activity while
you compile it, which sounds a high price for the benefit.

> For (2), it is clear that the above idea will work since the recovery will
> always read the data page to check its LSN -- the is the source of the
> cost. For (1), we have bgwriter, and part of reason it is desiged is to
> cleanup buffer pool.

Deferring I/O gains us nothing in the long run, though would speed up
recovery time by a fraction - but then crash recovery time is not much
an issue is it? If it is, there are other optimizations. 

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Checkpoint question

2006-01-12 Thread Simon Riggs
On Wed, 2006-01-11 at 20:46 -0500, Qingqing Zhou wrote:
> 
> On Wed, 11 Jan 2006, Simon Riggs wrote:
> 
> >
> > Probably good idea to read Gray & Reuter or Vekum & Vossen books on
> > transactional systems theory before any such discussion.
> >
> So can you give me some hints why my thoughts are just wrong?

Your thoughts are very often good ones, ISTM. This area is a minefield
of misunderstanding, as are many others, but in this case some external
wisdom is readily available. Those books helped me understand things
better; it was wrong of me to assume you hadn't already read them.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org