Re: [HACKERS] stats_block_level

2007-07-28 Thread Erik Jones

On Jul 27, 2007, at 6:45 PM, Jim Nasby wrote:


On Jul 26, 2007, at 2:03 PM, Tom Lane wrote:

So maybe the *real* question to ask is why we have separate GUCs for
stats_row_level and stats_block_level.  Shouldn't we fold them into a
single switch?  It's hard to see what having just one of them  
turned on

will save.


IIRC, the guys at Emma have seen a performance difference with  
stats_block_level off and row_level on, presumable due in part to  
having 150k tables.


Erik? Kim?


Well, we turned it off at the same time we moved from 8.2.3 to 8.2.4  
so the actual culprit may have been what prompted the stats collector  
improvement that went into that release.  I could test turning it  
back on this week if you like -- I certainly would like to have my  
blks_read/cach_hits stats back.  Toggling stats_block_level will  
respond to a reload, yes?


Software Developer | Emma®
[EMAIL PROTECTED]
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] Quick idea for reducing VACUUM contention

2007-07-28 Thread Jim Nasby

On Jul 27, 2007, at 1:49 AM, Alvaro Herrera wrote:

ITAGAKI Takahiro wrote:

"Simon Riggs" <[EMAIL PROTECTED]> wrote:


Read the heap blocks in sequence, but make a conditional lock for
cleanup on each block. If we don't get it, sleep, then try again  
when we

wake up. If we fail the second time, just skip the block completely.


It would be cool if we could do something like sweep a range of pages,
initiate IO for those that are not in shared buffers, and while  
that is
running, lock and clean up the ones that are in shared buffers,  
skipping

those that are not lockable right away; when that's done, go back to
those buffers that were gotten from I/O and clean those up.  And retry
the locking for those that couldn't be locked the first time around,
also conditionally.  And when that's all done, a third pass could get
those blocks that weren't cleaned up in none of the previous passes  
(and

this time the lock would not be conditional).


Would that be substantially easier than just creating a bgreader?
--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] stats_block_level

2007-07-28 Thread Jim Nasby

On Jul 26, 2007, at 2:03 PM, Tom Lane wrote:

So maybe the *real* question to ask is why we have separate GUCs for
stats_row_level and stats_block_level.  Shouldn't we fold them into a
single switch?  It's hard to see what having just one of them  
turned on

will save.


IIRC, the guys at Emma have seen a performance difference with  
stats_block_level off and row_level on, presumable due in part to  
having 150k tables.


Erik? Kim?
--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Machine available for community use

2007-07-28 Thread Jim Nasby
Where in Texas? I might be able to assist on-site if needed (though I  
don't know much about linuxes).


On Jul 25, 2007, at 11:31 AM, Gavin M. Roy wrote:


One thing to take into account is I dont have physical access to the
box (It is in TX, I am in PA).  All installs but Gentoo will be
performed by a well trained NOC monkey. *cough*

On 7/25/07, Dave Page <[EMAIL PROTECTED]> wrote:



> --- Original Message ---
> From: Tom Lane <[EMAIL PROTECTED]>
> To: Greg Smith <[EMAIL PROTECTED]>
> Sent: 25/07/07, 18:54:50
> Subject: Re: [HACKERS] Machine available for community use
>
> Another fairly big issue is that we need to know whether  
measurements we
> take in August are comparable to measurements we take in  
October, so a
> fairly stable platform is important.  As you say, a fast- 
changing kernel
> would make it difficult to have any confidence about  
comparability over
> time.  That would tend to make me vote for RHEL/Centos, where  
long-term
> stability is an explicit development goal.  Debian stable might  
do too,
> though I'm not as clear about their update criteria as I am  
about Red Hat's.


Perhaps RH could donate us a RHEL/RHN licence for this?

/D

---(end of  
broadcast)---

TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate



---(end of  
broadcast)---

TIP 6: explain analyze is your friend



--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] New Index

2007-07-28 Thread David Fuhry

algatt,

   You might find the examples of PostGIS [1] and SP-GiST [2] helpful 
in implementing your PR-Tree index, although the former implements only 
2D R-Trees and the latter implements tries, quadtrees, and kd-trees but 
not R-Trees (and extends GiST).


   Yours may be the first PostgreSQL R-Tree implementation which takes 
advantage of bulk-loading.  The Oracle Spatial guys have had a 
bulk-loading implementation [3] since 10g, based on [4].


-Dave Fuhry

[1] http://postgis.refractions.net/
[2] http://www.cs.purdue.edu/spgist/
[3] Ning An, Kothuri Venkata Ravi Kanth, Siva Ravada: Improving 
Performance with Bulk-Inserts in Oracle R-Trees. VLDB 2003: 948-951

[4] L. Arge, K. Hinrichs, et al. Efficient Bulk Operations
on Dynamic R-trees. In Workshop on Algorithm Engi-
neering and Experimentation (ALENEX), pages 328–
348, 1999.


Heikki Linnakangas wrote:

algatt wrote:

I am trying to create a new index for high multi-dimensional indexing.
The index I'm trying to use in PR-Tree ( www.cs.duke.edu/~yike/prtree/
).

I have created a new source file in "src/backend/access/" and for now
I have created the empty functions: build, insert, vacuum, etc... I
believe that this has to be done in C. Then I ran "/src/include/
catalog/unused_oids" to find the unused oids, and then i added the
oids to the functions in the source "file /src/include/catalog/
pg_proc.h". I re-ran initdb and inserted the functions in pg_am table.


You seem to be on the right track.


Now obviously I have to add the proper code to the functions, but the
version of the code I have for the PR-Tree is in C++. Is there a way
to implement the functions using the existing PR-Tree code in C++?


No, we don't support C++ in the backend. It's almost certainly easier to
just start from scratch anyway, because integrating it with all the
PostgreSQL infrastructure like the buffer manager, vacuum, WAL-logging,
data type and operator system.



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] default_text_search_config and expression indexes

2007-07-28 Thread Oleg Bartunov

On Fri, 27 Jul 2007, Bruce Momjian wrote:


Magnus Hagander wrote:

However, the big problem is that the expressions used in expression
indexes should not change their output based on the value of a GUC
variable (because it would corrupt the index), but in the case above,
default_text_search_config controls what configuration is used, and
hence the output of to_tsvector is changed if default_text_search_config
changes.


It wuoldn't actually *corrupt* the index, right? You could end up with
wrong results, which might be regarded as corruption in one way, but as
long as you change the value back the index still works, no?


Right, it would _temporarily_ corrupt it.  ;-)


We have a few possible options:

1) Document the problem and do nothing else.
2) Make default_text_search_config a postgresql.conf-only
   setting, thereby making it impossible to change by non-super
   users, or make it a super-user-only setting.
3) Remove default_text_search_config and require the
   configuration to be specified in each function call.

If we remove default_text_search_config, it would also make ::tsvector
casting useless as well.


I think 3 is a really bad solution.

2 is a half-bad solution. Do we have a way to say that it can be set at
database-level for example, but not at user session? Making it
superuser-only to change it but not postgresql.conf-only could accomplish
that, along with warnings in the docs for the super user about the effects
on current indexes by changing it.


OK, here is what I am thinking.  If we make default_text_search_config
super-user-only, then the user can't do SET (using "zero_damaged_pages"
as a superuser-only example):

test=> set zero_damaged_pages = on;
ERROR:  permission denied to set parameter "zero_damaged_pages"

test=> alter user guest set zero_damaged_pages = on;
ERROR:  permission denied to set parameter "zero_damaged_pages"

but the super-user can set it in postgresql.conf, or:

test=# alter user guest set zero_damaged_pages = on;
ALTER ROLE

or

test=# alter database vendor3 set zero_damaged_pages = on;
ALTER ROLE

meaning while it will be super-user-only, the administrator can set the
default for specific databases and users.  Is that the best approach?

A user can still over-ride the default by specifying the configuration
in the function call.


This is ok, but it will not work in hosting environment and still 
doesn't prevent errors.


Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] New Index

2007-07-28 Thread Heikki Linnakangas
algatt wrote:
> I am trying to create a new index for high multi-dimensional indexing.
> The index I'm trying to use in PR-Tree ( www.cs.duke.edu/~yike/prtree/
> ).
> 
> I have created a new source file in "src/backend/access/" and for now
> I have created the empty functions: build, insert, vacuum, etc... I
> believe that this has to be done in C. Then I ran "/src/include/
> catalog/unused_oids" to find the unused oids, and then i added the
> oids to the functions in the source "file /src/include/catalog/
> pg_proc.h". I re-ran initdb and inserted the functions in pg_am table.

You seem to be on the right track.

> Now obviously I have to add the proper code to the functions, but the
> version of the code I have for the PR-Tree is in C++. Is there a way
> to implement the functions using the existing PR-Tree code in C++?

No, we don't support C++ in the backend. It's almost certainly easier to
just start from scratch anyway, because integrating it with all the
PostgreSQL infrastructure like the buffer manager, vacuum, WAL-logging,
data type and operator system.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] default_text_search_config and expression indexes

2007-07-28 Thread Magnus Hagander
Bruce Momjian wrote:
> Magnus Hagander wrote:
>>> However, the big problem is that the expressions used in expression
>>> indexes should not change their output based on the value of a GUC
>>> variable (because it would corrupt the index), but in the case above,
>>> default_text_search_config controls what configuration is used, and
>>> hence the output of to_tsvector is changed if default_text_search_config
>>> changes.
>> It wuoldn't actually *corrupt* the index, right? You could end up with
>> wrong results, which might be regarded as corruption in one way, but as
>> long as you change the value back the index still works, no?
> 
> Right, it would _temporarily_ corrupt it.  ;-)
> 
>>> We have a few possible options:
>>>
>>> 1) Document the problem and do nothing else.
>>> 2) Make default_text_search_config a postgresql.conf-only
>>>setting, thereby making it impossible to change by non-super
>>>users, or make it a super-user-only setting.
>>> 3) Remove default_text_search_config and require the
>>>configuration to be specified in each function call.
>>>
>>> If we remove default_text_search_config, it would also make ::tsvector
>>> casting useless as well.
>> I think 3 is a really bad solution.
>>
>> 2 is a half-bad solution. Do we have a way to say that it can be set at
>> database-level for example, but not at user session? Making it
>> superuser-only to change it but not postgresql.conf-only could accomplish
>> that, along with warnings in the docs for the super user about the effects
>> on current indexes by changing it.
> 
> OK, here is what I am thinking.  If we make default_text_search_config
> super-user-only, then the user can't do SET (using "zero_damaged_pages"
> as a superuser-only example):
> 
>   test=> set zero_damaged_pages = on;
>   ERROR:  permission denied to set parameter "zero_damaged_pages"
>   
>   test=> alter user guest set zero_damaged_pages = on;
>   ERROR:  permission denied to set parameter "zero_damaged_pages"
> 
> but the super-user can set it in postgresql.conf, or:
> 
>   test=# alter user guest set zero_damaged_pages = on;
>   ALTER ROLE
> 
> or
> 
>   test=# alter database vendor3 set zero_damaged_pages = on;
>   ALTER ROLE
> 
> meaning while it will be super-user-only, the administrator can set the
> default for specific databases and users.  Is that the best approach?

That's exactly the one I was trying to suggest. And then with clear
warnings in the docs around the place that if you set it to different
values accessing the same index (for different users, for example) weird
things may happen.

But I see the ALTER DATABASE a fairly common scenario (I know I would
use it), and from what I can tell thereis no risk at all with that.

//Magnus

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] New Index

2007-07-28 Thread algatt
Hello,

I am trying to create a new index for high multi-dimensional indexing.
The index I'm trying to use in PR-Tree ( www.cs.duke.edu/~yike/prtree/
).

I have created a new source file in "src/backend/access/" and for now
I have created the empty functions: build, insert, vacuum, etc... I
believe that this has to be done in C. Then I ran "/src/include/
catalog/unused_oids" to find the unused oids, and then i added the
oids to the functions in the source "file /src/include/catalog/
pg_proc.h". I re-ran initdb and inserted the functions in pg_am table.

Now obviously I have to add the proper code to the functions, but the
version of the code I have for the PR-Tree is in C++. Is there a way
to implement the functions using the existing PR-Tree code in C++?

Thanks for your help.

Regards,
Alan


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq