Re: [HACKERS] How to REINDEX in high volume environments?

2002-09-27 Thread Shridhar Daithankar

On 28 Sep 2002 at 17:08, Justin Clift wrote:

> Have moved the indexes to another drive, then created symlinks to them.
> Ran a benchmark against the database, REINDEX'd the tables, VACUUM FULL
> ANALYZE'd, prepared to re-run the benchmark again and guess what?
> 
> The indexes were back on the original drive.
> Is there a way to allow REINDEX to work without having this side affect?
> 
> Pre-creating a bunch of dangling symlinks doesn't work (tried that, it
> gives a "ERROR:  cannot create accounts_pkey: File exists" on FreeBSD
> 4.6.2 when using the REINDEX).

Looks like we should have a subdirectory in database directory which stores 
index.

May be transaction logs, indexes goes in separte directory which can be 
symlinked. Linking a directory is much simpler solution than linking a file.

I suggest we have per database transaction log and indexes created in separate 
subdirectories for each database. Furhter given that large tables are segmented 
after one GB size, a table should have it's own subdirectory optionally..

At the cost of few inodes, postgresql would gain much more flexibility and 
hence tunability..

May be TODO for 7.4? Anyone?

Bye
 Shridhar

--
Software, n.:   Formal evening attire for female computer analysts.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



[HACKERS] How to REINDEX in high volume environments?

2002-09-27 Thread Justin Clift

Hi all,

Am experimenting to find out what kind of performance gain are achieved
from moving indexes to a different scsi drives than the WAL files, than
the data itself, etc.

Have come across an interesting problem.

Have moved the indexes to another drive, then created symlinks to them.

Ran a benchmark against the database, REINDEX'd the tables, VACUUM FULL
ANALYZE'd, prepared to re-run the benchmark again and guess what?

The indexes were back on the original drive.

The process of REINDEX-ing obviously creates another file then drops the
original.

Is there a way to allow REINDEX to work without having this side affect?

Pre-creating a bunch of dangling symlinks doesn't work (tried that, it
gives a "ERROR:  cannot create accounts_pkey: File exists" on FreeBSD
4.6.2 when using the REINDEX).

Any suggestions?

:-)

Regards and best wishes,

Justin Clift

-- 
"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
   - Indira Gandhi

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] making use of large TLB pages

2002-09-27 Thread Neil Conway

Okay, I did some more research into this area. It looks like it will
be feasible to use large TLB pages for PostgreSQL.

Tom Lane <[EMAIL PROTECTED]> writes:
> It wasn't clear from your description whether large-TLB shmem segments
> even have IDs that one could use to determine whether "the segment still
> exists".

There are two types of hugepages:

(a) private: Not shared on fork(), not accessible to processes
other than the one that allocates the pages.

(b) shared: Shared across a fork(), accessible to other
processes: different processes can access the same segment
if they call sys_alloc_hugepages() with the same key.

So for a standalone backend, we can just use private pages (probably
worth using private hugepages rather than malloc, although I doubt it
matters much either way).

> > Another possibility might be to still allocate a small SysV shmem
> > area, and use that to provide the interlock, while we allocate the
> > buffer area using sys_alloc_hugepages. That's somewhat of a hack, but
> > I think it would resolve the interlock problem, at least.
> 
> Not a bad idea ... I have not got a better one offhand ... but watch
> out for SHMMIN settings.

As it turns out, this will be completely unnecessary. Since hugepages
are an in-kernel data structure, the kernel takes care of ensuring
that dieing processes don't orphan any unused hugepage segments. The
logic works like this: (for shared hugepages)

(a) sys_alloc_hugepages() without IPC_EXCL will return a
pointer to an existing segment, if there is one that
matches the key. If an existing segment is found, the
usage counter for that segment is incremented. If no
matching segment exists, an error is returned. (I'm pretty
sure the usage counter is also incremented after a fork(),
but I'll double-check that.)

(b) sys_free_hugepages() decrements the usage counter

(c) when a process that has allocated a shared hugepage dies
for *any reason* (even kill -9), the usage counter is
decremented

(d) if the usage counter for a given segment ever reaches
zero, the segment is deleted and the memory is free'd.

If we used a key that would remain the same between runs of the
postmaster, this should ensure that there isn't a possibility of two
independant sets of backends operating on the same data dir. The most
logical way to do this IMHO would be to just hash the data dir, but I
suppose the current method of using the port number should work as
well.

To elaborate on (a) a bit, we'd want to use this logic when allocating
a new set of hugepages on postmaster startup:

(1) call sys_alloc_hugepages() without IPC_EXCL. If it returns
an error, we're in the clear: there's no page matching
that key. If it returns a pointer to a previously existing
segment, panic: it is very likely that there are some
orphaned backends still active.

(2) If the previous call didn't find anything, call
sys_alloc_hugepages() again, specifying IPC_EXCL to create
a new segment.

Now, the question is: how should this be implemented? You recently
did some of the legwork toward supporting different APIs for shared
memory / semaphores, which makes this work easier -- unfortunately,
some additional stuff is still needed. Specifically, support for
hugepages is a configuration option, that may or may not be enabled
(if it's disabled, the syscall returns a specific error). So I believe
the logic is something like:

- if compiling on a Linux system, enable support for hugepages
  (the regular SysV stuff is still needed as a backup)

- if we're compiling on a Linux system but the kernel headers
  don't define the syscalls we need, use some reasonable
  defaults (e.g. the syscall numbers for the current hugepage
  syscalls in Linux 2.5)

- at runtime, try to make one of these syscalls. If it fails,
  fall back to the SysV stuff.

Does that sound reasonable?

Any other comments would be appreciated.

Cheers,

Neil

-- 
Neil Conway <[EMAIL PROTECTED]> || PGP Key ID: DB3C29FC


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] Reconstructing FKs in pg_dump

2002-09-27 Thread Bruce Momjian


Both are done, and in CVS in /contrib/adddepend.

---

Matthew T. O'Connor wrote:
> From: "Tom Lane" <[EMAIL PROTECTED]>
> > However, if we are going to put that kind of knowledge into pg_dump,
> > it would only be a small further step to have it dump these triggers
> > as ALTER TABLE ADD CONSTRAINT commands instead.  Which would be a lot
> > better for forward compatibility than dumping the raw triggers.
> 
> There was some talk of adding Rod Taylor's identifies upgrade script to
> contrib, or mentioning it in the release.  I think that it upgrades Foreign
> key, Unique, and Serial constraints, is that relevant here?  Could it be
> used (or modified) to handle this situation?  Just a thought.
> 
> ---(end of broadcast)---
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to [EMAIL PROTECTED] so that your
> message can get through to the mailing list cleanly
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] version mismatch detection doesn't work

2002-09-27 Thread Bruce Momjian


I didn't think we were supposed to throw an error on a mismatch, were
we?

---

Alvaro Herrera wrote:
> Hackers,
> 
> Seems the functionality to detect old versions of the postmaster with
> newer psql doesn't work.  Here, server is 7.2.1:
> 
> $ psql alvherre
> ERROR:  parser: parse error at or near "."
> Welcome to psql 7.3b1, the PostgreSQL interactive terminal.
> 
> Type:  \copyright for distribution terms
>\h for help with SQL commands
>\? for help on internal slash commands
>\g or terminate with semicolon to execute query
>\q to quit
> 
> alvherre=> select version();
>version   
> -
>  PostgreSQL 7.2.1 on i686-pc-linux-gnu, compiled by GCC 2.96
> (1 row)
> 
> alvherre=> 
> 
> -- 
> Alvaro Herrera ()
> "Nunca confiar? en un traidor.  Ni siquiera si el traidor lo he creado yo"
> (Bar?n Vladimir Harkonnen)
> 
> ---(end of broadcast)---
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



[HACKERS] version mismatch detection doesn't work

2002-09-27 Thread Alvaro Herrera

Hackers,

Seems the functionality to detect old versions of the postmaster with
newer psql doesn't work.  Here, server is 7.2.1:

$ psql alvherre
ERROR:  parser: parse error at or near "."
Welcome to psql 7.3b1, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms
   \h for help with SQL commands
   \? for help on internal slash commands
   \g or terminate with semicolon to execute query
   \q to quit

alvherre=> select version();
   version   
-
 PostgreSQL 7.2.1 on i686-pc-linux-gnu, compiled by GCC 2.96
(1 row)

alvherre=> 

-- 
Alvaro Herrera ()
"Nunca confiaré en un traidor.  Ni siquiera si el traidor lo he creado yo"
(Barón Vladimir Harkonnen)

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] Reconstructing FKs in pg_dump

2002-09-27 Thread Matthew T. O'Connor

From: "Tom Lane" <[EMAIL PROTECTED]>
> However, if we are going to put that kind of knowledge into pg_dump,
> it would only be a small further step to have it dump these triggers
> as ALTER TABLE ADD CONSTRAINT commands instead.  Which would be a lot
> better for forward compatibility than dumping the raw triggers.

There was some talk of adding Rod Taylor's identifies upgrade script to
contrib, or mentioning it in the release.  I think that it upgrades Foreign
key, Unique, and Serial constraints, is that relevant here?  Could it be
used (or modified) to handle this situation?  Just a thought.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Cause of missing pg_clog files

2002-09-27 Thread Bruce Momjian


OK, we need a decision on whether we are going to do a 7.2,3 or just
have it in beta3.  If it is in 7.2.3, I would not mention it in the
beta3 release notes.

---

Tom Lane wrote:
> Yesterday I reported a WAL problem that could lead to tuples not being
> marked as committed-good or committed-dead after we'd already removed
> the pg_clog segment that had their transaction's commit status.
> I wasn't completely satisfied with that, though, because on further
> reflection it seemed a very low-probability mechanism.  I kept digging,
> and finally came to the kind of bug that qualifies as a big DOH :-(
> 
> If you run a database-wide VACUUM (one with no specific target table
> mentioned) as a non-superuser, then the VACUUM doesn't process tables
> that don't belong to you.  But it will advance pg_database.datvacuumxid
> anyway, which means that pg_clog could be truncated while old transaction
> references still remain unmarked in those other tables.
> 
> In words of one syllable: running VACUUM as a non-superuser can cause
> irrecoverable data loss in any 7.2.* release.
> 
> I think this qualifies as a "must fix" bug.  I recommend we back-patch
> a fix for this into the REL7_2 branch and put out a 7.2.3 release.
> We should also fix the "can't wait without a PROC" bug that was solved
> a few days ago.
> 
>   regards, tom lane
> 
> ---(end of broadcast)---
> TIP 6: Have you searched our list archives?
> 
> http://archives.postgresql.org
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] Cause of missing pg_clog files

2002-09-27 Thread Bruce Momjian

Tom Lane wrote:
> Yesterday I reported a WAL problem that could lead to tuples not being
> marked as committed-good or committed-dead after we'd already removed
> the pg_clog segment that had their transaction's commit status.
> I wasn't completely satisfied with that, though, because on further
> reflection it seemed a very low-probability mechanism.  I kept digging,
> and finally came to the kind of bug that qualifies as a big DOH :-(
> 
> If you run a database-wide VACUUM (one with no specific target table
> mentioned) as a non-superuser, then the VACUUM doesn't process tables
> that don't belong to you.  But it will advance pg_database.datvacuumxid
> anyway, which means that pg_clog could be truncated while old transaction
> references still remain unmarked in those other tables.
> 
> In words of one syllable: running VACUUM as a non-superuser can cause
> irrecoverable data loss in any 7.2.* release.
> 
> I think this qualifies as a "must fix" bug.  I recommend we back-patch
> a fix for this into the REL7_2 branch and put out a 7.2.3 release.
> We should also fix the "can't wait without a PROC" bug that was solved
> a few days ago.

Wow, you sure have found some good bugs in the past few days.  Nice job.

Yes, I agree we should push out a 7.2.3, and I think we are now ready
for beta3.  I will work on docs and packaging now.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



[HACKERS] Cause of missing pg_clog files

2002-09-27 Thread Tom Lane

Yesterday I reported a WAL problem that could lead to tuples not being
marked as committed-good or committed-dead after we'd already removed
the pg_clog segment that had their transaction's commit status.
I wasn't completely satisfied with that, though, because on further
reflection it seemed a very low-probability mechanism.  I kept digging,
and finally came to the kind of bug that qualifies as a big DOH :-(

If you run a database-wide VACUUM (one with no specific target table
mentioned) as a non-superuser, then the VACUUM doesn't process tables
that don't belong to you.  But it will advance pg_database.datvacuumxid
anyway, which means that pg_clog could be truncated while old transaction
references still remain unmarked in those other tables.

In words of one syllable: running VACUUM as a non-superuser can cause
irrecoverable data loss in any 7.2.* release.

I think this qualifies as a "must fix" bug.  I recommend we back-patch
a fix for this into the REL7_2 branch and put out a 7.2.3 release.
We should also fix the "can't wait without a PROC" bug that was solved
a few days ago.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-27 Thread Florian Weimer

Tom Lane <[EMAIL PROTECTED]> writes:

> We'd be happiest with a filesystem that journals its own metadata and
> not the user data in the file(s).  I dunno if there are any.

Most journalling file systems work this way.  Data journalling is not
very widespread, AFAIK.

-- 
Florian Weimer[EMAIL PROTECTED]
University of Stuttgart   http://CERT.Uni-Stuttgart.DE/people/fw/
RUS-CERT  fax +49-711-685-5898

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] [ODBC] [s.hetze@linux-ag.de: PostgreSQL integration Visual Basic,

2002-09-27 Thread Joe Conway

Michael Meskes wrote:
> into. As far as I understand the problem, the application uses stored
> procedures for each and every select statement. Thus he needs his
> procedures to return the whole query result, which is not doable with
> our functions.

It is in 7.3.

If the return tuple definition is fixed:
instead of:
 exec sp_myproc()
 go
do
 select * from sp_myproc();

If the return tuple definition is *not* fixed:
do
 select * from sp_myproc() as table_alias([col definition list]);

Does this help any? Can he try the 7.3 beta?

Joe





---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] About connectby() again

2002-09-27 Thread Masaru Sugawara

On Thu, 26 Sep 2002 16:32:08 -0700
Joe Conway <[EMAIL PROTECTED]> wrote:


> Masaru Sugawara wrote:
> > The previous patch fixed an infinite recursion bug in 
> > contrib/tablefunc/tablefunc.c:connectby. But, other unmanageable error
> > seems to occur even if a table has commonplace tree data(see below).
> > 
> > I would think the patch, ancestor check, should be
> > 
> >   if (strstr(branch_delim || branchstr->data || branch_delim,
> >branch_delim || current_key || branch_delim))
> > 
> > This is my image, not a real code. However, if branchstr->data includes
> > branch_delim, my image will not be perfect.
> 
> Good point. Thank you Masaru for the suggested fix.
> 
> Attached is a patch to fix the bug found by Masaru. His example now produces:
> 
> regression=# SELECT * FROM connectby('connectby_tree', 'keyid', 
> 'parent_keyid', '11', 0, '-') AS t(keyid int, parent_keyid int, level int, 
> branch text);
>   keyid | parent_keyid | level |  branch
> ---+--+---+--
>  11 |  | 0 | 11
>  10 |   11 | 1 | 11-10
> 111 |   11 | 1 | 11-111
>   1 |  111 | 2 | 11-111-1
> (4 rows)
> 
> While making the patch I also realized that the "no show branch" form of the 
> function was not going to work very well for recursion detection. Therefore 
> there is now a default branch delimiter ('~') that is used internally, for 
> that case, to enable recursion detection to work. If you need a different 
> delimiter for your specific data, you will have to use the "show branch" form 
> of the function.
> 
> If there are no objections, please apply. Thanks,


 I have no objection to your internally adding strings to detect a recursion.
And I agree with your definition--the default delimiter is a tilde.
Thanks a lot.



Regards,
Masaru Sugawara



---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] [PHP] WebDB Developers Wanted

2002-09-27 Thread Robert Treat

AFAIK it's because it is a members-only list. It is archived and the
archives are web viewable if you know where to look (not that I can
remember, but I have done it before).  

Robert Treat

On Fri, 2002-09-27 at 02:48, Markus Bertheau wrote:
> On Mon, 2002-09-16 at 10:52, Christopher Kings-Lynne wrote:
> > Developers mailing list: [EMAIL PROTECTED]
> 
> Hmm, that list does not appear on the sourceforge Lists page. Why?
> 
> -- 
> Markus Bertheau.
> Berlin, Berlin.
> Germany.
> 
> 
> ---(end of broadcast)---
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])




---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] hacker help: PHP-4.2.3 patch to allow restriction of

2002-09-27 Thread Larry Rosenman

On Thu, 2002-09-26 at 22:42, Tom Lane wrote:
> Jim Mercer <[EMAIL PROTECTED]> writes:
> > as best i can understand, there is no way to get apach/php/pgsql configured
> > (using "PostgreSQL's native access mappings") that would disallow php code
> > in one virtual host from connecting to any database on the system.
> 
> Betraying my ignorance of PHP here: what does a server supporting
> multiple virtual hosts look like from the database's end?  Can we
> tell the difference at all between connections initiated on behalf
> of one virtual host from those initiated on behalf of another?
Nope, you can't to the best of my knowledge.  You just get a standard
connect string.  (Assuming NAME based vHosts here, which is what most
should be, modulo SSL based ones). 

[snip]
 
-- 
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 972-414-9812 E-Mail: [EMAIL PROTECTED]
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Web site

2002-09-27 Thread Sean Chittenden

> > > could be done based on IP (yes it is inaccurate but it is close enough
> > > and has the same net effect: pushing people off the main web server) or
> > > it could be done by simply redirecting to a random mirror.
> > 
> > Have tried both in the past with disastrous results ...
> 
> What method will be employed instead?

Anyone thought about using GeoIP and writing a script that'd dump the
database into something that could be postgresql usable?  Just a
thought.  -sc

-- 
Sean Chittenden

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]