Re: [HACKERS] Cannot initdb in cvs tip

2004-06-20 Thread Andrew Dunstan
John Hansen said:
 On Sun, 2004-06-20 at 08:04, Dave Page wrote:
   although it says it's clearing the contents of the directory, in
   actual fact it leaves the directory structure in place, thus a
   subsequent initdb will not run without a manual clearup.
 
  Hm.  The rmtree() function in initdb.c is responsible for
  this, and I see it has WIN32-specific behavior, which is
  evidently wrong.
  Can you recommend a fix?

 The current solution does an rmdir /q /s $PGDATA if the datadir was
 created, and del /q /s $PGDATA if the directory already existed. The
 second case  will not work, as del will not remove directories.
 AFAICS, there is no easy way to do this using system() as rmdir won't
 accept wildcards, so we can't do del $PGDATA/*  rmdir $PGDATA/*.

 It seems to me that the simple answer is to put Andrew's recursive
 unlink code back in (as he suggested), which Bruce removed as rm etc.
 were being used in commands/dbcommands.c (which should work fine under
 Windows). Patch below


 you could of course rmdir /s /q $PGDATA  mkdir $PGDATA if the purpose
 is to leave the directory intact if it already existed prior to
 install.


No we can't. This was discussed months ago, IIRC. The user might very well
not have the privileges necessary to delete the directory, and might not
have the privileges to recreate it if they do.

The direct recursive delete is not a lot of code, and I must confess I
*hate* having C programs calling system() for such tasks. One of my goals
in rewriting initdb in C was to avoid any calls at all to any external
program other than postgres itself.

cheers

andrew



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Cannot initdb in cvs tip

2004-06-20 Thread Dave Page



-Original Message-
From: John Hansen [mailto:[EMAIL PROTECTED]
Sent: Sun 6/20/2004 2:27 AM
To: Dave Page
Cc: Tom Lane; PostgreSQL-development; [EMAIL PROTECTED]
Subject: Re: [HACKERS] Cannot initdb in cvs tip
 
 you could of course rmdir /s /q $PGDATA  mkdir $PGDATA if the purpose
 is to leave the directory intact if it already existed prior to install.

Permissions may not allow you to do that.

Regards, Dave

---(end of broadcast)---
TIP 8: explain analyze is your friend


[HACKERS] batch write of dirty buffers

2004-06-20 Thread Qingqing Zhou
In checkpoint and background writer, we flush out dirty buffer pages one
page one time. Is it possible to do in a batch mode? That is, try to find
out the continous page(same tblNode, relNode, adjacent blockNum), then write
them together?
To find out continous pages, most cases can be handled by just a qsort() of
candidate dirty pages, exceptional conditions may include the segment
boundary check if we don't let OS manage file size. This change will reduce
the write times, esp. if the database is in a batch update mode. We except
to write hundreds of pages by issuing just one smgrwrite().

There are other two points may need attentions. One is in function
StartBufferIO(), which asserts InProgressBuf, that is, we can just do one
page write one time. I am not quite sure the consequence if we remove this
variable. The other is that since we will acquire many locks on the buffer
page, so we may have to increase MAX_SIMUL_LWLOCKS. This should not be a
problem.

What's your ideas?

Regards,

Qingqing
http://www.cs.toronto.edu/~zhouqq










---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[HACKERS] JDBC prepared statements: actually not server prepared

2004-06-20 Thread Henner Zeller

Hi,

While tracking down a query with a JDBC prepared statement, I
noticed, that the statement actually wasn't prepared but sent to the
server as 'normal' Statement. In my case, this led to a very slow query
since the normal statement does not propagate the correct type -- in my
case I queried an indexed coloumn of type 'int8', which
was correctly passed to the PreparedStatement with setLong() -- however it
was sent with a different type to the server (the
AbstractJdbc1Statement::m_bindTypes[] are never used in this case).

The reason for this is, that by default, the PreparedStatement is not
configured to use a server side prepared statement. I assume, this is
unintended.

I fixed this by adding this to my local version (current CVS):

--
--- org/postgresql/jdbc3/Jdbc3PreparedStatement.java29 Mar 2004 19:17:12 - 
 1.6
+++ org/postgresql/jdbc3/Jdbc3PreparedStatement.java20 Jun 2004 11:54:00 -
@@ -13,6 +13,7 @@
public Jdbc3PreparedStatement(Jdbc3Connection connection, String
sql) throws SQLException
{
super(connection, sql);
+setUseServerPrepare(true);
}

public BaseResultSet createResultSet (Field[] fields,
java.util.Vector tuples, String status, int updateCount, long insertOID)
throws SQLException
--
(same for Jdbc2PreparedStatement and Jdbc1PreparedStatement).

If this is the right way to do ? And if so .. could someone apply
this patch ?

Thanks,
 -hen
Bücher kaufen und Freie Software fördern | http://bookzilla.de/

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] [PATCHES] ALTER TABLE ... SET TABLESPACE

2004-06-20 Thread Tatsuo Ishii
 Attached is a patch implementing this functionality.
 
 I've modified make_new_heap() as well as swap_relfilenodes() to not assume
 that tablespaces remain the same from old to new heap. I thought it better
 to go down this road than introduce a lot of duplicate code.

I have tried your patches and it works great. Thanks.

One thing I noticed was if I change tablespace for a table having
indexes, they are left in the old tablespace and the table itself was
moved to the new tablespace. I regard this is a good thing since I
could assign different table spaces for table and indexes.
It would be even better to assign different tablespaces for each
index.
--
Tatsuo Ishii

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] [PATCHES] ALTER TABLE ... SET TABLESPACE

2004-06-20 Thread Tatsuo Ishii
  Attached is a patch implementing this functionality.
  
  I've modified make_new_heap() as well as swap_relfilenodes() to not assume
  that tablespaces remain the same from old to new heap. I thought it better
  to go down this road than introduce a lot of duplicate code.
 
 I have tried your patches and it works great. Thanks.
 
 One thing I noticed was if I change tablespace for a table having
 indexes, they are left in the old tablespace and the table itself was
 moved to the new tablespace. I regard this is a good thing since I
 could assign different table spaces for table and indexes.
 It would be even better to assign different tablespaces for each
 index.

Hm. It seems there's a problem with tablespaces. What I did was:

pgbench -i test
alter table accounts set tablespace mydb2;
\d accounts

backend crashes by signal 11...
--
Tatsuo Ishii

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] batch write of dirty buffers

2004-06-20 Thread Tom Lane
Qingqing Zhou [EMAIL PROTECTED] writes:
 In checkpoint and background writer, we flush out dirty buffer pages one
 page one time. Is it possible to do in a batch mode? That is, try to find
 out the continous page(same tblNode, relNode, adjacent blockNum), then write
 them together?

What for?  The kernel will have its own ideas about scheduling the
physical writes, anyway.  We are not flushing anything directly to disk
here, we are just pushing pages out to kernel buffers.

 There are other two points may need attentions. One is in function
 StartBufferIO(), which asserts InProgressBuf, that is, we can just do one
 page write one time. I am not quite sure the consequence if we remove this
 variable. The other is that since we will acquire many locks on the buffer
 page, so we may have to increase MAX_SIMUL_LWLOCKS. This should not be a
 problem.

If the bgwriter tries to lock more than one shared buffer at a time,
you will inevitably get deadlocks.  I don't actually see the point
of doing that anyway, even assuming that it's worth trying to do the
writes in block-number order.  It would hardly ever be the case that
successive pages would be located in adjacent shared buffers, and so
you'd almost always end up issuing separate write commands anyway.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[HACKERS] ERROR: cannot find attribute 1 of relation resoconti_iq_key ... How can I solve?

2004-06-20 Thread M.Z.
Hi all.

Since yesterday postgres 7.2.1 (in debian 3.0r1) tell me this error when I
vacuumdb my database.

Why?

Suddenly a lot of data of the same db are disappered.
Only the data of the last pg_restore are again in the db.

Can I recover the lost data?

Mauro



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] constraint upon view

2004-06-20 Thread Pierre Emmanuel Gros
In fact i want this kind of view , because my views are forms with a
(Bjdbc selection like
(B
(Bcreate view toto (integer,integer) as SELECT a,b FROM executeJDBC(jdbc
(Bconnection,jdbc uri, 'select a,b from toto');
(B
(BWhere the view toto is filled by the function executeJDBC (jdbc driver,
(Bjdbc uri, remote query )
(B
(BThe problem is i lose to much information about my remote table (primary
(Bkey, indexes...) to make good JOIN operation.(for big table , this join
(Boperation
(Bwithout information upon primary key are very slow...)
(B
(BSo i like to extends the create view syntax to add constraint
(Binformation , and i hope speed up my join operation.
(B
(BI would like to know what to change in the source to obtain this extension .
(BPierre
(BPS excuse me if i was not clear.
(B
(B
(B
(BJonathan Gardner wrote:
(B
(BOn Monday 31 May 2004 06:38 am, Pierre Emmanuel Gros wrote:
(B  
(B
(BHi,
(BI would like to know if i can add constraint and typed column upon a
(Bcreate view sentence.
(Bsomething like create view toto (a INTEGER primary key , b VARCHAR) as
(Bselect .
(BIf it is not possible , what to have to change in the backend sources to
(Bobtain the result ???
(B
(B
(B
(B
(BI don't think you want a view in this case. When you select against a view, 
(Bthe view is unrolled into the statement that forms it. Try a look at the 
(Bresults of "EXPLAIN ANALYZE" when you are selecting against a view.
(B
(B  
(B
(B
(B
(B
(B---(end of broadcast)---
(BTIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Casts question

2004-06-20 Thread Shachar Shemesh
Tom Lane wrote:
Shachar Shemesh [EMAIL PROTECTED] writes:
 

I have defined a datatype called varcharci, shamelessly yanking the 
input, output, recv and send functions from varchar. This means (as far 
as I understand things) that this type is binary compatible with varchar.
   

Use text, not varchar.
 

Why is that? Being as it is that no operator = is defined for varcharci, 
and that the cast from varchar to varcharci is as assignment anyways, 
shouldn't postgres be able to do the cast implicitly?
   

Yes, it can cast to varchar, but that doesn't help because there are no
varchar operators ;-).  To resolve the operator, it has to promote both
sides to text, and you didn't offer a cast to text.
regards, tom lane
 

I don't get it. The cast from varchar to text is using a no function 
cast - i.e. - they are binary compatible. And yet, there are two 
seperate functions for receiving from text and from binary 
representation. Why not use the same function?

Shachar
--
Shachar Shemesh
Lingnu Open Source Consulting ltd.
http://www.lingnu.com/
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [HACKERS] Casts question

2004-06-20 Thread Tom Lane
Shachar Shemesh [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 Yes, it can cast to varchar, but that doesn't help because there are no
 varchar operators ;-).  To resolve the operator, it has to promote both
 sides to text, and you didn't offer a cast to text.
 
 I don't get it.

When we look to see whether we can cast from type X to type Y, we look
to see whether there is a pg_cast entry from type X to type Y.  We do
not look to see if we could get there by casting X to some other type Z
and thence to Y (much less more-than-2-step sequences).  There are a
number of good reasons for this, but I'll just mention speed and
surprise factors.  Doing so would make for an exponential increase in
the number of potential cast paths, thereby probably paralyzing the
operator selection code instead of making it more useful.  (Too many
alternatives are as bad as none.)

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] [PATCHES] ALTER TABLE ... SET TABLESPACE

2004-06-20 Thread Gavin Sherry
On Sun, 20 Jun 2004, Tatsuo Ishii wrote:

   Attached is a patch implementing this functionality.
  
   I've modified make_new_heap() as well as swap_relfilenodes() to not assume
   that tablespaces remain the same from old to new heap. I thought it better
   to go down this road than introduce a lot of duplicate code.
 
  I have tried your patches and it works great. Thanks.
 
  One thing I noticed was if I change tablespace for a table having
  indexes, they are left in the old tablespace and the table itself was
  moved to the new tablespace. I regard this is a good thing since I
  could assign different table spaces for table and indexes.
  It would be even better to assign different tablespaces for each
  index.

 Hm. It seems there's a problem with tablespaces. What I did was:

 pgbench -i test
 alter table accounts set tablespace mydb2;
 \d accounts

 backend crashes by signal 11...

I seem to be clobbering memory some where but I cannot get assert or
valgrind to tell me. Anyone got any ideas?

Gavin

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Compile failure with SSL

2004-06-20 Thread Dave Page
OK, looks like the error below is a Win32 thing. The patch attached
#ifdef'd out the permissions check on the private key file as it won't
work under Windows anyway (a similar check in postmaster.c has has
already been ifdef'd out for the same reason).

Incidently, the consts are also used in initdb.c where they work just
fine - can't seem to figure out where it gets them from though (must be
missing something). Normally they're in sys/stat.h, but not in mingw
(where they can only be found in glibc/sys/stat.h). Magnus also
mentioned to me that Merlin's snapshots are building fine without this
patch - dunno if he has a different version of sys/stat.h...

Anyway, regardless of that I think the patch is relevant.

Regards, Dave.


 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Dave Page
 Sent: 19 June 2004 23:24
 To: [EMAIL PROTECTED]
 Subject: [HACKERS] Compile failure with SSL
 
 I think this is another on of those 'might be Win32 specific' 
 problems.
 When building on XP, with OpenSSL 0.9.7c (from the bitWalk 
 MinGW tools), I get the following failure:
 
 gcc -O2 -fno-strict-aliasing -Wall -Wmissing-prototypes 
 -Wmissing-declarations -I../../../src/include 
 -I./src/include/port/win32 -DEXEC_BACKEND  
 -I../../../src/include/port/win32 -DBUILDING_DLL  -c -o 
 be-secure.o be-secure.c
 be-secure.c: In function `initialize_SSL':
 be-secure.c:653: `S_IRWXG' undeclared (first use in this function)
 be-secure.c:653: (Each undeclared identifier is reported only once
 be-secure.c:653: for each function it appears in.)
 be-secure.c:653: `S_IRWXO' undeclared (first use in this function)
 be-secure.c:654: warning: implicit declaration of function `getuid'
 make[3]: *** [be-secure.o] Error 1
 make[3]: Leaving directory `/cvs/pgsql/src/backend/libpq'
 
 I have run 'make distclean' and './configure --with-openssl' 
 to make sure everything is consistent.
 
 Any ideas?
 
 Regards, Dave
 
 ---(end of 
 broadcast)---
 TIP 1: subscribe and unsubscribe commands go to 
 [EMAIL PROTECTED]
 


be_secure.diff
Description: be_secure.diff

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] [PATCHES] ALTER TABLE ... SET TABLESPACE

2004-06-20 Thread Gavin Sherry
On Mon, 21 Jun 2004, Tatsuo Ishii wrote:

  On Sun, 20 Jun 2004, Tatsuo Ishii wrote:
 
 Attached is a patch implementing this functionality.

 I've modified make_new_heap() as well as swap_relfilenodes() to not assume
 that tablespaces remain the same from old to new heap. I thought it better
 to go down this road than introduce a lot of duplicate code.
   
I have tried your patches and it works great. Thanks.
   
One thing I noticed was if I change tablespace for a table having
indexes, they are left in the old tablespace and the table itself was
moved to the new tablespace. I regard this is a good thing since I
could assign different table spaces for table and indexes.
It would be even better to assign different tablespaces for each
index.
  
   Hm. It seems there's a problem with tablespaces. What I did was:
  
   pgbench -i test
   alter table accounts set tablespace mydb2;
   \d accounts
  
   backend crashes by signal 11...
 
  I seem to be clobbering memory some where but I cannot get assert or
  valgrind to tell me. Anyone got any ideas?

 First of all I would like to ask you if you intend to leave indexes in
 the old tables space or not.

Yes, that is intentional.


 Also I think we need to enhance ALTER INDEX to assign new table spaces
 for indexes. Assigning different tables spaces for tables and indexes
 are essential to gain more I/O speed IMO.

I thought about this. ALTER INDEX doesn't exist yet and I figured that,
unlike the case of tables, its easy to drop and recreate indexes in new
tablespaces.

I'm still stumped as to where I am corrupting memory with this patch
though. (There was another bug: I wasn't detecting the case where users
set tablespace to the tablespace that the table is already in).

gavin

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] [PATCHES] ALTER TABLE ... SET TABLESPACE

2004-06-20 Thread Tatsuo Ishii
 On Sun, 20 Jun 2004, Tatsuo Ishii wrote:
 
Attached is a patch implementing this functionality.
   
I've modified make_new_heap() as well as swap_relfilenodes() to not assume
that tablespaces remain the same from old to new heap. I thought it better
to go down this road than introduce a lot of duplicate code.
  
   I have tried your patches and it works great. Thanks.
  
   One thing I noticed was if I change tablespace for a table having
   indexes, they are left in the old tablespace and the table itself was
   moved to the new tablespace. I regard this is a good thing since I
   could assign different table spaces for table and indexes.
   It would be even better to assign different tablespaces for each
   index.
 
  Hm. It seems there's a problem with tablespaces. What I did was:
 
  pgbench -i test
  alter table accounts set tablespace mydb2;
  \d accounts
 
  backend crashes by signal 11...
 
 I seem to be clobbering memory some where but I cannot get assert or
 valgrind to tell me. Anyone got any ideas?

First of all I would like to ask you if you intend to leave indexes in
the old tables space or not.

Also I think we need to enhance ALTER INDEX to assign new table spaces
for indexes. Assigning different tables spaces for tables and indexes
are essential to gain more I/O speed IMO.
--
Tatsuo Ishii

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [PATCHES] ALTER TABLE ... SET TABLESPACE

2004-06-20 Thread Tatsuo Ishii
  Also I think we need to enhance ALTER INDEX to assign new table spaces
  for indexes. Assigning different tables spaces for tables and indexes
  are essential to gain more I/O speed IMO.
 
 I thought about this. ALTER INDEX doesn't exist yet and I figured that,
 unlike the case of tables, its easy to drop and recreate indexes in new
 tablespaces.

Oh you are right. I forgot about CREATE INDEX ... TABLESPACE.

 I'm still stumped as to where I am corrupting memory with this patch
 though. (There was another bug: I wasn't detecting the case where users
 set tablespace to the tablespace that the table is already in).
 
 gavin
 

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] [PATCHES] ALTER TABLE ... SET TABLESPACE

2004-06-20 Thread Scott Marlowe
On Sun, 2004-06-20 at 17:15, Tatsuo Ishii wrote:
   Also I think we need to enhance ALTER INDEX to assign new table spaces
   for indexes. Assigning different tables spaces for tables and indexes
   are essential to gain more I/O speed IMO.
  
  I thought about this. ALTER INDEX doesn't exist yet and I figured that,
  unlike the case of tables, its easy to drop and recreate indexes in new
  tablespaces.
 
 Oh you are right. I forgot about CREATE INDEX ... TABLESPACE.
 
  I'm still stumped as to where I am corrupting memory with this patch
  though. (There was another bug: I wasn't detecting the case where users
  set tablespace to the tablespace that the table is already in).

On a related note, will there be a way to have implicit index creation
occur in a seperate table space automagically?  I.e.

create table test (id int4 primary key, n1 int unique);

so that the indexes created in id and n1 here would have a different
default namespace than the table?  Just wondering.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] [PATCHES] ALTER TABLE ... SET TABLESPACE

2004-06-20 Thread Tom Lane
Gavin Sherry [EMAIL PROTECTED] writes:
 On Mon, 21 Jun 2004, Tatsuo Ishii wrote:
 Also I think we need to enhance ALTER INDEX to assign new table spaces
 for indexes. Assigning different tables spaces for tables and indexes
 are essential to gain more I/O speed IMO.

 I thought about this. ALTER INDEX doesn't exist yet and I figured that,
 unlike the case of tables, its easy to drop and recreate indexes in new
 tablespaces.

The precedents we already have (ALTER OWNER, RENAME, SET STATISTICS)
are that ALTER TABLE applies to any relation type for which it makes
sense.  So I'd expect ALTER TABLE SET TABLESPACE to just work on
indexes, not that we'd go and invent an ALTER INDEX ... command.

Given that you implement the data transfer as a straight block-by-block
copy and not some kind of tuple-at-a-time thing, I would think that
it would be trivial to consider them the same case from an
implementation point of view, too.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] [PATCHES] ALTER TABLE ... SET TABLESPACE

2004-06-20 Thread Gavin Sherry
On Sun, 20 Jun 2004, Tom Lane wrote:

 Gavin Sherry [EMAIL PROTECTED] writes:
  On Mon, 21 Jun 2004, Tatsuo Ishii wrote:
  Also I think we need to enhance ALTER INDEX to assign new table spaces
  for indexes. Assigning different tables spaces for tables and indexes
  are essential to gain more I/O speed IMO.

  I thought about this. ALTER INDEX doesn't exist yet and I figured that,
  unlike the case of tables, its easy to drop and recreate indexes in new
  tablespaces.

 The precedents we already have (ALTER OWNER, RENAME, SET STATISTICS)
 are that ALTER TABLE applies to any relation type for which it makes
 sense.  So I'd expect ALTER TABLE SET TABLESPACE to just work on
 indexes, not that we'd go and invent an ALTER INDEX ... command.

Yes, of course.


 Given that you implement the data transfer as a straight block-by-block
 copy and not some kind of tuple-at-a-time thing, I would think that
 it would be trivial to consider them the same case from an
 implementation point of view, too.

But I did implement it as a tuple at a time thing. I reused the code from
rebuild_relation()...

What did you have in mind?

Gavin

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] [PATCHES] ALTER TABLE ... SET TABLESPACE

2004-06-20 Thread Tom Lane
Gavin Sherry [EMAIL PROTECTED] writes:
 But I did implement it as a tuple at a time thing. I reused the code from
 rebuild_relation()...

 What did you have in mind?

Something about like

for (b = 0; b  RelationGetNumberOfBlocks(src); b++)
{
smgrread(src, b, buf);
smgrwrite(dst, b, buf);
}

Given that the only files people are going to be troubling to reassign
to new tablespaces are enormous ones, you'd want the transfer to be as
efficient as reasonably possible.

The main thing this is omitting is what about wal-logging the move?
Perhaps we could emit one WAL record showing the source and dest
RelFileNodes and number of blocks for the copy, and then LSN-stamp
each copied block with that record's LSN.  However I'm not sure how to
replay that if the source file isn't there anymore when the replay needs
to run :-(.  Maybe you have to dump each block into WAL as you copy it.
That would be kinda ugly ... though in point of fact less of a WAL load
than writing individual tuples ...

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] [PATCHES] ALTER TABLE ... SET TABLESPACE

2004-06-20 Thread Gavin Sherry
On Sun, 20 Jun 2004, Tom Lane wrote:

 Gavin Sherry [EMAIL PROTECTED] writes:
  But I did implement it as a tuple at a time thing. I reused the code from
  rebuild_relation()...

  What did you have in mind?

 Something about like

   for (b = 0; b  RelationGetNumberOfBlocks(src); b++)
   {
   smgrread(src, b, buf);
   smgrwrite(dst, b, buf);
   }

 Given that the only files people are going to be troubling to reassign
 to new tablespaces are enormous ones, you'd want the transfer to be as
 efficient as reasonably possible.

 The main thing this is omitting is what about wal-logging the move?

Yes, that's what I was thinking.

 Perhaps we could emit one WAL record showing the source and dest
 RelFileNodes and number of blocks for the copy, and then LSN-stamp
 each copied block with that record's LSN.  However I'm not sure how to
 replay that if the source file isn't there anymore when the replay needs
 to run :-(.  Maybe you have to dump each block into WAL as you copy it.
 That would be kinda ugly ... though in point of fact less of a WAL load
 than writing individual tuples ...

Should I use the WAL-enabled case of  _bt_blwritepage() as a guide here?

Gavin

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [PATCHES] ALTER TABLE ... SET TABLESPACE

2004-06-20 Thread Gavin Sherry
On Mon, 21 Jun 2004, Tom Lane wrote:

 Gavin Sherry [EMAIL PROTECTED] writes:
  On Sun, 20 Jun 2004, Tom Lane wrote:
  Maybe you have to dump each block into WAL as you copy it.
  That would be kinda ugly ... though in point of fact less of a WAL load
  than writing individual tuples ...

  Should I use the WAL-enabled case of  _bt_blwritepage() as a guide here?

 Yeah, actually that is a very good parallel.  If PITR archiving isn't
 turned on, you don't have to dump pages into WAL; you can substitute
 an fsync before commit, instead.  And if it's a temp table then you
 don't have to do either.  (Not sure anyone would ever do SET TABLESPACE
 on a temp table, but might as well get it right.)

 The xlog action here of copying a page image is currently
 btree-specific, but maybe we should move it to a more widely visible
 place, such as heapam.c.  I don't see any value in having identical
 xlog recovery actions in several different modules.

I was just thinking that. I imagine that this would be useful for WAL
logging of createdb() when that functionality gets implemented.

We might also be able to use it as a speed up for cluster() (some time
in the future). That is, we could form a complete page in memory in
relation_rebuild() and then write it out directly.

Gavin

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] [PATCHES] ALTER TABLE ... SET TABLESPACE

2004-06-20 Thread Tom Lane
Gavin Sherry [EMAIL PROTECTED] writes:
 On Sun, 20 Jun 2004, Tom Lane wrote:
 Maybe you have to dump each block into WAL as you copy it.
 That would be kinda ugly ... though in point of fact less of a WAL load
 than writing individual tuples ...

 Should I use the WAL-enabled case of  _bt_blwritepage() as a guide here?

Yeah, actually that is a very good parallel.  If PITR archiving isn't
turned on, you don't have to dump pages into WAL; you can substitute
an fsync before commit, instead.  And if it's a temp table then you
don't have to do either.  (Not sure anyone would ever do SET TABLESPACE
on a temp table, but might as well get it right.)

The xlog action here of copying a page image is currently
btree-specific, but maybe we should move it to a more widely visible
place, such as heapam.c.  I don't see any value in having identical
xlog recovery actions in several different modules.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] [PATCHES] ALTER TABLE ... SET TABLESPACE

2004-06-20 Thread Tom Lane
Gavin Sherry [EMAIL PROTECTED] writes:
 On Mon, 21 Jun 2004, Tatsuo Ishii wrote:
 First of all I would like to ask you if you intend to leave indexes in
 the old tables space or not.

 Yes, that is intentional.

There's a related issue: what about the table's TOAST table (if any)
and the index on same?

During CREATE TABLE, the toast table is assigned to the same tablespace
as the base table, as is its index.  I don't think this is wrong exactly
--- the fact that we separate data storage into main and toast tables is
an implementation detail that users shouldn't have to think about.

If you subscribe to that notion, then ALTER TABLE SET TABLESPACE had
better move the toast table and index too.  If you don't subscribe
to it, then ALTER TABLE SET TABLESPACE had better be able to move
toast tables, and the issue will have to be documented.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[HACKERS] Applying patches

2004-06-20 Thread Bruce Momjian
Because I am traveling, I will no longer be reviewing and applying
patches until I return on July 3.  Tom will take over most of this duty.
I will check things once I get back to be sure everything got in just so
Tom can relax knowing someone will make sure nothing was missed.

Also, I have learned I might have no connectivity in Armenia from June
26-July 3.  Until then, I am in Germany where connectivity will be
available but periodic.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org