from:"Jeremy Drake"

Re: [HACKERS] 64-bit API for large objects

2005-09-23 Thread Jeremy Drake

This patch implements the ability for large objects to be larger than 2GB.
I believe the limit to now be about 2TB, based on the fact that the large
object page size is 2048 bytes, and the page number is still 32 bits.

There are a few things about this patch which probably require tweaking or
at least a closer inspection from the list.

1) The lo_*64 functions are added to the catalog/pg_proc.h (spacing exact
location atm) with OID set to 0, all other entries in this file have OIDs
explicitly defined.

2) The lo_*64, in order to be convenient from the client end, have
functions added to libpq as the existing lo_* functions.  The client side
of libpq did not previously know anything about int64 or how to
send/receive them.  I added an include of postgres-fe.h (which according
to the comment in that file looks like it should go there) so int64 would
be defined, also implemented functions (code mostly stolen from the server
libpq format functions for same) to convert them to/from network byte
order.  I did this in a somewhat inconsistent way between the get and put,
as I did not want to change the existing api at all, and existing code as
little as possible.

3) The 32 bit box I tested this on was a PII 300MHz laptop.  Not exactly
the fastest.  The test consisted entirely of making sure it compiled.
Perhaps someone with a fast IA32 box and spare cycles can test it?  Also,
so far the only platforms I have tried to compile this on have been:

* Linux 2.6 (gentoo), AMD64, gcc-3.4.4
* Solaris 8, SPARCv9, gcc-3.4.2
* Linux 2.6 (debian unstable), i686, gcc-3.4.x (laptop, don't remember
exact version).

Would probably be a good idea to verify this on other platforms as well,
or at least other compilers.

Hopefully I did not break anything too badly with this.  All of the
regression tests still pass after the patch, and I made a version of the
tests/examples/testlo which uses 64bit (in the patch) which works also.  I
grepped in the regression tests, and I could not find any usage of large
objects in them, which I found to be rather odd, which is why I used
testlo and my new testlo64 to test them instead.

On Tue, 20 Sep 2005, Jonah H. Harris wrote:

 Cool. We look forward to it.

 On 9/19/05, Mark Dilger [EMAIL PROTECTED] wrote:
 
  Jonah H. Harris wrote:
   Mark,
  
   If you don't mind contributing the changes, we'd be glad to take a look
   at them. Thanks.
  
   -Jonah
  
 
  Ok, we will post it back soon. We have tested it on two different 64-bit
  architectures (Sparc and AMD) and are now testing on pentium before
  posting up
  to the list.
 
  mark
 



 --
 Respectfully,

 Jonah H. Harris, Database Internals Architect
 EnterpriseDB Corporation
 http://www.enterprisedb.com/


-- 
Mere nonexistence is a feeble excuse for declaring a thing unseeable. You
*can* see dragons.  You just have to look in the right direction.
-- John Haslerdiff -Nur postgresql-8.0.3-orig/src/backend/libpq/be-fsstubs.c 
postgresql-8.0.3/src/backend/libpq/be-fsstubs.c
--- postgresql-8.0.3-orig/src/backend/libpq/be-fsstubs.c2004-12-31 
13:59:50.0 -0800
+++ postgresql-8.0.3/src/backend/libpq/be-fsstubs.c 2005-09-18 
17:22:17.0 -0700
@@ -233,6 +233,34 @@
PG_RETURN_INT32(status);
 }
 
+
+Datum
+lo_lseek64(PG_FUNCTION_ARGS)
+{
+   int32   fd = PG_GETARG_INT32(0);
+   int64   offset = PG_GETARG_INT64(1);
+   int32   whence = PG_GETARG_INT32(2);
+   MemoryContext currentContext;
+   int64   status;
+
+   if (fd  0 || fd = cookies_size || cookies[fd] == NULL)
+   {
+   ereport(ERROR,
+   (errcode(ERRCODE_UNDEFINED_OBJECT),
+errmsg(invalid large-object descriptor: %d, 
fd)));
+   PG_RETURN_INT64(-1);
+   }
+
+   Assert(fscxt != NULL);
+   currentContext = MemoryContextSwitchTo(fscxt);
+
+   status = inv_seek64(cookies[fd], offset, whence);
+
+   MemoryContextSwitchTo(currentContext);
+
+   PG_RETURN_INT64(status);
+}
+
 Datum
 lo_creat(PG_FUNCTION_ARGS)
 {
@@ -283,6 +311,28 @@
PG_RETURN_INT32(inv_tell(cookies[fd]));
 }
 
+
+Datum
+lo_tell64(PG_FUNCTION_ARGS)
+{
+   int32   fd = PG_GETARG_INT32(0);
+
+   if (fd  0 || fd = cookies_size || cookies[fd] == NULL)
+   {
+   ereport(ERROR,
+   (errcode(ERRCODE_UNDEFINED_OBJECT),
+errmsg(invalid large-object descriptor: %d, 
fd)));
+   PG_RETURN_INT64(-1);
+   }
+
+   /*
+* We assume we do not need to switch contexts for inv_tell. That is
+* true for now, but is probably more than this module ought to
+* assume...
+*/
+   PG_RETURN_INT64(inv_tell(cookies[fd]));
+}
+
 Datum
 lo_unlink(PG_FUNCTION_ARGS)
 {
diff -Nur postgresql-8.0.3-orig/src/backend/storage/large_object/inv_api.c

Re: [HACKERS] 64-bit API for large objects

2005-09-23 Thread Jeremy Drake

I sent this from the wrong address last time so it did not go to the list,
I apologize to anyone who may be getting it again...

On Fri, 23 Sep 2005, Tom Lane wrote:

 Jeremy Drake [EMAIL PROTECTED] writes:
  2) The lo_*64, in order to be convenient from the client end, have
  functions added to libpq as the existing lo_* functions.  The client side
  of libpq did not previously know anything about int64 or how to
  send/receive them.  I added an include of postgres-fe.h (which according
  to the comment in that file looks like it should go there) so int64 would
  be defined,

 Unfortunately that's completely unacceptable from a namespace-pollution
 point of view.

I don't quite understand.  Allow me to cite the source, so we both are
referring to the same thing here...

[EMAIL PROTECTED] postgresql-8.0.3 $ head -n17 src/include/postgres_fe.h
/*-
 *
 * postgres_fe.h
 *Primary include file for PostgreSQL client-side .c files
 *
 * This should be the first file included by PostgreSQL client libraries and
 * application programs --- but not by backend modules, which should include
 * postgres.h.
 *
 *
 * Portions Copyright (c) 1996-2005, PostgreSQL Global Development Group
 * Portions Copyright (c) 1995, Regents of the University of California
 *
 * $PostgreSQL: pgsql/src/include/postgres_fe.h,v 1.10 2004/12/31 22:03:19 
pgsql Exp $
 *
 *-
 */

Now I may not completely understand the term client, but I think libpq
is a client library and anything which may use it would be an
application program.  So it seems it was an oversight on the part of
libpq to not include it.  Does the term client not mean what I thought
it did (anything which connects to a postgresql server)?


 The real problem here is that int64 isn't a well-defined portable
 datatype, and so it's going to be very hard to export these functions in
 a way that won't break on different platforms, applications compiled
 with a different compiler than libpq was, etc.

Umm, what wouldn't break if you switched compilers in a way that redefined
sizeof(things)?  I happen to know, even using the same compiler but just
changing a compile flag (-m64) which changes sizes of integral types
(sizeof(long) from 32 to 64 bits) will make such actions stop working on
one of my tested platform.  It sucks, I happen to not be fond of this
because I tend not to have every library which is on my box built for
both, but it is the way life is.  I do not know of a platform where the
size of an integral type can change and still be able to link against
libraries and things.  And if the size of some type is not changing, then
things should already be correctly set for the platform.  But I admit I
have not met every platform in existance.  Do you happen to be able to
cite a platform where this is the case?


 For that matter, we can't even guarantee that they work at all: not all
 platforms even *have* int64 types.  We have so far avoided putting any
 fundamental dependencies on int64 arithmetic into the system, and I'm a
 bit worried that this patch will break LO support entirely on platforms
 that don't have working int64 arithmetic.

They should in fact break gracefully on such platforms, or at least as
gracefully as any other int64-using code might.  I did check a couple
places for #ifdef INT64_BROKEN (or whatever it was called) to make sure
that on those platforms something at least somewhat sane would happen.
(they use 32 bits instead).  Also, on those platforms, you could always
use the non-64 versions if you were concerned about that.  The patches
would allow seeking past the old limit using the 32 function in stages
(seek 2G, seek 2G, seek 2G would put you at 6G) if you do not mind wierd
return values and tell not working.  And if you use a platform which
does not support 64bit integral types, then you cannot reasonably expect
those functions to work correctly anyway.  But they should compile at
least.


   regards, tom lane

 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match


-- 
I don't wanna argue, and I don't wanna fight,
But there will definitely be a party tonight...

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] 64-bit API for large objects

2005-09-23 Thread Jeremy Drake

On Fri, 23 Sep 2005, Tom Lane wrote:

 postgresql-fe.h defines a ton of stuff that has no business being
 visible to libpq's client applications.  It's designed to be used by
 our *own* client-side code (psql and the like), but we have not made
 any attempt to keep it from defining stuff that would likely break
 other peoples' code.

So does this mean that there is a different, more advanced and more likely
to break random other code, client library where this call would fit
better?  If so, I would be happy to change the patch to put it there.  I
did not see it, but I did not look very hard.

If not, what is a client side programmer to do if they want to pass int64s
around?  Every client app has to basically write their own htonll (or
whatever you want to call it) and perform their own detection of what type
is a 64bit int, and cache the oids for the fastcall interface themselves?
There seems to be a lot of overhead which libpq saves you from.  Or the
client program could perform the detection of the type, and also detect a
function which would reasonably serve as an atoll on the platform, and
snprintf(buf, 1024, SELECT lo_seek64(%d, %lld, %d), fh, offset,
SEEK_SET); exec the buf, check to see if any tuples came back, if so (get
the first column of the first tuple, call atoll on that) else handle
error, and in either case free the result?



In any case, are there any comments on the changes below libpq (the
functions visible to queries on down)?  I don't want to get hung up in the
client issues just to find out later that the server stuff was completely
insane anyway...  The client library seems to me to be less important
anyway.  If the server can support it, the client can always manage to do
it some how, and then once the client lib can support it, it should be
fairly transparent to swap that out later, so that code that worked around
could be updated without immediately breaking all other code working
around.

So that means that if I get good feedback on the server side code, I could
start having people code to it using one of the above workaround methods
listed, and then if we manage to come up with some way which would be more
correct (if that is the right word) than the libpq hack I did then they
could gradually switch over to that (or use sed -i).

-- 
All that glitters has a high refractive index.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] 64-bit API for large objects

2005-09-24 Thread Jeremy Drake

On Fri, 23 Sep 2005, Tom Lane wrote:

 Jeremy Drake [EMAIL PROTECTED] writes:
  2) The lo_*64, in order to be convenient from the client end, have
  functions added to libpq as the existing lo_* functions.  The client side
  of libpq did not previously know anything about int64 or how to
  send/receive them.  I added an include of postgres-fe.h (which according
  to the comment in that file looks like it should go there) so int64 would
  be defined,

 Unfortunately that's completely unacceptable from a namespace-pollution
 point of view.

I don't quite understand.  Allow me to cite the source, so we both are
referring to the same thing here...

[EMAIL PROTECTED] postgresql-8.0.3 $ head -n17 src/include/postgres_fe.h
/*-
 *
 * postgres_fe.h
 *Primary include file for PostgreSQL client-side .c files
 *
 * This should be the first file included by PostgreSQL client libraries and
 * application programs --- but not by backend modules, which should include
 * postgres.h.
 *
 *
 * Portions Copyright (c) 1996-2005, PostgreSQL Global Development Group
 * Portions Copyright (c) 1995, Regents of the University of California
 *
 * $PostgreSQL: pgsql/src/include/postgres_fe.h,v 1.10 2004/12/31 22:03:19 
pgsql Exp $
 *
 *-
 */

Now I may not completely understand the term client, but I think libpq
is a client library and anything which may use it would be an
application program.  So it seems it was an oversight on the part of
libpq to not include it.  Does the term client not mean what I thought
it did (anything which connects to a postgresql server)?


 The real problem here is that int64 isn't a well-defined portable
 datatype, and so it's going to be very hard to export these functions in
 a way that won't break on different platforms, applications compiled
 with a different compiler than libpq was, etc.

Umm, what wouldn't break if you switched compilers in a way that redefined
sizeof(things)?  I happen to know, even using the same compiler but just
changing a compile flag (-m64) which changes sizes of integral types
(sizeof(long) from 32 to 64 bits) will make such actions stop working on
one of my tested platform.  It sucks, I happen to not be fond of this
because I tend not to have every library which is on my box built for
both, but it is the way life is.  I do not know of a platform where the
size of an integral type can change and still be able to link against
libraries and things.  And if the size of some type is not changing, then
things should already be correctly set for the platform.  But I admit I
have not met every platform in existance.  Do you happen to be able to
cite a platform where this is the case?


 For that matter, we can't even guarantee that they work at all: not all
 platforms even *have* int64 types.  We have so far avoided putting any
 fundamental dependencies on int64 arithmetic into the system, and I'm a
 bit worried that this patch will break LO support entirely on platforms
 that don't have working int64 arithmetic.

They should in fact break gracefully on such platforms, or at least as
gracefully as any other int64-using code might.  I did check a couple
places for #ifdef INT64_BROKEN (or whatever it was called) to make sure
that on those platforms something at least somewhat sane would happen.
(they use 32 bits instead).  Also, on those platforms, you could always
use the non-64 versions if you were concerned about that.  The patches
would allow seeking past the old limit using the 32 function in stages
(seek 2G, seek 2G, seek 2G would put you at 6G) if you do not mind wierd
return values and tell not working.  And if you use a platform which
does not support 64bit integral types, then you cannot reasonably expect
those functions to work correctly anyway.  But they should compile at
least.


   regards, tom lane

 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match


-- 
I don't wanna argue, and I don't wanna fight,
But there will definitely be a party tonight...

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] 64-bit API for large objects

2005-09-24 Thread Jeremy Drake

On Sat, 24 Sep 2005, Tom Lane wrote:

 Jeremy Drake [EMAIL PROTECTED] writes:
  In any case, are there any comments on the changes below libpq (the
  functions visible to queries on down)?

 Within the backend, I don't see the point in maintaining a distinction
 between 32- and 64-bit APIs for inv_api.c: you should just switch 'em
 to use int64.  You did it that way for inv_getsize but then proceeded
 to add separate inv_seek64/tell64 functions, which is inconsistent.

Right.  I did it the way you describe my first cut (I did this several
times and changed my mind and started over).  I was concerned (perhaps
needlessly) about breaking the signatures of the inv_* functions which are
visible outside of the .c file.  This is why I did the getsize differently
- it is static.  But it sounds like there is no concern about changing the
signatures of these functions, so I will change my patch to not maintain
the seperate code paths in inv_api.c.

 The submitted version of lo_tell64 isn't even calling the right one :-(

Oops.  That's what I get for lots of copy/paste and doing it multiple
times...  Bonehead mistake, thanks for catching it.

 The 32-bit version of lo_tell will need to check and error out if the
 value it'd need to return doesn't fit in int32.  (Come to think of it,
 so should the 32-bit version of lo_lseek.)

That makes sense.  Or it could return some value (INT_MAX?) which could
mean that it is outside the range, so someone could still get at the data
even if they are using a backwards client box?  I don't know if that makes
sense at all, it sounds pretty wacky since these clients would have no
way of knowing where they are in the file.  Erroring would probably be
best.

 All of the LO code needs to be eyeballed to make sure it still behaves
 sanely if int64 is really only 32 bits.

Of course.

 It would probably be a good idea also to introduce some overflow checks
 to detect cases where the current LO offset would overflow int64 after a
 read, write, or seek.  This is missing from the existing code :-(
 It is possible to code overflow checks that work regardless of the size
 of int64; see int8.c for some inspiration.

Yes.  That would be good.  These would be in the inv_* functions, correct?

 I'd suggest also that the
 offset be defined as signed not unsigned (int64 not uint64) as this will
 simplify the overflow checks and eliminate the prospect of scenarios
 where lo_tell64 cannot return a correct value.

I intended to do that.  I think the only place I made uint64 vs int64 was
the getsize function, and that could be int64 also.  I will look at that
code and make sure I am not mixing them in ways that are not necessary and
useful.


I will take these suggestions and make another revision of the patch
shortly.  Also, I was considering exposing up an lo_getsize or lo_stat
function which would tell you how big a large object was without having to
seek to the end and look at the return value, requiring you to have the
large object open at the time, to loose your old seek position (unless you
do a tell first), and requires several more server round-trips (if not
open, would involve open/seek/close, if open, could require
tell/seek/seek).


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] 64-bit API for large objects

2005-09-24 Thread Jeremy Drake

On Sat, 24 Sep 2005, Alvaro Herrera wrote:

 Hey,

 While you guys are hacking at the LO code, it would be nice to consider
 the suggestions outlined here:

 http://archives.postgresql.org/pgsql-bugs/2004-07/msg00143.php

Included from that message for easier reference:

 0) In Oid lo_creat(PGconn *conn, int mode), why is there a mode on
 lo_create? The mode is determined when the object is lo_open()ed, right?

I think the docs basically said it is a vestigial feature, it used to be
useful but the code evolved in such a way that it ceased being useful.  It
is probably still there to allow old code to continue to compile against
newer servers without being recompiled.

 1) There is no lo_truncate(PGconn *conn, int fd, off_t len).

Did not notice that one.  That is a good one to add if adding functions is
in the cards.  I bet when the person/people who are attempting to write to
this api here get far enough, they would have noticed that too ;)

 2) There is no lo_length(PGconn *conn, int fd).

We did notice this one however.  There is also no lo_stat(PGconn *conn,
Oid lobjId).  I have been thinking about implementing these two.

I think I will make a revision of the patch at some point with these.  The
size ones will be extremely easy, the functionality is already there, just
a matter of exposing it.  The truncate is not too difficult, but actually
requires me to think a little more ;)

-- 
When does summertime come to Minnesota, you ask?  Well, last year, I
think it was a Tuesday.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

[HACKERS] catalog corruption bug

2005-12-21 Thread Jeremy Drake

We have encountered a very nasty but apparently rare bug which appears to
result in catalog corruption.  I have not been able to pin down an exact
sequence of events which cause this problem, it appears to be a race
condition of some sort.  This is what I have been able to figure out so
far.

* It appears to be related to temp table creation/deletion.
* It requires at least 2 clients be connected simultaneously.
* It seems to be related to the autovacuum (not sure, just a theory).

I will attempt to explain the circumstances leading up to the problem, and
then show the symptoms.

We are working on a project which uses postgresql to store data which has
been decomposed into a large number of rows in stored procedures
(plpgsql/plperl).  The implementation we have been working with recently
has used temp tables to store intermediate stages of this decomposition so
that we can run multiple queries over it in the course of adding it to our
final tables without having to regenerate the set each time.  We were
running a timing test for a load of data which would result in tens of
millions of rows.  This load creates temp tables with on commit drop and
also explitly drops them.  It appears to do so at a rate of approximately
10 per second (also transactions are being created/committed at that same
rate).  This works fine.  While this load was running we were
working on some testing code to determine whether it might be better to
create the temp table with on commit delete rows instead and use a
plpgsql function to create the temp table with an EXCEPTION
duplicate_table block to handle when the table has already been created
for this connection.  We wrote the function at first on a postgres 8.0 box
which was not running the load, but when we were attempting to determine
what the error code thrown was we noticed that the SQLSTATE variable was
not available in 8.0 and copied the function onto the 8.1 box (which was
running the load) to try it out.  We ran this function a couple times to
get the error, and then had it catch the duplicate_table exception.  We
got the function working, and when we looked at the status of our load we
found that it had died with a message saying

ERROR:  pg_class entry for relid 7502381 vanished during vacuuming

We found this interesting, figuring it was a bug in postgres.  Googling
the non-variable pieces of that message turned up nothing relevant, so we
set about trying to reproduce it.

During the course of doing so, we restarted our load several times and
called the function.  We later put the calling of the function into a loop
in bash calling psql (so we could disconnect/reconnect) to speed up the
finding of the problem.

These are some of the interesting errors which we got while doing this
(all from the server log):

ERROR:  duplicate key violates unique constraint
pg_class_relname_nsp_index
CONTEXT:  SQL statement CREATE TEMP TABLE foo (a integer, b integer) ON
COMMIT
DELETE ROWS
PL/pgSQL function temp_table_test line 2 at SQL statement
ERROR:  relation foo does not exist
ERROR:  duplicate key violates unique constraint
pg_class_relname_nsp_index
CONTEXT:  SQL statement CREATE TEMP TABLE foo (a integer, b integer) ON
COMMIT
DELETE ROWS
PL/pgSQL function temp_table_test line 2 at SQL statement



ERROR:  relation foo does not exist
FATAL:  cache lookup failed for relation 7600066
LOG:  server process (PID 20942) exited with exit code 1
LOG:  terminating any other active server processes
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited abnormally
and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and repeat 
your command.


We also managed to get an error which was more bothersome than the
mysterious disappearing/reappearing temp tables.

ERROR:  relation windowpos does not exist
ERROR:  type windowpos already exists
ERROR:  cache lookup failed for relation 794218

Later:
ERROR:  relation windowpos already exists
ERROR:  catalog is missing 14 attribute(s) for relid 7577269
ERROR:  catalog is missing 14 attribute(s) for relid 7577269
ERROR:  catalog is missing 14 attribute(s) for relid 7577269
ERROR:  catalog is missing 14 attribute(s) for relid 7577269
ERROR:  catalog is missing 14 attribute(s) for relid 7577269
ERROR:  catalog is missing 14 attribute(s) for relid 7577269
ERROR:  catalog is missing 14 attribute(s) for relid 7577269
ERROR:  catalog is missing 14 attribute(s) for relid 7577269




Here is the temp table function we were testing:

CREATE OR REPLACE FUNCTION temp_table_test() RETURNS boolean AS $$
BEGIN
CREATE TEMP TABLE foo (a integer, b integer) ON COMMIT DELETE ROWS;
RETURN true;
EXCEPTION WHEN duplicate_table THEN
RETURN false;
END;
$$ LANGUAGE plpgsql;


And our bash command line for stressing:
for i in

Re: [HACKERS] sending mail from Postgres

2005-12-28 Thread Jeremy Drake

Code from perl cookbook, wrapped in plperlu wrapper

not very portable, Net::SMTP would probably be better, and result in more
portable perl code.

Didn't test this - but it is copied straight from perl cookbook via
google: http://www.unix.org.ua/orelly/perl/cookbook/ch18_04.htm

CREATE OR REPLACE FUNCTION send_email(from_address text, to_address text,
subject text, body text) RETURNS void AS $$

use Mail::Mailer;
my ($from_address, $to_address, $subject, $body) = @_;

my $mailer = Mail::Mailer-new(sendmail);
$mailer-open({ From= $from_address,
To  = $to_address,
Subject = $subject,
  })
or die Can't open: $!;
print $mailer $body;
$mailer-close();

$$ LANGUAGE plperlu VOLATILE STRICT;

On Wed, 28 Dec 2005, Aftab Alam wrote:

 Hi there,

 How can i send mail form postgresql.

 any suggestion.

 thanx  regards
 aftab

 ---(end of broadcast)---
 TIP 6: explain analyze is your friend


-- 
Don't steal; thou'lt never thus compete successfully in business.  Cheat.
-- Ambrose Bierce

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] catalog corruption bug

2006-01-04 Thread Jeremy Drake

On Wed, 21 Dec 2005, Tom Lane wrote:

 Jeremy Drake [EMAIL PROTECTED] writes:
  We have encountered a very nasty but apparently rare bug which appears to
  result in catalog corruption.

 How much of this can you reproduce on 8.1.1?  We've fixed a few issues
 already.

We did not see this problem for a while.  I upgraded the second gentoo box
to show this problem to 8.1.1 basically as soon as the ebuild for it was
out.  It just started acting up today (but we have not stressed it for a
while).  It appears to be acting similarly (although corruption which
persisted into other backends has not appeared).

Here is the error message I currently get on 8.1.1 (names have been
changed):

DBD::Pg::st execute failed: ERROR:  type push_temp already exists
CONTEXT:  SQL statement CREATE TEMPORARY TABLE push_temp (val text) ON
COMMIT DROP
PL/pgSQL function push_func line 6 at SQL statement
DBD::Pg::st execute failed: ERROR:  type push_temp already exists
CONTEXT:  SQL statement CREATE TEMPORARY TABLE push_temp (val text) ON
COMMIT DROP
PL/pgSQL function push_func line 6 at SQL statement

postgres=# select version();
 version
--
 PostgreSQL 8.1.1 on x86_64-pc-linux-gnu, compiled by GCC
x86_64-pc-linux-gnu-gcc (GCC) 3.4.4 (Gentoo 3.4.4-r1, ssp-3.4.4-1.0,
pie-8.7.8)
(1 row)


  This was built from the gentoo ebuild version 8.1.0

 I'd be even more interested if you can reproduce it on a non-gentoo
 machine.  Gentoo is not noted for stability.

This one was also on Gentoo, this time ebuild version 8.1.1.  They are
applying a couple patches it looks like, one of which looks like it just
changes some makefile stuff around, and the other appears to add support
for the SH platform in s_lock.h.

Unfortunately, I don't have any non-gentoo boxes around which are on a par
with these two hardware-wise.  Also, I think my test cases I tried to come
up with were most likely wrong.  This code which is currently croaking is
basically amounting to 9 processes calling functions which do SELECT,
INSERT, SELECT FOR UPDATE, DELETE, and UPDATE, as well as CREATE TEMP
TABLE ... ON COMMIT DROP.  ON COMMIT DROP is the only kind of temp table
that this code uses.

I could probably try to re-arrange the code in such a way that I can send
it, if that would be helpful (although I wouldn't want to waste the effort
if it wouldn't be helpful).

Also, what do you figure are the chances of that plperl locale problem
causing this?  I would guess pretty slim seeing as I am only using ASCII
for my schemas, and all of my databases are SQL_ASCII.  I am calling
plperl functions in both of the projects which are breaking...  Also, if I
run the command 'locale' all of the things it prints out are either empty
or POSIX

-- 
Take your dying with some seriousness, however.  Laughing on the way to
your execution is not generally understood by less advanced life forms,
and they'll call you crazy.
-- Messiah's Handbook: Reminders for the Advanced Soul

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] catalog corruption bug

2006-01-05 Thread Jeremy Drake

Here is some additional information that I have managed to gather today
regarding this.  It is not really what causes it, so much as what does
not.

I removed all plperl from the loading processes.  I did a VACUUM FULL
ANALYZE, and then I reindexed everything in the database (Including
starting the backend in standalone mode and running REINDEX SYSTEM
dbname).  They still failed.

So it is apparently not that plperl issue which was being discussed
earlier.

Also, what I said about the corruption not having persisted into other
backends was not quite correct.  It was leaving behind types in pg_type
which were in some of the pg_temp* schemas which corresponded to some of
the temp tables.  But I took those out and still had issues (slightly
different).

Here is some interesting stuff too.  I just stopped my processes to start
up a batch again to copy the error message I got now, but before doing so
I was doing a VACUUM FULL ANALYZE VERBOSE so I could hopefully start from
a relatively clean state.  I got a few warnings I don't remember seeing
before.

INFO:  vacuuming pg_catalog.pg_shdepend
INFO:  pg_shdepend: found 108 removable, 440 nonremovable row versions
in 15 p
ages
DETAIL:  0 dead row versions cannot be removed yet.
Nonremovable row versions range from 53 to 53 bytes long.
There were 1492 unused item pointers.
Total free space (including removable row versions) is 89780 bytes.
7 pages are or will become empty, including 0 at the end of the table.
12 pages containing 89744 free bytes are potential move destinations.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index pg_shdepend_depender_index now contains 448 row versions in
33 pages
DETAIL:  108 index row versions were removed.
23 index pages have been deleted, 23 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.10 sec.
WARNING:  index pg_shdepend_depender_index contains 448 row versions,
but table contains 440 row versions
HINT:  Rebuild the index with REINDEX.
INFO:  index pg_shdepend_reference_index now contains 448 row versions
in 12 pages
DETAIL:  108 index row versions were removed.
3 index pages have been deleted, 3 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
WARNING:  index pg_shdepend_reference_index contains 448 row versions,
but table contains 440 row versions
HINT:  Rebuild the index with REINDEX.
INFO:  pg_shdepend: moved 4 row versions, truncated 15 to 4 pages
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index pg_shdepend_depender_index now contains 448 row versions in
33 pages
DETAIL:  4 index row versions were removed.
23 index pages have been deleted, 23 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
WARNING:  index pg_shdepend_depender_index contains 448 row versions,
but table contains 440 row versions
HINT:  Rebuild the index with REINDEX.
INFO:  index pg_shdepend_reference_index now contains 448 row versions
in 12 pages
DETAIL:  4 index row versions were removed.
4 index pages have been deleted, 4 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
WARNING:  index pg_shdepend_reference_index contains 448 row versions,
but table contains 440 row versions
HINT:  Rebuild the index with REINDEX.
INFO:  analyzing pg_catalog.pg_shdepend
INFO:  pg_shdepend: scanned 4 of 4 pages, containing 440 live rows and 0
dead rows; 440 rows in sample, 440 estimated total rows



Similar for pg_type, there being 248 index row versions vs 244 row
versions in the table.

1631 vs 1619 in pg_attribute
95 vs 94 in pg_index


Looks like it may be time to start a standalone backend and REINDEX
again...

-- 
Don't smoke the next cigarette.  Repeat.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] catalog corruption bug

2006-01-06 Thread Jeremy Drake

On Thu, 5 Jan 2006, Tom Lane wrote:

 The ReadBuffer bug I just fixed could result in disappearance of catalog
 rows, so this observation is consistent with the theory that that's
 what's biting you.  It's not proof though...

Well, I applied that patch that you sent me the link to (the bufmgr.c
one), and rebuilt (PORTDIR_OVERLAY is cool...)

I ran my nine processes which hammer things overnight, and in the
morning one of them was dead.

DBD::Pg::st execute failed: ERROR:  duplicate key violates unique
constraint pg_type_typname_nsp_index
CONTEXT:  SQL statement CREATE TEMPORARY TABLE push_tmp (val text) ON
COMMIT DROP
PL/pgSQL function push_func line 6 at SQL statement
DBD::Pg::st execute failed: ERROR:  duplicate key violates unique
constraint pg_type_typname_nsp_index
CONTEXT:  SQL statement CREATE TEMPORARY TABLE push_tmp (val text) ON
COMMIT DROP
PL/pgSQL function push_func line 6 at SQL statement


I also write out the time as my processes progress, so I know roughly when
it happened too.  It happened at 1136534029 (that's result of perl time()
function), which when run through localtime() yields Thu Jan  5 23:53:49
2006.  It looks like I started the processes at about 18:30, so they
lasted a while at least.

Let me know if there is anything else I can try to help debug this
(asserts on?).

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] catalog corruption bug

2006-01-06 Thread Jeremy Drake

On Fri, 6 Jan 2006, Tom Lane wrote:

 Jeremy Drake [EMAIL PROTECTED] writes:
  Well, I applied that patch that you sent me the link to (the bufmgr.c
  one), and rebuilt (PORTDIR_OVERLAY is cool...)

  I ran my nine processes which hammer things overnight, and in the
  morning one of them was dead.

  DBD::Pg::st execute failed: ERROR:  duplicate key violates unique
  constraint pg_type_typname_nsp_index

 Hm, did you REINDEX things beforehand?  This could be leftover corruption...

Yes.  I ran that VACUUM FULL ANALYZE VERBOSE which I emailed part of the
excerpt from, and then I started a standalone backend (postgres -D data
-P) and ran REINDEX SYSTEM dbname on the database in question.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] catalog corruption bug

2006-01-07 Thread Jeremy Drake

On Fri, 6 Jan 2006, Tom Lane wrote:

 OK, this must be a different issue then.  I think we have seen reports
 like this one before, but not been able to reproduce it.

 Could you rebuild with Asserts enabled and see if any asserts trigger?

I got an assert to fail.  I'm not entirely sure if this is helpful, but I
managed to get a core dump with --enable-debug and --enable-cassert (with
optimizations still on).  Let me know if there is anything else that would
be useful to get out of this core file...

(gdb) bt
#0  0x2b8a0cf9 in kill () from /usr/lib64/libc.so.6
#1  0x2b8a0a3d in raise () from /usr/lib64/libc.so.6
#2  0x2b8a1c82 in abort () from /usr/lib64/libc.so.6
#3  0x005f9838 in ExceptionalCondition (
conditionName=0x1abc Address 0x1abc out of bounds,
errorType=0x6 Address 0x6 out of bounds, fileName=0x0,
lineNumber=-1)
at assert.c:51
#4  0x005eec5d in AtEOXact_CatCache (isCommit=-68 '')
at catcache.c:575
#5  0x0047464f in CommitTransaction () at xact.c:1573
#6  0x00474d07 in CommitTransactionCommand () at xact.c:2184
#7  0x005581da in AutoVacMain (argc=6844, argv=0x6) at
autovacuum.c:688
#8  0x00558907 in autovac_start () at autovacuum.c:170
#9  0x0055e66b in ServerLoop () at postmaster.c:1269
#10 0x0055f9b9 in PostmasterMain (argc=3, argv=0x8832e0)
at postmaster.c:943
#11 0x0051fb43 in main (argc=3, argv=0x8832e0) at main.c:256
(gdb) frame 4
#4  0x005eec5d in AtEOXact_CatCache (isCommit=-68 '')
at catcache.c:575
575 Assert(!ct-dead);
(gdb) l
570 {
571 CatCTup*ct = (CatCTup *) DLE_VAL(elt);
572
573 Assert(ct-ct_magic == CT_MAGIC);
574 Assert(ct-refcount == 0);
575 Assert(!ct-dead);
576 }
577 }
578 #endif
579 }
(gdb) set print pretty
(gdb) p *ct
$1 = {
  ct_magic = 1462113538,
  my_cache = 0x2aac3060,
  lrulist_elem = {
dle_next = 0x0,
dle_prev = 0x939ab0,
dle_val = 0x2ab19e18,
dle_list = 0x93b1a8
  },
  cache_elem = {
dle_next = 0x0,
dle_prev = 0x934b58,
dle_val = 0x2ab19e18,
dle_list = 0x2aac36c8
  },
  c_list = 0x0,
  refcount = 0,
  dead = 1 '\001',
  negative = 0 '\0',
  hash_value = 15438,
  tuple = {
t_len = 48,
t_self = {
  ip_blkid = {
bi_hi = 0,
bi_lo = 0
  },
  ip_posid = 70
},
t_tableOid = 2602,
t_datamcxt = 0x914998,
t_data = 0x2ab19f30
  }
}


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] catalog corruption bug

2006-01-07 Thread Jeremy Drake

On Sat, 7 Jan 2006, Tom Lane wrote:

 Fascinating --- that's not anywhere near where I thought your problem
 was.  Which cache is this tuple in?  (Print *ct-my_cache)

$2 = {
  id = 3,
  cc_next = 0x2aac1048,
  cc_relname = 0x2ab19df8 pg_amop,
  cc_reloid = 2602,
  cc_indexoid = 2654,
  cc_relisshared = 0 '\0',
  cc_tupdesc = 0x2ab199e0,
  cc_reloidattr = 0,
  cc_ntup = 3,
  cc_nbuckets = 256,
  cc_nkeys = 2,
  cc_key = {5, 1, 0, 0},
  cc_hashfunc = {0x44e1a0 hashoid, 0x44e1a0 hashoid, 0, 0},
  cc_skey = {{
  sk_flags = 0,
  sk_attno = 5,
  sk_strategy = 3,
  sk_subtype = 0,
  sk_func = {
fn_addr = 0x5bb8c0 oideq,
fn_oid = 184,
fn_nargs = 2,
fn_strict = 1 '\001',
fn_retset = 0 '\0',
fn_extra = 0x0,
fn_mcxt = 0x914998,
fn_expr = 0x0
  },
  sk_argument = 0
}, {
  sk_flags = 0,
  sk_attno = 1,
  sk_strategy = 3,
  sk_subtype = 0,
  sk_func = {
fn_addr = 0x5bb8c0 oideq,
fn_oid = 184,
fn_nargs = 2,
fn_strict = 1 '\001',
fn_retset = 0 '\0',
fn_extra = 0x0,
fn_mcxt = 0x914998,
fn_expr = 0x0
  },
  sk_argument = 0
}, {
  sk_flags = 0,
  sk_attno = 0,
  sk_strategy = 0,
  sk_subtype = 0,
  sk_func = {
fn_addr = 0,
fn_oid = 0,
fn_nargs = 0,
fn_strict = 0 '\0',
fn_retset = 0 '\0',
fn_extra = 0x0,
fn_mcxt = 0x0,
fn_expr = 0x0
  },
  sk_argument = 0
}, {
  sk_flags = 0,
  sk_attno = 0,
  sk_strategy = 0,
  sk_subtype = 0,
  sk_func = {
fn_addr = 0,
fn_oid = 0,
fn_nargs = 0,
fn_strict = 0 '\0',
fn_retset = 0 '\0',
fn_extra = 0x0,
fn_mcxt = 0x0,
fn_expr = 0x0
  },
  sk_argument = 0
}},
  cc_isname = \000\000\000,
  cc_lists = {
dll_head = 0x935018,
dll_tail = 0x934c50
  },
  cc_bucket = {{
  dll_head = 0x0,
  dll_tail = 0x0
}}
}

Am I correct in interpreting this as the hash opclass for Oid?  That's
really bizarre.  Definately didn't change that.

 The tableOid implies it's one of the caches on pg_amop, which makes
 the whole thing stranger yet.  pg_amop doesn't change during normal
 operation so there's no reason for one of its tuples to become dead.
 You aren't creating/deleting operator classes in this database are
 you?

Nope.  As a matter of fact, I never created any custom operator classes in
this database.


   regards, tom lane

 ---(end of broadcast)---
 TIP 6: explain analyze is your friend


-- 
Given my druthers, I'd druther not.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] catalog corruption bug

2006-01-07 Thread Jeremy Drake

On Sat, 7 Jan 2006, Tom Lane wrote:

 Jeremy Drake [EMAIL PROTECTED] writes:
  Am I correct in interpreting this as the hash opclass for Oid?

 However, AFAICS the only consequence of this bug is to trigger
 that Assert failure if you've got Asserts enabled.  Dead catcache
 entries aren't actually harmful except for wasting some space.
 So I don't think this is related to your pg_type duplicate key
 problem.

 One weak spot in this theory is the assumption that somebody was
 vacuuming pg_amop.  It seems unlikely that autovacuum would do so
 since the table never changes (unless you had reached the point
 where an anti-XID-wraparound vacuum was needed, which is unlikely
 in itself).  Do you have any background processes that do full-database
 VACUUMs?

No.  Just the autovacuum, which is actually the process which had the
assert failure.

This appears to give the current xid
(gdb) p *s
$10 = {
  transactionId = 13568516,
  subTransactionId = 1,
  name = 0x0,
  savepointLevel = 0,
  state = TRANS_COMMIT,
  blockState = TBLOCK_STARTED,
  nestingLevel = 1,
  curTransactionContext = 0x9529c0,
  curTransactionOwner = 0x92eb40,
  childXids = 0x0,
  currentUser = 0,
  prevXactReadOnly = 0 '\0',
  parent = 0x0
}


 I'll go fix CatCacheRemoveCList, but I think this is not the bug
 we're looking for.

Incidentally, one of my processes did get that error at the same time.
All of the other processes had an error
DBD::Pg::st execute failed: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.

But this one had the DBD::Pg::st execute failed: ERROR:  duplicate key
violates unique constraint pg_type_typname_nsp_index

It looks like my kernel did not have the option to append the pid to core
files ,so perhaps they both croaked at the same time but only this one got
to write a core file?

I will enable this and try again, see if I can't get it to make 2 cores.

BTW, nothing of any interest made it into the backend log regarding what
assert(s) failed.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] catalog corruption bug

2006-01-07 Thread Jeremy Drake

On Sat, 7 Jan 2006, Tom Lane wrote:

 Jeremy Drake [EMAIL PROTECTED] writes:
  On Sat, 7 Jan 2006, Tom Lane wrote:
  I'll go fix CatCacheRemoveCList, but I think this is not the bug
  we're looking for.

 A bit of a leap in the dark, but: maybe the triggering event for this
 situation is not a VACUUM pg_amop but a global cache reset due to
 sinval message buffer overrun.  It's fairly clear how that would lead
 to the CatCacheRemoveCList bug.  The duplicate-key failure could be an
 unrelated bug triggered by the same condition.  I have no idea yet what
 the mechanism could be, but cache reset is a sufficiently seldom-exercised
 code path that it's entirely plausible that there are bugs lurking in it.

 If this is correct then we could vastly increase the probability of
 seeing the bug by setting up something to force cache resets at a high
 rate.  If you're interested I could put together a code patch for that.

I tried that function you sent, while running my other code.  It died, but
not the same way.  None of my processes had the unique constraint error,
but two had failed during commit.  Both of them died in that same place as
the last one, on pg_amop.

I think I am going to just run without the function running this time and
see if it does the duplicate type error and if it will generate two cores.




-- 
To kick or not to kick...
-- Somewhere on IRC, inspired by Shakespeare

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] catalog corruption bug

2006-01-09 Thread Jeremy Drake

On Sun, 8 Jan 2006, Tom Lane wrote:

 Yeah, that's not very surprising.  Running the forced-cache-resets
 function will definitely expose that catcache bug pretty quickly.
 You'd need to apply the patches I put in yesterday to have a system
 that has any chance of withstanding that treatment for any length of time.

  I think I am going to just run without the function running this time and
  see if it does the duplicate type error and if it will generate two cores.

I ran without that function you made, and it got the error, but not a
crash.  I stuck an Assert(false) right before the ereport for that
particular error, and I did end up with a core there, but I don't see
anything out of the ordinary (what little I know of the ordinary ;)

#0  0x2b8a0cf9 in kill () from /usr/lib64/libc.so.6
#1  0x2b8a0a3d in raise () from /usr/lib64/libc.so.6
#2  0x2b8a1c82 in abort () from /usr/lib64/libc.so.6
#3  0x005f9878 in ExceptionalCondition (
conditionName=0x2c53 Address 0x2c53 out of bounds,
errorType=0x6 Address 0x6 out of bounds, fileName=0x0,
lineNumber=-1)
at assert.c:51
#4  0x00460967 in _bt_doinsert (rel=0x2ab05568,
btitem=0xbec2c0,
index_is_unique=1 '\001', heapRel=0x8bf0f0) at nbtinsert.c:247
#5  0x00463773 in btinsert (fcinfo=0x2c53) at nbtree.c:228
#6  0x005fe869 in FunctionCall6 (flinfo=0x8, arg1=6, arg2=0,
arg3=18446744073709551615, arg4=0, arg5=0, arg6=0) at fmgr.c:1267
#7  0x0045bf4f in index_insert (indexRelation=0x2ab05568,
values=0x7fdfde20, isnull=0x7fdfde00 , heap_t_ctid=0xbebeac,
heapRelation=0x8bf0f0, check_uniqueness=1 '\001') at indexam.c:215
#8  0x0048f8fa in CatalogIndexInsert (indstate=0x2c53,
heapTuple=0xbebb88) at indexing.c:124
#9  0x0048f994 in CatalogUpdateIndexes (heapRel=0x2c53,
heapTuple=0xbebea8) at indexing.c:149
#10 0x0049bc67 in TypeCreate (typeName=0x7fdfe3e0
push_tmp,
typeNamespace=11057063, relationOid=12171371, relationKind=114 'r',
internalSize=-16728, typeType=99 'c', typDelim=44 ',',
inputProcedure=2290, outputProcedure=2291, receiveProcedure=2402,
sendProcedure=2403, analyzeProcedure=0, elementType=0, baseType=0,
defaultTypeValue=0x0, defaultTypeBin=0x0, passedByValue=-16 '',
alignment=100 'd', storage=120 'x', typeMod=-1, typNDims=0,
typeNotNull=0 '\0') at pg_type.c:316
#11 0x0048c361 in heap_create_with_catalog (
relname=0x7fdfe3e0 push_tmp, relnamespace=11057063,
reltablespace=0, relid=12171371, ownerid=16384, tupdesc=0xbeb8e8,
relkind=114 'r', shared_relation=0 '\0', oidislocal=0 '\0',
oidinhcount=0,
oncommit=ONCOMMIT_DROP, allow_system_table_mods=0 '\0') at heap.c:634
#12 0x004de220 in DefineRelation (stmt=0x93fc30, relkind=114 'r')
at tablecmds.c:423
#13 0x0058bfd0 in ProcessUtility (parsetree=0x93fc30, params=0x0,
dest=0x814b40, completionTag=0x0) at utility.c:497
#14 0x00515cb5 in _SPI_execute_plan (plan=0x93f9a8,
Values=0x9c5798,
Nulls=0x9c57b8 ~, '\177' repeats 199 times..., snapshot=0x0,
crosscheck_snapshot=0x0, read_only=0 '\0', tcount=0) at spi.c:1449
#15 0x005165fc in SPI_execute_plan (plan=0x93f9a8,
Values=0x9c5798,
Nulls=0x9c57b8 ~, '\177' repeats 199 times..., read_only=0 '\0',
tcount=0) at spi.c:336
#16 0x2c95d8a4 in exec_stmts (estate=0x7fdfe950, stmts=0x6)
at pl_exec.c:2280
#17 0x2c95ebc2 in exec_stmt_block (estate=0x7fdfe950,
block=0x8f2c70) at pl_exec.c:936
#18 0x2c95f5ab in plpgsql_exec_function (func=0x913bc8,
fcinfo=0x7fdfea90) at pl_exec.c:286
#19 0x2c9573f5 in plpgsql_call_handler (fcinfo=0x7fdfea90)
at pl_handler.c:123
#20 0x00501a74 in ExecMakeFunctionResult (fcache=0x90a7f0,
econtext=0x90a6c0, isNull=0x90ae38
\177~\177\177\177\177\177\177!\006,
isDone=0x90aef0) at execQual.c:1095
#21 0x00505543 in ExecProject (projInfo=0x90ae58,
isDone=0x7fdfeef4) at execQual.c:3669
#22 0x0050ff5a in ExecResult (node=0x90a5a8) at nodeResult.c:157
#23 0x0050034d in ExecProcNode (node=0x90a5a8) at
execProcnode.c:306
#24 0x004ff5ea in ExecutorRun (queryDesc=0x90a5a8,
direction=ForwardScanDirection, count=0) at execMain.c:1110
#25 0x0058a5de in PortalRunSelect (portal=0x8e6c68, forward=1
'\001',
count=0, dest=0x8dad30) at pquery.c:794
#26 0x0058abdf in PortalRun (portal=0x8e6c68,
count=9223372036854775807, dest=0x8dad30, altdest=0x8dad30,
completionTag=0x7fdff320 ) at pquery.c:646
#27 0x00588fcb in PostgresMain (argc=9333864, argv=0x8dac18,
username=0x8853f0 jeremyd) at postgres.c:1754
#28 0x0055e20a in ServerLoop () at postmaster.c:2853
#29 0x0055f9f9 in PostmasterMain (argc=3, argv=0x8832e0)
at postmaster.c:943
#30 0x0051fb83 in main (argc=3, argv=0x8832e0) at main.c:256


 Please also look at putting together a

Re: [HACKERS] catalog corruption bug

2006-01-09 Thread Jeremy Drake

On Mon, 9 Jan 2006, Tom Lane wrote:

 Does your application drop these temp tables explicitly, or leave them
 to be dropped automatically during commit?  It might be interesting to
 see whether changing that makes any difference.

I drop them explicitly at the end of the function.

 I'm also curious
 whether the transaction that makes the temp table is ever rolled back
 instead of committed.

Not intentionally/explicitly.  The only time it should roll back is if it
gets an error (which tends to be this error).  I do sometimes hit ^C on
the perl scripts to tweak something, which would roll it back if in this
particular code, but I don't think i did that on the last run at least.


   regards, tom lane

 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match


-- 
Every little picofarad has a nanohenry all its own.
-- Don Vonada

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] No heap lookups on index

2006-01-19 Thread Jeremy Drake

On Thu, 19 Jan 2006, Jim C. Nasby wrote:

 Do you still have that patch that folks could look at? ISTM that this
 technique would be rather dependant on your actual workload, and as such
 could result in a big win for certain types of queries.


It is not a patch, per se.  It is a c language function which calls some
of the nbtree functions to return things from the index.  The syntax for
calling it is rather obtuse, since those of us who don't understand the
parser are doomed to attempt circumventing it ;P.

I tarred up the code, and put it on a web server so that interested
parties can play with it.  The url is
http://linux.apptechsys.com/~jeremyd/postgresql/fakeidxscan.tar.gz

It is very hackish, so definately do not assume that it is in any way
correct, rather assume the opposite.  I have run it on x86 and x86_64
boxes, and it compiles and runs on those.

Here is an example of its usage, so you can see the nasty syntax required
and perhaps grok how to use it better.


create table test_table (a integer, b integer);
create index test_table_a_b_idx on test_table (a, b);
insert into test_table (a, b) select a, b from generate_series(1,100) a,
generate_series(1,100) b;

select * from fakeidxrowscomposite(
'test_table', -- relation
'test_table_a_b_idx', -- index
1, --number of scan keys
ARRAY[1, 2]::smallint[], -- numbers of the index attributes to return
ARRAY[1]::smallint[], -- numbers of the attrs the scankeys apply to
ARRAY['=(integer,integer)'::regoperator], -- operators for the scankeys
ARRAY[3]::smallint[], -- btree strategy for the scankeys
(42,0) -- values for the scankeys to compare against (if there is only
   -- one, you have to put a fake one in since otherwise the parser
   -- does not think it is a record)
) AS (a integer, b integer); -- tell the parser what columns to expect



This example returns 100 rows in which the first column contains 42 and
the second column contains the numbers between 1 and 100, in order.

Feel free to do whatever with this, it's pretty fast for tables where
seeks to validate tuples would hurt, but you do get back dead things...


-- 
When you know absolutely nothing about the topic, make your forecast by
asking a carefully selected probability sample of 300 others who don't
know the answer either.
-- Edgar R. Fiedler

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] No heap lookups on index

2006-01-19 Thread Jeremy Drake

On Thu, 19 Jan 2006, Jim C. Nasby wrote:

  Feel free to do whatever with this, it's pretty fast for tables where
  seeks to validate tuples would hurt, but you do get back dead things...

 How'd you then weed out the dead tuples?

I didn't get that far with it.  The purpose of this function was to
quickly put something together to demonstrate that the overhead of seeking
to the proper tuples in the heap to determine their visibility was the
main component of the time being spent to satisfy our queries.

 Basically, numbers talk. If there were convincing numbers for something
 that wasn't a corner-case that showed a marked improvement then there'd
 be much more interest in getting this into the backend in some fashion.

I could get some numbers of how much time validating tuples adds to a
query, but I don't think that that would be horribly novel.  BTW,
hopefully I did not make you think that I intended to get this into
the official backend.  This function was only meant to demonstrate to the
people around here that the visibility check was the bottleneck we were
seeing.  The function may also be interesting as a demonstration of how
indexes are handled in postgres, as you can see when tuples are flagged as
no longer valid and when they are not.  I have put xmin into an index so
that I could use this function to better visualize when index tuples are
left behind (I tried to put xmax in there too, but I never saw them
change, after checking the code it turns out that the index is never told
about changes in xmax).


We were seeing this case: All rows in our table are visible (we are the
only transaction on the machine and we did a VACUUM FULL ANALYZE before).
We rebooted to ensure no caching.  We were seeing times which, upon
division by the number of rows returned by the index scan, were remarkably
close to the average seek time listed on the specs for the hard drive in
the testing box.  This was about 5ms, which doesn't sound like much, but
given a large enough number of rows and a few joins, 5ms per tuple adds up
quickly.  This implies that we were seeing approximately the worst case as
far as the distribution of the relevant tuples on pages, ie each tuple we
wanted was on a different heap page.

Digging back to some times we had collected from this experiment,
apparently we were taking about 15 to 20 seconds to run a particular
query, and when we used the function I previously posted those times were
reduced to 5 seconds.  This was a while ago, however, so these times are
probably not very accurate and we probably made other tweaks to speed
things up since then.  But it gives an idea.  We could come up with more
absolute numbers, but I think people already know what they would look
like.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Multiple logical databases

2006-02-03 Thread Jeremy Drake

On Fri, 3 Feb 2006, Josh Berkus wrote:

 The feature you proposed is a way to make your idiosyncratic setup easier
 to manage, but doesn't apply to anyone else's problems on this list, so
 you're going to have a hard time drumming up enthusiasm.

I am somewhat reluctant to interject into this discussion, but the
particular idiosyncratic setup referred to is not the only one where
this may be useful.

The immediate use I thought of was being able to have what appeared to be
multiple databases on the same server with different locale settings,
which cannot be changed post-initdb.  I could see having different
databases in different locales being a useful feature, perhaps in a
wikipedia type setup so that the english, chinese, and arabic wikis could
each provide the correct sort order and other locale-specific properties
while still providing a single logical database server for connection
strings.  This just being the first example I could think of in which such
a setup could be useful.

-- 
In the beginning, I was made.  I didn't ask to be made.  No one consulted
with me or considered my feelings in this matter.  But if it brought some
passing fancy to some lowly humans as they haphazardly pranced their way
through life's mournful jungle, then so be it.
- Marvin the Paranoid Android, From Douglas Adams' Hitchiker's Guide to the
Galaxy Radio Scripts

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] Getting the length of varlength data using

2006-02-10 Thread Jeremy Drake

It looks like pg_column_size gives you the actual size on disk, ie after
compression.

What looks interesting for you would be byteaoctetlen or the function it
wraps, toast_raw_datum_size.  See src/backend/access/heap/tuptoaster.c.
pg_column_size calls toast_datum_size, while byteaoctetlen/textoctetlen
calls toast_raw_datum_size.



On Sat, 11 Feb 2006, Bruce Momjian wrote:


 Have you looked at the 8.1.X buildin function pg_column_size()?

 ---

 Mark Dilger wrote:
  Hello, could anyone tell me, for a user contributed variable length data 
  type,
  how can you access the length of the data without pulling the entire thing 
  from
  disk?  Is there a function or macro for this?
 
  As a first cut, I tried using the PG_DETOAST_DATUM_SLICE macro, but to no 
  avail.
grep'ing through the release source for version 8.1.2, I find very little
  usage of the PG_GETARG_*_SLICE and PG_DETOAST_DATUM_SLICE macros (and hence
  little clue how they are intended to be used.)  The only files where I find 
  them
  referenced are:
 
  doc/src/sgml/xfunc.sgml
  src/backend/utils/adt/varlena.c
  src/include/fmgr.h
 
 
  I am writing a variable length data type and trying to optimize the disk 
  usage
  in certain functions.  There are cases where the return value of the 
  function
  can be determined from the length of the data and a prefix of the data 
  without
  fetching the whole data from disk.  (The prefix alone is insufficient -- I 
  need
  to also know the length for the optimization to work.)
 
  The first field of the data type is the length, as follows:
 
  typedef struct datatype_foo {
  int32 length;
  char data[];
  } datatype_foo;
 
  But when I fetch the function arguments using
 
  datatype_foo * a = (datatype_foo *)
  PG_DETOAST_DATUM_SLICE(PG_GETARG_DATUM(0),0,BLCKSZ);
 
  the length field is set to the length of the fetched slice, not the length 
  of
  the data as it exists on disk. Is there some other function that gets the 
  length
  without pulling more than the first block?
 
  Thanks for any insight,
 
  --Mark
 
  ---(end of broadcast)---
  TIP 1: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly
 



-- 
Contrary to popular belief, penguins are not the salvation of modern
technology.  Neither do they throw parties for the urban proletariat.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Exposing DEFAULT_PGSOCKET_DIR via a libpq function?

2006-03-28 Thread Jeremy Drake

On Tue, 28 Mar 2006, Tom Lane wrote:

 Larry Rosenman ler@lerctr.org writes:
  The other issue is borked installs where the server and libpq disagree.
  What I'm looking for
   is to expose what libpq has for it's default as well as what the server is
  using.  There is currently
   no way to determine what libpq has for it's default.  What happened in the
  irc case was a partial re-install
   with non-matching server and libpq.

 [ shrug... ]  So?  There isn't going to be any way that
 random-app-using-libpq is going to have a way to tell the user what the
 underlying copy of libpq is using for this default --- adding a call for
 that will be nothing more nor less than a waste of code space.  You'd be
 best off running strings(1) over the libpq.so file when the question
 comes up.

When I encounter such behavior, my tool of choice tends to be strace(1)
rather than strings(1).  That way, you know what exactly the thing it
wants that it is not finding is...


-- 
Nothing astonishes men so much as common sense and plain dealing.
-- Ralph Waldo Emerson

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] GIN - Generalized Inverted iNdex. Try 2.

2006-04-26 Thread Jeremy Drake

On Wed, 26 Apr 2006, Teodor Sigaev wrote:


 We (me and Oleg) are glad to present GIN to PostgreSQL. If community will
 agree, we will commit it to HEAD branch.

 http://www.sigaev.ru/gin/gin.gz
 http://www.sigaev.ru/gin/README.txt

 Install:
 % cd  pgsql
 % zcat gin.gz | patch -p0
 make and initdb, install tsearch2

I just built this, and noticed that the regression test for opr_sanity
fails with your patch.  I attached the regression.diffs.

-- 
BOFH excuse #85:

Windows 95 undocumented feature
*** ./expected/opr_sanity.out   Wed Jan 25 18:35:51 2006
--- ./results/opr_sanity.outWed Apr 26 08:31:13 2006
***
*** 778,785 
  WHERE p4.amopclaid = p2.oid AND
p4.amopsubtype = p3.amopsubtype);
   oid | amname | oid | opcname | amopsubtype 
! -++-+-+-
! (0 rows)
  
  -- Check that amopopr points at a reasonable-looking operator, ie a binary
  -- operator yielding boolean.
--- 778,791 
  WHERE p4.amopclaid = p2.oid AND
p4.amopsubtype = p3.amopsubtype);
   oid  | amname | oid  |  opcname  | amopsubtype 
! --++--+---+-
!  2742 | gin| 2745 | _int4_ops |   0
!  2742 | gin| 2745 | _int4_ops |   0
!  2742 | gin| 2745 | _int4_ops |   0
!  2742 | gin| 2746 | _text_ops |   0
!  2742 | gin| 2746 | _text_ops |   0
!  2742 | gin| 2746 | _text_ops |   0
! (6 rows)
  
  -- Check that amopopr points at a reasonable-looking operator, ie a binary
  -- operator yielding boolean.
***
*** 825,831 
   783 |   10 | |
   783 |   11 | |
   783 |   12 | |
! (24 rows)
  
  -- Check that all operators linked to by opclass entries have selectivity
  -- estimators.  This is not absolutely required, but it seems a reasonable
--- 831,840 
   783 |   10 | |
   783 |   11 | |
   783 |   12 | |
! 2742 |1 | 
! 2742 |2 | @
! 2742 |3 | ~
! (27 rows)
  
  -- Check that all operators linked to by opclass entries have selectivity
  -- estimators.  This is not absolutely required, but it seems a reasonable
***
*** 847,854 
  WHERE p1.amopopr = p2.oid AND p1.amopclaid = p3.oid AND
  NOT binary_coercible(p3.opcintype, p2.oprleft);
   amopclaid | amopopr | oid | oprname | opcname 
! ---+-+-+-+-
! (0 rows)
  
  SELECT p1.amopclaid, p1.amopopr, p2.oid, p2.oprname, p3.opcname
  FROM pg_amop AS p1, pg_operator AS p2, pg_opclass AS p3
--- 856,869 
  WHERE p1.amopopr = p2.oid AND p1.amopclaid = p3.oid AND
  NOT binary_coercible(p3.opcintype, p2.oprleft);
   amopclaid | amopopr | oid  | oprname |  opcname  
! ---+-+--+-+---
!   2746 |2750 | 2750 |   | _text_ops
!   2745 |2750 | 2750 |   | _int4_ops
!   2746 |2751 | 2751 | @   | _text_ops
!   2745 |2751 | 2751 | @   | _int4_ops
!   2746 |2752 | 2752 | ~   | _text_ops
!   2745 |2752 | 2752 | ~   | _int4_ops
! (6 rows)
  
  SELECT p1.amopclaid, p1.amopopr, p2.oid, p2.oprname, p3.opcname
  FROM pg_amop AS p1, pg_operator AS p2, pg_opclass AS p3

==


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] 8.1.4 build failure on ICC 9.1

2006-05-25 Thread Jeremy Drake

On Thu, 25 May 2006, Tom Lane wrote:

 Joshua D. Drake [EMAIL PROTECTED] writes:
  I encountered this the other day and set up a build farm client for it.

  http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=meerkatdt=2006-05-25%2018:16:36

 That NaN problem has been discussed before, and I believe we concluded
 it's a compiler bug.  See the archives for the switch to use to avoid
 it.

I think it was meant as a feature by intel, but one man's feature is
another man's bug ;)

The flag to use is -mp1

Also, I see that you are getting all of the tons of output also.  Those
drove me nuts.  I sent in a patch for configure to take some of those
-W flags out which is now applied to HEAD as well as the addition of the
-mp1 flag for the ICC compiler.

I was more interested in the failures on the HEAD build on that box.  I
have had no problems with pl/(perl|python) on my box, though it is using
9.0 vs 9.1, I don't expect that they would have broken things THAT
badly...





---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] buildfarm - make check failures for leveret on 8.0

2006-08-07 Thread Jeremy Drake

On Mon, 7 Aug 2006, Tom Lane wrote:

 Stefan Kaltenbrunner [EMAIL PROTECTED] writes:
  *) why the large difference in the build-flags ?

 CVS HEAD configure.in knows about icc and the release branches don't.
 I think the changes were only put into HEAD because of lack of testing,
 but if we have buildfarm coverage I think it'd be OK to back-port the
 configure logic to the prior branches.  Any objections?

I sent the original patch.  I just sent it for HEAD because a) I could
still deal with previous branches by editing Makefile.global by hand after
configure, b) I reconfigured older branches seldom enough compared to
HEAD that it didn't bother me nearly as much, and c) I figured it would be
more readily accepted into HEAD than trying to get it back-ported.  Also I
was not sure about the acceptance of such things into back branches, since
it may be interpreted that supporting a new compiler is a new feature
and most projects don't like to add new features to old releases.


   regards, tom lane

 ---(end of broadcast)---
 TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq


-- 
[EMAIL PROTECTED] (no longer valid - where are you now, Martin?)
-- from /usr/src/linux/drivers/cdrom/mcd.c

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] buildfarm - make check failures for leveret on 8.0

2006-08-07 Thread Jeremy Drake

On Mon, 7 Aug 2006, Tom Lane wrote:

 Stefan Kaltenbrunner [EMAIL PROTECTED] writes:
  *) why the large difference in the build-flags ?

 CVS HEAD configure.in knows about icc and the release branches don't.
 I think the changes were only put into HEAD because of lack of testing,
 but if we have buildfarm coverage I think it'd be OK to back-port the
 configure logic to the prior branches.

Plus if it is backported, I can enable 8.x builds on mongoose (my x86 icc
buildfarm box).

One reason I like to use icc is that I have found (at least in c++ code)
that it had a tendancy to warn me about portability issues more than gcc
did.  But this was nasty, convoluted, nested-template c++ code where it is
much more likely to wander into situations in code that the standard did
not define, and compilers had vastly different interpretations, and
tracking down these sorts of errors was a matter of trying to glean what
exactly the new compiler didn't like from a compiler error which wrapped
around the screen at least 3 times.  Intel's compiler was one of the most
standards-compliant c++ compilers around, which was good about pointing
out things that were not strictly compliant, while still accepting the
more obscure tricks that the standard did allow.  Ah, the good old days ;)

   regards, tom lane

 ---(end of broadcast)---
 TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq


-- 
Neglect of duty does not cease, by repetition, to be neglect of duty.
-- Napoleon

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] buildfarm - make check failures for leveret on 8.0

2006-08-07 Thread Jeremy Drake

On Mon, 7 Aug 2006, Stefan Kaltenbrunner wrote:

 the just added new buildfarm member leveret (fedora core5 x86_64) is
 building with the recently released Intel C-compiler version 9.1.
 It passes all tests on -HEAD but fails on make check in both
 REL8_1_STABLE and REL8_0_STABLE.
 The logs of the later two branches also contain a very large number of
 annoying(stupid) warnings - those seem to be the result of -HEAD using:

 CFLAGS=-O2 -mp1 -fno-strict-aliasing -g

 while the older branches have

 CFLAGS=-O2 -Wall -Wmissing-prototypes -Wpointer-arith -Winline
 -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -g

Yep, I submitted the patch for this change a while back because the
warnings were driving me nuts, and I got tired of manually adjusting
Makefile.global every time I reconfigured.  Part of the reason for the
extra flags is due to the fact that on Linux boxen, icc masquerades as
gcc, so that configure cannot tell the difference.  This is intended to
make porting easier, since now every configure script in the world does
not need to know about icc, it is close enough to gcc for most purposes.


 three things to note here:

 *) why the large difference in the build-flags ?
Because the removed flags are either not supported in icc, or do something
rather stupid (see -Wall for an example of this).

 *) the large amount of warnings are probably caused by -Wall
That, and -Winline, which to icc means something like display inlining
optimization stage remarks as warnings which I don't think was the
intention of the inclusion of the flag here.  -Wall also does things like
this.

 *) -HEAD has -mp1 which the intel compiler manual describes as:

 -mp1   Improve floating-point  precision.  -mp1   disables  fewer
   optimizations  and  has  less  impact on performance than
   -mp.

 could that be the reason why -HEAD passes the float4/float8 regression
 tests ?

Exactly.  Without -mp1, icc cheats in floating point math, resulting in
non-standard behavior.  IIRC, this was NaN == 0.0 which according to the
standard should not be true, but the code generated by icc without -mp1
meant this was true.  I suppose it could be argued that this is a bug or a
feature resulting in (slightly) faster code at the expense of standards
compliance, but in either case, it does not work out here.



 Stefan

 ---(end of broadcast)---
 TIP 4: Have you searched our list archives?

http://archives.postgresql.org


-- 
He is not only dull himself, he is the cause of dullness in others.
-- Samuel Johnson

---(end of broadcast)---
TIP 6: explain analyze is your friend

[HACKERS] large object regression tests

2006-09-05 Thread Jeremy Drake

Sorry if this gets through more than once, I seem to be having email
difficulties...

On Tue, 5 Sep 2006, Jeremy Drake wrote:

 I noticed when I was working on a patch quite a while back that there are
 no regression tests for large object support.  I know, large objects
 are not the most sexy part of the code-base, and I think they tend to be
 ignored/forgotten most of the time.  Which IMHO is all the more reason
 they should have some regression tests.  Otherwise, if someone managed to
 break them somehow, it is quite likely not to be noticed for quite some
 time.

 So in this vein, I have recently found myself with some free time, and a
 desire to contribute something, and decided this would be the perfect
 place to get my feet wet without stepping on any toes.

 I guess what I should ask is, would a patch to add a test for large
 objects to the regression suite be well received?  And, is there any
 advice for how to go about making these tests?

 I am considering, and I think that in order to get a real test of the
 large objects, I would need to load data into a large object which would
 be sufficient to be loaded into more than one block (large object blocks
 were 1 or 2K IIRC) so that the block boundary case could be tested.  Is
 there any precedent on where to grab such a large chunk of data from?  I
 was thinking about using an excerpt from a public domain text such as Moby
 Dick, but on second thought binary data may be better to test things with.

 My current efforts, and probably the preliminary portion of the final
 test, involves loading a small amount (less than one block) of text into a
 large object inline from a sql script and calling the various functions
 against it to verify that they do what they should.  In the course of
 doing so, I find that it is necessary to stash certain values across
 statements (large object ids, large object 'handles'), and so far I am
 using a temporary table to store these.  Is this reasonable, or is there a
 cleaner way to do that?



-- 
Never make anything simple and efficient when a way can be found to
make it complex and wonderful.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] Win32 hard crash problem

2006-09-05 Thread Jeremy Drake

On Tue, 5 Sep 2006, Joshua D. Drake wrote:

 Right, but just took a reboot to fix it isn't very confidence inspiring ;)

Are you kidding?  This is standard procedure for troubleshooting Windows
problems :)

--
The world is coming to an end.  Please log off.

---(end of broadcast)---
TIP 6: explain analyze is your friend

[HACKERS] large object regression tests

2006-09-07 Thread Jeremy Drake

I noticed when I was working on a patch quite a while back that there are
no regression tests for large object support.  I know, large objects
are not the most sexy part of the code-base, and I think they tend to be
ignored/forgotten most of the time.  Which IMHO is all the more reason
they should have some regression tests.  Otherwise, if someone managed to
break them somehow, it is quite likely not to be noticed for quite some
time.

So in this vein, I have recently found myself with some free time, and a
desire to contribute something, and decided this would be the perfect
place to get my feet wet without stepping on any toes.

I guess what I should ask is, would a patch to add a test for large
objects to the regression suite be well received?  And, is there any
advice for how to go about making these tests?

I am considering, and I think that in order to get a real test of the
large objects, I would need to load data into a large object which would
be sufficient to be loaded into more than one block (large object blocks
were 1 or 2K IIRC) so that the block boundary case could be tested.  Is
there any precedent on where to grab such a large chunk of data from?  I
was thinking about using an excerpt from a public domain text such as Moby
Dick, but on second thought binary data may be better to test things with.

My current efforts, and probably the preliminary portion of the final
test, involves loading a small amount (less than one block) of text into a
large object inline from a sql script and calling the various functions
against it to verify that they do what they should.  In the course of
doing so, I find that it is necessary to stash certain values across
statements (large object ids, large object 'handles'), and so far I am
using a temporary table to store these.  Is this reasonable, or is there a
cleaner way to do that?

-- 
Even if you're on the right track, you'll get run over if you just sit there.
-- Will Rogers

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Fixed length data types issue

2006-09-10 Thread Jeremy Drake

On Sun, 10 Sep 2006, Kevin Brown wrote:

 Tom Lane wrote:
  (does anyone know the cost of ntohl() on modern
  Intel CPUs?)

 I have a system with an Athlon 64 3200+ (2.0 GHz) running in 64-bit
 mode, another one with the same processor running in 32-bit mode, a a
 third running a Pentium 4 1.5 GHz processor, and a fourth running a
 pair of 2.8 GHz Xeons in hyperthreading mode.

 I compiled the test program on the 32-bit systems with the -std=c9x
 option so that the constant would be treated as unsigned.  Other than
 that, the compilation method I used was identical: no optimization,
 since it would skip the loop entirely in the version without the
 ntohl() call.  I compiled it both with and without defining
 CALL_NTOHL, and measured the difference in billed CPU seconds.

 Based on the above, on both Athlon 64 systems, each ntohl() invocation
 and assignment takes 1.04 nanoseconds to complete (I presume the
 assignment is to a register, but I'd have to examine the assembly to
 know for sure).  On the 1.5 GHz P4 system, each iteration takes 8.49
 nanoseconds.  And on the 2.8 GHz Xeon system, each iteration takes
 5.01 nanoseconds.

Of course, that depends on the particular OS and variant as well.  IIRC,
at some point an instruction was added to x86 instruction set to do byte
swapping.

This is from /usr/include/netinet/in.h on a gentoo linux box with glibc
2.3

#ifdef __OPTIMIZE__
/* We can optimize calls to the conversion functions.  Either nothing has
   to be done or we are using directly the byte-swapping functions which
   often can be inlined.  */
# if __BYTE_ORDER == __BIG_ENDIAN
/* The host byte order is the same as network byte order,
   so these functions are all just identity.  */
# define ntohl(x)   (x)
# define ntohs(x)   (x)
# define htonl(x)   (x)
# define htons(x)   (x)
# else
#  if __BYTE_ORDER == __LITTLE_ENDIAN
#   define ntohl(x) __bswap_32 (x)
#   define ntohs(x) __bswap_16 (x)
#   define htonl(x) __bswap_32 (x)
#   define htons(x) __bswap_16 (x)
#  endif
# endif
#endif


And from bits/byteswap.h

/* To swap the bytes in a word the i486 processors and up provide the
   `bswap' opcode.  On i386 we have to use three instructions.  */
#  if !defined __i486__  !defined __pentium__  !defined __pentiumpro__ \
   !defined __pentium4__
#   define __bswap_32(x)  \
 (__extension__   \
  ({ register unsigned int __v, __x = (x);\
 if (__builtin_constant_p (__x))  \
   __v = __bswap_constant_32 (__x);   \
 else \
   __asm__ (rorw $8, %w0;   \
rorl $16, %0;   \
rorw $8, %w0\
: =r (__v)  \
: 0 (__x)   \
: cc);  \
 __v; }))
#  else
#   define __bswap_32(x) \
 (__extension__   \
  ({ register unsigned int __v, __x = (x);\
 if (__builtin_constant_p (__x))  \
   __v = __bswap_constant_32 (__x);   \
 else \
   __asm__ (bswap %0 : =r (__v) : 0 (__x)); \
 __v; }))
#  endif


/me searches around his hard drive for the ia32 developers reference

BSWAP
Opcode  Instruction Description
0F C8+rdBSWAP r32   Reverse the byte order of a 32-bit register

...

The BSWAP instruction is not supported on IA-32 processors earlier than
the Intel486 processor family. ...


I have read some odd stuff about instructions like these.  Apparently the
fact that this is a prefixed instruction (the 0F byte at the beginning)
costs an extra clock cycle, so though this instruction should take 1
cycle, it ends up taking 2.  I am unclear whether or not this is rectified
in later pentium chips.

So to answer the question about how much ntohl costs on recent Intel
boxes, a properly optimized build with a friendly libc like I quoted
should be able to do it in 2 cycles.


-- 
In Ohio, if you ignore an orator on Decoration day to such an extent as
to publicly play croquet or pitch horseshoes within one mile of the
speaker's stand, you can be fined $25.00.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Cassowary failing to report the results back to the

2006-09-11 Thread Jeremy Drake

On Mon, 11 Sep 2006, Adrian Maier wrote:

 It's not clear  to me where does that date-in-the-future come from.
 The machine's
 date is set correctly:
 $ date
 Mon Sep 11 11:00:30 PST 2006

Um, no.  I am currently in the PST time zone, and I can say from
first-hand experience that the current time is 2:21 am, not 11 am.  I have
confirmed this by looking out the window and noticing a distinct lack of
light.  The time you have quoted is about 8.5 hours in the future.
Suggest you either verify your time zone, or look out your window
;)



 Any ideas about what might cause this?




-- 
If you can lead it to water and force it to drink, it isn't a horse.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Getting a move on for 8.2 beta

2006-09-13 Thread Jeremy Drake

On Wed, 13 Sep 2006, Tom Dunstan wrote:

  Another possibility would be to test these patches in some kind of virtual
  machine that gets blown away every X days, so that even if someone did get
  something malicious in there it wouldn't last long.

Or just have a snapshot which is reverted after each run, and read-only
access to files used to do the build.  I know vmware supports this,
probably others too...

 Yeah, nasties could be roughly separated into two categories: stuff which
 affects your box, or stuff which uses your box to affect someone else. A VM
 fixes the first, and a firewall blocking outgoing connections (with exceptions
 for the CVS server and patch buildfarm or whatever it is server) largely fixes
 the second.

 I was under the impression that most VM products are x86 centric, which
 wouldn't lead to huge amounts of diversity in the buildfarm results. At least,
 not as far as architecture goes.

I have played with QEmu (www.qemu.org) which is open source and supports
multiple target architectures.  I'm not sure how stable all of the
different targets are, I know that sparc64 is not quite done yet.


-- 
The problem with engineers is that they tend to cheat in order to get
results.

The problem with mathematicians is that they tend to work on toy
problems in order to get results.

The problem with program verifiers is that they tend to cheat at toy
problems in order to get results.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] polite request about syntax

2006-09-15 Thread Jeremy Drake

On Fri, 15 Sep 2006, Dave Page wrote:
  -Original Message-
  From: [EMAIL PROTECTED]
  [mailto:[EMAIL PROTECTED] On Behalf Of
  Ricardo Malafaia
  Sent: 15 September 2006 16:35
  To: Andrew Dunstan
  Cc: pgsql-hackers@postgresql.org
  Subject: Re: [HACKERS] polite request about syntax

  my complaint is that, like i said, timestamp with time zone is no
  good substitute for a simple datetime.  Here, someone suggested a
  CREATE DOMAIN to create an alias for it.  Why isn't it provided there
  out-of-the-box by default?  So you have the SQL standard timestamp and
  the industry standard datetime.

 Because adding everybody's idea of industry-standard typenames, function
 name, operators etc will lead to bloated system catalogs and insanity
 for people trying to understand what differences between objects there
 may or may not be.

 We follow the SQL standards. If you need to provide compatibility types
 and functions to migrate from another product, then unlike many others
 we provide the capability for you to add them yourself.

I hate to comment on what is shaping up to be a bit of a tinderbox of a
thread, but I can't help myself.

When I was first dealing with postgres, I found it extremely annoying that
I had to type out double precision rather than just double since every
sane programming language (as well as Java) uses double.  I eventually
figured out that it was because double precision is the standard name, but
I don't like to type and although I know I could use float8, I am used to
typing double.

I have found the same thing with the type timestamp without time zone.
The verbosity of type names seems rather extreme.  But it is just not
important enough to warrant me creating a domain or anything to do
anything about it, it just slightly irks me every time I have to type
them.

I have probably now written more on this than it deserves :)

-- 
Fertility is hereditary.  If your parents didn't have any children,
neither will you.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] polite request about syntax

2006-09-20 Thread Jeremy Drake

On Tue, 19 Sep 2006, Alvaro Herrera wrote:

 Jeremy Drake wrote:

  I have found the same thing with the type timestamp without time zone.
  The verbosity of type names seems rather extreme.

 Then use simply timestamptz (with TZ) or timestamp (without).

Didn't know about these, learn something new every day I guess.  I know
that double did not work due to the countless times I forget the
precision :)

-- 
Armadillo:
To provide weapons to a Spanish pickle

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] [PATCHES] Patch for UUID datatype (beta)

2006-09-20 Thread Jeremy Drake

On Wed, 20 Sep 2006, Gregory Stark wrote:


 [EMAIL PROTECTED] writes:

  I have the impression I'm not being heard.
 
  *I* control the MAC address assignment for all of *MY* units.

 No, you're missing the point. How does that help *me* avoid collisions with
 your UUIDs? UUIDs are supposed to be unique period, not just unique on your
 database.


I must jump in with my amusement at this whole conversation.  I just
looked up the standard (http://www.ietf.org/rfc/rfc4122.txt) and it
includes this abstract:

Abstract

   This specification defines a Uniform Resource Name namespace for
   UUIDs (Universally Unique IDentifier), also known as GUIDs (Globally
   Unique IDentifier).  A UUID is 128 bits long, and can guarantee
   uniqueness across space and time.  UUIDs were originally used in the
   Apollo Network Computing System and later in the Open Software
   Foundation's (OSF) Distributed Computing Environment (DCE), and then
   in Microsoft Windows platforms.


It then goes on to detail multiple versions of them which are generated in
various ways.  But they are all called UUID, and thus should all be
UNIVERSALLY unique, and the statement can guarantee uniqueness across
space and time should apply equally to all versions, as it is an absolute
statement.  So perhaps the ietf have been drinking the kool-aid (or
whatever), or perhaps you plan to use your databases in multiple
universes.  But the standard seems to make the whole discussion moot by
guaranteeing all UUIDs to be unique across space and time.  Or am I
misreading that?

So I guess I am just ROFL at the fact that people can't seem to get their
definition of universe quite straight.  Either the UUID is misnamed, or
some people here are vastly underestimating the scope of the universe, or
perhaps both.  Or perhaps it's just that it's 3am and this thing seems
extraordiarily funny to me right now ;)


-- 
Menu, n.:
A list of dishes which the restaurant has just run out of.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] advisory locks and permissions

2006-09-20 Thread Jeremy Drake

On Wed, 20 Sep 2006, Bruce Momjian wrote:


 Doesn't creating many temp tables in a transaction do the same thing?

 ---

Like this?

jeremyd=# CREATE OR REPLACE FUNCTION testy(n integer) returns integer as $$
BEGIN
 EXECUTE 'CREATE TEMP TABLE testy_' || n::text || ' (a integer, b text);';
 RETURN n;
END;
$$ LANGUAGE plpgsql VOLATILE STRICT;
CREATE FUNCTION
jeremyd=# select testy(n) from generate_series(1,100) n;
WARNING:  out of shared memory
CONTEXT:  SQL statement CREATE TEMP TABLE testy_3323 (a integer, b
text); PL/pgSQL function testy line 2 at execute statement
ERROR:  out of shared memory
HINT:  You may need to increase max_locks_per_transaction.
CONTEXT:  SQL statement CREATE TEMP TABLE testy_3323 (a integer, b
text);
PL/pgSQL function testy line 2 at execute statement




 Josh Berkus wrote:
  All,
 
   I vote for locking down to superuser access (lets be frank here: I
   would estimate 90%+ database installatons run with the application as
   root) so we are not losing much.
 
  Not in my experience.   Note that making them superuser-only pretty much 
  puts
  them out of the hands of hosted applications.
 
  How simple would it be to limit the number of advisory locks available to a
  single request?  That would at least make the DOS non-trivial.  Or to put in
  a handle (GUC?) that allows turning advisory locks off?
 
  Hmmm ... I'll bet I could come up with other ways to use generate_series in 
  a
  DOS, even without advisory locks ...
 
  --
  Josh Berkus
  PostgreSQL @ Sun
  San Francisco
 
  ---(end of broadcast)---
  TIP 6: explain analyze is your friend



-- 
Two percent of zero is almost nothing.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PATCHES] [HACKERS] large object regression tests

2006-09-24 Thread Jeremy Drake

On Thu, 21 Sep 2006, Tom Lane wrote:

 Jeremy Drake [EMAIL PROTECTED] writes:
  I put together a patch which adds a regression test for large objects,
  hopefully attached to this message.  I would like some critique of it, to
  see if I have gone about it the right way.  Also I would be happy to hear
  any additional tests which should be added to it.

 I'd prefer it if we could arrange not to need any absolute paths
 embedded into the test, because maintaining tests that require such is
 a real PITA --- instead of just committing the actual test output, one
 has to reverse-convert it to a .source file.

I just copied how the test for COPY worked, since I perceived a similarity
in what I needed to do (use external files to load data).

 I suggest that instead of testing the server-side lo_import/lo_export
 functions, perhaps you could test the psql equivalents and write and
 read a file in psql's working directory.

I did not see any precedent for that when I was looking around in the
existing tests for an example of how to do things.  I am not even sure
where the cwd of psql is, so I can put an input file there.  Could you
provide an example of how this might look, by telling me where to put a
file in the src/test/regress tree and the path to give to \lo_import?
Besides which, shouldn't both the server-side and psql versions be tested?
When I was looking at the copy tests, it looked like the server-side ones
were tested, and then the psql ones were tested by exporting and then
importing data which was originally loaded from the server-side method.
Am I correctly interpreting the precedent, or are you suggesting that the
precedent be changed?  I was trying to stay as close to the copy tests as
possible since the functionality is so similar (transferring data to/from
files in the filesystem, either via server-side functions which require
absolute paths or via psql \ commands (which I forgot about for the lo
funcs)).

 I think we could do without the Moby Dick extract too ...

I am open to suggestions.  I saw one suggestion that I use an image of an
elephant, but I suspect that was tongue-in-cheek.  I am not very fond of
the idea of generating repetitious data, as I think it would be more
difficult to determine whether or not the loseek/tell functions put me in
the right place in the middle of the file.  Perhaps if there was a way to
generate deterministic pseudo-random data, that would work (has to be
deterministic so the diffs of the output come out right).  Anyone have a
good example of seeding a random number generator and generating a bunch
of bytea which is deterministic cross-platform?


   regards, tom lane


In the mean time, I will alter the test to also test the psql backslash
commands based on how the copy equivalents are tested, since I had
forgotten them and they need to be tested also.

-- 
Any sufficiently advanced technology is indistinguishable from a rigged
demo.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[HACKERS] pls disregard, testing majordomo settings

2006-09-24 Thread Jeremy Drake

I just messed with a bunch of my majordomo settings and I wanted to make
sure things are working the way I thought.  Please disregard.  Sorry to
bother everyone

-- 
I'll defend to the death your right to say that, but I never said I'd
listen to it!
-- Tom Galloway with apologies to Voltaire

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [PATCHES] [HACKERS] large object regression tests

2006-09-24 Thread Jeremy Drake

On Sun, 24 Sep 2006, Jeremy Drake wrote:

 On Thu, 21 Sep 2006, Tom Lane wrote:

  I suggest that instead of testing the server-side lo_import/lo_export
  functions, perhaps you could test the psql equivalents and write and
  read a file in psql's working directory.

 I did not see any precedent for that when I was looking around in the
 existing tests for an example of how to do things.
snip
 When I was looking at the copy tests, it looked like the server-side ones
 were tested, and then the psql ones were tested by exporting and then
 importing data which was originally loaded from the server-side method.

I just went back and looked at the tests again.  The only time the psql
\copy command was used was in the (quite recent IIRC) copyselect test, and
then only via stdout (never referring to psql working directory, or to
files at all).  Did I misunderstand, and you are proposing a completely
new way of doing things in the regression tests?  I am not particularly
fond of the sed substitution stuff myself, but it seems to be the only
currently supported/used method in the regression tests...  I do think
that making the large object test and the copy test consistent would make
a lot of sense, since as I said before, the functionality of file access
is so similar...

-- 
We demand rigidly defined areas of doubt and uncertainty!
-- Vroomfondel

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PATCHES] [HACKERS] large object regression tests

2006-09-26 Thread Jeremy Drake

On Sun, 24 Sep 2006, Jeremy Drake wrote:

 On Thu, 21 Sep 2006, Tom Lane wrote:

  I think we could do without the Moby Dick extract too ...

 I am open to suggestions.  I saw one suggestion that I use an image of an
 elephant, but I suspect that was tongue-in-cheek.  I am not very fond of
 the idea of generating repetitious data, as I think it would be more
 difficult to determine whether or not the loseek/tell functions put me in
 the right place in the middle of the file.

I just had the idea that I could use one of the existing data files which
are used for testing COPY instead of the Moby Dick extract.  They are
already there, a few of them are pretty good sized, they have data in the
file which is not just simple repetition so it would be pretty obvious if
the seek function broke, and they are very unlikely to change.  I am
considering changing the test I put together to use tenk.data as the input
file tomorrow and send in what I have again, since I also am doing a test
of \lo_import (which also requires a patch to psql I sent in earlier to
fix the output of the \lo_* commands to respect the output settings).

-- 
When does summertime come to Minnesota, you ask?
Well, last year, I think it was a Tuesday.

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Developer's Wiki

2006-09-27 Thread Jeremy Drake

On Wed, 27 Sep 2006, Lukas Kahwe Smith wrote:

 Dave Page wrote:
  I have now moved the wiki installation to:
 
  http://developer.postgresql.org/

 BTW: I am wondering if there is an RSS feed of the changes?

 On my wiki I have an RSS feed for every page, subwiki (aka area) and the
 entire wiki people can subscribe to:
 http://oss.backendmedia.com/rss.php?area=PHPTODOpage=HomePage
 http://oss.backendmedia.com/rss.php?area=PHPTODO
 http://oss.backendmedia.com/rss.php

I only really know of the entire wiki one, but that's the only one I have
ever wanted to do.  I think it may be able to limit to namespaces, but I
am not sure about that.

http://developer.postgresql.org/index.php?title=Special:Recentchangesfeed=rss

There are a bunch of knobs on the Special:Recentchanges page which could
apply also to the rss version, but I have never tried it and they may not,
I don't know.


 regards,
 Lukas


-- 
Besides the device, the box should contain:

* Eight little rectangular snippets of paper that say WARNING

* A plastic packet containing four 5/17 inch pilfer grommets and two
  club-ended 6/93 inch boxcar prawns.

YOU WILL NEED TO SUPPLY: a matrix wrench and 60,000 feet of tram
cable.

IF ANYTHING IS DAMAGED OR MISSING: You IMMEDIATELY should turn to your
spouse and say: Margaret, you know why this country can't make a car
that can get all the way through the drive-through at Burger King
without a major transmission overhaul?  Because nobody cares, that's
why.

WARNING: This is assuming your spouse's name is Margaret.
-- Dave Barry, Read This First!

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [PATCHES] [HACKERS] large object regression tests

2006-09-27 Thread Jeremy Drake

On Sun, 24 Sep 2006, Jeremy Drake wrote:

 On Thu, 21 Sep 2006, Tom Lane wrote:

  I suggest that instead of testing the server-side lo_import/lo_export
  functions, perhaps you could test the psql equivalents and write and
  read a file in psql's working directory.
snip
 In the mean time, I will alter the test to also test the psql backslash
 commands based on how the copy equivalents are tested, since I had
 forgotten them and they need to be tested also.

I just tried using the \lo_import command in a regression test, and I
think I figured out why this will not work:

$ make check
...
 largeobject  ... FAILED
...

$ cat regression.diffs
*** ./expected/largeobject.out  Sun Sep 24 19:55:25 2006
--- ./results/largeobject.out   Sun Sep 24 19:55:58 2006
***
*** 188,194 
  (1 row)

  \lo_import 'results/lotest.txt'
! lo_import 31138
  \set newloid :LASTOID
  -- This is a hack to test that export/import are reversible
  -- This uses knowledge about the inner workings of large object
mechanism
--- 188,194 
  (1 row)

  \lo_import 'results/lotest.txt'
! lo_import 31199
  \set newloid :LASTOID
  -- This is a hack to test that export/import are reversible
  -- This uses knowledge about the inner workings of large object
mechanism

==


Yes, that's the large object OID in the output there, and it is different
each run (as I expect).  If you look at src/bin/psql/large_obj.c line 192,
you see:

fprintf(pset.queryFout, lo_import %u\n, loid);

Which is executed unconditionally whenever the lo_import is successful.
While in a normal circumstance, it is quite necessary to know the loid,
since it does change each call, in this case it serves to break the diffs,
and so I guess it is impossible to use the \lo_import command in a
regression test.


-- 
The first time, it's a KLUDGE!
The second, a trick.
Later, it's a well-established technique!
-- Mike Broido, Intermetrics

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PATCHES] [HACKERS] large object regression tests

2006-09-27 Thread Jeremy Drake

On Mon, 25 Sep 2006, Tom Lane wrote:

 Jeremy Drake [EMAIL PROTECTED] writes:
  I just tried using the \lo_import command in a regression test, and I
  think I figured out why this will not work:
  ...
  Yes, that's the large object OID in the output there, and it is different
  each run (as I expect).

 Right.  I'd suggest temporarily setting ECHO off to hide the
 unpredictable part of the output.  There are similar measures taken in
 many of the contrib tests.

I tried this:
jeremyd=# \set QUIET
jeremyd=# \set ECHO off
jeremyd=# BEGIN;
jeremyd=# \lo_import results/lotest.txt
lo_import 84951
jeremyd=# ROLLBACK;

From what I could tell in the code, the message is printed regardless of
setting.

It looks like the large_obj.c output is missing much of the output
settings handling which is in the PrintQueryStatus function in common.c,
such as handling quiet mode, and html output.  I will try to dig around
and try to put together a patch to make it respect the settings like other
commands...

-- 
You are old, said the youth, and your programs don't run,
And there isn't one language you like;
Yet of useful suggestions for help you have none --
Have you thought about taking a hike?

Since I never write programs, his father replied,
Every language looks equally bad;
Yet the people keep paying to read all my books
And don't realize that they've been had.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

[HACKERS] src/tools/msvc usage instructions

2006-10-01 Thread Jeremy Drake

I was just trying to build using the src/tools/msvc scripts on windows,
and I was wondering if there were any instructions on how to do this, what
prerequisites there are, where to get them, etc.  I couldn't find any, but
I may not know the correct place to look.

Sorry if this is the wrong list for this question.

-- 
People need good lies.  There are too many bad ones.
-- Bokonon, Cat's Cradle by Kurt Vonnegut, Jr.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] src/tools/msvc usage instructions

2006-10-02 Thread Jeremy Drake

On Mon, 2 Oct 2006, Tom Lane wrote:

 Jeremy Drake [EMAIL PROTECTED] writes:
  I grabbed flex and bison from GNUwin32
  (http://gnuwin32.sourceforge.net/packages/bison.htm)

  This appears to not work out well.  If I copy the generated files from
  bison from a linux box, then they are ok, but if I try to use ones
  generated using that version of bison, it does not compile.

 Hm, have you tried diff'ing the output files from the two cases?  This
 is really probably a question for the flex and bison maintainers, not
 us, but it seems like it should work for moderately up-to-date versions
 of those tools.  What compile failures do you get exactly?

I was just going to chalk it up to a bad matching of tool ports or
something and try to find a different bison, but if you are really
interested...

I get errors on any bison generated file.  For simplicity of the diff, I
did not use the first failure I got, which was gram.c, but instead used
the much smaller bootparse.c file.  I grabbed the bootparse.c files
generated on windows and on linux, did a diff -cw between them, and tarred
up the three files, which you can get from
http://www.jdrake.com/postgresql/bison-files-win32.tar.gz

The errors I got on this file were:
1-- Build started: Project: postgres, Configuration: Release Win32 --
1Compiling...
1bootparse.c
1bootparse.tab.c(1065) : error C2449: found '{' at file scope (missing 
function header?)
1bootparse.tab.c(1858) : error C2059: syntax error : '}'

and then a whole lot of random, uninteresting errors of the kind you get
when the compiler is so confused it no longer knows what it is doing.

I am currently trying to build a newer version of bison using mingw and
use it, but I am running into issues with that also.

Oh, I just got the email from Magnus which says do not use v2.1, but
1.875, so I guess that's what I did wrong.  Oops!


   regards, tom lane

 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match



-- 
Drive defensively.  Buy a tank.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] src/tools/msvc usage instructions

2006-10-02 Thread Jeremy Drake

On Mon, 2 Oct 2006, Magnus Hagander wrote:

  This appears to not work out well.  If I copy the generated
  files from bison from a linux box, then they are ok, but if I
  try to use ones generated using that version of bison, it
  does not compile.  I'll look around for a different one.

 That's the onw I'm using. However, be sure to get version 1.875-4, and
 *not* version 2.1.

Oops, that was it.

 If you do build solution it should build all project sin the correct
 order - there are dependency references set between them that should
 take care of this automatically.

If I do build solution it tells me Project not selected to build for this
solution configuration for all projects, then 55 skipped at the end.

  Do you have any idea how to get the environment to know where
  windows.h is?  I even explicitly added the directory to the INCLUDE
  environment variable, but it did not work.  I will try switching to
  short paths in there in case it is an issue of paths with spaces.

 In my environment, that gets set when I start the Visual Studio command
 prompt - that's the whole point abou tusing the VS commandprompt and not
 a normal one. I think you get a question about integrating the Platform
 SDK with Visual studio when you install it - any chance you missed that
 one?

Well, it works in the gui, so I thought I got that integrated correctly.
One of the deals with the visual c express thing is that it does not come
with the headers and libraries and that you have to use the platform sdk
instead.



   I already have ActivePerl and ActivePython installed, so
  those should
   work out.  I am not really concerned about krb5 and ldap,
  so as long
   as commenting them out will disable them, that is good.

 You can safely leave LDAP in, because it uses only the builtin
 functionality in the OS and no external dependencies.

 //Magnus

 ---(end of broadcast)---
 TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq



-- 
I don't know what you mean by `glory,' Alice said
Humpty Dumpty smiled contemptuously.  Of course you don't --
till I tell you.  I meant `there's a nice knock-down argument for
you!'
But glory doesn't mean `a nice knock-down argument,' Alice
objected.
When I use a word, Humpty Dumpty said, in a rather scornful
tone, it means just what I choose it to mean -- neither more nor
less.
The question is, said Alice, whether you can make words mean
so many different things.
The question is, said Humpty Dumpty, which is to be master--
that's all.
-- Lewis Carroll, Through the Looking Glass

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] src/tools/msvc usage instructions

2006-10-02 Thread Jeremy Drake

On Sun, 1 Oct 2006, Jeremy Drake wrote:

 On Mon, 2 Oct 2006, Magnus Hagander wrote:

  If you do build solution it should build all project sin the correct
  order - there are dependency references set between them that should
  take care of this automatically.

 If I do build solution it tells me Project not selected to build for this
 solution configuration for all projects, then 55 skipped at the end.

I clicked around a little, selected the postgres project in the project
list, and switched to the release configuration, and now build solution
works.  Hmm.


   Do you have any idea how to get the environment to know where
   windows.h is?  I even explicitly added the directory to the INCLUDE
   environment variable, but it did not work.  I will try switching to
   short paths in there in case it is an issue of paths with spaces.

I switched to short paths in the INCLUDE env var, but it seems to just
ignore it.  I'll have to look around for how to deal with this, but for
now perhaps the gui will work ok.  Is there anything that needs to happen
post-compile that may not get done if I use the gui?


--
H. L. Mencken suffers from the hallucination that he is H. L.
Mencken -- there is no cure for a disease of that magnitude.
-- Maxwell Bodenheim

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] src/tools/msvc usage instructions

2006-10-02 Thread Jeremy Drake

On Mon, 2 Oct 2006, Tom Lane wrote:

 Jeremy Drake [EMAIL PROTECTED] writes:
  The errors I got on this file were:
  1bootparse.tab.c(1065) : error C2449: found '{' at file scope (missing 
  function header?)

 I looked at this.  Line 1065 is the left brace starting yyparse().  On
 my Fedora Core 5 box with Bison 2.1 installed, the stuff leading up to
 it is

 #ifdef YYPARSE_PARAM
 ... some uninteresting stuff, since we don't define YYPARSE_PARAM ...
 #else /* ! YYPARSE_PARAM */
 #if defined (__STDC__) || defined (__cplusplus)
 int
 yyparse (void)
 #else
 int
 yyparse ()

 #endif
 #endif
 {

 But lookee here, your Windows-built version has

 #ifdef YYPARSE_PARAM
 ...
 #else /* ! YYPARSE_PARAM */
 #if defined (__STDC__) || defined (__cplusplus)
 int
 yyparse (void)
 #else
 int
 yyparse ()
 ;
 #endif
 #endif
 {

 So that semicolon is the source of the trouble.  That's clearly a bison
 bug, and in fact digging in Red Hat's SRPM shows that they are carrying
 a patch for it:

 2005-10-05  Paul Eggert  [EMAIL PROTECTED]

   * data/m4sugar/m4sugar.m4 (_m4_map): New macro.
   (m4_map, m4_map_sep): Use it.  Handle the empty list correctly.

snip patch

 Presumably bison 2.2 includes this fix.  But I guess the real question
 is why the devil doesn't MSVC define __STDC__ ?  Are they that far
 removed from spec compliance?

In the bison 2.2 generated code, the #if check is

#if (defined __STDC__ || defined __C99__FUNC__ \
 || defined __cplusplus || defined _MSC_VER)

which looks like they figured out that they needed to check for MicroSoft
C explicitly.  I have no idea why they do not define __STDC__ however.



   regards, tom lane


-- 
A person is just about as big as the things that make him angry.

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] src/tools/msvc usage instructions

2006-10-02 Thread Jeremy Drake

I now get things to compile, but now I get linker errors on any dll which
needs to access symbols from postgres.exe via postgres.lib.  For example:

1-- Build started: Project: autoinc, Configuration: Release Win32 --
1Generate DEF file
1Not re-generating AUTOINC.DEF, file already exists.
1Linking...
1   Creating library Release\autoinc\autoinc.lib and object 
Release\autoinc\autoinc.exp
1autoinc.obj : error LNK2019: unresolved external symbol _SPI_modifytuple 
referenced in function _autoinc
1autoinc.obj : error LNK2019: unresolved external symbol _pfree referenced in 
function _autoinc
1autoinc.obj : error LNK2019: unresolved external symbol _pg_detoast_datum 
referenced in function _autoinc
1autoinc.obj : error LNK2019: unresolved external symbol _nextval referenced 
in function _autoinc
1autoinc.obj : error LNK2019: unresolved external symbol _DirectFunctionCall1 
referenced in function _autoinc
1autoinc.obj : error LNK2019: unresolved external symbol _textin referenced in 
function _autoinc
1autoinc.obj : error LNK2019: unresolved external symbol _SPI_getbinval 
referenced in function _autoinc
1autoinc.obj : error LNK2019: unresolved external symbol _SPI_gettypeid 
referenced in function _autoinc
1autoinc.obj : error LNK2019: unresolved external symbol _errfinish referenced 
in function _autoinc
1autoinc.obj : error LNK2019: unresolved external symbol ___msvc_errcode 
referenced in function _autoinc
1autoinc.obj : error LNK2019: unresolved external symbol _errmsg referenced in 
function _autoinc
1autoinc.obj : error LNK2019: unresolved external symbol _errstart referenced 
in function _autoinc
1autoinc.obj : error LNK2019: unresolved external symbol _SPI_fnumber 
referenced in function _autoinc
1autoinc.obj : error LNK2019: unresolved external symbol _MemoryContextAlloc 
referenced in function _autoinc
1autoinc.obj : error LNK2019: unresolved external symbol _SPI_getrelname 
referenced in function _autoinc
1autoinc.obj : error LNK2019: unresolved external symbol _elog_finish 
referenced in function _autoinc
1autoinc.obj : error LNK2019: unresolved external symbol _elog_start 
referenced in function _autoinc
1.\Release\autoinc\autoinc.dll : fatal error LNK1120: 17 unresolved externals

I checked the project properties for linker options and it does list
Release\postgres\postgres.lib in the additional dependencies list.

Any ideas?  Am I missing something?


-- 
A penny saved is ridiculous.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] src/tools/msvc usage instructions

2006-10-03 Thread Jeremy Drake

On Tue, 3 Oct 2006, Magnus Hagander wrote:

 Looks like the gendef script is failing. Check the contents of
 release\postgres\postgres.def - it should have thousands of symbols, but
 I'm willing to bet it's empty...

It contains one word: EXPORTS.  I assume this means it is empty.  What
should I do about it?  Is there something I can check to see why this is
failing?


 //Magnus


-- 
Honesty is the best policy, but insanity is a better defense.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] src/tools/msvc usage instructions

2006-10-03 Thread Jeremy Drake

On Tue, 3 Oct 2006, Magnus Hagander wrote:

   Looks like the gendef script is failing. Check the contents of
   release\postgres\postgres.def - it should have thousands of
  symbols,
   but I'm willing to bet it's empty...
 
  It contains one word: EXPORTS.  I assume this means it is
  empty.  What should I do about it?  Is there something I can
  check to see why this is failing?

 Yup.

 Delete the DEF file and run the gendef command manually (see the project
 file for commandline, IIRC there are no parameters, but just to be
 sure). I'm wondering if you're seeing the samre problem as Joachim
 Wieland (off-list conversation) where the output from dumpbin.exe goes
 to the console instead of the pipe in the perl program...

I was just checking this, I read the gendef script, and saw it would
short-circut if postgres.def existed, so I deleted the file and ran a
build in visual studio again and it printed all kinds of dumpbin output
into the visual stuio output window, which I remember it did before.
Since you have seen this before, what was the fix (or was there one)?


-- 
It's raisins that make Post Raisin Bran so raisiny ...

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] src/tools/msvc usage instructions

2006-10-03 Thread Jeremy Drake

On Tue, 3 Oct 2006, Magnus Hagander wrote:

 Funky.
 Can you try having it run the dumpbin command into a tempfile, and then
 open-and-read that tempfile, to see if that makes a difference?
 (Assuming you know enough perl to do that, of course)

Doing it as
system(dumpbin /symbols $_  $tmpfn)
still output to the console.

But, I got it to work with the attached patch to the script.  Note the use
of the handy /out:FILE parameter to dumpbin for redirecting the output ;)

Also, I changed the file glob to *.obj from * since I got an error trying
to run dumpbin on BuildLog.htm which is obviously not an object file.
Hopefully this is correct?

-- 
Q:  Why do mountain climbers rope themselves together?
A:  To prevent the sensible ones from going home.Index: gendef.pl

===

RCS file: X:\\postgres\\cvsuproot/pgsql/src/tools/msvc/gendef.pl,v

retrieving revision 1.1

diff -c -r1.1 gendef.pl

*** gendef.pl   4 Sep 2006 21:30:40 -   1.1

--- gendef.pl   3 Oct 2006 07:20:26 -

***

*** 10,18 

  

  print Generating $defname.DEF from directory $ARGV[0]\n;

  

! while ($ARGV[0]/*) {

  print .;

!   open(F,dumpbin /symbols $_|) || die Could not open $_\n;

while (F) {

s/\(\)//g;

next unless /^\d/;

--- 10,23 

  

  print Generating $defname.DEF from directory $ARGV[0]\n;

  

! while ($ARGV[0]/*.obj) {

  print .;

! #open(F,dumpbin /symbols $_|) || die Could not open $_\n;

!   s/\//\\/g;

!   system(dumpbin /symbols $_  /out:$_.syms) == 0 or die Could not 
dumpbin $_\n;

!   my $tmpfn = $_.syms;

!   open(F, $tmpfn) || die Could not open $tmpfn\n;

!   

while (F) {

s/\(\)//g;

next unless /^\d/;

***

*** 31,36 

--- 36,42 

push @def, $pieces[6];

}

close(F);

+   unlink $tmpfn;

  }

  print \n;

  




---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

[HACKERS] buildfarm failures in ECPG-Check

2006-10-03 Thread Jeremy Drake

It looks like something broke the ECPG-Check recently.  A number of
buildfarm members are failing.

On Tue, 3 Oct 2006, PG Build Farm wrote:



 The PGBuildfarm member mongoose had the following event on branch HEAD:

 Failed at Stage: ECPG-Check

 The snapshot timestamp for the build that triggered this notification is: 
 2006-10-03 22:30:01

 The specs of this machine are:
 OS:  Gentoo / 1.6.14
 Arch: i686
 Comp: icc / 9.0.032

 For more information, see 
 http://www.pgbuildfarm.org/cgi-bin/show_history.pl?nm=mongoosebr=HEAD



-- 
Lewis's Law of Travel:
The first piece of luggage out of the chute doesn't belong to
anyone, ever.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

[HACKERS] width_bucket function for timestamps

2006-10-08 Thread Jeremy Drake

I just came across this code I wrote about a year ago which implements a
function equivilant to width_bucket for timestamps.

I wrote this when I was trying to plot some data over time, and I had more
points than I needed.  This function allowed me to create a pre-determined
number of bins to average the data inside of so that I could get a sane
number of points.  Part of the problem was that there were so many data
points, that a sql implementation of the function (or plpgsql, I forget,
it was a year ago) was painfully slow.  This C function provided much
better performance than any other means at my disposal.

I wanted to share this code since it may be useful for someone else, but I
don't know exactly what to do with it.  So I am putting it out there, and
asking what the proper home for such a function might be.  I believe it
would be generally useful for people, but it is so small that it hardly
seems like a reasonable pgFoundry project.  Maybe there is a home for such
a thing in the core distribution in a future release?

The code can be found at
http://www.jdrake.com/postgresql/bintimestamp.tar.gz for a buildable PGXS
module, or I attached just the C code.  There is no documentation, the
parameters work the same as the width_bucket function.  The code is not
necessarily the most readable in the world, I was trying to get as much
speed out of it as possible, since I was calling it over a million times
as a group by value.

Thanks for any pointers...

-- 
Fortune's Office Door Sign of the Week:

Incorrigible punster -- Do not incorrige./*
 * file:$RCSfile: bintimestamp.c,v $ $Revision: 1.1 $
 * module:  timestamp
 * authors: jeremyd
 * last mod:$Author: jeremyd $ at $Date: 2005/10/28 20:26:38 $
 * 
 * created: Fri Oct 28 13:26:38 PDT 2005
 * 
 */

#include string.h
#include math.h
#include postgres.h

#include fmgr.h
#include libpq/pqformat.h
#include utils/builtins.h
#include funcapi.h
#include utils/timestamp.h

#ifndef JROUND
#   define JROUND(x) (x)
#endif

Datum timestamp_get_bin_size(PG_FUNCTION_ARGS);
Datum timestamp_bin(PG_FUNCTION_ARGS);

PG_FUNCTION_INFO_V1(timestamp_get_bin_size);
Datum
timestamp_get_bin_size(PG_FUNCTION_ARGS)
{
Timestamp start = PG_GETARG_TIMESTAMP(0);
Timestamp stop = PG_GETARG_TIMESTAMP(1);
int32 nbuckets = PG_GETARG_INT32(2);
Interval * retval = (Interval *)palloc (sizeof(Interval));

if (!retval)
{
ereport(ERROR, (errcode(ERRCODE_OUT_OF_MEMORY), 
errmsg(insufficient memory for Interval allocation)));
PG_RETURN_NULL();
}

memset (retval, 0, sizeof(Interval));

retval-time = JROUND ((stop - start) / nbuckets);

PG_RETURN_INTERVAL_P(retval);
}

PG_FUNCTION_INFO_V1(timestamp_bin);
Datum
timestamp_bin(PG_FUNCTION_ARGS)
{
/*Timestamp op = PG_GETARG_TIMESTAMP(0);*/
Timestamp start = PG_GETARG_TIMESTAMP(1);
/*Timestamp stop = PG_GETARG_TIMESTAMP(2);*/
Timestamp binsz;
/*int32 nbuckets = PG_GETARG_INT32(3)*/;

binsz = (PG_GETARG_TIMESTAMP(2) - start) / PG_GETARG_INT32(3);

PG_RETURN_TIMESTAMP(JROUND((int)((PG_GETARG_TIMESTAMP(0) - start) / 
binsz) * binsz + start));
}

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] width_bucket function for timestamps

2006-10-09 Thread Jeremy Drake

On Mon, 9 Oct 2006, Tom Lane wrote:

 It's not clear to me why we have width_bucket operating on numeric and
 not float8 --- that seems like an oversight, if not outright
 misunderstanding of the type hierarchy.

Would that make the below a lot faster?

 But if we had the float8
 version, I think Jeremy's problem would be solved just by applying
 the float8 version to extract(epoch from timestamp).  I don't really
 see the use-case for putting N versions of the function in there.

I found the function I used before I implemented the C version.  It was
significantly slower, which is why I wrote the C version.

-- given a date range and a number of buckets, round the given date to one
-- of the buckets such that any number of dates within the date range passed
-- in to this function will only return up to the number of buckets unique
-- values
CREATE OR REPLACE FUNCTION date_width_bucket
(tm TIMESTAMP WITHOUT TIME ZONE,
low TIMESTAMP WITHOUT TIME ZONE,
high TIMESTAMP WITHOUT TIME ZONE,
nbuckets INTEGER
) RETURNS TIMESTAMP WITHOUT TIME ZONE AS $$
SELECT ((EXTRACT(epoch FROM $3) - EXTRACT(epoch FROM $2)) / $4) *
(width_bucket(EXTRACT(epoch FROM $1)::NUMERIC,
EXTRACT(epoch FROM $2)::NUMERIC,
EXTRACT(epoch FROM $3)::NUMERIC,
$4)
- 1) * '1 second'::INTERVAL + $2;
$$ LANGUAGE sql IMMUTABLE STRICT;


-- 
I don't think they could put him in a mental hospital.  On the other
hand, if he were already in, I don't think they'd let him out.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] Patch queue concern

2007-03-28 Thread Jeremy Drake

On Wed, 28 Mar 2007, Simon Riggs wrote:

 On Wed, 2007-03-28 at 17:12 -0400, Bruce Momjian wrote:

 If everybody knows where everybody stands then we'll all be better off.
 There may be other dependencies that need resolution, or last minute
 decisions required to allow authors to finish.

Wasn't this the purpose of the wiki page that was set up?  I notice it has
not been updated in a while...

http://developer.postgresql.org/index.php/Todo:WishlistFor83

-- 
If the aborigine drafted an IQ test, all of Western civilization would
presumably flunk it.
-- Stanley Garn

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] tsearch2 in 8.3

2007-04-25 Thread Jeremy Drake

On Tue, 24 Apr 2007, Bruce Momjian wrote:

 Naz Gassiep wrote:
  A few of us on IRC were wondering what the status of tsearch2 is in 8.3 ?
  Was it decided to include it in core or did we decide to keep FTS as a
  plugin?
  Some brief comments from anyone on the inside of the whole FTS issue
  would be greatly appreciated by us mere end users.
  Regards,

 The patch is in the patch queue and we will try to get it into 8.3.

Let me just say, that for me this is the most anticipated feature for 8.3.
Along with the patch to allow the database owner to create trusted PLs,
this will allow me to move all but one of my databases to my hosting
provider's PostgreSQL instance from my own instance running in my home
directory (the one I cannot move also requires dblink).  I can only
imagine there are other users out there in similar circumstances to mine.
I was lucky enough to find a hosting provider with shell access where I
can run a postgres instance and that I already had the know-how to do so.
Without running my own instance, my only other option was to choose the
lesser of two evils: do without FTS, or use MySQL. ;)

Sorry for the rant, I just wanted to make sure that people knew that this
is not just cosmetic, or a restructure for its own sake, but will actually
help real world users.


-- 
The cow is nothing but a machine which makes grass fit for us people to
eat.
-- John McNulty

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

[HACKERS] buildfarm failures after pgstat patch

2007-05-26 Thread Jeremy Drake

The buildfarm appears to be failing after the recent pgstat patch.

The failure seems to be caused by this failed assertion, which appears to
occur fairly consistently in the ECPG tests, in the postmaster log:

TRAP: FailedAssertion(!(entry-trans == 0L), File: pgstat.c, Line: 696)


-- 
Disco is to music what Etch-A-Sketch is to art.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] ERROR: index row size

2007-06-03 Thread Jeremy Drake

Just glancing at this, a couple things stand out to me:

On Mon, 4 Jun 2007, Rodrigo Sakai wrote:

 Datum
 periodo_in(PG_FUNCTION_ARGS)
 {
 char*str = PG_GETARG_CSTRING(0);
 chartvi_char[MAXDATEFIELDS];
 chartvf_char[MAXDATEFIELDS];

 tvi_char = (char *) palloc(strlen(MAXDATEFIELDS));

What are you doing here?  This is completely broken.  I think you meant to
say:

char *tvi_char;

tvi_char = palloc(MAXDATEFIELDS);

Or:

char tvi_char[MAXDATEFIELDS];

and no palloc.

 tvf_char = (char *) palloc(strlen(MAXDATEFIELDS));

Same as above.


 Periodo *result;

 if (sscanf(str,  ( %s , %s ), tvi_char, tvf_char) != 2)

This is asking for trouble if arbitrary input can be fed to this.

 ereport(ERROR,
 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
 errmsg(invalid input syntax for periodo: \%s\, str)));

 result-tvi = StringToDateADT(tvi_char);
 result-tvi = StringToDateADT(tvf_char);

 result = (Periodo *) palloc(sizeof(Periodo));

 if (result-tvi  result-tvf)
 ereport(ERROR,
 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
 errmsg(Initial date (TVi) must be smaller than final date
 (TVf;

 PG_RETURN_POINTER(result);
 }

   Please help me!

   Thanks in advance!

Hope this helps.

-- 
My love, he's mad, and my love, he's fleet,
And a wild young wood-thing bore him!
The ways are fair to his roaming feet,
And the skies are sunlit for him.
As sharply sweet to my heart he seems
As the fragrance of acacia.
My own dear love, he is all my dreams --
And I wish he were in Asia.
-- Dorothy Parker

---(end of broadcast)---
TIP 6: explain analyze is your friend

[HACKERS] is_array_type vs type_is_array

2007-06-06 Thread Jeremy Drake

Was there some change in functionality reason for renaming is_array_type
to type_is_array?  It broke compilation of fulldisjunctions, which I build
and run regression tests on in my sandbox to keep it getting too horribly
broken with respect to current HEAD.  I got it to build and pass its
regression tests by adding this:

+ #if !defined(is_array_type)  defined(type_is_array)
+ #define is_array_type(x) type_is_array(x)
+ #endif

to the beginning of the one file which uses is_array_type.  Is this
reasonable to send back to the fulldisjunctions maintainer, or is there
some subtle change that prompted the name change to make uses of this
function immediately apparent?


-- 
Ducharme's Axiom:
If you view your problem closely enough you will recognize
yourself as part of the problem.

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] is_array_type vs type_is_array

2007-06-07 Thread Jeremy Drake

On Thu, 7 Jun 2007, Tom Lane wrote:

 Jeremy Drake [EMAIL PROTECTED] writes:
  Was there some change in functionality reason for renaming is_array_type
  to type_is_array?

 Just to sync style with type_is_enum ... there were more of the latter
 than the former.

OK, so it is safe to just #define one to the other, right?

  It broke compilation of fulldisjunctions,

 Sorry, but we change internal APIs every day, and twice on Sundays.
 Deal with it.

This is why I build fulldisjunctions in my sandbox, because when I decided
to try it out one time, I could not get it to compile.  After much effort
getting it happy with HEAD and sending the changes back to the maintainer
of fulldisjunctions, I thought a good thing for me to contribute is to
make sure it continues to compile cleanly against HEAD and send patches
when it breaks.  I just wanted to make sure that the functionality of this
function did not change in some way that I did not see before sending a
patch to the maintainer of fulldisjunctions.  Deal with it was not the
response I was expecting, as that is exactly what I am trying to do ;)


-- 
It is generally agreed that Hello is an appropriate greeting because
if you entered a room and said Goodbye, it could confuse a lot of
people.
-- Dolph Sharp, I'm O.K., You're Not So Hot

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Bugtraq: Having Fun With PostgreSQL

2007-06-16 Thread Jeremy Drake

On Sat, 16 Jun 2007, Michael Fuhr wrote:

 A message entitled Having Fun With PostgreSQL was posted to Bugtraq
 today.  I haven't read through the paper yet so I don't know if the
 author discusses security problems that need attention or if the
 article is more like a compilation of Stupid PostgreSQL Tricks.

 http://www.securityfocus.com/archive/1/471541/30/0/threaded

The crux of this seems to be two-fold:
1. If dblink is installed, an untrusted user could use it to gain
privileges, either using trust/ident auth (you have a superuser named
after the account the postmaster is runing as), or can be scripted to
brute force passwords.
2. If you are a superuser, you can gain access to the external system, ie,
by creating C language functions.

Neither of these are news to me, but maybe some new postgres admin will
read it and figure out to disable trust auth and not to let untrusted
users call dblink (either not install it or REVOKE the rights to call it).



-- 
Around computers it is difficult to find the correct unit of time to
measure progress.  Some cathedrals took a century to complete.  Can you
imagine the grandeur and scope of a program that would take as long?
-- Epigrams in Programming, ACM SIGPLAN Sept. 1982

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Bugtraq: Having Fun With PostgreSQL

2007-06-26 Thread Jeremy Drake

On Tue, 26 Jun 2007, Tom Lane wrote:

 Gregory Stark [EMAIL PROTECTED] writes:
  All that really has to happen is that dblink should by default not be
  callable by any user other than Postgres.

 Yeah, that is not an unreasonable change.  Someone suggested it far
 upthread, but we seem to have gotten distracted :-(

An idea came to me while thinking about this.  The particular use-case
that I use dblink for is connecting to another database in the same
database cluster.  ISTM (without looking at any code) that the postmaster
could keep track of who is properly authenticated in each backend, and see
if a connection is being created from that backend to allow connections as
the user in that backend.

I had a couple ideas about this:
1. If you can tell what process is connecting on a local socket, store a
mapping of pid to userid in the postmaster shmem and if a connection is
originating from a pid in this table and is attempting to authenticate as
the corresponding userid, allow it.

2. If you cannot tell what process is connecting on a local socket (which
I suspect you cannot portably), generate a random token and stash it in
shared memory mapping it to a userid, and then on authentication, send
this token to the postmaster to prove that you have already authenticated.
This has the downside of turning an exploit where a non-privileged user
can read arbitrary postgres memory, they could potentially gain the
privilieges of any logged on user, but the best idea is to not have that
kind of bug ;)


I know this is not the time for thinking about such things, but it may be
an idea for 8.4...

-- 
It's really quite a simple choice: Life, Death, or Los Angeles.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] Bugtraq: Having Fun With PostgreSQL

2007-06-28 Thread Jeremy Drake

On Tue, 26 Jun 2007, Andrew Dunstan wrote:

 Jeremy Drake wrote:

  2. If you cannot tell what process is connecting on a local socket (which
  I suspect you cannot portably),


 See ident_unix() in hba.c.

 It might not be 100% portable but I think it's fairly close for platforms
 that actually have unix sockets.

It looks to me (looking at docs on the various functions used there) that
only Linux supports getting the PID of the connecting process.  The other
various *BSD methods tend only to give the uid and gid, which will not be
helpful if the connection is coming from another backend in the same
cluster.

In the linux case, it looks like one would need to get the client pid, try
to get the PGPROC entry for it, if it exists get the roleid out of that
and allow connections as that role.

For any other case, some sort of painful protocol hack would be in order.
The best way I can see is to see if the client process is owned by the
same user as the database cluster, and if so send an auth request (like
the SCM_CRED one), which would be responded to with the pid and a random
sequence stored in the PGPROC entry.  The server then proves the backend
really is the one it claims to be by looking up the PID's PGPROC entry,
and making sure the token matches.

This is all just thinking out loud, of course...  I have no plans to
implement this in the short-term, but it may be an interesting project in
the future.




-- 
I like to believe that people in the long run are going to do more to
promote peace than our governments.  Indeed, I think that people want
peace so much that one of these days governments had better get out of
the way and let them have it.
-- Dwight D. Eisenhower

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] compiler warnings on the buildfarm

2007-07-12 Thread Jeremy Drake

On Thu, 12 Jul 2007, Stefan Kaltenbrunner wrote:

  What would probably be useful if you want to pursue this is to filter
  out the obvious spam like statement-not-reached, and see what's left.


I had gone through and looked at the warnings on mongoose before, but I am
running it against the current code now.  Let me know if you want line
numbers on any of these...

 count | msgtype | msgno |  
  msg
---+-+---+
   552 | warning |  1292 | attribute warn_unused_result ignored

This is due to perl headers, so don't worry about this one

77 | warning |   188 | enumerated type mixed with another type
16 | warning |   186 | pointless comparison of unsigned integer with zero
 9 | warning |   167 | argument of type int * is incompatible with 
parameter of type socklen_t={__socklen_t={unsigned int}} *restrict
 2 | warning |   300 | const variable all_zeroes requires an initializer
 1 | warning |   556 | a value of type void * cannot be assigned to an 
entity of type rl_completion_func_t *
(6 rows)



-- 
Give thought to your reputation.  Consider changing name and moving to
a new town.

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] [EMAIL PROTECTED]: Re: [GENERAL] array_to_set functions]

2007-08-07 Thread Jeremy Drake

On Tue, 7 Aug 2007, Decibel! wrote:

 ISTM that having a built-in array_to_set function would be awfully
 useful... Is the aggregate method below an acceptable way to do it?

Umm, the array_to_set function is not an aggregate.  Personally, when I
need this functionality, I use this function conveniently present in the
default install:

select * from information_schema._pg_expandarray(ARRAY['foo', 'bar', 'baz']);
  x  | n
-+---
 foo | 1
 bar | 2
 baz | 3
(3 rows)


Not exactly well documented or well known, but it works.


 - Forwarded message from Merlin Moncure [EMAIL PROTECTED] -
 On 8/3/07, Guy Fraser [EMAIL PROTECTED] wrote:
  On Wed, 2007-08-01 at 07:14 +0530, Merlin Moncure wrote:
   On 8/1/07, Decibel! [EMAIL PROTECTED] wrote:
David Fetter and I just came up with these, perhaps others will find
them useful:
   
CREATE OR REPLACE FUNCTION array_to_set(anyarray, int) RETURNS SETOF 
anyelement LANGUAGE SQL AS $$
SELECT $1[i] from generate_series(array_lower($1, $2), 
array_upper($1, $2)) i
$$;
CREATE OR REPLACE FUNCTION array_to_set(anyarray) RETURNS SETOF 
anyelement LANGUAGE SQL AS $$
SELECT array_to_set($1, 1)
$$;
  
   very nice, although IMO there is a strong justification for these
   functions to be in core and written in C for efficiency (along with
   array_accum, which I have hand burn from copying and pasting out of
   the documentation).
  
   merlin
  
  Excellent timing guys. :^)
 
  I was trying to build a function to list the items of an array, but
  ran into problems and was going to post what I had been working on.
 
  Your functions work great.
 
  In case you don't have the function to generate an array from a set
  here is one I have been using :
 
 
  CREATE AGGREGATE array_accum (
  BASETYPE = anyelement,
  SFUNC = array_append,
  STYPE = anyarray,
  INITCOND = '{}'
  );

 I think that's what just about everyone uses.  Unfortunately the
 reverse of the function (array_to_set above) AFAIK does not map
 directly to the C array API.

 merlin

 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match


 - End forwarded message -



-- 
Mollison's Bureaucracy Hypothesis:
If an idea can survive a bureaucratic review and be implemented
it wasn't worth doing.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[HACKERS] buildfarm failure after ICC configure change

2007-09-11 Thread Jeremy Drake

I just saw that my buildfarm member (running ICC 9.0 on linux) failed
after the latest change to configure

http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=mongoosedt=2007-09-11%2020:45:01

I was the one who sent in the first patch to configure to add the check
for ICC, and as I recall at the time, the docs said -fno-strict-aliasing
was the default (which is why I said not sure if this is needed in the
comment).  I don't know what -fno-alias does, but I think it means that
the program does not do aliasing at all.  The docs say You must
specify -fno-alias if you do not want aliasing to be assumed in the
program  The true option for -fno-strict-aliasing is -ansi-alias-
disable use of ANSI aliasing rules in optimizations.  But this is the
default...



-- 
Help me, I'm a prisoner in a Fortune cookie file!

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] buildfarm failure after ICC configure change

2007-09-11 Thread Jeremy Drake

On Wed, 12 Sep 2007, Tom Lane wrote:

 Jeremy Drake [EMAIL PROTECTED] writes:
  I just saw that my buildfarm member (running ICC 9.0 on linux) failed
  after the latest change to configure

 Argh!  Can someone quote chapter and verse from the ICC manual about
 this?  I was just following what Sergey said was the approved spelling
 of the switch ...

The docs are not particularly clear about these options, at least not in
9.0 (which is the version I have).  I figured -ansi-alias- was what
-fno-strict-aliasing meant, that the gcc people decided to follow the ISO
C standard strictly unless you gave -fno-strict-aliasing, which could
result in additional optimization.  This seems to match what the docs for
-ansi-alias, but with a different default.

The -fno-alias seems to go entirely the other way, saying you are not
going to be doing any aliasing, standard permitted or otherwise, so it can
optimize better accordingly.

Here are the docs for the two options.

==

ansi-alias, Qansi-alias

Enable use of ANSI aliasing rules in optimizations.

Syntax
Linux:   -ansi-alias
 -ansi-alias-
Windows: /Qansi-alias
 /Qansi-alias-

Default
-ansi-alias-Disable use of ANSI aliasing rules in optimizations.

This option tells the compiler to assume that the program adheres to ISO C
Standard aliasability rules.

If your program adheres to these rules, then this option allows the
compiler to optimize more aggressively. If it doesn't adhere to these
rules, then it can cause the compiler to generate incorrect code.


==

falias

Specifies that aliasing should be assumed in the program.

Syntax
Linux:   -falias
 -fno-alias
Windows: None

Default
-falias Aliasing is assumed in the program.

This option specifies that aliasing should be assumed in the program.

You must specify -fno-alias if you do not want aliasing to be assumed in
the program.

Alternate Options

Linux: None

Windows: /Oa


-- 
Remember:  Silly is a state of Mind, Stupid is a way of Life.
-- Dave Butler

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Build farm failure

2007-10-01 Thread Jeremy Drake

On Tue, 2 Oct 2007, Gregory Stark wrote:

 (we don't seem to have a recent icc ia32 build farm member).

Sorry about that, my buildfarm member (mongoose) is down with hardware
problems, and probably will be for the forseeable future.  For some
reason, it suddenly decided to stop recognizing its RAID card...

-- 
In the beginning was the word.
But by the time the second word was added to it,
there was trouble.
For with it came syntax ...
-- John Simon

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] modules

2008-04-03 Thread Jeremy Drake

On Thu, 3 Apr 2008, Peter Eisentraut wrote:

 Am Donnerstag, 3. April 2008 schrieb Andrew Dunstan:
  If this were at all true we would not not have seen the complaints from
  people along the lines of My ISP won't install contrib. But we have,
  and quite a number of times. We have concrete evidence that calling it
  contrib actually works against us.

 ISPs also won't install additional Perl modules, for example.  Yet, CPAN does
 exist successfully.

ISPs don't necessarily HAVE to install additional perl modules.  If I have
my own home directory and shell access, I can run perl Makefile.PL
PREFIX=/home/myuser/perlstuff, and just tweak PERL5LIB (or use lib) and I
can install modules without any superuser intervention.

This is where the CPAN comparison breaks down.  I can install any perl
module I want (native perl or even XS/C modules) without superuser
privileges.  With postgres, super user privileges are REQUIRED to install
any module, whatever it is called (contrib, modules, pgfoundry, gborg)...

IMHO, this is the Achilles heel of Postgres extensibility.  Look at this
library of plugins out there that do all of these nifty things, and if you
can't find one that fits your needs, you can always write a little C code
to do the job exactly how you want.  Too bad you can't use them if you
can't afford your own dedicated database server instance...

This was the most frustrating thing for me as a developer.  I know that
there are all of these fine modules out there, and I even have a few of my
own.  I have been spoiled by the extensibility of Postgres, only to have
it taken away when I want to move my databases from my own machine into
production on the hosting provider.


If I want to put geographical data in a database, I know PostGIS is out
there, but I can't install it.  I could use cube/earthdistance, but I
can't install that either.  So much for the geographical data.  How about
text search?  Nope, can't have that either, at least until 8.3 finds its
way into OpenBSD ports and the hosting provider gets around to installing
it.  At least I have that to look forward to.

My opinion is, it doesn't matter what you call the modules/contrib stuff
if I can't use it, and I can't use it  if it is not loaded in my
database, and I can't load it without superuser privileges.


-- 
Never put off till tomorrow what you can avoid all together.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] What in the world is happening on spoonbill?

2008-05-17 Thread Jeremy Drake

On Sat, 17 May 2008, Tom Lane wrote:

 Does anyone know how to get the child
 process exit status on Windows?

GetExitCodeProcess, if you've got the process handle handy (which I assume
you do, since you most likely were calling one of the WaitFor...Object
family of functions.

http://msdn.microsoft.com/en-us/library/ms683189(VS.85).aspx


   regards, tom lane



-- 
Then a man said: Speak to us of Expectations.

He then said: If a man does not see or hear the waters of the Jordan,
then he should not taste the pomegranate or ply his wares in an open
market.

If a man would not labour in the salt and rock quarries then he should
not accept of the Earth that which he refuses to give of himself.

Such a man would expect a pear of a peach tree.
Such a man would expect a stone to lay an egg.
Such a man would expect Sears to assemble a lawnmower.
-- Kehlog Albran, The Profit

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extending varlena

2008-08-19 Thread Jeremy Drake

On Mon, 18 Aug 2008, Tom Lane wrote:

 What would make more sense is to redesign the large-object stuff to be
 somewhat modern and featureful, and provide stream-access APIs (think
 lo_read, lo_seek, etc) that allow offsets wider than 32 bits.

A few years ago, I was working on such a project for a company I used to
work for.  The company changed directions shortly thereafter, and the
project was dropped, but perhaps the patch might still be useful as a
starting point for someone else.

The original patch is
http://archives.postgresql.org/pgsql-hackers/2005-09/msg01026.php, and the
advice I was working on implementing was in
http://archives.postgresql.org/pgsql-hackers/2005-09/msg01063.php

I am attaching the latest version of the patch I found around.  As it was
almost 3 years ago, I am a little fuzzy on where I left off, but I do
remember that I was trying to work through the suggestions Tom Lane gave
in that second linked email.  I would recommend discarding the libpq
changes, since that seemed to not pass muster.

Note that this patch was against 8.0.3.  There only seem to be a few
issues applying it to the current head, but I haven't really dug into them
to see how difficult it would be to update.  Luckily, the large object
code is fairly slow-moving, so there aren't too many conflicts.  One thing
I did notice is that it looks like someone extracted one of the functions
I wrote in this patch and applied it as a 32-bit version.  Good for them.
I'm glad someone got some use out of this project, and perhaps more use
will come of it.



-- 
At the source of every error which is blamed on the computer you will
find at least two human errors, including the error of blaming it on
the computer.diff -Nur postgresql-8.0.3-orig/src/backend/libpq/be-fsstubs.c 
postgresql-8.0.3/src/backend/libpq/be-fsstubs.c
--- postgresql-8.0.3-orig/src/backend/libpq/be-fsstubs.c2004-12-31 
13:59:50.0 -0800
+++ postgresql-8.0.3/src/backend/libpq/be-fsstubs.c 2005-10-03 
11:43:36.0 -0700
@@ -233,6 +233,34 @@
PG_RETURN_INT32(status);
 }
 
+
+Datum
+lo_lseek64(PG_FUNCTION_ARGS)
+{
+   int32   fd = PG_GETARG_INT32(0);
+   int64   offset = PG_GETARG_INT64(1);
+   int32   whence = PG_GETARG_INT32(2);
+   MemoryContext currentContext;
+   int64   status;
+
+   if (fd  0 || fd = cookies_size || cookies[fd] == NULL)
+   {
+   ereport(ERROR,
+   (errcode(ERRCODE_UNDEFINED_OBJECT),
+errmsg(invalid large-object descriptor: %d, 
fd)));
+   PG_RETURN_INT64(-1);
+   }
+
+   Assert(fscxt != NULL);
+   currentContext = MemoryContextSwitchTo(fscxt);
+
+   status = inv_seek(cookies[fd], offset, whence);
+
+   MemoryContextSwitchTo(currentContext);
+
+   PG_RETURN_INT64(status);
+}
+
 Datum
 lo_creat(PG_FUNCTION_ARGS)
 {
@@ -283,6 +311,165 @@
PG_RETURN_INT32(inv_tell(cookies[fd]));
 }
 
+
+Datum
+lo_tell64(PG_FUNCTION_ARGS)
+{
+   int32   fd = PG_GETARG_INT32(0);
+
+   if (fd  0 || fd = cookies_size || cookies[fd] == NULL)
+   {
+   ereport(ERROR,
+   (errcode(ERRCODE_UNDEFINED_OBJECT),
+errmsg(invalid large-object descriptor: %d, 
fd)));
+   PG_RETURN_INT64(-1);
+   }
+
+   /*
+* We assume we do not need to switch contexts for inv_tell. That is
+* true for now, but is probably more than this module ought to
+* assume...
+*/
+   PG_RETURN_INT64(inv_tell(cookies[fd]));
+}
+
+Datum
+lo_length(PG_FUNCTION_ARGS)
+{
+   int32   fd = PG_GETARG_INT32(0);
+   int32   sz = 0;
+   MemoryContext currentContext;
+
+   if (fd  0 || fd = cookies_size || cookies[fd] == NULL)
+   {
+   ereport(ERROR,
+   (errcode(ERRCODE_UNDEFINED_OBJECT),
+errmsg(invalid large-object descriptor: %d, 
fd)));
+   PG_RETURN_INT32(-1);
+   }
+   Assert(fscxt != NULL);
+   currentContext = MemoryContextSwitchTo(fscxt);
+
+   sz = inv_length(cookies[fd]);
+
+   MemoryContextSwitchTo(currentContext);
+
+   PG_RETURN_INT32(sz);
+}
+
+Datum
+lo_length64(PG_FUNCTION_ARGS)
+{
+   int32   fd = PG_GETARG_INT32(0);
+   int64   sz = 0;
+   MemoryContext currentContext;
+
+   if (fd  0 || fd = cookies_size || cookies[fd] == NULL)
+   {
+   ereport(ERROR,
+   (errcode(ERRCODE_UNDEFINED_OBJECT),
+errmsg(invalid large-object descriptor: %d, 
fd)));
+   PG_RETURN_INT64(-1);
+   }
+   Assert(fscxt != NULL);
+   currentContext = MemoryContextSwitchTo(fscxt);
+
+   sz = inv_length(cookies[fd]);
+
+   MemoryContextSwitchTo(currentContext);
+
+

Re: [HACKERS] Open items for 8.3

2007-11-05 Thread Jeremy Drake

On Mon, 5 Nov 2007, Gregory Stark wrote:

 How many developers have even jumped through the hoops to get wiki accounts?

According to
http://developer.postgresql.org/index.php?title=Special:Listusersgroup=pgdevlimit=500

there are currently 51 members of the group pgdev on the wiki.


-- 
Spare no expense to save money on this one.
-- Samuel Goldwyn

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

[HACKERS] Re: [COMMITTERS] pgsql: GIN index build's allocatedMemory counter needs to be long, not

2007-11-18 Thread Jeremy Drake

On Fri, 16 Nov 2007, Tom Lane wrote:

 GIN index build's allocatedMemory counter needs to be long, not uint32.
 Else, in a 64-bit machine with maintenance_work_mem set to above 4Gb,
 the counter overflows

I don't know if this has been discussed before, but you are aware that it
is not dictated by the C standard that sizeof(long) == sizeof(void*)?
The best counter-example I know is Windows x64, where sizeof(long) == 4
while sizeof(void*) == 8.  The standards-compliant way to deal with this
IIRC is using size_t or ptrdiff_t, depending on whether or not you need it
to be signed.

Sorry if this has been discussed before, but this commit just struck me as
someone who has just been working at porting some software to Win64...

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] The question of LOCALE at the time of a regression test.

2008-02-08 Thread Jeremy Drake

On Sat, 9 Feb 2008, Hiroshi Saito wrote:

 Um, I was flipped off by you

You shouldn't go around flipping people off: it's rude :)
http://www.merriam-webster.com/dictionary/flip%20off

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Severe regression in autoconf 2.61

2008-02-18 Thread Jeremy Drake

On Mon, 18 Feb 2008, Tom Lane wrote:

 There seems to have been a bit of a brain cramp upstream :-(.
 Previously, AC_FUNC_FSEEKO did this to test if fseeko was available:

   return !fseeko;

 Now it does this:

   return fseeko (stdin, 0, 0)  (fseeko) (stdin, 0, 0);

 Unfortunately, that gives the compiler enough of a syntactic clue
 to guess that fseeko is probably an undeclared function, and therefore
 *it will not error out*, only generate a warning, if it's not seen
 a declaration for fseeko.


So that's what that was.  I had the same problem in another project I was
working on (which I used some PostgreSQL configure code in).

I had to add this in the gcc section of configure:
PGAC_PROG_CC_CFLAGS_OPT([-Werror-implicit-function-declaration])

But it would be nice to find a better fix.  I don't understand how calling
a function that has not been defined yet is ever not an error.


-- 
In 1915 pancake make-up was invented but most people still preferred
syrup.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] Including PL/PgSQL by default

2008-02-22 Thread Jeremy Drake

On Fri, 22 Feb 2008, D'Arcy J.M. Cain wrote:

 On Fri, 22 Feb 2008 07:37:55 +
 Dave Page [EMAIL PROTECTED] wrote:
  I know I'm gonna regret wading in on this, but in my mind this is akin
  to one of the arguments for including tsearch in the core server -
  namely that too many brain dead hosting providers won't add a contrib
  module or anything else in a customer's database because they don't

 So their clients will go somewhere PLUG URL=http://www.Vex.Net/; /
 that does understand what they are installing and can support their
 users properly.  How far are we supposed to go to support the clueless?

Being someone on one of these clueless providers, I wrote the patch
(which made it into 8.3) which allows database owners to create trusted
languages.  For me, this was just far enough.  The clueless tend to
CREATE DATABASE %s OWNER %s, so then I can CREATE LANGUAGE plpgsql if I
want it.  This does not provide any detriment to the clueful, who can
always REVOKE the privilege to create any PL (the patch also added ACL
stuff for this).  And, since the clueful tend to run web apps and such as
non-database owners, if the web app was compromised and the db did not
explicitly load plpgsql, the attacker could not use it.


  understand that just because it's not there by default doesn't mean
  it's in any way second rate. Including pl/pgsql in template1 will help
  those folks who forwhatever reason use such providers, whilst more
  savvy providers can easily disable it post-initdb if thats what they
  want to do.

 And the first time someone uses pl/pgsql to do harm, even if it is due
 to their mis-configuration, who gets blamed?



-- 
The primary theme of SoupCon is communication.  The acronym LEO
represents the secondary theme:

Law Enforcement Officials

The overall theme of SoupCon shall be:

Avoiding Communication with Law Enforcement Officials

-- M. Gallaher

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Postgresql Caching

2006-10-15 Thread Jeremy Drake

On Sun, 15 Oct 2006, [EMAIL PROTECTED] wrote:

 On Sun, Oct 15, 2006 at 08:31:36PM +0530, Merlin Moncure wrote:
  On 10/15/06, Anon Mous [EMAIL PROTECTED] wrote:
   Would it be possible to combine a special memcache implementation of
   memcache with a Postgresql interface wrapper?
  have you seen
  http://people.freebsd.org/~seanc/pgmemcache/

 Interesting. I note that they don't address the view consistency
 problem any better than an application using memcached directly.
 And that's the real problem with memcached, and why people are
 tempted to 'indulge' by relying on PostgreSQL. Some people value
 the consistency. Others don't. memcached, whether application-side,
 or whether automatically invoked by triggers (pgmemcache) is a
 decision to ignore the consistency.

 Using memcache, I've had problems with consistency brought right to
 the front. Both of these have failed me:

 1) When updating a PostgreSQL record, I invalidate the memcache record.
If another process comes along in parallel before I commit, notices
that the memcache record is invalidated, it queries the data from
SQL, and updates the memcache record back to the old value. :-(

 2) When updating a PostgreSQL record, I updated the memcache record
to the new value. If another process comes along in parallel before
I commit, that is still looking at an older view, cross-referencing
may not work as expected.

Shouldn't you be able to use 2-stage commit for this?  Prepare to commit,
update the memcache record, then commit?  Or am I thinking of something
else?

 I'm currently settled on 2), but setting a short timeout (5 seconds) on
 the data. Still an imperfect compromise between speed and accuracy, but
 it isn't causing me problems... yet.


-- 
Fortune's Real-Life Courtroom Quote #32:

Q:  Do you know how far pregnant you are right now?
A:  I will be three months November 8th.
Q:  Apparently then, the date of conception was August 8th?
A:  Yes.
Q:  What were you and your husband doing at that time?

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

[HACKERS] constraints in query plans

2006-10-15 Thread Jeremy Drake

I set up the following experiment:

CREATE DOMAIN m_or_p AS char CHECK (VALUE = 'm' OR VALUE = 'p');

CREATE TABLE test_domain (
 fkey integer not null,
 kinteger not null,
 x1   integer not null,
 x2   integer,
 mp   m_or_p not null
);

CREATE INDEX test_domain_k_x1_x2_m ON test_domain (k, x1, x2) WHERE mp = 'm';
CREATE INDEX test_domain_k_x1_x2_p ON test_domain (k, x1, x2) WHERE mp = 'p';

then added about 375000 rows, half where mp = 'm' and half where mp = 'p'

Now, I do analyze verbose test_domain
jeremyd=# ANALYZE verbose test_domain;
INFO:  analyzing public.test_domain
INFO:  test_domain: scanned 2379 of 2379 pages, containing 375226 live
rows and 0 dead rows; 3000 rows in sample, 375226 estimated total rows
ANALYZE


Now, take a look at this
jeremyd=# explain select * from test_domain where k = 1255;
   QUERY PLAN
-
 Seq Scan on test_domain  (cost=0.00..7069.32 rows=127 width=17)
   Filter: (k = 1255)
(2 rows)

I turn constraint_exclusion on and I still get the same plan.  I tried
adding the domain's constraint to the table as well

ALTER TABLE test_domain ADD CHECK (mp = 'm' OR mp = 'p');

and I still get the same plan.  It seems the constraint is not
incorporated into the plan, since I get a different plan if I include the
constraint in the WHERE clause explicitly:

jeremyd=# explain select * from test_domain where k = 1255 AND (mp = 'm' OR mp 
= 'p');
QUERY PLAN
--
 Bitmap Heap Scan on test_domain  (cost=9.97..423.26 rows=95 width=17)
   Recheck Cond: (((k = 1255) AND ((mp)::char = 'm'::char)) OR ((k = 1255) 
AND ((mp)::char = 'p'::char)))
   -  BitmapOr  (cost=9.97..9.97 rows=127 width=0)
 -  Bitmap Index Scan on test_domain_k_x1_x2_m  (cost=0.00..4.98 
rows=60 width=0)
   Index Cond: (k = 1255)
 -  Bitmap Index Scan on test_domain_k_x1_x2_p  (cost=0.00..4.98 
rows=67 width=0)
   Index Cond: (k = 1255)
(7 rows)




And the explain analyze for each:

jeremyd=# explain analyze select * from test_domain where k = 1255 AND (mp = 
'm' OR mp = 'p');
  QUERY PLAN
---
 Bitmap Heap Scan on test_domain  (cost=9.97..423.26 rows=95 width=17) (actual 
time=0.325..2.397 rows=261 loops=1)
   Recheck Cond: (((k = 1255) AND ((mp)::char = 'm'::char)) OR ((k = 1255) 
AND ((mp)::char = 'p'::char)))
   -  BitmapOr  (cost=9.97..9.97 rows=127 width=0) (actual time=0.269..0.269 
rows=0 loops=1)
 -  Bitmap Index Scan on test_domain_k_x1_x2_m  (cost=0.00..4.98 
rows=60 width=0) (actual time=0.150..0.150 rows=129 loops=1)
   Index Cond: (k = 1255)
 -  Bitmap Index Scan on test_domain_k_x1_x2_p  (cost=0.00..4.98 
rows=67 width=0) (actual time=0.101..0.101 rows=132 loops=1)
   Index Cond: (k = 1255)
 Total runtime: 3.238 ms
(8 rows)

jeremyd=# explain analyze select * from test_domain where k = 1255;
  QUERY PLAN
---
 Seq Scan on test_domain  (cost=0.00..7069.32 rows=127 width=17) (actual 
time=0.427..125.057 rows=261 loops=1)
   Filter: (k = 1255)
 Total runtime: 125.878 ms
(3 rows)

ISTM that with the constraint_exclusion flag on, it should see from the
constraints that all values but 'm' or 'p' are excluded for the column mp,
and thus the two queries I gave are exactly equivalent.  I noticed that
the docs said it looked at table constraints, so I added the check to the
table constraint as well, but it made no difference.  I'm not sure if this
is a bug or a limitation of the planner, but it seems that these two
queries are equivalent.  I wonder how it would work out with boolean
instead of the char column, it should definitely know that there are
only 2 possible values for a boolean not null column, true or false.

DROP INDEX test_domain_k_x1_x2_p;
DROP INDEX test_domain_k_x1_x2_m;
ALTER TABLE test_domain DROP CONSTRAINT test_domain_mp_check;
ALTER TABLE test_domain ALTER COLUMN mp TYPE boolean USING (CASE WHEN mp = 'm' 
THEN false ELSE true END);
CREATE INDEX test_domain_k_x1_x2_p ON test_domain (k, x1, x2) WHERE mp;
CREATE INDEX test_domain_k_x1_x2_m ON test_domain (k, x1, x2) WHERE NOT mp;

jeremyd=# ANALYZE verbose test_domain;
INFO:  analyzing public.test_domain
INFO:  test_domain: scanned 2379 of 2379 pages, containing 375226 live
rows and 0 dead rows; 3000 rows in sample, 375226 estimated total rows
ANALYZE

jeremyd=# explain analyze select * from test_domain where k = 1255;
QUERY PLAN

Re: [HACKERS] constraints in query plans

2006-10-15 Thread Jeremy Drake

On Sun, 15 Oct 2006, Tom Lane wrote:

 Jeremy Drake [EMAIL PROTECTED] writes:
  CREATE TABLE test_domain (
   fkey integer not null,
   kinteger not null,
   x1   integer not null,
   x2   integer,
   mp   m_or_p not null
  );

  CREATE INDEX test_domain_k_x1_x2_m ON test_domain (k, x1, x2) WHERE mp = 
  'm';
  CREATE INDEX test_domain_k_x1_x2_p ON test_domain (k, x1, x2) WHERE mp = 
  'p';

 Perhaps you need a non-partial index.

I just tried that,
CREATE INDEX test_domain_k_x1_x2_mp ON test_domain (k, x1, x2, mp);

and dropped the others.  That actually works properly.
jeremyd=# explain analyze select * from test_domain where k = 1255 and mp;
QUERY PLAN
--
 Bitmap Heap Scan on test_domain  (cost=5.37..237.21 rows=66 width=17) (actual 
time=0.115..0.707 rows=132 loops=1)
   Recheck Cond: (k = 1255)
   Filter: mp
   -  Bitmap Index Scan on test_domain_k_x1_x2_mp  (cost=0.00..5.37 rows=66 
width=0) (actual time=0.081..0.081 rows=132 loops=1)
 Index Cond: ((k = 1255) AND (mp = true))
 Total runtime: 1.137 ms
(6 rows)

I thought I had to refer to all of the columns in order for this to work,
that I could not skip some in the middle, but it seems to work.


-- 
If you can survive death, you can probably survive anything.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Postgresql Caching

2006-10-16 Thread Jeremy Drake

On Mon, 16 Oct 2006, [EMAIL PROTECTED] wrote:

 On Sun, Oct 15, 2006 at 06:33:36PM -0700, Jeremy Drake wrote:
   2) When updating a PostgreSQL record, I updated the memcache record
  to the new value. If another process comes along in parallel before
  I commit, that is still looking at an older view, cross-referencing
  may not work as expected.
  Shouldn't you be able to use 2-stage commit for this?  Prepare to commit,
  update the memcache record, then commit?  Or am I thinking of something
  else?

 Two stage commits makes the window of error smaller, it can't eliminate it.

Right, I was thinking there was still some raciness there.  I think what I
remembered is that if you updated the cache and then the transaction
failed (or rolled back for whatever reason) later on, the cache would have
data that was never committed.  The two-phase commit thing is intended to
deal with that eventuality.  Which is also a possibility for a consistency
issue.


-- 
Oh, I have slipped the surly bonds of earth,
And danced the skies on laughter silvered wings;
Sunward I've climbed and joined the tumbling mirth
Of sun-split clouds and done a hundred things
You have not dreamed of --
Wheeled and soared and swung
High in the sunlit silence.
Hovering there
I've chased the shouting wind along and flung
My eager craft through footless halls of air.
Up, up along delirious, burning blue
I've topped the wind-swept heights with easy grace,
Where never lark, or even eagle flew;
And, while with silent, lifting mind I've trod
The high untrespassed sanctity of space,
Put out my hand, and touched the face of God.
-- John Gillespie Magee Jr., High Flight

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

[HACKERS] row-wise comparison question/issue

2006-10-20 Thread Jeremy Drake

I noticed something odd when trying to use the row-wise comparison
mentioned in the release notes for 8.2 and in the docs
http://developer.postgresql.org/pgdocs/postgres/functions-comparisons.html#ROW-WISE-COMPARISON

This sets up a suitable test:

create type myrowtype AS (a integer, b integer);
create table myrowtypetable (rowval myrowtype);

insert into myrowtypetable select (a, b)::myrowtype from
generate_series(1,5) a, generate_series(1,5) b;

First I get this error:

select rowval  rowval from myrowtypetable ;
ERROR:  operator does not exist: myrowtype  myrowtype
LINE 1: select rowval  rowval from myrowtypetable ;
  ^
HINT:  No operator matches the given name and argument type(s). You may
need to add explicit type casts.

OK, I guess I can live with that.  I did create a new type, and there are
no operators for it...

Now, I can do the following (pointless) query
select ROW((rowval).*)  ROW((rowval).*) from myrowtypetable ;

and I get 25 rows of 'f'.  So far so good.

But if I try to do
select rowval from myrowtypetable ORDER BY ROW((rowval).*);
ERROR:  could not identify an ordering operator for type record
HINT:  Use an explicit ordering operator or modify the query.

or even
select rowval from myrowtypetable ORDER BY ROW((rowval).*) USING ;
ERROR:  operator does not exist: record  record
HINT:  No operator matches the given name and argument type(s). You may
need to add explicit type casts.

I know that that less-than operator exists, because I just used it in the
query that worked above.  It seems that ORDER BY just can't find it for
some reason.

Is it supposed to not work in order by?  That doesn't really make sense to
me why order by should be special for this.



-- 
All extremists should be taken out and shot.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] row-wise comparison question/issue

2006-10-20 Thread Jeremy Drake

On Fri, 20 Oct 2006, Tom Lane wrote:

 Jeremy Drake [EMAIL PROTECTED] writes:
  select rowval from myrowtypetable ORDER BY ROW((rowval).*) USING ;
  ERROR:  operator does not exist: record  record

 This isn't required by the spec, and it's not implemented.  I don't
 see that it'd give any new functionality anyway, since you can always
 do ORDER BY rowval.f1, rowval.f2, ...

 The cases that are implemented are comparisons of explicit row
 constructors, eg (a,b,c)  (d,e,f) --- which I think is all
 you'll find support for in the spec.

I just think it is quite unexpected that the operator  is defined in some
places and not in others.  And the way I wrote the order by, it should
have been comparing explicit row constructors (compare the explicitly
constructed row for each rowval in order to sort).  I don't understand how
the operator  in a where clause would be different than the operator 
used by the order by.  If I were to make a custom type in C, and write
these same operators for it, they would work in both places, right?  Why
then would this be any different?


-- 
If someone had told me I would be Pope one day, I would have studied
harder.
-- Pope John Paul I

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-23 Thread Jeremy Drake

On Mon, 23 Oct 2006, Mark Kirkwood wrote:

 Tom Lane wrote:
 
 
  Yah, I checked.  Several times... but if anyone else wants to repeat
  the experiment, please do.  Or look for bugs in either my test case
  or Gurjeet's.



Just for fun, I tried it out with both GCC and with Intel's C compiler
with some agressive platform-specific flags on my 2.8Ghz Xeon running
Gentoo.

Std crc Slice-8 crc

Intel P4 Xeon 2.8Ghz (Gentoo, gcc-3.4.5, -O2)

8192 bytes  4.6975729.806341
1024 bytes  0.5974291.181828
64 bytes0.0466360.086984

Intel P4 Xeon 2.8Ghz (Gentoo, icc-9.0.032, -O2 -xN -ipo -parallel)

8192 bytes  0.040.001085
1024 bytes  0.040.001292
64 bytes0.030.001078


So at this point I realize that intel's compiler is optimizing the loop
away, at least for the std crc and probably for both.  So I make mycrc an
array of 2, and substript mycrc[j1] in the loop.

Std crc Slice-8 crc

Intel P4 Xeon 2.8Ghz (Gentoo, gcc-3.4.5, -O2)

8192 bytes  51.397146   9.523182
1024 bytes  6.4309861.229043
64 bytes0.4000620.128579

Intel P4 Xeon 2.8Ghz (Gentoo, icc-9.0.032, -O2 -xN -ipo -parallel)

8192 bytes  29.881708   0.001432
1024 bytes  3.7503130.001432
64 bytes0.2385830.001431

So it looks like something fishy is still going on with the slice-8 with
the intel compiler.

I have attached my changed testcrc.c file.


 FWIW - FreeBSD and Linux results using Tom's test program on almost identical
 hardware[1]:

 Std crc Slice-8 crc

 Intel P-III 1.26Ghz (FreeBSD 6.2)

 8192 bytes  12.975314   14.503810
 1024 bytes  1.6335571.852322
 64 bytes0.1115800.206975


 Intel P-III 1.26Ghz (Gentoo 2006.1)


 8192 bytes  12.967997   28.363876
 1024 bytes  1.6323173.626230
 64 bytes0.1115130.326557


 Interesting that the slice-8 algorithm seems to work noticeably better on
 FreeBSD than Linux - but still not as well as the standard one (for these
 tests anyway)...


 Cheers

 Mark

 [1] Both  boxes have identical mobos, memory and CPUs (same sspec nos).


 ---(end of broadcast)---
 TIP 5: don't forget to increase your free space map settings



-- 
You can tune a piano, but you can't tuna fish.#include postgres.h

#include time.h
#include sys/time.h

#include pg_crc.h

int
main()
{
charbuffer[TESTSIZE];
pg_crc32mycrc[2];
int j;
struct timeval tstart;
struct timeval tstop;

srand(time(NULL));
for (j = 0; j  TESTSIZE; ++j)
buffer[j] = (char) (255 * (rand() / (RAND_MAX + 1.0)));

gettimeofday(tstart, NULL);

for (j = 0; j  NTESTS; j++)
{
INIT_CRC32(mycrc[j1]);
COMP_CRC32(mycrc[j1], buffer, TESTSIZE);
FIN_CRC32(mycrc[j1]);
}

gettimeofday(tstop, NULL);

if (tstop.tv_usec  tstart.tv_usec)
{
tstop.tv_sec--;
tstop.tv_usec += 100;
}

printf(bufsize = %d, loops = %d, elapsed = %ld.%06ld\n,
   TESTSIZE, NTESTS,
   (long) (tstop.tv_sec - tstart.tv_sec),
   (long) (tstop.tv_usec - tstart.tv_usec));

return 0;
}

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-23 Thread Jeremy Drake

On Mon, 23 Oct 2006, Tom Lane wrote:

 Jeremy Drake [EMAIL PROTECTED] writes:
  So at this point I realize that intel's compiler is optimizing the loop
  away, at least for the std crc and probably for both.  So I make mycrc an
  array of 2, and substript mycrc[j1] in the loop.

 That's not a good workaround, because making mycrc expensive to access
 means your inner loop timing isn't credible at all.  Instead try making the
 buffer array nonlocal --- malloc it, perhaps.

That did not make any difference.  The way I see it, the only way to
convince the compiler it really needs to do this loop more than once is to
make it think it is not overwriting the same variable every time.  The
subscript was the cheapest way I could think of to do that.  Any other
suggestions on how to do this are welcome.


   regards, tom lane

 ---(end of broadcast)---
 TIP 6: explain analyze is your friend



-- 
I like being single.  I'm always there when I need me.
-- Art Leo

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-23 Thread Jeremy Drake

On Mon, 23 Oct 2006, Tom Lane wrote:

 Hmm.  Maybe store the CRCs into a global array somewhere?

   uint32 results[NTESTS];

   for ...
   {
   INIT/COMP/FIN_CRC32...
   results[j] = mycrc;
   }

 This still adds a bit of overhead to the outer loop, but not much.


That seems to have worked.

Std crc Slice-8 crc

Intel P4 Xeon 2.8Ghz (Gentoo, gcc-3.4.5, -O2)

8192 bytes  26.765317   10.511143
1024 bytes  3.3578431.280890
64 bytes0.2232130.103767

Intel P4 Xeon 2.8Ghz (Gentoo, icc-9.0.032, -O2 -xN -ipo -parallel)

8192 bytes  29.495836   0.007107
1024 bytes  3.7086650.012183
64 bytes0.2425790.008700


So the gcc times are reasonable, but the icc times for the slice-by-8 are
still too fast to be believed.  I will have to take a look at the
generated assembly later and see what gives.

My changed testcrc.c is attached, again.


-- 
I'd love to go out with you, but I did my own thing and now I've got
to undo it.#include postgres.h

#include time.h
#include sys/time.h

#include pg_crc.h

int
main()
{
volatile char   buffer[TESTSIZE];
pg_crc32results[NTESTS];
pg_crc32mycrc;
int j;
struct timeval tstart;
struct timeval tstop;

srand(time(NULL));
for (j = 0; j  TESTSIZE; ++j)
buffer[j] = (char) (255 * (rand() / (RAND_MAX + 1.0)));

gettimeofday(tstart, NULL);

for (j = 0; j  NTESTS; j++)
{
INIT_CRC32(mycrc);
COMP_CRC32(mycrc, buffer, TESTSIZE);
FIN_CRC32(mycrc);
results[j] = mycrc;
}

gettimeofday(tstop, NULL);

if (tstop.tv_usec  tstart.tv_usec)
{
tstop.tv_sec--;
tstop.tv_usec += 100;
}

printf(bufsize = %d, loops = %d, elapsed = %ld.%06ld\n,
   TESTSIZE, NTESTS,
   (long) (tstop.tv_sec - tstart.tv_sec),
   (long) (tstop.tv_usec - tstart.tv_usec));

return 0;
}

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] plperl/plperlu interaction

2006-10-26 Thread Jeremy Drake

On Thu, 26 Oct 2006, Alvaro Herrera wrote:

 Jeff Trout wrote:
 
  On Oct 26, 2006, at 3:23 PM, Martijn van Oosterhout wrote:
 
  On Thu, Oct 26, 2006 at 03:15:00PM -0400, Andrew Dunstan wrote:
  Perhaps people who use other platforms could look for these flags
  in the
  output of
 perl -e 'use Config qw(myconfig config_sh config_vars config_re);
  print config_sh();'
  
 
  OSX 10.4.8:
 
  usemultiplicity='define'
  usethreads='define'
  useithreads='define'

 Same here on Debian unstable (stock Perl packages).

On my current Gentoo box:
useithreads='undef'
usemultiplicity='undef'
usethreads='undef'

My USE flags have ithreads disabled, since the description of the feature
is Enable Perl threads, has some compatibility problems


-- 
Whether you can hear it or not
The Universe is laughing behind your back
-- National Lampoon, Deteriorata

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

[HACKERS] beta3 CFLAGS issue on openbsd

2006-11-09 Thread Jeremy Drake

I was trying to compile 8.2beta3 on openbsd, and ran into an interesting
issue.  My account on the particular openbsd box has some restrictive
ulimit settings, so I don't have a lot of memory to work with.  I was
getting an out of memory issue linking postgres, while I did not before.
I figured out that the -g flag was being surreptitiously added to my
CFLAGS.  It was like pulling teeth trying to get the -g flag out.  I tried
--disable-debug to configure, which did not work.  I had to do
CFLAGS=-O2 ./configure ...

Is this a known feature in the betas to get people running with -g in case
things break, or is this a configure bug, or expected?

Here is the first bit from configure, note the -g in the using CFLAGS line
at the end.

[EMAIL PROTECTED](~/build/postgres/postgresql-8.2beta3)$ ./configure 
--prefix=/home/jeremyd/progs/pg82 --with-perl --with-openssl --with-pgport=54322
checking build system type... x86_64-unknown-openbsd3.9
checking host system type... x86_64-unknown-openbsd3.9
checking which template to use... openbsd
checking whether to build with 64-bit integer date/time support... no
checking whether NLS is wanted... no
checking for default port number... 54322
checking for C compiler default output file name... a.out
checking whether the C compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables...
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether cc accepts -g... yes
checking for cc option to accept ANSI C... none needed
checking if cc supports -Wdeclaration-after-statement... no
checking if cc supports -Wendif-labels... yes
checking if cc supports -fno-strict-aliasing... yes
configure: using CFLAGS=-O2 -g -pipe -Wall -Wmissing-prototypes -Wpointer-arith 
-Winline -Wendif-labels -fno-strict-aliasing


-- 
It's odd, and a little unsettling, to reflect upon the fact that
English is the only major language in which I is capitalized; in many
other languages You is capitalized and the i is lower case.
-- Sydney J. Harris

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Proposal: syntax of operation with tsearch's configuration

2006-11-17 Thread Jeremy Drake

On Fri, 17 Nov 2006, Tom Lane wrote:

 Peter Eisentraut [EMAIL PROTECTED] writes:
  I don't see any comparable arguments about this full-text search stuff.
  In particular I don't see any arguments why a change would necessary at
  all, including why moving to core would be necessary in the first
  place.

 AFAIR the only argument in favor of that is basically a marketing one:
 users perceive a feature as more real, or more supported, if it's in
 core.  I don't find this argument especially compelling myself.

I am currently in the position that my hosting provider is apprehensive
about installing modules in contrib because they believe they are less
secure.  They cited (real or imagined) security holes as the reason they
would not install tsearch2, or any other contrib module.  This leaves me
without any fulltext indexing option, as it requires a superuser to
install.  I have currently worked around this by running my own postgres
instance from my home directory, as they provide shell access and allow
running background processes, but I was really happy when I heard that
tsearch2 was going to be integrated into core in 8.3.

I think I would settle for some sort of assurance somewhere by someone who
sounds authoritative that the contrib modules are not less secure than
postgres core, and are fully supported by the developers.  I think if  I
could point them at that, I may be able to convince them that it is safe.


   regards, tom lane



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] psql: core dumped

2006-12-19 Thread Jeremy Drake

On Wed, 20 Dec 2006, Philip Yarra wrote:

 Mario wrote:
Even if you get a core dumped every time you press CTRL+\  ?  why?

 Try ulimit -c 0, then run it (you should get no core dump)
 Then ulimit -c 50, then run it (you should get a core dump)

 SIGQUIT is supposed to dump core. Ulimit settings can suppress generation of
 core files. The difference between your machine and your friend's is likely
 just the ulimit settings.

If you want to type CTRL+\ you can redefine what char generates SIGQUIT
with stty quit command.  For instance,

stty quit ^@




-- 
fortune's Contribution of the Month to the Animal Rights Debate:

I'll stay out of animals' way if they'll stay out of mine.
Hey you, get off my plate
-- Roger Midnight

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

[HACKERS] recent --with-libxml support

2006-12-22 Thread Jeremy Drake

I adjusted my buildfarm config (mongoose) to attempt to build HEAD
--with-libxml.  I added the following to build-farm.conf:

if ($branch eq 'HEAD' || $branch ge 'REL8_3')
{
 push(@{$conf{config_opts}},
--with-includes=/usr/include/et:/usr/include/libxml2);
 push(@{$conf{config_opts}}, --with-libxml);
}

As seen, I needed to add an include dir for configure to pass.  However,
make check fails now with the backend crashing.  This can be seen in the
buildfarm results for mongoose.

According to gentoo portage, I have libxml2 version 2.6.26 installed on my
system.

I am not clear if I should have pointed it at libxml version 1 or 2, but
configure seemed to be happy with libxml2.  If it needs version 1, perhaps
configure should do something to keep it from using version 2.

Here is the diff for the xml regression test:

*** ./expected/xml.out  Thu Dec 21 16:47:22 2006
--- ./results/xml.out   Thu Dec 21 16:59:32 2006
***
*** 58,68 
  SELECT xmlelement(name element,
xmlattributes (1 as one, 'deuce' as two),
'content');
!xmlelement
! 
!  element one=1 two=deucecontent/element
! (1 row)
!
  SELECT xmlelement(name element,
xmlattributes ('unnamed and wrong'));
  ERROR:  unnamed attribute value must be a column reference
--- 58,64 
  SELECT xmlelement(name element,
xmlattributes (1 as one, 'deuce' as two),
'content');
! ERROR:  cache lookup failed for type 0
  SELECT xmlelement(name element,
xmlattributes ('unnamed and wrong'));
  ERROR:  unnamed attribute value must be a column reference
***
*** 73,145 
  (1 row)

  SELECT xmlelement(name employee, xmlforest(name, age, salary as pay)) FROM 
emp;
!   xmlelement
! --
!  employeenamesharon/nameage25/agepay1000/pay/employee
!  employeenamesam/nameage30/agepay2000/pay/employee
!  employeenamebill/nameage20/agepay1000/pay/employee
!  employeenamejeff/nameage23/agepay600/pay/employee
!  employeenamecim/nameage30/agepay400/pay/employee
!  employeenamelinda/nameage19/agepay100/pay/employee
! (6 rows)
!
! SELECT xmlelement(name wrong, 37);
! ERROR:  argument of XMLELEMENT must be type xml, not type integer
! SELECT xmlpi(name foo);
!   xmlpi
! -
!  ?foo?
! (1 row)
!
! SELECT xmlpi(name xmlstuff);
! ERROR:  invalid XML processing instruction
! DETAIL:  XML processing instruction target name cannot start with xml.
! SELECT xmlpi(name foo, 'bar');
! xmlpi
! -
!  ?foo bar?
! (1 row)
!
! SELECT xmlpi(name foo, 'in?valid');
! ERROR:  invalid XML processing instruction
! DETAIL:  XML processing instruction cannot contain ?.
! SELECT xmlroot (
!   xmlelement (
! name gazonk,
! xmlattributes (
!   'val' AS name,
!   1 + 1 AS num
! ),
! xmlelement (
!   NAME qux,
!   'foo'
! )
!   ),
!   version '1.0',
!   standalone yes
! );
!  xmlroot
! 
--
!  ?xml version=1.0 standalone=yes?gazonk name=val 
num=2quxfoo/qux/gazonk
! (1 row)
!
! SELECT xmlserialize(content data as character varying) FROM xmltest;
! data
! 
!  valueone/value
!  valuetwo/value
! (2 rows)
!
! -- Check mapping SQL identifier to XML name
! SELECT xmlpi(name :::_xml_abc135.%-_);
!   xmlpi
! -
!  ?_x003A_::_x005F_xml_abc135._x0025_-_x0026__?
! (1 row)
!
! SELECT xmlpi(name 123);
!  xmlpi
! ---
!  ?_x0031_23?
! (1 row)
!
--- 69,75 
  (1 row)

  SELECT xmlelement(name employee, xmlforest(name, age, salary as pay)) FROM 
emp;
! server closed the connection unexpectedly
!   This probably means the server terminated abnormally
!   before or while processing the request.
! connection to server was lost




-- 
The very powerful and the very stupid have one thing in common.
Instead of altering their views to fit the facts, they alter the facts
to fit their views ... which can be very uncomfortable if you happen to
be one of the facts that needs altering.
-- Doctor Who, Face of Evil

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] recent --with-libxml support

2006-12-22 Thread Jeremy Drake

On Fri, 22 Dec 2006, Tom Lane wrote:

 Jeremy Drake [EMAIL PROTECTED] writes:
  As seen, I needed to add an include dir for configure to pass.  However,
  make check fails now with the backend crashing.  This can be seen in the
  buildfarm results for mongoose.

 Can you provide a stack trace for that crash?

#0  0xb7c4dc85 in memcpy () from /lib/tls/libc.so.6
#1  0x08190f59 in appendBinaryStringInfo (str=0xbfd87f90,
data=0x841ffc0 qux, datalen=138543040) at stringinfo.c:192
#2  0x0828377f in map_sql_identifier_to_xml_name (ident=0x841ffc0 qux,
fully_escaped=0 '\0') at xml.c:933
#3  0x0811ce83 in transformXmlExpr (pstate=0x84202b8, x=0x8420034)
at parse_expr.c:1426
#4  0x0811ac91 in transformExpr (pstate=0x84202b8, expr=0x8420034)
at parse_expr.c:238
#5  0x0811ceb4 in transformXmlExpr (pstate=0x84202b8, x=0x8420174)
at parse_expr.c:1456
#6  0x0811ac91 in transformExpr (pstate=0x84202b8, expr=0x8420174)
at parse_expr.c:238
#7  0x081288a4 in transformTargetEntry (pstate=0x84202b8, node=0x8420174,
expr=0x0, colname=0x0, resjunk=0 '\0') at parse_target.c:74
#8  0x0812890e in transformTargetList (pstate=0x84202b8, targetlist=0x1)
at parse_target.c:146
#9  0x080ffcef in transformStmt (pstate=0x84202b8, parseTree=0x84201fc,
extras_before=0xbfd882c4, extras_after=0xbfd882c8) at analyze.c:2102
#10 0x08101421 in do_parse_analyze (parseTree=0x841ffc0, pstate=0x84202b8)
at analyze.c:251
#11 0x0810227a in parse_analyze (parseTree=0x84201fc,
sourceText=0x841ffc0 qux, paramTypes=0x841ffc0, numParams=138543040)
at analyze.c:173
#12 0x0820b66e in pg_analyze_and_rewrite (parsetree=0x84201fc,
query_string=0x841fb74 SELECT xmlconcat(xmlcomment('hello'),\n, ' '
repeats 17 times, xmlelement(NAME qux, 'foo'),\n, ' ' repeats 17
times, xmlcomment('world'));, paramTypes=0x0, numParams=0) at
postgres.c:567
#13 0x0820b91e in exec_simple_query (
query_string=0x841fb74 SELECT xmlconcat(xmlcomment('hello'),\n, ' '
repeats 17 times, xmlelement(NAME qux, 'foo'),\n, ' ' repeats 17
times, xmlcomment('world'));) at postgres.c:875
#14 0x0820d72b in PostgresMain (argc=4, argv=0x83c5c2c,
username=0x83c5bfc jeremyd) at postgres.c:3418
#15 0x081dfbd7 in ServerLoop () at postmaster.c:2924
#16 0x081e132c in PostmasterMain (argc=3, argv=0x83c4550) at
postmaster.c:958
#17 0x081991e0 in main (argc=3, argv=0x83c4550) at main.c:188


-- 
In Tennessee, it is illegal to shoot any game other than whales from a
moving automobile.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] recent --with-libxml support

2006-12-22 Thread Jeremy Drake

On Fri, 22 Dec 2006, Tom Lane wrote:

 Jeremy Drake [EMAIL PROTECTED] writes:
  Can you provide a stack trace for that crash?

  #0  0xb7c4dc85 in memcpy () from /lib/tls/libc.so.6
  #1  0x08190f59 in appendBinaryStringInfo (str=0xbfd87f90,
  data=0x841ffc0 qux, datalen=138543040) at stringinfo.c:192
  #2  0x0828377f in map_sql_identifier_to_xml_name (ident=0x841ffc0 qux,
  fully_escaped=0 '\0') at xml.c:933

 Hmm ... it seems to work for me here, using Fedora 5's libxml.

 Are you by any chance running this with a non-C locale?  The trace
 suggests an encoding-mismatch sort of issue...

Nope.

I saw another buildfarm member that looks like it croaked in the same
place:
http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=spongedt=2006-12-22%2022:30:02

So I guess it is not just me...


-- 
If you think education is expensive, try ignorance.
-- Derek Bok, president of Harvard

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] recent --with-libxml support

2006-12-22 Thread Jeremy Drake

On Sat, 23 Dec 2006, Tom Lane wrote:

 Peter Eisentraut [EMAIL PROTECTED] writes:
  Jeremy Drake wrote:
  #0  0xb7c4dc85 in memcpy () from /lib/tls/libc.so.6
  #1  0x08190f59 in appendBinaryStringInfo (str=0xbfd87f90,
  data=0x841ffc0 qux, datalen=138543040) at stringinfo.c:192
  #2  0x0828377f in map_sql_identifier_to_xml_name (ident=0x841ffc0
  qux, fully_escaped=0 '\0') at xml.c:933

  Obviously the datalen has gone off the map.

 I wouldn't put 100% faith in that display, unless Jeremy built with -O0.

I built this one with gcc 3.4.5 using --enable-debug --enable-cassert
configure options.  I will try with -O0 and see what I get...


-- 
NAPOLEON: What shall we do with this soldier, Guiseppe?
Everything he says is wrong.
GUISEPPE: Make him a general, Excellency,
and then everything he says will be right.

-- G. B. Shaw, The Man of Destiny

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] recent --with-libxml support

2006-12-22 Thread Jeremy Drake

On Fri, 22 Dec 2006, Jeremy Drake wrote:

 On Sat, 23 Dec 2006, Tom Lane wrote:

  Peter Eisentraut [EMAIL PROTECTED] writes:
   Jeremy Drake wrote:
   #0  0xb7c4dc85 in memcpy () from /lib/tls/libc.so.6
   #1  0x08190f59 in appendBinaryStringInfo (str=0xbfd87f90,
   data=0x841ffc0 qux, datalen=138543040) at stringinfo.c:192
   #2  0x0828377f in map_sql_identifier_to_xml_name (ident=0x841ffc0
   qux, fully_escaped=0 '\0') at xml.c:933
 
   Obviously the datalen has gone off the map.
 
  I wouldn't put 100% faith in that display, unless Jeremy built with -O0.

 I built this one with gcc 3.4.5 using --enable-debug --enable-cassert
 configure options.  I will try with -O0 and see what I get...

I just tried the same thing, but passing CFLAGS=-g -O0 to configure and
the xml test passed.  Maybe a '\0' termination issue?

I also recompiled everything with the defaults again (-O2) and the xml
test crashed in the same place.

So it is an issue of -O0 works vs -O2 does not.  Hate those...



-- 
When I get real bored, I like to drive downtown and get a great
parking spot, then sit in my car and count how many people ask me if
I'm leaving.
-- Steven Wright

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] recent --with-libxml support

2006-12-23 Thread Jeremy Drake

On Sat, 23 Dec 2006, Tom Lane wrote:

 Ah-hah, I've sussed it.  sqlchar_to_unicode() calls the
 mb2wchar_with_len converters, which are defined to return a *null
 terminated* pg_wchar string.  So even if you only ask for the conversion
 of a single character, you need a 2-pg_wchar array to hold the result,
 and that wasn't being allocated.  So the code was clobbering a word of
 stack space that evidently is critical on some machines and not others.

 Fix committed, please see if it works for you now.

The buildfarm run seems to be green now.  I am compiling a sandbox to test
it manually now, but I'm pretty sure that fixed it.  Yep, the make check
works there too...

-- 
As long as war is regarded as wicked, it will always have its
fascination.  When it is looked upon as vulgar, it will cease to be
popular.
-- Oscar Wilde

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Doc bug

2006-12-30 Thread Jeremy Drake

On Sun, 31 Dec 2006, Gurjeet Singh wrote:

 On 12/31/06, Andrew Dunstan [EMAIL PROTECTED] wrote:
 
 
  Gurjeet Singh wrote:
   BTW, I don't know how to make sure that the effect of a doc patch looks
   fine
   in a browser. I mean, how to view the doc/src/sgml/*.sgml in a browser,
   nicely formatted as we see on our website!
  
 
 
 
  Docs for CVS HEAD can aleways be seen here:
 
  http://momjian.us/main/writings/pgsql/sgml/


 I meant, before submitting the patch, or while even working on it, I would
 like to see it's rendering on my local machine, from the sources. One would
 like to be sure that he used the correct (or more appropriate) tags while
 writing the patch.

cd pgsql/doc/src/sgml
make html

See
http://developer.postgresql.org/pgdocs/postgres/docguide-build.html

---(end of broadcast)---
TIP 6: explain analyze is your friend

[HACKERS] float8 width_bucket function

2007-01-01 Thread Jeremy Drake

I came across this when looking through the patches_hold queue link that
Bruce sent out.

http://momjian.us/mhonarc/patches_hold/msg00162.html

There is no patch or anything associated with it, just the suggestion that
it be put in when 8.3 devel starts up.

Just thought I'd put this back out there now that 8.3 devel has started,
since I had just about forgotten about it until seeing it on that list...


-- 
Putt's Law:
Technology is dominated by two types of people:
Those who understand what they do not manage.
Those who manage what they do not understand.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

1 2 >

1 - 100 of 138 matches

Mail list logo