Re: [HACKERS] 64-bit API for large objects
Thread added to TODO. As far as I can see, this thread never produced an patch that could be applied. --- Tom Lane wrote: Jeremy Drake [EMAIL PROTECTED] writes: On Fri, 23 Sep 2005, Tom Lane wrote: postgresql-fe.h defines a ton of stuff that has no business being visible to libpq's client applications. It's designed to be used by our *own* client-side code (psql and the like), but we have not made any attempt to keep it from defining stuff that would likely break other peoples' code. So does this mean that there is a different, more advanced and more likely to break random other code, client library where this call would fit better? I've been thinking more about this and come to these conclusions: 1. libpq_fe.h definitely cannot include postgres_fe.h; in fact, it has no business even defining a type named int64. That is way too likely to collide with symbols coming from elsewhere in a client compilation. I think what we need is to declare a type named pg_int64 and use that in the externally visible declarations. The most reasonable place to put the typedef is postgres_ext.h. This will mean making configure generate postgres_ext.h from a template postgres_ext.h.in, but that's no big deal. 2. We need a strategy for what to do when configure doesn't find a working int64 type. My inclination is to just not export the functions in that case. So normally, postgres_ext.h would contain something like #define HAVE_PG_INT64 1 typedef long long int pg_int64; but neither of these would appear if configure couldn't find a working type. In libpq-fe.h, we'd have #ifdef HAVE_PG_INT64 extern pg_int64 lo_lseek64(PGconn *conn, int fd, pg_int64 offset, int whence); extern pg_int64 lo_tell64(PGconn *conn, int fd); #endif and similarly for all the code inside libpq. The reason this seems like a good idea is that client code could key off #ifdef HAVE_PG_INT64 to detect whether the lo64 functions are available; which is useful even if you don't care about machines without int64, because you still need to think about machines with pre-8.2 PG installations. 3. This is still not 100% bulletproof, as it doesn't address situations like building PG with gcc and then trying to compile client apps with a vendor cc that doesn't understand long long int. The compile would choke on the typedef even if you weren't trying to use large objects at all. I don't see any very nice way around that. It might be worth doing this in postgres_ext.h: #ifndef NO_PG_INT64 #define HAVE_PG_INT64 1 typedef long long int pg_int64; #endif which would at least provide an escape hatch for such situations: define NO_PG_INT64 before trying to build. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend -- Bruce Momjian http://candle.pha.pa.us EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] 64-bit API for large objects
This has been saved for the 8.2 release: http://momjian.postgresql.org/cgi-bin/pgpatches_hold --- Tom Lane wrote: Jeremy Drake [EMAIL PROTECTED] writes: On Fri, 23 Sep 2005, Tom Lane wrote: postgresql-fe.h defines a ton of stuff that has no business being visible to libpq's client applications. It's designed to be used by our *own* client-side code (psql and the like), but we have not made any attempt to keep it from defining stuff that would likely break other peoples' code. So does this mean that there is a different, more advanced and more likely to break random other code, client library where this call would fit better? I've been thinking more about this and come to these conclusions: 1. libpq_fe.h definitely cannot include postgres_fe.h; in fact, it has no business even defining a type named int64. That is way too likely to collide with symbols coming from elsewhere in a client compilation. I think what we need is to declare a type named pg_int64 and use that in the externally visible declarations. The most reasonable place to put the typedef is postgres_ext.h. This will mean making configure generate postgres_ext.h from a template postgres_ext.h.in, but that's no big deal. 2. We need a strategy for what to do when configure doesn't find a working int64 type. My inclination is to just not export the functions in that case. So normally, postgres_ext.h would contain something like #define HAVE_PG_INT64 1 typedef long long int pg_int64; but neither of these would appear if configure couldn't find a working type. In libpq-fe.h, we'd have #ifdef HAVE_PG_INT64 extern pg_int64 lo_lseek64(PGconn *conn, int fd, pg_int64 offset, int whence); extern pg_int64 lo_tell64(PGconn *conn, int fd); #endif and similarly for all the code inside libpq. The reason this seems like a good idea is that client code could key off #ifdef HAVE_PG_INT64 to detect whether the lo64 functions are available; which is useful even if you don't care about machines without int64, because you still need to think about machines with pre-8.2 PG installations. 3. This is still not 100% bulletproof, as it doesn't address situations like building PG with gcc and then trying to compile client apps with a vendor cc that doesn't understand long long int. The compile would choke on the typedef even if you weren't trying to use large objects at all. I don't see any very nice way around that. It might be worth doing this in postgres_ext.h: #ifndef NO_PG_INT64 #define HAVE_PG_INT64 1 typedef long long int pg_int64; #endif which would at least provide an escape hatch for such situations: define NO_PG_INT64 before trying to build. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] 64-bit API for large objects
On Sat, Sep 24, 2005 at 12:13:11PM -0400, Tom Lane wrote: David Fetter [EMAIL PROTECTED] writes: On Fri, Sep 23, 2005 at 05:40:09PM -0400, Tom Lane wrote: For that matter, we can't even guarantee that they work at all: not all platforms even *have* int64 types. What platforms that PG supports don't have int64 arithmetic? We claim to build with any ANSI C compiler, and there is no requirement for a 64-bit type in ANSI C. The historical project policy is that we should still build without such a type, and everything should still work except that the effective bounds of bigint data correspond to int32 instead of int64 limits. I see no reason to back off that policy. It's not very much harder to do it right. So what happens if you attempt to put a value greater than 2^32 into a bigint on a non-int64 platform? I would argue that by default we should not allow users to even create bigints on any platform where bigint = int. And if the default is overridden, we should still throw a warning. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] 64-bit API for large objects
Jim C. Nasby [EMAIL PROTECTED] writes: So what happens if you attempt to put a value greater than 2^32 into a bigint on a non-int64 platform? You get the same error as if you tried to store a value greater than 2^64. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] 64-bit API for large objects
Jeremy Drake [EMAIL PROTECTED] writes: On Fri, 23 Sep 2005, Tom Lane wrote: postgresql-fe.h defines a ton of stuff that has no business being visible to libpq's client applications. It's designed to be used by our *own* client-side code (psql and the like), but we have not made any attempt to keep it from defining stuff that would likely break other peoples' code. So does this mean that there is a different, more advanced and more likely to break random other code, client library where this call would fit better? I've been thinking more about this and come to these conclusions: 1. libpq_fe.h definitely cannot include postgres_fe.h; in fact, it has no business even defining a type named int64. That is way too likely to collide with symbols coming from elsewhere in a client compilation. I think what we need is to declare a type named pg_int64 and use that in the externally visible declarations. The most reasonable place to put the typedef is postgres_ext.h. This will mean making configure generate postgres_ext.h from a template postgres_ext.h.in, but that's no big deal. 2. We need a strategy for what to do when configure doesn't find a working int64 type. My inclination is to just not export the functions in that case. So normally, postgres_ext.h would contain something like #define HAVE_PG_INT64 1 typedef long long int pg_int64; but neither of these would appear if configure couldn't find a working type. In libpq-fe.h, we'd have #ifdef HAVE_PG_INT64 extern pg_int64 lo_lseek64(PGconn *conn, int fd, pg_int64 offset, int whence); extern pg_int64 lo_tell64(PGconn *conn, int fd); #endif and similarly for all the code inside libpq. The reason this seems like a good idea is that client code could key off #ifdef HAVE_PG_INT64 to detect whether the lo64 functions are available; which is useful even if you don't care about machines without int64, because you still need to think about machines with pre-8.2 PG installations. 3. This is still not 100% bulletproof, as it doesn't address situations like building PG with gcc and then trying to compile client apps with a vendor cc that doesn't understand long long int. The compile would choke on the typedef even if you weren't trying to use large objects at all. I don't see any very nice way around that. It might be worth doing this in postgres_ext.h: #ifndef NO_PG_INT64 #define HAVE_PG_INT64 1 typedef long long int pg_int64; #endif which would at least provide an escape hatch for such situations: define NO_PG_INT64 before trying to build. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] 64-bit API for large objects
Jeremy Drake [EMAIL PROTECTED] writes: In any case, are there any comments on the changes below libpq (the functions visible to queries on down)? Within the backend, I don't see the point in maintaining a distinction between 32- and 64-bit APIs for inv_api.c: you should just switch 'em to use int64. You did it that way for inv_getsize but then proceeded to add separate inv_seek64/tell64 functions, which is inconsistent. The submitted version of lo_tell64 isn't even calling the right one :-( The 32-bit version of lo_tell will need to check and error out if the value it'd need to return doesn't fit in int32. (Come to think of it, so should the 32-bit version of lo_lseek.) All of the LO code needs to be eyeballed to make sure it still behaves sanely if int64 is really only 32 bits. It would probably be a good idea also to introduce some overflow checks to detect cases where the current LO offset would overflow int64 after a read, write, or seek. This is missing from the existing code :-( It is possible to code overflow checks that work regardless of the size of int64; see int8.c for some inspiration. I'd suggest also that the offset be defined as signed not unsigned (int64 not uint64) as this will simplify the overflow checks and eliminate the prospect of scenarios where lo_tell64 cannot return a correct value. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] 64-bit API for large objects
On Fri, 23 Sep 2005, Tom Lane wrote: Jeremy Drake [EMAIL PROTECTED] writes: 2) The lo_*64, in order to be convenient from the client end, have functions added to libpq as the existing lo_* functions. The client side of libpq did not previously know anything about int64 or how to send/receive them. I added an include of postgres-fe.h (which according to the comment in that file looks like it should go there) so int64 would be defined, Unfortunately that's completely unacceptable from a namespace-pollution point of view. I don't quite understand. Allow me to cite the source, so we both are referring to the same thing here... [EMAIL PROTECTED] postgresql-8.0.3 $ head -n17 src/include/postgres_fe.h /*- * * postgres_fe.h *Primary include file for PostgreSQL client-side .c files * * This should be the first file included by PostgreSQL client libraries and * application programs --- but not by backend modules, which should include * postgres.h. * * * Portions Copyright (c) 1996-2005, PostgreSQL Global Development Group * Portions Copyright (c) 1995, Regents of the University of California * * $PostgreSQL: pgsql/src/include/postgres_fe.h,v 1.10 2004/12/31 22:03:19 pgsql Exp $ * *- */ Now I may not completely understand the term client, but I think libpq is a client library and anything which may use it would be an application program. So it seems it was an oversight on the part of libpq to not include it. Does the term client not mean what I thought it did (anything which connects to a postgresql server)? The real problem here is that int64 isn't a well-defined portable datatype, and so it's going to be very hard to export these functions in a way that won't break on different platforms, applications compiled with a different compiler than libpq was, etc. Umm, what wouldn't break if you switched compilers in a way that redefined sizeof(things)? I happen to know, even using the same compiler but just changing a compile flag (-m64) which changes sizes of integral types (sizeof(long) from 32 to 64 bits) will make such actions stop working on one of my tested platform. It sucks, I happen to not be fond of this because I tend not to have every library which is on my box built for both, but it is the way life is. I do not know of a platform where the size of an integral type can change and still be able to link against libraries and things. And if the size of some type is not changing, then things should already be correctly set for the platform. But I admit I have not met every platform in existance. Do you happen to be able to cite a platform where this is the case? For that matter, we can't even guarantee that they work at all: not all platforms even *have* int64 types. We have so far avoided putting any fundamental dependencies on int64 arithmetic into the system, and I'm a bit worried that this patch will break LO support entirely on platforms that don't have working int64 arithmetic. They should in fact break gracefully on such platforms, or at least as gracefully as any other int64-using code might. I did check a couple places for #ifdef INT64_BROKEN (or whatever it was called) to make sure that on those platforms something at least somewhat sane would happen. (they use 32 bits instead). Also, on those platforms, you could always use the non-64 versions if you were concerned about that. The patches would allow seeking past the old limit using the 32 function in stages (seek 2G, seek 2G, seek 2G would put you at 6G) if you do not mind wierd return values and tell not working. And if you use a platform which does not support 64bit integral types, then you cannot reasonably expect those functions to work correctly anyway. But they should compile at least. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match -- I don't wanna argue, and I don't wanna fight, But there will definitely be a party tonight... ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] 64-bit API for large objects
David Fetter [EMAIL PROTECTED] writes: On Fri, Sep 23, 2005 at 05:40:09PM -0400, Tom Lane wrote: For that matter, we can't even guarantee that they work at all: not all platforms even *have* int64 types. What platforms that PG supports don't have int64 arithmetic? We claim to build with any ANSI C compiler, and there is no requirement for a 64-bit type in ANSI C. The historical project policy is that we should still build without such a type, and everything should still work except that the effective bounds of bigint data correspond to int32 instead of int64 limits. I see no reason to back off that policy. It's not very much harder to do it right. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] 64-bit API for large objects
Hey, While you guys are hacking at the LO code, it would be nice to consider the suggestions outlined here: http://archives.postgresql.org/pgsql-bugs/2004-07/msg00143.php -- Alvaro Herrera http://www.amazon.com/gp/registry/CTMLCN8V17R4 Y una voz del caos me habló y me dijo Sonríe y sé feliz, podría ser peor. Y sonreí. Y fui feliz. Y fue peor. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] 64-bit API for large objects
On Sat, 24 Sep 2005, Tom Lane wrote: Jeremy Drake [EMAIL PROTECTED] writes: In any case, are there any comments on the changes below libpq (the functions visible to queries on down)? Within the backend, I don't see the point in maintaining a distinction between 32- and 64-bit APIs for inv_api.c: you should just switch 'em to use int64. You did it that way for inv_getsize but then proceeded to add separate inv_seek64/tell64 functions, which is inconsistent. Right. I did it the way you describe my first cut (I did this several times and changed my mind and started over). I was concerned (perhaps needlessly) about breaking the signatures of the inv_* functions which are visible outside of the .c file. This is why I did the getsize differently - it is static. But it sounds like there is no concern about changing the signatures of these functions, so I will change my patch to not maintain the seperate code paths in inv_api.c. The submitted version of lo_tell64 isn't even calling the right one :-( Oops. That's what I get for lots of copy/paste and doing it multiple times... Bonehead mistake, thanks for catching it. The 32-bit version of lo_tell will need to check and error out if the value it'd need to return doesn't fit in int32. (Come to think of it, so should the 32-bit version of lo_lseek.) That makes sense. Or it could return some value (INT_MAX?) which could mean that it is outside the range, so someone could still get at the data even if they are using a backwards client box? I don't know if that makes sense at all, it sounds pretty wacky since these clients would have no way of knowing where they are in the file. Erroring would probably be best. All of the LO code needs to be eyeballed to make sure it still behaves sanely if int64 is really only 32 bits. Of course. It would probably be a good idea also to introduce some overflow checks to detect cases where the current LO offset would overflow int64 after a read, write, or seek. This is missing from the existing code :-( It is possible to code overflow checks that work regardless of the size of int64; see int8.c for some inspiration. Yes. That would be good. These would be in the inv_* functions, correct? I'd suggest also that the offset be defined as signed not unsigned (int64 not uint64) as this will simplify the overflow checks and eliminate the prospect of scenarios where lo_tell64 cannot return a correct value. I intended to do that. I think the only place I made uint64 vs int64 was the getsize function, and that could be int64 also. I will look at that code and make sure I am not mixing them in ways that are not necessary and useful. I will take these suggestions and make another revision of the patch shortly. Also, I was considering exposing up an lo_getsize or lo_stat function which would tell you how big a large object was without having to seek to the end and look at the return value, requiring you to have the large object open at the time, to loose your old seek position (unless you do a tell first), and requires several more server round-trips (if not open, would involve open/seek/close, if open, could require tell/seek/seek). ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] 64-bit API for large objects
On Sat, 24 Sep 2005, Alvaro Herrera wrote: Hey, While you guys are hacking at the LO code, it would be nice to consider the suggestions outlined here: http://archives.postgresql.org/pgsql-bugs/2004-07/msg00143.php Included from that message for easier reference: 0) In Oid lo_creat(PGconn *conn, int mode), why is there a mode on lo_create? The mode is determined when the object is lo_open()ed, right? I think the docs basically said it is a vestigial feature, it used to be useful but the code evolved in such a way that it ceased being useful. It is probably still there to allow old code to continue to compile against newer servers without being recompiled. 1) There is no lo_truncate(PGconn *conn, int fd, off_t len). Did not notice that one. That is a good one to add if adding functions is in the cards. I bet when the person/people who are attempting to write to this api here get far enough, they would have noticed that too ;) 2) There is no lo_length(PGconn *conn, int fd). We did notice this one however. There is also no lo_stat(PGconn *conn, Oid lobjId). I have been thinking about implementing these two. I think I will make a revision of the patch at some point with these. The size ones will be extremely easy, the functionality is already there, just a matter of exposing it. The truncate is not too difficult, but actually requires me to think a little more ;) -- When does summertime come to Minnesota, you ask? Well, last year, I think it was a Tuesday. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] 64-bit API for large objects
Jeremy Drake [EMAIL PROTECTED] writes: 0) In Oid lo_creat(PGconn *conn, int mode), why is there a mode on lo_create? The mode is determined when the object is lo_open()ed, right? I think the docs basically said it is a vestigial feature, it used to be useful but the code evolved in such a way that it ceased being useful. It is probably still there to allow old code to continue to compile against newer servers without being recompiled. Yeah. There were once multiple types of large objects, and I suppose the mode argument told lo_creat which kind to create. I have no idea how the read/write bits got included into that --- it doesn't make any sense. As of PG 8.1, lo_creat just ignores the mode argument. We can't delete the argument though without causing a lot of compatibility headaches. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] 64-bit API for large objects
This patch implements the ability for large objects to be larger than 2GB. I believe the limit to now be about 2TB, based on the fact that the large object page size is 2048 bytes, and the page number is still 32 bits. There are a few things about this patch which probably require tweaking or at least a closer inspection from the list. 1) The lo_*64 functions are added to the catalog/pg_proc.h (spacing exact location atm) with OID set to 0, all other entries in this file have OIDs explicitly defined. 2) The lo_*64, in order to be convenient from the client end, have functions added to libpq as the existing lo_* functions. The client side of libpq did not previously know anything about int64 or how to send/receive them. I added an include of postgres-fe.h (which according to the comment in that file looks like it should go there) so int64 would be defined, also implemented functions (code mostly stolen from the server libpq format functions for same) to convert them to/from network byte order. I did this in a somewhat inconsistent way between the get and put, as I did not want to change the existing api at all, and existing code as little as possible. 3) The 32 bit box I tested this on was a PII 300MHz laptop. Not exactly the fastest. The test consisted entirely of making sure it compiled. Perhaps someone with a fast IA32 box and spare cycles can test it? Also, so far the only platforms I have tried to compile this on have been: * Linux 2.6 (gentoo), AMD64, gcc-3.4.4 * Solaris 8, SPARCv9, gcc-3.4.2 * Linux 2.6 (debian unstable), i686, gcc-3.4.x (laptop, don't remember exact version). Would probably be a good idea to verify this on other platforms as well, or at least other compilers. Hopefully I did not break anything too badly with this. All of the regression tests still pass after the patch, and I made a version of the tests/examples/testlo which uses 64bit (in the patch) which works also. I grepped in the regression tests, and I could not find any usage of large objects in them, which I found to be rather odd, which is why I used testlo and my new testlo64 to test them instead. On Tue, 20 Sep 2005, Jonah H. Harris wrote: Cool. We look forward to it. On 9/19/05, Mark Dilger [EMAIL PROTECTED] wrote: Jonah H. Harris wrote: Mark, If you don't mind contributing the changes, we'd be glad to take a look at them. Thanks. -Jonah Ok, we will post it back soon. We have tested it on two different 64-bit architectures (Sparc and AMD) and are now testing on pentium before posting up to the list. mark -- Respectfully, Jonah H. Harris, Database Internals Architect EnterpriseDB Corporation http://www.enterprisedb.com/ -- Mere nonexistence is a feeble excuse for declaring a thing unseeable. You *can* see dragons. You just have to look in the right direction. -- John Haslerdiff -Nur postgresql-8.0.3-orig/src/backend/libpq/be-fsstubs.c postgresql-8.0.3/src/backend/libpq/be-fsstubs.c --- postgresql-8.0.3-orig/src/backend/libpq/be-fsstubs.c2004-12-31 13:59:50.0 -0800 +++ postgresql-8.0.3/src/backend/libpq/be-fsstubs.c 2005-09-18 17:22:17.0 -0700 @@ -233,6 +233,34 @@ PG_RETURN_INT32(status); } + +Datum +lo_lseek64(PG_FUNCTION_ARGS) +{ + int32 fd = PG_GETARG_INT32(0); + int64 offset = PG_GETARG_INT64(1); + int32 whence = PG_GETARG_INT32(2); + MemoryContext currentContext; + int64 status; + + if (fd 0 || fd = cookies_size || cookies[fd] == NULL) + { + ereport(ERROR, + (errcode(ERRCODE_UNDEFINED_OBJECT), +errmsg(invalid large-object descriptor: %d, fd))); + PG_RETURN_INT64(-1); + } + + Assert(fscxt != NULL); + currentContext = MemoryContextSwitchTo(fscxt); + + status = inv_seek64(cookies[fd], offset, whence); + + MemoryContextSwitchTo(currentContext); + + PG_RETURN_INT64(status); +} + Datum lo_creat(PG_FUNCTION_ARGS) { @@ -283,6 +311,28 @@ PG_RETURN_INT32(inv_tell(cookies[fd])); } + +Datum +lo_tell64(PG_FUNCTION_ARGS) +{ + int32 fd = PG_GETARG_INT32(0); + + if (fd 0 || fd = cookies_size || cookies[fd] == NULL) + { + ereport(ERROR, + (errcode(ERRCODE_UNDEFINED_OBJECT), +errmsg(invalid large-object descriptor: %d, fd))); + PG_RETURN_INT64(-1); + } + + /* +* We assume we do not need to switch contexts for inv_tell. That is +* true for now, but is probably more than this module ought to +* assume... +*/ + PG_RETURN_INT64(inv_tell(cookies[fd])); +} + Datum lo_unlink(PG_FUNCTION_ARGS) { diff -Nur postgresql-8.0.3-orig/src/backend/storage/large_object/inv_api.c
Re: [HACKERS] 64-bit API for large objects
Jeremy Drake [EMAIL PROTECTED] writes: 2) The lo_*64, in order to be convenient from the client end, have functions added to libpq as the existing lo_* functions. The client side of libpq did not previously know anything about int64 or how to send/receive them. I added an include of postgres-fe.h (which according to the comment in that file looks like it should go there) so int64 would be defined, Unfortunately that's completely unacceptable from a namespace-pollution point of view. The real problem here is that int64 isn't a well-defined portable datatype, and so it's going to be very hard to export these functions in a way that won't break on different platforms, applications compiled with a different compiler than libpq was, etc. For that matter, we can't even guarantee that they work at all: not all platforms even *have* int64 types. We have so far avoided putting any fundamental dependencies on int64 arithmetic into the system, and I'm a bit worried that this patch will break LO support entirely on platforms that don't have working int64 arithmetic. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] 64-bit API for large objects
On Fri, Sep 23, 2005 at 05:40:09PM -0400, Tom Lane wrote: Jeremy Drake [EMAIL PROTECTED] writes: The real problem here is that int64 isn't a well-defined portable datatype, and so it's going to be very hard to export these functions in a way that won't break on different platforms, applications compiled with a different compiler than libpq was, etc. For that matter, we can't even guarantee that they work at all: not all platforms even *have* int64 types. We have so far avoided putting any fundamental dependencies on int64 arithmetic into the system, and I'm a bit worried that this patch will break LO support entirely on platforms that don't have working int64 arithmetic. What platforms that PG supports don't have int64 arithmetic? Cheers, D -- David Fetter [EMAIL PROTECTED] http://fetter.org/ phone: +1 510 893 6100 mobile: +1 415 235 3778 Remember to vote! ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] 64-bit API for large objects
Jeremy Drake [EMAIL PROTECTED] writes: On Fri, 23 Sep 2005, Tom Lane wrote: Unfortunately that's completely unacceptable from a namespace-pollution point of view. I don't quite understand. postgresql-fe.h defines a ton of stuff that has no business being visible to libpq's client applications. It's designed to be used by our *own* client-side code (psql and the like), but we have not made any attempt to keep it from defining stuff that would likely break other peoples' code. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] 64-bit API for large objects
I sent this from the wrong address last time so it did not go to the list, I apologize to anyone who may be getting it again... On Fri, 23 Sep 2005, Tom Lane wrote: Jeremy Drake [EMAIL PROTECTED] writes: 2) The lo_*64, in order to be convenient from the client end, have functions added to libpq as the existing lo_* functions. The client side of libpq did not previously know anything about int64 or how to send/receive them. I added an include of postgres-fe.h (which according to the comment in that file looks like it should go there) so int64 would be defined, Unfortunately that's completely unacceptable from a namespace-pollution point of view. I don't quite understand. Allow me to cite the source, so we both are referring to the same thing here... [EMAIL PROTECTED] postgresql-8.0.3 $ head -n17 src/include/postgres_fe.h /*- * * postgres_fe.h *Primary include file for PostgreSQL client-side .c files * * This should be the first file included by PostgreSQL client libraries and * application programs --- but not by backend modules, which should include * postgres.h. * * * Portions Copyright (c) 1996-2005, PostgreSQL Global Development Group * Portions Copyright (c) 1995, Regents of the University of California * * $PostgreSQL: pgsql/src/include/postgres_fe.h,v 1.10 2004/12/31 22:03:19 pgsql Exp $ * *- */ Now I may not completely understand the term client, but I think libpq is a client library and anything which may use it would be an application program. So it seems it was an oversight on the part of libpq to not include it. Does the term client not mean what I thought it did (anything which connects to a postgresql server)? The real problem here is that int64 isn't a well-defined portable datatype, and so it's going to be very hard to export these functions in a way that won't break on different platforms, applications compiled with a different compiler than libpq was, etc. Umm, what wouldn't break if you switched compilers in a way that redefined sizeof(things)? I happen to know, even using the same compiler but just changing a compile flag (-m64) which changes sizes of integral types (sizeof(long) from 32 to 64 bits) will make such actions stop working on one of my tested platform. It sucks, I happen to not be fond of this because I tend not to have every library which is on my box built for both, but it is the way life is. I do not know of a platform where the size of an integral type can change and still be able to link against libraries and things. And if the size of some type is not changing, then things should already be correctly set for the platform. But I admit I have not met every platform in existance. Do you happen to be able to cite a platform where this is the case? For that matter, we can't even guarantee that they work at all: not all platforms even *have* int64 types. We have so far avoided putting any fundamental dependencies on int64 arithmetic into the system, and I'm a bit worried that this patch will break LO support entirely on platforms that don't have working int64 arithmetic. They should in fact break gracefully on such platforms, or at least as gracefully as any other int64-using code might. I did check a couple places for #ifdef INT64_BROKEN (or whatever it was called) to make sure that on those platforms something at least somewhat sane would happen. (they use 32 bits instead). Also, on those platforms, you could always use the non-64 versions if you were concerned about that. The patches would allow seeking past the old limit using the 32 function in stages (seek 2G, seek 2G, seek 2G would put you at 6G) if you do not mind wierd return values and tell not working. And if you use a platform which does not support 64bit integral types, then you cannot reasonably expect those functions to work correctly anyway. But they should compile at least. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match -- I don't wanna argue, and I don't wanna fight, But there will definitely be a party tonight... ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] 64-bit API for large objects
On Fri, 23 Sep 2005, Tom Lane wrote: postgresql-fe.h defines a ton of stuff that has no business being visible to libpq's client applications. It's designed to be used by our *own* client-side code (psql and the like), but we have not made any attempt to keep it from defining stuff that would likely break other peoples' code. So does this mean that there is a different, more advanced and more likely to break random other code, client library where this call would fit better? If so, I would be happy to change the patch to put it there. I did not see it, but I did not look very hard. If not, what is a client side programmer to do if they want to pass int64s around? Every client app has to basically write their own htonll (or whatever you want to call it) and perform their own detection of what type is a 64bit int, and cache the oids for the fastcall interface themselves? There seems to be a lot of overhead which libpq saves you from. Or the client program could perform the detection of the type, and also detect a function which would reasonably serve as an atoll on the platform, and snprintf(buf, 1024, SELECT lo_seek64(%d, %lld, %d), fh, offset, SEEK_SET); exec the buf, check to see if any tuples came back, if so (get the first column of the first tuple, call atoll on that) else handle error, and in either case free the result? In any case, are there any comments on the changes below libpq (the functions visible to queries on down)? I don't want to get hung up in the client issues just to find out later that the server stuff was completely insane anyway... The client library seems to me to be less important anyway. If the server can support it, the client can always manage to do it some how, and then once the client lib can support it, it should be fairly transparent to swap that out later, so that code that worked around could be updated without immediately breaking all other code working around. So that means that if I get good feedback on the server side code, I could start having people code to it using one of the above workaround methods listed, and then if we manage to come up with some way which would be more correct (if that is the right word) than the libpq hack I did then they could gradually switch over to that (or use sed -i). -- All that glitters has a high refractive index. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] 64-bit API for large objects
Cool. We look forward to it.On 9/19/05, Mark Dilger [EMAIL PROTECTED] wrote: Jonah H. Harris wrote: Mark, If you don't mind contributing the changes, we'd be glad to take a look at them.Thanks. -JonahOk, we will post it back soon.We have tested it on two different 64-bit architectures (Sparc and AMD) and are now testing on pentium before posting upto the list.mark-- Respectfully,Jonah H. Harris, Database Internals Architect EnterpriseDB Corporationhttp://www.enterprisedb.com/
Re: [HACKERS] 64-bit API for large objects
Mark, If you don't mind contributing the changes, we'd be glad to take a look at them. Thanks. -JonahOn 9/18/05, Mark Dilger [EMAIL PROTECTED] wrote: My company has written a 64-bit large object API, extending the postgresqlserver to be able to read/write/seek/tell/open/close objects larger than 2GB.If the hackers community considers this valuable, we will submit the changes back for the rest of the community to share. From one of my programmers, Jeremy Drake:I tested this out on my box with a 4gb dvd iso image, and it appears towork correctly.The test code I found for large object things does not really seem to exercise the api very well though.And the regressiontests do not seem to even touch large objects (they all still pass afterthis change).Mark, can you take a look at this and make sure I haven't broken anything too obviously?I wrote it into the same file as the large objectcode, since that filehas some static stuff for caching things which I would like to share.Iopted to add new functions tell64 and seek64 rather than changing the existing ones for backwards compatibility.I plugged them into thepg_proc catalog, but everything in that file has an explicit OID, and I donot feel comfortable (yet) grabbing up OIDs for stuff.So I set them to an OID of zero, which means the scripts will assign it one which is notused (in the range 1-something).Since the convention is that suchfunctions have explicit assigned OIDs, it would probably be required to get real ones if this were ever to be submitted back.Also, in the libpqstuff, at the moment I have it fail if it cannot find the seek64 or tell64functions.It may be best to have it work as long as you don't try to call them, in order to preserve backwards compatibility with other serverversions.If you think this is a reasonable patch, it might be nice to send it tothem, be a good neighbor and all that... ---(end of broadcast)---TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match-- Respectfully,Jonah H. Harris, Database Internals ArchitectEnterpriseDB Corporationhttp://www.enterprisedb.com/
Re: [HACKERS] 64-bit API for large objects
Jonah H. Harris wrote: Mark, If you don't mind contributing the changes, we'd be glad to take a look at them. Thanks. -Jonah Ok, we will post it back soon. We have tested it on two different 64-bit architectures (Sparc and AMD) and are now testing on pentium before posting up to the list. mark ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] 64-bit API for large objects
My company has written a 64-bit large object API, extending the postgresql server to be able to read/write/seek/tell/open/close objects larger than 2GB. If the hackers community considers this valuable, we will submit the changes back for the rest of the community to share. From one of my programmers, Jeremy Drake: I tested this out on my box with a 4gb dvd iso image, and it appears to work correctly. The test code I found for large object things does not really seem to exercise the api very well though. And the regression tests do not seem to even touch large objects (they all still pass after this change). Mark, can you take a look at this and make sure I haven't broken anything too obviously? I wrote it into the same file as the large object code, since that file has some static stuff for caching things which I would like to share. I opted to add new functions tell64 and seek64 rather than changing the existing ones for backwards compatibility. I plugged them into the pg_proc catalog, but everything in that file has an explicit OID, and I do not feel comfortable (yet) grabbing up OIDs for stuff. So I set them to an OID of zero, which means the scripts will assign it one which is not used (in the range 1-something). Since the convention is that such functions have explicit assigned OIDs, it would probably be required to get real ones if this were ever to be submitted back. Also, in the libpq stuff, at the moment I have it fail if it cannot find the seek64 or tell64 functions. It may be best to have it work as long as you don't try to call them, in order to preserve backwards compatibility with other server versions. If you think this is a reasonable patch, it might be nice to send it to them, be a good neighbor and all that... ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match