Re: [BUGS] DBD::Pg 'lo_read' fails on >= 32768 byte large objects

2003-08-24 Thread Kevin Houle
Tom Lane wrote:

Kevin Houle <[EMAIL PROTECTED]> writes:

The nature of the bug is that an 'lo_read' operation performed
with DBD::Pg caused a segfault with postgresql-libs-7.3.2 and
"hangs" on files >= 32768 bytes with postgresql-libs-7.3.4. The
hang is actually a read() loop on the socket generating EAGAIN
error on each read().
I finally realized what's going on here.  7.3 branch CVS tip should fix
it.
Good news. I'll test as soon as I'm able and report back
only if I still see an issue. No news is good news.
Thanks for the effort, Tom.

Kevin



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [BUGS] DBD::Pg 'lo_read' fails on >= 32768 byte large objects

2003-08-22 Thread Tom Lane
Kevin Houle <[EMAIL PROTECTED]> writes:
> The nature of the bug is that an 'lo_read' operation performed
> with DBD::Pg caused a segfault with postgresql-libs-7.3.2 and
> "hangs" on files >= 32768 bytes with postgresql-libs-7.3.4. The
> hang is actually a read() loop on the socket generating EAGAIN
> error on each read().

I finally realized what's going on here.  7.3 branch CVS tip should fix
it.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [BUGS] DBD::Pg 'lo_read' fails on >= 32768 byte large objects

2003-08-17 Thread Tom Lane
Kevin Houle <[EMAIL PROTECTED]> writes:
>> One more data point. The DBD::Pg 'lo_extract' function works
>> fine across SSL. There is no issue with large objects >= 32K
>> using 'lo_extract'. So that casts doubt on it being an OpenSSL
>> issue.

No, it does not, because lo_export writes the data to a file on the
server side --- the data does not pass across the encrypted connection.
It still appears that there's an issue with moving a large chunk of
data across the link.

Whether this is OpenSSL's fault or libpq's is not clear.  I'm prepared
to believe that we are not using SSL_read correctly, but if so, what
is the correct way?  The OpenSSL docs are not very clear on how one
does asynchronous reads.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [BUGS] DBD::Pg 'lo_read' fails on >= 32768 byte large objects

2003-08-16 Thread Kevin Houle
Kevin Houle wrote:

Tom Lane wrote:

Kevin Houle <[EMAIL PROTECTED]> writes:

Is it just me, or are both sides reading waiting for the other
side to send data?


Sure looks like it.  Could it be an OpenSSL bug?
One more data point. The DBD::Pg 'lo_extract' function works
fine across SSL. There is no issue with large objects >= 32K
using 'lo_extract'. So that casts doubt on it being an OpenSSL
issue. Is there a different code path within libpq.so to move
data from the server to the client via SSL for lo_extract than
for lo_read that we can learn from? I'm looking at the code,
but for the first time.
s/lo_extract/lo_export/

Must  work  less.

Kevin



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [BUGS] DBD::Pg 'lo_read' fails on >= 32768 byte large objects

2003-08-16 Thread Kevin Houle
Tom Lane wrote:

Kevin Houle <[EMAIL PROTECTED]> writes:

Is it just me, or are both sides reading waiting for the other
side to send data?
Sure looks like it.  Could it be an OpenSSL bug?
One more data point. The DBD::Pg 'lo_extract' function works
fine across SSL. There is no issue with large objects >= 32K
using 'lo_extract'. So that casts doubt on it being an OpenSSL
issue. Is there a different code path within libpq.so to move
data from the server to the client via SSL for lo_extract than
for lo_read that we can learn from? I'm looking at the code,
but for the first time.
Kevin





---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [BUGS] DBD::Pg 'lo_read' fails on >= 32768 byte large objects

2003-08-16 Thread Kevin Houle
Kevin Houle wrote:

One more data point. The DBD::Pg 'lo_extract' function works
fine across SSL. There is no issue with large objects >= 32K
using 'lo_extract'. So that casts doubt on it being an OpenSSL
issue. Is there a different code path within libpq.so to move
data from the server to the client via SSL for lo_extract than
for lo_read that we can learn from? I'm looking at the code,
but for the first time.
This looks suspicious.

$ pwd; find . -type f | xargs grep inv_read
/local/kjh/pgRPM/postgresql-7.3.4/src
./backend/libpq/be-fsstubs.c:   status = inv_read(cookies[fd], buf, len);
./backend/libpq/be-fsstubs.c:   while ((nbytes = inv_read(lobj, buf, 
BUFSIZE)) > 0)
./backend/storage/large_object/inv_api.c:inv_read(LargeObjectDesc 
*obj_desc, char *buf, int nbytes)
./include/storage/large_object.h:extern int inv_read(LargeObjectDesc 
*obj_desc, char *buf, int nbytes);

inv_read(cookies[fd], buf, len) is in lo_read()
inv_read(lobj, buf, BUFSIZE) is in lo_export()
./backend/libpq/be-fsstubs.c:#define BUFSIZE8192

The lo_export() function loops calling inv_read() until the
entire object is read. The lo_read() function simply passes
the number of bytes to be read to a single invocation of the
inv_read() function. So if I use a length >= 32768 for
lo_read(), it is happily passed to inv_read in an int datatype,
which is 32k.
Seems to me lo_read() should loop around inv_read() using
BUFSIZE like lo_export() does.
Kevin



---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [BUGS] DBD::Pg 'lo_read' fails on >= 32768 byte large objects

2003-08-16 Thread Kevin Houle
Tom Lane wrote:
Kevin Houle <[EMAIL PROTECTED]> writes:

Is it just me, or are both sides reading waiting for the other
side to send data?
Sure looks like it.  Could it be an OpenSSL bug?
Well, redhat-9 ships with openssl-0.9.7a so I tried it
with openssl-0.9.7b and lo_read still caused a read()
loop after 32768 bytes. Then I used a Debian sid machine
for testing using the same test case. It uses
  postgresql 7.3.3
  openssl 0.9.7b
The lo_read worked flawlessly on the Debian box. So I grabbed
the postgresql 7.3.3 packages from the PGDG website and tried
those on redhat-9. Still experienced the loop after 32768 bytes.
I'm at a bit of a loss here. It would be nice if someone else
could run the test script (see original message in this thread)
on a redhat-9 box to eliminate my environment from the equation.
Kevin

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [BUGS] DBD::Pg 'lo_read' fails on >= 32768 byte large objects

2003-08-15 Thread Kevin Houle
--On Friday, August 15, 2003 02:09:32 PM -0400 Tom Lane <[EMAIL PROTECTED]> 
wrote:

In 7.3.2 pqsecure_read will recurse to self when SSL_read returns
SSL_ERROR_WANT_READ.  I changed the recursion to a loop in 7.3.4.
Evidently, in 7.3.2 it's possible for the recursion to overflow your
alloted stack space before the process uses up its timeslice :-(.
In 7.3.4 the loop simply doesn't exit.
I'll consider 7.3.2 a bad choice.

I don't understand why, though.  What I expected would happen is that
the process would busy-wait until more data becomes available from the
connection.  That's not ideal but the full fix seemed safer to postpone
to 7.4.  Can you set up this situation, then attach with gdb to the
connected backend and see what it thinks it's doing?  A stacktrace from
that side of the connection might shed some light.
Okay. Here's stack traces from both sides, to put the information
in the same bucket. The gdb info on the client side...
(gdb) info stack
#0  0xe002 in ?? ()
#1  0x404dd242 in BIO_read () from /lib/libcrypto.so.4
#2  0x40455bf0 in ssl3_alert_code () from /lib/libssl.so.4
#3  0x40455d8c in ssl3_alert_code () from /lib/libssl.so.4
#4  0x4045705e in ssl3_read_bytes () from /lib/libssl.so.4
#5  0x40454619 in ssl3_write () from /lib/libssl.so.4
#6  0x404546a7 in ssl3_read () from /lib/libssl.so.4
#7  0x4045cd2b in SSL_read () from /lib/libssl.so.4
#8  0x404216d1 in pqsecure_read () from /usr/lib/libpq.so.3
#9  0x4041d76a in pqReadData () from /usr/lib/libpq.so.3
#10 0x4041c224 in PQfn () from /usr/lib/libpq.so.3
#11 0x4041f773 in lo_read () from /usr/lib/libpq.so.3
#12 0x4014c194 in pg_db_lo_read (dbh=0xfff5, fd=5,
   buf=0x5 , len=5) at dbdimp.c:545
#13 0x40144881 in XS_DBD__Pg__st_bind_param_inout (my_perl=0x804bcb8,
   cv=0x81b1794) at Pg.xs:282
#14 0x40400247 in XS_DBI_dispatch (my_perl=0x804bcb8, cv=0x8187ee0)
   at DBI.xs:2320
#15 0x400a34e5 in Perl_pp_entersub (my_perl=0x804bcb8) at pp_hot.c:2781
#16 0x40086c9a in Perl_runops_debug (my_perl=0x804bcb8) at dump.c:1414
#17 0x4003a9ab in S_run_body (my_perl=0x804bcb8, oldscope=-11) at 
perl.c:1705
#18 0x4003a635 in perl_run (my_perl=0x804bcb8) at perl.c:1624
#19 0x080493a3 in main ()
#20 0x420156a4 in __libc_start_main () from /lib/tls/libc.so.6

(gdb) info reg
eax0xfff5   -11
ecx0x81cc880136104064
edx0x5  5
ebx0x3  3
esp0xbfffeaa4   0xbfffeaa4
ebp0xbfffeac8   0xbfffeac8
esi0x0  0
edi0x81cc880136104064
eip0xe002   0xe002
eflags 0x246582
cs 0x23 35
ss 0x2b 43
ds 0x2b 43
es 0x2b 43
fs 0x0  0
gs 0x33 51
And here is the gbd info from the attached backend process.

(gdb) info stack
#0  0xe002 in ?? ()
#1  0x400d5242 in BIO_read () from /lib/libcrypto.so.4
#2  0x4004dbf0 in ssl3_alert_code () from /lib/libssl.so.4
#3  0x4004dd8c in ssl3_alert_code () from /lib/libssl.so.4
#4  0x4004f05e in ssl3_read_bytes () from /lib/libssl.so.4
#5  0x4004c619 in ssl3_write () from /lib/libssl.so.4
#6  0x4004c6a7 in ssl3_read () from /lib/libssl.so.4
#7  0x40054d2b in SSL_read () from /lib/libssl.so.4
#8  0x080eff16 in secure_read ()
#9  0x080f43d2 in StreamClose ()
#10 0x080f4495 in pq_getbyte ()
#11 0x0813a194 in HandleFunctionRequest ()
#12 0x0813a21a in HandleFunctionRequest ()
#13 0x0813b8ba in PostgresMain ()
#14 0x0811f37b in ClosePostmasterPorts ()
#15 0x0811ee03 in ClosePostmasterPorts ()
#16 0x0811dc09 in PostmasterMain ()
#17 0x0811d3a6 in PostmasterMain ()
#18 0x080f5113 in main ()
#19 0x420156a4 in __libc_start_main () from /lib/tls/libc.so.6
(gdb) info reg
eax0xfe00   -512
ecx0x8282740136849216
edx0x5  5
ebx0x8  8
esp0xbfffe0b4   0xbfffe0b4
ebp0xbfffe0d8   0xbfffe0d8
esi0x0  0
edi0x8282740136849216
eip0xe002   0xe002
eflags 0x246582
cs 0x23 35
ss 0x2b 43
ds 0x2b 43
es 0x2b 43
fs 0x0  0
gs 0x33 51
Is it just me, or are both sides reading waiting for the other
side to send data?
Kevin

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [BUGS] DBD::Pg 'lo_read' fails on >= 32768 byte large objects

2003-08-15 Thread Kevin Houle
--On Friday, August 15, 2003 08:43:23 AM -0400 Tom Lane <[EMAIL PROTECTED]> 
wrote:

Kevin Houle <[EMAIL PROTECTED]> writes:
The nature of the bug is that an 'lo_read' operation performed
with DBD::Pg caused a segfault with postgresql-libs-7.3.2 and
"hangs" on files >= 32768 bytes with postgresql-libs-7.3.4. The
hang is actually a read() loop on the socket generating EAGAIN
error on each read().
Can you get us a gdb stack trace from the segfault cases?  (And from
the loop cases too; a stack trace back from the repeated read() call
would be useful.)
The attached gdb-7.3.2.log is from the segfault case using
postgresql-libs-7.3.2 and the gdb-7.3.4.log is the loop case
from postgresql-libs-7.3.4. I had to SIGINT the loop case to
get the stack trace.
Also, does adding/removing SSL encryption affect your results?
Yes. It works flawlessly without SSL. I thought I had tested that
previously but in fact did not correctly; I re-tested and it works
without SSL and fails with SSL. The stack traces seem to point to
pqsecure_read().
Let me know if I can provide anything else.

Kevin




gdb-7.3.2.log
Description: Binary data


gdb-7.3.4.log
Description: Binary data

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [BUGS] DBD::Pg 'lo_read' fails on >= 32768 byte large objects

2003-08-15 Thread Tom Lane
Kevin Houle <[EMAIL PROTECTED]> writes:
> Is it just me, or are both sides reading waiting for the other
> side to send data?

Sure looks like it.  Could it be an OpenSSL bug?

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [BUGS] DBD::Pg 'lo_read' fails on >= 32768 byte large objects

2003-08-15 Thread Tom Lane
Kevin Houle <[EMAIL PROTECTED]> writes:
>> Can you get us a gdb stack trace from the segfault cases?

> #36 0x40421712 in pqsecure_read () from /usr/lib/libpq.so.3
> #37 0x40421712 in pqsecure_read () from /usr/lib/libpq.so.3
> #38 0x40421712 in pqsecure_read () from /usr/lib/libpq.so.3
> #39 0x40421712 in pqsecure_read () from /usr/lib/libpq.so.3
> ...
> cut thousands of lines; you get the idea

In 7.3.2 pqsecure_read will recurse to self when SSL_read returns
SSL_ERROR_WANT_READ.  I changed the recursion to a loop in 7.3.4.
Evidently, in 7.3.2 it's possible for the recursion to overflow your
alloted stack space before the process uses up its timeslice :-(.
In 7.3.4 the loop simply doesn't exit.

I don't understand why, though.  What I expected would happen is that
the process would busy-wait until more data becomes available from the
connection.  That's not ideal but the full fix seemed safer to postpone
to 7.4.  Can you set up this situation, then attach with gdb to the
connected backend and see what it thinks it's doing?  A stacktrace from
that side of the connection might shed some light.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [BUGS] DBD::Pg 'lo_read' fails on >= 32768 byte large objects

2003-08-15 Thread Tom Lane
Kevin Houle <[EMAIL PROTECTED]> writes:
> The nature of the bug is that an 'lo_read' operation performed
> with DBD::Pg caused a segfault with postgresql-libs-7.3.2 and
> "hangs" on files >= 32768 bytes with postgresql-libs-7.3.4. The
> hang is actually a read() loop on the socket generating EAGAIN
> error on each read().

Can you get us a gdb stack trace from the segfault cases?  (And from
the loop cases too; a stack trace back from the repeated read() call
would be useful.)

Also, does adding/removing SSL encryption affect your results?

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


[BUGS] DBD::Pg 'lo_read' fails on >= 32768 byte large objects

2003-08-14 Thread Kevin Houle
Hello,

There appears to be a bug in the code used by DBD::Pg's 'lo_read'
function. I presume the code with the bug is a part of libpq, and
thus the postgresql-libs RPM binary distribution.
The nature of the bug is that an 'lo_read' operation performed
with DBD::Pg caused a segfault with postgresql-libs-7.3.2 and
"hangs" on files >= 32768 bytes with postgresql-libs-7.3.4. The
hang is actually a read() loop on the socket generating EAGAIN
error on each read().
I've attached a short perl script that demonstrates the bug.
The database server does not seem to matter; I get the same results
using 7.2 as I do with 7.3.x servers. I also get the same results
when I vary the perl-DBD-Pg and perl-DBI module versions. The bug
seems to be following libpg.
All of my testing has been on the i686 platform using linux 2.4.20
kernels.
Test case #1:
- client = postgresql-libs-7.2.3-5 (redhat-7.3)
   perl-DBI-1.21-1 (redhat-7.3)
   perl-DBD-Pg-1.01-8 (redhat-7.3)
   perl-5.6.1
- server = postgresql-server-7.3.2-3 (redhat-9)
- transport = tcpip + ssl
$ perl test.pl
- reading testfile '/bin/ls'
- inserting testfile contents
- oid = '16265435'
- reading large object
success.. removing large object
Test case #2:
- client = postgresql-libs-7.3.2-3 (redhat-9)
   perl-DBI-1.32-5 (redhat-9)
   perl-DBD-Pg-1.21-2 (redhat-9)
-server = postgresql-server-7.3.2-3 (redhat-9)
-transport = tcpip + ssl
$ perl test.pl
- reading testfile '/bin/ls'
- inserting testfile contents
- oid = '16265436'
- reading large object
Segmentation fault
... after ~32768 bytes, process loops reading the socket
 read(3, 0x81cc938, 5) = -1 EAGAIN (Resource temporarily unavailable)
and eventually segfaults
Test case #3:
- same as #2 except
- transport = tcpip (localhost)
$ perl test.pl
- reading testfile '/bin/ls'
- inserting testfile contents
- oid = '16265436'
- reading large object
Segmentation fault
Test case #4:
- same as #2 and #3 except
- postgresql-libs-7.3.4 PGDG used
$ perl test.pl
- reading testfile '/bin/ls'
- inserting testfile contents
- oid = '16265437'
- reading large object
... after ~32768 bytes, process loops reading the socket
 read(3, 0x81cc938, 5) = -1 EAGAIN (Resource temporarily unavailable)
but doesn't appear to segfault (unless I am not patient enough)
This problem is a bastard as it makes getting data out of an
otherwise perfectly good database rather difficult. In other
words, its a show-stopper when migrating from 7.2 -> 7.3.
Regards,
Kevin


test.pl
Description: Binary data

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])