Re: [BUGS] [ODBC] Segmentation Fault in Postgres server when using psqlODBC

2013-06-13 Thread Hiroshi Inoue

(2013/06/12 1:26), Andres Freund wrote:

On 2013-06-11 19:20:57 +0300, Heikki Linnakangas wrote:

On 11.06.2013 19:04, Joshua Berry wrote:

Hiroshi Inoue has developed the attached patch to correct the issue that
was  reported. More of the dialogue can be found in the pgsql-odbc list.


I tried to follow that thread over at pgsql-odbc, but couldn't quite
understand what the problem is. Did you have a test program to reproduce it?
Or failing that, what is the sequence of protocol messages that causes the
problem?


I'd guess creating a SQL level WITH HOLD cursor and then fetching that
via the extended protocol, outside the transaction, should do the trick.


OK I made a test C program which reproduces the crash.
The program uses libpq and a hack.

I attached the program.
Please modify the connect operation suitable for your environment.
Note that the connection should be non-ssl.
Also add error checkings if needed.

regards,
Hiroshi Inoue

#include libpq-fe.h
#ifdef	WIN32
#include WinSock2.h
#else
#include sys/types.h
#include sys/socket.h
#endif
#define	MY_CUR	mycur
int main(int argc, const char **argv)
{
	const char	*connstr;
	PGconn		*conn;
	PGresult	*result;
	int			sock;
	int			len, count;

	if (argc  1)
		connstr = argv[1];
	else
		connstr = host=localhost port=5432 dbname=x user=x password=x;
	conn = PQconnectdb(connstr);
	result = PQexec(conn, declare  MY_CUR  cursor with hold for select * from generate_series(1, 2) as i);
	if (PQgetssl(conn) != NULL)
	{
		printf(Use non-ssl connection\n);
		return 1;
	}
	sock = PQsocket(conn);
	if (sock  0) 
	{
		printf(socket error\n);
		return 1;
	}
	// send execute message
	send(sock, E, 1, 0); 
	len = sizeof(len) + strlen(MY_CUR) + 1 + sizeof(count);
	len = htonl(len);
	send(sock, (const char *) len, sizeof(len), 0);
	send(sock, MY_CUR, strlen(MY_CUR) + 1, 0);
	count = htonl(1);
	send(sock, (const char *) count, sizeof(count), 0);

	result = PQexec(conn, close  MY_CUR);
	if (!result)
		printf(close error\n);
	else
		printf(result error=%s\n, PQresultErrorMessage(result));

	PQfinish(conn);

	return 0;
}
-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] [ODBC] Segmentation Fault in Postgres server when using psqlODBC

2013-06-13 Thread Tom Lane
Hiroshi Inoue in...@tpf.co.jp writes:
 OK I made a test C program which reproduces the crash.
 The program uses libpq and a hack.

Oh, thank you, I was just about to go spend an hour doing that ...

regards, tom lane


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] [ODBC] Segmentation Fault in Postgres server when using psqlODBC

2013-06-13 Thread Tom Lane
Hiroshi Inoue in...@tpf.co.jp writes:
 (2013/06/12 1:26), Andres Freund wrote:
 I'd guess creating a SQL level WITH HOLD cursor and then fetching that
 via the extended protocol, outside the transaction, should do the trick.

 OK I made a test C program which reproduces the crash.
 The program uses libpq and a hack.

I've committed a fix for this.  Thanks again for the test case.

regards, tom lane


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] [ODBC] Segmentation Fault in Postgres server when using psqlODBC

2013-06-13 Thread Joshua Berry
Hiroshi, Tom, and Andres,

On Thu, Jun 13, 2013 at 12:16 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Hiroshi Inoue in...@tpf.co.jp writes:
  OK I made a test C program which reproduces the crash.
  The program uses libpq and a hack.

 I've committed a fix for this.  Thanks again for the test case.


Many thanks for your time and effort in debugging, testing, and patching
this.

Kind Regards,
-Joshua


Re: [BUGS] [ODBC] Segmentation Fault in Postgres server when using psqlODBC

2013-06-11 Thread Joshua Berry
Hiroshi Inoue has developed the attached patch to correct the issue that
was  reported. More of the dialogue can be found in the pgsql-odbc list.

The root issue:

 Inoue, Hiroshi in...@tpf.co.jp  mailto:in...@tpf.co.jp wrote:

 It's also preferrable to fix the crash at backend.
 The crash is caused by execute commands after commit.


Regarding testing:


 Is there any test code that I could leverage to put together a test case
 which can quickly invoke the backend problem that I'm seeing? Perhaps
 something that is used in the pgsqlODBC project or something else you
 or others might have sitting around? I would like to have a
 testapp/function that could help verify that the issue has been fixed in
 a future backend patch/release.


 It seems difficult to provide a test code. However I can reproduce
 the crash by changing 1 line of psqlodbc driver source code with a
 test case. For example, the crash is fixed by the attached patch.


  I've never explicitly used EXECUTE. Could I construct a plpgsql script
 which could use EXECUTE in a similar manner as psqlODBC, thus creating a
 test case that would have greater portability?


 Oops it's an Execute message used in extended query protocol not a
 *EXECUTE* command.



printtup_holdable_cursor.patch
Description: Binary data

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] [ODBC] Segmentation Fault in Postgres server when using psqlODBC

2013-06-11 Thread Heikki Linnakangas

On 11.06.2013 19:04, Joshua Berry wrote:

Hiroshi Inoue has developed the attached patch to correct the issue that
was  reported. More of the dialogue can be found in the pgsql-odbc list.


I tried to follow that thread over at pgsql-odbc, but couldn't quite 
understand what the problem is. Did you have a test program to reproduce 
it? Or failing that, what is the sequence of protocol messages that 
causes the problem?


- Heikki


--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] [ODBC] Segmentation Fault in Postgres server when using psqlODBC

2013-06-11 Thread Andres Freund
On 2013-06-11 19:20:57 +0300, Heikki Linnakangas wrote:
 On 11.06.2013 19:04, Joshua Berry wrote:
 Hiroshi Inoue has developed the attached patch to correct the issue that
 was  reported. More of the dialogue can be found in the pgsql-odbc list.
 
 I tried to follow that thread over at pgsql-odbc, but couldn't quite
 understand what the problem is. Did you have a test program to reproduce it?
 Or failing that, what is the sequence of protocol messages that causes the
 problem?

I'd guess creating a SQL level WITH HOLD cursor and then fetching that
via the extended protocol, outside the transaction, should do the trick.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] [ODBC] Segmentation Fault in Postgres server when using psqlODBC

2013-05-24 Thread Hiroshi Inoue

Hi,

Psqlodbc drivers send Execite requests for cursors instead of
issueing FETCH commands.

regards,
Hiroshi Inoue

(2013/05/25 1:55), Joshua Berry wrote:

Hi Groups,

I'm dealing with periodic backend process segmentation faults. I'm
posting to both the bugs and odbc lists as it seems that my
application's use of pgsqlODBC triggers a bug in the postgres backend.

The environment is:
Clients: win32 clients using various version of the psqlODBC driver
(mostly using 8.04.0200). The connection string contains
UseDeclareFetch=1 which causes long idle transactions, heavy cursor
and savepoint use.
Server: Dell dual Xeon x86 48GB RAM (12GB PG shared mem) RHEL system
with Postgresql 9.2.4. Note that these same issues occurred on PG9.1 and
PG8.4.

I've experienced these issues for over a year now. However, during that
time several things have changed which may or may not be related:
* The schema has been modified with heavier use of triggers to update
manually created materialized views (ie not using the 9.3 CREATE
MATERIALIZED VIEW).
* The number of concurrent users has increased (from perhaps 15
concurrent users two years ago, to 30 concurrent users now).
* The PG version used has changed from 8.4, to 9.1, and finally to 9.2
* I've done recent tuning to the planner cost constants in order to
favor index scans over table scans in more situations.

I've looked through past error logs and I found that I had a segfault in
the server process while using PG 8.4 about a dozen times over a 12
month period. Each time one client's postgres process crashes, the
backend forcefully closes all active connections due to possiby
corrupted memory and then restarts. This leaves all active clients
stranded as the connection is closed, and all cursors and savepoint info
is lost. My app doesn't recover gracefully, and users are forced to
click through some cryptic error messages from the application framework
used for the app (Clarion) and then restart the app.

A few months ago, I upgraded to another server with PG 9.1, assuming
that the issue with the previous server with 8.4 was due to bad RAM, as
I did observe a high ECC error count on the previous system as logged by
the IPMI controller. However, I very quickly had another segfault on the
new system with PG9.1. The default settings of the OS and PG init script
disabled core dumps, so I only started collecting core files for the
past few months.

I posted details in pgsql-general April 10, 2013; here is a link to the
thread:
http://www.postgresql.org/message-id/capmzxm03meden6nqqf_phs3m1dk-eaxp5_k-lmirneojmaq...@mail.gmail.com

The crash always is some variation of the following stack, as observed
in both PG91 and PG92 crashes:
(gdb) bt
#0  ResourceOwnerEnlargeCatCacheRefs (owner=0x0) at resowner.c:603
#1  0x0070f372 in SearchCatCache (cache=0x27fab90, v1=value
optimized out, v2=value optimized out, v3=value optimized out,
v4=value optimized out) at catcache.c:1136
#2  0x0071b1ae in getTypeOutputInfo (type=20,
typOutput=0x2b3db80, typIsVarlena=0x2b3db88 ) at lsyscache.c:2482
#3  0x0045d127 in printtup_prepare_info (myState=0x2810290,
typeinfo=0x29ad7b0, numAttrs=42) at printtup.c:263
#4  0x0045d4c4 in printtup (slot=0x3469650, self=0x2810290) at
printtup.c:297
#5  0x0065a76a in RunFromStore (portal=0x285fed0,
direction=value optimized out, count=10, dest=0x2810290) at pquery.c:1122
#6  0x0065a852 in PortalRunSelect (portal=0x285fed0,
forward=value optimized out, count=10, dest=0x2810290) at pquery.c:940
#7  0x0065bcf8 in PortalRun (portal=0x285fed0, count=10,
isTopLevel=1 '\001', dest=0x2810290, altdest=0x2810290,
completionTag=0x7fffee67f7c0 ) at pquery.c:788
#8  0x00659552 in exec_execute_message (argc=value optimized
out, argv=value optimized out, dbname=0x2768370 DBNAME,
username=value optimized out) at postgres.c:1929
#9  PostgresMain (argc=value optimized out, argv=value optimized
out, dbname=0x2768370 DBNAME, username=value optimized out) at
postgres.c:4016
#10 0x00615161 in BackendRun () at postmaster.c:3614
#11 BackendStartup () at postmaster.c:3304
#12 ServerLoop () at postmaster.c:1367
#13 0x00617dcc in PostmasterMain (argc=value optimized out,
argv=value optimized out) at postmaster.c:1127
#14 0x005b6830 in main (argc=5, argv=0x2766480) at main.c:199
(gdb)

Andres Freund and Tom Lane spent some time tracking down possible root
causes with some of the following summaries:

Tom Lane writes:
|Andres Freund and...@2ndquadrant.com mailto:and...@2ndquadrant.com
writes:
| Tom: It looks to me like printtup_prepare_info won't normally be called
| in an held cursor. But if some concurrent DDL changed the number of
| columns in typeinfo vs thaose in the the receiver that could explain the
| issue and why its not seen all the time, right?
|
|It looks to me like there are probably two triggering conditions:
|
|1. Client is doing a direct protocol Execute on a held-cursor portal.
|
|2. Cache