El día viernes, abril 23, 2021 a las 11:32:02a. m. +0200, Matthias Apitz
escribió:
> El día viernes, abril 23, 2021 a las 10:48:24a. m. +0200, Laurenz Albe
> escribió:
>
> > > The serverlog has around this time (sorry for German):
> > >
> > > 2021-04-23 05:55:23.591 CEST [2317] LOG: unvollständige Message vom
> > > Client
> > > 2021-04-23 05:55:23.593 CEST [2317] FEHLER: Speicher aufgebraucht
> > > 2021-04-23 05:55:23.593 CEST [2317] DETAIL: Kann Zeichenkettenpuffer mit
> > > 0 Bytes nicht um 1380986158 Bytes vergrößern.
> > > 2021-04-23 05:55:23.593 CEST [2317] LOG: konnte Daten nicht an den
> > > Client senden: Datenübergabe unterbrochen (broken pipe)
> > > 2021-04-23 05:55:23.593 CEST [2317] FATAL: Verbindung zum Client wurde
> > > verloren
> > > 2021-04-23 05:57:20.339 CEST [27190] LOG: unerwartetes EOF auf
> > > Client-Verbindung mit einer offenen Transaktion
> > > 2021-04-23 05:59:17.419 CEST [26827] LOG: unerwartetes EOF auf
> > > Client-Verbindung mit einer offenen Transaktion
> > >
> > > What else could be checked for this? Any hints about this issue?
> >
> > It seems pretty clear that both the client and the server are waiting for
> > each other.
>
> Yes, they are both alive as processes. Both are running on the same
> Linux server, i.e. no real network between. And any loss of data would
> have caused perhaps the end of one (and then the other) process.
> I will try to gather a TCPDUMP for this.
>
I hit the same situation again. The ESQL/C client sent an UPDATE to the
PostgreSQL server:
# gdb /opt/lib/sisis/catserver/bin/INDEX 31864
...
#0 0x00007f688bb955c4 in poll () from /lib64/libc.so.6
#1 0x00007f688a82622f in pqSocketPoll () from
/usr/local/sisis-pap/pgsql/lib/libpq.so.5
#2 0x00007f688a8260ff in pqSocketCheck () from
/usr/local/sisis-pap/pgsql/lib/libpq.so.5
#3 0x00007f688a825f83 in pqWaitTimed () from
/usr/local/sisis-pap/pgsql/lib/libpq.so.5
#4 0x00007f688a825f55 in pqWait () from
/usr/local/sisis-pap/pgsql/lib/libpq.so.5
#5 0x00007f688a821d3a in PQgetResult () from
/usr/local/sisis-pap/pgsql/lib/libpq.so.5
#6 0x00007f688a822478 in PQexecFinish () from
/usr/local/sisis-pap/pgsql/lib/libpq.so.5
#7 0x00007f688a82212e in PQexecParams () from
/usr/local/sisis-pap/pgsql/lib/libpq.so.5
#8 0x00007f688b792216 in ecpg_execute ()
from /usr/local/sisis-pap/pgsql/lib/libecpg.so.6
#9 0x00007f688b7935de in ecpg_do () from
/usr/local/sisis-pap/pgsql/lib/libecpg.so.6
#10 0x00007f688b7936ef in ECPGdo () from
/usr/local/sisis-pap/pgsql/lib/libecpg.so.6
#11 0x00007f688dd831c6 in update_record () at per_kette.pgc:1968
#12 0x00007f688dd7ed16 in per_kette (zugriff=10, scroll=-20000, lock=-20000,
key=-20000,
sto=-20000, p_daten=0x7ffef0499950, sel_anw=0x0, where_anw=0x0,
p_btw_daten=0x0,
order_by=0x0, auf_ab=0x0, group_by=0x0, having=0x0, into_temp=0x0,
count=0x7ffef0498670) at per_kette.pgc:497
#13 0x00007f688dcf770f in DB_rewr (tabmodul=0x7f688dd7e7ca <per_kette>,
p_daten=0x7ffef0499950) at dbcall.pgc:3274
#14 0x00007f688ea58821 in BKDeskInsert (setid=FSTAB_Personen, kette=...,
katkey=168952979) at ./BKDesk.c:1447
and the Pos server sits idle:
# select * from pg_stat_activity where client_port = 52288;
datid | datname | pid | leader_pid | usesysid | usename |
application_name | client_addr | client_hostname | client_port |
backend_start | xact_start |
query_start | state_change | wait_event_type |
wait_event | state | backend_xid | backend_xmin | query
| backend_type
--------+---------+-------+------------+----------+---------+---------------------------------+-------------+-----------------+-------------+---------------------------------+---------------------------------+---------------------------------+---------------------------------+-----------------+------------+---------------------+-------------+--------------+---------------------+----------------
951941 | sisis | 31894 | | 16384 | sisis | SunRise DBCALL
V7.2 (pid=31864) | 127.0.0.1 | | 52288 | 25.04.2021
10:57:15.922137 CEST | 25.04.2021 10:57:15.968081 CEST | 25.04.2021
10:57:16.051256 CEST | 25.04.2021 10:57:16.051273 CEST | Client |
ClientRead | idle in transaction | 18314796 | 18314796 | fetch
per_kette_seq | client backend
But this time I was collecting all TCP traffic of all clients to and
from the port 5432. This TCPDUMP command writes all traffic as binary
data to files of 100.000.000 bytes in size (-C 100) and appends a number
when a new file is needed:
# /usr/sbin/tcpdump -n -i lo -C 100 -w /var/spool/sisis/tcp5432. port 5432 &
Later one can look into the files for the local port 52288 of the
connection with:
# tcpdump -n -r tcp5432.2 -Xx -s1024 port 52288 > INDEX.52288.txt
10:57:15.921783 IP 127.0.0.1.52288 > 127.0.0.1.5432: Flags [S], seq 3652783347,
win 65495, options [mss 65495,sackOK,TS val 3424445579 ecr 0,nop,wscale 7],
length 0
0x0000: 4500 003c b720 4000 4006 8599 7f00 0001 E..<..@.@.......
0x0010: 7f00 0001 cc40 1538 d9b9 0cf3 0000 0000 [email protected]........
0x0020: a002 ffd7 fe30 0000 0204 ffd7 0402 080a .....0..........
0x0030: cc1c e48b 0000 0000 0103 0307 ............
10:57:15.921792 IP 127.0.0.1.5432 > 127.0.0.1.52288: Flags [S.], seq
4241480135, ack 3652783348, win 65483, options [mss 65495,sackOK,TS val
3424445579 ecr 3424445579,nop,wscale 7], length 0
0x0000: 4500 003c 0000 4000 4006 3cba 7f00 0001 E..<..@.@.<.....
0x0010: 7f00 0001 1538 cc40 fccf d9c7 d9b9 0cf4 .....8.@........
0x0020: a012 ffcb fe30 0000 0204 ffd7 0402 080a .....0..........
0x0030: cc1c e48b cc1c e48b 0103 0307 ............
10:57:15.921799 IP 127.0.0.1.52288 > 127.0.0.1.5432: Flags [.], ack 1, win 512,
options [nop,nop,TS val 3424445579 ecr 3424445579], length 0
0x0000: 4500 0034 b721 4000 4006 85a0 7f00 0001 E..4.!@.@.......
0x0010: 7f00 0001 cc40 1538 d9b9 0cf4 fccf d9c8 [email protected]........
0x0020: 8010 0200 fe28 0000 0101 080a cc1c e48b .....(..........
0x0030: cc1c e48b ....
And here you see the last command (the UPDATE) sent to the Pos server:
10:57:16.051388 IP 127.0.0.1.52288 > 127.0.0.1.5432: Flags [P.], seq
14407:14812, ack 120304, win 512, options [nop,nop,TS val 3424445708 ecr
3424445708], length 405
0x0000: 4500 01c9 b84f 4000 4006 82dd 7f00 0001 E....O@.@.......
0x0010: 7f00 0001 cc40 1538 d9b9 453a fcd1 afb7 [email protected]:....
0x0020: 8018 0200 ffbd 0000 0101 080a cc1c e50c ................
0x0030: cc1c e50c 5000 0000 7e00 7570 6461 7465 ....P...~.update
0x0040: 2070 6572 5f6b 6574 7465 2073 6574 2064 .per_kette.set.d
0x0050: 6573 6b6e 7220 3d20 2431 2020 2c20 6672 esknr.=.$1..,.fr
0x0060: 6569 203d 2024 3220 202c 206d 6f64 6966 ei.=.$2..,.modif
0x0070: 203d 2024 3320 202c 2061 6e7a 203d 2024 .=.$3..,.anz.=.$
0x0080: 3420 202c 206b 6174 6c69 7374 203d 2024 4..,.katlist.=.$
0x0090: 3520 2077 6865 7265 2063 7572 7265 6e74 5..where.current
0x00a0: 206f 6620 6863 5f70 6572 5f6b 6574 7465 .of.hc_per_kette
0x00b0: 0000 0042 0000 00ff 0000 0000 0005 0000 ...B............
0x00c0: 0007 3133 3834 3131 3800 0000 0130 0000 ..1384118....0..
0x00d0: 0001 3000 0000 0232 3600 0000 d25c 7865 ..0....26....\xe
0x00e0: 6539 6530 3130 3031 6561 6630 3130 3035 e9e01001eaf01005
0x00f0: 3761 6630 3130 3030 3563 6530 3130 3038 7af010005ce01008
0x0100: 3264 3230 3130 3064 3466 3830 3130 3038 2d20100d4f801008
0x0110: 6530 6130 3230 3061 6633 3430 3230 3032 e0a0200af3402002
0x0120: 3533 6130 3230 3061 3538 6530 3230 3061 53a0200a58e0200a
0x0130: 6538 6530 3230 3034 3461 3630 3230 3033 e8e020044a602003
0x0140: 6561 3730 3230 3032 3066 3130 3330 3032 ea7020020f103002
0x0150: 3466 3130 3330 3035 3431 3730 3430 3061 4f1030054170400a
0x0160: 3034 3330 3430 3033 6636 3730 3430 3062 04304003f670400b
0x0170: 6138 3930 3430 3033 6265 6430 3030 3030 a8904003bed00000
0x0180: 3534 3930 3130 3039 3235 3230 3130 3037 5490100925201007
0x0190: 3435 3430 3130 3038 3336 3030 3130 3038 4540100836001008
0x01a0: 3238 3330 3130 3039 3330 3431 3230 6100 28301009304120a.
0x01b0: 0100 0044 0000 0006 5000 4500 0000 0900 ...D....P.E.....
0x01c0: 0000 0000 5300 0000 04 ....S....
10:57:16.051395 IP 127.0.0.1.5432 > 127.0.0.1.52288: Flags [.], ack 14812, win
512, options [nop,nop,TS val 3424445708 ecr 3424445708], length 0
0x0000: 4500 0034 453c 4000 4006 f785 7f00 0001 E..4E<@.@.......
0x0010: 7f00 0001 1538 cc40 fcd1 afb7 d9b9 46cf [email protected].
0x0020: 8010 0200 fe28 0000 0101 080a cc1c e50c .....(..........
0x0030: cc1c e50c ....
The send buffer is complete, the bytea string \x ....00 is terminated
by a NULL byte. TCP ACK'ed the package and then no further reaction of
the Pos server on this connection. It still sits there if someone needs
more information about this connection.
>From the collected data, I tend to say: this must be a bug in the Pos
server...
Any ideas?
matthias
--
Matthias Apitz, ✉ [email protected], http://www.unixarea.de/ +49-176-38902045
Public GnuPG key: http://www.unixarea.de/key.pub