Status: New
Owner: ----
New issue 1179 by bcandler...@googlemail.com: gnt-instance move socat error
- certificate commonName does not match hostname
https://code.google.com/p/ganeti/issues/detail?id=1179
*What software version are you running?*
root@nuc1:~# gnt-cluster --version
gnt-cluster (ganeti 2.15.2-3) 2.15.2
root@nuc1:~# gnt-cluster version
Software version: 2.15.2
Internode protocol: 2150000
Configuration format: 2150000
OS api version: 20
Export interface: 0
VCS version: (ganeti) version 2.15.2-3
root@nuc1:~# hspace --version
hspace (ganeti) version 2.15.2-3
compiled with ghc 7.10
running on linux x86_64
*What distribution are you using?*
Two nodes both ubuntu 16.04
10.10.0.238 = nuc1.ws.nsrc.org
10.10.0.239 = nuc2.ws.nsrc.org
*What steps will reproduce the problem?*
root@nuc1:~# gnt-instance list -o name,pnode,snodes pfsense1
Instance Primary_node Secondary_Nodes
pfsense1 nuc2.ws.nsrc.org
root@nuc1:~# gnt-instance move -n nuc1.ws.nsrc.org -d pfsense1
2016-05-08 16:22:41,729: gnt-instance move pid=6299 cli:1218 DEBUG Command
line: gnt-instance move -n nuc1.ws.nsrc.org -d pfsense1
Instance pfsense1 will be moved. This requires a shutdown of the
instance. Continue?
y/[n]/?: y
Sun May 8 16:22:47 2016 - INFO: Not checking memory on the secondary node
as instance will not be started
Sun May 8 16:22:48 2016 - INFO: Shutting down instance pfsense1 on source
node nuc2.ws.nsrc.org
Sun May 8 16:22:49 2016 Exporting disk/0 from nuc2.ws.nsrc.org to
nuc1.ws.nsrc.org
Sun May 8 16:22:53 2016 disk/0 is now listening, starting export
Sun May 8 16:22:56 2016 - WARNING:
import 'import-disk0-2016-05-08_16_22_49-MXuNPT' on nuc1.ws.nsrc.org
failed: Exited with status 1
Sun May 8 16:22:56 2016 disk/0 failed to receive data: Exited with status
1 (recent output: 0+0 records in\n0+0 records out\n0 bytes copied, 4.81635
s, 0.0 kB/s)
Sun May 8 16:22:56 2016 - WARNING: Aborting
export 'export-disk0-2016-05-08_16_22_54-MM8F_x' on
0b180aa9-2c45-4250-a54a-e56c41b5cc68
Sun May 8 16:22:56 2016 - WARNING:
export 'export-disk0-2016-05-08_16_22_54-MM8F_x' on nuc2.ws.nsrc.org
failed: Exited with status 1
Sun May 8 16:22:56 2016 disk/0 failed to send data: Exited with status 1
(recent output: socat: E certificate is valid but its commonName does not
match hostname\ndd: dd: error writing 'standard output': Broken pipe\ndd:
error writing 'standard output': Broken pipe\n2+0 records in\n1+0 records
out\n1114112 bytes (1.1 MB, 1.1 MiB) copied, 0.0976589 s, 11.4 MB/s)
Sun May 8 16:22:57 2016 - WARNING: Some disks failed to copy, aborting
2016-05-08 16:22:57,864: gnt-instance move pid=6299 cli:1225 ERROR Error
during command processing
Traceback (most recent call last):
File "/usr/share/ganeti/2.15/ganeti/cli.py", line 1221, in GenericMain
result = func(options, args)
File "/usr/share/ganeti/2.15/ganeti/client/gnt_instance.py", line 851, in
MoveInstance
SubmitOrSend(op, opts, cl=cl)
File "/usr/share/ganeti/2.15/ganeti/cli.py", line 1011, in SubmitOrSend
return SubmitOpCode(op, cl=cl, feedback_fn=feedback_fn, opts=opts)
File "/usr/share/ganeti/2.15/ganeti/cli.py", line 976, in SubmitOpCode
reporter=reporter)
File "/usr/share/ganeti/2.15/ganeti/cli.py", line 955, in PollJob
return GenericPollJob(job_id, _LuxiJobPollCb(cl), reporter)
File "/usr/share/ganeti/2.15/ganeti/cli.py", line 777, in GenericPollJob
errors.MaybeRaise(msg)
File "/usr/share/ganeti/2.15/ganeti/errors.py", line 531, in MaybeRaise
raise errcls(*args)
OpExecError: Errors during disk copy: Failed to transfer instance data
Failure: command execution error:
Errors during disk copy: Failed to transfer instance data
root@nuc1:~#
"gnt-cluster renew-crypto --new-node-certificates" is successful, but does
not affect the results. Restarting ganeti on both nodes makes no difference.
Replacing /usr/bin/socat with:
#!/bin/sh
echo "$0" "$@" >>/tmp/socat.out
exec socat.real "$@"
allowed me to capture the command arguments.
=== On nuc1: receiver of image and cluster master ===
/usr/bin/socat -ls -d -d -b1048576 -u
OPENSSL-LISTEN:0,reuseaddr,forever,intervall=0.01,keepalive,keepidle=60,keepintvl=10,keepcnt=5,verify=1,method=TLS1,cipher=HIGH:-DES:-3DES:-EXPORT:-DH,compress=none,key=/var/lib/ganeti/server.pem,cert=/var/lib/ganeti/server.pem,cafile=/var/run/ganeti/import-export/import-disk0-2016-05-08_16_27_40-mDO5uf/ca,pf=ipv4
stdout
=== On nuc2: sender of image ===
/usr/bin/socat -ls -d -d -b1048576 -u stdin
OPENSSL:10.10.0.238:46423,connect-timeout=20,retry=10,intervall=1,keepalive,keepidle=60,keepintvl=10,keepcnt=5,verify=1,method=TLS1,cipher=HIGH:-DES:-3DES:-EXPORT:-DH,compress=none,key=/var/lib/ganeti/server.pem,cert=/var/lib/ganeti/server.pem,cafile=/var/run/ganeti/import-export/export-disk0-2016-05-08_16_27_45-u0Sxcf/ca,pf=ipv4
OK: so that it seems that the sender is trying to connect by IP address
(10.10.0.238 = nuc1), but socat fails because it the hostname in the
certificate is "nuc1.ws.nsrc.org"
It says in the socat manpage:
NOTE: Up to version 1.7.2.4 the server certificate was only
checked for validity
against the system certificate store or cafile or capath,
but not for match with
the server’s name or its IP address. Since version 1.7.3.0
socat checks the peer
certificate for match with the <host> parameter or the value
of the openssl-com‐
monname option. Socat tries to match it against the
certificates subject common‐
Name, and the certifications extension subjectAltName DNS
names. Wildcards in the
certificate are supported.
I think therefore that the connection should be made by name but not IP
address, *or* the commonname option should be added.
It's possible that my cluster is in an invalid state as it has been through
several upgrades, but I don't think so; it seems to work OK apart from
instance move; gnt-cluster verify is fine. Here is the relevant config
excerpt (reformatted via 'python -m json.tool')
=== /var/lib/ganeti/config.data ===
...
"nodes": {
"0b180aa9-2c45-4250-a54a-e56c41b5cc68": {
"ctime": 1462607653.915405,
"drained": false,
"group": "4f283bc2-68c9-4392-9155-65a569e06a97",
"master_candidate": true,
"master_capable": true,
"mtime": 1462607653.915405,
"name": "nuc2.ws.nsrc.org",
"ndparams": {},
"offline": false,
"powered": true,
"primary_ip": "10.10.0.239",
"secondary_ip": "10.10.0.239",
"serial_no": 1,
"tags": [],
"uuid": "0b180aa9-2c45-4250-a54a-e56c41b5cc68",
"vm_capable": true
},
"52cee655-b738-45ff-b920-b797840a6113": {
"ctime": 1462717482.202958,
"drained": false,
"group": "4f283bc2-68c9-4392-9155-65a569e06a97",
"master_candidate": true,
"master_capable": true,
"mtime": 1462717482.202958,
"name": "nuc1.ws.nsrc.org",
"ndparams": {},
"offline": false,
"powered": true,
"primary_ip": "10.10.0.238",
"secondary_ip": "10.10.0.238",
"serial_no": 1,
"tags": [],
"uuid": "52cee655-b738-45ff-b920-b797840a6113",
"vm_capable": true
}
...
== ssconf_node_primary_ips ==
nuc1.ws.nsrc.org 10.10.0.238
nuc2.ws.nsrc.org 10.10.0.239
== ssconf_master_candidates_ips ==
10.10.0.239
10.10.0.238
--
You received this message because this project is configured to send all
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings