Re: [ceph-users] Signature check failures.

2018-02-19 Thread Cary
Gregory,


I greatly appreciate your assistance. I recompiled Ceph with -ssl and
the nss USE flags set, which is opposite what I was using. I am now
able to export from our pools without signature check failures. Thank
you for pointing me in the right direction.

Cary
-Dynamic



On Fri, Feb 16, 2018 at 11:29 PM, Gregory Farnum  wrote:
> On Thu, Feb 15, 2018 at 10:28 AM Cary  wrote:
>>
>> Hello,
>>
>> I have enabled debugging on my MONs and OSDs to help troubleshoot
>> these signature check failures. I was watching ods.4's log and saw
>> these errors when the signature check failure happened.
>>
>> 2018-02-15 18:06:29.235791 7f8bca7de700  1 --
>> 192.168.173.44:6806/72264 >> 192.168.173.42:0/4264467021
>> conn(0x55f802746000 :6806 s=STATE_OPEN pgs=7 cs=1 l=1).read_bulk peer
>> close file descriptor 81
>> 2018-02-15 18:06:29.235832 7f8bca7de700  1 --
>> 192.168.173.44:6806/72264 >> 192.168.173.42:0/4264467021
>> conn(0x55f802746000 :6806 s=STATE_OPEN pgs=7 cs=1 l=1).read_until read
>> failed
>> 2018-02-15 18:06:29.235841 7f8bca7de700  1 --
>> 192.168.173.44:6806/72264 >> 192.168.173.42:0/4264467021
>> conn(0x55f802746000 :6806 s=STATE_OPEN pgs=7 cs=1 l=1).process read
>> tag failed
>> 2018-02-15 18:06:29.235848 7f8bca7de700  1 --
>> 192.168.173.44:6806/72264 >> 192.168.173.42:0/4264467021
>> conn(0x55f802746000 :6806 s=STATE_OPEN pgs=7 cs=1 l=1).fault on lossy
>> channel, failing
>> 2018-02-15 18:06:29.235966 7f8bc0853700  2 osd.8 27498 ms_handle_reset
>> con 0x55f802746000 session 0x55f8063b3180
>>
>>
>>  Could someone please look at this? We have 3 different Ceph clusters
>> setup and they all have this issue. This cluster is running Gentoo and
>> Ceph version 12.2.2-r1. The other two clusters are 12.2.2. Exporting
>> images causes signature check failures and with larger files it seg
>> faults as well.
>>
>> When exporting the image from osd.4 This message shows up as well.
>> Exporting image: 1% complete...2018-02-15 18:14:05.283708 7f6834277700
>>  0 -- 192.168.173.44:0/122241099 >> 192.168.173.44:6801/72152
>> conn(0x7f681400ff10 :-1 s=STATE_CONNECTING_WAIT_CONNECT_REPLY_AUTH
>> pgs=0 cs=0 l=1).handle_connect_reply connect got BADAUTHORIZER
>>
>> The error below show up on all OSD/MGR/MON nodes when exporting an image.
>> Exporting image: 8% complete...2018-02-15 18:15:51.419437 7f2b64ac0700
>>  0 SIGN: MSG 28 Message signature does not match contents.
>> 2018-02-15 18:15:51.419459 7f2b64ac0700  0 SIGN: MSG 28Signature on
>> message:
>> 2018-02-15 18:15:51.419460 7f2b64ac0700  0 SIGN: MSG 28sig:
>> 8338581684421737157
>> 2018-02-15 18:15:51.419469 7f2b64ac0700  0 SIGN: MSG 28Locally
>> calculated signature:
>> 2018-02-15 18:15:51.419470 7f2b64ac0700  0 SIGN: MSG 28
>> sig_check:5913182128308244
>> 2018-02-15 18:15:51.419471 7f2b64ac0700  0 Signature failed.
>> 2018-02-15 18:15:51.419472 7f2b64ac0700  0 --
>> 192.168.173.44:0/3919097436 >> 192.168.173.44:6801/72152
>> conn(0x7f2b4800ff10 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH
>> pgs=39 cs=1 l=1).process Signature check failed
>>
>> Our VMs crash when writing to disk. Libvirt's logs just say the VM
>> crashed.   This is a blocker. Has anyone else seen this? This seems to
>> be an issue with Ceph Luminous, as we were not having these problem
>> with Jewel.
>
>
> When I search through my email, the only two reports of failed signatures
> are people who in fact had misconfiguration issues resulting in one end
> using signatures and the other side not.
>
> Given that, and since you're on Gentoo and presumably compiled the packages
> yourself, the most likely explanation I can think of is something that went
> wrong between your packages and the compilation. :/
>
> I guess you could try switching from libnss to libcryptopp (or vice versa)
> by recompiling with the relevant makeflags if you want to do something that
> only involves the Ceph code. Otherwise, do a rebuild?
>
> Sadly I don't think there's much else we can suggest given that nobody has
> seen this with binary packages blessed by the upstream or a distribution.
> -Greg
>
>>
>>
>> Cary
>> -Dynamic
>>
>> On Thu, Feb 1, 2018 at 7:04 PM, Cary  wrote:
>> > Hello,
>> >
>> > I did not do anything special that I know of. I was just exporting an
>> > image from Openstack. We have recently upgraded from Jewel 10.2.3 to
>> > Luminous 12.2.2.
>> >
>> > Caps for admin:
>> > client.admin
>> > key: CENSORED
>> > auid: 0
>> > caps: [mgr] allow *
>> > caps: [mon] allow *
>> > caps: [osd] allow *
>> >
>> > Caps for Cinder:
>> > client.cinder
>> > key: CENSORED
>> > caps: [mgr] allow r
>> > caps: [mon] profile rbd, allow command "osd blacklist"
>> > caps: [osd] profile rbd pool=vms, profile rbd pool=volumes,
>> > profile rbd pool=images
>> >
>> > Caps for MGR:
>> > mgr.0
>> > key: CENSORED
>> > caps: [mon] allow *
>> >
>> > I believe this is 

Re: [ceph-users] Signature check failures.

2018-02-16 Thread Gregory Farnum
On Thu, Feb 15, 2018 at 10:28 AM Cary  wrote:

> Hello,
>
> I have enabled debugging on my MONs and OSDs to help troubleshoot
> these signature check failures. I was watching ods.4's log and saw
> these errors when the signature check failure happened.
>
> 2018-02-15 18:06:29.235791 7f8bca7de700  1 --
> 192.168.173.44:6806/72264 >> 192.168.173.42:0/4264467021
> conn(0x55f802746000 :6806 s=STATE_OPEN pgs=7 cs=1 l=1).read_bulk peer
> close file descriptor 81
> 2018-02-15 18:06:29.235832 7f8bca7de700  1 --
> 192.168.173.44:6806/72264 >> 192.168.173.42:0/4264467021
> conn(0x55f802746000 :6806 s=STATE_OPEN pgs=7 cs=1 l=1).read_until read
> failed
> 2018-02-15 18:06:29.235841 7f8bca7de700  1 --
> 192.168.173.44:6806/72264 >> 192.168.173.42:0/4264467021
> conn(0x55f802746000 :6806 s=STATE_OPEN pgs=7 cs=1 l=1).process read
> tag failed
> 2018-02-15 18:06:29.235848 7f8bca7de700  1 --
> 192.168.173.44:6806/72264 >> 192.168.173.42:0/4264467021
> conn(0x55f802746000 :6806 s=STATE_OPEN pgs=7 cs=1 l=1).fault on lossy
> channel, failing
> 2018-02-15 18:06:29.235966 7f8bc0853700  2 osd.8 27498 ms_handle_reset
> con 0x55f802746000 session 0x55f8063b3180
>
>
>  Could someone please look at this? We have 3 different Ceph clusters
> setup and they all have this issue. This cluster is running Gentoo and
> Ceph version 12.2.2-r1. The other two clusters are 12.2.2. Exporting
> images causes signature check failures and with larger files it seg
> faults as well.
>
> When exporting the image from osd.4 This message shows up as well.
> Exporting image: 1% complete...2018-02-15 18:14:05.283708 7f6834277700
>  0 -- 192.168.173.44:0/122241099 >> 192.168.173.44:6801/72152
> conn(0x7f681400ff10 :-1 s=STATE_CONNECTING_WAIT_CONNECT_REPLY_AUTH
> pgs=0 cs=0 l=1).handle_connect_reply connect got BADAUTHORIZER
>
> The error below show up on all OSD/MGR/MON nodes when exporting an image.
> Exporting image: 8% complete...2018-02-15 18:15:51.419437 7f2b64ac0700
>  0 SIGN: MSG 28 Message signature does not match contents.
> 2018-02-15 18:15:51.419459 7f2b64ac0700  0 SIGN: MSG 28Signature on
> message:
> 2018-02-15 18:15:51.419460 7f2b64ac0700  0 SIGN: MSG 28sig:
> 8338581684421737157
> 2018-02-15 18:15:51.419469 7f2b64ac0700  0 SIGN: MSG 28Locally
> calculated signature:
> 2018-02-15 18:15:51.419470 7f2b64ac0700  0 SIGN: MSG 28
> sig_check:5913182128308244
> 2018-02-15 18:15:51.419471 7f2b64ac0700  0 Signature failed.
> 2018-02-15 18:15:51.419472 7f2b64ac0700  0 --
> 192.168.173.44:0/3919097436 >> 192.168.173.44:6801/72152
> conn(0x7f2b4800ff10 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH
> pgs=39 cs=1 l=1).process Signature check failed
>
> Our VMs crash when writing to disk. Libvirt's logs just say the VM
> crashed.   This is a blocker. Has anyone else seen this? This seems to
> be an issue with Ceph Luminous, as we were not having these problem
> with Jewel.
>

When I search through my email, the only two reports of failed signatures
are people who in fact had misconfiguration issues resulting in one end
using signatures and the other side not.

Given that, and since you're on Gentoo and presumably compiled the packages
yourself, the most likely explanation I can think of is something that went
wrong between your packages and the compilation. :/

I guess you could try switching from libnss to libcryptopp (or vice versa)
by recompiling with the relevant makeflags if you want to do something that
only involves the Ceph code. Otherwise, do a rebuild?

Sadly I don't think there's much else we can suggest given that nobody has
seen this with binary packages blessed by the upstream or a distribution.
-Greg


>
> Cary
> -Dynamic
>
> On Thu, Feb 1, 2018 at 7:04 PM, Cary  wrote:
> > Hello,
> >
> > I did not do anything special that I know of. I was just exporting an
> > image from Openstack. We have recently upgraded from Jewel 10.2.3 to
> > Luminous 12.2.2.
> >
> > Caps for admin:
> > client.admin
> > key: CENSORED
> > auid: 0
> > caps: [mgr] allow *
> > caps: [mon] allow *
> > caps: [osd] allow *
> >
> > Caps for Cinder:
> > client.cinder
> > key: CENSORED
> > caps: [mgr] allow r
> > caps: [mon] profile rbd, allow command "osd blacklist"
> > caps: [osd] profile rbd pool=vms, profile rbd pool=volumes,
> > profile rbd pool=images
> >
> > Caps for MGR:
> > mgr.0
> > key: CENSORED
> > caps: [mon] allow *
> >
> > I believe this is causing the virtual machines we have running to
> > crash. Any advice would be appreciated. Please let me know if I need
> > to provide any other details. Thank you,
> >
> > Cary
> > -Dynamic
> >
> > On Mon, Jan 29, 2018 at 7:53 PM, Gregory Farnum 
> wrote:
> >> On Fri, Jan 26, 2018 at 12:14 PM Cary  wrote:
> >>>
> >>> Hello,
> >>>
> >>>  We are running Luminous 12.2.2. 6 OSD hosts with 12 1TB OSDs, and 64GB
> >>> RAM. Each host has a SSD for 

Re: [ceph-users] Signature check failures.

2018-02-15 Thread Cary
Hello,

I have enabled debugging on my MONs and OSDs to help troubleshoot
these signature check failures. I was watching ods.4's log and saw
these errors when the signature check failure happened.

2018-02-15 18:06:29.235791 7f8bca7de700  1 --
192.168.173.44:6806/72264 >> 192.168.173.42:0/4264467021
conn(0x55f802746000 :6806 s=STATE_OPEN pgs=7 cs=1 l=1).read_bulk peer
close file descriptor 81
2018-02-15 18:06:29.235832 7f8bca7de700  1 --
192.168.173.44:6806/72264 >> 192.168.173.42:0/4264467021
conn(0x55f802746000 :6806 s=STATE_OPEN pgs=7 cs=1 l=1).read_until read
failed
2018-02-15 18:06:29.235841 7f8bca7de700  1 --
192.168.173.44:6806/72264 >> 192.168.173.42:0/4264467021
conn(0x55f802746000 :6806 s=STATE_OPEN pgs=7 cs=1 l=1).process read
tag failed
2018-02-15 18:06:29.235848 7f8bca7de700  1 --
192.168.173.44:6806/72264 >> 192.168.173.42:0/4264467021
conn(0x55f802746000 :6806 s=STATE_OPEN pgs=7 cs=1 l=1).fault on lossy
channel, failing
2018-02-15 18:06:29.235966 7f8bc0853700  2 osd.8 27498 ms_handle_reset
con 0x55f802746000 session 0x55f8063b3180


 Could someone please look at this? We have 3 different Ceph clusters
setup and they all have this issue. This cluster is running Gentoo and
Ceph version 12.2.2-r1. The other two clusters are 12.2.2. Exporting
images causes signature check failures and with larger files it seg
faults as well.

When exporting the image from osd.4 This message shows up as well.
Exporting image: 1% complete...2018-02-15 18:14:05.283708 7f6834277700
 0 -- 192.168.173.44:0/122241099 >> 192.168.173.44:6801/72152
conn(0x7f681400ff10 :-1 s=STATE_CONNECTING_WAIT_CONNECT_REPLY_AUTH
pgs=0 cs=0 l=1).handle_connect_reply connect got BADAUTHORIZER

The error below show up on all OSD/MGR/MON nodes when exporting an image.
Exporting image: 8% complete...2018-02-15 18:15:51.419437 7f2b64ac0700
 0 SIGN: MSG 28 Message signature does not match contents.
2018-02-15 18:15:51.419459 7f2b64ac0700  0 SIGN: MSG 28Signature on message:
2018-02-15 18:15:51.419460 7f2b64ac0700  0 SIGN: MSG 28sig:
8338581684421737157
2018-02-15 18:15:51.419469 7f2b64ac0700  0 SIGN: MSG 28Locally
calculated signature:
2018-02-15 18:15:51.419470 7f2b64ac0700  0 SIGN: MSG 28
sig_check:5913182128308244
2018-02-15 18:15:51.419471 7f2b64ac0700  0 Signature failed.
2018-02-15 18:15:51.419472 7f2b64ac0700  0 --
192.168.173.44:0/3919097436 >> 192.168.173.44:6801/72152
conn(0x7f2b4800ff10 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH
pgs=39 cs=1 l=1).process Signature check failed

Our VMs crash when writing to disk. Libvirt's logs just say the VM
crashed.   This is a blocker. Has anyone else seen this? This seems to
be an issue with Ceph Luminous, as we were not having these problem
with Jewel.

Cary
-Dynamic

On Thu, Feb 1, 2018 at 7:04 PM, Cary  wrote:
> Hello,
>
> I did not do anything special that I know of. I was just exporting an
> image from Openstack. We have recently upgraded from Jewel 10.2.3 to
> Luminous 12.2.2.
>
> Caps for admin:
> client.admin
> key: CENSORED
> auid: 0
> caps: [mgr] allow *
> caps: [mon] allow *
> caps: [osd] allow *
>
> Caps for Cinder:
> client.cinder
> key: CENSORED
> caps: [mgr] allow r
> caps: [mon] profile rbd, allow command "osd blacklist"
> caps: [osd] profile rbd pool=vms, profile rbd pool=volumes,
> profile rbd pool=images
>
> Caps for MGR:
> mgr.0
> key: CENSORED
> caps: [mon] allow *
>
> I believe this is causing the virtual machines we have running to
> crash. Any advice would be appreciated. Please let me know if I need
> to provide any other details. Thank you,
>
> Cary
> -Dynamic
>
> On Mon, Jan 29, 2018 at 7:53 PM, Gregory Farnum  wrote:
>> On Fri, Jan 26, 2018 at 12:14 PM Cary  wrote:
>>>
>>> Hello,
>>>
>>>  We are running Luminous 12.2.2. 6 OSD hosts with 12 1TB OSDs, and 64GB
>>> RAM. Each host has a SSD for Bluestore's block.wal and block.db.
>>> There are 5 monitor nodes as well with 32GB RAM. All servers have
>>> Gentoo with kernel, 4.12.12-gentoo.
>>>
>>> When I export an image using:
>>> rbd export pool-name/volume-name  /location/image-name.raw
>>>
>>> Message similar to below are displayed. The signature check fails
>>> randomly. And sometimes a message about a bad authorizer, but not
>>> everytime.
>>> The image is still exported successfully.
>>>
>>> 2018-01-24 17:35:15.616080 7fc8d4024700  0 cephx:
>>> verify_authorizer_reply bad nonce got 4552544084014661633 expected
>>> 4552499520046621785 sent 4552499520046621784
>>> 2018-01-24 17:35:15.616098 7fc8d4024700  0 --
>>> 172.21.32.16:0/1412094654 >> 172.21.32.6:6802/6219 conn(0x7fc8b0078a50
>>> :-1 s=STATE_CONNECTING_WAIT_CONNECT_REPLY_AUTH pgs=0 cs=0
>>> l=1)._process_connection failed verifying authorize reply
>>> 2018-01-24 17:35:15.699004 7fc8d4024700  0 SIGN: MSG 2 Message
>>> signature does not match contents.
>>> 2018-01-24 17:35:15.699020 7fc8d4024700  0 

Re: [ceph-users] Signature check failures.

2018-02-01 Thread Cary
Hello,

I did not do anything special that I know of. I was just exporting an
image from Openstack. We have recently upgraded from Jewel 10.2.3 to
Luminous 12.2.2.

Caps for admin:
client.admin
key: CENSORED
auid: 0
caps: [mgr] allow *
caps: [mon] allow *
caps: [osd] allow *

Caps for Cinder:
client.cinder
key: CENSORED
caps: [mgr] allow r
caps: [mon] profile rbd, allow command "osd blacklist"
caps: [osd] profile rbd pool=vms, profile rbd pool=volumes,
profile rbd pool=images

Caps for MGR:
mgr.0
key: CENSORED
caps: [mon] allow *

I believe this is causing the virtual machines we have running to
crash. Any advice would be appreciated. Please let me know if I need
to provide any other details. Thank you,

Cary
-Dynamic

On Mon, Jan 29, 2018 at 7:53 PM, Gregory Farnum  wrote:
> On Fri, Jan 26, 2018 at 12:14 PM Cary  wrote:
>>
>> Hello,
>>
>>  We are running Luminous 12.2.2. 6 OSD hosts with 12 1TB OSDs, and 64GB
>> RAM. Each host has a SSD for Bluestore's block.wal and block.db.
>> There are 5 monitor nodes as well with 32GB RAM. All servers have
>> Gentoo with kernel, 4.12.12-gentoo.
>>
>> When I export an image using:
>> rbd export pool-name/volume-name  /location/image-name.raw
>>
>> Message similar to below are displayed. The signature check fails
>> randomly. And sometimes a message about a bad authorizer, but not
>> everytime.
>> The image is still exported successfully.
>>
>> 2018-01-24 17:35:15.616080 7fc8d4024700  0 cephx:
>> verify_authorizer_reply bad nonce got 4552544084014661633 expected
>> 4552499520046621785 sent 4552499520046621784
>> 2018-01-24 17:35:15.616098 7fc8d4024700  0 --
>> 172.21.32.16:0/1412094654 >> 172.21.32.6:6802/6219 conn(0x7fc8b0078a50
>> :-1 s=STATE_CONNECTING_WAIT_CONNECT_REPLY_AUTH pgs=0 cs=0
>> l=1)._process_connection failed verifying authorize reply
>> 2018-01-24 17:35:15.699004 7fc8d4024700  0 SIGN: MSG 2 Message
>> signature does not match contents.
>> 2018-01-24 17:35:15.699020 7fc8d4024700  0 SIGN: MSG 2Signature on
>> message:
>> 2018-01-24 17:35:15.699021 7fc8d4024700  0 SIGN: MSG 2sig:
>> 8189090775647585001
>> 2018-01-24 17:35:15.699047 7fc8d4024700  0 SIGN: MSG 2Locally
>> calculated signature:
>> 2018-01-24 17:35:15.699048 7fc8d4024700  0 SIGN: MSG 2
>> sig_check:140500325643792
>> 2018-01-24 17:35:15.699049 7fc8d4024700  0 Signature failed.
>> 2018-01-24 17:35:15.699050 7fc8d4024700  0 --
>> 172.21.32.16:0/1412094654 >> 172.21.32.2:6807/153106
>> conn(0x7fc8bc020870 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH
>> pgs=26018 cs=1 l=1).process Signature check failed
>>
>> Does anyone know what could cause this, and what I can do to fix it.
>
>
> That's in the cephx authentication code and it's indicating that the secure
> signature sent with the message isn't what the local node thinks it should
> be. That's pretty odd (a bit flip or something that could actually change it
> ought to trigger the messaging checksums directly) and I'm not quite sure
> how it could happen.
>
> But, as you've noticed, it retries and apparently succeeds. How did you
> notice this?
> -Greg
>
>>
>>
>> Thank you,
>>
>> Cary
>> -Dynamic
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Signature check failures.

2018-01-29 Thread Gregory Farnum
On Fri, Jan 26, 2018 at 12:14 PM Cary  wrote:

> Hello,
>
>  We are running Luminous 12.2.2. 6 OSD hosts with 12 1TB OSDs, and 64GB
> RAM. Each host has a SSD for Bluestore's block.wal and block.db.
> There are 5 monitor nodes as well with 32GB RAM. All servers have
> Gentoo with kernel, 4.12.12-gentoo.
>
> When I export an image using:
> rbd export pool-name/volume-name  /location/image-name.raw
>
> Message similar to below are displayed. The signature check fails
> randomly. And sometimes a message about a bad authorizer, but not
> everytime.
> The image is still exported successfully.
>
> 2018-01-24 17:35:15.616080 7fc8d4024700  0 cephx:
> verify_authorizer_reply bad nonce got 4552544084014661633 expected
> 4552499520046621785 sent 4552499520046621784
> 2018-01-24 17:35:15.616098 7fc8d4024700  0 --
> 172.21.32.16:0/1412094654 >> 172.21.32.6:6802/6219 conn(0x7fc8b0078a50
> :-1 s=STATE_CONNECTING_WAIT_CONNECT_REPLY_AUTH pgs=0 cs=0
> l=1)._process_connection failed verifying authorize reply
> 2018-01-24 17:35:15.699004 7fc8d4024700  0 SIGN: MSG 2 Message
> signature does not match contents.
> 2018-01-24 17:35:15.699020 7fc8d4024700  0 SIGN: MSG 2Signature on message:
> 2018-01-24 17:35:15.699021 7fc8d4024700  0 SIGN: MSG 2sig:
> 8189090775647585001
> 2018-01-24 17:35:15.699047 7fc8d4024700  0 SIGN: MSG 2Locally
> calculated signature:
> 2018-01-24 17:35:15.699048 7fc8d4024700  0 SIGN: MSG 2
> sig_check:140500325643792
> 2018-01-24 17:35:15.699049 7fc8d4024700  0 Signature failed.
> 2018-01-24 17:35:15.699050 7fc8d4024700  0 --
> 172.21.32.16:0/1412094654 >> 172.21.32.2:6807/153106
> conn(0x7fc8bc020870 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH
> pgs=26018 cs=1 l=1).process Signature check failed
>
> Does anyone know what could cause this, and what I can do to fix it.
>

That's in the cephx authentication code and it's indicating that the secure
signature sent with the message isn't what the local node thinks it should
be. That's pretty odd (a bit flip or something that could actually change
it ought to trigger the messaging checksums directly) and I'm not quite
sure how it could happen.

But, as you've noticed, it retries and apparently succeeds. How did you
notice this?
-Greg


>
> Thank you,
>
> Cary
> -Dynamic
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Signature check failures.

2018-01-26 Thread Cary
Hello,

 We are running Luminous 12.2.2. 6 OSD hosts with 12 1TB OSDs, and 64GB
RAM. Each host has a SSD for Bluestore's block.wal and block.db.
There are 5 monitor nodes as well with 32GB RAM. All servers have
Gentoo with kernel, 4.12.12-gentoo.

When I export an image using:
rbd export pool-name/volume-name  /location/image-name.raw

Message similar to below are displayed. The signature check fails
randomly. And sometimes a message about a bad authorizer, but not
everytime.
The image is still exported successfully.

2018-01-24 17:35:15.616080 7fc8d4024700  0 cephx:
verify_authorizer_reply bad nonce got 4552544084014661633 expected
4552499520046621785 sent 4552499520046621784
2018-01-24 17:35:15.616098 7fc8d4024700  0 --
172.21.32.16:0/1412094654 >> 172.21.32.6:6802/6219 conn(0x7fc8b0078a50
:-1 s=STATE_CONNECTING_WAIT_CONNECT_REPLY_AUTH pgs=0 cs=0
l=1)._process_connection failed verifying authorize reply
2018-01-24 17:35:15.699004 7fc8d4024700  0 SIGN: MSG 2 Message
signature does not match contents.
2018-01-24 17:35:15.699020 7fc8d4024700  0 SIGN: MSG 2Signature on message:
2018-01-24 17:35:15.699021 7fc8d4024700  0 SIGN: MSG 2sig:
8189090775647585001
2018-01-24 17:35:15.699047 7fc8d4024700  0 SIGN: MSG 2Locally
calculated signature:
2018-01-24 17:35:15.699048 7fc8d4024700  0 SIGN: MSG 2
sig_check:140500325643792
2018-01-24 17:35:15.699049 7fc8d4024700  0 Signature failed.
2018-01-24 17:35:15.699050 7fc8d4024700  0 --
172.21.32.16:0/1412094654 >> 172.21.32.2:6807/153106
conn(0x7fc8bc020870 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH
pgs=26018 cs=1 l=1).process Signature check failed

Does anyone know what could cause this, and what I can do to fix it.

Thank you,

Cary
-Dynamic
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Signature check failures.

2018-01-25 Thread Cary
Hello,

 We are running Luminous 12.2.2. 6 OSD hosts with 12 1TB OSDs, and 64GB
RAM. Each host has a SSD for Bluestore's block.wal and block.db.
There are 5 monitor nodes as well with 32GB RAM. All servers have
Gentoo with kernel, 4.12.12-gentoo.

When I export an image using:
rbd export pool-name/volume-name  /location/image-name.raw

Message similar to below are displayed. The signature check fails
randomly. And sometimes a message about a bad authorizer, but not
everytime.
The image is still exported successfully.

2018-01-24 17:35:15.616080 7fc8d4024700  0 cephx:
verify_authorizer_reply bad nonce got 4552544084014661633 expected
4552499520046621785 sent 4552499520046621784
2018-01-24 17:35:15.616098 7fc8d4024700  0 --
172.21.32.16:0/1412094654 >> 172.21.32.6:6802/6219 conn(0x7fc8b0078a50
:-1 s=STATE_CONNECTING_WAIT_CONNECT_REPLY_AUTH pgs=0 cs=0
l=1)._process_connection failed verifying authorize reply
2018-01-24 17:35:15.699004 7fc8d4024700  0 SIGN: MSG 2 Message
signature does not match contents.
2018-01-24 17:35:15.699020 7fc8d4024700  0 SIGN: MSG 2Signature on message:
2018-01-24 17:35:15.699021 7fc8d4024700  0 SIGN: MSG 2sig:
8189090775647585001
2018-01-24 17:35:15.699047 7fc8d4024700  0 SIGN: MSG 2Locally
calculated signature:
2018-01-24 17:35:15.699048 7fc8d4024700  0 SIGN: MSG 2
sig_check:140500325643792
2018-01-24 17:35:15.699049 7fc8d4024700  0 Signature failed.
2018-01-24 17:35:15.699050 7fc8d4024700  0 --
172.21.32.16:0/1412094654 >> 172.21.32.2:6807/153106
conn(0x7fc8bc020870 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH
pgs=26018 cs=1 l=1).process Signature check failed

Does anyone know what could cause this, and what I can do to fix it.

Thank you,

Cary
-Dynamic
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Signature check failures.

2018-01-24 Thread Cary
Hello,

 We are running Luminous 12.2.2. 6 OSD hosts with 12 1TB, and 64GB
RAM. Each host with a SSD for Bluestore's block.wal and block.db.
There are 5 monitor nodes as well with 32GB RAM. All servers have
Gentoo with kernel, 4.12.12-gentoo.

When I export an image using:
rbd export pool-name/volume-name  /location/image-name.raw

The following messages show up. The signature check fails randomly.
The image is still exported successfully.

2018-01-24 17:35:15.616080 7fc8d4024700  0 cephx:
verify_authorizer_reply bad nonce got 4552544084014661633 expected
4552499520046621785 sent 4552499520046621784
2018-01-24 17:35:15.616098 7fc8d4024700  0 --
172.21.32.16:0/1412094654 >> 172.21.32.6:6802/6219 conn(0x7fc8b0078a50
:-1 s=STATE_CONNECTING_WAIT_CONNECT_REPLY_AUTH pgs=0 cs=0
l=1)._process_connection failed verifying authorize reply
2018-01-24 17:35:15.699004 7fc8d4024700  0 SIGN: MSG 2 Message
signature does not match contents.
2018-01-24 17:35:15.699020 7fc8d4024700  0 SIGN: MSG 2Signature on message:
2018-01-24 17:35:15.699021 7fc8d4024700  0 SIGN: MSG 2sig:
8189090775647585001
2018-01-24 17:35:15.699047 7fc8d4024700  0 SIGN: MSG 2Locally
calculated signature:
2018-01-24 17:35:15.699048 7fc8d4024700  0 SIGN: MSG 2
sig_check:140500325643792
2018-01-24 17:35:15.699049 7fc8d4024700  0 Signature failed.
2018-01-24 17:35:15.699050 7fc8d4024700  0 --
172.21.32.16:0/1412094654 >> 172.21.32.2:6807/153106
conn(0x7fc8bc020870 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH
pgs=26018 cs=1 l=1).process Signature check failed

Does anyone know what could cause this, and what I can do to fix it.

Thank you,

Cary
-Dynamic
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com