[ceph-users] Re: How to trigger scrubbing in Ceph on-demand ?

2023-10-26 Thread Joe Comeau
Don't know if this will help you

But we do all our scrubbing manually with cron tasks
always the oldest non-scrubbed pg

And to check on scrubbing we use this - which reports the current
active scrubbing process 
ceph pg ls scrubbing | sort -k18 -k19 | head -n 20
for us a scrub is 5 minutes +/- maybe 3
Deep scrub is 40 minutes +/- 10
all slow hd's


hth  Joe





>>> Jayjeet Chakraborty  10/23/2023 1:59 PM >>>
Hi Reto,

Thanks a lot for the instructions. I tried the same, but still
couldn't
trigger scrubbing deterministically. The first time I initiated
scrubbing,
I saw scrubbing status in ceph -s, but for subsequent times, I didn't
see
any scrubbing status. Do you know what might be going on potentially?
Any
ideas would be appreciated. Thanks.

Best Regards,
*Jayjeet Chakraborty*
Ph.D. Student
Department of Computer Science and Engineering
University of California, Santa Cruz
*Email: jayje...@ucsc.edu *


On Wed, Oct 18, 2023 at 7:47 AM Reto Gysi  wrote:

> Hi
>
> I haven't updated to reef yet. I've tried this on quincy.
>
> # create a testfile on cephfs.rgysi.data pool
> root@zephir:/home/rgysi/misc# echo cephtest123 > cephtest.txt
>
> #list inode of new file
> root@zephir:/home/rgysi/misc# ls -i cephtest.txt
> 1099518867574 cephtest.txt
>
> convert inode value to hex value
> root@zephir:/home/rgysi/misc# printf "%x" 1099518867574
> 16e7876
>
> # search for this value in the rados pool cephfs.rgysi.data, to find
> object(s)
> root@zephir:/home/rgysi/misc# rados -p cephfs.rgysi.data ls | grep
> 16e7876
> 16e7876.
>
> # find pg for the object
> root@zephir:/home/rgysi/misc# ceph osd map cephfs.rgysi.data
> 16e7876.
> osdmap e105365 pool 'cephfs.rgysi.data' (25) object
'16e7876.'
> -> pg 25.ee1befa1 (25.1) -> up ([0,2,8], p0) acting ([0,2,8], p0)
>
> #Initiate a deep-scrub for this pg
> root@zephir:/home/rgysi/misc# ceph pg deep-scrub 25.1
> instructing pg 25.1 on osd.0 to deep-scrub
>
> # check status of scrubbing
> root@zephir:/home/rgysi/misc# ceph pg ls scrubbing
> PGOBJECTS  DEGRADED  MISPLACED  UNFOUND  BYTES   OMAP_BYTES*
>  OMAP_KEYS*  LOG   STATE 
SINCE  VERSION
>REPORTED   UP  ACTING SCRUB_STAMP
> DEEP_SCRUB_STAMP  
 LAST_S
> CRUB_DURATION  SCRUB_SCHEDULING
> 25.137774 00  0  62869823142  
   0
>  0  2402  active+clean+scrubbing+deep  7s 
105365'1178098
>  105365:8066292  [0,2,8]p0  [0,2,8]p0 
2023-10-18T05:17:48.631392+
>  2023-10-08T11:30:58.883164+
>   3  deep scrubbing for 1s
>
>
> Best Regards,
>
> Reto
>
> Am Mi., 18. Okt. 2023 um 16:24 Uhr schrieb Jayjeet Chakraborty <
> jayje...@ucsc.edu>:
>
>> Hi all,
>>
>> Just checking if someone had a chance to go through the scrub
trigger
>> issue
>> above. Thanks.
>>
>> Best Regards,
>> *Jayjeet Chakraborty*
>> Ph.D. Student
>> Department of Computer Science and Engineering
>> University of California, Santa Cruz
>> *Email: jayje...@ucsc.edu *
>>
>>
>> On Mon, Oct 16, 2023 at 9:01 PM Jayjeet Chakraborty

>> wrote:
>>
>> > Hi all,
>> >
>> > I am trying to trigger deep scrubbing in Ceph reef (18.2.0) on
demand
>> on a
>> > set of files that I randomly write to CephFS. I have tried both
invoking
>> > deep-scrub on CephFS using ceph tell and just deep scrubbing a
>> > particular PG. Unfortunately, none of that seems to be working for
me.
>> I am
>> > monitoring the ceph status output, it never shows any scrubbing
>> > information. Can anyone please help me out on this ? In a
nutshell, I
>> need
>> > Ceph to scrub for me anytime I want. I am using Ceph with default
>> configs
>> > for scrubbing. Thanks all.
>> >
>> > Best Regards,
>> > *Jayjeet Chakraborty*
>> > Ph.D. Student
>> > Department of Computer Science and Engineering
>> > University of California, Santa Cruz
>> > *Email: jayjeetc@
ucsc.edu *
>> >
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] test message

2023-01-08 Thread Joe Comeau
 
Hi 
Just testing as I have not received a message from the list in a couple days
 
Thanks Joe
 
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph-iscsi lock ping pong

2022-12-14 Thread Joe Comeau
That's correct - we use the kernel target not tcmu-runner


>>> Xiubo Li  12/13/2022 6:02 PM >>>

On 14/12/2022 06:54, Joe Comeau wrote:
> I am curious about what is happening with your iscsi configuration
> Is this a new iscsi config or something that has just cropped up ?
>   
> We are using/have been using vmware for 5+ years with iscsi
> We are using the kernel iscsi vs tcmu
>   

Do you mean you are using kernel target, not the ceph-iscsi/tcmu-runner 
in user space, right ?

> We are running ALUA and all datastores are setup as RR
> We routinely reboot the iscsi gateways - during patching and updates and the 
> storage migrates to and from all servers without issue
> We usually wait about 10 minutes before a gateway restart, so there is not an 
> outage
>   
> It has been extremely stable for us
>   
> Thanks Joe
>   
>
>
>>>> Xiubo Li  12/13/2022 4:21 AM >>>
> On 13/12/2022 18:57, Stolte, Felix wrote:
>> Hi Xiubo,
>>
>> Thx for pointing me into the right direction. All involved esx host
>> seem to use the correct policy. I am going to detach the LUN on each
>> host one by one until i found the host causing the problem.
>>
>  From the logs it means the client was switching the path in turn.
>
> BTW, what's policy are you using ?
>
> Thanks
>
> - Xiubo
>
>> Regards Felix
>> -
>> -
>> Forschungszentrum Juelich GmbH
>> 52425 Juelich
>> Sitz der Gesellschaft: Juelich
>> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
>> Vorsitzender des Aufsichtsrats: MinDir Volker Rieke
>> Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
>> Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
>> Dr. Astrid Lambrecht, Prof. Dr. Frauke Melchior
>> -
>> -
>>
>>> Am 12.12.2022 um 13:03 schrieb Xiubo Li :
>>>
>>> Hi Stolte,
>>>
>>> For the VMware config could you refer to :
>>> https://docs.ceph.com/en/latest/rbd/iscsi-initiator-esx/ ?
>>>
>>> What's the "Path Selection Policy with ALUA" you are using ? The
>>> ceph-iscsi couldn't implement the real AA, so if you use the RR I
>>> think it will be like this.
>>>
>>> - Xiubo
>>>
>>> On 12/12/2022 17:45, Stolte, Felix wrote:
>>>> Hi guys,
>>>>
>>>> we are using ceph-iscsi to provide block storage for Microsoft Exchange 
>>>> and vmware vsphere. Ceph docs state that you need to configure Windows 
>>>> iSCSI Initatior for fail-over-only but there is no such point for vmware. 
>>>> In my tcmu-runner logs on both ceph-iscsi gateways I see the following:
>>>>
>>>> 2022-12-12 10:36:06.978 33789 [WARN] tcmu_notify_lock_lost:222 
>>>> rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
>>>> 2022-12-12 10:36:06.993 33789 [INFO] alua_implicit_transition:570 
>>>> rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.
>>>> 2022-12-12 10:36:08.064 33789 [WARN] tcmu_rbd_lock:762 
>>>> rbd/mailbox.vmdk_junet_sata: Acquired exclusive lock.
>>>> 2022-12-12 10:36:09.067 33789 [WARN] tcmu_notify_lock_lost:222 
>>>> rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
>>>> 2022-12-12 10:36:09.071 33789 [INFO] alua_implicit_transition:570 
>>>> rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.
>>>> 2022-12-12 10:36:10.109 33789 [WARN] tcmu_rbd_lock:762 
>>>> rbd/mailbox.vmdk_junet_sata: Acquired exclusive lock.
>>>> 2022-12-12 10:36:11.104 33789 [WARN] tcmu_notify_lock_lost:222 
>>>> rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
>>>> 2022-12-12 10:36:11.106 33789 [INFO] alua_implicit_transition:570 
>>>> rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.
>>>>
>>>> At the same time there are these log entries in ceph.audit.logs:
>>>> 2022-12-12T10:36:06.731621+0100 mon.mon-k2-1 (mon.1) 3407851 : audit [INF] 
>>>> from='client.? 10.100.8.55:0/2392201639' entity='client.admin' 
>>>> cmd=[{"prefix": "osd blocklist", "blocklistop": "add", "

[ceph-users] Re: ceph-iscsi lock ping pong

2022-12-13 Thread Joe Comeau
I am curious about what is happening with your iscsi configuration
Is this a new iscsi config or something that has just cropped up ?
 
We are using/have been using vmware for 5+ years with iscsi
We are using the kernel iscsi vs tcmu
 
We are running ALUA and all datastores are setup as RR
We routinely reboot the iscsi gateways - during patching and updates and the 
storage migrates to and from all servers without issue
We usually wait about 10 minutes before a gateway restart, so there is not an 
outage 
 
It has been extremely stable for us
 
Thanks Joe
 


>>> Xiubo Li  12/13/2022 4:21 AM >>>

On 13/12/2022 18:57, Stolte, Felix wrote:
> Hi Xiubo,
>
> Thx for pointing me into the right direction. All involved esx host 
> seem to use the correct policy. I am going to detach the LUN on each 
> host one by one until i found the host causing the problem.
>
>From the logs it means the client was switching the path in turn.

BTW, what's policy are you using ?

Thanks

- Xiubo

> Regards Felix
> -
> -
> Forschungszentrum Juelich GmbH
> 52425 Juelich
> Sitz der Gesellschaft: Juelich
> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
> Vorsitzender des Aufsichtsrats: MinDir Volker Rieke
> Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
> Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
> Dr. Astrid Lambrecht, Prof. Dr. Frauke Melchior
> -
> -
>
>> Am 12.12.2022 um 13:03 schrieb Xiubo Li :
>>
>> Hi Stolte,
>>
>> For the VMware config could you refer to : 
>> https://docs.ceph.com/en/latest/rbd/iscsi-initiator-esx/ ?
>>
>> What's the "Path Selection Policy with ALUA" you are using ? The 
>> ceph-iscsi couldn't implement the real AA, so if you use the RR I 
>> think it will be like this.
>>
>> - Xiubo
>>
>> On 12/12/2022 17:45, Stolte, Felix wrote:
>>> Hi guys,
>>>
>>> we are using ceph-iscsi to provide block storage for Microsoft Exchange and 
>>> vmware vsphere. Ceph docs state that you need to configure Windows iSCSI 
>>> Initatior for fail-over-only but there is no such point for vmware. In my 
>>> tcmu-runner logs on both ceph-iscsi gateways I see the following:
>>>
>>> 2022-12-12 10:36:06.978 33789 [WARN] tcmu_notify_lock_lost:222 
>>> rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
>>> 2022-12-12 10:36:06.993 33789 [INFO] alua_implicit_transition:570 
>>> rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.
>>> 2022-12-12 10:36:08.064 33789 [WARN] tcmu_rbd_lock:762 
>>> rbd/mailbox.vmdk_junet_sata: Acquired exclusive lock.
>>> 2022-12-12 10:36:09.067 33789 [WARN] tcmu_notify_lock_lost:222 
>>> rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
>>> 2022-12-12 10:36:09.071 33789 [INFO] alua_implicit_transition:570 
>>> rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.
>>> 2022-12-12 10:36:10.109 33789 [WARN] tcmu_rbd_lock:762 
>>> rbd/mailbox.vmdk_junet_sata: Acquired exclusive lock.
>>> 2022-12-12 10:36:11.104 33789 [WARN] tcmu_notify_lock_lost:222 
>>> rbd/mailbox.vmdk_junet_sata: Async lock drop. Old state 1
>>> 2022-12-12 10:36:11.106 33789 [INFO] alua_implicit_transition:570 
>>> rbd/mailbox.vmdk_junet_sata: Starting lock acquisition operation.
>>>
>>> At the same time there are these log entries in ceph.audit.logs:
>>> 2022-12-12T10:36:06.731621+0100 mon.mon-k2-1 (mon.1) 3407851 : audit [INF] 
>>> from='client.? 10.100.8.55:0/2392201639' entity='client.admin' 
>>> cmd=[{"prefix": "osd blocklist", "blocklistop": "add", "addr": "10
>>> .100.8.56:0/1598475844"}]: dispatch
>>> 2022-12-12T10:36:06.731913+0100 mon.mon-e2-1 (mon.0) 783726 : audit [INF] 
>>> from='client.? ' entity='client.admin' cmd=[{"prefix": "osd blocklist", 
>>> "blocklistop": "add", "addr": "10.100.8.56:0/1598475844"}]
>>> : dispatch
>>> 2022-12-12T10:36:06.905082+0100 mon.mon-e2-1 (mon.0) 783727 : audit [INF] 
>>> from='client.? ' entity='client.admin' cmd='[{"prefix": "osd blocklist", 
>>> "blocklistop": "add", "addr": "10.100.8.56:0/1598475844"}
>>> ]': finished
>>>
>>> Can someone explaint to me, what is happening? Why are the gateways 
>>> blacklisting each other? All involved daemons are running Version 16.2.10. 
>>> ceph-iscsi gateways are running on Ubuntu 20.04 with ceph-isci package from 
>>> the Ubuntu repo (all other packers came directly from 
>>> ceph.com)
>>>
>>>
>>> regards Felix
>>>
>>> -
>>> -
>>> Forschungszentrum Juelich GmbH
>>> 52425 Juelich
>>> Sitz der G

[ceph-users] Re: SATA SSD OSD behind PERC raid0

2021-11-25 Thread Joe Comeau
I will tell you of our experience
 
Dell perc controllers with HDD and separate Intel NVMe for journals etc
 
With the Disk first behind the controller with caching enabled and it set as a 
raid0  and the OSDs were  encrypted everything was good.
 
When we upgrade to LVM and still encrypted and still raid0 on the perc - 
everything was fine for a while
 
As soon as drives started to fail - it became incredibly difficult to find the 
right drive according to the idrac, the OSD and everything was difficult
Remember not all drives actually fail - some go in to a state that go up and 
down or can cause a lot of noise on the bus of the server taking a whole server 
offline
 
So we reconfigured (all / most OSDs / drive to not use raid0 cache) did notice 
a slight drop in performance
BUT we can now find and address the OS/s hard drives in the idrac 
We also don't have to worry about the battery failing on the perc (hopefully 
anyway)
 
So I would test you config before going into production
I also don't know what info the idrac will give you on the SSD, vs the OS tools
 
Joe
 

>>>  11/25/2021 9:42 AM >>>
Hi,
I plan to use Samsung PM883 1.92TB as OSD with Raid0 behind PERC controller and 
Journal and Data will be on the same drive.
Does anyone have a similar setup? Any hints or tips would be appreciated.
BR
Max

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Fwd: Re: Issues with Ceph network redundancy using L2 MC-LAG

2021-06-15 Thread Joe Comeau
We also run with Dell VLT switches (40 GB)
everything is active/active, so multiple paths as Andrew describes in
his config
 
Our config allows us:
   bring down one of the switches for upgrades
   bring down an iscsi gatway for patching
all the while at least one path is up and servicing
 
Thanks Joe


>>> Andrew Walker-Brown  6/15/2021 10:26 AM
>>>
With an unstable link/port you could see the issues you describe.  Ping
doesn’t have the packet rate for you to necessarily have a packet in
transit at exactly the same time as the port fails temporarily.  Iperf
on the other hand could certainly show the issue, higher packet rate and
more likely to have packets in flight at the time of a link
fail...combined with packet loss/retries gives poor throughput.

Depending on what you want to happen, there are a number of tuning
options both on the switches and Linux.  If you want the LAG to be down
if any link fails, the you should be able to config this on the switches
and/or Linux  (minimum number of links = 2 if you have 2 links in the
lag).

You can also tune the link monitoring, how frequently the links are
checked (e.g. miimon) etc.  Bringing this value down from the default of
100ms may allow you to detect a link failure more quickly.  But you then
run into the chance if detecting a transient failure that wouldn’t have
caused any issuesand the LAG becoming more unstable.

Flapping/unstable links are the worst kind of situation.  Ideally you’d
pick that up quickly from monitoring/alerts and either fix immediately
or take the link down until you can fix it.

I run 2x10G from my hosts into separate switches (Dell S series – VLT
between switches).  Pulling a single interface has no impact on Ceph,
any packet loss is tiny and we’re not exceeding 10G bandwidth per host.

If you’re running 1G links and the LAG is already busy, a link failure
could be causing slow writes to the host, just down to
congestion...which then starts to impact the wider cluster based on how
Ceph works.

Just caveating the above with - I’m relatively new to Ceph myself

Sent from Mail for
Windows 10

From: huxia...@horebdata.cn
Sent: 15 June 2021 17:52
To: Serkan Çoban
Cc: ceph-users
Subject: [ceph-users] Re: Issues with Ceph network redundancy using L2
MC-LAG

When i pull out the cable, then the bond is working properly.

Does it mean that the port is somehow flapping? Ping can still work,
but the iperf test yields very low results.





huxia...@horebdata.cn

From: Serkan Çoban
Date: 2021-06-15 18:47
To: huxia...@horebdata.cn
CC: ceph-users
Subject: Re: [ceph-users] Issues with Ceph network redundancy using L2
MC-LAG
Do you observe the same behaviour when you pull a cable?
Maybe a flapping port might cause this kind of behaviour, other than
that you should't see any network disconnects.
Are you sure about LACP configuration, what is the output of 'cat
/proc/net/bonding/bond0'

On Tue, Jun 15, 2021 at 7:19 PM huxia...@horebdata.cn
 wrote:
>
> Dear Cephers,
>
> I encountered the following networking issue several times, and i
wonder whether there is a solution for networking HA solution.
>
> We build ceph using L2 multi chassis link aggregation group (MC-LAG )
to provide switch redundancy. On each host, we use 802.3ad, LACP
> mode for NIC redundancy. However, we observe several times, when a
single network port, either the cable, or the SFP+ optical module fails,
Ceph cluster  is badly affected by networking, although in theory it
should be able to tolerate.
>
> Did i miss something important here? and how to really achieve
networking HA in Ceph cluster?
>
> best regards,
>
> Samuel
>
>
>
>
> huxia...@horebdata.cn
> ___
> ceph-users mailing list -- ceph-users@ceph.io

> To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Natutilus - not unmapping

2021-05-07 Thread Joe Comeau
Hi Mattias
 
thanks for the info
 
All OSDs active and clean
 
With help we found it is related to this we think:
https://tracker.ceph.com/issues/48946
 
osd_committed is better now
I'm not sure that it is fixed so are watching closely
 
currently
 ceph report |grep "osdmap_.*_committed"
report 3319096257
"osdmap_first_committed": 303919,
"osdmap_last_committed": 304671,
 
Thanks Joe
 
 
 
 
 


>>> Matthias Grandl  5/6/2021 10:39 PM >>>
Hi Joe,

are all PGs active+clean? If not, you will only get osdmap pruning, which
will try to keep only every 10th osdmap.
https://docs.ceph.com/en/latest/dev/mon-osdmap-prune/

If you have remapped PGs and need to urgently get rid of osdmaps, you can
try the upmap-remapped script to get to a pseudo clean state.

https://github.com/HeinleinSupport/cern-ceph-scripts/blob/master/tools/upmap/upmap-remapped.py


Matthias Grandl
Head of UX

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io

On Fri, May 7, 2021, 02:16 Joe Comeau  wrote:

>
> Nautilus cluster is not unmapping
>
> ceph 14.2.16
>
> ceph report |grep "osdmap_.*_committed"
> report 1175349142
>"osdmap_first_committed": 285562,
>"osdmap_last_committed": 304247,
> we've set osd_map_cache_size = 2
> but its is slowly growing to that difference as well
>
> OSD map first committed is not changing for some strange reason
>
> Cluster has been around and upgraded since either firefly or jewel
>
> I have seen a few other with this problem to no solution to it
> Any suggestions ?
>
>
> Thanks Joe
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Natutilus - not unmapping

2021-05-06 Thread Joe Comeau
 
Nautilus cluster is not unmapping
 
ceph 14.2.16
 
ceph report |grep "osdmap_.*_committed"
report 1175349142
"osdmap_first_committed": 285562,
"osdmap_last_committed": 304247,
we've set osd_map_cache_size = 2
but its is slowly growing to that difference as well
 
OSD map first committed is not changing for some strange reason
 
Cluster has been around and upgraded since either firefly or jewel
 
I have seen a few other with this problem to no solution to it
Any suggestions ?
 
 
Thanks Joe
 
 
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: PG inconsistent with empty inconsistent objects

2021-01-26 Thread Joe Comeau
just issue the commands
 
scrub pg deep-scrub 17.1cs  
this will deep scrub this pg
 
ceph pg repair 17.7ff
repairs the pg 
 
 
 


>>> Richard Bade  1/26/2021 3:40 PM >>>
Hi Everyone,
I also have seen this inconsistent with empty when you do
list-inconsistent-obj

$ sudo ceph health detail
HEALTH_ERR 1 scrub errors; Possible data damage: 1 pg inconsistent; 1
pgs not deep-scrubbed in time
OSD_SCRUB_ERRORS 1 scrub errors
PG_DAMAGED Possible data damage: 1 pg inconsistent
pg 17.7ff is active+clean+inconsistent, acting
[232,242,34,280,266,21]
PG_NOT_DEEP_SCRUBBED 1 pgs not deep-scrubbed in time
pg 17.1c2 not deep-scrubbed since 2021-01-15 02:46:16.271811

$ sudo rados list-inconsistent-obj 17.7ff --format=json-pretty
{
"epoch": 183807,
"inconsistents": []
}

Usually these are caused by read errors on the disks, but I've checked
all osd hosts that are part of this osd and there's no smart or dmesg
errors.

Rich

--
>
> Date: Sun, 17 Jan 2021 14:00:01 +0330
> From: Seena Fallah 
> Subject: [ceph-users] Re: PG inconsistent with empty inconsistent
>objects
> To: "Alexander E. Patrakov" 
> Cc: ceph-users 
> Message-ID:
>   

> Content-Type: text/plain; charset="UTF-8"
>
> It's for a long time ago and I don't have the `ceph health detail`
output!
>
> On Sat, Jan 16, 2021 at 9:42 PM Alexander E. Patrakov

> wrote:
>
> > For a start, please post the "ceph health detail" output.
> >
> > сб, 19 дек. 2020 г. в 23:48, Seena Fallah :
> > >
> > > Hi,
> > >
> > > I'm facing something strange! One of the PGs in my pool got
inconsistent
> > > and when I run `rados list-inconsistent-obj $PG_ID
--format=json-pretty`
> > > the `inconsistents` key was empty! What is this? Is it a bug in
Ceph
> > or..?
> > >
> > > Thanks.
> > > ___
> > > ceph-users mailing list -- ceph-users@ceph.io
> > > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
> >
> >
> > --
> > Alexander E. Patrakov
> > CV: http://u.pc.cd/wT8otalK
> >
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph RBD iSCSI compatibility

2020-09-03 Thread Joe Comeau
Salsa
 
Again the doc shows and we have used layering only as a feature for
iSCSI
Further down it gives you specific settings for the luns/images
 
In our case we let vmware/veeam snapshot and make copies of our VMs
 
There is a new Beta of SES that bypasses the iscsi gateways for Windows
servers
We have not had a chance to look at it yet, but it looks very
interesting
 
Thanks Joe
 
 
 


>>> Salsa  9/3/2020 9:03 AM >>>
Joe,

sorry, I should have been clearer. The incompatible rbd features are
exclusive-lock, journaling, object-map and such.

The info comes from here:
https://documentation.suse.com/ses/6/html/ses-all/ceph-rbd.html

--
Salsa

Sent with ProtonMail Secure Email.

‐‐‐ Original Message ‐‐‐
On Thursday, September 3, 2020 12:58 PM, Joe Comeau
 wrote:

> Here is a link for iSCSI/RBD implementation guide from SUSE for this
year for vmware (Hyper-v should be similar)
>
https://www.suse.com/media/guide/suse-enterprise-storage-implementation-guide-for-vmware-esxi-guide.pdf
>
> We've been running rbd/iscsi for 4 years
>
> Thanks Joe
>
> > > > Salsa sa...@protonmail.com 9/2/2020 3:08 PM >>>
>
> I just came across a Suse documentation stating that RBD features are
not iSCSI compatible. Since I had 2 cases of image corruption in this
scenario in 10 days I'm wondering if my setup is to blame.
>
> So question is if it is possible to provide disks to a Windows Server
2019 via iSCSI while using rbd-mirror to backup data to a second
cluster? I created all images with all features enabled. Is that
compatible?
>
>

>
> Salsa
>
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph RBD iSCSI compatibility

2020-09-03 Thread Joe Comeau
Here is a link for iSCSI/RBD implementation guide from SUSE for this year for 
vmware (Hyper-v should be similar)
https://www.suse.com/media/guide/suse-enterprise-storage-implementation-guide-for-vmware-esxi-guide.pdf
 
We've been running rbd/iscsi for 4 years 
 
Thanks Joe

>>> Salsa  9/2/2020 3:08 PM >>>
I just came across a Suse documentation stating that RBD features are not iSCSI 
compatible. Since I had 2 cases of image corruption in this scenario in 10 days 
I'm wondering if my setup is to blame.

So question is if it is possible to provide disks to a Windows Server 2019 via 
iSCSI while using rbd-mirror to backup data to a second cluster? I created all 
images with all features enabled. Is that compatible?

--
Salsa

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Question about ceph-balancer and OSD reweights

2020-02-28 Thread Joe Comeau
A while ago - before ceph balancer - probably on Jewel
We had a bunch of disks with different re-weights to help control pg
We upgraded to luminous
All our disks are the same, so we set them all back to 1.0 then let them fill 
accordingly
 
Then ran balancer about 4-5 times, letting each run finish , before then next - 
worked great - took a while too
Note that when balancer kicks off it can really move a lot of data and involve 
a lot of objects
 
using it currently to help evacuate and redeploy hosts
 
HTH  Joe

>>> shubjero  2/28/2020 11:43 AM >>>
I talked to some guys on IRC about going back to the non-1 reweight
OSD's and setting them to 1.

I went from a standard deviation of 2+ to 0.5.

Awesome.

On Wed, Feb 26, 2020 at 10:08 AM shubjero  wrote:
>
> Right, but should I be proactively returning any reweighted OSD's that
> are not 1. to 1.?
>
> On Wed, Feb 26, 2020 at 3:36 AM Konstantin Shalygin  wrote:
> >
> > On 2/26/20 3:40 AM, shubjero wrote:
> > > I'm running a Ceph Mimic cluster 13.2.6 and we use the ceph-balancer
> > > in upmap mode. This cluster is fairly old and pre-Mimic we used to set
> > > osd reweights to balance the standard deviation of the cluster. Since
> > > moving to Mimic about 9 months ago I enabled the ceph-balancer with
> > > upmap mode and let it do its thing but I did not think about setting
> > > the previously modified reweights back to 1.0 (not sure if this is
> > > fine or would have been a best practice?)
> > >
> > > Does the ceph-balancer in upmap mode manage the osd reweight
> > > dynamically? Just wondering if I need to proactively go back and set
> > > all non 1.0 reweights to 1.0.
> >
> > Balancer in upmap mode should always work on not reweighed (e.g. 1.)
> > OSD's.
> >
> >
> >
> > k
> >
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: extract disk usage stats from running ceph cluster

2020-02-10 Thread Joe Comeau
try from admin node
 
ceph osd df
ceph osd status
thanks Joe
 

>>>  2/10/2020 10:44 AM >>>
Hello MJ,

Perhaps your PGs are a unbalanced?

Ceph osd df tree

Greetz
Mehmet 

Am 10. Februar 2020 14:58:25 MEZ schrieb lists :
>Hi,
>
>We would like to replace the current seagate ST4000NM0034 HDDs in our 
>ceph cluster with SSDs, and before doing that, we would like to
>checkout 
>the typical usage of our current drives, over the last years, so we can
>
>select the best (price/performance/endurance) SSD to replace them with.
>
>I am trying to extract this info from the fields "Blocks received from 
>initiator" / "blocks sent to initiator", as these are the fields 
>smartctl gets from the seagate disks. But the numbers seem strange, and
>
>I would like to request feedback here.
>
>Three nodes, all equal, 8 OSDs per node, all 4TB ST4000NM0034 
>(filestore) HDDs with SSD-based journals:
>
>> root@node1:~# ceph osd crush tree
>> ID CLASS WEIGHT   TYPE NAME
>> -1  87.35376 root default
>> -2  29.11688 host node1
>>  0   hdd  3.64000 osd.0
>>  1   hdd  3.64000 osd.1
>>  2   hdd  3.63689 osd.2
>>  3   hdd  3.64000 osd.3
>> 12   hdd  3.64000osd.12
>> 13   hdd  3.64000osd.13
>> 14   hdd  3.64000osd.14
>> 15   hdd  3.64000osd.15
>> -3  29.12000 host node2
>>  4   hdd  3.64000 osd.4
>>  5   hdd  3.64000 osd.5
>>  6   hdd  3.64000 osd.6
>>  7   hdd  3.64000 osd.7
>> 16   hdd  3.64000osd.16
>> 17   hdd  3.64000osd.17
>> 18   hdd  3.64000osd.18
>> 19   hdd  3.64000osd.19
>> -4  29.11688 host node3
>>  8   hdd  3.64000 osd.8
>>  9   hdd  3.64000 osd.9
>> 10   hdd  3.64000osd.10
>> 11   hdd  3.64000osd.11
>> 20   hdd  3.64000osd.20
>> 21   hdd  3.64000osd.21
>> 22   hdd  3.64000osd.22
>> 23   hdd  3.63689osd.23
>
>We are looking at the numbers from smartctl, and basing our
>calculations 
>on this output for each individual various OSD:
>> Vendor (Seagate) cache information
>>   Blocks sent to initiator = 3783529066
>>   Blocks received from initiator = 3121186120
>>   Blocks read from cache and sent to initiator = 545427169
>>   Number of read and write commands whose size <= segment size =
>93877358
>>   Number of read and write commands whose size > segment size =
>2290879
>
>I created the following spreadsheet:
>
>>  blocks sent blocks received total blocks
>>   to initiatorfrom initiatorcalculatedread%write%
>>  aka
>> node1
>> osd0 905060564   1900663448  2805724012  32,26%  67,74%  
>> sda
>> osd1 2270442418  3756215880  6026658298  37,67%  62,33%  
>> sdb
>> osd2 3531938448  3940249192  7472187640  47,27%  52,73%  
>> sdc
>> osd3 2824808123  3130655416  5955463539  47,43%  52,57%  
>> sdd
>> osd121956722491  1294854032  3251576523  60,18%  39,82%  
>> sdg
>> osd133410188306  1265443936  4675632242  72,94%  27,06%  
>> sdh
>> osd143765454090  3115079112  6880533202  54,73%  45,27%  
>> sdi
>> osd152272246730  2218847264  4491093994  50,59%  49,41%  
>> sdj
>>  
>> node2
>> osd4 3974937107  740853712   4715790819  84,29%  15,71%  
>> sda
>> osd5 1181377668  2109150744  3290528412  35,90%  64,10%  
>> sdb
>> osd5 1903438106  608869008   2512307114  75,76%  24,24%  
>> sdc
>> osd7 3511170043  724345936   4235515979  82,90%  17,10%  
>> sdd
>> osd162642731906  3981984640  6624716546  39,89%  60,11%  
>> sdg
>> osd173994977805  3703856288  7698834093  51,89%  48,11%  
>> sdh
>> osd183992157229  2096991672  6089148901  65,56%  34,44%  
>> sdi
>> osd19279766405   1053039640  1332806045  20,99%  79,01%  
>> sdj
>>  
>> node3
>> osd8 3711322586  234696960   3946019546  94,05%  5,95%   
>> sda
>> osd9 1203912715  313299  4336902715  27,76%  72,24%  
>> sdb
>> osd10912356010   1681434416  2593790426  35,17%  64,83%  
>> sdc
>> osd11810488345   2626589896  3437078241  23,58%  76,42%  
>> sdd
>> osd201506879946  2421596680  3928476626  38,36%  61,64%  
>> sdg
>> osd212991526593  7525120 2999051713  99,75%  0,25%   
>> sdh
>> osd22295603373226114552  3255674889  0,91%   99