date:20210615

[ceph-users] Docs on Containerized Mon Maintenance

2021-06-15 Thread Phil Merricks

Hey folks,

I'm working through some basic ops drills, and noticed what I think is an
inconsistency in the Cephadm Docs.  Some Googling appears to show this is a
known thing, but I didn't find a clear direction on cooking up a solution
yet.

On a cluster with 5 mons, 2 were abruptly removed when their host OS
decided to do scheduled maintenance without asking first.  Those hosts only
had mons running on them (and mds/crash/node exporter), so I still have 3
mon quorum and the cluster is happy.

It's not clear to me how I add these hosts back in as mons though.  In the
troubleshooting docs it describes bringing all mons down, then extracting a
monmap.  I tried this through various iterations of bringing all down,
bringing one back up and entering the container; bringing all down and
trying to use ceph-mon from cephadm shell and so on.  I either got rocksdb
lock issues presumably because a mon node was running, or an error that the
path to the mon data didn't exist, presumably for the opposite reason.

Is there guidance on the container-friendly way to perform the monmap
maintenance?

I did think that because I still have quorum, I could simply do ceph orch
apply mon label:mon instead, but I am nervous this might upset my remaining
mons.  Looking at the ceph orch ls output I see:

root@kida:/# ceph orch ls
NAME   PORTS  RUNNING  REFRESHED  AGE  PLACEMENT
alertmanager  1/1  7m ago 2h   count:1
crash 5/5  9m ago 2h   *
grafana   1/1  7m ago 2h   count:1
mds.media 3/3  9m ago 2h
thebends;okcomputer;amnesiac
mgr   2/2  9m ago 2h   count:2
mon   3/5  9m ago 2h   label:mon
node-exporter 5/5  9m ago 2h   *
osd.all-available-devices5/10  9m ago 2h   *
prometheus1/1  7m ago 2h   count:1
root@kida:/#

So is it expecting 2 more mons, or has it autoscaled down cleverly?

Looking at ceph orch ps I see:
root@kida:/# ceph orch ps
NAME HOST PORTSSTATUS
REFRESHED  AGE  VERSIONIMAGE ID  CONTAINER ID
alertmanager.kidakida *:9093,9094  running (2h)   8m
ago 2h   0.20.0 0881eb8f169f  89c604455194
crash.amnesiac   amnesiac  running (11h)  8m
ago 11h  16.2.4 8d91d370c2b8  bff086c930db
crash.kida   kida  running (2h)   8m
ago 2h   16.2.4 8d91d370c2b8  b0ac059be109
crash.kingoflimbskingoflimbs   running (13h)  8m
ago 13h  16.2.4 8d91d370c2b8  b0955309a8b9
crash.okcomputer okcomputerrunning (2h)   10m
ago2h   16.2.4 8d91d370c2b8  a75cf65ef235
crash.thebends   thebends  running (2h)   8m
ago 2h   16.2.4 8d91d370c2b8  befe9c1015f3
grafana.kida kida *:3000   running (2h)   8m
ago 2h   6.7.4  ae5c36c3d3cd  f85747138299
mds.media.amnesiac.uujwlkamnesiac  running (11h)  8m
ago 2h   16.2.4 8d91d370c2b8  512a2fcc0f97
mds.media.okcomputer.nednib  okcomputerrunning (2h)   10m
ago2h   16.2.4 8d91d370c2b8  10c6244a9308
mds.media.thebends.pqsfebthebends  running (2h)   8m
ago 2h   16.2.4 8d91d370c2b8  c1b75831a973
mgr.kida.kchysa  kida *:9283   running (2h)   8m
ago 2h   16.2.4 8d91d370c2b8  602acc0d8df3
mgr.okcomputer.rjtrqwokcomputer   *:8443,9283  running (2h)   10m
ago2h   16.2.4 8d91d370c2b8  605a8a25a604
mon.amnesiac amnesiac  stopped8m
ago 2h  
mon.kida kida  running (2h)   8m
ago 2h   16.2.4 8d91d370c2b8  a441563a978d
mon.kingoflimbs  kingoflimbs   stopped8m
ago 2h  
mon.okcomputer   okcomputerrunning (2h)   10m
ago2h   16.2.4 8d91d370c2b8  c4297efafe27
mon.thebends thebends  running (2h)   8m
ago 2h   16.2.4 8d91d370c2b8  e2394d5f152b
node-exporter.amnesiac   amnesiac *:9100   running (11h)  8m
ago 2h   0.18.1 e5a616e4b9cf  da3c69057c4f
node-exporter.kida   kida *:9100   running (2h)   8m
ago 2h   0.18.1 e5a616e4b9cf  5c9219a29257
node-exporter.kingoflimbskingoflimbs  *:9100   running (13h)  8m
ago 2h   0.18.1 e5a616e4b9cf  c2236491fb6e
node-exporter.okcomputer okcomputer   *:9100   running (2h)   10m
ago2h   0.18.1 e5a616e4b9cf  2e53a82eed32
node-exporter.thebends   thebends *:9100   running (2h)   8m
ago 2h   0.18.1 e5a616e4b9cf  def6bdd359d6
osd.0kida  running (2h)   8m

[ceph-users] Re: Mon crash when client mounts CephFS

2021-06-15 Thread Phil Merricks

Thanks for the replies folks.

This one was resolved, I wish I could tell you I know what I changed to fix
it, but there were several undocumented changes to the deployment script
I'm using whilst I was distracted by something else.. Tearing down and
redeploying today seems to not be suffering from this particular issue.

I do have a new thing though, less concerning.  I'll start a new thread..

On Tue, 8 Jun 2021 at 12:48, Robert W. Eckert  wrote:

> When I had issues with the monitors, it was access on the monitor folder
> under /var/lib/ceph//mon./store.db,
> make sure it is owned by the ceph user.
>
> My issues originated from a hardware issue - the memory needed 1.3 v, but
> the mother board was only reading 1.2 (The memory had the issue, the
> firmware said 1.2v required, the sticker on the side said 1.3).  So I had a
> script that copied the store across and fixed the permissions.
>
> The other thing that helped a lot compared to the crash logs, was to edit
> the unit.run and remove  -rm parameter from the command.  That lets you see
> the podman logs using podman logs   it was  a bit more detailed.
>
> When you do this, you will need to restore that afterwards, and clean up
> the 'cid' and 'pid' files from /run/ceph-@mon..service-cid
> and /run/ceph-@mon..service-pid
>
> My reference is from Redhat enterprise 8, so things may be a bit different
> on ubuntu.
>
> If you get a message about the store.db files being off,  its easiest to
> stop the working node, copy them over , set the user id/group to ceph and
> start things up.
>
> Rob
>
> -Original Message-
> From: Phil Merricks 
> Sent: Tuesday, June 8, 2021 3:18 PM
> To: ceph-users 
> Subject: [ceph-users] Mon crash when client mounts CephFS
>
> Hey folks,
>
> I have deployed a 3 node dev cluster using cephadm.  Deployment went
> smoothly and all seems well.
>
> If I try to mount a CephFS from a client node, 2/3 mons crash however.
> I've begun picking through the logs to see what I can see, but so far
> other than seeing the crash in the log itself, it's unclear what the cause
> of the crash is.
>
> Here's a log. .  You can see where the crash is
> occurring around the line that begins with "Jun 08 18:56:04 okcomputer
> podman[790987]:"
>
> I would welcome any advice on either what the cause may be, or how I can
> advance the analysis of what's wrong.
>
> Best regards
>
> Phil
> ___
> ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an
> email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] JSON output schema

2021-06-15 Thread Vladimir Prokofev

Good day.

I'm writing some code for parsing output data for monitoring purposes.
The data is that of "ceph status -f json", "ceph df -f json", "ceph osd
perf -f json" and "ceph osd pool stats -f json".
I also need support for all major CEPH releases, starting with Jewel till
Pacific.

What I've stumbled upon is that:
 - keys in JSON output are not present if there's no appropriate data.
For example the key ['pgmap', 'read_bytes_sec'] will not be present in
"ceph status" output if there's no read activity in the cluster;
 - some keys changed between versions. For example ['health']['status'] key
is not present in Jewel, but is available in all the following versions;
vice-versa, key ['osdmap', 'osdmap'] is not present in Pacific, but is in
all the previous versions.

So I need to get a list of all possible keys for all CEPH releases. Any
ideas how this can be achieved? My only thought atm is to build a "failing"
cluster with all the possible states and get a reference data out of it.
Not only this is tedious work since it requires each possible cluster
version, but it is also prone for error.
Is there any publicly available JSON schema for output?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] How to orch apply single site rgw with custom front-end

2021-06-15 Thread Vladimir Brik


Hello

How can I use ceph orch apply to deploy single site rgw 
daemons with custom frontend configuration?


Basically, I have three servers in a DNS round-robin, each 
running a 15.2.12 rgw daemon with this configuration:
rgw_frontends = civetweb num_threads=5000 port=443s 
ssl_certificate=/etc/ceph/rgw.crt


I would like to deploy 16.2.4 rgw daemons, but I don't know 
how to configure them. When I used "ceph orch apply rgw 
 ", it created a new entry in the monitor 
configuration database instead of using existing 
rgw_frontends entry.


I am guessing that I need to name the config db entry 
correctly, but I don't know what name to use. Currently I have

$ ceph config get client rgw_frontends
civetweb num_threads=5000 port=443s 
ssl_certificate=/etc/ceph/rgw.crt


Can anybody help?


Thanks,

Vlad
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Ceph monitor won't start after Ubuntu update

2021-06-15 Thread Petr

Hello Ceph-users,

I've upgraded my Ubuntu server from 18.04.5 LTS to Ubuntu 20.04.2 LTS via 
'do-release-upgrade',
during that process ceph packages were upgraded from Luminous to Octopus and 
now ceph-mon daemon(I have only one) won't start, log error is:
"2021-06-15T20:23:41.843+ 7fbb55e9b540 -1 mon.target@-1(probing) e2 current 
monmap has recorded min_mon_release 12 (luminous) is >2 releases older than 
installed 15 (octopus);
you can only upgrade 2 releases at a time you should first upgrade to 13 
(mimic) or 14 (nautilus) stopping."

Is there any way to get cluster running or at least get data from OSDs?

Will appreciate any help.
Thank you

-- 
Best regards,
Petr
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Issues with Ceph network redundancy using L2 MC-LAG

2021-06-15 Thread huxia...@horebdata.cn

I run 2x 10G on my hosts, and i would tolerate one bond with one link down. 

From what you suggest, i will check link monitoring, to make sure the failing 
link will be removed automatically, without the requirement for manually 
pulling out the cable.

thanks and best regards,

samuel

huxia...@horebdata.cn

From: Andrew Walker-Brown
Date: 2021-06-15 19:26
To: huxia...@horebdata.cn; Serkan Çoban
CC: ceph-users
Subject: RE: [ceph-users] Re: Issues with Ceph network redundancy using L2 
MC-LAG
With an unstable link/port you could see the issues you describe.  Ping doesn’t 
have the packet rate for you to necessarily have a packet in transit at exactly 
the same time as the port fails temporarily.  Iperf on the other hand could 
certainly show the issue, higher packet rate and more likely to have packets in 
flight at the time of a link fail...combined with packet loss/retries gives 
poor throughput.

Depending on what you want to happen, there are a number of tuning options both 
on the switches and Linux.  If you want the LAG to be down if any link fails, 
the you should be able to config this on the switches and/or Linux  (minimum 
number of links = 2 if you have 2 links in the lag).  

You can also tune the link monitoring, how frequently the links are checked 
(e.g. miimon) etc.  Bringing this value down from the default of 100ms may 
allow you to detect a link failure more quickly.  But you then run into the 
chance if detecting a transient failure that wouldn’t have caused any 
issuesand the LAG becoming more unstable.

Flapping/unstable links are the worst kind of situation.  Ideally you’d pick 
that up quickly from monitoring/alerts and either fix immediately or take the 
link down until you can fix it.

I run 2x10G from my hosts into separate switches (Dell S series – VLT between 
switches).  Pulling a single interface has no impact on Ceph, any packet loss 
is tiny and we’re not exceeding 10G bandwidth per host.

If you’re running 1G links and the LAG is already busy, a link failure could be 
causing slow writes to the host, just down to congestion...which then starts to 
impact the wider cluster based on how Ceph works.

Just caveating the above with - I’m relatively new to Ceph myself

Sent from Mail for Windows 10

From: huxia...@horebdata.cn
Sent: 15 June 2021 17:52
To: Serkan Çoban
Cc: ceph-users
Subject: [ceph-users] Re: Issues with Ceph network redundancy using L2 MC-LAG

When i pull out the cable, then the bond is working properly.

Does it mean that the port is somehow flapping? Ping can still work, but the 
iperf test yields very low results.

huxia...@horebdata.cn

From: Serkan Çoban
Date: 2021-06-15 18:47
To: huxia...@horebdata.cn
CC: ceph-users
Subject: Re: [ceph-users] Issues with Ceph network redundancy using L2 MC-LAG
Do you observe the same behaviour when you pull a cable?
Maybe a flapping port might cause this kind of behaviour, other than
that you should't see any network disconnects.
Are you sure about LACP configuration, what is the output of 'cat
/proc/net/bonding/bond0'

On Tue, Jun 15, 2021 at 7:19 PM huxia...@horebdata.cn
 wrote:
>
> Dear Cephers,
>
> I encountered the following networking issue several times, and i wonder 
> whether there is a solution for networking HA solution.
>
> We build ceph using L2 multi chassis link aggregation group (MC-LAG ) to 
> provide switch redundancy. On each host, we use 802.3ad, LACP
> mode for NIC redundancy. However, we observe several times, when a single 
> network port, either the cable, or the SFP+ optical module fails, Ceph 
> cluster  is badly affected by networking, although in theory it should be 
> able to tolerate.
>
> Did i miss something important here? and how to really achieve networking HA 
> in Ceph cluster?
>
> best regards,
>
> Samuel
>
>
>
>
> huxia...@horebdata.cn
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: ceph PGs issues

2021-06-15 Thread Aly, Adel

Hi Reed,

Thank you for getting back to us.

We had indeed several disk failures at the same time.

Regarding the OSD map, we have an OSD that failed and we needed to remove but 
we didn't update the crushmap.

The question here, is it safe to update the OSD crushmap without affecting the 
data available?

We can free up more space on the monitors if that will help indeed.

More information which can be helpful:

# ceph -v
ceph version 13.2.10 (564bdc4ae87418a232fc901524470e1a0f76d641) mimic (stable)

# ceph health detail
https://pastebin.pl/view/2b8b337d

# ceph osd pool ls detail
pool 3 'cephfs-data' erasure size 12 min_size 11 crush_rule 1 object_hash 
rjenkins pg_num 3072 pgp_num 3072 last_change 370219 lfor 0/367599 flags 
hashpspool,ec_overwrites,selfmanaged_snaps stripe_width 40960 fast_read 1 
compression_algorithm snappy compression_mode force application cephfs
removed_snaps [2~7c]
pool 4 'cephfs-meta' replicated size 3 min_size 2 crush_rule 0 object_hash 
rjenkins pg_num 1024 pgp_num 1024 last_change 370219 lfor 0/367414 flags 
hashpspool stripe_width 0 compression_algorithm none compression_mode none 
application cephfs

# ceph osd tree
https://pastebin.pl/view/eac56017

Our main struggle is when we try to rsync data, the rsync process hangs because 
it encounters an inaccessible object.

Is there a way we can take out the incomplete PGs to be able to copy data 
smoothly without having to reset the rsync process?

Kind regards,
adel

-Original Message-
From: Reed Dier 
Sent: Tuesday, June 15, 2021 4:21 PM
To: Aly, Adel 
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] ceph PGs issues

Caution! External email. Do not open attachments or click links, unless this 
email comes from a known sender and you know the content is safe.

You have incomplete PGs, which means you have inactive data, because the data 
isn't there.

This will typically only happen when you have multiple concurrent disk 
failures, or something like that, so I think there is some missing info.

>1 osds exist in the crush map but not in the osdmap

This seems like a red flag to have an OSD in the crush map but not the osdmap.

>mons xyz01,xyz02 are low on available space

Your mons are probably filling up data running in the warn state.
This can be problematic for recovery.

I think you will be more likely to receive some useful suggestions by providing 
things like which version of ceph you are using ($ ceph -v), major events that 
caused this, poo ($ ceph osd pool ls detail) and osd  ($ ceph osd tree) 
topology, as well as maybe detailed health output ($ ceph health detail).

Given how much data some things may be, like the osd tree, you may want to 
paste to pastebin and link here.

Reed

> On Jun 15, 2021, at 2:48 AM, Aly, Adel  wrote:
>
> Dears,
>
> We have a ceph cluster with 4096 PGs out of with +100 PGs are not 
> active+clean.
>
> On top of the ceph cluster, we have a ceph FS, with 3 active MDS servers.
>
> It seems that we can’t get all the files out of it because of the affected 
> PGs.
>
> The object store has more than 400 million objects.
>
> When we do “rados -p cephfs-data ls”, the listing stops (hangs) after listing 
> +11 million objects.
>
> When we try to access an object which we can’t copy, the rados command hangs 
> forever:
>
> ls -I 
> 2199140525188
>
> printf "%x\n" 2199140525188
> 20006fd6484
>
> rados -p cephfs-data stat 20006fd6484. (hangs here)
>
> This is the current status of the ceph cluster:
>health: HEALTH_WARN
>1 MDSs report slow metadata IOs
>1 MDSs report slow requests
>1 MDSs behind on trimming
>1 osds exist in the crush map but not in the osdmap
>*Reduced data availability: 22 pgs inactive, 22 pgs incomplete*
>240324 slow ops, oldest one blocked for 391503 sec, daemons
> [osd.144,osd.159,osd.180,osd.184,osd.242,osd.271,osd.275,osd.278,osd.280,osd.332]...
>  h ave slow ops.
>mons xyz01,xyz02 are low on available space
>
>  services:
>mon: 4 daemons, quorum abc001,abc002,xyz02,xyz01
>mgr: abc002(active), standbys: xyz01, xyz02, abc001
>mds: cephfs-3/3/3 up  
> {0=xyz02=up:active,1=abc001=up:active,2=abc002=up:active}, 1 up:standby
>osd: 421 osds: 421 up, 421 in; 7 remapped pgs
>
>  data:
>pools:   2 pools, 4096 pgs
>objects: 403.4 M objects, 846 TiB
>usage:   1.2 PiB used, 1.4 PiB / 2.6 PiB avail
>pgs: 0.537% pgs not active
> 3968 active+clean
> 96   active+clean+scrubbing+deep+repair
> 15   incomplete
> 10   active+clean+scrubbing
> 7remapped+incomplete
>
>  io:
>client:   89 KiB/s rd, 13 KiB/s wr, 34 op/s rd, 1 op/s wr
>
> The 100+ PGs have been in this state for a long time already.
>
> Sometimes when we try to copy some files the rsync process hangs and we can’t 
> kill it and from the process stack, it seems to be hanging on ceph i/o 
> operation.
>

[ceph-users] Re: cephfs mount problems with 5.11 kernel - not a ipv6 problem

2021-06-15 Thread Ackermann, Christoph

Hi Dan,

Thanks for the hint, i'll try this tomorrow with a test bed first. This
evening I had to fix some  Bareos client systems to get a  quiet sleep. ;-)

Will give you feedback asap.

Best regards,
Christoph

Am Di., 15. Juni 2021 um 21:03 Uhr schrieb Dan van der Ster <
d...@vanderster.com>:

> Hi Christoph,
>
> What about the max osd? If "ceph osd getmaxosd" is not 76 on this
> cluster, then set it: `ceph osd setmaxosd 76`.
>
> -- dan
>
> On Tue, Jun 15, 2021 at 8:54 PM Ackermann, Christoph
>  wrote:
> >
> > Dan,
> >
> > sorry, we have no gaps in osd numbering:
> > isceph@ceph-deploy:~$ sudo ceph osd ls |wc -l; sudo ceph osd tree |
> sort -n -k1  |tail
> > 76
> > [..]
> >  73ssd0.28600  osd.73  up   1.0
> 1.0
> >  74ssd0.27689  osd.74  up   1.0
> 1.0
> >  75ssd0.28600  osd.75  up   1.0
> 1.0
> >
> > The (quite old) cluster is running v15.2.13 very well. :-)   OSDs
> running on top of  (newest) centos8.4 bare metal, mon/mds run on (bewest)
> Centos 7.9  VMs.  Problem just appears only with the newest Centos8 client
> libceph.
> >
> > Christoph
> >
> >
> >
> >
> >
> > Am Di., 15. Juni 2021 um 20:26 Uhr schrieb Dan van der Ster <
> d...@vanderster.com>:
> >>
> >> Replying to own mail...
> >>
> >> On Tue, Jun 15, 2021 at 7:54 PM Dan van der Ster 
> wrote:
> >> >
> >> > Hi Ilya,
> >> >
> >> > We're now hitting this on CentOS 8.4.
> >> >
> >> > The "setmaxosd" workaround fixed access to one of our clusters, but
> >> > isn't working for another, where we have gaps in the osd ids, e.g.
> >> >
> >> > # ceph osd getmaxosd
> >> > max_osd = 553 in epoch 691642
> >> > # ceph osd tree | sort -n -k1 | tail
> >> >  541   ssd   0.87299 osd.541up  1.0
> 1.0
> >> >  543   ssd   0.87299 osd.543up  1.0
> 1.0
> >> >  548   ssd   0.87299 osd.548up  1.0
> 1.0
> >> >  552   ssd   0.87299 osd.552up  1.0
> 1.0
> >> >
> >> > Is there another workaround for this?
> >>
> >> The following seems to have fixed this cluster:
> >>
> >> 1. Fill all gaps with: ceph osd new `uuid`
> >> ^^ after this, the cluster is still not mountable.
> >> 2. Purge all the gap osds: ceph osd purge 
> >>
> >> I filled/purged a couple hundred gap osds, and now the cluster can be
> mounted.
> >>
> >> Cheers!
> >>
> >> Dan
> >>
> >> P.S. The bugzilla is not public:
> >> https://bugzilla.redhat.com/show_bug.cgi?id=1972278
> >>
> >> >
> >> > Cheers, dan
> >> >
> >> >
> >> > On Mon, May 3, 2021 at 12:32 PM Ilya Dryomov 
> wrote:
> >> > >
> >> > > On Mon, May 3, 2021 at 12:27 PM Magnus Harlander 
> wrote:
> >> > > >
> >> > > > Am 03.05.21 um 12:25 schrieb Ilya Dryomov:
> >> > > >
> >> > > > ceph osd setmaxosd 10
> >> > > >
> >> > > > Bingo! Mount works again.
> >> > > >
> >> > > > Vry strange things are going on here (-:
> >> > > >
> >> > > > Thanx a lot for now!! If I can help to track it down, please let
> me know.
> >> > >
> >> > > Good to know it helped!  I'll think about this some more and
> probably
> >> > > plan to patch the kernel client to be less stringent and not choke
> on
> >> > > this sort of misconfiguration.
> >> > >
> >> > > Thanks,
> >> > >
> >> > > Ilya
> >> > > ___
> >> > > ceph-users mailing list -- ceph-users@ceph.io
> >> > > To unsubscribe send an email to ceph-users-le...@ceph.io
> >> ___
> >> ceph-users mailing list -- ceph-users@ceph.io
> >> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: cephfs mount problems with 5.11 kernel - not a ipv6 problem

2021-06-15 Thread Dan van der Ster

Hi Christoph,

What about the max osd? If "ceph osd getmaxosd" is not 76 on this
cluster, then set it: `ceph osd setmaxosd 76`.

-- dan

On Tue, Jun 15, 2021 at 8:54 PM Ackermann, Christoph
 wrote:
>
> Dan,
>
> sorry, we have no gaps in osd numbering:
> isceph@ceph-deploy:~$ sudo ceph osd ls |wc -l; sudo ceph osd tree | sort -n 
> -k1  |tail
> 76
> [..]
>  73ssd0.28600  osd.73  up   1.0  
> 1.0
>  74ssd0.27689  osd.74  up   1.0  
> 1.0
>  75ssd0.28600  osd.75  up   1.0  
> 1.0
>
> The (quite old) cluster is running v15.2.13 very well. :-)   OSDs running on 
> top of  (newest) centos8.4 bare metal, mon/mds run on (bewest) Centos 7.9  
> VMs.  Problem just appears only with the newest Centos8 client libceph.
>
> Christoph
>
>
>
>
>
> Am Di., 15. Juni 2021 um 20:26 Uhr schrieb Dan van der Ster 
> :
>>
>> Replying to own mail...
>>
>> On Tue, Jun 15, 2021 at 7:54 PM Dan van der Ster  wrote:
>> >
>> > Hi Ilya,
>> >
>> > We're now hitting this on CentOS 8.4.
>> >
>> > The "setmaxosd" workaround fixed access to one of our clusters, but
>> > isn't working for another, where we have gaps in the osd ids, e.g.
>> >
>> > # ceph osd getmaxosd
>> > max_osd = 553 in epoch 691642
>> > # ceph osd tree | sort -n -k1 | tail
>> >  541   ssd   0.87299 osd.541up  1.0 1.0
>> >  543   ssd   0.87299 osd.543up  1.0 1.0
>> >  548   ssd   0.87299 osd.548up  1.0 1.0
>> >  552   ssd   0.87299 osd.552up  1.0 1.0
>> >
>> > Is there another workaround for this?
>>
>> The following seems to have fixed this cluster:
>>
>> 1. Fill all gaps with: ceph osd new `uuid`
>> ^^ after this, the cluster is still not mountable.
>> 2. Purge all the gap osds: ceph osd purge 
>>
>> I filled/purged a couple hundred gap osds, and now the cluster can be 
>> mounted.
>>
>> Cheers!
>>
>> Dan
>>
>> P.S. The bugzilla is not public:
>> https://bugzilla.redhat.com/show_bug.cgi?id=1972278
>>
>> >
>> > Cheers, dan
>> >
>> >
>> > On Mon, May 3, 2021 at 12:32 PM Ilya Dryomov  wrote:
>> > >
>> > > On Mon, May 3, 2021 at 12:27 PM Magnus Harlander  
>> > > wrote:
>> > > >
>> > > > Am 03.05.21 um 12:25 schrieb Ilya Dryomov:
>> > > >
>> > > > ceph osd setmaxosd 10
>> > > >
>> > > > Bingo! Mount works again.
>> > > >
>> > > > Vry strange things are going on here (-:
>> > > >
>> > > > Thanx a lot for now!! If I can help to track it down, please let me 
>> > > > know.
>> > >
>> > > Good to know it helped!  I'll think about this some more and probably
>> > > plan to patch the kernel client to be less stringent and not choke on
>> > > this sort of misconfiguration.
>> > >
>> > > Thanks,
>> > >
>> > > Ilya
>> > > ___
>> > > ceph-users mailing list -- ceph-users@ceph.io
>> > > To unsubscribe send an email to ceph-users-le...@ceph.io
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: cephfs mount problems with 5.11 kernel - not a ipv6 problem

2021-06-15 Thread Ackermann, Christoph

Dan,

sorry, we have no gaps in osd numbering:
isceph@ceph-deploy:~$ sudo ceph osd ls |wc -l; sudo ceph osd tree | sort -n
-k1  |tail
76
[..]
 73ssd0.28600  osd.73  up   1.0
 1.0
 74ssd0.27689  osd.74  up   1.0
 1.0
 75ssd0.28600  osd.75  up   1.0
 1.0

The (quite old) cluster is running v15.2.13 very well. :-)   OSDs running
on top of  (newest) centos8.4 bare metal, mon/mds run on (bewest) Centos
7.9  VMs.  Problem just appears only with the newest Centos8 client
libceph.

Christoph





Am Di., 15. Juni 2021 um 20:26 Uhr schrieb Dan van der Ster <
d...@vanderster.com>:

> Replying to own mail...
>
> On Tue, Jun 15, 2021 at 7:54 PM Dan van der Ster 
> wrote:
> >
> > Hi Ilya,
> >
> > We're now hitting this on CentOS 8.4.
> >
> > The "setmaxosd" workaround fixed access to one of our clusters, but
> > isn't working for another, where we have gaps in the osd ids, e.g.
> >
> > # ceph osd getmaxosd
> > max_osd = 553 in epoch 691642
> > # ceph osd tree | sort -n -k1 | tail
> >  541   ssd   0.87299 osd.541up  1.0
> 1.0
> >  543   ssd   0.87299 osd.543up  1.0
> 1.0
> >  548   ssd   0.87299 osd.548up  1.0
> 1.0
> >  552   ssd   0.87299 osd.552up  1.0
> 1.0
> >
> > Is there another workaround for this?
>
> The following seems to have fixed this cluster:
>
> 1. Fill all gaps with: ceph osd new `uuid`
> ^^ after this, the cluster is still not mountable.
> 2. Purge all the gap osds: ceph osd purge 
>
> I filled/purged a couple hundred gap osds, and now the cluster can be
> mounted.
>
> Cheers!
>
> Dan
>
> P.S. The bugzilla is not public:
> https://bugzilla.redhat.com/show_bug.cgi?id=1972278
>
> >
> > Cheers, dan
> >
> >
> > On Mon, May 3, 2021 at 12:32 PM Ilya Dryomov  wrote:
> > >
> > > On Mon, May 3, 2021 at 12:27 PM Magnus Harlander 
> wrote:
> > > >
> > > > Am 03.05.21 um 12:25 schrieb Ilya Dryomov:
> > > >
> > > > ceph osd setmaxosd 10
> > > >
> > > > Bingo! Mount works again.
> > > >
> > > > Vry strange things are going on here (-:
> > > >
> > > > Thanx a lot for now!! If I can help to track it down, please let me
> know.
> > >
> > > Good to know it helped!  I'll think about this some more and probably
> > > plan to patch the kernel client to be less stringent and not choke on
> > > this sort of misconfiguration.
> > >
> > > Thanks,
> > >
> > > Ilya
> > > ___
> > > ceph-users mailing list -- ceph-users@ceph.io
> > > To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: cephfs mount problems with 5.11 kernel - not a ipv6 problem

2021-06-15 Thread Dan van der Ster

Replying to own mail...

On Tue, Jun 15, 2021 at 7:54 PM Dan van der Ster  wrote:
>
> Hi Ilya,
>
> We're now hitting this on CentOS 8.4.
>
> The "setmaxosd" workaround fixed access to one of our clusters, but
> isn't working for another, where we have gaps in the osd ids, e.g.
>
> # ceph osd getmaxosd
> max_osd = 553 in epoch 691642
> # ceph osd tree | sort -n -k1 | tail
>  541   ssd   0.87299 osd.541up  1.0 1.0
>  543   ssd   0.87299 osd.543up  1.0 1.0
>  548   ssd   0.87299 osd.548up  1.0 1.0
>  552   ssd   0.87299 osd.552up  1.0 1.0
>
> Is there another workaround for this?

The following seems to have fixed this cluster:

1. Fill all gaps with: ceph osd new `uuid`
^^ after this, the cluster is still not mountable.
2. Purge all the gap osds: ceph osd purge 

I filled/purged a couple hundred gap osds, and now the cluster can be mounted.

Cheers!

Dan

P.S. The bugzilla is not public:
https://bugzilla.redhat.com/show_bug.cgi?id=1972278

>
> Cheers, dan
>
>
> On Mon, May 3, 2021 at 12:32 PM Ilya Dryomov  wrote:
> >
> > On Mon, May 3, 2021 at 12:27 PM Magnus Harlander  wrote:
> > >
> > > Am 03.05.21 um 12:25 schrieb Ilya Dryomov:
> > >
> > > ceph osd setmaxosd 10
> > >
> > > Bingo! Mount works again.
> > >
> > > Vry strange things are going on here (-:
> > >
> > > Thanx a lot for now!! If I can help to track it down, please let me know.
> >
> > Good to know it helped!  I'll think about this some more and probably
> > plan to patch the kernel client to be less stringent and not choke on
> > this sort of misconfiguration.
> >
> > Thanks,
> >
> > Ilya
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] problem using gwcli; package dependancy lockout

2021-06-15 Thread Philip Brown

I'm trying to update a ceph octopus install, to add an iscsi gateway, using 
ceph-ansible, and gwcli wont run for me.


The ansible run went well.. but when I try to actually use gwcli, I get
(blahblah)
ImportError: No module named rados

which isnt too surprising, since "python-rados" is not installed. 
HOWEVER.

The ceph repos installed by ceph-ansible (5.0, the octopus release) are
http://download.ceph.com/ceph-iscsi/3/rpm/el7/noarch
which provides
  python3-rados

This supposedly "obsoletes" python-rados. Except it doesnt, because 
python-rados is for python2.
But even if it didnt.. there is no provided python v2 rados module in the 
ceph-iscsi repo, or the main ceph repo.

and gwcli is still python 2.


So, what am I supposed to do now?
It seems like I need a python2 version of python rados, but the ceph repos dont 
provide one, so they basically provide a broken gwcli?




--
Philip Brown| Sr. Linux System Administrator | Medata, Inc. 
5 Peters Canyon Rd Suite 250 
Irvine CA 92606 
Office 714.918.1310| Fax 714.918.1325 
pbr...@medata.com| www.medata.com
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: cephfs mount problems with 5.11 kernel - not a ipv6 problem

2021-06-15 Thread Dan van der Ster

Hi Ilya,

We're now hitting this on CentOS 8.4.

The "setmaxosd" workaround fixed access to one of our clusters, but
isn't working for another, where we have gaps in the osd ids, e.g.

# ceph osd getmaxosd
max_osd = 553 in epoch 691642
# ceph osd tree | sort -n -k1 | tail
 541   ssd   0.87299 osd.541up  1.0 1.0
 543   ssd   0.87299 osd.543up  1.0 1.0
 548   ssd   0.87299 osd.548up  1.0 1.0
 552   ssd   0.87299 osd.552up  1.0 1.0

Is there another workaround for this?

Cheers, dan


On Mon, May 3, 2021 at 12:32 PM Ilya Dryomov  wrote:
>
> On Mon, May 3, 2021 at 12:27 PM Magnus Harlander  wrote:
> >
> > Am 03.05.21 um 12:25 schrieb Ilya Dryomov:
> >
> > ceph osd setmaxosd 10
> >
> > Bingo! Mount works again.
> >
> > Vry strange things are going on here (-:
> >
> > Thanx a lot for now!! If I can help to track it down, please let me know.
>
> Good to know it helped!  I'll think about this some more and probably
> plan to patch the kernel client to be less stringent and not choke on
> this sort of misconfiguration.
>
> Thanks,
>
> Ilya
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Issues with Ceph network redundancy using L2 MC-LAG

2021-06-15 Thread Anthony D'Atri



> On Jun 15, 2021, at 10:26 AM, Andrew Walker-Brown  
> wrote:
> 
> With an unstable link/port you could see the issues you describe.  Ping 
> doesn’t have the packet rate for you to necessarily have a packet in transit 
> at exactly the same time as the port fails temporarily.  Iperf on the other 
> hand could certainly show the issue, higher packet rate and more likely to 
> have packets in flight at the time of a link fail...combined with packet 
> loss/retries gives poor throughput.
> 
> Depending on what you want to happen, there are a number of tuning options 
> both on the switches and Linux.  If you want the LAG to be down if any link 
> fails, the you should be able to config this on the switches and/or Linux  
> (minimum number of links = 2 if you have 2 links in the lag).

Or ensure that the links are active/active.

Some of the trickiest situations I’ve encountered are when a bond is configured 
for active/backup, and there’s a latent issue with the backup link.  Active 
goes down, and the bond is horqued.

Another is when the backup link has CRC errors that only show up on the switch 
side, or when a configuration error causes packets sent over one of the links 
to be blackholed.
> 
> 
> Flapping/unstable links are the worst kind of situation.  Ideally you’d pick 
> that up quickly from monitoring/alerts and either fix immediately or take the 
> link down until you can fix it.

This.

Flakiness on a cluster/replciation network is one reason to favor not having 
one, it removes certain flappy situations and OSDs are more likely up for real, 
or down hard. 

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Issues with Ceph network redundancy using L2 MC-LAG

2021-06-15 Thread Andrew Walker-Brown

With an unstable link/port you could see the issues you describe.  Ping doesn’t 
have the packet rate for you to necessarily have a packet in transit at exactly 
the same time as the port fails temporarily.  Iperf on the other hand could 
certainly show the issue, higher packet rate and more likely to have packets in 
flight at the time of a link fail...combined with packet loss/retries gives 
poor throughput.

Depending on what you want to happen, there are a number of tuning options both 
on the switches and Linux.  If you want the LAG to be down if any link fails, 
the you should be able to config this on the switches and/or Linux  (minimum 
number of links = 2 if you have 2 links in the lag).

You can also tune the link monitoring, how frequently the links are checked 
(e.g. miimon) etc.  Bringing this value down from the default of 100ms may 
allow you to detect a link failure more quickly.  But you then run into the 
chance if detecting a transient failure that wouldn’t have caused any 
issuesand the LAG becoming more unstable.

Flapping/unstable links are the worst kind of situation.  Ideally you’d pick 
that up quickly from monitoring/alerts and either fix immediately or take the 
link down until you can fix it.

I run 2x10G from my hosts into separate switches (Dell S series – VLT between 
switches).  Pulling a single interface has no impact on Ceph, any packet loss 
is tiny and we’re not exceeding 10G bandwidth per host.

If you’re running 1G links and the LAG is already busy, a link failure could be 
causing slow writes to the host, just down to congestion...which then starts to 
impact the wider cluster based on how Ceph works.

Just caveating the above with - I’m relatively new to Ceph myself

Sent from Mail for Windows 10

From: huxia...@horebdata.cn
Sent: 15 June 2021 17:52
To: Serkan Çoban
Cc: ceph-users
Subject: [ceph-users] Re: Issues with Ceph network redundancy using L2 MC-LAG

When i pull out the cable, then the bond is working properly.

Does it mean that the port is somehow flapping? Ping can still work, but the 
iperf test yields very low results.

huxia...@horebdata.cn

From: Serkan Çoban
Date: 2021-06-15 18:47
To: huxia...@horebdata.cn
CC: ceph-users
Subject: Re: [ceph-users] Issues with Ceph network redundancy using L2 MC-LAG
Do you observe the same behaviour when you pull a cable?
Maybe a flapping port might cause this kind of behaviour, other than
that you should't see any network disconnects.
Are you sure about LACP configuration, what is the output of 'cat
/proc/net/bonding/bond0'

On Tue, Jun 15, 2021 at 7:19 PM huxia...@horebdata.cn
 wrote:
>
> Dear Cephers,
>
> I encountered the following networking issue several times, and i wonder 
> whether there is a solution for networking HA solution.
>
> We build ceph using L2 multi chassis link aggregation group (MC-LAG ) to 
> provide switch redundancy. On each host, we use 802.3ad, LACP
> mode for NIC redundancy. However, we observe several times, when a single 
> network port, either the cable, or the SFP+ optical module fails, Ceph 
> cluster  is badly affected by networking, although in theory it should be 
> able to tolerate.
>
> Did i miss something important here? and how to really achieve networking HA 
> in Ceph cluster?
>
> best regards,
>
> Samuel
>
>
>
>
> huxia...@horebdata.cn
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Issues with Ceph network redundancy using L2 MC-LAG

2021-06-15 Thread huxia...@horebdata.cn

When i pull out the cable, then the bond is working properly.

Does it mean that the port is somehow flapping? Ping can still work, but the 
iperf test yields very low results.

huxia...@horebdata.cn

From: Serkan Çoban
Date: 2021-06-15 18:47
To: huxia...@horebdata.cn
CC: ceph-users
Subject: Re: [ceph-users] Issues with Ceph network redundancy using L2 MC-LAG
Do you observe the same behaviour when you pull a cable?
Maybe a flapping port might cause this kind of behaviour, other than
that you should't see any network disconnects.
Are you sure about LACP configuration, what is the output of 'cat
/proc/net/bonding/bond0'

On Tue, Jun 15, 2021 at 7:19 PM huxia...@horebdata.cn
 wrote:
>
> Dear Cephers,
>
> I encountered the following networking issue several times, and i wonder 
> whether there is a solution for networking HA solution.
>
> We build ceph using L2 multi chassis link aggregation group (MC-LAG ) to 
> provide switch redundancy. On each host, we use 802.3ad, LACP
> mode for NIC redundancy. However, we observe several times, when a single 
> network port, either the cable, or the SFP+ optical module fails, Ceph 
> cluster  is badly affected by networking, although in theory it should be 
> able to tolerate.
>
> Did i miss something important here? and how to really achieve networking HA 
> in Ceph cluster?
>
> best regards,
>
> Samuel
>
>
>
>
> huxia...@horebdata.cn
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Failover with 2 nodes

2021-06-15 Thread Jamie Fargen

This also sounds like a possible GlusterFS use case.

Regards,
-Jamie

On Tue, Jun 15, 2021 at 12:30 PM Burkhard Linke <
burkhard.li...@computational.bio.uni-giessen.de> wrote:

> Hi,
>
> On 15.06.21 16:15, Christoph Brüning wrote:
> > Hi,
> >
> > That's right!
> >
> > We're currently evaluating a similar setup with two identical HW nodes
> > (on two different sites), with OSD, MON and MDS each, and both nodes
> > have CephFS mounted.
> >
> > The goal is to build a minimal self-contained shared filesystem that
> > remains online during planned updates and can somehow survive should
> > disaster strike at one of the two sites.
>
>
> This sounds like a use case for DRBD, maybe with OCFS2 on top as
> cluster(ed) filesystem. Ceph is overkill, and not really suited for two
> hosts setups.
>
>
> Regards,
>
> Burkhard
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 
Jamie Fargen
Senior Consultant
jfar...@redhat.com
813-817-4430
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Issues with Ceph network redundancy using L2 MC-LAG

2021-06-15 Thread Serkan Çoban

Do you observe the same behaviour when you pull a cable?
Maybe a flapping port might cause this kind of behaviour, other than
that you should't see any network disconnects.
Are you sure about LACP configuration, what is the output of 'cat
/proc/net/bonding/bond0'

On Tue, Jun 15, 2021 at 7:19 PM huxia...@horebdata.cn
 wrote:
>
> Dear Cephers,
>
> I encountered the following networking issue several times, and i wonder 
> whether there is a solution for networking HA solution.
>
> We build ceph using L2 multi chassis link aggregation group (MC-LAG ) to 
> provide switch redundancy. On each host, we use 802.3ad, LACP
> mode for NIC redundancy. However, we observe several times, when a single 
> network port, either the cable, or the SFP+ optical module fails, Ceph 
> cluster  is badly affected by networking, although in theory it should be 
> able to tolerate.
>
> Did i miss something important here? and how to really achieve networking HA 
> in Ceph cluster?
>
> best regards,
>
> Samuel
>
>
>
>
> huxia...@horebdata.cn
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Issues with Ceph network redundancy using L2 MC-LAG

2021-06-15 Thread huxia...@horebdata.cn

My big worry is about, when a single link under a bond breaks, it breaks hardly 
such that the whole bond does not work.

How to make it "failover" in such cases?

best regards,

samuel

huxia...@horebdata.cn

From: Anthony D'Atri
Date: 2021-06-15 18:22
To: huxia...@horebdata.cn
Subject: Re: [ceph-users] Issues with Ceph network redundancy using L2 MC-LAG
Which hash mode are you using on the hosts?  layer 3+4 ?  Are they set up 
active/active, or active/passive?

I often see suboptimal bonding configurations that result in most or all 
traffic going over only one ilnk.

> On Jun 15, 2021, at 9:19 AM, huxia...@horebdata.cn wrote:
> 
> Dear Cephers,
> 
> I encountered the following networking issue several times, and i wonder 
> whether there is a solution for networking HA solution.
> 
> We build ceph using L2 multi chassis link aggregation group (MC-LAG ) to 
> provide switch redundancy. On each host, we use 802.3ad, LACP  
> mode for NIC redundancy. However, we observe several times, when a single 
> network port, either the cable, or the SFP+ optical module fails, Ceph 
> cluster  is badly affected by networking, although in theory it should be 
> able to tolerate.
> 
> Did i miss something important here? and how to really achieve networking HA 
> in Ceph cluster?
> 
> best regards,
> 
> Samuel
> 
> 
> 
> 
> huxia...@horebdata.cn
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Failover with 2 nodes

2021-06-15 Thread Burkhard Linke


Hi,

On 15.06.21 16:15, Christoph Brüning wrote:

Hi,

That's right!

We're currently evaluating a similar setup with two identical HW nodes 
(on two different sites), with OSD, MON and MDS each, and both nodes 
have CephFS mounted.


The goal is to build a minimal self-contained shared filesystem that 
remains online during planned updates and can somehow survive should 
disaster strike at one of the two sites.



This sounds like a use case for DRBD, maybe with OCFS2 on top as 
cluster(ed) filesystem. Ceph is overkill, and not really suited for two 
hosts setups.



Regards,

Burkhard

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: ceph PGs issues

2021-06-15 Thread Reed Dier

Note: I am not entirely sure here, and would love other input from the ML about 
this, so take this with a grain of salt.

You don't show any unfound objects, which I think is excellent news as far as 
data loss.
>>96   active+clean+scrubbing+deep+repair
The deep scrub + repair seems auspicious, and also seems like a really heavy 
operation on those PGs.

I can't tell fully, but it looks like your EC profile is K+M=12. Which could be 
10+2, 9+3, or hopefully not 11+1.
That said, being on Mimic, I am thinking that you are more than likely running 
into this: 
https://docs.ceph.com/en/latest/rados/operations/erasure-code/#erasure-coded-pool-recovery

> Prior to Octopus, erasure coded pools required at least min_size shards to be 
> available, even if min_size is greater than K. (We generally recommend 
> min_size be K+2 or more to prevent loss of writes and data.) This 
> conservative decision was made out of an abundance of caution when designing 
> the new pool mode but also meant pools with lost OSDs but no data loss were 
> unable to recover and go active without manual intervention to change the 
> min_size.

I can't definitively say whether reducing the min_size will unlock the offline 
data, but I think it could.
As for what that value will be, I'm guessing just drop it by one, and see if 
PGs come out of their incomplete state.
After (hopeful) recovery, I would revert the min_size back to the original 
value for safety.

Something odd I did notice from the pastebin of ceph health detail,
> pg 3.e5 is remapped+incomplete, acting 
> [2147483647,2147483647,2147483647,2147483647,2147483647,278,2147483647,2147483647,273,2147483647,2147483647,2147483647]
> pg 3.14e is remapped+incomplete, acting 
> [2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,271,2147483647,222,416,2147483647]
> pg 3.45e is remapped+incomplete, acting 
> [2147483647,2147483647,2147483647,2147483647,2147483647,377,2147483647,2147483647,2147483647,2147483647,2147483647,2147483647]
> pg 3.4bc is remapped+incomplete, acting 
> [2147483647,280,2147483647,2147483647,2147483647,407,445,268,2147483647,2147483647,418,273]
> pg 3.7c6 is remapped+incomplete, acting 
> [2147483647,338,2147483647,2147483647,261,2147483647,2147483647,2147483647,416,415,337,2147483647]
> pg 3.8e8 is remapped+incomplete, acting 
> [2147483647,2147483647,2147483647,2147483647,360,418,2147483647,2147483647,2147483647,2147483647,2147483647,2147483647]
>  
> pg 3.b5e is remapped+incomplete, acting 
> [2147483647,242,2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,399,2147483647,2147483647]
>  

These 7 PGs are reporting a really large percentage of chunks with no OSDs 
found.
https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-pg/#erasure-coded-pgs-are-not-active-clean

I think this could possibly relate to the below bit about osd.73 throwing off 
the crush map.
I'm sure someone with more experience may have a better understanding of what 
this implies.

As for osd.73, I would remove it from the crush map.
It existing in the crush map, while not being a valid OSD may be throwing off 
the crush mappings.
I think the first step I would take would be to
$ ceph osd crush remove osd.73
$ ceph osd rm osd.73

This should reweight the ceph003 host, and cause some data movement.

So, in summation,
I would kill off osd.73 first.
Then, after some assumed rebalancing, I would then reduce the min_size to try 
and bring PGs out of an incomplete state.

As I said, I'm not entirely sure, and would love a second opinion from someone, 
but if it were me in a vacuum, I think these would be my steps.

Reed

> On Jun 15, 2021, at 10:14 AM, Aly, Adel  wrote:
> 
> Hi Reed,
> 
> Thank you for getting back to us.
> 
> We had indeed several disk failures at the same time.
> 
> Regarding the OSD map, we have an OSD that failed and we needed to remove but 
> we didn't update the crushmap.
> 
> The question here, is it safe to update the OSD crushmap without affecting 
> the data available?
> 
> We can free up more space on the monitors if that will help indeed.
> 
> More information which can be helpful:
> 
> # ceph -v
> ceph version 13.2.10 (564bdc4ae87418a232fc901524470e1a0f76d641) mimic (stable)
> 
> # ceph health detail
> https://pastebin.pl/view/2b8b337d
> 
> # ceph osd pool ls detail
> pool 3 'cephfs-data' erasure size 12 min_size 11 crush_rule 1 object_hash 
> rjenkins pg_num 3072 pgp_num 3072 last_change 370219 lfor 0/367599 flags 
> hashpspool,ec_overwrites,selfmanaged_snaps stripe_width 40960 fast_read 1 
> compression_algorithm snappy compression_mode force application cephfs
>removed_snaps [2~7c]
> pool 4 'cephfs-meta' replicated size 3 min_size 2 crush_rule 0 object_hash 
> rjenkins

[ceph-users] Issues with Ceph network redundancy using L2 MC-LAG

2021-06-15 Thread huxia...@horebdata.cn

Dear Cephers,

I encountered the following networking issue several times, and i wonder 
whether there is a solution for networking HA solution.

We build ceph using L2 multi chassis link aggregation group (MC-LAG ) to 
provide switch redundancy. On each host, we use 802.3ad, LACP  
mode for NIC redundancy. However, we observe several times, when a single 
network port, either the cable, or the SFP+ optical module fails, Ceph cluster  
is badly affected by networking, although in theory it should be able to 
tolerate.

Did i miss something important here? and how to really achieve networking HA in 
Ceph cluster?

best regards,

Samuel




huxia...@horebdata.cn
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: CephFS mount fails after Centos 8.4 Upgrade

2021-06-15 Thread Dan van der Ster

Looks like this: https://tracker.ceph.com/issues/51112


On Tue, Jun 15, 2021 at 5:48 PM Ackermann, Christoph
 wrote:
>
> Hello all,
>
> after upgrading Centos clients to version 8.4 CephFS  ( Kernel
> 4.18.0-305.3.1.el8 ) mount did fail.  Message: *mount error 110 =
> Connection timed out*
> ..unfortunately the kernel log was flooded with zeros... :-(
>
> The monitor connection seems to be ok, but libceph said:
> kernel: libceph: corrupt full osdmap (-2) epoch off  x  of
> yyy
>
> After client VM restore with Kernel 4.18.0-240.22.1.el8_3.x86_64 everything
> runs well.
>
> Did someone recently upgrade clients to Centos 8.4?
>
> Best regards,
>
> Christoph
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] CephFS mount fails after Centos 8.4 Upgrade

2021-06-15 Thread Ackermann, Christoph

Hello all,

after upgrading Centos clients to version 8.4 CephFS  ( Kernel
4.18.0-305.3.1.el8 ) mount did fail.  Message: *mount error 110 =
Connection timed out*
..unfortunately the kernel log was flooded with zeros... :-(

The monitor connection seems to be ok, but libceph said:
kernel: libceph: corrupt full osdmap (-2) epoch off  x  of
yyy

After client VM restore with Kernel 4.18.0-240.22.1.el8_3.x86_64 everything
runs well.

Did someone recently upgrade clients to Centos 8.4?

Best regards,

Christoph
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Strategy for add new osds

2021-06-15 Thread DHilsbos

Personally, when adding drives like this, I set noin (ceph osd set noin), and 
norebalance (ceph osd set norebalance).  Like your situation, we run smaller 
clusters; our largest cluster only has 18 OSDs.

That keeps the cluster from starting data moves until all new drives are in 
place.  Don't forget to unset these values (ceph osd unset noin, ceph osd unset 
norebalance).

There are also values you can tune to control whether user traffic or recovery 
traffic gets precedent while data is moving.

Thank you,

Dominic L. Hilsbos, MBA 
Vice President - Information Technology 
Perform Air International Inc.
dhils...@performair.com 
www.PerformAir.com

-Original Message-
From: Kai Börnert [mailto:kai.boern...@posteo.de] 
Sent: Tuesday, June 15, 2021 8:20 AM
To: ceph-users@ceph.io
Subject: [ceph-users] Re: Strategy for add new osds

Hi,

as far as I understand it,

you get no real benefit with doing them one by one, as each osd add, can 
cause a lot of data to be moved to a different osd, even tho you just 
rebalanced it.

The algorithm determining the placement of pg's does not take the 
current/historic placement into account, so changing anything at this, 
could cause any amount of data to migrate, with each change

Greetings,

Kai

On 6/15/21 5:06 PM, Jorge JP wrote:
> Hello,
>
> I have a ceph cluster with 5 nodes (1 hdd each node). I want to add 5 more 
> drives (hdd) to expand my cluster. What is the best strategy for this?
>
> I will add each drive in each node but is a good strategy add one drive and 
> wait to rebalance the data to new osd for add new osd? or maybe.. I should be 
> add the 5 drives without wait rebalancing and ceph rebalancing the data to 
> all new osd?
>
> Thank you.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Failover with 2 nodes

2021-06-15 Thread nORKy

Hi,

Thank you guys. I deployed a third monitor and failover works. Thanks you

Le mar. 15 juin 2021 à 16:15, Christoph Brüning <
christoph.bruen...@uni-wuerzburg.de> a écrit :

> Hi,
>
> That's right!
>
> We're currently evaluating a similar setup with two identical HW nodes
> (on two different sites), with OSD, MON and MDS each, and both nodes
> have CephFS mounted.
>
> The goal is to build a minimal self-contained shared filesystem that
> remains online during planned updates and can somehow survive should
> disaster strike at one of the two sites.
>
> We added a third node (a small VM) running only a monitor to avoid
> exactly the described problem.
>
> Best,
> Christoph
>
>
>
> On 15/06/2021 15.32, Robert Sander wrote:
> > On 15.06.21 15:16, nORKy wrote:
> >
> >> Why is there no failover ??
> >
> > Because only one MON out of two is not in the majority to build a quorum.
> >
> > Regards
> >
> >
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
>
> --
> Dr. Christoph Brüning
> Universität Würzburg
> HPC & DataManagement @ ct.qmat & RZUW
> Am Hubland
> D-97074 Würzburg
> Tel.: +49 931 31-80499
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Strategy for add new osds

2021-06-15 Thread Jorge JP

Hello,

I have a ceph cluster with 5 nodes (1 hdd each node). I want to add 5 more 
drives (hdd) to expand my cluster. What is the best strategy for this?

I will add each drive in each node but is a good strategy add one drive and 
wait to rebalance the data to new osd for add new osd? or maybe.. I should be 
add the 5 drives without wait rebalancing and ceph rebalancing the data to all 
new osd?

Thank you.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: ceph PGs issues

2021-06-15 Thread Reed Dier

You have incomplete PGs, which means you have inactive data, because the data 
isn't there.

This will typically only happen when you have multiple concurrent disk 
failures, or something like that, so I think there is some missing info.

>1 osds exist in the crush map but not in the osdmap

This seems like a red flag to have an OSD in the crush map but not the osdmap.

>mons xyz01,xyz02 are low on available space

Your mons are probably filling up data running in the warn state.
This can be problematic for recovery.

I think you will be more likely to receive some useful suggestions by providing 
things like which version of ceph you are using ($ ceph -v), major events that 
caused this, poo ($ ceph osd pool ls detail) and osd  ($ ceph osd tree) 
topology, as well as maybe detailed health output ($ ceph health detail).

Given how much data some things may be, like the osd tree, you may want to 
paste to pastebin and link here.

Reed

> On Jun 15, 2021, at 2:48 AM, Aly, Adel  wrote:
> 
> Dears,
> 
> We have a ceph cluster with 4096 PGs out of with +100 PGs are not 
> active+clean.
> 
> On top of the ceph cluster, we have a ceph FS, with 3 active MDS servers.
> 
> It seems that we can’t get all the files out of it because of the affected 
> PGs.
> 
> The object store has more than 400 million objects.
> 
> When we do “rados -p cephfs-data ls”, the listing stops (hangs) after listing 
> +11 million objects.
> 
> When we try to access an object which we can’t copy, the rados command hangs 
> forever:
> 
> ls -I 
> 2199140525188
> 
> printf "%x\n" 2199140525188
> 20006fd6484
> 
> rados -p cephfs-data stat 20006fd6484.
> (hangs here)
> 
> This is the current status of the ceph cluster:
>health: HEALTH_WARN
>1 MDSs report slow metadata IOs
>1 MDSs report slow requests
>1 MDSs behind on trimming
>1 osds exist in the crush map but not in the osdmap
>*Reduced data availability: 22 pgs inactive, 22 pgs incomplete*
>240324 slow ops, oldest one blocked for 391503 sec, daemons 
> [osd.144,osd.159,osd.180,osd.184,osd.242,osd.271,osd.275,osd.278,osd.280,osd.332]...
>  h
> ave slow ops.
>mons xyz01,xyz02 are low on available space
> 
>  services:
>mon: 4 daemons, quorum abc001,abc002,xyz02,xyz01
>mgr: abc002(active), standbys: xyz01, xyz02, abc001
>mds: cephfs-3/3/3 up  
> {0=xyz02=up:active,1=abc001=up:active,2=abc002=up:active}, 1 up:standby
>osd: 421 osds: 421 up, 421 in; 7 remapped pgs
> 
>  data:
>pools:   2 pools, 4096 pgs
>objects: 403.4 M objects, 846 TiB
>usage:   1.2 PiB used, 1.4 PiB / 2.6 PiB avail
>pgs: 0.537% pgs not active
> 3968 active+clean
> 96   active+clean+scrubbing+deep+repair
> 15   incomplete
> 10   active+clean+scrubbing
> 7remapped+incomplete
> 
>  io:
>client:   89 KiB/s rd, 13 KiB/s wr, 34 op/s rd, 1 op/s wr
> 
> The 100+ PGs have been in this state for a long time already.
> 
> Sometimes when we try to copy some files the rsync process hangs and we can’t 
> kill it and from the process stack, it seems to be hanging on ceph i/o 
> operation.
> 
> # cat /proc/51795/stack
> [] ceph_mdsc_do_request+0xfd/0x280 [ceph]
> [] __ceph_do_getattr+0x9e/0x200 [ceph]
> [] ceph_getattr+0x28/0x100 [ceph]
> [] vfs_getattr+0x49/0x80
> [] vfs_fstatat+0x75/0xc0
> [] SYSC_newlstat+0x31/0x60
> [] SyS_newlstat+0xe/0x10
> [] system_call_fastpath+0x25/0x2a
> [] 0x
> 
> # cat /proc/51795/mem
> cat: /proc/51795/mem: Input/output error
> 
> Any idea on how to move forward with debugging and fixing this issue so we 
> can get the data out of the ceph FS?
> 
> Thank you in advance.
> 
> Kind regards,
> adel
> 
> This e-mail and the documents attached are confidential and intended solely 
> for the addressee; it may also be privileged. If you receive this e-mail in 
> error, please notify the sender immediately and destroy it. As its integrity 
> cannot be secured on the Internet, Atos’ liability cannot be triggered for 
> the message content. Although the sender endeavours to maintain a computer 
> virus-free network, the sender does not warrant that this transmission is 
> virus-free and will not be liable for any damages resulting from any virus 
> transmitted. On all offers and agreements under which Atos Nederland B.V. 
> supplies goods and/or services of whatever nature, the Terms of Delivery from 
> Atos Nederland B.V. exclusively apply. The Terms of Delivery shall be 
> promptly submitted to you on your request.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Failover with 2 nodes

2021-06-15 Thread Christoph Brüning


Hi,

That's right!

We're currently evaluating a similar setup with two identical HW nodes 
(on two different sites), with OSD, MON and MDS each, and both nodes 
have CephFS mounted.


The goal is to build a minimal self-contained shared filesystem that 
remains online during planned updates and can somehow survive should 
disaster strike at one of the two sites.


We added a third node (a small VM) running only a monitor to avoid 
exactly the described problem.


Best,
Christoph



On 15/06/2021 15.32, Robert Sander wrote:

On 15.06.21 15:16, nORKy wrote:


Why is there no failover ??


Because only one MON out of two is not in the majority to build a quorum.

Regards


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



--
Dr. Christoph Brüning
Universität Würzburg
HPC & DataManagement @ ct.qmat & RZUW
Am Hubland
D-97074 Würzburg
Tel.: +49 931 31-80499
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] NFS Ganesha ingress parameter not valid?

2021-06-15 Thread Oliver Weinmann


Dear All,



I have deployed the latest CEPH Pacific release in my lab and started to check 
out the new ?stable? NFS Ganesha features. First of all I'm a bit confused 
which method to actually use to deploy the NFS cluster:



cephadm or ceph nfs cluster create?



I used "nfs cluster create" for now and noticed a minor problem in the docs.


https://docs.ceph.com/en/latest/cephfs/fs-nfs-exports/#cephfs-nfs


The command is stated as:

$ ceph nfs cluster create  [] [--ingress --virtual-ip 
]

while as it needs a type (cephfs) to be specified

nfs cluster create   [] :  Create an NFS Cluster

Also I can't manage to use the --ingress --virtual-ip parameter. Every time I 
try to use it I get this:

[root@cephboot~]# ceph nfs cluster create cephfs 
ec9e031a-cd10-11eb-a3c3-005056b7db1f --ingress --virtual-ip 192.168.9.199
Invalid command: Unexpected argument '--ingress'
nfs cluster create   [] :  Create an NFS Cluster
Error EINVAL: invalid command

So i just deployed a NFS cluster without a VIP. Maybe I'm missing something?

What about this note in the docs:


From Pacific, the nfs mgr module must be enabled prior to use. <<


I can't find any info on how to enable it. Maybe this is already the case?

ceph nfs cluster create cephfs ec9e031a-cd10-11eb-a3c3-005056b7db1f "cephnode01"
This seems to be working fine. I managed to connect a CentOS 7 VM and I can 
access the NFS export just fine. Great stuff.



For testing I tried to attach the same NFS export to a standalone ESXi 6.5 
Server. This also works, but its diskspace is shown as 0 bytes:





I'm not sure if this supported or I'm missing something. I could not find any 
clear info in the docs only some reddit posts where users mentioned that they 
were able to use it with VMware.



Thanks and Best Regards,

Oliver
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Failover with 2 nodes

2021-06-15 Thread mhnx

It's easy. The problem ise OSD's are still up because there is not enough
down mon_osd_min_down_reporters and due to this problem MDS is stucking.

The solution is "mon_osd_min_down_reporters = 1"
Due to "two node" cluster and "replicated 2" with "chooseleaf host"
the reporter count should be set to 1 but on a malfunction this could
be a serious problem. In LAB environment you're fine.



nORKy , 15 Haz 2021 Sal, 16:18 tarihinde şunu yazdı:

> Hi,
>
> I'm building a lab with virtual machines.
>
> I build a set up with only 2 nodes, 2 osd per nodes and I have a host that
> use mount.cephfs
> Each 2 ceph nodes runs services mon + mgr + mds and has cephadm command.
>
> If I stop a node, all commands hang.
> Can't use dashboard, can't use ceph -s or any ceph command, and my cephfs
> on the third host stop to respond too (ex: with ls command)
> All come back, when I power on the stopped node
>
> Why is there no failover ??
>
> Thanks you
>
> 'Jof
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Failover with 2 nodes

2021-06-15 Thread Robert Sander

On 15.06.21 15:16, nORKy wrote:

> Why is there no failover ??

Because only one MON out of two is not in the majority to build a quorum.

Regards
-- 
Robert Sander
Heinlein Support GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 93818 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Failover with 2 nodes

2021-06-15 Thread nORKy

Hi,

I'm building a lab with virtual machines.

I build a set up with only 2 nodes, 2 osd per nodes and I have a host that
use mount.cephfs
Each 2 ceph nodes runs services mon + mgr + mds and has cephadm command.

If I stop a node, all commands hang.
Can't use dashboard, can't use ceph -s or any ceph command, and my cephfs
on the third host stop to respond too (ex: with ls command)
All come back, when I power on the stopped node

Why is there no failover ??

Thanks you

'Jof
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Ceph Month June Schedule Now Available

2021-06-15 Thread Mike Perez

Hi everyone,

Here's today's schedule for Ceph Month:

9:00ET / 15:00 CEST Dashboard Update [Ernesto]
9:30 ET / 15:30 CEST [lightning] RBD latency with QD=1 bs=4k [Wido,
den Hollander]
9:40 ET / 15:40 CEST [lightning] From Open Source  to Open Ended in
Ceph with Lua [Yuval Lifshitz]

Full schedule: https://pad.ceph.com/p/ceph-month-june-2021
Meeting link: https://bluejeans.com/908675367

On Mon, Jun 14, 2021 at 6:50 AM Mike Perez  wrote:
>
> Hi everyone,
>
> In ten minutes, Ceph Month continues with the following schedule today:
>
> 10:00 ET / 16:00 CEST RBD update [Ilya Dryomov]
> 10:30 ET / 16:30 CEST 5 more ways to break your ceph cluster [Wout van 
> Heeswijk]
>
> Full schedule: https://pad.ceph.com/p/ceph-month-june-2021
> Meeting link: https://bluejeans.com/908675367
>
>
> On Fri, Jun 11, 2021 at 6:50 AM Mike Perez  wrote:
> >
> > Hi everyone,
> >
> > In ten minutes, join us for the next Ceph Month presentation on Intel
> > QLC SSD: Cost-Effective Ceph Deployments by Anthony D'Atri
> >
> > https://bluejeans.com/908675367
> > https://pad.ceph.com/p/ceph-month-june-2021
> >
> > On Fri, Jun 11, 2021 at 5:50 AM Mike Perez  wrote:
> > >
> > > Hi everyone,
> > >
> > > In ten minutes, join us for the next Ceph Month presentation on
> > > Performance Optimization for All Flash-based on aarch64 by Chunsong
> > > Feng
> > >
> > > https://pad.ceph.com/p/ceph-month-june-2021
> > > https://bluejeans.com/908675367
> > >
> > > On Thu, Jun 10, 2021 at 6:00 AM Mike Perez  wrote:
> > > >
> > > > Hi everyone,
> > > >
> > > > We're about to start Ceph Month 2021 with Casey Bodley giving a RGW 
> > > > update!
> > > >
> > > > Afterward we'll have two BoF discussions on:
> > > >
> > > > 9:30 ET / 15:30 CEST [BoF] Ceph in Research & Scientific Computing
> > > > [Kevin Hrpcek]
> > > >
> > > > 10:10 ET / 16:10 CEST [BoF] The go-ceph get together [John Mulligan]
> > > >
> > > > Join us now on the stream:
> > > >
> > > > https://bluejeans.com/908675367
> > > >
> > > > On Tue, Jun 1, 2021 at 6:50 AM Mike Perez  wrote:
> > > > >
> > > > > Hi everyone,
> > > > >
> > > > > In ten minutes, join us for the start of the Ceph Month June event!
> > > > > The schedule and meeting link can be found on this etherpad:
> > > > >
> > > > > https://pad.ceph.com/p/ceph-month-june-2021
> > > > >
> > > > > On Tue, May 25, 2021 at 11:56 AM Mike Perez  
> > > > > wrote:
> > > > > >
> > > > > > Hi everyone,
> > > > > >
> > > > > > The Ceph Month June schedule is now available:
> > > > > >
> > > > > > https://pad.ceph.com/p/ceph-month-june-2021
> > > > > >
> > > > > > We have great sessions from component updates, performance best
> > > > > > practices, Ceph on different architectures, BoF sessions to get more
> > > > > > involved with working groups in the community, and more! You may 
> > > > > > also
> > > > > > leave open discussion topics for the listed talks that we'll get to
> > > > > > each Q/A portion.
> > > > > >
> > > > > > I will provide the video stream link on this thread and etherpad 
> > > > > > once
> > > > > > it's available. You can also add the Ceph community calendar, which
> > > > > > will have the Ceph Month sessions prefixed with "Ceph Month" to get
> > > > > > local timezone conversions.
> > > > > >
> > > > > > https://calendar.google.com/calendar/embed?src=9ts9c7lt7u1vic2ijvvqqlfpo0%40group.calendar.google.com
> > > > > >
> > > > > > Thank you to our speakers for taking the time to share with us all 
> > > > > > the
> > > > > > latest best practices and usage with Ceph!
> > > > > >
> > > > > > --
> > > > > > Mike Perez
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: ceph df: pool stored vs bytes_used -- raw or not?

2021-06-15 Thread Konstantin Shalygin

Fired https://tracker.ceph.com/issues/51223


k
> On 9 Jun 2021, at 13:20, Igor Fedotov  wrote:
> 
> Should we fire another ticket for that?

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] ceph PGs issues

2021-06-15 Thread Aly, Adel

Dears,

We have a ceph cluster with 4096 PGs out of with +100 PGs are not active+clean.

On top of the ceph cluster, we have a ceph FS, with 3 active MDS servers.

It seems that we can’t get all the files out of it because of the affected PGs.

The object store has more than 400 million objects.

When we do “rados -p cephfs-data ls”, the listing stops (hangs) after listing 
+11 million objects.

When we try to access an object which we can’t copy, the rados command hangs 
forever:

ls -I 
2199140525188

printf "%x\n" 2199140525188
20006fd6484

rados -p cephfs-data stat 20006fd6484.
(hangs here)

This is the current status of the ceph cluster:
health: HEALTH_WARN
1 MDSs report slow metadata IOs
1 MDSs report slow requests
1 MDSs behind on trimming
1 osds exist in the crush map but not in the osdmap
*Reduced data availability: 22 pgs inactive, 22 pgs incomplete*
240324 slow ops, oldest one blocked for 391503 sec, daemons 
[osd.144,osd.159,osd.180,osd.184,osd.242,osd.271,osd.275,osd.278,osd.280,osd.332]...
 h
ave slow ops.
mons xyz01,xyz02 are low on available space

  services:
mon: 4 daemons, quorum abc001,abc002,xyz02,xyz01
mgr: abc002(active), standbys: xyz01, xyz02, abc001
mds: cephfs-3/3/3 up  
{0=xyz02=up:active,1=abc001=up:active,2=abc002=up:active}, 1 up:standby
osd: 421 osds: 421 up, 421 in; 7 remapped pgs

  data:
pools:   2 pools, 4096 pgs
objects: 403.4 M objects, 846 TiB
usage:   1.2 PiB used, 1.4 PiB / 2.6 PiB avail
pgs: 0.537% pgs not active
 3968 active+clean
 96   active+clean+scrubbing+deep+repair
 15   incomplete
 10   active+clean+scrubbing
 7remapped+incomplete

  io:
client:   89 KiB/s rd, 13 KiB/s wr, 34 op/s rd, 1 op/s wr

The 100+ PGs have been in this state for a long time already.

Sometimes when we try to copy some files the rsync process hangs and we can’t 
kill it and from the process stack, it seems to be hanging on ceph i/o 
operation.

# cat /proc/51795/stack
[] ceph_mdsc_do_request+0xfd/0x280 [ceph]
[] __ceph_do_getattr+0x9e/0x200 [ceph]
[] ceph_getattr+0x28/0x100 [ceph]
[] vfs_getattr+0x49/0x80
[] vfs_fstatat+0x75/0xc0
[] SYSC_newlstat+0x31/0x60
[] SyS_newlstat+0xe/0x10
[] system_call_fastpath+0x25/0x2a
[] 0x

# cat /proc/51795/mem
cat: /proc/51795/mem: Input/output error

Any idea on how to move forward with debugging and fixing this issue so we can 
get the data out of the ceph FS?

Thank you in advance.

Kind regards,
adel

This e-mail and the documents attached are confidential and intended solely for 
the addressee; it may also be privileged. If you receive this e-mail in error, 
please notify the sender immediately and destroy it. As its integrity cannot be 
secured on the Internet, Atos’ liability cannot be triggered for the message 
content. Although the sender endeavours to maintain a computer virus-free 
network, the sender does not warrant that this transmission is virus-free and 
will not be liable for any damages resulting from any virus transmitted. On all 
offers and agreements under which Atos Nederland B.V. supplies goods and/or 
services of whatever nature, the Terms of Delivery from Atos Nederland B.V. 
exclusively apply. The Terms of Delivery shall be promptly submitted to you on 
your request.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Module 'devicehealth' has failed:

2021-06-15 Thread Torkil Svensgaard


Hi

Thanks, I guess this might have something to do with it:

"
Jun 15 09:44:22 dcn-ceph-01 bash[3278]: debug 
2021-06-15T09:44:22.507+ 7f704e4b3700 -1 mgr notify devicehealth.notify:
Jun 15 09:44:22 dcn-ceph-01 bash[3278]: debug 
2021-06-15T09:44:22.507+ 7f704e4b3700 -1 mgr notify Traceback (most 
recent call last):
Jun 15 09:44:22 dcn-ceph-01 bash[3278]:   File 
"/usr/share/ceph/mgr/devicehealth/module.py", line 229, in notify

Jun 15 09:44:22 dcn-ceph-01 bash[3278]: self.create_device_pool()
Jun 15 09:44:22 dcn-ceph-01 bash[3278]:   File 
"/usr/share/ceph/mgr/devicehealth/module.py", line 254, in 
create_device_pool

Jun 15 09:44:22 dcn-ceph-01 bash[3278]: assert r == 0
Jun 15 09:44:22 dcn-ceph-01 bash[3278]: AssertionError
"

Not sure why it would be creating a pool? I believe it used to work, and 
I have this pool:


"
# ceph osd dump | grep pool
pool 9 'device_health_metrics' replicated size 2 min_size 1 crush_rule 1 
object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 
2630 flags hashpspool stripe_width 0 compression_algorithm snappy 
compression_mode aggressive application health_metrics

"

Mvh.

Torkil

On 15/06/2021 11.38, Sebastian Wagner wrote:

Hi Torkil,

you should see more information in the MGR log file.

Might be an idea to restart the MGR to get some recent logs.

Am 15.06.21 um 09:41 schrieb Torkil Svensgaard:

Hi

Looking at this error in v15.2.13:

"
[ERR] MGR_MODULE_ERROR: Module 'devicehealth' has failed:
    Module 'devicehealth' has failed:
"

It used to work. Since the module is always on I can't seem to restart 
it and I've found no clue as to why it failed. I've tried rebooting 
all hosts to no avail.


Suggestions?

Thanks,

Torkil


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Module 'devicehealth' has failed:

2021-06-15 Thread Sebastian Wagner


Hi Torkil,

you should see more information in the MGR log file.

Might be an idea to restart the MGR to get some recent logs.

Am 15.06.21 um 09:41 schrieb Torkil Svensgaard:

Hi

Looking at this error in v15.2.13:

"
[ERR] MGR_MODULE_ERROR: Module 'devicehealth' has failed:
    Module 'devicehealth' has failed:
"

It used to work. Since the module is always on I can't seem to restart 
it and I've found no clue as to why it failed. I've tried rebooting 
all hosts to no avail.


Suggestions?

Thanks,

Torkil


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io





OpenPGP_signature
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Upgrading ceph to latest version, skipping minor versions?

2021-06-15 Thread Janne Johansson

Den mån 14 juni 2021 kl 22:48 skrev Matt Larson :
>
> Looking at the documentation for (
> https://docs.ceph.com/en/latest/cephadm/upgrade/) - I have a question on
> whether you need to sequentially upgrade for each minor versions, 15.2.1 ->
> 15.2.3 -> ... -> 15.2.XX?
>
> Can you safely upgrade by directly specifying the latest version from
> several minor versions behind?
>

Yes, you should always be able to jump to latest minor directly.



-- 
May the most significant bit of your life be positive.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Module 'devicehealth' has failed:

2021-06-15 Thread Torkil Svensgaard


Hi

Looking at this error in v15.2.13:

"
[ERR] MGR_MODULE_ERROR: Module 'devicehealth' has failed:
Module 'devicehealth' has failed:
"

It used to work. Since the module is always on I can't seem to restart 
it and I've found no clue as to why it failed. I've tried rebooting all 
hosts to no avail.


Suggestions?

Thanks,

Torkil


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

40 matches

Mail list logo