[ceph-users] MON: global_init: error reading config file.

2020-12-11 Thread Oscar Segarra
Hi, In my environment I have a single node and I'm trying to run a ceph monitor as a docker container using a kv store. Version: Octopus (stable-5.0) 2020-12-12 00:24:28 /opt/ceph-container/bin/entrypoint.sh: STAYALIVE: container will not die if a command fails., 2020-12-12 00:24:28

[ceph-users] Re: Debian repo for ceph-iscsi

2020-12-11 Thread Reed Dier
I know this isn't what you asked for, but I do know that Canonical is building this package for focal and up. While not Buster, could possibly be a compromise to move things forward without huge plumbing changes between Debian and Ubuntu. You may also be able to hack and slash your way through

[ceph-users] diskprediction_local to be retired or fixed or??

2020-12-11 Thread Harry G. Coin
Any idea whether 'diskprediction_local' will ever work in containers?  I'm running 15.2.7 which contains a dependency on scikit-learn v 0.19.2 which isn't in the container.  It's been throwing that error for a year now on all the octopus container versions I tried.  It used to be on the baremetal

[ceph-users] Re: CephFS max_file_size

2020-12-11 Thread Paul Mezzanini
>From how I understand it, that setting is a rev-limiter to prevent users from >creating HUGE sparse files and then wasting cluster resources firing off >deletes. We have ours set to 32T and haven't seen any issues with large files. -- Paul Mezzanini Sr Systems Administrator / Engineer,

[ceph-users] Re: CephFS max_file_size

2020-12-11 Thread Adam Tygart
I've had this set to 16TiB for several years now. I've not seen any ill effects. -- Adam On Fri, Dec 11, 2020 at 12:56 PM Patrick Donnelly wrote: > > Hi Mark, > > On Fri, Dec 11, 2020 at 4:21 AM Mark Schouten wrote: > > There is a default limit of 1TiB for the max_file_size in CephFS. I

[ceph-users] Re: CephFS max_file_size

2020-12-11 Thread Patrick Donnelly
Hi Mark, On Fri, Dec 11, 2020 at 4:21 AM Mark Schouten wrote: > There is a default limit of 1TiB for the max_file_size in CephFS. I altered > that to 2TiB, but I now got a request for storing a file up to 7TiB. > > I'd expect the limit to be there for a reason, but what is the risk of >

[ceph-users] Debian repo for ceph-iscsi

2020-12-11 Thread Chris Palmer
I just went to setup an iscsi gateway on a Debian Buster / Octopus cluster and hit a brick wall with packages. I had perhaps naively assumed they were in with the rest. Now I understand that it can exist separately, but then so can RGW. I found some ceph-iscsi rpm builds for Centos, but

[ceph-users] Re: Incomplete PG due to primary OSD crashing during EC backfill - get_hash_info: Mismatch of total_chunk_size 0

2020-12-11 Thread Byrne, Thomas (STFC,RAL,SC)
After confirming that the corruption was limited to a single object, we deleted the object (first via radosgw-admin, and the via a rados rm), and restarted the new OSD in the set. The backfill has continued past the point of the original crash, so things are looking promising. I'm still

[ceph-users] Slow Replication on Campus

2020-12-11 Thread Vikas Rana
Hi Friends, We have 2 Ceph clusters on campus and we setup the second cluster as the DR solution. The images on the DR side are always behind the master. Ceph Version : 12.2.11 VMWARE_LUN0: global_id: 23460954-6986-4961-9579-0f2a1e58e2b2 state: up+replaying

[ceph-users] Re: mgr's stop responding, dropping out of cluster with _check_auth_rotating

2020-12-11 Thread David Orman
No, as the number of responses we've seen in the mailing lists and on the bug report(s) have indicated it fixed the situation, we didn't proceed down that path (it seemed highly probable it would resolve things). If it's of additional value, we can disable the module temporarily to see if the

[ceph-users] Re: mgr's stop responding, dropping out of cluster with _check_auth_rotating

2020-12-11 Thread Wido den Hollander
On 11/12/2020 00:12, David Orman wrote: Hi Janek, We realize this, we referenced that issue in our initial email. We do want the metrics exposed by Ceph internally, and would prefer to work towards a fix upstream. We appreciate the suggestion for a workaround, however! Again, we're happy to

[ceph-users] Re: Scrubbing - osd down

2020-12-11 Thread Igor Fedotov
Miroslav, On 12/11/2020 4:57 PM, Miroslav Boháč wrote: Hi Igor, thank you. Yes you are right. It seems that the background removal is completed. you can inspect "numpg_removing" performance counter to make sure it's been completed. The correct way to fix it is "ceph-kvstore-tool

[ceph-users] Re: Scrubbing - osd down

2020-12-11 Thread Miroslav Boháč
Hi Igor, thank you. Yes you are right. It seems that the background removal is completed. The correct way to fix it is "ceph-kvstore-tool bluestore-kv compact" to all OSD (one by one)? Regards, Miroslav pá 11. 12. 2020 v 14:19 odesílatel Igor Fedotov napsal: > Hi Miroslav, > > haven't you

[ceph-users] Re: Ceph benchmark tool (cbt)

2020-12-11 Thread Mark Nelson
This is what the "use_existing" flag is for (on by default).  It short-circuits initialize() which is what actually does the whole shutdown/creation/startup procedure. https://github.com/ceph/cbt/blob/master/cluster/ceph.py#L149-L151 That is invoked before shutdown() and make_osds():

[ceph-users] Re: Scrubbing - osd down

2020-12-11 Thread Igor Fedotov
Hi Miroslav, haven't you performed massive data removal (PG migration) recently? If so you might want to apply manual DB compaction to your OSDs. The positive effect might be just temporary if background removals are still in progress though.. See https://tracker.ceph.com/issues/47044 and

[ceph-users] Re: Ceph benchmark tool (cbt)

2020-12-11 Thread Marc Roos
Just run the tool from a client that is not part of the ceph nodes. Than it can do nothing, that you did not configure ceph to allow it to do ;) Besides you should never run software from 'unknown' sources in an environment where it can use 'admin' rights. -Original Message- To:

[ceph-users] Re: Larger number of OSDs, cheroot, cherrypy, limits + containers == broken

2020-12-11 Thread David Orman
Hi Ken, This seems to have fixed that issue. It exposed another: https://tracker.ceph.com/issues/39264 which is causing ceph-mgr to become entirely unresponsive across the cluster, but cheroot seems to be ok. David On Wed, Dec 9, 2020 at 12:25 PM David Orman wrote: > Ken, > > We have rebuilt

[ceph-users] CephFS max_file_size

2020-12-11 Thread Mark Schouten
Hi, There is a default limit of 1TiB for the max_file_size in CephFS. I altered that to 2TiB, but I now got a request for storing a file up to 7TiB. I'd expect the limit to be there for a reason, but what is the risk of setting that value to say 10TiB? -- Mark Schouten Tuxis, Ede,

[ceph-users] Scrubbing - osd down

2020-12-11 Thread Miroslav Boháč
Hi, I have a problem with crashing OSD daemons in our Ceph 15.2.6 cluster . The problem was temporarily resolved by disabling scrub and deep-scrub. All PGs are active+clean. After a few days I tried to enable scrubbing again, but the problem persists. OSD with high latencies, PG laggy, osd not

[ceph-users] All ceph commands hangs - bad magic number in monitor log

2020-12-11 Thread Evrard Van Espen - Weather-Measures
Hello all, We have a production cluster running since weeks deployed on centos8 with |cephadm| (3 nodes with 6x12TB HDD each, one |mon| and one |mgr| on each node too). Today, the "master" node (the from on which I run all the setup commands) crashed so I rebooted the server. Now, all ceph