[ceph-users] ceph's UID/GID 65045 in conflict with user's UID/GID in a ldap

2018-05-14 Thread Yoann Moulin
Hello, I'm facing an issue with ceph's UID/GID 65045 on an LDAPized server, I have to install ceph-common to mount a cephfs filesystem but ceph-common fails because a user with uid 65045 already exist with a group also set at 65045. Server under Ubuntu 16.04.4 LTS > Setting up ceph-common (12.

Re: [ceph-users] Cephfs write fail when node goes down

2018-05-14 Thread Yan, Zheng
On Mon, May 14, 2018 at 5:37 PM, Josef Zelenka wrote: > Hi everyone, we've encountered an unusual thing in our setup(4 nodes, 48 > OSDs, 3 monitors - ceph Jewel, Ubuntu 16.04 with kernel 4.4.0). Yesterday, > we were doing a HW upgrade of the nodes, so they went down one by one - the > cluster was

Re: [ceph-users] Cephfs write fail when node goes down

2018-05-14 Thread Paul Emmerich
Which kernel version are you using? If it's an older kernel: consider using the cephfs-fuse client instead Paul 2018-05-14 11:37 GMT+02:00 Josef Zelenka : > Hi everyone, we've encountered an unusual thing in our setup(4 nodes, 48 > OSDs, 3 monitors - ceph Jewel, Ubuntu 16.04 with kernel 4.4.0).

[ceph-users] nfs-ganesha 2.6 deb packages

2018-05-14 Thread Benjeman Meekhof
I see that luminous RPM packages are up at download.ceph.com for ganesha-ceph 2.6 but there is nothing in the Deb area. Any estimates on when we might see those packages? http://download.ceph.com/nfs-ganesha/deb-V2.6-stable/luminous/ thanks, Ben ___ ce

Re: [ceph-users] a big cluster or several small

2018-05-14 Thread Paul Emmerich
Hi, don't do multiple clusters on the same server without containers; support for the cluster name stuff is deprecated and will probably be removed: https://github.com/ceph/ceph-deploy/pull/441 Also, I wouldn't split your cluster (yet?), ~300 OSDs is still quite small. But it depends on the exact

Re: [ceph-users] a big cluster or several small

2018-05-14 Thread João Paulo Sacchetto Ribeiro Bastos
Hello Marc, In my beliefs that's exactly the main reason why people use Ceph: its gets more reliable the more nodes we put in the cluster. You should take a look in documentation and try to make use of placement rules, erasure codes or whatever fits your needs. I'm yet new in Ceph (been using for

Re: [ceph-users] a big cluster or several small

2018-05-14 Thread Michael Kuriger
The more servers you have in your cluster, the less impact a failure causes to the cluster. Monitor your systems and keep them up to date. You can also isolate data with clever crush rules and creating multiple zones. Mike Kuriger From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On

Re: [ceph-users] a big cluster or several small

2018-05-14 Thread Jack
Well I currently manage 27 nodes, over 9 clusters There is some burden that you should considers The easiest is : "what do we do when two smalls clusters, which grows slowly, need more space" With one cluster: buy a node, add it, done With two clusters: buy two nodes, add them, done This can be

[ceph-users] a big cluster or several small

2018-05-14 Thread Marc Boisis
Hi, Hello, Currently we have a 294 OSD (21 hosts/3 racks) cluster with RBD clients only, 1 single pool (size=3). We want to divide this cluster into several to minimize the risk in case of failure/crash. For example, a cluster for the mail, another for the file servers, a test cluster ... Do

Re: [ceph-users] slow requests are blocked

2018-05-14 Thread Grigory Murashov
Hello David! 2. I set it up 10/10 3. Thanks, my problem was I did it on host where was no osd.15 daemon. Could you please help to read osd logs? Here is a part from ceph.log 2018-05-14 13:46:32.644323 mon.storage-ru1-osd1 mon.0 185.164.149.2:6789/0 553895 : cluster [INF] Cluster is now healt

Re: [ceph-users] PG show inconsistent active+clean+inconsistent

2018-05-14 Thread David Turner
Just for clarification, the PG state is not the cause of the scrub errors. Something happened in your cluster that caused inconsistencies between copies of the data, the scrub noticed them, the scrub errors are why the PG is flagged inconsistent, which does put the cluster in HEALTH_ERR. Anyway, j

Re: [ceph-users] Intel Xeon Scalable and CPU frequency scaling on NVMe/SSD Ceph OSDs

2018-05-14 Thread Nick Fisk
Hi Wido, Are you trying this setting? /sys/devices/system/cpu/intel_pstate/min_perf_pct -Original Message- From: ceph-users On Behalf Of Wido den Hollander Sent: 14 May 2018 14:14 To: n...@fisk.me.uk; 'Blair Bethwaite' Cc: 'ceph-users' Subject: Re: [ceph-users] Intel Xeon Scalable a

Re: [ceph-users] Intel Xeon Scalable and CPU frequency scaling on NVMe/SSD Ceph OSDs

2018-05-14 Thread John Hearns
Wido, I am going to put my rather large foot in it here. I am sure it is understood that the Turbo mode will not keep all cores at the maximum frequency at any given time. There is a thermal envelope for the chip, and the chip works to keep the power dissipation within that envelope. >From what I

Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?

2018-05-14 Thread Webert de Souza Lima
On Sat, May 12, 2018 at 3:11 AM Alexandre DERUMIER wrote: > The documentation (luminous) say: > > >mds cache size > > > >Description:The number of inodes to cache. A value of 0 indicates an > unlimited number. It is recommended to use mds_cache_memory_limit to limit > the amount of memory t

Re: [ceph-users] Intel Xeon Scalable and CPU frequency scaling on NVMe/SSD Ceph OSDs

2018-05-14 Thread Wido den Hollander
On 05/01/2018 10:19 PM, Nick Fisk wrote: > 4.16 required? > https://www.phoronix.com/scan.php?page=news_item&px=Skylake-X-P-State-Linux- > 4.16 > I've been trying with the 4.16 kernel for the last few days, but still, it's not working. The CPU's keep clocking down to 800Mhz I've set scaling_m

Re: [ceph-users] RBD Cache and rbd-nbd

2018-05-14 Thread Jason Dillaman
On Mon, May 14, 2018 at 12:15 AM, Marc Schöchlin wrote: > Hello Jason, > > many thanks for your informative response! > > Am 11.05.2018 um 17:02 schrieb Jason Dillaman: >> I cannot speak for Xen, but in general IO to a block device will hit >> the pagecache unless the IO operation is flagged as di

[ceph-users] Cephfs write fail when node goes down

2018-05-14 Thread Josef Zelenka
Hi everyone, we've encountered an unusual thing in our setup(4 nodes, 48 OSDs, 3 monitors - ceph Jewel, Ubuntu 16.04 with kernel 4.4.0). Yesterday, we were doing a HW upgrade of the nodes, so they went down one by one - the cluster was in good shape during the upgrade, as we've done this numero

Re: [ceph-users] RBD Cache and rbd-nbd

2018-05-14 Thread Marc Schöchlin
Hello Jason, many thanks for your informative response! Am 11.05.2018 um 17:02 schrieb Jason Dillaman: > I cannot speak for Xen, but in general IO to a block device will hit > the pagecache unless the IO operation is flagged as direct (e.g. > O_DIRECT) to bypass the pagecache and directly send it

Re: [ceph-users] jewel to luminous upgrade, chooseleaf_vary_r and chooseleaf_stable

2018-05-14 Thread Dan van der Ster
Hi Adrian, Is there a strict reason why you *must* upgrade the tunables? It is normally OK to run with old (e.g. hammer) tunables on a luminous cluster. The crush placement won't be state of the art, but that's not a huge problem. We have a lot of data in a jewel cluster with hammer tunables. We

[ceph-users] jewel to luminous upgrade, chooseleaf_vary_r and chooseleaf_stable

2018-05-14 Thread Adrian
Hi all, We recently upgraded our old ceph cluster to jewel (5xmon, 21xstorage hosts with 9x6tb filestore osds and 3xssd's with 3 journals on each) - mostly used for openstack compute/cinder. In order to get there we had to go with chooseleaf_vary_r = 4 in order to minimize client impact and save