Re: [ceph-users] split brain case

2018-04-02 Thread Donny Davis
The only reason to stray from what is defined in the docs is if you have a very specific use case for the application of RAID or something else not defined. You are in good shape. Just follow the guidance in the docs and you shouldn't have any problems. Ceph is powerful and intelligent technology

Re: [ceph-users] split brain case

2018-04-02 Thread ST Wong (ITSC)
I’m newbie to Ceph ☺. When I go through the doc and some discussions, seems one OSD per disk will be of better performance than one OSD per server on RAID. Is that correct? Thanks again. From: Donny Davis [mailto:do...@fortnebula.com] Sent: Tuesday, April 03, 2018 10:19 AM To: ST Wong (ITSC

Re: [ceph-users] ceph-fuse segfaults

2018-04-02 Thread Donny Davis
The kernel client in my experience was much better all around. I had an issue with file corruption issue quite a long time ago) that could have been prevented if I had proper UPS and was using the kernel client. On Mon, Apr 2, 2018 at 7:12 PM, Zhang Qiang wrote: > Thanks Patrick, > I should have

Re: [ceph-users] split brain case

2018-04-02 Thread Donny Davis
It would work fine either way. I was just curious how people are setting up Ceph in their environments. Usually when people say they have one OSD per server, then they are using RAID for one reason or another. It not really relevant to the question at hand, but thank you for satisfying my curiosi

Re: [ceph-users] split brain case

2018-04-02 Thread ST Wong (ITSC)
There are multiple disks per server, and will have one OSD for each disk. Is that okay? Thanks again. From: Donny Davis [mailto:do...@fortnebula.com] Sent: Tuesday, April 03, 2018 10:12 AM To: ST Wong (ITSC) Cc: Ronny Aasen; ceph-users@lists.ceph.com Subject: Re: [ceph-users] split brain case

Re: [ceph-users] ceph-fuse segfaults

2018-04-02 Thread Zhang Qiang
Thanks Patrick, I should have checked the tracker first. I'll try the kernel client and a upgrade to see if it resolves. On 2 April 2018 at 22:29, Patrick Donnelly wrote: > Probably fixed by this: http://tracker.ceph.com/issues/17206 > > You need to upgrade your version of ceph-fuse. > > On Mon,

Re: [ceph-users] split brain case

2018-04-02 Thread Donny Davis
Do you only have one OSD per server? Not that is really matters... because all of the above stated is true in any case. Just curious On Mon, Apr 2, 2018 at 6:40 PM, ST Wong (ITSC) wrote: > Hi, > > > > >how many servers are your osd's split over ? keep in mind that ceph's > default picks one osd

Re: [ceph-users] split brain case

2018-04-02 Thread ST Wong (ITSC)
Hi, >how many servers are your osd's split over ? keep in mind that ceph's default >picks one osd from each host. so you would need minimum 4 osd hosts in total >to be able to use 4+2 pools and with only 4 hosts you have no failuredomain. >but 4 hosts in the minimum sane starting point for a r

Re: [ceph-users] Have an inconsistent PG, repair not working

2018-04-02 Thread Michael Sudnick
Hi Kjetil, I've tried to get the pg scrubbing/deep scrubbing and nothing seems to be happening. I've tried it a few times over the last few days. My cluster is recovering from a failed disk (which was probably the reason for the inconsistency), do I need to wait for the cluster to heal before repa

Re: [ceph-users] Have an inconsistent PG, repair not working

2018-04-02 Thread Marc Roos
I have this inconsistent pg also for a long time on my test cluster, also tried pg repair among other things. Can I also get some help on this? [@c02 ~]# ceph health detail HEALTH_ERR 1 scrub errors; Possible data damage: 1 pg inconsistent OSD_SCRUB_ERRORS 1 scrub errors PG_DAMAGED Possible d

[ceph-users] Ceph performance falls as data accumulates

2018-04-02 Thread Robert Stanford
This is a known issue as far as I can tell, I've read about it several times. Ceph performs great (using radosgw), but as the OSDs fill up performance falls sharply. I am down to half of empty performance with about 50% disk usage. My questions are: does adding more OSDs / disks to the cluster

Re: [ceph-users] Have an inconsistent PG, repair not working

2018-04-02 Thread Kjetil Joergensen
Hi, scrub or deep-scrub the pg, that should in theory get you back to list-inconsistent-obj spitting out what's wrong, then mail that info to the list. -KJ On Sun, Apr 1, 2018 at 9:17 AM, Michael Sudnick wrote: > Hello, > > I have a small cluster with an inconsistent pg. I've tried ceph pg rep

Re: [ceph-users] librados python pool alignment size write failures

2018-04-02 Thread Gregory Farnum
On Mon, Apr 2, 2018 at 8:21 AM Kevin Hrpcek wrote: > Hello, > > We use python librados bindings for object operations on our cluster. For > a long time we've been using 2 ec pools with k=4 m=1 and a fixed 4MB > read/write size with the python bindings. During preparations for migrating > all of o

Re: [ceph-users] Bluestore caching, flawed by design?

2018-04-02 Thread Mark Nelson
On 04/01/2018 07:59 PM, Christian Balzer wrote: Hello, firstly, Jack pretty much correctly correlated my issues to Mark's points, more below. On Sat, 31 Mar 2018 08:24:45 -0500 Mark Nelson wrote: On 03/29/2018 08:59 PM, Christian Balzer wrote: Hello, my crappy test cluster was rendered in

[ceph-users] librados python pool alignment size write failures

2018-04-02 Thread Kevin Hrpcek
Hello, We use python librados bindings for object operations on our cluster. For a long time we've been using 2 ec pools with k=4 m=1 and a fixed 4MB read/write size with the python bindings. During preparations for migrating all of our data to a k=6 m=2 pool we've discovered that ec pool ali

Re: [ceph-users] ceph-fuse segfaults

2018-04-02 Thread Patrick Donnelly
Probably fixed by this: http://tracker.ceph.com/issues/17206 You need to upgrade your version of ceph-fuse. On Mon, Apr 2, 2018 at 12:56 AM, Zhang Qiang wrote: > Hi, > > I'm using ceph-fuse 10.2.3 on CentOS 7.3.1611. ceph-fuse always > segfaults after running for some time. > > *** Caught signal

Re: [ceph-users] wal and db device on SSD partitions?

2018-04-02 Thread David Turner
Filestore had no such recommendation for size of journal per TB. The default was a flat 10GB partition. For bluestore, the recommendation has been 10GB per TB. However, with your setup, I would probably bump that up a bit more just to give you more room for your WAL/DB partition. While bluestor

Re: [ceph-users] Cephfs and number of clients

2018-04-02 Thread David Turner
It depends on how you're mounting CephFS. If you're using ceph-fuse, I believe that you would see a performance increase from multiple mount points for each volume. How much and if that actually holds up for you in production would require some testing on your end. On Tue, Mar 20, 2018 at 6:53 A

Re: [ceph-users] multiple radosgw daemons per host, and performance

2018-04-02 Thread David Turner
Before I would start multiple daemons on the same host, I would start looking at running your RGW daemons in containers. That way you can manage resources to them a little more elegantly and scale them as you need. On Mon, Mar 26, 2018 at 1:11 PM Robert Stanford wrote: > > When I am running at

Re: [ceph-users] Bluestore caching, flawed by design?

2018-04-02 Thread Simon Leinen
Christian Balzer writes: > On Mon, 2 Apr 2018 08:33:35 +0200 John Hearns wrote: >> Christian, you mention single socket systems for storage servers. >> I often thought that the Xeon-D would be ideal as a building block for >> storage servers >> https://www.intel.com/content/www/us/en/products/proce

[ceph-users] ceph-fuse segfaults

2018-04-02 Thread Zhang Qiang
Hi, I'm using ceph-fuse 10.2.3 on CentOS 7.3.1611. ceph-fuse always segfaults after running for some time. *** Caught signal (Segmentation fault) ** in thread 7f455d832700 thread_name:ceph-fuse ceph version 10.2.3 (ecc23778eb545d8dd55e2e4735b53cc93f92e65b) 1: (()+0x2a442a) [0x7f457208e42a] 2

Re: [ceph-users] Bluestore caching, flawed by design?

2018-04-02 Thread Christian Balzer
Hello, On Mon, 2 Apr 2018 08:33:35 +0200 John Hearns wrote: > Christian, you mention single socket systems for storage servers. > I often thought that the Xeon-D would be ideal as a building block for > storage servers > https://www.intel.com/content/www/us/en/products/processors/xeon/d-processo