Re: [ceph-users] Slow Performance - Sequential IO

2020-01-17 Thread Christian Balzer
No high CPU load/interface saturation is noted when running > tests against the rbd. > > > > When testing with a 4K block size against an RBD on a dedicated metal test > host (same specs as other cluster nodes noted above) I get the following > (command similar to fi

Re: [ceph-users] Slow rbd read performance

2019-12-26 Thread Christian Balzer
at 164MiB/sec 41 > IOPS > osd.27: bench: wrote 1GiB in blocks of 4MiB in 7.00978 sec at 146MiB/sec 36 > IOPS > osd.32: bench: wrote 1GiB in blocks of 4MiB in 6.38438 sec at 160MiB/sec 40 > IOPS > > Thanks, > Mario > > > > On Tue, Dec 24, 2019 at 1:46 AM Chr

Re: [ceph-users] Slow rbd read performance

2019-12-23 Thread Christian Balzer
5517.19 bytes/sec: > 45196784.26 (45MB/sec) => WHY JUST 45MB/sec? > > Since i ran those rbd benchmarks in ceph01, i guess the problem is not > related to my backup rbd mount at all? > > Thanks, > Mario > ___ > ceph-user

Re: [ceph-users] SPAM in the ceph-users list

2019-11-12 Thread Christian Balzer
labor intensive and a nuisance for real users) as well as harsher ingress and egress (aka spamfiltering) controls you will find that all the domains spamvertized are now in the Spamhaus DBL. "host abbssm.edu.in.dbl.spamhaus.org" Pro tip for spammers: Don't get my attention, ever. Christi

Re: [ceph-users] Bluestore caching oddities, again

2019-08-04 Thread Christian Balzer
Hello, On Sun, 4 Aug 2019 06:34:46 -0500 Mark Nelson wrote: > On 8/4/19 6:09 AM, Paul Emmerich wrote: > > > On Sun, Aug 4, 2019 at 3:47 AM Christian Balzer wrote: > > > >> 2. Bluestore caching still broken > >> When writing data with the fios below, it

[ceph-users] Bluestore caching oddities, again

2019-08-03 Thread Christian Balzer
: IOPS=199, BW=797MiB/s (835MB/s)(32.0GiB/41130msec) with direct=1 read: IOPS=702, BW=2810MiB/s (2946MB/s)(32.0GiB/11662msec) Which is as fast as gets with this setup. Comments? Christian -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Rakuten Mobile

Re: [ceph-users] HEALTH_ERR with a kitchen sink of problems: MDS damaged, readonly, and so forth

2019-07-24 Thread Christian Balzer
On Thu, 25 Jul 2019 13:49:22 +0900 Sangwhan Moon wrote: > osd: 39 osds: 39 up, 38 in You might want to find that out OSD. -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Rakuten Mobile Inc. ___ ceph-us

Re: [ceph-users] Package availability for Debian / Ubuntu

2019-05-16 Thread Christian Balzer
h package found > stretch has 1 Packages. No ceph package found > > If you want to re-run these tests, the attached hacky shell script does it. > > Regards, > > Matthew > > > > -- > The Wellcome Sanger Institute is operated by Genome Research > Limited, a char

Re: [ceph-users] Unexpected IOPS Ceph Benchmark Result

2019-04-21 Thread Christian Balzer
get different results when using > different pool image, but it isnt. It's like using 1 same performance. > Although we're really sure that we alreay separate the SSD and HDD pool and > crushmap. > > My question is : > > 1. Why i get same test results although i already test it w

Re: [ceph-users] how to judge the results? - rados bench comparison

2019-04-17 Thread Christian Balzer
On Wed, 17 Apr 2019 16:08:34 +0200 Lars Täuber wrote: > Wed, 17 Apr 2019 20:01:28 +0900 > Christian Balzer ==> Ceph Users : > > On Wed, 17 Apr 2019 11:22:08 +0200 Lars Täuber wrote: > > > > > Wed, 17 Apr 2019 10:47:32 +0200 > &

Re: [ceph-users] how to judge the results? - rados bench comparison

2019-04-17 Thread Christian Balzer
nt to reduce > > recovery speed anyways if you would run into that limit > > > > Paul > > > > Lars > _______ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listi

Re: [ceph-users] how to judge the results? - rados bench comparison

2019-04-17 Thread Christian Balzer
so if everything is on the same boat. So if you clients (or most of them at least) can be on 25GB/s as well, that would be the best situation, with a non-split network. Christian > > > > > My 2 cents, > > > > Gr. Stefan > > > > Cheers, > Lars > __

Re: [ceph-users] How to reduce HDD OSD flapping due to rocksdb compacting event?

2019-04-11 Thread Christian Balzer
using object store), how busy those disks and CPUs are, etc. That kind of information will be invaluable for others here and likely the developers as well. Regards, Christian > Kind regards, > > Charles Alva > Sent from Gmail Mobile -- Christian BalzerNetwork/Syst

Re: [ceph-users] Erasure Coding failure domain (again)

2019-04-10 Thread Christian Balzer
Hello, On Wed, 10 Apr 2019 20:09:58 +0200 Paul Emmerich wrote: > On Wed, Apr 10, 2019 at 11:12 AM Christian Balzer wrote: > > > > > > Hello, > > > > Another thing that crossed my mind aside from failure probabilities caused > > by actual HDDs dying i

Re: [ceph-users] Erasure Coding failure domain (again)

2019-04-10 Thread Christian Balzer
week" situation like experienced with several people here, you're even more like to wind up in trouble very fast. This is of course all something people do (or should know), I'm more wondering how to model it to correctly asses risks. Christian On Wed, 3 Apr 2019 10:28:09 +0900 Christian Ba

Re: [ceph-users] RGW: Reshard index of non-master zones in multi-site

2019-04-07 Thread Christian Balzer
e what documentation says, it's safe to > > run 'reshard stale-instances rm' on a multi-site setup. > > > > However it is quite telling if the author of this feature doesn't > > trust what they have written to work correctly. > > > > There are still thousands

Re: [ceph-users] Erasure Coding failure domain (again)

2019-04-02 Thread Christian Balzer
On Tue, 2 Apr 2019 19:04:28 +0900 Hector Martin wrote: > On 02/04/2019 18.27, Christian Balzer wrote: > > I did a quick peek at my test cluster (20 OSDs, 5 hosts) and a replica 2 > > pool with 1024 PGs. > > (20 choose 2) is 190, so you're never going to have more than tha

Re: [ceph-users] Erasure Coding failure domain (again)

2019-04-02 Thread Christian Balzer
Hello Hector, Firstly I'm so happy somebody actually replied. On Tue, 2 Apr 2019 16:43:10 +0900 Hector Martin wrote: > On 31/03/2019 17.56, Christian Balzer wrote: > > Am I correct that unlike with with replication there isn't a maximum size > > of the critical path OSDs? &

Re: [ceph-users] Does Bluestore backed OSD detect bit rot immediately when reading or only when scrubbed?

2019-04-01 Thread Christian Balzer
What and how would happen in case erasure coded pool's data was found > to be damaged as well? > > -- > End of message. Next message? > _______ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Christian Balzer

Re: [ceph-users] Ceph block storage cluster limitations

2019-03-31 Thread Christian Balzer
d this. ^o^ Regards, Christian > > What is the known maximum cluster size that Ceph RBD has been deployed to? > > See above. > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinf

[ceph-users] Erasure Coding failure domain (again)

2019-03-31 Thread Christian Balzer
down to the same risk as a 3x replica pool. Feedback welcome. Christian -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Rakuten Communications ___ ceph-users mailing list ceph-users@lists.ceph.com http

Re: [ceph-users] Bluestore WAL/DB decisions

2019-03-28 Thread Christian Balzer
s the penalty for a too small DB on an SSD partition so severe that > it's not worth doing? > > Thanks, > Erik > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-

[ceph-users] Dealing with SATA resets and consequently slow ops

2019-03-26 Thread Christian Balzer
] ata5.00: configured for UDMA/133 [54954737.206140] ata5: EH complete --- -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Rakuten Communications ___ ceph-users mailing list ceph-users@lists.ceph.com http

Re: [ceph-users] Ceph cluster on AMD based system.

2019-03-05 Thread Christian Balzer
llo, > >>> > >>> > >>> I was thinking of using AMD based system for my new nvme based > >>> cluster. In particular I'm looking at > >>> https://www.supermicro.com/Aplus/system/1U/1113/AS-1113S-WN10RT.cfm > >>> and https://www.am

Re: [ceph-users] Prevent rebalancing in the same host?

2019-02-19 Thread Christian Balzer
uld also permanently set noout and nodown and live with the consequences and warning state. But of course everybody will (rightly) tell you that you need enough capacity to at the very least deal with a single OSD loss. Christian -- Christian BalzerNetwork/Systems Engineer ch...

Re: [ceph-users] rados block on SSD - performance - how to tune and get insight?

2019-02-06 Thread Christian Balzer
ing on NVMe? > > Thanks a log. > > [0] > https://www.micron.com/-/media/client/global/documents/products/other-documents/micron_9200_max_ceph_12,-d-,2,-d-,8_luminous_bluestore_reference_architecture.pdf?la=en > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Rakuten Communications ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] One host with 24 OSDs is offline - best way to get it back online

2019-01-26 Thread Christian Balzer
The Cluster is recovering and remapping fine, but still has some objects > >> to process. > >> > >> My question: May I just switch the server back on and in best case, the 24 > >> OSDs get back online and recovering will do the job without problems. > >>

Re: [ceph-users] disk controller failure

2018-12-13 Thread Christian Balzer
a small cluster (is there even enough space to rebalance a node worth of data?) things may be different. I always set "mon_osd_down_out_subtree_limit = host" (and monitor things of course) since I reckon a down node can often be brought back way faster than a full rebalance.

Re: [ceph-users] SLOW SSD's after moving to Bluestore

2018-12-11 Thread Christian Balzer
ail is prohibited. If you received > > this message in error, please contact the sender and destroy all > > copies of this email and any attachment(s). > > > > > > On Mon, Dec 10, 2018 at 8:57 PM Christian Balzer > <mailto:ch...@gol.com>> wrote: >

Re: [ceph-users] SLOW SSD's after moving to Bluestore

2018-12-10 Thread Christian Balzer
d visibly in the logs. > The systems were never powered off or anything during the conversion > from filestore to bluestore. > So anything mentioned as well as kernel changes don't apply. I shall point the the bluestore devs then. >.> Christian -- Christian BalzerNetwork/Systems En

Re: [ceph-users] SLOW SSD's after moving to Bluestore

2018-12-10 Thread Christian Balzer
ed this > message in error, please contact the sender and destroy all copies of > this email and any attachment(s). > > > On Mon, Dec 10, 2018 at 8:57 PM Christian Balzer wrote: > > > > Hello, > > > > On Mon, 10 Dec 2018 20:43:40 -0500 Tyler Bishop wrote: &g

Re: [ceph-users] SLOW SSD's after moving to Bluestore

2018-12-10 Thread Christian Balzer
gt; Device: rrqm/s wrqm/s r/s w/s rkB/swkB/s > > > avgrq-sz avgqu-sz await r_await w_await svctm %util > > > sda 0.00 0.000.003.50 0.0017.00 > > > 9.71 0.001.290.001.29 1.14 0.40 > > &g

Re: [ceph-users] Luminous with osd flapping, slow requests when deep scrubbing

2018-10-16 Thread Christian Balzer
Hello, On Tue, 16 Oct 2018 14:09:23 +0100 (BST) Andrei Mikhailovsky wrote: > Hi Christian, > > > - Original Message - > > From: "Christian Balzer" > > To: "ceph-users" > > Cc: "Andrei Mikhailovsky" > > Sen

Re: [ceph-users] Luminous with osd flapping, slow requests when deep scrubbing

2018-10-16 Thread Christian Balzer
b = 0 > > > Could you share experiences with deep scrubbing of bluestore osds? Are there > any options that I should set to make sure the osds are not flapping and the > client IO is still available? > > Thanks > > Andrei -- Christian BalzerNetwork/Syste

Re: [ceph-users] Bluestore vs. Filestore

2018-10-02 Thread Christian Balzer
gt; > * Bluestore should be the new and shiny future - right? > ** Total mem 1TB+ > > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -

Re: [ceph-users] Best practices for allocating memory to bluestore cache

2018-08-31 Thread Christian Balzer
message in error, please contact the sender and destroy all copies of this > > email and any attachment(s). > > ___ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >

Re: [ceph-users] Design a PetaByte scale CEPH object storage

2018-08-26 Thread Christian Balzer
e anathema and NVMe is likely not needed in your scenario, at least not for actual storage space. Christian -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Rakuten Communications ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] ceph auto repair. What is wrong?

2018-08-24 Thread Christian Balzer
, > > 74 pgs degraded, 74 pgs undersized > > > > And ceph does not try to repair pool. Why? > > How long did you wait? The default timeout is 600 seconds before > recovery starts. > > These OSDs are not marked as out yet. > > Wido > > > >

Re: [ceph-users] Stability Issue with 52 OSD hosts

2018-08-22 Thread Christian Balzer
and graphing this data might work, too. My suspects would be deep scrubs and/or high IOPS spikes when this is happening, starving out OSD processes (CPU wise, RAM should be fine one supposes). Christian > Please help!!! > _______ > ceph-users mailing

Re: [ceph-users] Public network faster than cluster network

2018-05-10 Thread Christian Balzer
h IOPS long before bandwidth becomes an issue. > thus, a 10GB network would be needed, right ? Maybe a dual gigabit port > bonded together could do the job. > A single gigabit link would be saturated by a single disk. > > Is my assumption correct ? > The biggest argument again

Re: [ceph-users] Public network faster than cluster network

2018-05-09 Thread Christian Balzer
Lastly, more often than not segregated networks are not needed, add unnecessary complexity and the resources spent on them would be better used to have just one fast and redundant network instead. Christian -- Christian BalzerNetwork/Systems Engineer ch...

Re: [ceph-users] Poor read performance.

2018-04-25 Thread Christian Balzer
Hello, On Wed, 25 Apr 2018 17:20:55 -0400 Jonathan Proulx wrote: > On Wed Apr 25 02:24:19 PDT 2018 Christian Balzer wrote: > > > Hello, > > > On Tue, 24 Apr 2018 12:52:55 -0400 Jonathan Proulx wrote: > > > > The performence I really care about is over

Re: [ceph-users] Poor read performance.

2018-04-25 Thread Christian Balzer
gt; so don't want to make direct comparisons. > It could be as easy as having lots of pagecache with filestore that helped dramatically with (repeated) reads. But w/o a quiescent cluster determining things might be difficult. Christian -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Rakuten Communications ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Questions regarding hardware design of an SSD only cluster

2018-04-24 Thread Christian Balzer
Hello, On Tue, 24 Apr 2018 11:39:33 +0200 Florian Florensa wrote: > 2018-04-24 3:24 GMT+02:00 Christian Balzer <ch...@gol.com>: > > Hello, > > > > Hi Christian, and thanks for your detailed answer. > > > On Mon, 23 Apr 2018 17:43:03 +0200 Florian Flore

Re: [ceph-users] Questions regarding hardware design of an SSD only cluster

2018-04-23 Thread Christian Balzer
rds, > > Florian > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Rakuten

Re: [ceph-users] Cluster unusable after 50% full, even with index sharding

2018-04-13 Thread Christian Balzer
that getting 50% full makes your cluster > unusable? Index sharding has seemed to not help at all (I did some > benchmarking, with 128 shards and then 256; same result each time.) > > Or are we out of luck? -- Christian BalzerNetwork/Systems Enginee

Re: [ceph-users] Bluestore caching, flawed by design?

2018-04-02 Thread Christian Balzer
v3 @ 3.50GHz (12/6 cores) Christian > I haven't been following these processors lately. Is anyone building CEPH > clusters using them > > On 2 April 2018 at 02:59, Christian Balzer <ch...@gol.com> wrote: > > > > > Hello, > > > > firstly, Jack pretty

Re: [ceph-users] Bluestore caching, flawed by design?

2018-04-01 Thread Christian Balzer
Hello, firstly, Jack pretty much correctly correlated my issues to Mark's points, more below. On Sat, 31 Mar 2018 08:24:45 -0500 Mark Nelson wrote: > On 03/29/2018 08:59 PM, Christian Balzer wrote: > > > Hello, > > > > my crappy test cluster was rendered inoperatio

[ceph-users] Bluestore caching, flawed by design?

2018-03-29 Thread Christian Balzer
-- Christian BalzerNetwork/Systems Engineer ch...@gol.com Rakuten Communications ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Fwd: Fwd: High IOWait Issue

2018-03-26 Thread Christian Balzer
figure things on either host or switch, or with a good modern switch not even buy that much in the latency department. Christian > > 2018-03-26 7:41 GMT+07:00 Christian Balzer <ch...@gol.com>: > > > > > Hello, > > > > in general and as reminder for others, the more

Re: [ceph-users] problem while removing images

2018-03-26 Thread Christian Balzer
> > Regards > > > *Thiago Gonzaga* > SaaSOps Software Architect > o. 1 (512) 2018-287 x2119 > Skype: thiago.gonzaga20 > [image: Aurea] > <http://www.aurea.com/?utm_source=email-signature_medium=email> -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Rakuten Communications ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Fwd: High IOWait Issue

2018-03-25 Thread Christian Balzer
r, IOwait on each ceph host is about 20%. > > > https://prnt.sc/ivne08 > > > > > > > > > Can you guy help me find the root cause of this issue, and how > > to eliminate this high iowait? > > > > > > T

Re: [ceph-users] Growing an SSD cluster with different disk sizes

2018-03-19 Thread Christian Balzer
. > Understood, thank you! > > *Mark Steffen* > *"Don't believe everything you read on the Internet." -Abraham Lincoln* > > > > On Mon, Mar 19, 2018 at 7:11 AM, Christian Balzer <ch...@gol.com> wrote: > > > > > Hello, > > >

Re: [ceph-users] Growing an SSD cluster with different disk sizes

2018-03-19 Thread Christian Balzer
he action, are they a) really twice as fast or b) is your load never going to be an issue anyway? Christian > I'm also using Luminous/Bluestore if it matters. > > Thanks in advance! > > *Mark Steffen* > *"Don't believe everything you read on the Internet." -Abraham Lincoln*

Re: [ceph-users] Disk write cache - safe?

2018-03-15 Thread Christian Balzer
to IT mode style exposure of the disks and still use their HW cache. Christian -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Rakuten Communications ___ ceph-users mailing list ceph-users@lists.cep

Re: [ceph-users] New Ceph cluster design

2018-03-13 Thread Christian Balzer
-sz await r_await w_await svctm %util sdb 0.0383.097.07 303.24 746.64 5084.9937.59 0.050.150.710.13 0.06 2.00 --- 300 write IOPS and 5MB/s for all that time. Christian -- Christian BalzerNetwork/Systems Engineer c

Re: [ceph-users] New Ceph-cluster and performance "questions"

2018-02-08 Thread Christian Balzer
Hello, On Thu, 8 Feb 2018 10:58:43 + Patrik Martinsson wrote: > Hi Christian, > > First of all, thanks for all the great answers and sorry for the late > reply. > You're welcome. > > On Tue, 2018-02-06 at 10:47 +0900, Christian Balzer wrote: > > Hello, >

Re: [ceph-users] Latency for the Public Network

2018-02-06 Thread Christian Balzer
Hello, On Tue, 6 Feb 2018 09:21:22 +0100 Tobias Kropf wrote: > On 02/06/2018 04:03 AM, Christian Balzer wrote: > > Hello, > > > > On Mon, 5 Feb 2018 22:04:00 +0100 Tobias Kropf wrote: > > > >> Hi ceph list, > >> > >> we have a hype

Re: [ceph-users] osd_recovery_max_chunk value

2018-02-06 Thread Christian Balzer
20 ,ie 7340032 ? > More like 4MB to match things up nicely in the binary world. Christian > Karun Josy > > On Tue, Feb 6, 2018 at 1:15 PM, Christian Balzer <ch...@gol.com> wrote: > > > On Tue, 6 Feb 2018 13:01:12 +0530 Karun Josy wrote: > > > > >

Re: [ceph-users] osd_recovery_max_chunk value

2018-02-06 Thread Christian Balzer
ph tell osd.* injectargs '--osd_recovery_sleep .1' > - > > > Karun Josy > > On Tue, Feb 6, 2018 at 1:15 PM, Christian Balzer <ch...@gol.com> wrote: > > > On Tue, 6 Feb 2018 13:01:12 +0530 Karun Josy wrote: > > > > > Hello, > > > &g

Re: [ceph-users] osd_recovery_max_chunk value

2018-02-05 Thread Christian Balzer
t you get when programmers write docs. The above is a left-shift operation, see for example: http://bit-calculator.com/bit-shift-calculator Now if shrinking that value is beneficial for reducing recovery load, that's for you to find out. Christian > &

Re: [ceph-users] Latency for the Public Network

2018-02-05 Thread Christian Balzer
t; Define terminal server, are we talking Windows Virtual Desktops with RDP? Windows is quite the hog when it comes to I/O. Regards, Christian -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Rakuten Communications

Re: [ceph-users] New Ceph-cluster and performance "questions"

2018-02-05 Thread Christian Balzer
> Is the above a good way of measuring our cluster, or is it better more > reliable ways of measuring it ? > See above. A fio test is definitely a closer thing to reality compared to OSD or RADOS benches. > Is there a way to calculate this "theoretically" (ie with with 6

Re: [ceph-users] Linux Meltdown (KPTI) fix and how it affects performance?

2018-01-11 Thread Christian Balzer
__ > >> ceph-users mailing list > >> ceph-users@lists.ceph.com > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> > > ___ > > ceph-users mailing list > > ceph-users@

Re: [ceph-users] Running Jewel and Luminous mixed for a longer period

2018-01-01 Thread Christian Balzer
hed hard (when it wants/needs to flush to HDD) it will overload things and doesn't honor I/O priorities as others have mentioned here. I'm using bcache for now because in my use case the issues above won't show up, but I'd be wary to use it with Ceph in a cluster where I don't control/know the IO patte

Re: [ceph-users] Many concurrent drive failures - How do I activate pgs?

2017-12-20 Thread Christian Balzer
inaccessible, SMART reporting barely that something was there. So one wonders what caused your SSDs to get their knickers in such a twist. Are the survivors showing any unusual signs in their SMART output? Of course what your vendor/Intel will have to say will also be of interest. ^o^ R

Re: [ceph-users] Ceph - SSD cluster

2017-11-21 Thread Christian Balzer
On Tue, 21 Nov 2017 11:34:51 +0100 Ronny Aasen wrote: > On 20. nov. 2017 23:06, Christian Balzer wrote: > > On Mon, 20 Nov 2017 15:53:31 +0100 Ansgar Jazdzewski wrote: > > > >> Hi *, > >> > >> just on note because we hit it, take a look on your discar

Re: [ceph-users] how to improve performance

2017-11-21 Thread Christian Balzer
On Tue, 21 Nov 2017 09:21:58 +0200 Rudi Ahlers wrote: > On Mon, Nov 20, 2017 at 2:36 PM, Christian Balzer <ch...@gol.com> wrote: > > > On Mon, 20 Nov 2017 14:02:30 +0200 Rudi Ahlers wrote: > > > > > We're planning on installing 12X Virtual Machines with some

Re: [ceph-users] how to test journal?

2017-11-21 Thread Christian Balzer
nt to monitor a ceph cluster this way. -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Rakuten Communications ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] how to improve performance

2017-11-20 Thread Christian Balzer
On Tue, 21 Nov 2017 10:35:57 +1100 Nigel Williams wrote: > On 21 November 2017 at 10:07, Christian Balzer <ch...@gol.com> wrote: > > On Tue, 21 Nov 2017 10:00:28 +1100 Nigel Williams wrote: > >> Is there something in the specifications that gives them away as SS

Re: [ceph-users] how to improve performance

2017-11-20 Thread Christian Balzer
On Tue, 21 Nov 2017 10:00:28 +1100 Nigel Williams wrote: > On 20 November 2017 at 23:36, Christian Balzer <ch...@gol.com> wrote: > > On Mon, 20 Nov 2017 14:02:30 +0200 Rudi Ahlers wrote: > >> The SATA drives are ST8000NM0055-1RM112 > >> > > Note that th

Re: [ceph-users] Switch to replica 3

2017-11-20 Thread Christian Balzer
196 1.7 osd.196 up 1.0 1.0 > > 197 1.7 osd.197 up 1.0 1.0 > > 198 1.7 osd.198 up 1.0 1.0 > > 199 1.7 osd.199 up 1.0 1.0 > > 200

Re: [ceph-users] Ceph - SSD cluster

2017-11-20 Thread Christian Balzer
s mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Rakuten Communications ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] how to improve performance

2017-11-20 Thread Christian Balzer
eating. And if you're looking for a less invasive procedure, docs and the ML archive, but AFAIK there is nothing but re-creation at this time. Christian > > On Mon, Nov 20, 2017 at 1:44 PM, Christian Balzer <ch...@gol.com> wrote: > > > On Mon, 20 Nov 2017 12:38:55 +0200 Rudi Ahlers

Re: [ceph-users] how to improve performance

2017-11-20 Thread Christian Balzer
| grep /dev/sdf | grep osd > /dev/sdc1 ceph data, active, cluster ceph, osd.9, block /dev/sdc2, > block.db /dev/sdf1 > /dev/sdd1 ceph data, active, cluster ceph, osd.10, block /dev/sdd2, > block.db /dev/sdf2 > > > > I see now /dev/sda doesn't have a jour

Re: [ceph-users] Switch to replica 3

2017-11-20 Thread Christian Balzer
> ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Rakuten Communications _

Re: [ceph-users] rocksdb: Corruption: missing start of fragmented record

2017-11-01 Thread Christian Balzer
prit is just the above rocksdb > error again. > > Q: Is there some way in which I can tell rockdb to truncate or delete / > skip the respective log entries? Or can I get access to rocksdb('s > files) in some other way to just manipulate it or delete corrupted WAL > files manual

Re: [ceph-users] How to increase the size of requests written to a ceph image

2017-10-20 Thread Christian Balzer
d controllers > > and we just tracked down 10 of our nodes that had >100ms await pretty much > > always were the only 10 nodes in the cluster with failed batteries on the > > raid controllers. > > > > On Thu, Oct 19, 2017, 8:15 PM Christian Balz

Re: [ceph-users] How to increase the size of requests written to a ceph image

2017-10-19 Thread Christian Balzer
al clean up time :20.166559 > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> On Wed, Oct 18, 2017 at 8:51 AM, Maged Mokhtar <mmokh...@petasan.org> > >>>>>> wrote: >

Re: [ceph-users] killing ceph-disk [was Re: ceph-volume: migration and disk partition support]

2017-10-16 Thread Christian Balzer
lf. > If the OS is on a RAID1 the chances of things being lost entirely is reduced very much, so moving OSDs to another host becomes a trivial exercise one would assume. But yeah, this sounds fine to me, as it's extremely flexible. Christian -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Rakuten Communications ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] list admin issues

2017-10-15 Thread Christian Balzer
of these and you're well on your way out. The default mailman settings and logic require 5 bounces to trigger unsubscription and 7 days of NO bounces to reset the counter. Christian On Mon, 16 Oct 2017 12:23:25 +0900 Christian Balzer wrote: > On Mon, 16 Oct 2017 14:15:22 +1100 Blair Bethwa

Re: [ceph-users] list admin issues

2017-10-15 Thread Christian Balzer
s (in Mailman lingo, it “scrubs” the messages of attachments). However, Mailman also includes links to the original attachments that the recipient can click on. --- Christian > Cheers, > > On 16 October 2017 at 13:54, Christian Balzer <ch...@gol.com> wrote: > > > > He

Re: [ceph-users] list admin issues

2017-10-15 Thread Christian Balzer
ning on roughly a monthly basis. > > Thing is I have no idea what the bounce is or where it is coming from. > I've tried emailing ceph-users-ow...@lists.ceph.com and the contact > listed in Mailman (l...@redhat.com) to get more info but haven't > received any response despite several

Re: [ceph-users] min_size & hybrid OSD latency

2017-10-10 Thread Christian Balzer
__ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Rakuten Communications _

Re: [ceph-users] killing ceph-disk [was Re: ceph-volume: migration and disk partition support]

2017-10-09 Thread Christian Balzer
ministic behavior. Since there never was an (non-disruptive) upgrade process from non-GPT based OSDs to GPT based ones, I wonder what changed minds here. Not that the GPT based users won't appreciate it. Christian > sage > ___ > ceph-users m

Re: [ceph-users] Ceph cache pool full

2017-10-06 Thread Christian Balzer
29 osd.30 up 1.0 1.0 > >> > 31 hdd 7.27829 osd.31 up 1.0 1.0 > >> > 32 hdd 7.27829 osd.32 up 1.0 1.0 > >> > 33 hdd 7.27829 osd.33 up 1.0 1.0 >

Re: [ceph-users] Ceph cache pool full

2017-10-06 Thread Christian Balzer
t; nodelete: false > > nopgchange: false > > nosizechange: false > > write_fadvise_dontneed: false > > noscrub: false > > nodeep-scrub: false > > hit_set_type: bloom > > hit_set_period: 14400 > > hit_set_count: 12 > > hit_set_fpp: 0.05 > >

Re: [ceph-users] Ceph cache pool full

2017-10-05 Thread Christian Balzer
3 > > 0723953 10541k > > > > total_objects355409 > > total_used 2847G > > total_avail 262T > > total_space 265T > > > > However, the data pool is completely empty! So it seems that data has only > > been written t

Re: [ceph-users] osd max scrubs not honored?

2017-09-28 Thread Christian Balzer
-09-26 13:54:44.199792 0.ec > >> > >> What’s going on here? > >> > >> Why isn’t the limit on scrubs being honored? > >> > >> It would also be great if scrub I/O were surfaced in “ceph status” the > >> way recovery I/O is, especially since it can have such a significant > >> impact on client operations. > >> > >> Thanks! > >> ___ > >> ceph-users mailing list > >> ceph-users@lists.ceph.com > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> > > ___ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Rakuten Communications ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph cluster with SSDs

2017-09-12 Thread Christian Balzer
mer models again. > >>> > > >>> > Samsung also makes DC grade SSDs and NVMEs, as Adrian pointed out. > >>> > > >>> >> Btw, if we split this SSD with multiple OSD (for ex: 1 SSD with 4 or 2 > >>> >>

Re: [ceph-users] PCIe journal benefit for SSD OSDs

2017-09-07 Thread Christian Balzer
Hello, On Thu, 7 Sep 2017 08:03:31 +0200 Stefan Priebe - Profihost AG wrote: > Hello, > Am 07.09.2017 um 03:53 schrieb Christian Balzer: > > > > Hello, > > > > On Wed, 6 Sep 2017 09:09:54 -0400 Alex Gorbachev wrote: > > > >> We are planning a Je

Re: [ceph-users] PCIe journal benefit for SSD OSDs

2017-09-06 Thread Christian Balzer
le here who have actually done this, hopefully some will speak up. Christian -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Rakuten Communications ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ce

Re: [ceph-users] Bad IO performance CephFS vs. NFS for block size 4k/128k

2017-09-04 Thread Christian Balzer
at for good IO performance only data with blocksize > 128k (I > guess > 1M) should be used. > Can anybody confirm this? > > THX > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/cep

Re: [ceph-users] [SSD NVM FOR JOURNAL] Performance issues

2017-08-24 Thread Christian Balzer
Hello, On Thu, 24 Aug 2017 14:49:24 -0300 Guilherme Steinmüller wrote: > Hello Christian. > > First of all, thanks for your considerations, I really appreciate it. > > 2017-08-23 21:34 GMT-03:00 Christian Balzer <ch...@gol.com>: > > > > > Hello, > &

Re: [ceph-users] [SSD NVM FOR JOURNAL] Performance issues

2017-08-23 Thread Christian Balzer
oughput of the journal. > How busy are your NVMe journals during that test on the Dells and HPs respectively? Same for the HDDs. Again, run longer, larger tests to get something that will actually register, also atop with shorter intervals. Christian > > Any clue about what is missing or

Re: [ceph-users] Ceph cluster with SSDs

2017-08-23 Thread Christian Balzer
On Wed, 23 Aug 2017 16:48:12 +0530 M Ranga Swami Reddy wrote: > On Mon, Aug 21, 2017 at 5:37 PM, Christian Balzer <ch...@gol.com> wrote: > > On Mon, 21 Aug 2017 17:13:10 +0530 M Ranga Swami Reddy wrote: > > > >> Thank you. > >> Here I have NVMes from I

Re: [ceph-users] pros/cons of multiple OSD's per host

2017-08-23 Thread Christian Balzer
ian > Thanks, > Nick > > On Tue, Aug 22, 2017 at 6:56 PM, Christian Balzer <ch...@gol.com> wrote: > > > > > Hello, > > > > On Tue, 22 Aug 2017 16:51:47 +0800 Nick Tan wrote: > > > > > Hi Christian, > > > > >

Re: [ceph-users] pros/cons of multiple OSD's per host

2017-08-22 Thread Christian Balzer
> > > > > > And > > > I don't have enough hardware to setup a test cluster of any significant > > > size to run some actual testing. > > > > > You may want to set up something to get a feeling for CephFS, if it's > > right for you or if something else

Re: [ceph-users] Cache tier unevictable objects

2017-08-22 Thread Christian Balzer
olumes listed in the cache pool, but the objects didn't change at > all, the total number was also still 39. For the rbd_header objects I > don't even know how to identify their "owner", is there a way? > > Has anyone a hint what else I could check or is it reasonable to >

Re: [ceph-users] pros/cons of multiple OSD's per host

2017-08-21 Thread Christian Balzer
> I don't have enough hardware to setup a test cluster of any significant > size to run some actual testing. > You may want to set up something to get a feeling for CephFS, if it's right for you or if something else on top of RBD may be more suitable. C

  1   2   3   4   5   6   7   8   9   10   >