[ceph-users] Odp.: backup ceph

2018-09-18 Thread Tomasz Kuzemko
Hello, a colleague of mine has done a presentation at FOSDEM about how we (OVH) are doing RBD backups. You might find it interesting: https://archive.fosdem.org/2018/schedule/event/backup_ceph_at_scale/ -- Tomasz Kuzemko tomasz.kuze...@corp.ovh.com Od

[ceph-users] Odp.: New Ceph community manager: Mike Perez

2018-08-30 Thread Tomasz Kuzemko
Welcome Mike! You're the perfect person for this role! -- Tomasz Kuzemko tomasz.kuze...@corp.ovh.com Od: ceph-users w imieniu użytkownika Sage Weil Wysłane: środa, 29 sierpnia 2018 03:13 Do: ceph-de...@vger.kernel.org; ceph-users@lists.ceph.com;

[ceph-users] Odp.: pgs incomplete and inactive

2018-08-27 Thread Tomasz Kuzemko
Hello Josef, I would suggest setting up a bigger disk (if not physical then maybe a LVM volume from 2 smaller disks) and cloning (remember about extended attributes!) the OSD data dir to the new disk, then try to bring the OSD back into cluster. -- Tomasz Kuzemko tomasz.kuze...@corp.ovh.com

Re: [ceph-users] pgs stuck unclean

2017-02-16 Thread Tomasz Kuzemko
If the PG cannot be queried I would bet on OSD message throttler. Check with "ceph --admin-daemon PATH_TO_ADMIN_SOCK perf dump" on each OSD which is holding this PG if message throttler current value is not equal max. If it is, increase the max value in ceph.conf and restart OSD.

Re: [ceph-users] bcache vs flashcache vs cache tiering

2017-02-14 Thread Tomasz Kuzemko
mance increase. -- Tomasz Kuzemko tomasz.kuze...@corp.ovh.com Dnia 14.02.2017 o godz. 17:25 Wido den Hollander napisał(a): > >> Op 14 februari 2017 om 11:14 schreef Nick Fisk : >> >> >>> -Original Message- >>> From: ceph-users [mailto:cep

Re: [ceph-users] ceph cluster having blocke requests very frequently

2016-11-23 Thread Tomasz Kuzemko
use of the contents of this information is prohibited. If you have received > this electronic message in error, please notify us by post or telephone (to > the numbers or correspondence address above) or by email (at the email > address above) immediately. >

Re: [ceph-users] Ceph and container

2016-11-15 Thread Tomasz Kuzemko
ema > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > <http://lists.ceph.com/listinfo.cgi/ce

Re: [ceph-users] Can we drop ubuntu 14.04 (trusty) for kraken and lumninous?

2016-11-14 Thread Tomasz Kuzemko
bscribe ceph-devel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Tomasz Kuzemko tomasz.kuze...@corp.ovh.com signature.asc Description: OpenPGP digital signature _

Re: [ceph-users] unfound objects blocking cluster, need help!

2016-10-02 Thread Tomasz Kuzemko
anyone who might run into the same problem. 2016-10-01 14:27 GMT+02:00 Tomasz Kuzemko : > Hi, > > I have a production cluster on which 1 OSD on a failing disk was slowing > the whole cluster down. I removed the OSD (osd.87) like usual in such case > but this time it resulted in 17 un

[ceph-users] unfound objects blocking cluster, need help!

2016-10-01 Thread Tomasz Kuzemko
d85016af6bd7879ef272ca5639/raw/d6fceb9acd206b75c3ce59c60bcd55a47dea7acd/osd-dump ceph health detail: https://gist.github.com/anonymous/ddb27863ecd416748ebd7ebbc036e438/raw/59ef1582960e011f10cbdbd4ccee509419b95d4e/health-detail -- Pozdrawiam, Tomasz Kuzemko tom...@kuzemko.net _

Re: [ceph-users] rbd pool:replica size choose: 2 vs 3

2016-09-23 Thread Tomasz Kuzemko
be > numbers. I never found some sort of calculator which can say "Oh you get > this hardware? Than a repl size of x y z is what you need." > > HTH a bit . Regards . Götz > > > > > ___ > ceph-users mailing list &

Re: [ceph-users] Fwd: lost power. monitors died. Cephx errors now

2016-08-10 Thread Tomasz Kuzemko
___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Tomasz Kuzem

Re: [ceph-users] CEPH Replication

2016-07-01 Thread Tomasz Kuzemko
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ > ceph-users mailing list > ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Tomasz Kuzemko tomasz.kuze...@corp.ovh.com signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Double OSD failure (won't start) any recovery options?

2016-06-30 Thread Tomasz Kuzemko
l persists. > - I have been at HEALTH_OK every day, but overnight scrubbing has been > uncovering problematic pgs I've had to repair every single night so > far. This morning was when it went beyond my ability to repair. > > >

Re: [ceph-users] Another cluster completely hang

2016-06-29 Thread Tomasz Kuzemko
; I have read many times the post "incomplete pgs, oh my" > I think my case is different. > The broken disk is completely broken. > So how can I simply mark incomplete pgs as complete? > Should I stop ceph before? > > > Il giorno mer 29

Re: [ceph-users] Another cluster completely hang

2016-06-29 Thread Tomasz Kuzemko
ual machine >>> can boot >>> because ceph has stopped i/o. >>> >>> I can accept to lose some data, but not ALL data! >>> Can you help me please? >>> Thanks, >>> Mario >>> &g

Re: [ceph-users] SSD OSDs - more Cores or more GHz

2016-01-20 Thread Tomasz Kuzemko
Hi, my team did some benchmarks in the past to answer this question. I don't have results at hand, but conclusion was that it depends on how many disks/OSDs you have in a single host: above 9 there was more benefit from more cores than GHz (6-core 3.5GHz vs 10-core 2.4GHz AFAIR). -- T

Re: [ceph-users] Scrubbing question

2015-11-26 Thread Tomasz Kuzemko
ECC will not be able to recover the data, but it will always be able to detect that data is corrupted. AFAIK under Linux this results in immediate halt of system, so it would not be able to report bad checksum data during deep-scrub. -- Tomasz Kuzemko tomasz.kuze...@corp.ovh.com W dniu

Re: [ceph-users] Scrubbing question

2015-11-26 Thread Tomasz Kuzemko
Hi, I have also seen inconsistent PGs despite md5 being the same on all objects, however all my hardware uses ECC RAM, which as I understand should prevent this type of error. To be clear - in your case you were using ECC or non-ECC module? -- Tomasz Kuzemko tomasz.kuze...@ovh.net W dniu

Re: [ceph-users] Upgrade to hammer, crush tuneables issue

2015-11-26 Thread Tomasz Kuzemko
ine will probably take affect only for PGs on which backfill has not yet started, which can explain why you did not see immediate effect of changing these on the fly. -- Tomasz Kuzemko tom...@kuzemko.net 2015-11-26 0:24 GMT+01:00 Robert LeBlanc : > > -BEGIN PGP SIGNED MESSAGE- > H

Re: [ceph-users] Upgrade to hammer, crush tuneables issue

2015-11-25 Thread Tomasz Kuzemko
filestore_split_multiple": "2", > "filestore_update_to": "1000", > "filestore_blackhole": "false", > "filestore_fd_cache_size": "128

Re: [ceph-users] Improving Performance with more OSD's?

2014-12-29 Thread Tomasz Kuzemko
On Sun, Dec 28, 2014 at 02:49:08PM +0900, Christian Balzer wrote: > You really, really want size 3 and a third node for both performance > (reads) and redundancy. How does it benefit read performance? I thought all reads are made only from the active primary OSD. -- Tomasz Kuzemko tomas

Re: [ceph-users] IO Hang on rbd

2014-12-15 Thread Tomasz Kuzemko
Try lowering "filestore max sync interval" and "filestore min sync interval". It looks like during the hanged period data is flushed from some overly big buffer. If this does not help you can monitor perf stats on OSDs to see if some queue is unusually large. -- Tomasz

Re: [ceph-users] unable to repair PG

2014-12-11 Thread Tomasz Kuzemko
Be very careful with running "ceph pg repair". Have a look at this thread: http://thread.gmane.org/gmane.comp.file-systems.ceph.user/15185 -- Tomasz Kuzemko tomasz.kuze...@ovh.net On Thu, Dec 11, 2014 at 10:57:22AM +, Luis Periquito wrote: > Hi, > > I've stoppe

Re: [ceph-users] Tool or any command to inject metadata/data corruption on rbd

2014-12-04 Thread Tomasz Kuzemko
For metadata corruption you would have to modify object file's extended attributes (with xattr for example). -- Tomasz Kuzemko tomasz.kuze...@ovh.net On Thu, Dec 04, 2014 at 02:26:56PM +0100, Sebastien Han wrote: > AFAIK there is no tool to do this. > You simply rm object or dd a new

Re: [ceph-users] Problems with pgs incomplete

2014-12-01 Thread Tomasz Kuzemko
.intent-log' replicated size 3 min_size 2 crush_ruleset 0 > object_hash rjenkins pg_num 8 pgp_num 8 last_change 213 flags hashpspool > stripe_width 0 > pool 15 '' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins > pg_num 8 pgp_num 8 last_change 238 flags hashpspool stripe_width 0 > Some of your pools have size = 3. > > -- > With regards, > Stanislav Butkeev > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Tomasz Kuzemko tomasz.kuze...@ovh.net signature.asc Description: Digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] What is the state of filestore sloppy CRC?

2014-11-25 Thread Tomasz Kuzemko
On Tue, Nov 25, 2014 at 07:10:26AM -0800, Sage Weil wrote: > On Tue, 25 Nov 2014, Tomasz Kuzemko wrote: > > Hello, > > as far as I can tell, Ceph does not make any guarantee that reads from an > > object return what was actually written to it. In other words, it does not >

Re: [ceph-users] What is the state of filestore sloppy CRC?

2014-11-25 Thread Tomasz Kuzemko
se it in production? Are there any >> considerations one should make before enabling it? Is it safe to >> enable it on an existing cluster? >> >> -- >> >> Tomasz Kuzemko >> tom...@kuzemko.net >> >> ______

[ceph-users] What is the state of filestore sloppy CRC?

2014-11-25 Thread Tomasz Kuzemko
it's merged since Emperor. Getting back to my actual question - what is the state of "filestore sloppy crc"? Does someone actually use it in production? Are there any considerations one should make before enabling it? Is it safe to enable it on an existing cluste