Re: [ceph-users] Omap issues - metadata creating too many

2019-01-03 Thread Josef Zelenka
, this happens even with bluestore. Is there anything we can do to clean up the omap manually? Josef On 18/12/2018 23:19, J. Eric Ivancich wrote: On 12/17/18 9:18 AM, Josef Zelenka wrote: Hi everyone, i'm running a Luminous 12.2.5 cluster with 6 hosts on ubuntu 16.04 - 12 HDDs for data each, plus 2

[ceph-users] Omap issues - metadata creating too many

2018-12-17 Thread Josef Zelenka
caused by some data influx. It seems like some kind of a bug to me to be honest, but i'm not certain - anyone else seen this behavior with their radosgw? Thanks a lot Josef Zelenka Cloudevelops ___ ceph-users mailing list ceph-users@lists.ceph.com http

Re: [ceph-users] pgs incomplete and inactive

2018-08-27 Thread Josef Zelenka
delete a whole PG on the full OSD's file system to make space (preferably one that is already recovered and active+clean even without the dead OSD) Paul 2018-08-27 10:44 GMT+02:00 Josef Zelenka : Hi, i've had a very ugly thing happen to me over the weekend. Some of my OSDs in a root that handles

[ceph-users] pgs incomplete and inactive

2018-08-27 Thread Josef Zelenka
Hi, i've had a very ugly thing happen to me over the weekend. Some of my  OSDs in a root that handles metadata pools overflowed to 100% disk usage due to omap size(even though i had 97% full ratio, which is odd) and refused to start. There were some pgs on those OSDs that went away with them.

Re: [ceph-users] OSD had suicide timed out

2018-08-09 Thread Josef Zelenka
shortage) then you need to find out why else osd 5, etc. could not contact it. On Wed, Aug 8, 2018 at 6:47 PM, Josef Zelenka wrote: Checked the system load on the host with the OSD that is suiciding currently and it's fine, however i can see a noticeably higher IO (around 700), though that seems

Re: [ceph-users] OSD had suicide timed out

2018-08-08 Thread Josef Zelenka
ow you some message debugging which may help. On Tue, Aug 7, 2018 at 10:34 PM, Josef Zelenka wrote: To follow up, I did some further digging with debug_osd=20/20 and it appears as if there's no traffic to the OSD, even though it comes UP for the cluster (this started happening on another

Re: [ceph-users] OSD had suicide timed out

2018-08-07 Thread Josef Zelenka
: start I found this thread in the ceph mailing list (http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-June/018956.html) but I'm not sure if this is the same thing(albeit, it's the same error), as I don't use s3 acls/expiration in my cluster(if it's set to a default, I'm not awar

[ceph-users] OSD had suicide timed out

2018-08-06 Thread Josef Zelenka
Hi, i'm running a cluster on Luminous(12.2.5), Ubuntu 16.04 - configuration is 3 nodes, 6 drives each(though i have encountered this on a different cluster, similar hardware, only the drives were HDD instead of SSD - same usage). I have recently seen a bug(?) where one of the OSDs suddenly

Re: [ceph-users] Best way to replace OSD

2018-08-06 Thread Josef Zelenka
Hi, our procedure is usually(assured that the cluster was ok the failure, with 2 replicas as crush rule) 1.Stop the OSD process(to keep it from coming up and down and putting load on the cluster) 2. Wait for the "Reweight" to come to 0(happens after 5 min i think - can be set manually but i

[ceph-users] Erasure coded pools - overhead, data distribution

2018-07-26 Thread Josef Zelenka
  1.0  5588G 1455M 5587G  0.03    0   0   TOTAL   579T  346T  232T 59.80 MIN/MAX VAR: 0/1.55  STDDEV: 23.04 THanks in advance for any help, i find it very hard to wrap my head around this. Josef Zelenka Cloudevelops ___ ceph

Re: [ceph-users] NFS-ganesha with RGW

2018-05-30 Thread Josef Zelenka
anesha.nfsd to fail to bind to a port is that a Linux kernel nfsd is already running--can you make sure that's not the case; meanwhile you -do- need rpcbind to be running Matt On Wed, May 30, 2018 at 6:03 AM, Josef Zelenka wrote: Hi everyone, i'm currently trying to set up a NFS-ganesha instance th

[ceph-users] NFS-ganesha with RGW

2018-05-30 Thread Josef Zelenka
Hi everyone, i'm currently trying to set up a NFS-ganesha instance that mounts a RGW storage, however i'm not succesful in this. I'm running Ceph Luminous 12.2.4 and ubuntu 16.04. I tried compiling ganesha from source(latest version), however i didn't manage to get the mount running with that,

[ceph-users] Issues with RBD when rebooting

2018-05-25 Thread Josef Zelenka
Hi, we are running a jewel cluster (54OSDs, six nodes, ubuntu 16.04) that serves as a backend for openstack(newton) VMs. TOday we had to reboot one of the nodes(replicated pool, x2) and some of our VMs oopsed with issues with their FS(mainly database VMs, postgresql) - is there a reason for

Re: [ceph-users] Cephfs write fail when node goes down

2018-05-15 Thread Josef Zelenka
/05/18 02:57, Yan, Zheng wrote: On Mon, May 14, 2018 at 5:37 PM, Josef Zelenka <josef.zele...@cloudevelops.com> wrote: Hi everyone, we've encountered an unusual thing in our setup(4 nodes, 48 OSDs, 3 monitors - ceph Jewel, Ubuntu 16.04 with kernel 4.4.0). Yesterday, we were doing a HW u

[ceph-users] Cephfs write fail when node goes down

2018-05-14 Thread Josef Zelenka
Hi everyone, we've encountered an unusual thing in our setup(4 nodes, 48 OSDs, 3 monitors - ceph Jewel, Ubuntu 16.04 with kernel 4.4.0). Yesterday, we were doing a HW upgrade of the nodes, so they went down one by one - the cluster was in good shape during the upgrade, as we've done this

[ceph-users] RGW multisite sync issues

2018-04-06 Thread Josef Zelenka
who knows what might be wrong? I can supply any needed info. THanks Josef Zelenka ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Radosgw halts writes during recovery, recovery info issues

2018-03-26 Thread Josef Zelenka
forgot to mention - we are running jewel, 10.2.10 On 26/03/18 11:30, Josef Zelenka wrote: Hi everyone, i'm currently fighting an issue in a cluster we have for a customer. It's used for a lot of small files(113m currently) that are pulled via radosgw. We have 3 nodes, 24 OSDs in total

Re: [ceph-users] Mapping faulty pg to file on cephfs

2018-02-13 Thread Josef Zelenka
Oh, sorry, forgot to mention - this cluster is running jewel :( On 13/02/18 12:10, John Spray wrote: On Tue, Feb 13, 2018 at 10:38 AM, Josef Zelenka <josef.zele...@cloudevelops.com> wrote: Hi everyone, one of the clusters we are running for a client recently had a power outage, it's cur

[ceph-users] Mapping faulty pg to file on cephfs

2018-02-13 Thread Josef Zelenka
Josef Zelenka ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Inconsistent PG - failed to pick suitable auth object

2018-01-29 Thread Josef Zelenka
he same size on both primary and secondary and even copying the identical object from the secondary to the primary, but nothing seems to work. any pointers regarding this? thanks Josef Zelenka ___ ceph-users mailing list ceph-users@lists.ceph.com h

Re: [ceph-users] how to get bucket or object's ACL?

2018-01-29 Thread Josef Zelenka
hi, this should be possible via the s3cmd tool. |s3cmd info s3:/// s3cmd info s3://PP-2015-Tut/ here is more info - https://kunallillaney.github.io/s3cmd-tutorial/ i have succesfully used this tool in the past for ACL management, so i hope it's gonna work for you too. JZ | On 29/01/18

Re: [ceph-users] Cluster crash - FAILED assert(interval.last > last)

2018-01-11 Thread Josef Zelenka
I have posted logs/strace from our osds with details to a ticket in the ceph bug tracker - see here http://tracker.ceph.com/issues/21142. You can see where exactly the OSDs crash etc, this can be of help if someone decides to debug it. JZ On 10/01/18 22:05, Josef Zelenka wrote: Hi, today

Re: [ceph-users] How to speed up backfill

2018-01-10 Thread Josef Zelenka
*发件人:*Josef Zelenka <josef.zele...@cloudevelops.com> *发送时间:*2018-01-11 04:53 *主题:*Re: [ceph-users] How to speed up backfill *收件人:*"shadow_lin"<shadow_...@163.com> *抄送:* Hi, i had the same issue a few days

Re: [ceph-users] How to speed up backfill

2018-01-10 Thread Josef Zelenka
On 10/01/18 21:53, Josef Zelenka wrote: Hi, i had the same issue a few days back, i tried playing around with these two: ceph tell 'osd.*' injectargs '--osd-max-backfills ' ceph tell 'osd.*' injectargs '--osd-recovery-max-active ' and it helped greatly(increased our recovery speed 20x

[ceph-users] Cluster crash - FAILED assert(interval.last > last)

2018-01-10 Thread Josef Zelenka
owever i can't find any info regarding this or how to fix it. Did someone here also encounter it? We're running luminous on ubuntu 16.04. Thanks Josef Zelenka Cloudevelops ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ce

[ceph-users] determining the source of io in the cluster

2017-12-18 Thread Josef Zelenka
or advice is greatly appreciated. Thanks Josef Zelenka Cloudevelops ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] A new SSD for journals - everything sucks?

2017-10-11 Thread Josef Zelenka
/knowledge about some good price/performance SSDs for ceph journaling? I can also share the complete benchmarking data my coworker made, if someone is interested. Thanks Josef Zelenka ___ ceph-users mailing list ceph-users@lists.ceph.com http

Re: [ceph-users] Large amount of files - cephfs?

2017-09-29 Thread Josef Zelenka
Hi everyone, thanks for the advice, we consulted it and we're gonna test it out with cephfs first. Object storage is a possibility if it misbehaves. Hopefully it will go well :) On 28/09/17 08:20, Henrik Korkuc wrote: On 17-09-27 14:57, Josef Zelenka wrote: Hi, we are currently working

[ceph-users] Large amount of files - cephfs?

2017-09-27 Thread Josef Zelenka
Hi, we are currently working on a ceph solution for one of our customers. They run a file hosting and they need to store approximately 100 million of pictures(thumbnails). Their current code works with FTP, that they use as a storage. We thought that we could use cephfs for this, but i am

[ceph-users] RADOSGW S3 api ACLs

2017-02-16 Thread Josef Zelenka
only by a specific user. Currently i was able to set the ACLs i want on existing files, but i want them to be set up in a way that will automatically do this, i.e the entire bucket. Can anyone shed some light on ACLs in S3 API and RGW? Thanks Josef Zelenka Cloudevelops