Re: [ceph-users] Mimic upgrade failure

2018-09-18 Thread KEVIN MICHAEL HRPCEK
Sage, Unfortunately the mon election problem came back yesterday and it makes it really hard to get a cluster to stay healthy. A brief unexpected network outage occurred and sent the cluster into a frenzy and when I had it 95% healthy the mons started their nonstop reelections. In the previous

Re: [ceph-users] network architecture questions

2018-09-18 Thread Erik McCormick
On Tue, Sep 18, 2018, 7:56 PM solarflow99 wrote: > thanks for the replies, I don't know that cephFS clients go through the > MONs, they reach the OSDs directly. When I mentioned NFS, I meant NFS > clients (ie. not cephFS clients) This should have been pretty straight > forward. > Anyone doing

[ceph-users] [RGWRados]librados: Objecter returned from getxattrs r=-36

2018-09-18 Thread fatkun chan
ceph version 12.2.5 (cad919881333ac92274171586c827e01f554a70a) luminous (stable) I have a file with long name , when I cat the file through minio client, the error show. librados: Objecter returned from getxattrs r=-36 the log is come from radosgw 2018-09-15 03:38:24.763109 7f833c0ed700 2 req

Re: [ceph-users] backup ceph

2018-09-18 Thread ST Wong (ITSC)
Hi, Thanks for your help. > I assume that you are speaking of rbd only Yes, as we just started studying Ceph, we only aware of backup of RBD. Will there be other areas that need backup? Sorry for my ignorance. > Taking snapshot of rbd volumes and keeping all of them on the cluster is fine

Re: [ceph-users] network architecture questions

2018-09-18 Thread Jean-Charles Lopez
They don’t go though the MONs for IOs but they need access to the MONs over the public network for authentication and to receive the cluster map. JC While moving. Excuse unintended typos. > On Sep 18, 2018, at 17:51, Jean-Charles Lopez wrote: > > Hi > > You deploy 3 MONs on a production

Re: [ceph-users] network architecture questions

2018-09-18 Thread solarflow99
thanks for the replies, I don't know that cephFS clients go through the MONs, they reach the OSDs directly. When I mentioned NFS, I meant NFS clients (ie. not cephFS clients) This should have been pretty straight forward. Anyone doing HA on the MONs? How do you mount the cephFS shares, surely

[ceph-users] total_used statistic incorrect

2018-09-18 Thread Mike Cave
Greetings, I’ve recently run into an issue with my new Mimic deploy. I created some pools and created volumes and did some general testing. In total, there was about 21 TiB used. Once testing was completed, I deleted the pools and thus thought I deleted the data. However, the ‘total_used’

Re: [ceph-users] lost osd while migrating EC pool to device-class crush rules

2018-09-18 Thread Graham Allan
On 09/17/2018 04:33 PM, Gregory Farnum wrote: On Mon, Sep 17, 2018 at 8:21 AM Graham Allan > wrote: Looking back through history it seems that I *did* override the min_size for this pool, however I didn't reduce it - it used to have min_size 2! That made no

Re: [ceph-users] Any backfill in our cluster makes the cluster unusable and takes forever

2018-09-18 Thread David Turner
Here's the [1] full log from the time the OSD was started to the end of the crash dump. These logs are so hard to parse. Is there anything useful in them? I did confirm that all perms were set correctly and that the superblock was changed to rocksdb before the first time I attempted to start

Re: [ceph-users] Any backfill in our cluster makes the cluster unusable and takes forever

2018-09-18 Thread Pavan Rallabhandi
I meant the stack trace hints that the superblock still has leveldb in it, have you verified that already? On 9/18/18, 5:27 PM, "Pavan Rallabhandi" wrote: You should be able to set them under the global section and that reminds me, since you are on Luminous already, I guess those values

Re: [ceph-users] Any backfill in our cluster makes the cluster unusable and takes forever

2018-09-18 Thread Pavan Rallabhandi
You should be able to set them under the global section and that reminds me, since you are on Luminous already, I guess those values are already the default, you can verify from the admin socket of any OSD. But the stack trace didn’t hint as if the superblock on the OSD is still considering

Re: [ceph-users] Any backfill in our cluster makes the cluster unusable and takes forever

2018-09-18 Thread David Turner
Are those settings fine to have be global even if not all OSDs on a node have rocksdb as the backend? Or will I need to convert all OSDs on a node at the same time? On Tue, Sep 18, 2018 at 5:02 PM Pavan Rallabhandi < prallabha...@walmartlabs.com> wrote: > The steps that were outlined for

Re: [ceph-users] Any backfill in our cluster makes the cluster unusable and takes forever

2018-09-18 Thread Pavan Rallabhandi
The steps that were outlined for conversion are correct, have you tried setting some the relevant ceph conf values too: filestore_rocksdb_options = "max_background_compactions=8;compaction_readahead_size=2097152;compression=kNoCompression" filestore_omap_backend = rocksdb Thanks, -Pavan.

[ceph-users] Any backfill in our cluster makes the cluster unusable and takes forever

2018-09-18 Thread David Turner
I've finally learned enough about the OSD backend track down this issue to what I believe is the root cause. LevelDB compaction is the common thread every time we move data around our cluster. I've ruled out PG subfolder splitting, EC doesn't seem to be the root cause of this, and it is cluster

Re: [ceph-users] network architecture questions

2018-09-18 Thread Jonathan D. Proulx
On Tue, Sep 18, 2018 at 12:33:21PM -0700, solarflow99 wrote: :Hi, anyone able to answer these few questions? I'm not using CephFS but for RBD (my primary use case) clients also access OSDs directly. I use separate cluster and public networks mainly so replication bandwidth and client bandwidth

Re: [ceph-users] network architecture questions

2018-09-18 Thread Paul Emmerich
I would almost never separate the cluster and backend networks, it usually creates more problems than it solves. Paul 2018-09-18 21:37 GMT+02:00 Jean-Charles Lopez : >> On Sep 17, 2018, at 16:13, solarflow99 wrote: >> >> Hi, I read through the various documentation and had a few questions: >>

[ceph-users] (no subject)

2018-09-18 Thread Kevin Olbrich
Hi! is the compressible hint / incompressible hint supported on qemu+kvm? http://docs.ceph.com/docs/mimic/rados/configuration/bluestore-config-ref/ If not, only aggressive would work in this case for rbd, right? Kind regards Kevin ___ ceph-users

Re: [ceph-users] network architecture questions

2018-09-18 Thread Jean-Charles Lopez
> On Sep 17, 2018, at 16:13, solarflow99 wrote: > > Hi, I read through the various documentation and had a few questions: > > - From what I understand cephFS clients reach the OSDs directly, does the > cluster network need to be opened up as a public network? Client traffic only goes over the

Re: [ceph-users] network architecture questions

2018-09-18 Thread solarflow99
Hi, anyone able to answer these few questions? On Mon, Sep 17, 2018 at 4:13 PM solarflow99 wrote: > Hi, I read through the various documentation and had a few questions: > > - From what I understand cephFS clients reach the OSDs directly, does the > cluster network need to be opened up as a

Re: [ceph-users] No fix for 0x6706be76 CRCs ?

2018-09-18 Thread Alfredo Daniel Rezinovsky
I started with this after upgrade to bionic. I had Xenial with lts kernels (4.13) without problem. I will try to change to ubuntu 4.13 and wait for the logs. Thanks On 18/09/18 16:27, Paul Emmerich wrote: Yeah, it's very likely a kernel bug (that no one managed to reduce to a simpler test

Re: [ceph-users] No fix for 0x6706be76 CRCs ?

2018-09-18 Thread Paul Emmerich
Yeah, it's very likely a kernel bug (that no one managed to reduce to a simpler test case or even to reproduce it reliably with reasonable effort on a test system). 4.9 and earlier aren't affected as far as we can tell, we only encountered this after upgrading. But I think Bionic ships with a

Re: [ceph-users] No fix for 0x6706be76 CRCs ?

2018-09-18 Thread Alfredo Daniel Rezinovsky
MOMENT !!! "Some kernels (4.9+) sometime fail to return data when readingfrom a block device under memory pressure." I dind't knew that was the problem. Can't I just dowgrade the kernel? There are known working versions o just need to be prior 4.9? On 18/09/18 16:19, Paul Emmerich wrote:

Re: [ceph-users] No fix for 0x6706be76 CRCs ?

2018-09-18 Thread Paul Emmerich
We built a work-around here: https://github.com/ceph/ceph/pull/23273 Which hasn't been backported, but we'll ship 13.2.2 in our Debian packages for the croit OS image. Paul 2018-09-18 21:10 GMT+02:00 Alfredo Daniel Rezinovsky : > Changed all my hardware. Now I have plenty of free ram. swap

[ceph-users] https://ceph-storage.slack.com

2018-09-18 Thread Alfredo Daniel Rezinovsky
Can anyone add me to this slack? with my email alfrenov...@gmail.com Thanks. -- Alfredo Daniel Rezinovsky Director de Tecnologías de Información y Comunicaciones Facultad de Ingeniería - Universidad Nacional de Cuyo ___ ceph-users mailing list

[ceph-users] No fix for 0x6706be76 CRCs ?

2018-09-18 Thread Alfredo Daniel Rezinovsky
Changed all my hardware. Now I have plenty of free ram. swap never needed, low iowait and still 7fdbbb73e700 -1 bluestore(/var/lib/ceph/osd/ceph-6) _verify_csum bad crc32c/0x1000 checksum at blob offset 0x1e000, got 0x6706be76, expected 0x85a3fefe, device location [0x25ac04be000~1000],

Re: [ceph-users] Slow requests from bluestore osds

2018-09-18 Thread Augusto Rodrigues
I solved my slow requests by increasing the size of block.db. Calculate 4% per stored TB and preferably host the DB in NVME. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Odp.: backup ceph

2018-09-18 Thread Tomasz Kuzemko
Hello, a colleague of mine has done a presentation at FOSDEM about how we (OVH) are doing RBD backups. You might find it interesting: https://archive.fosdem.org/2018/schedule/event/backup_ceph_at_scale/ -- Tomasz Kuzemko tomasz.kuze...@corp.ovh.com

Re: [ceph-users] backup ceph

2018-09-18 Thread ceph
Hi, I assume that you are speaking of rbd only Taking snapshot of rbd volumes and keeping all of them on the cluster is fine However, this is no backup A snapshot is only a backup if it is exported off-site On 09/18/2018 11:54 AM, ST Wong (ITSC) wrote: > Hi, > > We're newbie to Ceph. Besides

Re: [ceph-users] Dashboard Object Gateway

2018-09-18 Thread Hendrik Peyerl
Hi Lenz, Would you mind sharing the commands that you used to configure the RGW connection details? The RGW Node was installed like documented: ceph-deploy install --rgw $SERVERNAME ceph-deploy rgw create $SERVERNAME After the installation the server was only listening on 0.0.0.0. I then

Re: [ceph-users] Dashboard Object Gateway

2018-09-18 Thread Lenz Grimmer
Hi Hendrik, On 09/18/2018 12:57 PM, Hendrik Peyerl wrote: > we just deployed an Object Gateway to our CEPH Cluster via ceph-deploy > in an IPv6 only Mimic Cluster. To make sure the RGW listens on IPv6 we > set the following config: > rgw_frontends = civetweb port=[::]:7480 > > We now tried to

[ceph-users] bluestore compression enabled but no data compressed

2018-09-18 Thread Frank Schilder
I seem to have a problem getting bluestore compression to do anything. I followed the documentation and enabled bluestore compression on various pools by executing "ceph osd pool set compression_mode aggressive". Unfortunately, it seems like no data is compressed at all. As an example, below

[ceph-users] Dashboard Object Gateway

2018-09-18 Thread Hendrik Peyerl
Hello all, we just deployed an Object Gateway to our CEPH Cluster via ceph-deploy in an IPv6 only Mimic Cluster. To make sure the RGW listens on IPv6 we set the following config: rgw_frontends = civetweb port=[::]:7480 We now tried to enable the dashboard functionality for said gateway but

Re: [ceph-users] mount cephfs without tiering

2018-09-18 Thread Konstantin Shalygin
I have cephfs with tiering. Does anyone know if it's possible to mount a file system so that the tiring is not used? I.e. I want mount cephfs on backup server without tiering usage and on samba server with tiering usage. It's possible?

[ceph-users] backup ceph

2018-09-18 Thread ST Wong (ITSC)
Hi, We're newbie to Ceph. Besides using incremental snapshots with RDB to backup data on one Ceph cluster to another running Ceph cluster, or using backup tools like backy2, will there be any recommended way to backup Ceph data ? Someone here suggested taking snapshot of RDB daily and

[ceph-users] radosgw bucket stats vs s3cmd du

2018-09-18 Thread Luis Periquito
Hi all, I have a couple of very big s3 buckets that store temporary data. We keep writing to the buckets some files which are then read and deleted. They serve as a temporary storage. We're writing (and deleting) circa 1TB of data daily in each of those buckets, and their size has been mostly