Re: [ceph-users] RGW multisite sync data sync shard stuck

2017-06-05 Thread Andreas Calminder
Hello, I'm using Ceph jewel (10.2.7) and as far as I know I'm using the jewel multisite setup (multiple zones) as described here http://docs.ceph.com/docs/master/radosgw/multisite/ and two ceph clusters, one in each site. Stretching clusters over multiple sites are seldom/never worth the hassle in

Re: [ceph-users] Write back mode Cach-tier behavior

2017-06-05 Thread TYLin
Hi Christian, Thanks for you quick reply. > On Jun 5, 2017, at 2:01 PM, Christian Balzer wrote: > > > Hello, > > On Mon, 5 Jun 2017 12:25:25 +0800 TYLin wrote: > >> Hi all, >> >> We’re using cache-tier with write-back mode but the write throughput is not >> as good as we expect. > > Num

[ceph-users] Migrate from AWS to Ceph

2017-06-05 Thread ankit malik
Hello, we have noSQL installed on AWS S3, we would like to migrate to our ceph cluster (private cloud). We do have a question related to that:1. Is it a supported scenario (AWS->Ceph)?2. If yes, how to do that? Has some one tried it before? (if not noSQL any other application is also fine , just

Re: [ceph-users] Write back mode Cach-tier behavior

2017-06-05 Thread Christian Balzer
On Mon, 5 Jun 2017 15:32:00 +0800 TYLin wrote: > Hi Christian, > > Thanks for you quick reply. > > > > On Jun 5, 2017, at 2:01 PM, Christian Balzer wrote: > > > > > > Hello, > > > > On Mon, 5 Jun 2017 12:25:25 +0800 TYLin wrote: > > > >> Hi all, > >> > >> We’re using cache-tier with wri

[ceph-users] handling different disk sizes

2017-06-05 Thread Félix Barbeira
Hi, We have a small cluster for radosgw use only. It has three nodes, witch 3 osds each. Each node has different disk sizes: node01 : 3x8TB node02 : 3x2TB node03 : 3x3TB I thought that the weight handle the amount of data that every osd receive. In this case for example the node with the 8TB dis

[ceph-users] Hard disk bad manipulation: journal corruption and stale pgs

2017-06-05 Thread Zigor Ozamiz
Hi everyone, Due to two beginner's big mistakes handling and recovering a hard disk, we have reached to a situation in which the system tells us that the journal of an osd is corrupted. 2017-05-30 17:59:21.318644 7fa90757a8c0 1 journal _open /dev/disk/by-id/ata-INTEL_SSDSC2BA200G4_BTHV5281013C20

Re: [ceph-users] OSD crash loop - FAILED assert(recovery_info.oi.snaps.size())

2017-06-05 Thread Stephen M. Anthony ( Faculty/Staff - Ctr for Innovation in Teach & )
Using rbd ls -l poolname to list all images and their snapshots, then purging snapshots from each image with rbd snap purge poolname/imagename, then finally reweighing each flapping OSD to 0.0 resolved this issue. -Steve On 2017-06-02 14:15, Steve Anthony wrote: I'm seeing this again on two

Re: [ceph-users] handling different disk sizes

2017-06-05 Thread Christian Balzer
Hello, On Mon, 5 Jun 2017 13:54:02 +0200 Félix Barbeira wrote: > Hi, > > We have a small cluster for radosgw use only. It has three nodes, witch 3 ^ ^ > osds each. Each node has different disk sizes: > There's your answer, staring you r

Re: [ceph-users] handling different disk sizes

2017-06-05 Thread Loic Dachary
Hi Félix, Could you please send me the output of the "ceph report" command (privately, the output is likely too big for the list) ? I suspect what you're seeing is because the smaller disks have more PGs than they should for the default.rgw.buckets.data pool. With the output of "ceph report" an

Re: [ceph-users] RGW lifecycle not expiring objects

2017-06-05 Thread Daniel Gryniewicz
Kraken has lifecycle, Jewel does not. Daniel On 06/04/2017 07:16 PM, ceph.nov...@habmalnefrage.de wrote: grrr... sorry && and again as text :| Gesendet: Montag, 05. Juni 2017 um 01:12 Uhr Von: ceph.nov...@habmalnefrage.de An: "Yehuda Sadeh-Weinraub" Cc: "ceph-users@lists.ceph.com" , ceph-d

Re: [ceph-users] handling different disk sizes

2017-06-05 Thread Loic Dachary
On 06/05/2017 02:48 PM, Christian Balzer wrote: > > Hello, > > On Mon, 5 Jun 2017 13:54:02 +0200 Félix Barbeira wrote: > >> Hi, >> >> We have a small cluster for radosgw use only. It has three nodes, witch 3 > ^ ^ >> osds each. Each node

Re: [ceph-users] handling different disk sizes

2017-06-05 Thread David Turner
If you want to resolve your issue without purchasing another node, you should move one disk of each size into each server. This process will be quite painful as you'll need to actually move the disks in the crush map to be under a different host and then all of your data will move around, but then

[ceph-users] BUG: Bad page state in process ceph-osd pfn:111ce00

2017-06-05 Thread Alex Gorbachev
Hello, we have received this today after months of running without any issues. ceph version 0.94.9 (fe6d859066244b97b24f09d46552afc2071e6f90) Running Ubuntu 14.04 with kernel 4.10.2-041002-generic Jun 5 11:08:36 roc02r-sca040 kernel: [7126162.348529] BUG: Bad page state in process ceph-osd pf

Re: [ceph-users] Hard disk bad manipulation: journal corruption and stale pgs

2017-06-05 Thread koukou73gr
Is your min-size at least 2? Is it just one OSD affected? If yes and if it is only the journal that is corrupt, but the actual OSD store is intact although lagging behind now in writes and you do have healthy copies of its PGs elsewhere (hence the min-size requirement) you could resolve this situ

Re: [ceph-users] Bug report: unexpected behavior when executing Lua object class

2017-06-05 Thread Noah Watkins
I haven't taken the time to really grok why the limitation exists (e.g. i'd be interested in to know if it's fundamental). There is a comment here: https://github.com/ceph/ceph/blob/master/src/osd/PrimaryLogPG.cc#L3221 - Noah On Sat, Jun 3, 2017 at 8:18 PM, Zheyuan Chen wrote: >> Unfortunately,

Re: [ceph-users] Bug report: unexpected behavior when executing Lua object class

2017-06-05 Thread Gregory Farnum
On Mon, Jun 5, 2017 at 10:43 AM Noah Watkins wrote: > I haven't taken the time to really grok why the limitation exists > (e.g. i'd be interested in to know if it's fundamental). There is a > comment here: > > https://github.com/ceph/ceph/blob/master/src/osd/PrimaryLogPG.cc#L3221 We need to sen

Re: [ceph-users] RGW lifecycle not expiring objects

2017-06-05 Thread Ben Hines
FWIW lifecycle is working for us. I did have to research to find the appropriate lc config file settings, the documentation for which is found in a git pull request (waiting for another release?) rather than on the Ceph docs site. https://github.com/ceph/ceph/pull/13990 Try these: debug rgw = 20

Re: [ceph-users] PG Stuck EC Pool

2017-06-05 Thread Gregory Farnum
It looks to me like this is related to http://tracker.ceph.com/issues/18162. You might see if they came up with good resolution steps, and it looks like David is working on it in master but hasn't finished it yet. -Greg On Sat, Jun 3, 2017 at 2:47 AM Ashley Merrick wrote: > Attaching with loggi

Re: [ceph-users] Recovering PGs from Dead OSD disk

2017-06-05 Thread Gregory Farnum
On Sat, Jun 3, 2017 at 6:17 AM James Horner wrote: > Hi All > > Thanks in advance for any help, I was wondering if anyone can help me with > a pickle I have gotten myself into! > > I was in the process of adding OSD's to my small cluster (6 OSDs) and the > disk died halfway through, unfort I had

Re: [ceph-users] handling different disk sizes

2017-06-05 Thread Christian Wuerdig
Yet another option is to change the failure domain to OSD instead host (this avoids having to move disks around and will probably meet you initial expectations). Means your cluster will become unavailable when you loose a host until you fix it though. OTOH you probably don't have too much leeway an

Re: [ceph-users] BUG: Bad page state in process ceph-osd pfn:111ce00

2017-06-05 Thread Brad Hubbard
This is a kernel error. The ceph userspace code is extremely unlikely to be responsible. Regardless, this needs to be debugged by whoever supports this kernel as a first step. A google search for "page dumped because: PAGE_FLAGS_CHECK_AT_PREP flag set" shows similar bugs which may give some guid

[ceph-users] First monthly Ceph on ARM call tomorrow

2017-06-05 Thread Patrick McGarry
Hey cephers, Just a reminder, tomorrow at 11a EDT we will be hosting the first "Ceph on ARM" call as a follow on action to the last Ceph Tech Talk [0]. If you would like to join us, please connect to the following URL: https://bluejeans.com/217597658/ If you have any questions, please don't hes

Re: [ceph-users] Write back mode Cach-tier behavior

2017-06-05 Thread TYLin
> On Jun 5, 2017, at 6:47 PM, Christian Balzer wrote: > > Personally I avoid odd numbered releases, but my needs for stability > and low update frequency seem to be far off the scale for "normal" Ceph > users. > > W/o precise numbers of files and the size of your SSDs (which type?) it is > hard

Re: [ceph-users] Write back mode Cach-tier behavior

2017-06-05 Thread Webert de Souza Lima
I'd like to add that, from all tests I did, the writing of new files only go directly to the cache tier if you set hit set count = 0. Em Seg, 5 de jun de 2017 23:26, TYLin escreveu: > On Jun 5, 2017, at 6:47 PM, Christian Balzer wrote: > > Personally I avoid odd numbered releases, but my needs

[ceph-users] Kraken bluestore compression

2017-06-05 Thread Daniel K
Hi, I see several mentions that compression is available in Kraken for bluestore OSDs, however, I can find almost nothing in the documentation that indicates how to use it. I've found: - http://docs.ceph.com/docs/master/radosgw/compression/ - http://ceph.com/releases/v11-2-0-kraken-released/ I'm

Re: [ceph-users] Write back mode Cach-tier behavior

2017-06-05 Thread Christian Balzer
On Tue, 6 Jun 2017 10:25:38 +0800 TYLin wrote: > > On Jun 5, 2017, at 6:47 PM, Christian Balzer wrote: > > > > Personally I avoid odd numbered releases, but my needs for stability > > and low update frequency seem to be far off the scale for "normal" Ceph > > users. > > > > W/o precise numbers

Re: [ceph-users] Write back mode Cach-tier behavior

2017-06-05 Thread Christian Balzer
Hello, On Tue, 06 Jun 2017 02:35:25 + Webert de Souza Lima wrote: > I'd like to add that, from all tests I did, the writing of new files only > go directly to the cache tier if you set hit set count = 0. > Yes, that also depends on the settings of course. (which we don't know, as they never

[ceph-users] design guidance

2017-06-05 Thread Daniel K
I've built 'my-first-ceph-cluster' with two of the 4-node, 12 drive Supermicro servers and dual 10Gb interfaces(one cluster, one public) I now have 9x 36-drive supermicro StorageServers made available to me, each with dual 10GB and a single Mellanox IB/40G nic. No 1G interfaces except IPMI. 2x 6-c

Re: [ceph-users] design guidance

2017-06-05 Thread Christian Balzer
Hello, lots of similar questions in the past, google is your friend. On Mon, 5 Jun 2017 23:59:07 -0400 Daniel K wrote: > I've built 'my-first-ceph-cluster' with two of the 4-node, 12 drive > Supermicro servers and dual 10Gb interfaces(one cluster, one public) > > I now have 9x 36-drive supermi

Re: [ceph-users] design guidance

2017-06-05 Thread Adrian Saul
> > Early usage will be CephFS, exported via NFS and mounted on ESXi 5.5 > > and > > 6.0 hosts(migrating from a VMWare environment), later to transition to > > qemu/kvm/libvirt using native RBD mapping. I tested iscsi using lio > > and saw much worse performance with the first cluster, so it seems