Re: [ceph-users] unsubscribe

2019-07-12 Thread Brian Topping
It’s in the mail headers on every email: mailto:ceph-users-requ...@lists.ceph.com?subject=unsubscribe > On Jul 12, 2019, at 5:00 PM, Robert Stanford wrote: > > unsubscribe > ___ > ceph-users mailing list > ceph-users@lists.ceph.com >

Re: [ceph-users] Weird behaviour of ceph-deploy

2019-06-17 Thread Brian Topping
I don’t have an answer for you, but it’s going to help others to have shown: Versions of all nodes involved and multi-master configuration Confirm forward and reverse DNS and SSH / remote sudo since you are using deploy Specific steps that did not behave properly > On Jun 17, 2019, at 6:29 AM,

Re: [ceph-users] pool migration for cephfs?

2019-05-15 Thread Brian Topping
Lars, I just got done doing this after generating about a dozen CephFS subtrees for different Kubernetes clients. tl;dr: there is no way for files to move between filesystem formats (ie CephFS ,> RBD) without copying them. If you are doing the same thing, there may be some relevance for you

Re: [ceph-users] PG stuck peering - OSD cephx: verify_authorizer key problem

2019-04-26 Thread Brian Topping
> On Apr 26, 2019, at 1:50 PM, Gregory Farnum wrote: > > Hmm yeah, it's probably not using UTC. (Despite it being good > practice, it's actually not an easy default to adhere to.) cephx > requires synchronized clocks and probably the same timezone (though I > can't swear to that.) Apps don’t

Re: [ceph-users] SOLVED: Multi-site replication speed

2019-04-20 Thread Brian Topping
s://gitlab.com/gitlab-com/gl-infra/infrastructure/issues/415#note_16192610 > On Apr 19, 2019, at 10:21 PM, Brian Topping wrote: > > Hi Casey, > > I set up a completely fresh cluster on a new VM host.. everything is fresh > fresh fresh. I feel like it installed cleanly and be

Re: [ceph-users] Multi-site replication speed

2019-04-19 Thread Brian Topping
Hi Casey, I set up a completely fresh cluster on a new VM host.. everything is fresh fresh fresh. I feel like it installed cleanly and because there is practically zero latency and unlimited bandwidth as peer VMs, this is a better place to experiment. The behavior is the same as the other

Re: [ceph-users] Are there any statistics available on how most production ceph clusters are being used?

2019-04-19 Thread Brian Topping
> On Apr 19, 2019, at 10:59 AM, Janne Johansson wrote: > > May the most significant bit of your life be positive. Marc, my favorite thing about open source software is it has a 100% money back satisfaction guarantee: If you are not completely satisfied, you can have an instant refund, just

Re: [ceph-users] Multi-site replication speed

2019-04-18 Thread Brian Topping
16, 2019, at 08:38, Casey Bodley wrote: > > Hi Brian, > > On 4/16/19 1:57 AM, Brian Topping wrote: >>> On Apr 15, 2019, at 5:18 PM, Brian Topping >> <mailto:brian.topp...@gmail.com>> wrote: >>> >>> If I am correct, how do I trigger the full sy

Re: [ceph-users] Multi-site replication speed

2019-04-15 Thread Brian Topping
> On Apr 15, 2019, at 5:18 PM, Brian Topping wrote: > > If I am correct, how do I trigger the full sync? Apologies for the noise on this thread. I came to discover the `radosgw-admin [meta]data sync init` command. That’s gotten me with something that looked like this for seve

Re: [ceph-users] Multi-site replication speed

2019-04-15 Thread Brian Topping
I’m starting to wonder if I actually have things configured and working correctly, but the light traffic I am seeing is that of an incremental replication. That would make sense, the cluster being replicated does not have a lot of traffic on it yet. Obviously, without the full replication, the

Re: [ceph-users] Multi-site replication speed

2019-04-14 Thread Brian Topping
> On Apr 14, 2019, at 2:08 PM, Brian Topping wrote: > > Every so often I might see the link running at 20 Mbits/sec, but it’s not > consistent. It’s probably going to take a very long time at this rate, if > ever. What can I do? Correction: I was looking at statistics

[ceph-users] Multi-site replication speed

2019-04-14 Thread Brian Topping
Hi all! I’m finally running with Ceph multi-site per http://docs.ceph.com/docs/nautilus/radosgw/multisite/ , woo hoo! I wanted to confirm that the process can be slow. It’s been a couple of hours since the sync started and `radosgw-admin

Re: [ceph-users] 1/3 mon not working after upgrade to Nautilus

2019-03-25 Thread Brian Topping
Did you check port access from other nodes? My guess is a forgotten firewall re-emerged on that node after reboot. Sent from my iPhone > On Mar 25, 2019, at 07:26, Clausen, Jörn wrote: > > Hi again! > >> moment, one of my three MONs (the then active one) fell out of the > > "active one"

Re: [ceph-users] Migrating a baremetal Ceph cluster into K8s + Rook

2019-02-19 Thread Brian Topping
> On Feb 19, 2019, at 3:30 PM, Vitaliy Filippov wrote: > > In our russian-speaking Ceph chat we swear "ceph inside kuber" people all the > time because they often do not understand in what state their cluster is at > all Agreed 100%. This is a really good way to lock yourself out of your data

Re: [ceph-users] Downsizing a cephfs pool

2019-02-08 Thread Brian Topping
> > (resending because the previous reply wound up off-list) > > On 09/02/2019 10.39, Brian Topping wrote: >> Thanks again to Jan, Burkhard, Marc and Hector for responses on this. To >> review, I am removing OSDs from a small cluster and running up against >>

Re: [ceph-users] Downsizing a cephfs pool

2019-02-08 Thread Brian Topping
Thanks again to Jan, Burkhard, Marc and Hector for responses on this. To review, I am removing OSDs from a small cluster and running up against the “too many PGs per OSD problem due to lack of clarity. Here’s a summary of what I have collected on it: The CephFS data pool can’t be changed, only

Re: [ceph-users] Downsizing a cephfs pool

2019-02-08 Thread Brian Topping
Thanks Marc and Burkhard. I think what I am learning is it’s best to copy between filesystems with cpio, if not impossible to do it any other way due to the “fs metadata in first pool” problem. FWIW, the mimic docs still describe how to create a differently named cluster on the same hardware.

Re: [ceph-users] Downsizing a cephfs pool

2019-02-08 Thread Brian Topping
ays creating pools starting 8 pg's and when I know I am at > what I want in production I can always increase the pg count. > > > > -Original Message- > From: Brian Topping [mailto:brian.topp...@gmail.com] > Sent: 08 February 2019 05:30 > To: Ceph Users > Subj

[ceph-users] Downsizing a cephfs pool

2019-02-07 Thread Brian Topping
Hi all, I created a problem when moving data to Ceph and I would be grateful for some guidance before I do something dumb. I started with the 4x 6TB source disks that came together as a single XFS filesystem via software RAID. The goal is to have the same data on a cephfs volume, but with

Re: [ceph-users] One host with 24 OSDs is offline - best way to get it back online

2019-01-26 Thread Brian Topping
I went through this as I reformatted all the OSDs with a much smaller cluster last weekend. When turning nodes back on, PGs would sometimes move, only to move back, prolonging the operation and system stress. What I took away is it’s least overall system stress to have the OSD tree back to

Re: [ceph-users] Problem with OSDs

2019-01-21 Thread Brian Topping
> On Jan 21, 2019, at 6:47 AM, Alfredo Deza wrote: > > When creating an OSD, ceph-volume will capture the ID and the FSID and > use these to create a systemd unit. When the system boots, it queries > LVM for devices that match that ID/FSID information. Thanks Alfredo, I see that now. The name

Re: [ceph-users] quick questions about a 5-node homelab setup

2019-01-21 Thread Brian Topping
> On Jan 18, 2019, at 3:48 AM, Eugen Leitl wrote: > > > (Crossposting this from Reddit /r/ceph , since likely to have more technical > audience present here). > > I've scrounged up 5 old Atom Supermicro nodes and would like to run them > 365/7 for limited production as RBD with Bluestore

[ceph-users] Problem with OSDs

2019-01-20 Thread Brian Topping
Hi all, looks like I might have pooched something. Between the two nodes I have, I moved all the PGs to one machine, reformatted the other machine, rebuilt that machine, and moved the PGs back. In both cases, I did this by taking the OSDs on the machine being moved from “out” and waiting for

Re: [ceph-users] Boot volume on OSD device

2019-01-19 Thread Brian Topping
> On Jan 18, 2019, at 10:58 AM, Hector Martin wrote: > > Just to add a related experience: you still need 1.0 metadata (that's > the 1.x variant at the end of the partition, like 0.9.0) for an > mdadm-backed EFI system partition if you boot using UEFI. This generally > works well, except on some

Re: [ceph-users] Today's DocuBetter meeting topic is... SEO

2019-01-18 Thread Brian Topping
Hi Noah! With an eye toward improving documentation and community, two things come to mind: 1. I didn’t know about this meeting or I would have done my very best to enlist my roommate, who probably could have answered these questions very quickly. I do know there’s something to do with the

Re: [ceph-users] Boot volume on OSD device

2019-01-18 Thread Brian Topping
> On Jan 18, 2019, at 4:29 AM, Hector Martin wrote: > > On 12/01/2019 15:07, Brian Topping wrote: >> I’m a little nervous that BlueStore assumes it owns the partition table and >> will not be happy that a couple of primary partitions have been used. Will >> this be

Re: [ceph-users] Offsite replication scenario

2019-01-16 Thread Brian Topping
> On Jan 16, 2019, at 12:08 PM, Anthony Verevkin wrote: > > I would definitely see huge value in going to 3 MONs here (and btw 2 on-site > MGR and 2 on-site MDS) > However 350Kbps is quite low and MONs may be latency sensitive, so I suggest > you do heavy QoS if you want to use that link for

Re: [ceph-users] /var/lib/ceph/mon/ceph-{node}/store.db on mon nodes

2019-01-16 Thread Brian Topping
ote: > > > >> On 1/16/19 10:36 AM, Matthew Vernon wrote: >> Hi, >> >>> On 16/01/2019 09:02, Brian Topping wrote: >>> >>> I’m looking at writes to a fragile SSD on a mon node, >>> /var/lib/ceph/mon/ceph-{node}/store.db is the big

[ceph-users] /var/lib/ceph/mon/ceph-{node}/store.db on mon nodes

2019-01-16 Thread Brian Topping
I’m looking at writes to a fragile SSD on a mon node, /var/lib/ceph/mon/ceph-{node}/store.db is the big offender at the moment. Is it required to be on a physical disk or can it be in tempfs? One of the log files has paxos strings, so I’m guessing it has to be on disk for a panic recovery? Are

Re: [ceph-users] Offsite replication scenario

2019-01-14 Thread Brian Topping
Ah! Makes perfect sense now. Thanks!! Sent from my iPhone > On Jan 14, 2019, at 12:30, Gregory Farnum wrote: > >> On Fri, Jan 11, 2019 at 10:07 PM Brian Topping >> wrote: >> Hi all, >> >> I have a simple two-node Ceph cluster that I’m comfortable wi

[ceph-users] Boot volume on OSD device

2019-01-11 Thread Brian Topping
Question about OSD sizes: I have two cluster nodes, each with 4x 800GiB SLC SSD using BlueStore. They boot from SATADOM so the OSDs are data-only, but the MLC SATADOM have terrible reliability and the SLC are way overpriced for this application. Can I carve off 64GiB of from one of the four

[ceph-users] Offsite replication scenario

2019-01-11 Thread Brian Topping
Hi all, I have a simple two-node Ceph cluster that I’m comfortable with the care and feeding of. Both nodes are in a single rack and captured in the attached dump, it has two nodes, only one mon, all pools size 2. Due to physical limitations, the primary location can’t move past two nodes at