. In this context, I am looking to
enable and disable mirroring on specific RBD images and RGW buckets as the
client workload is migrated from accessing the old cluster to accessing the
new.
Thanks.
-Dave
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
.
Thanks.
-Dave
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
607-760-2328 (Cell)
607-777-4641 (Office)
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
as it was draining it.
So did I miss something here? What is the best way to proceed? I
understand that it would be mayhem to mark 8 of 72 OSDs out and then turn
backfill/rebalance/recover back on. But it seems like there should be a
better way.
Suggestions?
Thanks.
-Dave
--
Dave Hall
Binghamton
to get to container-based Reef, but
I need to keep a stable cluster throughout.
Any advice or reassurance much appreciated.
Thanks.
-Dave
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
___
ceph-users mailing list -- ceph-users@ceph.io
. This
would be a long and painful process - decommission a node, move it, move
some data, decommission another node - and I don't know what effect it
would have on external references to our object store.
Please advise.
Thanks.
-Dave
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
out simultaneously?
Thanks.
-Dave
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
On Fri, Aug 4, 2023 at 10:16 AM Dave Holland wrote:
> On Fri, Aug 04, 2023 at 09:44:57AM -0400, Dave Hall wrote:
> > My inclination is to mark these 3 OSDs 'OUT' before they crash
> completel
, if it would be better to do them one per day or something,
I'd rather be safe.
I also assume that I should wait for the rebalance to complete before I
initiate the replacement procedure.
Your thoughts?
Thanks.
-Dave
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
is not large, there is an increased chance that more
than one scrub will want to read the same OSD. Scheduling nightmare if
the number of simultaneous scrubs is low and client traffic is given
priority.
-Dave
-Dave
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
607-760-2328 (Cell)
607-777
had to run another 'pg repair' after the
object repair.
Since then all is good.
-Dave
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
On Sun, Oct 3, 2021 at 1:09 PM 胡 玮文 wrote:
>
> > 在 2021年10月4日,00:53,Michael Thomas 写道:
> >
> > I recently started getting i
for increasing osd_scrub_max_preemptions just enough balance
between scrub progress and client responsiveness?
Or perhaps there are other scrub attributes that should be tuned instead?
Thanks.
-Dave
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
of
these 29 slow ops.
Can anybody suggest a path forward?
Thanks.
-Dave
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
of the more rarely used metadata on the HDD but having it on
> flash certain is nice.
>
>
> Mark
>
>
> On 6/3/21 5:18 PM, Dave Hall wrote:
> > Anthony,
> >
> > I had recently found a reference in the Ceph docs that indicated
> something
> &g
Anthony,
I had recently found a reference in the Ceph docs that indicated something
like 40GB per TB for WAL+DB space. For a 12TB HDD that comes out to
480GB. If this is no longer the guideline I'd be glad to save a couple
dollars.
-Dave
--
Dave Hall
Binghamton University
kdh
drives (still
Enterprise). For Ceph, will the switch to SATA carry a performance
difference that I should be concerned about?
Thanks.
-Dave
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
___
ceph-users mailing list -- ceph-users@ceph.io
ceph-block-b1fea172-71a4-463e-a3e3-8cdcc1bc7b79: autoactivation failed.
-Dave
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
people (developers) can produce something that a
large number of people (storage administrators, or 'users') will want to
use.
Please remember the ratio of users (cluster administrators) to developers
and don't lose sight of the users in working to ease and simplify
development.
-Dave
--
Dave Hall
system to a new NFS server and new storage I was able to directly rsync
each user in parallel. I filled up a 10GB pipe and copied the whole FS in
an hour.
Typing in a hurry. If my explanation is confusing, please don't hesitate
to ask me to explain better.
-Dave
--
Dave Hall
Binghamton
deadlines are missed, so it's not a
steady march to zero.
Please feel free to comment. I'd be glad to know if I'm on the right track
as we expect the cluster to double in size over the next 12 to 18 months.
Thanks.
-Dave
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
with a 500TB production cluster I am
asking for guidance from this list.
BTW, my cluster is currently running Nautilus 14.2.6 (stock Debian
packages).
Thank you.
-Dave
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
___
ceph-users mailing list -- ceph
. The systemd log messages were similar to those reported by Radoslav. A
Google search led me to the link above. The suggested addition to the
kernel command line fixed the issue.
-Dave
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
On Thu, Apr 15, 2021 at 4:07 AM Eneko Lacunza wrote
.
-Dave
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
On Wed, Apr 14, 2021 at 12:51 PM Radoslav Milanov <
radoslav.mila...@gmail.com> wrote:
> Hello,
>
> Cluster is 3 nodes Debian 10. Started cephadm upgrade on healthy 15.2.10
> cluster. Managers were upgraded fine then
any of the ceph-volume stuff that seems to
be failing after the OSDs are configured?
Or maybe I just have something odd in my inventory file. I'd be glad to
share - either in this list or off line.
Thanks.
-Dave
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
practices
for Nautilus?
1) I couldn't find how to set this in Nautilus.
2) I found a mailing list post from August 2019 that talked about EC pools
and using a multiple of k * 4M.
Any insight, or a pointer to the right part of the docs would be greatly
appreciated.
Thanks.
-Dave
--
Dave Hall
the best
course of action - should I just mark it back in? Or should I destroy and
rebuild it. If clearing it in the way I have, in combination with updating
to 14.2.16, will prevent it from misbehaving, why go through the trouble of
destroying and rebuilding?
Thanks.
-Dave
--
Dave Hall
suspend autoscaling it would be required to
modify the setting for each pool and then to modify it back afterward.
Thoughts?
Thanks
-Dave
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
On Mon, Mar 29, 2021 at 1:44 PM Anthony D'Atri
wrote:
> Yes the PG autoscalar has a
but enabled for this
particular pool. Also, I see that the target PG count is lower than the
current.
I guess you learn something new every day.
-Dave
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
607-760-2328 (Cell)
607-777-4641 (Office)
On Mon, Mar 29, 2021 at 7:52 AM Eugen Block
is
slowly eating itself, and that I'm about to lose 200TB of data. It's also
possible to imagine that this is all due to the gradual optimization of the
pools.
Note that the primary pool is an EC 8 + 2 containing about 124TB.
Thanks.
-Dave
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
questions about MDS in the near term, but I haven't searched the
docs yet.
Thanks.
-Dave
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
be more like 480GB of NVMe.
Thanks.
-Dave
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
ied, right?
Thanks.
-Dave
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
parts. Networking is
frequently an afterthought. In this case node-level traffic management -
weighted fair queueing - could make all the difference.
-Dave
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
On Tue, Mar 16, 2021 at 4:20 AM Burkhard Linke <
burkhard
?)
-Dave
Dave Hall
Binghamton University
kdh...@binghamton.edu
On 3/15/2021 12:48 PM, Andrew Walker-Brown wrote:
Dave
That’s the way our cluster is setup. It’s relatively small, 5 hosts, 12 osd’s.
Each host has 2x10G with LACP to the switches. We’ve vlan’d public/private
networks.
Making
suddenly changed.
Maybe this is a crazy idea, or maybe it's really cool. Your thoughts?
Thanks.
-Dave
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le
Reed,
Thank you. This seems like a very well thought approach. Your note about
the balancer and the auto_scaler seem quite relevant as well. I'll give it
a try when I add my next two nodes.
-Dave
--
Dave Hall
Binghamton University
On Thu, Mar 11, 2021 at 5:53 PM Reed Dier wrote:
>
will take out
at least 2 OSDs. Becasue of this it seems potentially worthwhile to go
through the trouble of defining failure domain = nvme to assure maximum
resilience.
-Dvae
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
607-760-2328 (Cell)
607-777-4641 (Office)
On Thu, Mar 11, 2021
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
607-760-2328 (Cell)
607-777-4641 (Office)
On Thu, Mar 11, 2021 at 1:28 PM Christian Wuerdig <
christian.wuer...@gmail.com> wrote:
> For EC 8+2 you can get away with 5 hosts by ensuring each host gets 2
> shards similar to t
ing 6 OSD nodes
and 48 HDDs) are NVMe write exhaustion and HDD failures. Since we have
multiple OSDs sharing a single NVMe device it occurs to me that we might
want to get Ceph to 'map' against that. In a way, NVMe devices are our
'nodes' at the current size of our cluster.
-Dave
--
Dave H
domains, resulting in protection against NVMe failure.
Please let me know if this is worth pursuing.
Thanks.
-Dave
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
607-760-2328 (Cell)
607-777-4641 (Office)
___
ceph-users mailing list -- ceph
e NVMe.
Is this correct? It is also possible to lay these out as basic logical
partitions?
Second question: How do I decide whether I need WAL, DB, or both?
Third question: Once I answer the above WAL/DB question, what are the
guidelines for sizing them?
Thanks.
-Dave
--
Dave Hall
Bing
I have been told that Rocky Linux is a fork of CentOS that will be what
CentOS used to be before this all happened. I'm not sure how that figures
in here, but it's worth knowing.
-Dave
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
On Wed, Mar 3, 2021 at 12:41 PM Drew Weaver
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
On Tue, Mar 2, 2021 at 4:06 AM David Caro wrote:
> On 03/01 21:41, Dave Hall wrote:
> > Hello,
> >
> > I've had a look at the instructions for clean shutdown given at
> > https://ceph.io/planet/how-to-do-a-ceph-c
contains 200TB of a
researcher's data that has taken a year to collect, so caution is needed.
Thanks.
-Dave
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users
bare-metal.
I think I saw that Cephadm will only deploy container-based clusters.
Is this a hint that bare-metal is going away in the long run?
Thanks.
-Dave
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
___
ceph-users mailing list
be necessary to
allocate 300GB for DB per OSD.
-Dave
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
On Mon, Nov 16, 2020 at 12:41 AM Zhenshi Zhou wrote:
> well, the warning message disappeared after I executed "ceph tell osd.63
> compact".
>
> Zhenshi Zhou 于2020年11月16日周
-domain
= OSD, data loss may be possible due to the failure of a shared SSD/NVMe.
Maybe it's important with a small cluster to suggest to place WAL/DB on the
HDD and use SSD/NVMe only for journal?
-Dave
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
607-760-2328 (Cell)
607-777-4641 (Office
it
to work.
Hope this helps.
-Dave
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
607-760-2328 (Cell)
607-777-4641 (Office)
On Mon, Nov 9, 2020 at 9:08 AM Frédéric Nass
wrote:
> Hi Luis,
>
> Thanks for your help. Sorry I forgot about the kernel details. This is
> lat
. I
also have 150GB left on my mirrored boot drive. I could un-mirror part
of this and get 300GB of SATA SSD.
Thoughts?
-Dave
Dave Hall
Binghamton University
kdh...@binghamton.edu
On 10/23/2020 6:00 AM, Eneko Lacunza wrote:
Hi Dave,
El 22/10/20 a las 19:43, Dave Hall escribió:
El 22/10/20
253:15 0 124G 0 lvm
Dave Hall
Binghamton universitykdh...@binghamton.edu
607-760-2328 (Cell)
607-777-4641 (Office)
On 10/23/2020 6:00 AM, Eneko Lacunza wrote:
Hi Dave,
El 22/10/20 a las 19:43, Dave Hall escribió:
El 22/10/20 a las 16:48, Dave Hall escribió:
(BTW, Nautilus
Brian, Eneko,
BTW, the Tyan LFF chassis we've been using has 12 x 3.5" bays in front
and 2 x 2.5" SATA bays in back. We've been using 240GB SSDs in the rear
bays for mirrored boot drives, so any NVMe we add is exclusively for OSD
support.
-Dave
Dave Hall
Binghamton Univ
Eneko,
On 10/22/2020 11:14 AM, Eneko Lacunza wrote:
Hi Dave,
El 22/10/20 a las 16:48, Dave Hall escribió:
Hello,
(BTW, Nautilus 14.2.7 on Debian non-container.)
We're about to purchase more OSD nodes for our cluster, but I have a
couple questions about hardware choices. Our original nodes
thought I might have seen some comments about cutting large drives
into multiple OSDs - could that be?
Thanks.
-Dave
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email
this yet, but there are at least some
discussions for MySQL.
-Dave
Dave Hall
Binghamton University
On 10/19/2020 10:49 PM, Brian Topping wrote:
Another option is to let PosgreSQL do the replication with local storage. There
are great reasons for Ceph, but databases optimize for this kind
.
-Dave
--
Dave Hall
Binghamton University
kdh...@binghamton.edu
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
into the install package. Once I copied
the file over to /lib/systemd/system everything started working again.
If I had to guess it was either ceph-volume@.service, or more likely -
based on timestamps on my one of my OSD servers, ceph-osd@.service.
Hope this helps.
-Dave
Dave Hall
I'm running EC 8+2 with 'failure domain OSD' on a 3 node cluster with 24
OSDs. Until one has 10s of nodes it pretty much has to be failure domain
OSD.
The documentation lists certain other important settings which it took time
to find. Most important are recommendations to have a small
getting sent to the wrong TCP
connection. Hard to imagine this happening, but it could.)
-Dave
Dave Hall
Binghamton University
kdh...@binghamton.edu
607-760-2328 (Cell)
607-777-4641 (Office)
On 5/29/2020 2:45 PM, Anthony D'Atri wrote:
I’m pretty sure I’ve seen that happen with QFX5100 switches
Dave Hall
Binghamton University
kdh...@binghamton.edu
607-760-2328 (Cell)
607-777-4641 (Office)
On 5/29/2020 6:29 AM, Paul Emmerich wrote:
Please do not apply any optimization without benchmarking *before* and
*after* in a somewhat realistic scenario.
No, iperf is likely not a realistic setup
of 9000.
It would be interesting to see the iperf tests repeated with
corresponding buffer sizing. I will perform this experiment as soon as
I complete some day-job tasks.
-Dave
Dave Hall
Binghamton University
kdh...@binghamton.edu
607-760-2328 (Cell)
607-777-4641 (Office)
On 5/27/2020 6:51
ody is interested.
-Dave
Dave Hall
Binghamton University
On 5/24/2020 12:29 PM, Martin Verges wrote:
Just save yourself the trouble. You won't have any real benefit from MTU
9000. It has some smallish, but it is not worth the effort, problems, and
loss of reliability for most environments.
.
If you want to move your production traffic to Jumbo Frames, change the
appropriate routes to MTU 8192 on all systems. Then test test test.
Lastly, change your network configuration on any effected nodes so the
increased MTU will be reinstated after every reboot.
-Dave
Dave Hall
an optimal configuration I'm next
going to ask if I can adjust it without having to wipe and reinitialize
every OSD.
Thanks.
-Dave
Dave Hall
Binghamton University
kdh...@binghamton.edu
On 5/6/2020 2:20 AM, lin yunfan wrote:
Is there a way to get the block,block.db,block.wal path and size
they need.
-Dave
Dave Hall
Binghamton University
kdh...@binghamton.edu
607-760-2328 (Cell)
607-777-4641 (Office)
On 5/5/2020 10:42 AM, Igor Fedotov wrote:
Hi Dave,
wouldn't this help (particularly "Viewing runtime settings" section):
https://docs.ceph.com/docs/nautilus/rados/configuration
and, while I have found documentation on how
to configure, reconfigure, and repair a BlueStore OSD, I haven't found
anything on how to query the current configuration.
Could anybody point me to a command or link to documentation on this?
Thanks.
-Dave
--
Dave Hall
Binghamton University
memory?
Also, is there something that needs to be tweaked to prevent the MDSs
from accumulating so much memory?
Thanks.
-Dave
--
Dave Hall
Binghamton University
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users
,
but then the lookups got even slower.
-Dave
Dave Hall
Binghamton University
kdh...@binghamton.edu
On 3/23/2020 12:21 AM, Liviu Sas wrote:
Hi Dave,
Thank you for the answer.
Unfortunately the issue is that ceph uses the wrong source IP address,
and sends the traffic on the wrong interface
.
-Dave
Dave Hall
Binghamton University
kdh...@binghamton.edu
On 3/22/2020 8:03 PM, Liviu Sas wrote:
Hello,
While testing our ceph cluster setup, I noticed a possible issue with the
cluster/public network configuration being ignored for TCP session
initiation.
Looks like the daemons (mon/mgr/mds
to create a new FS and copy
the data over?
Thanks.
-Dave
--
Dave Hall
Binghamton University
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
/DNS for the active mgr. When an active mgr fails
over (or goes down), the new active mgr enables this floating IP and
starts responding to mgr requests. Any passive mgrs would, of course,
turn this IP off to assure that requests go to the active mgr.
Just a thought...
-Dave
Dave Hall
as a criterion.
Does anybody have anything further to add that would help clarify this?
Thanks.
-Dave
Dave Hall
Binghamton University
On 2/10/20 1:26 PM, Gregory Farnum wrote:
On Mon, Feb 10, 2020 at 12:29 AM Håkan T Johansson wrote:
On Mon, 10 Feb 2020, Gregory Farnum wrote:
On Sun, Feb 9, 2020
the cluster re-balance and reconstruct
the lost data until the failed OSD was replaced.
Does this make sense? Or is it just wishful thinking.
Thanks.
-Dave
--
Dave Hall
Binghamton University
___
ceph-users mailing list -- ceph-users@ceph.io
b-0/db-0 --osd-data
/var/lib/ceph/osd/ceph-24/ --osd-uuid 6441f236-8694-46b9-9c6a-bf82af89765d
--setuser ceph --setgroup ceph
root@ceph01:~#
Dave Hall
Binghamton universitykdh...@binghamton.edu
607-760-2328 (Cell)
607-777-4641 (Office)
On 1/29/2020 3:15 AM, Jan Fajerski wrote:
On Tue, Jan 28, 202
Dave Hall
On 1/28/2020 3:05 AM, Jan Fajerski wrote:
On Mon, Jan 27, 2020 at 03:23:55PM -0500, Dave Hall wrote:
All,
I've just spent a significant amount of time unsuccessfully chasing
the _read_fsid unparsable uuid error on Debian 10 / Natilus 14.2.6.
Since this is a brand new cluster, last night
bytes, 23437770752 sectors
Disk /dev/sdj: 10.9 TiB, 12000138625024 bytes, 23437770752 sectors
I'd send the output of ceph-volume inventory on Luminous, but I'm
getting -->: KeyError: 'human_readable_size'.
Please let me know if I can provide any further information.
Thanks.
-Dave
--
D
73 matches
Mail list logo