> Hi Everyone,
>
> I'm putting together a HDD cluster with an ECC pool dedicated to the backup
> environment. Traffic via s3. Version 18.2, 7 OSD nodes, 12 * 12TB HDD +
> 1NVME each,
QLC, man. QLC. That said, I hope you're going to use that single NVMe SSD for
at least the index pool. Is
Why not? The hwarch doesn't matter.
> On May 25, 2024, at 07:35, filip Mutterer wrote:
>
> Is this known to be working:
>
> Setting up the Ceph Cluster with ARM and then use the storage with X86
> Machines for example LXC, Docker and KVM?
>
> Is this possible?
>
> Greetings
>
> filip
>
I'm interested in these responses. Early this year a certain someone related
having good results by deploying an RGW on every cluster node. This was when
we were experiencing ballooning memory usage conflicting with K8s limits when
running 3. So on the cluster in question we now run 25.
> I think it is his lab so maybe it is a test setup for production.
Home production?
>
> I don't think it matters to much with scrubbing, it is not like it is related
> to how long you were offline. It will scrub just as much being 1 month
> offline as being 6 months offline.
>
>>
>> If
If you have a single node arguably ZFS would be a better choice.
> On May 21, 2024, at 14:53, adam.ther wrote:
>
> Hello,
>
> To save on power in my home lab can I have a single node CEPH cluster sit
> idle and powered off for 3 months at a time then boot only to refresh
> backups? Or will
>
>
> I have additional questions,
> We use 13 disk (3.2TB NVMe) per server and allocate one OSD to each disk. In
> other words 1 Node has 13 osds.
> Do you think this is inefficient?
> Is it better to create more OSD by creating LV on the disk?
Not with the most recent Ceph releases. I
> On May 20, 2024, at 2:24 PM, Matthew Vernon wrote:
>
> Hi,
>
> Thanks for your help!
>
> On 20/05/2024 18:13, Anthony D'Atri wrote:
>
>> You do that with the CRUSH rule, not with osd_crush_chooseleaf_type. Set
>> that back to the default value
>
>>> This has left me with a single sad pg:
>>> [WRN] PG_AVAILABILITY: Reduced data availability: 1 pg inactive
>>>pg 1.0 is stuck inactive for 33m, current state unknown, last acting []
>>>
>> .mgr pool perhaps.
>
> I think so
>
>>> ceph osd tree shows that CRUSH picked up my racks OK,
> On May 20, 2024, at 12:21 PM, Matthew Vernon wrote:
>
> Hi,
>
> I'm probably Doing It Wrong here, but. My hosts are in racks, and I wanted
> ceph to use that information from the get-go, so I tried to achieve this
> during bootstrap.
>
> This has left me with a single sad pg:
> [WRN]
If using jumbo frames, also ensure that they're consistently enabled on all OS
instances and network devices.
> On May 16, 2024, at 09:30, Frank Schilder wrote:
>
> This is a long shot: if you are using octopus, you might be hit by this
> pglog-dup problem:
>
https://docs.ceph.com/en/latest/start/os-recommendations/#platforms
You might want to go to 20.04, then to Reef, then to 22.04
> On May 13, 2024, at 12:22, Nima AbolhassanBeigi
> wrote:
>
> The ceph version is 16.2.13 pacific.
> It's deployed using ceph-ansible. (release branch stable-6.0)
I halfway suspect that something akin to the speculation in
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/7MWAHAY7NCJK2DHEGO6MO4SWTLPTXQMD/
is going on.
Below are reservations reported by a random OSD that serves (mostly) an EC RGW
bucket pool. This is with the mclock
>> For our customers we are still disabling mclock and using wpq. Might be
>> worth trying.
>>
>>
> Could you please elaborate a bit on the issue(s) preventing the
> use of mClock. Is this specific to only the slow backfill rate and/or other
> issue?
>
> This feedback would help prioritize
Do you see *keys* aka omap traffic? Especially if you have RGW set up?
> On Apr 24, 2024, at 15:37, David Orman wrote:
>
> Did you ever figure out what was happening here?
>
> David
>
> On Mon, May 29, 2023, at 07:16, Hector Martin wrote:
>> On 29/05/2023
> On Apr 23, 2024, at 12:24, Maged Mokhtar wrote:
>
> For nvme:HDD ratio, yes you can go for 1:10, or if you have extra slots you
> can use 1:5 using smaller capacity/cheaper nvmes, this will reduce the impact
> of nvme failures.
On occasion I've seen a suggestion to mirror the fast
Sounds like an opportunity for you to submit an expansive code PR to implement
it.
> On Apr 23, 2024, at 04:28, Marc wrote:
>
>> I have removed dual-stack-mode-related information from the documentation
>> on the assumption that dual-stack mode was planned but never fully
>> implemented.
>>
me-series DB and watch
both for drives nearing EOL and their burn rates.
>
> On Sun, Apr 21, 2024 at 11:06 PM Anthony D'Atri
> wrote:
>>
>> A deep archive cluster benefits from NVMe too. You can use QLC up to 60TB
>> in size, 32 of those in one RU makes for a cluste
Vendor lock-in only benefits vendors. You’ll pay outrageously for support /
maint then your gear goes EOL and you’re trolling eBay for parts.
With Ceph you use commodity servers, you can swap 100% of the hardware without
taking downtime with servers and drives of your choice. And you get
A deep archive cluster benefits from NVMe too. You can use QLC up to 60TB in
size, 32 of those in one RU makes for a cluster that doesn’t take up the whole
DC.
> On Apr 21, 2024, at 5:42 AM, Darren Soothill wrote:
>
> Hi Niklaus,
>
> Lots of questions here but let me tray and get through
,
> Malte
>
>> On 21.04.24 04:14, Anthony D'Atri wrote:
>> The party line is to jump no more than 2 major releases at once.
>> So that would be Octopus (15) to Quincy (17) to Reef (18).
>> Squid (19) is due out soon, so you may want to pause at Quincy until Squid
>>
The party line is to jump no more than 2 major releases at once.
So that would be Octopus (15) to Quincy (17) to Reef (18).
Squid (19) is due out soon, so you may want to pause at Quincy until Squid is
released and has some runtime and maybe 19.2.1, then go straight to Squid from
Quincy to
Look for unlinked but open files, it may not be Ceph at fault. Suboptimal
logrotate rules can cause this. lsof, fsck -n, etc.
> On Apr 19, 2024, at 05:54, Sake Ceph wrote:
>
> Hi Matthew,
>
> Cephadm doesn't cleanup old container images, at least with Quincy. After a
> upgrade we run the
This is a ymmv thing, it depends on one's workload.
>
> However, we have some questions about this and are looking for some guidance
> and advice.
>
> The first one is about the expected benefits. Before we undergo the efforts
> involved in the transition, we are wondering if it is even
If you're using SATA/SAS SSDs I would aim for 150-200 PGs per OSD as shown by
`ceph osd df`.
If NVMe, 200-300 unless you're starved for RAM.
> On Apr 15, 2024, at 07:07, Mitsumasa KONDO wrote:
>
> Hi Menguy-san,
>
> Thank you for your reply. Users who use large IO with tiny volumes are a
>
If you're using an Icinga active check that just looks for
SMART overall-health self-assessment test result: PASSED
then it's not doing much for you. That bivalue status can be shown for a drive
that is decidedly an ex-parrot. Gotta look at specific attributes, which is
thorny since they
One can up the ratios temporarily but it's all too easy to forget to reduce
them later, or think that it's okay to run all the time with reduced headroom.
Until a host blows up and you don't have enough space to recover into.
> On Apr 12, 2024, at 05:01, Frédéric Nass
> wrote:
>
>
> Oh, and
My understanding is that omap and EC are incompatible, though.
> On Apr 8, 2024, at 09:46, David Orman wrote:
>
> I would suggest that you might consider EC vs. replication for index data,
> and the latency implications. There's more than just the nvme vs. rotational
> discussion to
ISTR that the Ceph slow op threshold defaults to 30 or 32 seconds. Naturally
an op over the threshold often means there are more below the reporting
threshold.
120s I think is the default Linux op timeout.
> On Apr 6, 2024, at 10:53 AM, David C. wrote:
>
> Hi,
>
> Do slow ops impact
A bucket may contain objects spread across multiple storage classes, and AIUI
the head object is always in the default storage class, so I'm not sure
*exactly* what you're after here.
> On Apr 4, 2024, at 17:09, Ondřej Kukla wrote:
>
> Hello,
>
> I’m playing around with Storage classes in
> Istvan Szabo
> Staff Infrastructure Engineer
> ---
> Agoda Services Co., Ltd.
> e: istvan.sz...@agoda.com<mailto:istvan.sz...@agoda.com>
> -------
>
>
>
> _
Network RTT?
> On Apr 4, 2024, at 03:44, Noah Elias Feldt wrote:
>
> Hello,
> I have a question about a setting for RBD.
> How exactly does "rbd_read_from_replica_policy" with the value "localize"
> work?
> According to the RBD documentation, read operations will be sent to the
> closest OSD
t all, but IMO RGW
> index/usage(/log/gc?) pools are always better off using asynchronous
> recovery.
>
> Josh
>
> On Wed, Apr 3, 2024 at 1:48 PM Anthony D'Atri wrote:
>>
>> We currently have in src/common/options/global.yaml.in
>>
>> - name: osd_async_
Depending on your Ceph release you might need to enable rbdstats.
Are you after provisioned, allocated, or both sizes? Do you have object-map
and fast-diff enabled? They speed up `rbd du` massively.
> On Apr 3, 2024, at 00:26, Szabo, Istvan (Agoda)
> wrote:
>
> Hi,
>
> Trying to pull out
We currently have in src/common/options/global.yaml.in
- name: osd_async_recovery_min_cost
type: uint
level: advanced
desc: A mixture measure of number of current log entries difference and
historical
missing objects, above which we switch to use asynchronous recovery when
Do these RBD volumes have a full feature set? I would think that fast-diff and
objectmap would speed this.
> On Apr 2, 2024, at 00:36, Henry lol wrote:
>
> I'm not sure, but it seems that read and write operations are
> performed for all objects in rbd.
> If so, is there any method to apply
> a001s017.bpygfm(active, since 13M), standbys: a001s016.ctmoay
Looks like you just had an mgr failover? Could be that the secondary mgr
hasn't caught up with current events.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an
Yes, you will need to create datacenter buckets and move your host buckets
under them.
> On Mar 26, 2024, at 09:18, ronny.lippold wrote:
>
> hi there, need some help please.
>
> we are planning to replace our rbd-mirror setup and go to stretch mode.
> the goal is, to have the cluster in 2
First try "ceph osd down 89"
> On Mar 25, 2024, at 15:37, Alexander E. Patrakov wrote:
>
> On Mon, Mar 25, 2024 at 7:37 PM Torkil Svensgaard wrote:
>>
>>
>>
>> On 24/03/2024 01:14, Torkil Svensgaard wrote:
>>> On 24-03-2024 00:31, Alexander E. Patrakov wrote:
Hi Torkil,
>>>
>>> Hi
I fear this will raise controversy, but in 2024 what’s the value in
perpetuating an interface from early 1980s BITnet batch operating systems?
> On Mar 23, 2024, at 5:45 AM, Janne Johansson wrote:
>
>> Sure! I think Wido just did it all unofficially, but afaik we've lost
>> all of those
Perhaps emitting an extremely low value could have value for identifying a
compromised drive?
> On Mar 22, 2024, at 12:49, Michel Jouvin
> wrote:
>
> Frédéric,
>
> We arrived at the same conclusions! I agree that an insane low value would be
> a good addition: the idea would be that the
n this forum people recommend upgrading "M3CR046"
> https://forums.unraid.net/topic/134954-warning-crucial-mx500-ssds-world-of-pain-stay-away-from-these/
> But actually in my ud cluster all the drives are "M3CR045" and have lower
> latency. I'm really confused.
>
https://askubuntu.com/questions/1454997/how-to-stop-sys-from-changing-usb-ssd-provisioning-mode-from-unmap-to-full-in-ub
How to stop sys from changing USB SSD provisioning_mode from unmap to full in
Ubuntu 22.04?
askubuntu.com
?
> On Mar 22, 2024, at 09:36, Özkan Göksu wrote:
>
> Hello!
>
>
37ssd 18.19040 1.0 18 TiB 13 TiB 13 TiB 13 GiB 53 GiB
> 5.0 TiB 72.78 1.21 179 up
> 43ssd 18.19040 1.0 18 TiB 8.9 TiB 8.8 TiB 17 GiB 23 GiB
> 9.3 TiB 48.71 0.81 178 up
>TOTAL 873 TiB 527 TiB 525 Ti
Grep through the ls output for ‘rados bench’ leftovers, it’s easy to leave them
behind.
> On Mar 20, 2024, at 5:28 PM, Igor Fedotov wrote:
>
> Hi Thorne,
>
> unfortunately I'm unaware of any tools high level enough to easily map files
> to rados objects without deep undestanding how this
ult.rgw.jv-comm-pool.non-ec 6432 0 B0 0 B 0
> 61 TiB
> default.rgw.jv-va-pool.data 6532 4.8 TiB 22.17M 14 TiB 7.28
> 61 TiB
> default.rgw.jv-va-pool.index 6632 38 GiB 401 113 GiB 0.06
> 61 TiB
> default.rg
> On Mar 20, 2024, at 14:42, Michael Worsham
> wrote:
>
> Is there an easy way to poll a Ceph cluster to see how much space is available
`ceph df`
The exporter has percentages per pool as well.
> and how much space is available per bucket?
Are you using RGW quotas?
>
> Looking for a
Suggest issuing an explicit deep scrub against one of the subject PGs, see if
it takes.
> On Mar 20, 2024, at 8:20 AM, Michel Jouvin
> wrote:
>
> Hi,
>
> We have a Reef cluster that started to complain a couple of weeks ago about
> ~20 PGs (over 10K) not scrubbed/deep-scrubbed in time.
> Those files are VM disk images, and they're under constant heavy use, so yes-
> there/is/ constant severe write load against this disk.
Why are you using CephFS for an RBD application?
___
ceph-users mailing list -- ceph-users@ceph.io
To
NVMe (hope those are enterprise not client) drives aren't likely to suffer the
same bottlenecks as HDDs or even SATA SSDs. And a 2:1 size ratio isn't the
largest I've seen.
So I would just use all 108 OSDs as a single device class and spread the pools
across all of them. That way you won't
I think heartbeats will failover to the public network if the private doesn't
work -- may not have always done that.
>> Hi
>> Cephadm Reef 18.2.0.
>> We would like to remove our cluster_network without stopping the cluster and
>> without having to route between the networks.
>> global
> On Feb 28, 2024, at 17:55, Joel Davidow wrote:
>
> Current situation
> -
> We have three Ceph clusters that were originally built via cephadm on octopus
> and later upgraded to pacific. All osds are HDD (will be moving to wal+db on
> SSD) and were resharded after the
I don't see these in the config dump.
I think you might have to apply them to `global` for them to take effect, not
just `osd`, FWIW.
> I have tried various settings, like osd_deep_scrub_interval, osd_max_scrubs,
> mds_max_scrub_ops_in_progress etc.
> All those get ignored.
the right how many PG replicas are on each OSD.
> On Mar 5, 2024, at 14:50, Nikolaos Dandoulakis wrote:
>
> Hi Anthony,
>
> I should have said, it’s replicated (3)
>
> Best,
> Nick
>
> Sent from my phone, apologies for any typos!
> From: Anthony D'Atri
> Sen
Replicated or EC?
> On Mar 5, 2024, at 14:09, Nikolaos Dandoulakis wrote:
>
> Hi all,
>
> Pretty sure not the first time you see a thread like this.
>
> Our cluster consists of 12 nodes/153 OSDs/1.2 PiB used, 708 TiB /1.9 PiB avail
>
> The data pool is 2048 pgs big exactly the same number as
* Try applying the settings to global so that mons/mgrs get them.
* Set your shallow scrub settings back to the default. Shallow scrubs take
very few resources
* Set your randomize_ratio back to the default, you’re just bunching them up
* Set the load threshold back to the default, I can’t
> I think the short answer is "because you have so wildly varying sizes
> both for drives and hosts".
Arguably OP's OSDs *are* balanced in that their PGs are roughly in line with
their sizes, but indeed the size disparity is problematic in some ways.
Notably, the 500GB OSD should just be
> On Mar 2, 2024, at 10:37 AM, Erich Weiler wrote:
>
> Hi Y'all,
>
> We have a new ceph cluster online that looks like this:
>
> md-01 : monitor, manager, mds
> md-02 : monitor, manager, mds
> md-03 : monitor, manager
> store-01 : twenty 30TB NVMe OSDs
> store-02 : twenty 30TB NVMe OSDs
>
>
I have a number of drives in my fleet with old firmware that seems to have
discard / TRIM bugs, as in the drives get bricked.
Much worse is that since they're on legacy RAID HBAs, many of them can't be
updated.
ymmv.
> On Mar 1, 2024, at 13:15, Igor Fedotov wrote:
>
> I played with this
>
> I'm designing a new Ceph storage from scratch and I want to increase CephFS
> speed and decrease latency.
> Usually I always build (WAL+DB on NVME with Sas-Sata SSD's)
Just go with pure-NVMe servers. NVMe SSDs shouldn't cost much if anything more
than the few remaining SATA or
> Low space hindering backfill (add storage if this doesn't resolve
> itself): 21 pgs backfill_toofull
^^^ Ceph even told you what you need to do ;)
If your have recovery taking place and the numbers of misplaced objects and
*full PGs/pools keeps decreasing, then yes wait.
As for
Your recovery is stuck because there are no OSDs that have enough space to
accept data.
Your second OSD host appears to only have 9 OSDs currently, so you should be
able to add a 10TB OSD there without removing anything.
That will enable data to move to all three of your 10TB OSDs.
> On Feb
You aren’t going to be able to finish recovery without having somewhere to
recover TO.
> On Feb 24, 2024, at 10:33 AM, nguyenvand...@baoviet.com.vn wrote:
>
> Thank you, Sir. But i think i ll wait for PG BACKFILLFULL finish, my boss is
> very angry now and will not allow me to add one more
You also might want to increase mon_max_pg_per_osd since you have a wide spread
of OSD sizes.
Default is 250. Set it to 1000.
> On Feb 24, 2024, at 10:30 AM, Anthony D'Atri wrote:
>
> Add a 10tb HDD to the third node as I suggested, that will help your cluster.
>
>
>
Add a 10tb HDD to the third node as I suggested, that will help your cluster.
> On Feb 24, 2024, at 10:29 AM, nguyenvand...@baoviet.com.vn wrote:
>
> I will correct some small things:
>
> we have 6 nodes, 3 osd node and 3 gaeway node ( which run RGW, mds and nfs
> service)
> you r corrct, 2/3
# ceph osd dump | grep ratio
full_ratio 0.95
backfillfull_ratio 0.9
nearfull_ratio 0.85
Read the four sections here:
https://docs.ceph.com/en/quincy/rados/operations/health-checks/#osd-out-of-order-full
> On Feb 24, 2024, at 10:12 AM, nguyenvand...@baoviet.com.vn wrote:
>
> Hi Mr Anthony,
There ya go.
You have 4 hosts, one of which appears to be down and have a single OSD that is
so small as to not be useful. Whatever cephgw03 is, it looks like a mistake.
OSDs much smaller than, say, 1TB often aren’t very useful.
Your pools appear to be replicated, size=3.
So each of your
>
> 2) It looks like you might have an interesting crush map. Allegedly you have
> 41TiB of space but you can’t finish rococering you have lots of PGs stuck as
> their destination is too full. Are you running homogenous hardware or do you
> have different drive sizes? Are all the weights set
> you can sometimes find really good older drives like Intel P4510s on ebay
> for reasonable prices. Just watch out for how much write wear they have on
> them.
Also be sure to update to the latest firmware before use, then issue a Secure
Erase.
>
gt;>
>> This situation will permit some rules to be relaxed (even if they are not
>> ok at first).
>> Likewise, there are already situations like lazyio that make some
>> exceptions to standard procedures.
>>
>>
>> Remembering: it's just a suggestion.
>&g
It would be better if this feature could make a replica at a second time on
> selected pool.
> Thanks.
> Rafael.
>
>
>
> De: "Anthony D'Atri"
> Enviada: 2024/02/01 15:00:59
> Para: quag...@bol.com.br
> Cc: ceph-users@ceph.io
> Assunto: [ceph-users] Re:
>> After wrangling with this myself, both with 17.2.7 and to an extent with
>> 17.2.5, I'd like to follow up here and ask:
>> Those who have experienced this, were the affected PGs
>> * Part of an EC pool?
>> * Part of an HDD pool?
>> * Both?
>
> Both in my case, EC is 4+2 jerasure blaum_roth
After wrangling with this myself, both with 17.2.7 and to an extent with
17.2.5, I'd like to follow up here and ask:
Those who have experienced this, were the affected PGs
* Part of an EC pool?
* Part of an HDD pool?
* Both?
>
> You don't say anything about the Ceph version you are running.
Notably, the tiebreaker should be in a third location.
> On Feb 14, 2024, at 05:16, Peter Sabaini wrote:
>
> On 14.02.24 06:59, Vladimir Sigunov wrote:
>> Hi Ronny,
>> This is a good starting point for your design.
>> https://docs.ceph.com/en/latest/rados/operations/stretch-mode/
>>
>> My
rives were automatically added as OSD, and the Cluster was returning to
> normal state. Currently, the degraded PGs are recovering,
>
> Thank you
>
>> Anthony D'Atri wrote:
>> You probably have the H330 HBA, rebadged LSI. You can
You probably have the H330 HBA, rebadged LSI. You can set the “mode” or
“personality” using storcli / perccli. You might need to remove the VDs from
them too.
> On Feb 12, 2024, at 7:53 PM, sa...@dcl-online.com wrote:
>
> Hello,
>
> I have a Ceph cluster created by orchestrator Cephadm.
ag for that as it is does not contain md5 in case of multipart
> upload.
>
> Michal
>
>
> On 2/9/24 13:53, Anthony D'Atri wrote:
>> You could use Lua scripting perhaps to do this at ingest, but I'm very
>> curious about scrubs -- you have them turned off
You could use Lua scripting perhaps to do this at ingest, but I'm very curious
about scrubs -- you have them turned off completely?
> On Feb 9, 2024, at 04:18, Michal Strnad wrote:
>
> Hi all!
>
> In the context of a repository-type project, we need to address a situation
> where we cannot
> Hi everyone,
>
> I saw the bluestore can separate block.db, block.wal.
> In my case, I'd like to apply hybrid device which uses SSD, HDD to improve
> the small data write performance.
> but I don't have enough SSD to cover block.db and block.wal.
> so I think it can impact performance even
or pre-shard it in advance for the eventual size of the bucket.
Recent releases have a feature that does this automatically, if it's enabled.
My command of these dynamics is limited, so others on the list may be able to
chime in with refinements.
>
> Thanks for the help already!
>
>
> I have recently onboarded new OSDs into my Ceph Cluster. Previously, I had
> 44 OSDs of 1.7TiB each and was using it for about a year. About 1 year ago,
> we onboarded an additional 20 OSDs of 14TiB each.
That's a big difference in size. I suggest increasing mon_max_pg_per_osd to
1000 --
Anything on HDDs yields suboptimal performance.
> On Feb 4, 2024, at 13:42, Niklas Hambüchen wrote:
>
> https://docs.ceph.com/en/reef/cephfs/createfs/ says:
>
>> The data pool used to create the file system is the “default” data pool and
>> the location for storing all inode backtrace
I’ve done the pg import dance a couple of times. It was very slow but did work
ultimately.
Depending on the situation, if there is one valid copy available one can enable
recovery by temporarily setting min_size on the pool to 1, reverting it once
recovery completes.
You you run with 1
The slashes don’t mean much if anything to Ceph. Buckets are not hierarchical
filesystems.
You speak of millions of files. How many millions?
How big are they? Very small objects stress any object system. Very large
objects may be multi part uploads that stage to slow media or otherwise
You adjusted osd_memory_target? Higher than the default 4GB?
>
>
> Another thing that we've found is that rocksdb can become quite slow if it
> doesn't have enough memory for internal caches. As our cluster usage has
> grown, we've needed to increase OSD memory in accordance with bucket
suggestion.
> If this type of functionality is not interesting, it is ok.
>
>
>
> Rafael.
>
>
> De: "Anthony D'Atri"
> Enviada: 2024/02/01 12:10:30
> Para: quag...@bol.com.br
> Cc: ceph-users@ceph.io
> Assunto: [ceph-users] Re: Performance improvement
that like 40 TIMES better density with SSDs.
> However, I don't think it's interesting to lose the functionality of the
> replicas.
> I'm just suggesting another way to increase performance without losing
> the functionality of replicas.
>
>
> Rafael.
>
>
&
I’ve heard conflicting asserts on whether the write returns with min_size
shards have been persisted, or all of them.
> On Jan 31, 2024, at 2:58 PM, Can Özyurt wrote:
>
> I never tried this myself but "min_size = 1" should do what you want to
> achieve.
Would you be willing to accept the risk of data loss?
> On Jan 31, 2024, at 2:48 PM, quag...@bol.com.br wrote:
>
> Hello everybody,
> I would like to make a suggestion for improving performance in Ceph
> architecture.
> I don't know if this group would be the best place or if my
>
> so .. in a PG there are no "file data" but pieces of "file data"?
Yes. Chapter 8 may help here, but be warned, it’s pretty dense and may confuse
more than help.
The foundation layer of Ceph is RADOS — services including block (RBD), file
(CephFS), and object (RGW) storage are built on
>
>>> so it depends on failure domain .. but with host failure domain, if there
>>> is space on some other OSDs
>>> will the missing OSDs be "healed" on the available space on some other OSDs?
>> Yes, if you have enough hosts. When using 3x replication it is thus
>> advantageous to have at
>> Oh! so the device class is more like an arbitrary label not a immutable
>> defined property!
>> looking at
>> https://docs.ceph.com/en/reef/rados/operations/crush-map/#device-classes
>> this is not specified …
"By default, OSDs automatically set their class at startup to hdd, ssd, or
>
> Just a continuation of this mail, Could you help me out to understand the ceph
> df output. PFA the screenshot with this mail.
No idea what PFA means, but attachments usually don’t make it through on
mailing lists. Paste text instead.
> 1. Raw storage is 180 TB
The sum of OSD total
>
> First a all, thanks a lot for for info and taking time to help
> a beginner :)
Nichts zu denken. This is a community, it’s what we do. Next year you’ll
help someone else.
>>>
> Oh! so the device class is more like an arbitrary label not a immutable
> defined property!
> looking at
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
I’ve seen C-states impact mons by dropping a bunch of packets — on nodes that
were lightly utilized so they transitioned a lot. Curiously both CPU and NIC
generation seemed to be factors, as it only happened on one cluster out of a
dozen or so.
If by SSD you mean SAS/SATA SSDs, then the
>
>>> forth), so this is why "ceph df" will tell you a pool has X free
>>> space, where X is "smallest free space on the OSDs on which this pool
>>> lies, times the number of OSDs".
To be even more precise, this depends on the failure domain. With the typical
"rack" failure domain, say you
Conventional wisdom is that with recent Ceph releases there is no longer a
clear advantage to this.
> On Jan 17, 2024, at 11:56, Peter Sabaini wrote:
>
> One thing that I've heard people do but haven't done personally with fast
> NVMes (not familiar with the IronWolf so not sure if they
> Also in our favour is that the users of the cluster we are currently
> intending for this have established a practice of storing large objects.
That definitely is in your favor.
> but it remains to be seen how 60x 22TB behaves in practice.
Be sure you don't get SMR drives.
> and it's
>
> NVMe SSDs shouldn’t cost significantly more than SATA SSDs. Hint: certain
> tier-one chassis manufacturers mark both the fsck up. You can get a better
> warranty and pricing by buying drives from a VAR.
>
> We stopped buying “Vendor FW” drives a long time ago.
Groovy.
by “RBD for cloud”, do you mean VM / container general-purposes volumes on
which a filesystem is usually built? Or large archive / backup volumes that
are read and written sequentially without much concern for latency or
throughput?
How many of those ultra-dense chassis in a cluster? Are all
1 - 100 of 454 matches
Mail list logo