[ceph-users] How to deploy ceph with ssd ?

2021-05-10 Thread codignotto
I'm deploying 6 ceph servers with 128GB of memory each, 12 SSDs of 1 Tb on
each server, 10Gb network cards connected to 10Gb port switches. I'm
following this documentation

https://docs.ceph.com/en/octopus/cephadm/install/

But I don't know if this is the best way to get the most out of the disks,
I will use it with RBD only and deliver it to a proxmox cluster. Do you
have any more complete documentation? Some tuning tips for the best SSD
speed process?

Many Tks
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: v16.2.2 Pacific released

2021-05-10 Thread Mike Perez
Hi Norman,

Here's the correct link.

https://docs.ceph.com/en/latest/install/get-packages/

On Fri, May 7, 2021 at 9:04 PM kefu chai  wrote:
>
> On Sat, May 8, 2021 at 10:42 AM Norman.Kern  wrote:
> >
> > Hi David,
> >
> > The web page is missing: 
> > https://docs.ceph.com/en/latest/docs/master/install/get-packages/
>
> probably we should just replace "http" with "https" in
>
> > > * For packages, see http://docs.ceph.com/docs/master/install/get-packages/
>
> >
> >
> > \  SORRY/
> >  \ /
> >   \This page does /
> >]   not exist yet.[,'|
> >] [   /  |
> >]___   ___[ ,'   |
> >]  ]\ /[  [ |:   |
> >]  ] \   / [  [ |:   |
> >]  ]  ] [  [  [ |:   |
> >]  ]  ]__ __[  [  [ |:   |
> >]  ]  ] ]\ _ /[ [  [  [ |:   |
> >]  ]  ] ] (#) [ [  [  [ :'
> >]  ]  ]_].nHn.[_[  [  [
> >]  ]  ]  H. [  [  [
> >]  ] /   `HH("N  \ [  [
> >]__]/ HHH  "  \[__[
> >] NNN [
> >] N/" [
> >] N H [
> >   /  N\
> >  /   q,\
> > /   \
> >
> > 在 2021/5/6 上午12:51, David Galloway 写道:
> > > This is the second backport release in the Pacific stable series. For a
> > > detailed release notes with links & changelog please refer to the
> > > official blog entry at https://ceph.io/releases/v16-2-2-pacific-released
> > >
> > > Notable Changes
> > > ---
> > > * Cephadm now supports an *ingress* service type that provides load
> > > balancing and HA (via haproxy and keepalived on a virtual IP) for RGW
> > > service.  The experimental *rgw-ha* service has been removed.
> > >
> > > Getting Ceph
> > > 
> > > * Git at git://github.com/ceph/ceph.git
> > > * Tarball at http://download.ceph.com/tarballs/ceph-16.2.2.tar.gz
> > > * For packages, see http://docs.ceph.com/docs/master/install/get-packages/
> > > * Release git sha1: e8f22dde28889481f4dda2beb8a07788204821d3
> > > ___
> > > ceph-users mailing list -- ceph-users@ceph.io
> > > To unsubscribe send an email to ceph-users-le...@ceph.io
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
>
> --
> Regards
> Kefu Chai
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io



-- 
Mike Perez
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [Suspicious newsletter] RGW: Multiple Site does not sync olds data

2021-05-10 Thread 特木勒
Hi Istvan:

Thanks for your help.

After we rewrite all the objects that in buckets, the sync seems to work
again.

We are using this command to rewrite all the objects in specific bucket:
`radosgw-admin bucket rewrite —bucket=BUCKET_NAME --min-rewrite-size 0`

You can try to run this on 1 bucket and see if it could help you fix the
problem.

Thank you~

Szabo, Istvan (Agoda)  于2021年5月10日周一 下午12:16写道:

> So how is your multisite things going at the moment? Seems like with this
> rewrite you’ve moved further than me 😊 Is it working properly now? If
> yes, what is the steps to make it work? Where is the magic 😊 ?
>
>
>
> Istvan Szabo
> Senior Infrastructure Engineer
> ---
> Agoda Services Co., Ltd.
> e: istvan.sz...@agoda.com
> ---
>
>
>
> *From:* 特木勒 
> *Sent:* Thursday, May 6, 2021 11:27 AM
> *To:* Jean-Sebastien Landry 
> *Cc:* Szabo, Istvan (Agoda) ; ceph-users@ceph.io;
> Amit Ghadge 
> *Subject:* Re: [ceph-users] Re: [Suspicious newsletter] RGW: Multiple
> Site does not sync olds data
>
>
>
> Hi Jean:
>
>
>
> Thanks for your info.
>
>
>
> Unfortunately I check the secondary cluster and non-objects had been
> synced. The only way I have is to force rewrite objects for whole buckets.
>
>
>
> I have tried to set up multiple site between Nautilus and octopus. It
> works pretty well. But after I upgrade primary cluster to octopus, we have
> this issue. :(
>
>
>
> Here is the issue: https://tracker.ceph.com/issues/49542#change-193975
>
>
>
> Thanks
>
>
>
> Jean-Sebastien Landry  于2021年4月27日周二 下午
> 7:52写道:
>
> Hi, I hit the same errors when doing multisite sync between luminous and
> octopus, but what I founded is that my sync errors was mainly on old
> multipart and shadow objects, at the "rados level" if I might say.
> (leftovers from luminous bugs)
>
> So check at the "user level", using s3cmd/awscli and the objects md5,
> you will probably find that your pretty much in sync. Hopefully.
>
> Cheers!
>
> On 4/25/21 11:29 PM, 特木勒 wrote:
> > [Externe UL*]
> >
> > Another problem I notice for a new bucket, the first object in the bucket
> > will not be sync. the sync will start with the second object. I tried to
> > fix the index on the bucket and manually rerun bucket sync, but the first
> > object still does not sync with secondary cluster.
> >
> > Do you have any ideas for this issue?
> >
> > Thanks
> >
> > 特木勒  于2021年4月26日周一 上午11:16写道:
> >
> >> Hi Istvan:
> >>
> >> Thanks Amit's suggestion.
> >>
> >> I followed his suggestion to fix bucket index and re-do sync on buckets,
> >> but it still did not work for me.
> >>
> >> Then I tried to use bucket rewrite command to rewrite all the objects in
> >> buckets and it works for me. I think the reason is there's something
> wrong
> >> with bucket index and rewrite has rebuilt the index.
> >>
> >> Here's the command I use:
> >> `sudo radosgw-admin bucket rewrite -b BUCKET-NAME --min-rewrite-size 0`
> >>
> >> Maybe you can try this to fix the sync issues.
> >>
> >> @Amit Ghadge  Thanks for your suggestions. Without
> >> your suggestions, I will not notice something wrong with index part.
> >>
> >> Thanks :)
> >>
> >> Szabo, Istvan (Agoda)  于2021年4月26日周一 上午9:57写道:
> >>
> >>> Hi,
> >>>
> >>>
> >>>
> >>> No, doesn’t work, now we will write our own sync app for ceph, I gave
> up.
> >>>
> >>>
> >>>
> >>> Istvan Szabo
> >>> Senior Infrastructure Engineer
> >>> ---
> >>> Agoda Services Co., Ltd.
> >>> e: istvan.sz...@agoda.com
> >>> ---
> >>>
> >>>
> >>>
> >>> *From:* 特木勒 
> >>> *Sent:* Friday, April 23, 2021 7:50 PM
> >>> *To:* Szabo, Istvan (Agoda) 
> >>> *Cc:* ceph-users@ceph.io
> >>> *Subject:* Re: [Suspicious newsletter] [ceph-users] RGW: Multiple Site
> >>> does not sync olds data
> >>>
> >>>
> >>>
> >>> Hi Istvan:
> >>>
> >>>
> >>>
> >>> We just upgraded whole cluster to 15.2.10 and the multiple site still
> >>> cannot sync whole objects to secondary cluster. 🙁
> >>>
> >>>
> >>>
> >>> Do you have any suggestions on this? And I open another issues in ceph
> >>> tracker site:
> >>>
> >>>
> https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftracker.ceph.com%2Fissues%2F50474&data=04%7C01%7Cjean-sebastien.landry%40dti.ulaval.ca%7C7758dcb00aa0481e098408d90863da13%7C56778bd56a3f4bd3a26593163e4d5bfe%7C1%7C0%7C637550047218082332%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=flvOwiaSDmDRQsbhzZsw5Q16hTmzUj9RXhcBrL6wsjo%3D&reserved=0
> >>>
> >>>
> >>>
> >>> Hope someone could go to check this issue.
> >>>
> >>>
> >>>
> >>> Thanks
> >>>
> >>>
> >>>
> >>> 特木勒 于2021年3月22日 周一下午9:08写道:
> >>>
> >>> Thank you~
> >>>
> >>>
> >>>
> >>> I will try to upgrade cluster too. Seem like this is the only way for
> >>> now. 😭
> >>>
> >>>
> >>>
> >>> I will let you know once I complete testing. :)
> >>>
> >>>
> >>>
> >>> Have a good day
>

[ceph-users] Re: Stuck OSD service specification - can't remove

2021-05-10 Thread David Orman
This turns out to be worse than we thought. We attempted another Ceph
upgrade (15.2.10->16.2.3) on another cluster, and have run into this
again. We're seeing strange behavior with the OSD specifications,
which also have a count that is #OSDs + #hosts, so for example, on a
504 OSD cluster (21 nodes of 24 OSDs), we see:

osd.osd_spec504/5256s   *

It never deletes, and we cannot apply a specification over it (we
attempt, and it stays in deleting state - and a --export does not show
any specification).

On 15.2.10 we didn't have this problem, it appears new in 16.2.x. We
are using 16.2.3.

Thanks,
David


On Fri, May 7, 2021 at 9:06 AM David Orman  wrote:
>
> Hi,
>
> I'm not attempting to remove the OSDs, but instead the
> service/placement specification. I want the OSDs/data to persist.
> --force did not work on the service, as noted in the original email.
>
> Thank you,
> David
>
> On Fri, May 7, 2021 at 1:36 AM mabi  wrote:
> >
> > Hi David,
> >
> > I had a similar issue yesterday where I wanted to remove an OSD on an OSD 
> > node which had 2 OSDs so for that I used "ceph orch osd rm" command which 
> > completed successfully but after rebooting that OSD node I saw it was still 
> > trying to start the systemd service for that OSD and one CPU core was 100% 
> > busy trying to do a "crun delete" which I suppose here is trying to delete 
> > an image or container. So what I did here is to kill this process and I 
> > also had to run the following command:
> >
> > ceph orch daemon rm osd.3 --force
> >
> > After that everything was fine again. This is a Ceph 15.2.11 cluster on 
> > Ubuntu 20.04 and podman.
> >
> > Hope that helps.
> >
> > ‐‐‐ Original Message ‐‐‐
> > On Friday, May 7, 2021 1:24 AM, David Orman  wrote:
> >
> > > Has anybody run into a 'stuck' OSD service specification? I've tried
> > > to delete it, but it's stuck in 'deleting' state, and has been for
> > > quite some time (even prior to upgrade, on 15.2.x). This is on 16.2.3:
> > >
> > > NAME PORTS RUNNING REFRESHED AGE PLACEMENT
> > > osd.osd_spec 504/525  12m label:osd
> > > root@ceph01:/# ceph orch rm osd.osd_spec
> > > Removed service osd.osd_spec
> > >
> > > From active monitor:
> > >
> > > debug 2021-05-06T23:14:48.909+ 7f17d310b700 0
> > > log_channel(cephadm) log [INF] : Remove service osd.osd_spec
> > >
> > > Yet in ls, it's still there, same as above. --export on it:
> > >
> > > root@ceph01:/# ceph orch ls osd.osd_spec --export
> > > service_type: osd
> > > service_id: osd_spec
> > > service_name: osd.osd_spec
> > > placement: {}
> > > unmanaged: true
> > > spec:
> > > filter_logic: AND
> > > objectstore: bluestore
> > >
> > > We've tried --force, as well, with no luck.
> > >
> > > To be clear, the --export even prior to delete looks nothing like the
> > > actual service specification we're using, even after I re-apply it, so
> > > something seems 'bugged'. Here's the OSD specification we're applying:
> > >
> > > service_type: osd
> > > service_id: osd_spec
> > > placement:
> > > label: "osd"
> > > data_devices:
> > > rotational: 1
> > > db_devices:
> > > rotational: 0
> > > db_slots: 12
> > >
> > > I would appreciate any insight into how to clear this up (without
> > > removing the actual OSDs, we're just wanting to apply the updated
> > > service specification - we used to use host placement rules and are
> > > switching to label-based).
> > >
> > > Thanks,
> > > David
> > >
> > > ceph-users mailing list -- ceph-users@ceph.io
> > > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
> >
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: radosgw-admin user create takes a long time (with failed to distribute cache message)

2021-05-10 Thread Boris Behrens
Hi guys,

does someone got any idea?

Am Mi., 5. Mai 2021 um 16:16 Uhr schrieb Boris Behrens :

> Hi,
> since a couple of days we experience a strange slowness on some
> radosgw-admin operations.
> What is the best way to debug this?
>
> For example creating a user takes over 20s.
> [root@s3db1 ~]# time radosgw-admin user create --uid test-bb-user
> --display-name=test-bb-user
> 2021-05-05 14:08:14.297 7f6942286840  1 robust_notify: If at first you
> don't succeed: (110) Connection timed out
> 2021-05-05 14:08:14.297 7f6942286840  0 ERROR: failed to distribute cache
> for eu-central-1.rgw.users.uid:test-bb-user
> 2021-05-05 14:08:24.335 7f6942286840  1 robust_notify: If at first you
> don't succeed: (110) Connection timed out
> 2021-05-05 14:08:24.335 7f6942286840  0 ERROR: failed to distribute cache
> for eu-central-1.rgw.users.keys:
> {
> "user_id": "test-bb-user",
> "display_name": "test-bb-user",
>
> }
> real 0m20.557s
> user 0m0.087s
> sys 0m0.030s
>
> First I thought that rados operations might be slow, but adding and
> deleting objects in rados are fast as usual (at least from my perspective).
> Also uploading to buckets is fine.
>
> We changed some things and I think it might have to do with this:
> * We have a HAProxy that distributes via leastconn between the 3 radosgw's
> (this did not change)
> * We had three times a daemon with the name "eu-central-1" running (on the
> 3 radosgw's)
> * Because this might have led to our data duplication problem, we have
> split that up so now the daemons are named per host (eu-central-1-s3db1,
> eu-central-1-s3db2, eu-central-1-s3db3)
> * We also added dedicated rgw daemons for garbage collection, because the
> current one were not able to keep up.
> * So basically ceph status went from "rgw: 1 daemon active (eu-central-1)"
> to "rgw: 14 daemons active (eu-central-1-s3db1, eu-central-1-s3db2,
> eu-central-1-s3db3, gc-s3db12, gc-s3db13...)
>
>
> Cheers
>  Boris
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Building ceph clusters with 8TB SSD drives?

2021-05-10 Thread Erik Lindahl
Hi Matt,

Yes, we've experimented a bit with consumer SSDs, and also done some
benchmarks.

The main reason for SSDs is typically to improve IOPS for small writes,
since even HDDs will usually give you quite good aggregated bandwidth as
long as you have enough of them - but for high-IOPS usage most (all)
consumer SSDs we have tested perform badly in Ceph, in particular for
writes.

The reason for this is that Ceph requires sync writes, and since consumer
SSDs (and now even some cheap datacenter ones) don't have capacitors for
power-loss-protection they cannot use the volatile caches that give them
(semi-fake) good performance on desktops. If that sounds bad, you should be
even more careful if you shop around until you find a cheap drive that
performs well - because there have historically been consumer drives that
lie and acknowledge a sync even if the data is just in volatile memory
rather than safe :-)

Samsung PM883 one relative cheap drives that we've been quite happy with -
at least if your application is not highly write-intensive. If it's
write-intensive you might need the longer-endurace SM883 (or similar from
other vendors).

Now, having said that, we have had pretty decent experience with a way to
partly cheat around these limitations: since we have a good dozen large
servers with mixed HDDs we also have 2-3 NVMe samsung PM983 M.2 drives per
server on PCIe cards for the DB/wal for these OSDs. It seems to work
remarkably well to do this for consumer SSDs to, i.e. let each 4TB el
cheapo SATA SSD (we used Samsung 860) use a ~100GB db/wal partition on an
NVMe drive. This gives very nice low latencies in rados benchmarks,
although they are still ~50% higher than with proper enterprise SSDs.


Caveats:

- Think about balancing IOPS. If you have 10 SSD OSDs share a single NVMe
WAL device you will likely be limited by the NVMe IOPS instead.
- if the NVMe drive dies, all the corresponding OSDs die.
- This might work for read-intensive applications, but if you try it for
write-intensive applications you will wear out the consumer SSDs (check
their write endurance).
- When doing rados benchmarks, you will still see latency/bandwidth go
up/down and periodically throttle to almost zero for consumer SSDs,
presumably because they are busy flushing some sort of intermediate storage.


In comparison, even the relatively cheap pm883 "just works" at constant
high bandwidth close to the bus limit, and the latency is a constant low
fraction of a millisecond in ceph.

In summary, while somewhat possible, I simply don't think it's worth the
hassle/risk/complex setup with consumer drives (and god knows I can be a
cheap bastard at times ;-), but if you absolutely have to i would at least
avoid the absolutely cheapest QVO models (note that the QVO models have a
sustained bandwidth of only 80-160MB/s - that's like a magnetic spinner!) -
and if you don't put the WAL on a better device I predict you'll regret it
once you start doing benchmarks in RADOS.

Cheers,

Erik


On Fri, May 7, 2021 at 10:11 PM Matt Larson  wrote:

> Is anyone trying Ceph clusters containing larger (4-8TB) SSD drives?
>
> 8TB SSDs are described here (
>
> https://www.anandtech.com/show/16136/qlc-8tb-ssd-review-samsung-870-qvo-sabrent-rocket-q
> ) and make use QLC NAND flash memory to reach the costs and capacity.
> Currently, the 8TB Samsung 870 SSD is $800/ea at some online retail stores.
>
> SATA form-factor SSDs can reach read/write rates of 560/520 MB/s, while not
> as great as nVME drives is still a multiple faster than 7200 RPM drives.
> SSDs now appear to have much lower failure rates than HDs in 2021 (
>
> https://www.techspot.com/news/89590-backblaze-latest-storage-reliability-figures-add-ssd-boot.html
> ).
>
> Are there any major caveats to considering working with larger SSDs for
> data pools?
>
> Thanks,
>   Matt
>
> --
> Matt Larson, PhD
> Madison, WI  53705 U.S.A.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 
Erik Lindahl 
Professor of Biophysics, Dept. Biochemistry & Biophysics, Stockholm
University
Science for Life Laboratory, Box 1031, 17121 Solna, Sweden
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Upgrade tips from Luminous to Nautilus?

2021-05-10 Thread Mark Schouten
On Thu, Apr 29, 2021 at 10:58:15AM +0200, Mark Schouten wrote:
> We've done our fair share of Ceph cluster upgrades since Hammer, and
> have not seen much problems with them. I'm now at the point that I have
> to upgrade a rather large cluster running Luminous and I would like to
> hear from other users if they have experiences with issues I can expect
> so that I can anticipate on them beforehand.


Thanks for the replies! 

Just one question though. Step one for me was to lower max_mds to one.
Documentation seems to suggest that the cluster automagically moves > 1
mds'es to a standby state. However, nothing really happens.

root@osdnode01:~# ceph fs get dadup_pmrb | grep max_mds
max_mds 1

I still have three active ranks. Do I simply restart two of the MDS'es
and force max_mds to one daemon, or is there a nicer way to move two
mds'es from active to standby?

Thanks again!

-- 
Mark Schouten | Tuxis B.V.
KvK: 74698818 | http://www.tuxis.nl/
T: +31 318 200208 | i...@tuxis.nl
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Ceph 16.2.3 issues during upgrade from 15.2.10 with cephadm/lvm list

2021-05-10 Thread David Orman
Hi,

We are seeing the mgr attempt to apply our OSD spec on the various
hosts, then block. When we investigate, we see the mgr has executed
cephadm calls like so, which are blocking:

root 1522444  0.0  0.0 102740 23216 ?S17:32   0:00
 \_ /usr/bin/python3
/var/lib/ceph/X/cephadm.30cb78bdbbafb384af862e1c2292b944f15942b586128e91262b43e91e11ae90
--image 
docker.io/ceph/ceph@sha256:694ba9cdcbe6cb7d25ab14b34113c42c2d1af18d4c79c7ba4d1f62cf43d145fe
ceph-volume --fsid X -- lvm list --format json

This occurs on all hosts in the cluster, following
starting/restarting/failing over a manager. It's blocking an
in-progress upgrade post-manager updates on one cluster, currently.

Looking at the cephadm logs on the host(s) in question, we see the
last entry appears to be truncated, like:

2021-05-10 17:32:06,471 INFO /usr/bin/podman:
"ceph.db_uuid": "1n2f5v-EEgO-1Kn6-hQd2-v5QF-AN9o-XPkL6b",
2021-05-10 17:32:06,471 INFO /usr/bin/podman:
"ceph.encrypted": "0",
2021-05-10 17:32:06,471 INFO /usr/bin/podman:
"ceph.osd_fsid": "",
2021-05-10 17:32:06,471 INFO /usr/bin/podman:
"ceph.osd_id": "205",
2021-05-10 17:32:06,471 INFO /usr/bin/podman:
"ceph.osdspec_affinity": "osd_spec",
2021-05-10 17:32:06,471 INFO /usr/bin/podman:
"ceph.type": "block",

The previous entry looks like this:

2021-05-10 17:32:06,469 INFO /usr/bin/podman:
"ceph.db_uuid": "TMTPD5-MLqp-06O2-raqp-S8o5-TfRG-hbFmpu",
2021-05-10 17:32:06,469 INFO /usr/bin/podman:
"ceph.encrypted": "0",
2021-05-10 17:32:06,469 INFO /usr/bin/podman:
"ceph.osd_fsid": "",
2021-05-10 17:32:06,469 INFO /usr/bin/podman:
"ceph.osd_id": "195",
2021-05-10 17:32:06,470 INFO /usr/bin/podman:
"ceph.osdspec_affinity": "osd_spec",
2021-05-10 17:32:06,470 INFO /usr/bin/podman:
"ceph.type": "block",
2021-05-10 17:32:06,470 INFO /usr/bin/podman: "ceph.vdo": "0"
2021-05-10 17:32:06,470 INFO /usr/bin/podman: },
2021-05-10 17:32:06,470 INFO /usr/bin/podman: "type": "block",
2021-05-10 17:32:06,470 INFO /usr/bin/podman: "vg_name":
"ceph-ffd1a4a7-316c-4c85-acde-06459e26f2c4"
2021-05-10 17:32:06,470 INFO /usr/bin/podman: }
2021-05-10 17:32:06,470 INFO /usr/bin/podman: ],

We'd like to get to the bottom of this, please let us know what other
information we can provide.

Thank you,
David
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [Suspicious newsletter] Building ceph clusters with 8TB SSD drives?

2021-05-10 Thread Szabo, Istvan (Agoda)
We are using in our objectstore 15TB SSDs.

Istvan Szabo
Senior Infrastructure Engineer
---
Agoda Services Co., Ltd.
e: istvan.sz...@agoda.com
---

-Original Message-
From: Matt Larson 
Sent: Saturday, May 8, 2021 3:11 AM
To: ceph-users 
Subject: [Suspicious newsletter] [ceph-users] Building ceph clusters with 8TB 
SSD drives?

Is anyone trying Ceph clusters containing larger (4-8TB) SSD drives?

8TB SSDs are described here (
https://www.anandtech.com/show/16136/qlc-8tb-ssd-review-samsung-870-qvo-sabrent-rocket-q
) and make use QLC NAND flash memory to reach the costs and capacity.
Currently, the 8TB Samsung 870 SSD is $800/ea at some online retail stores.

SATA form-factor SSDs can reach read/write rates of 560/520 MB/s, while not as 
great as nVME drives is still a multiple faster than 7200 RPM drives.
SSDs now appear to have much lower failure rates than HDs in 2021 ( 
https://www.techspot.com/news/89590-backblaze-latest-storage-reliability-figures-add-ssd-boot.html
).

Are there any major caveats to considering working with larger SSDs for data 
pools?

Thanks,
  Matt

--
Matt Larson, PhD
Madison, WI  53705 U.S.A.
___
ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to 
ceph-users-le...@ceph.io


This message is confidential and is for the sole use of the intended 
recipient(s). It may also be privileged or otherwise protected by copyright or 
other legal rules. If you have received it by mistake please let us know by 
reply email and delete it from your system. It is prohibited to copy this 
message or disclose its content to anyone. Any confidentiality or privilege is 
not waived or lost by any mistaken delivery or unauthorized disclosure of the 
message. All messages sent to and from Agoda may be monitored to ensure 
compliance with company policies, to protect the company's interests and to 
remove potential malware. Electronic messages may be intercepted, amended, lost 
or deleted, or contain viruses.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Which EC-code for 6 servers?

2021-05-10 Thread Szabo, Istvan (Agoda)
Hi,

Thinking to have 2:2 so I can tolerate 2 hosts loss, but if I just want to 
tolerate 1 host loss, which one better, 3:2  or 4:1?

Istvan Szabo
Senior Infrastructure Engineer
---
Agoda Services Co., Ltd.
e: istvan.sz...@agoda.com
---



This message is confidential and is for the sole use of the intended 
recipient(s). It may also be privileged or otherwise protected by copyright or 
other legal rules. If you have received it by mistake please let us know by 
reply email and delete it from your system. It is prohibited to copy this 
message or disclose its content to anyone. Any confidentiality or privilege is 
not waived or lost by any mistaken delivery or unauthorized disclosure of the 
message. All messages sent to and from Agoda may be monitored to ensure 
compliance with company policies, to protect the company's interests and to 
remove potential malware. Electronic messages may be intercepted, amended, lost 
or deleted, or contain viruses.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Host crash undetected by ceph health check

2021-05-10 Thread Frank Schilder
I reproduced the problem today by taking down the ceph cluster network 
interface on a host, cutting off all ceph communication at once. What I observe 
is, that IO gets stuck, but OSDs are not marked down. Instead, operations like 
the one below get stuck in the MON leader and a MON slow ops warning is shown. 
I thought that OSDs get marked down after a few missed heartbeats, but no such 
thing seems to happen. The cluster is mimic 13.2.10.

What is the expected behaviour and am I seeing something unexpected?

Thanks for any help!

{
"description": "osd_failure(failed timeout osd.503 
192.168.32.74:6830/7639 for 66sec e468459 v468459)",
"initiated_at": "2021-05-10 14:54:06.206619",
"age": 116.134646,
"duration": 88.051377,
"type_data": {
"events": [
{
"time": "2021-05-10 14:54:06.206619",
"event": "initiated"
},
{
"time": "2021-05-10 14:54:06.206619",
"event": "header_read"
},
{
"time": "0.00",
"event": "throttled"
},
{
"time": "0.00",
"event": "all_read"
},
{
"time": "0.00",
"event": "dispatched"
},
{
"time": "2021-05-10 14:54:06.211701",
"event": "mon:_ms_dispatch"
},
{
"time": "2021-05-10 14:54:06.211701",
"event": "mon:dispatch_op"
},
{
"time": "2021-05-10 14:54:06.211701",
"event": "psvc:dispatch"
},
{
"time": "2021-05-10 14:54:06.211709",
"event": "osdmap:preprocess_query"
},
{
"time": "2021-05-10 14:54:06.211709",
"event": "osdmap:preprocess_failure"
},
{
"time": "2021-05-10 14:54:06.211717",
"event": "osdmap:prepare_update"
},
{
"time": "2021-05-10 14:54:06.211718",
"event": "osdmap:prepare_failure"
},
{
"time": "2021-05-10 14:54:06.211732",
"event": "no_reply: send routed request"
},
{
"time": "2021-05-10 14:55:34.257996",
"event": "no_reply: send routed request"
},
{
"time": "2021-05-10 14:55:34.257996",
"event": "done"
}
],
"info": {
"seq": 34455802,
"src_is_mon": false,
"source": "osd.373 192.168.32.73:6806/7244",
"forwarded_to_leader": false
}
}
}

=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Frank Schilder 
Sent: 07 May 2021 22:06:38
To: ceph-users@ceph.io
Subject: [ceph-users] Host crash undetected by ceph health check

Dear cephers,

today it seems I observed an impossible event for the first time: an OSD host 
crashed, but the ceph health monitoring did not recognise the crash. Not a 
single OSD was marked down and IO simply stopped, waiting for the crashed OSDs 
to respond. All that was reported was slow ops, slow meta data IO, MDS behind 
on trimming, but no OSD fail. I have rebooted these machines a lot of times and 
have never seen the health check fail to recognise that instantly. The only 
difference I see is that these were clean shut-downs, not crashes (I believe 
the OSDs mark themselves as down).

For debugging this problem, can anyone provide me with a pointer when this 
could be the result of a misconfiguration?

Thanks and best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [Suspicious newsletter] RGW: Multiple Site does not sync olds data

2021-05-10 Thread Szabo, Istvan (Agoda)
So how is your multisite things going at the moment? Seems like with this 
rewrite you’ve moved further than me 😊 Is it working properly now? If yes, what 
is the steps to make it work? Where is the magic 😊 ?

Istvan Szabo
Senior Infrastructure Engineer
---
Agoda Services Co., Ltd.
e: istvan.sz...@agoda.com
---

From: 特木勒 
Sent: Thursday, May 6, 2021 11:27 AM
To: Jean-Sebastien Landry 
Cc: Szabo, Istvan (Agoda) ; ceph-users@ceph.io; Amit 
Ghadge 
Subject: Re: [ceph-users] Re: [Suspicious newsletter] RGW: Multiple Site does 
not sync olds data

Hi Jean:

Thanks for your info.

Unfortunately I check the secondary cluster and non-objects had been synced. 
The only way I have is to force rewrite objects for whole buckets.

I have tried to set up multiple site between Nautilus and octopus. It works 
pretty well. But after I upgrade primary cluster to octopus, we have this 
issue. :(

Here is the issue: https://tracker.ceph.com/issues/49542#change-193975

Thanks

Jean-Sebastien Landry 
mailto:jean-sebastien.landr...@ulaval.ca>> 
于2021年4月27日周二 下午7:52写道:
Hi, I hit the same errors when doing multisite sync between luminous and
octopus, but what I founded is that my sync errors was mainly on old
multipart and shadow objects, at the "rados level" if I might say.
(leftovers from luminous bugs)

So check at the "user level", using s3cmd/awscli and the objects md5,
you will probably find that your pretty much in sync. Hopefully.

Cheers!

On 4/25/21 11:29 PM, 特木勒 wrote:
> [Externe UL*]
>
> Another problem I notice for a new bucket, the first object in the bucket
> will not be sync. the sync will start with the second object. I tried to
> fix the index on the bucket and manually rerun bucket sync, but the first
> object still does not sync with secondary cluster.
>
> Do you have any ideas for this issue?
>
> Thanks
>
> 特木勒 mailto:twl...@gmail.com>> 于2021年4月26日周一 上午11:16写道:
>
>> Hi Istvan:
>>
>> Thanks Amit's suggestion.
>>
>> I followed his suggestion to fix bucket index and re-do sync on buckets,
>> but it still did not work for me.
>>
>> Then I tried to use bucket rewrite command to rewrite all the objects in
>> buckets and it works for me. I think the reason is there's something wrong
>> with bucket index and rewrite has rebuilt the index.
>>
>> Here's the command I use:
>> `sudo radosgw-admin bucket rewrite -b BUCKET-NAME --min-rewrite-size 0`
>>
>> Maybe you can try this to fix the sync issues.
>>
>> @Amit Ghadge mailto:amitg@gmail.com>> Thanks for 
>> your suggestions. Without
>> your suggestions, I will not notice something wrong with index part.
>>
>> Thanks :)
>>
>> Szabo, Istvan (Agoda) 
>> mailto:istvan.sz...@agoda.com>> 于2021年4月26日周一 
>> 上午9:57写道:
>>
>>> Hi,
>>>
>>>
>>>
>>> No, doesn’t work, now we will write our own sync app for ceph, I gave up.
>>>
>>>
>>>
>>> Istvan Szabo
>>> Senior Infrastructure Engineer
>>> ---
>>> Agoda Services Co., Ltd.
>>> e: istvan.sz...@agoda.com
>>> ---
>>>
>>>
>>>
>>> *From:* 特木勒 mailto:twl...@gmail.com>>
>>> *Sent:* Friday, April 23, 2021 7:50 PM
>>> *To:* Szabo, Istvan (Agoda) 
>>> mailto:istvan.sz...@agoda.com>>
>>> *Cc:* ceph-users@ceph.io
>>> *Subject:* Re: [Suspicious newsletter] [ceph-users] RGW: Multiple Site
>>> does not sync olds data
>>>
>>>
>>>
>>> Hi Istvan:
>>>
>>>
>>>
>>> We just upgraded whole cluster to 15.2.10 and the multiple site still
>>> cannot sync whole objects to secondary cluster. 🙁
>>>
>>>
>>>
>>> Do you have any suggestions on this? And I open another issues in ceph
>>> tracker site:
>>>
>>> https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftracker.ceph.com%2Fissues%2F50474&data=04%7C01%7Cjean-sebastien.landry%40dti.ulaval.ca%7C7758dcb00aa0481e098408d90863da13%7C56778bd56a3f4bd3a26593163e4d5bfe%7C1%7C0%7C637550047218082332%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=flvOwiaSDmDRQsbhzZsw5Q16hTmzUj9RXhcBrL6wsjo%3D&reserved=0
>>>
>>>
>>>
>>> Hope someone could go to check this issue.
>>>
>>>
>>>
>>> Thanks
>>>
>>>
>>>
>>> 特木勒 mailto:twl...@gmail.com>>于2021年3月22日 周一下午9:08写道:
>>>
>>> Thank you~
>>>
>>>
>>>
>>> I will try to upgrade cluster too. Seem like this is the only way for
>>> now. 😭
>>>
>>>
>>>
>>> I will let you know once I complete testing. :)
>>>
>>>
>>>
>>> Have a good day
>>>
>>>
>>>
>>> Szabo, Istvan (Agoda) 
>>> mailto:istvan.sz...@agoda.com>>于2021年3月22日 
>>> 周一下午3:38写道:
>>>
>>> Yeah, doesn't work. Last week they fixed my problem ticket which caused
>>> the crashes, and due to the crashes stopped the replication I'll give a try
>>> this week again after the update if the daemon doesn't crash, maybe it will
>>> work, because if crash hasn't happened, the data was synced. Fingers
>>> crosse

[ceph-users] Re: Ceph 16.2.3 issues during upgrade from 15.2.10 with cephadm/lvm list

2021-05-10 Thread David Orman
I think I may have found the issue:

https://tracker.ceph.com/issues/50526
It seems it may be fixed in: https://github.com/ceph/ceph/pull/41045

I hope this can be prioritized as an urgent fix as it's broken
upgrades on clusters of a relatively normal size (14 nodes, 24x OSDs,
2x NVME for DB/WAL w/ 12 OSDs per NVME), even when new OSDs are not
being deployed, as it still tries to apply the OSD specification.

On Mon, May 10, 2021 at 4:03 PM David Orman  wrote:
>
> Hi,
>
> We are seeing the mgr attempt to apply our OSD spec on the various
> hosts, then block. When we investigate, we see the mgr has executed
> cephadm calls like so, which are blocking:
>
> root 1522444  0.0  0.0 102740 23216 ?S17:32   0:00
>  \_ /usr/bin/python3
> /var/lib/ceph/X/cephadm.30cb78bdbbafb384af862e1c2292b944f15942b586128e91262b43e91e11ae90
> --image 
> docker.io/ceph/ceph@sha256:694ba9cdcbe6cb7d25ab14b34113c42c2d1af18d4c79c7ba4d1f62cf43d145fe
> ceph-volume --fsid X -- lvm list --format json
>
> This occurs on all hosts in the cluster, following
> starting/restarting/failing over a manager. It's blocking an
> in-progress upgrade post-manager updates on one cluster, currently.
>
> Looking at the cephadm logs on the host(s) in question, we see the
> last entry appears to be truncated, like:
>
> 2021-05-10 17:32:06,471 INFO /usr/bin/podman:
> "ceph.db_uuid": "1n2f5v-EEgO-1Kn6-hQd2-v5QF-AN9o-XPkL6b",
> 2021-05-10 17:32:06,471 INFO /usr/bin/podman:
> "ceph.encrypted": "0",
> 2021-05-10 17:32:06,471 INFO /usr/bin/podman:
> "ceph.osd_fsid": "",
> 2021-05-10 17:32:06,471 INFO /usr/bin/podman:
> "ceph.osd_id": "205",
> 2021-05-10 17:32:06,471 INFO /usr/bin/podman:
> "ceph.osdspec_affinity": "osd_spec",
> 2021-05-10 17:32:06,471 INFO /usr/bin/podman:
> "ceph.type": "block",
>
> The previous entry looks like this:
>
> 2021-05-10 17:32:06,469 INFO /usr/bin/podman:
> "ceph.db_uuid": "TMTPD5-MLqp-06O2-raqp-S8o5-TfRG-hbFmpu",
> 2021-05-10 17:32:06,469 INFO /usr/bin/podman:
> "ceph.encrypted": "0",
> 2021-05-10 17:32:06,469 INFO /usr/bin/podman:
> "ceph.osd_fsid": "",
> 2021-05-10 17:32:06,469 INFO /usr/bin/podman:
> "ceph.osd_id": "195",
> 2021-05-10 17:32:06,470 INFO /usr/bin/podman:
> "ceph.osdspec_affinity": "osd_spec",
> 2021-05-10 17:32:06,470 INFO /usr/bin/podman:
> "ceph.type": "block",
> 2021-05-10 17:32:06,470 INFO /usr/bin/podman: "ceph.vdo": "0"
> 2021-05-10 17:32:06,470 INFO /usr/bin/podman: },
> 2021-05-10 17:32:06,470 INFO /usr/bin/podman: "type": "block",
> 2021-05-10 17:32:06,470 INFO /usr/bin/podman: "vg_name":
> "ceph-ffd1a4a7-316c-4c85-acde-06459e26f2c4"
> 2021-05-10 17:32:06,470 INFO /usr/bin/podman: }
> 2021-05-10 17:32:06,470 INFO /usr/bin/podman: ],
>
> We'd like to get to the bottom of this, please let us know what other
> information we can provide.
>
> Thank you,
> David
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph 16.2.3 issues during upgrade from 15.2.10 with cephadm/lvm list

2021-05-10 Thread Sage Weil
The root cause is a bug in conmon.  If you can upgrade to >= 2.0.26
this will also fix the problem.  What version are you using?  The
kubic repos currently have 2.0.27.  See
https://build.opensuse.org/project/show/devel:kubic:libcontainers:stable

We'll make sure the next release has the verbosity workaround!

sage

On Mon, May 10, 2021 at 5:47 PM David Orman  wrote:
>
> I think I may have found the issue:
>
> https://tracker.ceph.com/issues/50526
> It seems it may be fixed in: https://github.com/ceph/ceph/pull/41045
>
> I hope this can be prioritized as an urgent fix as it's broken
> upgrades on clusters of a relatively normal size (14 nodes, 24x OSDs,
> 2x NVME for DB/WAL w/ 12 OSDs per NVME), even when new OSDs are not
> being deployed, as it still tries to apply the OSD specification.
>
> On Mon, May 10, 2021 at 4:03 PM David Orman  wrote:
> >
> > Hi,
> >
> > We are seeing the mgr attempt to apply our OSD spec on the various
> > hosts, then block. When we investigate, we see the mgr has executed
> > cephadm calls like so, which are blocking:
> >
> > root 1522444  0.0  0.0 102740 23216 ?S17:32   0:00
> >  \_ /usr/bin/python3
> > /var/lib/ceph/X/cephadm.30cb78bdbbafb384af862e1c2292b944f15942b586128e91262b43e91e11ae90
> > --image 
> > docker.io/ceph/ceph@sha256:694ba9cdcbe6cb7d25ab14b34113c42c2d1af18d4c79c7ba4d1f62cf43d145fe
> > ceph-volume --fsid X -- lvm list --format json
> >
> > This occurs on all hosts in the cluster, following
> > starting/restarting/failing over a manager. It's blocking an
> > in-progress upgrade post-manager updates on one cluster, currently.
> >
> > Looking at the cephadm logs on the host(s) in question, we see the
> > last entry appears to be truncated, like:
> >
> > 2021-05-10 17:32:06,471 INFO /usr/bin/podman:
> > "ceph.db_uuid": "1n2f5v-EEgO-1Kn6-hQd2-v5QF-AN9o-XPkL6b",
> > 2021-05-10 17:32:06,471 INFO /usr/bin/podman:
> > "ceph.encrypted": "0",
> > 2021-05-10 17:32:06,471 INFO /usr/bin/podman:
> > "ceph.osd_fsid": "",
> > 2021-05-10 17:32:06,471 INFO /usr/bin/podman:
> > "ceph.osd_id": "205",
> > 2021-05-10 17:32:06,471 INFO /usr/bin/podman:
> > "ceph.osdspec_affinity": "osd_spec",
> > 2021-05-10 17:32:06,471 INFO /usr/bin/podman:
> > "ceph.type": "block",
> >
> > The previous entry looks like this:
> >
> > 2021-05-10 17:32:06,469 INFO /usr/bin/podman:
> > "ceph.db_uuid": "TMTPD5-MLqp-06O2-raqp-S8o5-TfRG-hbFmpu",
> > 2021-05-10 17:32:06,469 INFO /usr/bin/podman:
> > "ceph.encrypted": "0",
> > 2021-05-10 17:32:06,469 INFO /usr/bin/podman:
> > "ceph.osd_fsid": "",
> > 2021-05-10 17:32:06,469 INFO /usr/bin/podman:
> > "ceph.osd_id": "195",
> > 2021-05-10 17:32:06,470 INFO /usr/bin/podman:
> > "ceph.osdspec_affinity": "osd_spec",
> > 2021-05-10 17:32:06,470 INFO /usr/bin/podman:
> > "ceph.type": "block",
> > 2021-05-10 17:32:06,470 INFO /usr/bin/podman: "ceph.vdo": 
> > "0"
> > 2021-05-10 17:32:06,470 INFO /usr/bin/podman: },
> > 2021-05-10 17:32:06,470 INFO /usr/bin/podman: "type": "block",
> > 2021-05-10 17:32:06,470 INFO /usr/bin/podman: "vg_name":
> > "ceph-ffd1a4a7-316c-4c85-acde-06459e26f2c4"
> > 2021-05-10 17:32:06,470 INFO /usr/bin/podman: }
> > 2021-05-10 17:32:06,470 INFO /usr/bin/podman: ],
> >
> > We'd like to get to the bottom of this, please let us know what other
> > information we can provide.
> >
> > Thank you,
> > David
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph 16.2.3 issues during upgrade from 15.2.10 with cephadm/lvm list

2021-05-10 Thread David Orman
Hi Sage,

We've got 2.0.27 installed. I restarted all the manager pods, just in
case, and I have the same behavior afterwards.

David

On Mon, May 10, 2021 at 6:53 PM Sage Weil  wrote:
>
> The root cause is a bug in conmon.  If you can upgrade to >= 2.0.26
> this will also fix the problem.  What version are you using?  The
> kubic repos currently have 2.0.27.  See
> https://build.opensuse.org/project/show/devel:kubic:libcontainers:stable
>
> We'll make sure the next release has the verbosity workaround!
>
> sage
>
> On Mon, May 10, 2021 at 5:47 PM David Orman  wrote:
> >
> > I think I may have found the issue:
> >
> > https://tracker.ceph.com/issues/50526
> > It seems it may be fixed in: https://github.com/ceph/ceph/pull/41045
> >
> > I hope this can be prioritized as an urgent fix as it's broken
> > upgrades on clusters of a relatively normal size (14 nodes, 24x OSDs,
> > 2x NVME for DB/WAL w/ 12 OSDs per NVME), even when new OSDs are not
> > being deployed, as it still tries to apply the OSD specification.
> >
> > On Mon, May 10, 2021 at 4:03 PM David Orman  wrote:
> > >
> > > Hi,
> > >
> > > We are seeing the mgr attempt to apply our OSD spec on the various
> > > hosts, then block. When we investigate, we see the mgr has executed
> > > cephadm calls like so, which are blocking:
> > >
> > > root 1522444  0.0  0.0 102740 23216 ?S17:32   0:00
> > >  \_ /usr/bin/python3
> > > /var/lib/ceph/X/cephadm.30cb78bdbbafb384af862e1c2292b944f15942b586128e91262b43e91e11ae90
> > > --image 
> > > docker.io/ceph/ceph@sha256:694ba9cdcbe6cb7d25ab14b34113c42c2d1af18d4c79c7ba4d1f62cf43d145fe
> > > ceph-volume --fsid X -- lvm list --format json
> > >
> > > This occurs on all hosts in the cluster, following
> > > starting/restarting/failing over a manager. It's blocking an
> > > in-progress upgrade post-manager updates on one cluster, currently.
> > >
> > > Looking at the cephadm logs on the host(s) in question, we see the
> > > last entry appears to be truncated, like:
> > >
> > > 2021-05-10 17:32:06,471 INFO /usr/bin/podman:
> > > "ceph.db_uuid": "1n2f5v-EEgO-1Kn6-hQd2-v5QF-AN9o-XPkL6b",
> > > 2021-05-10 17:32:06,471 INFO /usr/bin/podman:
> > > "ceph.encrypted": "0",
> > > 2021-05-10 17:32:06,471 INFO /usr/bin/podman:
> > > "ceph.osd_fsid": "",
> > > 2021-05-10 17:32:06,471 INFO /usr/bin/podman:
> > > "ceph.osd_id": "205",
> > > 2021-05-10 17:32:06,471 INFO /usr/bin/podman:
> > > "ceph.osdspec_affinity": "osd_spec",
> > > 2021-05-10 17:32:06,471 INFO /usr/bin/podman:
> > > "ceph.type": "block",
> > >
> > > The previous entry looks like this:
> > >
> > > 2021-05-10 17:32:06,469 INFO /usr/bin/podman:
> > > "ceph.db_uuid": "TMTPD5-MLqp-06O2-raqp-S8o5-TfRG-hbFmpu",
> > > 2021-05-10 17:32:06,469 INFO /usr/bin/podman:
> > > "ceph.encrypted": "0",
> > > 2021-05-10 17:32:06,469 INFO /usr/bin/podman:
> > > "ceph.osd_fsid": "",
> > > 2021-05-10 17:32:06,469 INFO /usr/bin/podman:
> > > "ceph.osd_id": "195",
> > > 2021-05-10 17:32:06,470 INFO /usr/bin/podman:
> > > "ceph.osdspec_affinity": "osd_spec",
> > > 2021-05-10 17:32:06,470 INFO /usr/bin/podman:
> > > "ceph.type": "block",
> > > 2021-05-10 17:32:06,470 INFO /usr/bin/podman: "ceph.vdo": 
> > > "0"
> > > 2021-05-10 17:32:06,470 INFO /usr/bin/podman: },
> > > 2021-05-10 17:32:06,470 INFO /usr/bin/podman: "type": "block",
> > > 2021-05-10 17:32:06,470 INFO /usr/bin/podman: "vg_name":
> > > "ceph-ffd1a4a7-316c-4c85-acde-06459e26f2c4"
> > > 2021-05-10 17:32:06,470 INFO /usr/bin/podman: }
> > > 2021-05-10 17:32:06,470 INFO /usr/bin/podman: ],
> > >
> > > We'd like to get to the bottom of this, please let us know what other
> > > information we can provide.
> > >
> > > Thank you,
> > > David
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Write Ops on CephFS Increasing exponentially

2021-05-10 Thread Patrick Donnelly
Hi Kyle,

On Thu, May 6, 2021 at 7:56 AM Kyle Dean  wrote:
>
> Hi, hoping someone could help me get to the bottom of this particular issue 
> I'm having.
>
> I have ceph octopus installed using ceph-ansible.
>
> Currently, I have 3 MDS servers running, and one client connected to the 
> active MDS. I'm currently storing a very large encrypted container on the 
> CephFS file system, 8TB worth, and I'm writing data into it from the client 
> host.
>
> recently I have noticed a severe impact on performance, and the time take to 
> do processing on file within the container has increased from 1 minute to 11 
> minutes.
>
> in the ceph dashboard, when I take a look at the performance tab on the file 
> system page, the Write Ops are increasing exponentially over time.
>
> At the end of April around the 22nd I had 49 write Ops on the performance 
> page for the MDS deamons. This is now at 266467 Write Ops and increasing.
>
> Also the client requests have gone from 14 to 67 to 117 and is now at 283
>
> would someone be able to help me make sense of why the performance has 
> decreased and what is going on with the client requests and write operations.

I suggest you look at the "perf dump" statistics from the MDS  (via
ceph tell or admin socket) over a period of time to get an idea what
operations it's performing. It's probable your workload changed
somehow and that is the cause.

-- 
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: radosgw-admin user create takes a long time (with failed to distribute cache message)

2021-05-10 Thread Boris Behrens
Hi Amit,

I just pinged the mons from every system and they are all available.

Am Mo., 10. Mai 2021 um 21:18 Uhr schrieb Amit Ghadge :

> We seen slowness due to unreachable one of them mgr service, maybe here
> are different, you can check monmap/ ceph.conf mon entry and then verify
> all nodes are successfully ping.
>
>
> -AmitG
>
>
> On Tue, 11 May 2021 at 12:12 AM, Boris Behrens  wrote:
>
>> Hi guys,
>>
>> does someone got any idea?
>>
>> Am Mi., 5. Mai 2021 um 16:16 Uhr schrieb Boris Behrens :
>>
>> > Hi,
>> > since a couple of days we experience a strange slowness on some
>> > radosgw-admin operations.
>> > What is the best way to debug this?
>> >
>> > For example creating a user takes over 20s.
>> > [root@s3db1 ~]# time radosgw-admin user create --uid test-bb-user
>> > --display-name=test-bb-user
>> > 2021-05-05 14:08:14.297 7f6942286840  1 robust_notify: If at first you
>> > don't succeed: (110) Connection timed out
>> > 2021-05-05 14:08:14.297 7f6942286840  0 ERROR: failed to distribute
>> cache
>> > for eu-central-1.rgw.users.uid:test-bb-user
>> > 2021-05-05 14:08:24.335 7f6942286840  1 robust_notify: If at first you
>> > don't succeed: (110) Connection timed out
>> > 2021-05-05 14:08:24.335 7f6942286840  0 ERROR: failed to distribute
>> cache
>> > for eu-central-1.rgw.users.keys:
>> > {
>> > "user_id": "test-bb-user",
>> > "display_name": "test-bb-user",
>> >
>> > }
>> > real 0m20.557s
>> > user 0m0.087s
>> > sys 0m0.030s
>> >
>> > First I thought that rados operations might be slow, but adding and
>> > deleting objects in rados are fast as usual (at least from my
>> perspective).
>> > Also uploading to buckets is fine.
>> >
>> > We changed some things and I think it might have to do with this:
>> > * We have a HAProxy that distributes via leastconn between the 3
>> radosgw's
>> > (this did not change)
>> > * We had three times a daemon with the name "eu-central-1" running (on
>> the
>> > 3 radosgw's)
>> > * Because this might have led to our data duplication problem, we have
>> > split that up so now the daemons are named per host (eu-central-1-s3db1,
>> > eu-central-1-s3db2, eu-central-1-s3db3)
>> > * We also added dedicated rgw daemons for garbage collection, because
>> the
>> > current one were not able to keep up.
>> > * So basically ceph status went from "rgw: 1 daemon active
>> (eu-central-1)"
>> > to "rgw: 14 daemons active (eu-central-1-s3db1, eu-central-1-s3db2,
>> > eu-central-1-s3db3, gc-s3db12, gc-s3db13...)
>> >
>> >
>> > Cheers
>> >  Boris
>> >
>>
>>
>> --
>> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
>> groüen Saal.
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>

-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [Suspicious newsletter] RGW: Multiple Site does not sync olds data

2021-05-10 Thread Szabo, Istvan (Agoda)
Ok, will be challenging with an 800 millions object bucket 😃 But I might give a 
try.

Istvan Szabo
Senior Infrastructure Engineer
---
Agoda Services Co., Ltd.
e: istvan.sz...@agoda.com
---

From: 特木勒 
Sent: Monday, May 10, 2021 6:53 PM
To: Szabo, Istvan (Agoda) 
Cc: Jean-Sebastien Landry ; 
ceph-users@ceph.io; Amit Ghadge 
Subject: Re: [ceph-users] Re: [Suspicious newsletter] RGW: Multiple Site does 
not sync olds data

Hi Istvan:

Thanks for your help.

After we rewrite all the objects that in buckets, the sync seems to work again.

We are using this command to rewrite all the objects in specific bucket:
`radosgw-admin bucket rewrite —bucket=BUCKET_NAME --min-rewrite-size 0`

You can try to run this on 1 bucket and see if it could help you fix the 
problem.

Thank you~

Szabo, Istvan (Agoda) mailto:istvan.sz...@agoda.com>> 
于2021年5月10日周一 下午12:16写道:
So how is your multisite things going at the moment? Seems like with this 
rewrite you’ve moved further than me 😊 Is it working properly now? If yes, what 
is the steps to make it work? Where is the magic 😊 ?

Istvan Szabo
Senior Infrastructure Engineer
---
Agoda Services Co., Ltd.
e: istvan.sz...@agoda.com
---

From: 特木勒 mailto:twl...@gmail.com>>
Sent: Thursday, May 6, 2021 11:27 AM
To: Jean-Sebastien Landry 
mailto:jean-sebastien.landr...@ulaval.ca>>
Cc: Szabo, Istvan (Agoda) 
mailto:istvan.sz...@agoda.com>>; 
ceph-users@ceph.io; Amit Ghadge 
mailto:amitg@gmail.com>>
Subject: Re: [ceph-users] Re: [Suspicious newsletter] RGW: Multiple Site does 
not sync olds data

Hi Jean:

Thanks for your info.

Unfortunately I check the secondary cluster and non-objects had been synced. 
The only way I have is to force rewrite objects for whole buckets.

I have tried to set up multiple site between Nautilus and octopus. It works 
pretty well. But after I upgrade primary cluster to octopus, we have this 
issue. :(

Here is the issue: https://tracker.ceph.com/issues/49542#change-193975

Thanks

Jean-Sebastien Landry 
mailto:jean-sebastien.landr...@ulaval.ca>> 
于2021年4月27日周二 下午7:52写道:
Hi, I hit the same errors when doing multisite sync between luminous and
octopus, but what I founded is that my sync errors was mainly on old
multipart and shadow objects, at the "rados level" if I might say.
(leftovers from luminous bugs)

So check at the "user level", using s3cmd/awscli and the objects md5,
you will probably find that your pretty much in sync. Hopefully.

Cheers!

On 4/25/21 11:29 PM, 特木勒 wrote:
> [Externe UL*]
>
> Another problem I notice for a new bucket, the first object in the bucket
> will not be sync. the sync will start with the second object. I tried to
> fix the index on the bucket and manually rerun bucket sync, but the first
> object still does not sync with secondary cluster.
>
> Do you have any ideas for this issue?
>
> Thanks
>
> 特木勒 mailto:twl...@gmail.com>> 于2021年4月26日周一 上午11:16写道:
>
>> Hi Istvan:
>>
>> Thanks Amit's suggestion.
>>
>> I followed his suggestion to fix bucket index and re-do sync on buckets,
>> but it still did not work for me.
>>
>> Then I tried to use bucket rewrite command to rewrite all the objects in
>> buckets and it works for me. I think the reason is there's something wrong
>> with bucket index and rewrite has rebuilt the index.
>>
>> Here's the command I use:
>> `sudo radosgw-admin bucket rewrite -b BUCKET-NAME --min-rewrite-size 0`
>>
>> Maybe you can try this to fix the sync issues.
>>
>> @Amit Ghadge mailto:amitg@gmail.com>> Thanks for 
>> your suggestions. Without
>> your suggestions, I will not notice something wrong with index part.
>>
>> Thanks :)
>>
>> Szabo, Istvan (Agoda) 
>> mailto:istvan.sz...@agoda.com>> 于2021年4月26日周一 
>> 上午9:57写道:
>>
>>> Hi,
>>>
>>>
>>>
>>> No, doesn’t work, now we will write our own sync app for ceph, I gave up.
>>>
>>>
>>>
>>> Istvan Szabo
>>> Senior Infrastructure Engineer
>>> ---
>>> Agoda Services Co., Ltd.
>>> e: istvan.sz...@agoda.com
>>> ---
>>>
>>>
>>>
>>> *From:* 特木勒 mailto:twl...@gmail.com>>
>>> *Sent:* Friday, April 23, 2021 7:50 PM
>>> *To:* Szabo, Istvan (Agoda) 
>>> mailto:istvan.sz...@agoda.com>>
>>> *Cc:* ceph-users@ceph.io
>>> *Subject:* Re: [Suspicious newsletter] [ceph-users] RGW: Multiple Site
>>> does not sync olds data
>>>
>>>
>>>
>>> Hi Istvan:
>>>
>>>
>>>
>>> We just upgraded whole cluster to 15.2.10 and the multiple site still
>>> cannot sync whole objects to secondary cluster. 🙁
>>>
>>>
>>>
>>> Do you have any suggestions on this? And I open another issues in ceph
>>> tracker site:
>>>
>>> https://can01.safelinks.protection.outlook.com/?url=https