[ceph-users] rwg/civetweb log verbosity level
Hi there, I have a question about rgw/civetweb log settings. Currently, rgw/civetweb prints 3 lines of logs with loglevel 1 (high priority) for each HTTP request, like following: $ tail /var/log/ceph/ceph-client.rgw.node-1.log 2018-11-28 11:52:45.339229 7fbf2d693700 1 == starting new request req=0x7fbf2d68d190 = 2018-11-28 11:52:45.341961 7fbf2d693700 1 == req done req=0x7fbf2d68d190 op status=0 http_status=200 == 2018-11-28 11:52:45.341993 7fbf2d693700 1 civetweb: 0x558f0433: 127.0.0.1 - - [28/Nov/2018:11:48:10 +0800] "HEAD /swift/v1/images.xxx.com/8801234/BFAB307D-F5FE-4BC6-9449-E854944A460F_160_180.jpg HTTP/1.1" 1 0 - goswift/1.0 The above 3 lines occupies roughly 0.5KB space on average, varying a little with the lengths of bucket names and object names. Now the problem is, when requests are intensive, it will consume a huge mount of space. For example, 4 million requests (on a single RGW node) will result to 2GB, which takes only ~6 hours to happen in our cluster node in busy period (a large part may be HEAD requests). When trouble shooting, I usually need to turn the loglevel to 5, 10 or even bigger to check the detailed logs, but most of the log space is occupied by the above access logs (level 1), which doesn’t provide much information. My question is, is there a way to configure Ceph skip those logs? E.g. only print logs with verbosity in a specified range (NOT support, according to my investigation). Or, are there any suggested ways for turning on more logs for debugging? Best Regards Arthur Chiao ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] OSD wont start after moving to a new node with ceph 12.2.10
This is *probably* unrelated to the upgrade as it's complaining at a very early stage about data corruption. (Earlier than the bug that would trigger related to the 12.2.9 issues) So this might just be a coincidence with a bad disk. That being said: you are running a 12.2.9 OSD and you probably should not upgrade to 12.2.10 especially while a backfill is running. Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 Am Di., 27. Nov. 2018 um 23:04 Uhr schrieb Cassiano Pilipavicius : > > Hi, I am facing a problem where a OSD wont start after moving to a new > node with 12.2.10 (the old one has 12.2.8) > > I have one node of my cluster failed and trued to move 3 osds to a new > node. 2 of the 3 osds has started and is running fine at the moment > (backfiling is still in place.) but one of the osds just dont start with > the following error on the logs (writing mostly to try to find if this > is a bug or if have I done something wrong): > > 2018-11-27 19:44:38.013454 7fba0d35fd80 -1 > bluestore(/var/lib/ceph/osd/ceph-1) _verify_csum bad crc32c/0x1000 > checksum at blob offset 0x0, got 0xb1a184d1, expected 0xb682fc52, device > location [0x1~1000], logical extent 0x0~1000, object > #-1:7b3f43c4:::osd_superblock:0# > 2018-11-27 19:44:38.013501 7fba0d35fd80 -1 osd.1 0 OSD::init() : unable > to read osd superblock > 2018-11-27 19:44:38.013511 7fba0d35fd80 1 > bluestore(/var/lib/ceph/osd/ceph-1) umount > 2018-11-27 19:44:38.065478 7fba0d35fd80 1 stupidalloc 0x0x55ebb04c3f80 > shutdown > 2018-11-27 19:44:38.077261 7fba0d35fd80 1 freelist shutdown > 2018-11-27 19:44:38.077316 7fba0d35fd80 4 rocksdb: > [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.10/rpm/el7/BUILD/ceph-12.2.10/src/rocksdb/db/db_impl.cc:217] > Shutdown: canceling all background work > 2018-11-27 19:44:38.077982 7fba0d35fd80 4 rocksdb: > [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.10/rpm/el7/BUILD/ceph-12.2.10/src/rocksdb/db/db_impl.cc:343] > Shutdown complete > 2018-11-27 19:44:38.107923 7fba0d35fd80 1 bluefs umount > 2018-11-27 19:44:38.108248 7fba0d35fd80 1 stupidalloc 0x0x55ebb01cddc0 > shutdown > 2018-11-27 19:44:38.108302 7fba0d35fd80 1 bdev(0x55ebb01cf800 > /var/lib/ceph/osd/ceph-1/block) close > 2018-11-27 19:44:38.362984 7fba0d35fd80 1 bdev(0x55ebb01cf600 > /var/lib/ceph/osd/ceph-1/block) close > 2018-11-27 19:44:38.470791 7fba0d35fd80 -1 ** ERROR: osd init failed: > (22) Invalid argument > > My cluster has too many mixed versions, I havent realized that the > versions is changed when running a yum update and righ now I have the > following situation:ceph versions > { > "mon": { > "ceph version 12.2.7 (3ec878d1e53e1aeb47a9f619c49d9e7c0aa384d5) > luminous (stable)": 1, > "ceph version 12.2.8 (ae699615bac534ea496ee965ac6192cb7e0e07c0) > luminous (stable)": 2 > }, > "mgr": { > "ceph version 12.2.7 (3ec878d1e53e1aeb47a9f619c49d9e7c0aa384d5) > luminous (stable)": 1 > }, > "osd": { > "ceph version 12.2.10 > (177915764b752804194937482a39e95e0ca3de94) luminous (stable)": 2, > "ceph version 12.2.7 (3ec878d1e53e1aeb47a9f619c49d9e7c0aa384d5) > luminous (stable)": 18, > "ceph version 12.2.8 (ae699615bac534ea496ee965ac6192cb7e0e07c0) > luminous (stable)": 27, > "ceph version 12.2.9 (9e300932ef8a8916fb3fda78c58691a6ab0f4217) > luminous (stable)": 1 > }, > "mds": {}, > "overall": { > "ceph version 12.2.10 > (177915764b752804194937482a39e95e0ca3de94) luminous (stable)": 2, > "ceph version 12.2.7 (3ec878d1e53e1aeb47a9f619c49d9e7c0aa384d5) > luminous (stable)": 20, > "ceph version 12.2.8 (ae699615bac534ea496ee965ac6192cb7e0e07c0) > luminous (stable)": 29, > "ceph version 12.2.9 (9e300932ef8a8916fb3fda78c58691a6ab0f4217) > luminous (stable)": 1 > } > } > > Is there an easy way to get the OSD working again? I am thinking about > waiting the backfill/recovery to finish and them upgrade all nodes to > 12.2.10 and if the OSD dont come up, recreating the OSD. > > Regards, > Cassiano Pilipavicius. > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] OSD wont start after moving to a new node with ceph 12.2.10
Hi, I am facing a problem where a OSD wont start after moving to a new node with 12.2.10 (the old one has 12.2.8) I have one node of my cluster failed and trued to move 3 osds to a new node. 2 of the 3 osds has started and is running fine at the moment (backfiling is still in place.) but one of the osds just dont start with the following error on the logs (writing mostly to try to find if this is a bug or if have I done something wrong): 2018-11-27 19:44:38.013454 7fba0d35fd80 -1 bluestore(/var/lib/ceph/osd/ceph-1) _verify_csum bad crc32c/0x1000 checksum at blob offset 0x0, got 0xb1a184d1, expected 0xb682fc52, device location [0x1~1000], logical extent 0x0~1000, object #-1:7b3f43c4:::osd_superblock:0# 2018-11-27 19:44:38.013501 7fba0d35fd80 -1 osd.1 0 OSD::init() : unable to read osd superblock 2018-11-27 19:44:38.013511 7fba0d35fd80 1 bluestore(/var/lib/ceph/osd/ceph-1) umount 2018-11-27 19:44:38.065478 7fba0d35fd80 1 stupidalloc 0x0x55ebb04c3f80 shutdown 2018-11-27 19:44:38.077261 7fba0d35fd80 1 freelist shutdown 2018-11-27 19:44:38.077316 7fba0d35fd80 4 rocksdb: [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.10/rpm/el7/BUILD/ceph-12.2.10/src/rocksdb/db/db_impl.cc:217] Shutdown: canceling all background work 2018-11-27 19:44:38.077982 7fba0d35fd80 4 rocksdb: [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.10/rpm/el7/BUILD/ceph-12.2.10/src/rocksdb/db/db_impl.cc:343] Shutdown complete 2018-11-27 19:44:38.107923 7fba0d35fd80 1 bluefs umount 2018-11-27 19:44:38.108248 7fba0d35fd80 1 stupidalloc 0x0x55ebb01cddc0 shutdown 2018-11-27 19:44:38.108302 7fba0d35fd80 1 bdev(0x55ebb01cf800 /var/lib/ceph/osd/ceph-1/block) close 2018-11-27 19:44:38.362984 7fba0d35fd80 1 bdev(0x55ebb01cf600 /var/lib/ceph/osd/ceph-1/block) close 2018-11-27 19:44:38.470791 7fba0d35fd80 -1 ** ERROR: osd init failed: (22) Invalid argument My cluster has too many mixed versions, I havent realized that the versions is changed when running a yum update and righ now I have the following situation:ceph versions { "mon": { "ceph version 12.2.7 (3ec878d1e53e1aeb47a9f619c49d9e7c0aa384d5) luminous (stable)": 1, "ceph version 12.2.8 (ae699615bac534ea496ee965ac6192cb7e0e07c0) luminous (stable)": 2 }, "mgr": { "ceph version 12.2.7 (3ec878d1e53e1aeb47a9f619c49d9e7c0aa384d5) luminous (stable)": 1 }, "osd": { "ceph version 12.2.10 (177915764b752804194937482a39e95e0ca3de94) luminous (stable)": 2, "ceph version 12.2.7 (3ec878d1e53e1aeb47a9f619c49d9e7c0aa384d5) luminous (stable)": 18, "ceph version 12.2.8 (ae699615bac534ea496ee965ac6192cb7e0e07c0) luminous (stable)": 27, "ceph version 12.2.9 (9e300932ef8a8916fb3fda78c58691a6ab0f4217) luminous (stable)": 1 }, "mds": {}, "overall": { "ceph version 12.2.10 (177915764b752804194937482a39e95e0ca3de94) luminous (stable)": 2, "ceph version 12.2.7 (3ec878d1e53e1aeb47a9f619c49d9e7c0aa384d5) luminous (stable)": 20, "ceph version 12.2.8 (ae699615bac534ea496ee965ac6192cb7e0e07c0) luminous (stable)": 29, "ceph version 12.2.9 (9e300932ef8a8916fb3fda78c58691a6ab0f4217) luminous (stable)": 1 } } Is there an easy way to get the OSD working again? I am thinking about waiting the backfill/recovery to finish and them upgrade all nodes to 12.2.10 and if the OSD dont come up, recreating the OSD. Regards, Cassiano Pilipavicius. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Luminous v12.2.10 released
On 11/27/18 12:11 PM, Josh Durgin wrote: 13.2.3 will have a similar revert, so if you are running anything other than 12.2.9 or 13.2.2 you can go directly to 13.2.3. Correction: I misremembered here, we're not reverting these patches for 13.2.3, so 12.2.9 users can upgrade to 13.2.2 or later, but other luminous users should avoid 13.2.2 or later for the time being, unless they can accept some downtime during the upgrade. See http://tracker.ceph.com/issues/36686#note-6 for more detail. Josh ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Luminous v12.2.10 released
On 11/27/18 12:00 PM, Robert Sander wrote: Am 27.11.18 um 15:50 schrieb Abhishek Lekshmanan: As mentioned above if you've successfully upgraded to v12.2.9 DO NOT upgrade to v12.2.10 until the linked tracker issue has been fixed. What about clusters currently running 12.2.9 (because this was the version in the repos when they got installed / last upgraded) where new nodes are scheduled to setup? Can the new nodes be installed with 12.2.10 and run with the other 12.2.9 nodes? Should the new nodes be pinned to 12.2.9? To be safe, pin them to 12.2.9 until we have a safe upgrade path in a future luminous release. Alternately you can restart them all at once as 12.2.10 if you don't mind a short loss of availability. Josh ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Luminous v12.2.10 released
On 11/27/18 9:40 AM, Graham Allan wrote: On 11/27/2018 08:50 AM, Abhishek Lekshmanan wrote: We're happy to announce the tenth bug fix release of the Luminous v12.2.x long term stable release series. The previous release, v12.2.9, introduced the PG hard-limit patches which were found to cause an issue in certain upgrade scenarios, and this release was expedited to revert those patches. If you already successfully upgraded to v12.2.9, you should **not** upgrade to v12.2.10, but rather **wait** for a release in which http://tracker.ceph.com/issues/36686 is addressed. All other users are encouraged to upgrade to this release. I wonder if you can comment on upgrade policy for a mixed cluster - eg where the majority is running 12.2.8 but a handful of newly-added osd nodes were installed with 12.2.9. Should the 12.2.8 nodes be upgraded to 12.2.10 (this does sound like it should have no negative effects) and just the 12.2.9 nodes kept to wait for a future release - or wait on all? I'd suggest upgrading everything to 12.2.10. If you aren't hitting crashes already with this mixed 12.2.9 + 12.2.8 cluster, a further upgrade shouldn't cause any issues. Josh ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Luminous v12.2.10 released
On 11/27/18 8:26 AM, Simon Ironside wrote: On 27/11/2018 14:50, Abhishek Lekshmanan wrote: We're happy to announce the tenth bug fix release of the Luminous v12.2.x long term stable release series. The previous release, v12.2.9, introduced the PG hard-limit patches which were found to cause an issue in certain upgrade scenarios, and this release was expedited to revert those patches. If you already successfully upgraded to v12.2.9, you should **not** upgrade to v12.2.10, but rather **wait** for a release in which http://tracker.ceph.com/issues/36686 is addressed. All other users are encouraged to upgrade to this release. Is it safe for v12.2.9 users upgrade to v13.2.2 Mimic? http://tracker.ceph.com/issues/36686 suggests a similar revert might be on the cards for v13.2.3 so I'm not sure. Yes, 13.2.2 has the same pg hard limit code as 12.2.9, so that upgrade is safe. The only danger is running a mixed-version cluster where some of the osds have the pg hard limit code, and others do not. 13.2.3 will have a similar revert, so if you are running anything other than 12.2.9 or 13.2.2 you can go directly to 13.2.3. Josh ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Luminous v12.2.10 released
Am 27.11.18 um 15:50 schrieb Abhishek Lekshmanan: > As mentioned above if you've successfully upgraded to v12.2.9 DO NOT > upgrade to v12.2.10 until the linked tracker issue has been fixed. What about clusters currently running 12.2.9 (because this was the version in the repos when they got installed / last upgraded) where new nodes are scheduled to setup? Can the new nodes be installed with 12.2.10 and run with the other 12.2.9 nodes? Should the new nodes be pinned to 12.2.9? Regards -- Robert Sander Heinlein Support GmbH Linux: Akademie - Support - Hosting http://www.heinlein-support.de Tel: 030-405051-43 Fax: 030-405051-19 Zwangsangaben lt. §35a GmbHG: HRB 93818 B / Amtsgericht Berlin-Charlottenburg, Geschäftsführer: Peer Heinlein -- Sitz: Berlin signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] RGW Swift metadata dropped when S3 bucket versioning enabled
Hi, I'm running into an issue with the RadosGW Swift API when the S3 bucket versioning is enabled. It looks like it silently drops any metadata sent with the "X-Object-Meta-foo" header (see example below). This is observed on a Luminous 12.2.8 cluster. Is that a normal thing? Am I misconfiguring something here? With S3 bucket versioning OFF: $ openstack object set --property foo=bar test test.dat $ os object show test test.dat ++--+ | Field | Value| ++--+ | account| v1 | | container | test | | content-length | 507904 | | content-type | binary/octet-stream | | etag | 03e8a398f343ade4e1e1d7c81a66e400 | | last-modified | Tue, 27 Nov 2018 13:53:54 GMT| | object | test.dat | | properties | Foo='bar'| <= Metadata is here ++--+ With S3 bucket versioning ON: $ openstack object set --property foo=bar test test2.dat $ openstack object show test test2.dat ++--+ | Field | Value| ++--+ | account| v1 | | container | test | | content-length | 507904 | | content-type | binary/octet-stream | | etag | 03e8a398f343ade4e1e1d7c81a66e400 | | last-modified | Tue, 27 Nov 2018 13:56:50 GMT| | object | test2.dat| <= Metadata is absent ++--+ Cheers, / Maxime ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Poor ceph cluster performance
And this exact problem was one of the reasons why we migrated everything to PXE boot where the OS runs from RAM. That kind of failure is just the worst to debug... Also, 1 GB of RAM is cheaper than a separate OS disk. -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 Am Di., 27. Nov. 2018 um 19:22 Uhr schrieb Cody : > > Hi everyone, > > Many, many thanks to all of you! > > The root cause was due to a failed OS drive on one storage node. The > server was responsive to ping, but unable to login. After a reboot via > IPMI, docker daemon failed to start due to I/O errors and dmesg > complained about the failing OS disk. I failed to catch the problem > initially since 'ceph -s' kept showing HEALTH and the cluster was > "functional" despite of slow performance. > > I really appreciate all the tips and advices received from you all and > learned a lot. I will carry your advices (e.g. using bluestore, > enterprise ssd/hdd, separating public and cluster traffics, etc) into > my next round PoC. > > Thank you very much! > > Best regards, > Cody > > On Tue, Nov 27, 2018 at 6:31 AM Vitaliy Filippov wrote: > > > > > CPU: 2 x E5-2603 @1.8GHz > > > RAM: 16GB > > > Network: 1G port shared for Ceph public and cluster traffics > > > Journaling device: 1 x 120GB SSD (SATA3, consumer grade) > > > OSD device: 2 x 2TB 7200rpm spindle (SATA3, consumer grade) > > > > 0.84 MB/s sequential write is impossibly bad, it's not normal with any > > kind of devices and even with 1G network, you probably have some kind of > > problem in your setup - maybe the network RTT is very high or maybe osd or > > mon nodes are shared with other running tasks and overloaded or maybe your > > disks are already dead... :)) > > > > > As I moved on to test block devices, I got a following error message: > > > > > > # rbd map image01 --pool testbench --name client.admin > > > > You don't need to map it to run benchmarks, use `fio --ioengine=rbd` > > (however you'll still need /etc/ceph/ceph.client.admin.keyring) > > > > -- > > With best regards, > >Vitaliy Filippov > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RGW performance with lots of objects
Hi Robert, Solved is probably a strong word. I'd say that things have improved. Bluestore in general tends to handle large numbers of objects better than filestore does for several reasons including that it doesn't suffer from pg directory splitting (though RocksDB compaction can become a bottleneck with very large DBs and heavy metadata traffic) Bluestore also has less overhead for OMAP operations and so far we've generally seen higher OMAP performance (ie how bucket indexes are currently stored). The bucket index sharding of course helps too. One counter argument is that bluestore uses the KeyvalueDB a lot more aggressively than filestore does and that could have an impact on bucket indexes hosted on the same OSDs as user objects. This gets sort of complicated though and may primarily be an issue if all of your OSDs are backed by NVMe and sustaining very high write traffic. Ultimately I suspect that if you ran the same 500+m object single-bucket test, that a modern bluestore deployment would probably be faster than what you saw pre-luminous with filestore. Whether or not it's acceptable is a different question. For example I've noticed in past tests that delete performance improved dramatically when objects were spread across a higher number of buckets. Probably the best course of action will be to run tests and diagnose the behavior to see if it's going to meet your needs. Thanks, Mark On 11/27/18 12:10 PM, Robert Stanford wrote: In the old days when I first installed Ceph with RGW the performance would be very slow after storing 500+ million objects in my buckets. With Luminous and index sharding is this still a problem or is this an old problem that has been solved? Regards R ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Poor ceph cluster performance
Hi everyone, Many, many thanks to all of you! The root cause was due to a failed OS drive on one storage node. The server was responsive to ping, but unable to login. After a reboot via IPMI, docker daemon failed to start due to I/O errors and dmesg complained about the failing OS disk. I failed to catch the problem initially since 'ceph -s' kept showing HEALTH and the cluster was "functional" despite of slow performance. I really appreciate all the tips and advices received from you all and learned a lot. I will carry your advices (e.g. using bluestore, enterprise ssd/hdd, separating public and cluster traffics, etc) into my next round PoC. Thank you very much! Best regards, Cody On Tue, Nov 27, 2018 at 6:31 AM Vitaliy Filippov wrote: > > > CPU: 2 x E5-2603 @1.8GHz > > RAM: 16GB > > Network: 1G port shared for Ceph public and cluster traffics > > Journaling device: 1 x 120GB SSD (SATA3, consumer grade) > > OSD device: 2 x 2TB 7200rpm spindle (SATA3, consumer grade) > > 0.84 MB/s sequential write is impossibly bad, it's not normal with any > kind of devices and even with 1G network, you probably have some kind of > problem in your setup - maybe the network RTT is very high or maybe osd or > mon nodes are shared with other running tasks and overloaded or maybe your > disks are already dead... :)) > > > As I moved on to test block devices, I got a following error message: > > > > # rbd map image01 --pool testbench --name client.admin > > You don't need to map it to run benchmarks, use `fio --ioengine=rbd` > (however you'll still need /etc/ceph/ceph.client.admin.keyring) > > -- > With best regards, >Vitaliy Filippov ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] RGW performance with lots of objects
In the old days when I first installed Ceph with RGW the performance would be very slow after storing 500+ million objects in my buckets. With Luminous and index sharding is this still a problem or is this an old problem that has been solved? Regards R ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Luminous v12.2.10 released
On 11/27/2018 08:50 AM, Abhishek Lekshmanan wrote: We're happy to announce the tenth bug fix release of the Luminous v12.2.x long term stable release series. The previous release, v12.2.9, introduced the PG hard-limit patches which were found to cause an issue in certain upgrade scenarios, and this release was expedited to revert those patches. If you already successfully upgraded to v12.2.9, you should **not** upgrade to v12.2.10, but rather **wait** for a release in which http://tracker.ceph.com/issues/36686 is addressed. All other users are encouraged to upgrade to this release. I wonder if you can comment on upgrade policy for a mixed cluster - eg where the majority is running 12.2.8 but a handful of newly-added osd nodes were installed with 12.2.9. Should the 12.2.8 nodes be upgraded to 12.2.10 (this does sound like it should have no negative effects) and just the 12.2.9 nodes kept to wait for a future release - or wait on all? Thanks, Graham -- Graham Allan Minnesota Supercomputing Institute - g...@umn.edu ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Ceph IO stability issues
Hi, We’re currently progressively pushing into production a CEPH Mimic cluster and we’ve noticed a fairly strange behaviour. We use Ceph as a storage backend for Openstack block device. Now, we’ve deployed a few VMs on this backend to test the waters. These VMs are practically empty, with only the regular cpanel services running on them and no actual website set. We notice that about twice in a span of about 5 minutes, the iowait will jump to ~10% without any VM-side explanation, no specific service taking any more io bandwidth than usual. I must also add that the speed of the cluster is excellent. It’s really more of a stability issue that bothers me here. I see the jump in iowait as the VM being unable to read or write on the ceph cluster for a second or so. I've considered that it could be the deep scrub operations, but those seem to complete in 0.1 second, as there’s practically no data to scrub. The cluster pool configuration is as such: -RBD on erasure-coded pool (a replicated metadata pool and an erasure coded data pool) with overwrites enabled -The data pool size is k=6 m=2, so 8, with 1024 PGs -The metadata pool size is 3, with 64 PGs Of course, this is running on bluestore. As for the hardware, the config is as follow: -10 hosts -9 OSD per host -Each OSD is a Intel DC S3510 -CPUs are dual E5-2680v2 (40 threads total @2.8GHz) -Each host has 128 GB of ram -Network is 2x bonded 10gbps, 1 for storage, 1 for replication I understand that I will eventually hit a speed block because of either the CPUs or the network, but maximum speed is not my current concern here and can be upgraded when needed. I’ve been wondering, could these hiccups be caused by data caching at the client level? If so, what could I do to fix this? Jean-Philippe Méthot Openstack system administrator Administrateur système Openstack PlanetHoster inc. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Luminous v12.2.10 released
On 27/11/2018 14:50, Abhishek Lekshmanan wrote: We're happy to announce the tenth bug fix release of the Luminous v12.2.x long term stable release series. The previous release, v12.2.9, introduced the PG hard-limit patches which were found to cause an issue in certain upgrade scenarios, and this release was expedited to revert those patches. If you already successfully upgraded to v12.2.9, you should **not** upgrade to v12.2.10, but rather **wait** for a release in which http://tracker.ceph.com/issues/36686 is addressed. All other users are encouraged to upgrade to this release. Is it safe for v12.2.9 users upgrade to v13.2.2 Mimic? http://tracker.ceph.com/issues/36686 suggests a similar revert might be on the cards for v13.2.3 so I'm not sure. Thanks, Simon ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Luminous v12.2.10 released
We're happy to announce the tenth bug fix release of the Luminous v12.2.x long term stable release series. The previous release, v12.2.9, introduced the PG hard-limit patches which were found to cause an issue in certain upgrade scenarios, and this release was expedited to revert those patches. If you already successfully upgraded to v12.2.9, you should **not** upgrade to v12.2.10, but rather **wait** for a release in which http://tracker.ceph.com/issues/36686 is addressed. All other users are encouraged to upgrade to this release. Notable Changes --- * This release reverts the PG hard-limit patches added in v12.2.9 in which, a partial upgrade during a recovery/backfill, can cause the osds on the previous version, to fail with assert(trim_to <= info.last_complete). The workaround for users is to upgrade and restart all OSDs to a version with the pg hard limit, or only upgrade when all PGs are active+clean. See also: http://tracker.ceph.com/issues/36686 As mentioned above if you've successfully upgraded to v12.2.9 DO NOT upgrade to v12.2.10 until the linked tracker issue has been fixed. * The bluestore_cache_* options are no longer needed. They are replaced by osd_memory_target, defaulting to 4GB. BlueStore will expand and contract its cache to attempt to stay within this limit. Users upgrading should note this is a higher default than the previous bluestore_cache_size of 1GB, so OSDs using BlueStore will use more memory by default. For more details, see BlueStore docs[1] For the complete release notes with changelog, please check out the release blog entry at: http://ceph.com/releases/v12-2-10-luminous-released Getting ceph: * Git at git://github.com/ceph/ceph.git * Tarball at http://download.ceph.com/tarballs/ceph-12.2.10.tar.gz * For packages, see http://docs.ceph.com/docs/master/install/get-packages/ * Release git sha1: 177915764b752804194937482a39e95e0ca3de94 [1]: http://docs.ceph.com/docs/master/rados/configuration/bluestore-config-ref/#cache-size -- Abhishek Lekshmanan SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] CEPH DR RBD Mount
Hi There, We are replicating a 100TB RBD image to DR site. Replication works fine. rbd --cluster cephdr mirror pool status nfs --verbose health: OK images: 1 total 1 replaying dir_research: global_id: 11e9cbb9-ce83-4e5e-a7fb-472af866ca2d state: up+replaying description: replaying, master_position=[object_number=591701, tag_tid=1, entry_tid=902879873], mirror_position=[object_number=446354, tag_tid=1, entry_tid=727653146], entries_behind_master=175226727 last_update: 2018-11-14 16:17:23 We then, use nbd to map the RBD image at the DR site but when we try to mount it, we get # mount /dev/nbd2 /mnt mount: block device /dev/nbd2 is write-protected, mounting read-only *mount: /dev/nbd2: can't read superblock* We are using 12.2.8. Any help will be greatly appreciated. Thanks, -Vikas ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Poor ceph cluster performance
CPU: 2 x E5-2603 @1.8GHz RAM: 16GB Network: 1G port shared for Ceph public and cluster traffics Journaling device: 1 x 120GB SSD (SATA3, consumer grade) OSD device: 2 x 2TB 7200rpm spindle (SATA3, consumer grade) 0.84 MB/s sequential write is impossibly bad, it's not normal with any kind of devices and even with 1G network, you probably have some kind of problem in your setup - maybe the network RTT is very high or maybe osd or mon nodes are shared with other running tasks and overloaded or maybe your disks are already dead... :)) As I moved on to test block devices, I got a following error message: # rbd map image01 --pool testbench --name client.admin You don't need to map it to run benchmarks, use `fio --ioengine=rbd` (however you'll still need /etc/ceph/ceph.client.admin.keyring) -- With best regards, Vitaliy Filippov ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Libvirt snapshot rollback still has 'new' data
I just rolled back a snapshot, and when I started the (windows) vm, I noticed still a software update I installed after this snapshot. What am I doing wrong that libvirt is not reading the rolled back snapshot (,but uses something from cache)? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Poor ceph cluster performance
Hi, Most likely the issue is with your consumer grade journal ssd. Run this to your ssd to check if it performs: fio --filename= --direct=1 --sync=1 --rw=write --bs=4k --numjobs=1 --iodepth=1 --runtime=60 --time_based --group_reporting --name=journal-test On Tue, Nov 27, 2018 at 2:06 AM Cody wrote: > > Hello, > > I have a Ceph cluster deployed together with OpenStack using TripleO. > While the Ceph cluster shows a healthy status, its performance is > painfully slow. After eliminating a possibility of network issues, I > have zeroed in on the Ceph cluster itself, but have no experience in > further debugging and tunning. > > The Ceph OSD part of the cluster uses 3 identical servers with the > following specifications: > > CPU: 2 x E5-2603 @1.8GHz > RAM: 16GB > Network: 1G port shared for Ceph public and cluster traffics > Journaling device: 1 x 120GB SSD (SATA3, consumer grade) > OSD device: 2 x 2TB 7200rpm spindle (SATA3, consumer grade) > > This is not beefy enough in any way, but I am running for PoC only, > with minimum utilization. > > Ceph-mon and ceph-mgr daemons are hosted on the OpenStack Controller > nodes. Ceph-ansible version is 3.1 and is using Filestore with > non-colocated scenario (1 SSD for every 2 OSDs). Connection speed > among Controllers, Computes, and OSD nodes can reach ~900Mbps tested > using iperf. > > I followed the Red Hat Ceph 3 benchmarking procedure [1] and received > following results: > > Write Test: > > Total time run: 80.313004 > Total writes made: 17 > Write size: 4194304 > Object size:4194304 > Bandwidth (MB/sec): 0.846687 > Stddev Bandwidth: 0.320051 > Max bandwidth (MB/sec): 2 > Min bandwidth (MB/sec): 0 > Average IOPS: 0 > Stddev IOPS:0 > Max IOPS: 0 > Min IOPS: 0 > Average Latency(s): 66.6582 > Stddev Latency(s): 15.5529 > Max latency(s): 80.3122 > Min latency(s): 29.7059 > > Sequencial Read Test: > > Total time run: 25.951049 > Total reads made: 17 > Read size:4194304 > Object size: 4194304 > Bandwidth (MB/sec): 2.62032 > Average IOPS: 0 > Stddev IOPS: 0 > Max IOPS: 1 > Min IOPS: 0 > Average Latency(s): 24.4129 > Max latency(s): 25.9492 > Min latency(s): 0.117732 > > Random Read Test: > > Total time run: 66.355433 > Total reads made: 46 > Read size:4194304 > Object size: 4194304 > Bandwidth (MB/sec): 2.77295 > Average IOPS: 0 > Stddev IOPS: 3 > Max IOPS: 27 > Min IOPS: 0 > Average Latency(s): 21.4531 > Max latency(s): 66.1885 > Min latency(s): 0.0395266 > > Apparently, the results are pathetic... > > As I moved on to test block devices, I got a following error message: > > # rbd map image01 --pool testbench --name client.admin > rbd: failed to add secret 'client.admin' to kernel > > Any suggestions on the above error and/or debugging would be greatly > appreciated! > > Thank you very much to all. > > Cody > > [1] > https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html-single/administration_guide/#benchmarking_performance > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] pre-split causing slow requests when rebuild osd ?
If you are re-creating or adding the OSD anywyas: consider using Bluestore for the new ones, it performs *so much* better. Especially in scenarios like these. Running a mixed configuration is no problem in our experience. Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 Am Di., 27. Nov. 2018 um 08:50 Uhr schrieb hnuzhoulin2 : > > Hi,guys > > > I have a 42 nodes cluster,and I create the pool using expected_num_objects to > pre-split filestore dirs. > > today I rebuild a osd because a disk error,it cause much slow > request,filestore logs like below > > 2018-11-26 16:49:41.003336 7f2dad075700 10 > filestore(/home/ceph/var/lib/osd/ceph-4) create_collection > /home/ceph/var/lib/osd/ceph-4/current/388.433_head = 0 > 2018-11-26 16:49:41.003479 7f2dad075700 10 > filestore(/home/ceph/var/lib/osd/ceph-4) create_collection > /home/ceph/var/lib/osd/ceph-4/current/388.433_TEMP = 0 > 2018-11-26 16:49:41.003570 7f2dad075700 10 > filestore(/home/ceph/var/lib/osd/ceph-4) _set_replay_guard 33.0.0 > 2018-11-26 16:49:41.003591 7f2dad876700 5 > filestore(/home/ceph/var/lib/osd/ceph-4) _journaled_ahead 0x55e054382300 seq > 81 osr(388.2bd 0x55e053ed9280) [Transaction(0x55e06d3046 > 80)] > 2018-11-26 16:49:41.003603 7f2dad876700 5 > filestore(/home/ceph/var/lib/osd/ceph-4) queue_op 0x55e054382300 seq 81 > osr(388.2bd 0x55e053ed9280) 1079089 bytes (queue has 50 ops > and 15513428 bytes) > 2018-11-26 16:49:41.003608 7f2dad876700 10 > filestore(/home/ceph/var/lib/osd/ceph-4) queueing ondisk 0x55e06cc83f80 > 2018-11-26 16:49:41.024714 7f2d9d055700 5 > filestore(/home/ceph/var/lib/osd/ceph-4) queue_transactions existing > 0x55e053a5d1e0 osr(388.f2a 0x55e053ed92e0) > 2018-11-26 16:49:41.166512 7f2dac874700 10 filestore oid: > #388:c940head# not skipping op, *spos 32.0.1 > 2018-11-26 16:49:41.166522 7f2dac874700 10 filestore > header.spos 0.0.0 > 2018-11-26 16:49:41.170670 7f2dac874700 10 filestore oid: > #388:c940head# not skipping op, *spos 32.0.2 > 2018-11-26 16:49:41.170680 7f2dac874700 10 filestore > header.spos 0.0.0 > 2018-11-26 16:49:41.183259 7f2dac874700 10 > filestore(/home/ceph/var/lib/osd/ceph-4) _do_op 0x55e05ddb3480 seq 32 r = 0, > finisher 0x55e051d122e0 0 > 2018-11-26 16:49:41.187211 7f2dac874700 10 > filestore(/home/ceph/var/lib/osd/ceph-4) _finish_op 0x55e05ddb3480 seq 32 > osr(388.293 0x55e053ed84b0)/0x55e053ed84b0 lat 47.804533 > 2018-11-26 16:49:41.187232 7f2dac874700 5 > filestore(/home/ceph/var/lib/osd/ceph-4) _do_op 0x55e052113e60 seq 34 > osr(388.2d94 0x55e053ed91c0)/0x55e053ed91c0 start > 2018-11-26 16:49:41.187236 7f2dac874700 10 > filestore(/home/ceph/var/lib/osd/ceph-4) _do_transaction on 0x55e05e022140 > 2018-11-26 16:49:41.187239 7f2da4864700 5 > filestore(/home/ceph/var/lib/osd/ceph-4) queue_transactions (writeahead) 82 > [Transaction(0x55e0559e6d80)] > > looks like it is very slow when create pg dir like: > /home/ceph/var/lib/osd/ceph-4/current/388.433 > > but at the start of service,the status of osd is not up,it works well. no > slow request,and pg dir is creating. > but when the osd state is up,slow request is coming and pg dir is creating. > > when I disable the config filestore merge threshold = -10 in the ceoh.conf. > the rebuild process works well,pg dirs are created very fast.then I see dir > split in log > > 2018-11-26 19:16:56.406276 7f768b189700 1 _created [8,F,8] has 593 objects, > starting split. > 2018-11-26 19:16:56.977392 7f768b189700 1 _created [8,F,8] split completed. > 2018-11-26 19:16:57.032567 7f768b189700 1 _created [8,F,8,6] has 594 > objects, starting split. > 2018-11-26 19:16:57.814694 7f768b189700 1 _created [8,F,8,6] split completed. > > > so,how can I set to let all pg dirs created before the osd state is up?or > other solution. > > Thanks. > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com