[ceph-users] cephfs triggers warnings "tar: file changed as we read it"
Hello, I'm evaluating cephfs on a virtual machines cluster. I'm using Infernalis (9.2.0) on debian Jessie as client and server. I'm trying to get some performance numbers on operations like tar/untar on things like the linux kernel. I have an issue where tar displays this warning : 'file changed as we read it'. I reproduced the problem with just the Documentation dir. root@admin:/mnt/cephfs# rm -rf Documentation ; tar xf linux-documentation.tar.bz2 ; sync ; tar c Documentation>/dev/null ; tar: Documentation/parisc: file changed as we read it tar: Documentation/pcmcia: file changed as we read it tar: Documentation/phy: file changed as we read it tar: Documentation/platform: file changed as we read it tar: Documentation/power: file changed as we read it tar: Documentation/powerpc: file changed as we read it tar: Documentation/pps: file changed as we read it tar: Documentation/prctl: file changed as we read it tar: Documentation/pti: file changed as we read it tar: Documentation/ptp: file changed as we read it tar: Documentation/rapidio: file changed as we read it tar: Documentation/s390: file changed as we read it tar: Documentation/scheduler: file changed as we read it tar: Documentation/scsi: file changed as we read it tar: Documentation/security: file changed as we read it tar: Documentation/w1/slaves: file changed as we read it tar: Documentation/watchdog: file changed as we read it tar: Documentation/x86: file changed as we read it tar: Documentation/zh_CN: file changed as we read it tar: Documentation: file changed as we read it If I wait between the two commands, the error is reduced but not eliminated: root@admin:/mnt/cephfs# rm -rf Documenation ; tar xf linux-documentation.tar.bz2 ; sleep 10 ; tar c Documentation>/dev/null ; tar: Documentation/virtual: file changed as we read it tar: Documentation/w1: file changed as we read it tar: Documentation/watchdog: file changed as we read it tar: Documentation/x86: file changed as we read it tar: Documentation/zh_CN: file changed as we read it tar: Documentation: file changed as we read it root@admin:/mnt/cephfs# rm -rf Documentation ; tar xf linux-documentation.tar.bz2 ; sleep 120 ; tar c Documentation>/dev/null ; tar: Documentation: file changed as we read it I'm sure no other client process is not modifying the files. I have this problem with the fuse client and with the kernel (version in Jessie) client. By doing a "stat", I see some meta-data are changed: root@admin:/mnt/cephfs# rm -rf Documentation ; tar xf linux-documentation.tar.bz2 ;stat Documentation; tar -c Documentation>/dev/null ; stat Documentation File: ‘Documentation’ Size: 14740322Blocks: 1 IO Block: 4096 directory Device: 23h/35d Inode: 1099511913288 Links: 1 Access: (0770/drwxrwx---) Uid: ( 1000/ ) Gid: ( 1000/ ) Access: 2016-01-15 16:51:40.882143334 + Modify: 2015-05-12 09:34:49.0 +0100 Change: 2016-01-15 16:52:31.745684502 + Birth: - tar: Documentation/scheduler: file changed as we read it tar: Documentation/scsi: file changed as we read it tar: Documentation/zh_CN/arm64: file changed as we read it tar: Documentation/zh_CN/filesystems: file changed as we read it tar: Documentation/zh_CN/video4linux: file changed as we read it tar: Documentation: file changed as we read it File: ‘Documentation’ Size: 15088573Blocks: 1 IO Block: 4096 directory Device: 23h/35d Inode: 1099511913288 Links: 1 Access: (0770/drwxrwx---) Uid: ( 1000/ ) Gid: ( 1000/ ) Access: 2016-01-15 16:51:40.882143334 + Modify: 2015-05-12 09:34:49.0 +0100 Change: 2016-01-15 16:52:31.745684502 + Birth: - I know it's possible to silence this warning with a tar option but, I don't want to worry about that in every commands of every script, it changes the tar output. And above all, I don't find that very clean. Do you know any settings that guarantees that all pending async writes are terminated when a client opens a file ? Regards Thomas HAMEL ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RGW -- 404 on keys in bucket.list() thousands of multipart ids listed as well.
Hello Yehuda, Here it is:: radosgw-admin object stat --bucket="noaa-nexrad-l2" --object="2015/01/01/PAKC/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar" { "name": "2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar", "size": 7147520, "policy": { "acl": { "acl_user_map": [ { "user": "b05f707271774dbd89674a0736c9406e", "acl": 15 } ], "acl_group_map": [ { "group": 1, "acl": 1 } ], "grant_map": [ { "id": "", "grant": { "type": { "type": 2 }, "id": "", "email": "", "permission": { "flags": 1 }, "name": "", "group": 1 } }, { "id": "b05f707271774dbd89674a0736c9406e", "grant": { "type": { "type": 0 }, "id": "b05f707271774dbd89674a0736c9406e", "email": "", "permission": { "flags": 15 }, "name": "noaa-commons", "group": 0 } } ] }, "owner": { "id": "b05f707271774dbd89674a0736c9406e", "display_name": "noaa-commons" } }, "etag": "b91b6f1650350965c5434c547b3c38ff-1\u", "tag": "_cWrvEa914Gy1AeyzIhRlUdp1wJnek3E\u", "manifest": { "objs": [], "obj_size": 7147520, "explicit_objs": "false", "head_obj": { "bucket": { "name": "noaa-nexrad-l2", "pool": ".rgw.buckets", "data_extra_pool": ".rgw.buckets.extra", "index_pool": ".rgw.buckets.index", "marker": "default.384153.1", "bucket_id": "default.384153.1" }, "key": "", "ns": "", "object": "2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar", "instance": "" }, "head_size": 0, "max_head_size": 0, "prefix": "2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar.2~pcu5Hz6foFXjlSxBat22D8YMcHlQOBD", "tail_bucket": { "name": "noaa-nexrad-l2", "pool": ".rgw.buckets", "data_extra_pool": ".rgw.buckets.extra", "index_pool": ".rgw.buckets.index", "marker": "default.384153.1", "bucket_id": "default.384153.1" }, "rules": [ { "key": 0, "val": { "start_part_num": 1, "start_ofs": 0, "part_size": 0, "stripe_max_size": 4194304, "override_prefix": "" } } ] }, "attrs": {} } On 1/15/16 11:17 AM, Yehuda Sadeh-Weinraub wrote: radosgw-admin object stat --bucket= --object=' ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RGW -- 404 on keys in bucket.list() thousands of multipart ids listed as well.
On Thu, Jan 14, 2016 at 10:51 PM, seapasu...@uchicago.eduwrote: > It looks like the gateway is experiencing a similar race condition to what > we reported before. > > The rados object has a size of 0 bytes but the bucket index shows the object > listed and the object metadata shows a size of > 7147520 bytes. > > I have a lot of logs but I don't think any of them have the full data from > the upload of this object. > > I thought this bug was fixed back in firefly/giant > > https://www.mail-archive.com/ceph-users@lists.ceph.com/msg19971.html > > -- > > root@kg34-33:/srv/nfs/griffin_temp# rados -p .rgw.buckets stat > default.384153.1_2015/01/01/PAKC/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar > ..rgw.buckets/default.384153.1_2015/01/01/PAKC/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar > mtime 1446672570, size 0 > > -- > > SError: [Errno 2] No such file or directory: > '/srv/nfs/griffin_tempnoaa-nexrad-l2/2015/01/01/PAKC/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar' > > In [13]: print(key.size) > 7147520 > > We are currently using 94.5 and the file were uploaded to hammer as well > > lacadmin@kh28-10:~$ ceph --version > ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43) > lacadmin@kh28-10:~$ radosgw --version > ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43) > > > The cluster is health_ok and was ok during the upload. I need to confirm > with the person who uploaded the data but I think they did it with s3cmd. > Has anyone seen this before? I think I need to file a bug :-( > What does 'radosgw-admin object stat --bucket= --object=' show? Yehuda ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Odd single VM ceph error
On Thu, 14 Jan 2016, Robert LeBlanc wrote: > We have a single VM that is acting odd. We had 7 SSD OSDs (out of 40) go > down over a period of about 12 hours. These are a cache tier and have size > 4, min_size 2. I'm not able to make heads or tails of the error and hoped > someone here could help. > > 2016-01-14 23:09:54.559121 osd.136 [ERR] 13.503 copy from > f8bedd03/rbd_data.48a6325f5e3f87.683d/head//13 to > f8bedd03/rbd_data.48a6325f5e3f87.683d/head//13 data digest > 0x92bc163c != source 0x8fe2d0a9 > > The PG fully recovered then the error was > > 2016-01-15 00:39:25.321469 osd.12 [ERR] 13.503 copy from > f8bedd03/rbd_data.48a6325f5e3f87.683d/head//13 to > f8bedd03/rbd_data.48a6325f5e3f87.683d/head//13 data digest > 0x92bc163c != source 0x8fe2d0a9 > > A deep scrub of the PG comes back clean and a hash of the files on all OSDs > match. The file system on this vm keeps going read only. > > The osd file system is EXT4 and this is 0.94.5. You're using cache tiering I take it? I think the error is in the base tier, while the PG mentioned is the cache tier. sage ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Infernalis upgrade breaks when journal on separate partition
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 If you are not booting from the GPT disk, you don't need the EFI partition (or any special boot partition). The required backup FAT is usually put at the end where there is usually some free space anyway. It has been a long time since I've converted from MBR to GPT, but it didn't require any resizing that I remember. I'd test it in a VM or similar to make sure you understand the process. You will also have to manually add the Ceph Journal UUID to the partition after the conversion for it to all work automatically. - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Thu, Jan 14, 2016 at 9:00 PM, Stuart Longland wrote: > On 12/01/16 01:22, Stillwell, Bryan wrote: >>> Well, it seems I spoke to soon. Not sure what logic the udev rules use >>> >to identify ceph journals, but it doesn't seem to pick up on the >>> >journals in our case as after a reboot, those partitions are owned by >>> >root:disk with permissions 0660. > >> This is handled by the UUIDs of the GPT partitions, and since you're using >> MS-DOS partition tables it won't work correctly. I would recommend >> switching to >> GPT partition tables if you can. > > I'm not comfortable with switching from MS-DOS to GPT disklabels on a > running production server, and nothing in the Ceph docs at the time > mentioned that GPT was a requirement on the journal disks. > > Switching to GPT isn't toggling an option, it'd require resizing > partitions (some of these are xfs; can't be shrunk to my knowledge) and > moving stuff around to allow for an additional EFI boot partition, > possibly changes to boot firmware settings and bootloaders too. > > I'll look into a udev rule for our particular case and see how I go. > -- > _ ___ Stuart Longland - Systems Engineer > \ /|_) | T: +61 7 3535 9619 > \/ | \ | 38b Douglas StreetF: +61 7 3535 9699 >SYSTEMSMilton QLD 4064 http://www.vrt.com.au > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -BEGIN PGP SIGNATURE- Version: Mailvelope v1.3.3 Comment: https://www.mailvelope.com wsFcBAEBCAAQBQJWmUrVCRDmVDuy+mK58QAA4VkQAIttGy8fn8zqrSNGnykb A/FI2WjObejmDREjRc3DVFqrBOL7faSyTvf636N/y0TYF1J5VLJu1Zf4C/2R aCxm4lDzxvRWnWW935GFtZ9YCwzA9KyR10DJJf1TaQW7nq1a+UYPTiTX9kk6 fnA9X7AICpLDrxNX+/2sYfZAASDBQjslW4qFIJ5CW5JhRVi8CggnUF5wPvf1 R3r7u/tlGJEU3pktTnSix8mzBjJSKpOiHFNikkrj/Md6+pNVgmAfpvA7cj3R gkazGxL5TiqdwQQ0OVusv18VrL9Vir4tyA0BOam95DEQ8QaZ4PodRQsj5RUd KD+MjpIbj1OtRriVrTUrlTtLuMq26g8yVlR6HttwFIMkANUu0kso1p2C3NzF 1ETjx+JpcX+Zn/o3gx4AgYC/YJ97y4LRIdTfJR/3P1UnZnkGsyOadFqi6qTC l6mgzhjURxLUTu7XOnJn28aJ0Ql+gH1HuFUojskiahppE+B5K9KMAVZjbDAN j7aOspD9oZydu20XkwDD6jnKRmC82tYk58A5PMdYfN9tbgItmlwg51NGV0Tk n8EoW6mT8zfK3aOqBYSDsmymyL8b6vB/uK05pJz+w9ut1SozaOWmKRPiGifN JvOeROVCpoF8FoW6J0GJTbRMYIhvxew4VMe3hjxy2OyCXxkay4bINY2kPllD mOnH =8a8l -END PGP SIGNATURE- ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RGW -- 404 on keys in bucket.list() thousands of multipart ids listed as well.
On Fri, Jan 15, 2016 at 9:36 AM, seapasu...@uchicago.eduwrote: > Hello Yehuda, > > Here it is:: > > radosgw-admin object stat --bucket="noaa-nexrad-l2" > --object="2015/01/01/PAKC/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar" > { > "name": > "2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar", > "size": 7147520, > "policy": { > "acl": { > "acl_user_map": [ > { > "user": "b05f707271774dbd89674a0736c9406e", > "acl": 15 > } > ], > "acl_group_map": [ > { > "group": 1, > "acl": 1 > } > ], > "grant_map": [ > { > "id": "", > "grant": { > "type": { > "type": 2 > }, > "id": "", > "email": "", > "permission": { > "flags": 1 > }, > "name": "", > "group": 1 > } > }, > { > "id": "b05f707271774dbd89674a0736c9406e", > "grant": { > "type": { > "type": 0 > }, > "id": "b05f707271774dbd89674a0736c9406e", > "email": "", > "permission": { > "flags": 15 > }, > "name": "noaa-commons", > "group": 0 > } > } > ] > }, > "owner": { > "id": "b05f707271774dbd89674a0736c9406e", > "display_name": "noaa-commons" > } > }, > "etag": "b91b6f1650350965c5434c547b3c38ff-1\u", > "tag": "_cWrvEa914Gy1AeyzIhRlUdp1wJnek3E\u", > "manifest": { > "objs": [], > "obj_size": 7147520, > "explicit_objs": "false", > "head_obj": { > "bucket": { > "name": "noaa-nexrad-l2", > "pool": ".rgw.buckets", > "data_extra_pool": ".rgw.buckets.extra", > "index_pool": ".rgw.buckets.index", > "marker": "default.384153.1", > "bucket_id": "default.384153.1" > }, > "key": "", > "ns": "", > "object": > "2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar", > "instance": "" > }, > "head_size": 0, > "max_head_size": 0, > "prefix": > "2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar.2~pcu5Hz6foFXjlSxBat22D8YMcHlQOBD", Try running: $ rados -p .rgw.buckets ls | grep pcu5Hz6 Yehuda > "tail_bucket": { > "name": "noaa-nexrad-l2", > "pool": ".rgw.buckets", > "data_extra_pool": ".rgw.buckets.extra", > "index_pool": ".rgw.buckets.index", > "marker": "default.384153.1", > "bucket_id": "default.384153.1" > }, > "rules": [ > { > "key": 0, > "val": { > "start_part_num": 1, > "start_ofs": 0, > "part_size": 0, > "stripe_max_size": 4194304, > "override_prefix": "" > } > } > ] > }, > "attrs": {} > > } > > On 1/15/16 11:17 AM, Yehuda Sadeh-Weinraub wrote: >> >> radosgw-admin object stat --bucket= --object=' > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Odd single VM ceph error
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Exactly the problem. Thanks for the point in the right direction. - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Fri, Jan 15, 2016 at 10:28 AM, Sage Weil wrote: > On Thu, 14 Jan 2016, Robert LeBlanc wrote: >> We have a single VM that is acting odd. We had 7 SSD OSDs (out of 40) go >> down over a period of about 12 hours. These are a cache tier and have size >> 4, min_size 2. I'm not able to make heads or tails of the error and hoped >> someone here could help. >> >> 2016-01-14 23:09:54.559121 osd.136 [ERR] 13.503 copy from >> f8bedd03/rbd_data.48a6325f5e3f87.683d/head//13 to >> f8bedd03/rbd_data.48a6325f5e3f87.683d/head//13 data digest >> 0x92bc163c != source 0x8fe2d0a9 >> >> The PG fully recovered then the error was >> >> 2016-01-15 00:39:25.321469 osd.12 [ERR] 13.503 copy from >> f8bedd03/rbd_data.48a6325f5e3f87.683d/head//13 to >> f8bedd03/rbd_data.48a6325f5e3f87.683d/head//13 data digest >> 0x92bc163c != source 0x8fe2d0a9 >> >> A deep scrub of the PG comes back clean and a hash of the files on all OSDs >> match. The file system on this vm keeps going read only. >> >> The osd file system is EXT4 and this is 0.94.5. > > You're using cache tiering I take it? I think the error is in the base > tier, while the PG mentioned is the cache tier. > > sage -BEGIN PGP SIGNATURE- Version: Mailvelope v1.3.3 Comment: https://www.mailvelope.com wsFcBAEBCAAQBQJWmUlxCRDmVDuy+mK58QAAUVIP/0B7/0eMpV1lcEhXXLCE aNNBNSMwXevD7uRRVoqoUnjgqZExttMPNZhFij0MF5ztvJ88lycfb+1hClHk Z8agN+APhCVKFpbKm0bGf47EPs+BLYlTl8r7fmA2uIWcG0XO4+eZtjngsJON zZiM2ZE9URJ6GSvquCAADS1XDumAkve0CaJV0c6MOgj4nP1ktDiX0CzjdMF0 fpPc1hUZGAnCEevGGGEQmVcqXDZnUK3OsY2WCZiMCAXBiTQmNMVvJehM1rEg Ss/kC+VIyqzevPN+/r0STUve7UrkVXII19la5fBItSQt8btSjpEkzFFU1WmI 8ehX5KJwyKGH4NIrPCqjg+TGuUlTkkDO9LcEoj19x9I+Tzuyu/bSayo8FBvT 9SYkW7TzfF0JL+ed4fYCv7kK2OJBPR6uQZ8ABcLRXSbmtZa9oVKgHgTwvbgD jT+smWg9JB7bGrx7PkRYDQVpgr62Nbau/CFoUP1FWI8bVhStM3LQQQot8x7X pN9vFwAOwn5yvWLYi+7qmYYLyoOzkg8Ib7d1E3QfOzS86O3mCp14u++x1QBc 5YthsSxfvfs+fNmMKnCrz5YGGcCSntQgJRvOlQl8Oyp29xHVYTP1nqZSXhVi jC94GBmnX+3Z26JmnybVfZnELDbk9wsp2R0tN1Ai4JxpY0PUq/672PuAZDaO McAz =G1Lk -END PGP SIGNATURE- ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RGW -- 404 on keys in bucket.list() thousands of multipart ids listed as well.
lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep 'pcu5Hz6' lacadmin@kh28-10:~$ Nothing was found. That said when I run the command with another prefix snippet:: lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep 'wksHvto' default.384153.1__shadow_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~wksHvto9gRgHUJbhm_TZPXJTZUPXLT2.1_1 default.384153.1__multipart_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~wksHvto9gRgHUJbhm_TZPXJTZUPXLT2.1 On 1/15/16 12:05 PM, Yehuda Sadeh-Weinraub wrote: On Fri, Jan 15, 2016 at 9:36 AM, seapasu...@uchicago.eduwrote: Hello Yehuda, Here it is:: radosgw-admin object stat --bucket="noaa-nexrad-l2" --object="2015/01/01/PAKC/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar" { "name": "2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar", "size": 7147520, "policy": { "acl": { "acl_user_map": [ { "user": "b05f707271774dbd89674a0736c9406e", "acl": 15 } ], "acl_group_map": [ { "group": 1, "acl": 1 } ], "grant_map": [ { "id": "", "grant": { "type": { "type": 2 }, "id": "", "email": "", "permission": { "flags": 1 }, "name": "", "group": 1 } }, { "id": "b05f707271774dbd89674a0736c9406e", "grant": { "type": { "type": 0 }, "id": "b05f707271774dbd89674a0736c9406e", "email": "", "permission": { "flags": 15 }, "name": "noaa-commons", "group": 0 } } ] }, "owner": { "id": "b05f707271774dbd89674a0736c9406e", "display_name": "noaa-commons" } }, "etag": "b91b6f1650350965c5434c547b3c38ff-1\u", "tag": "_cWrvEa914Gy1AeyzIhRlUdp1wJnek3E\u", "manifest": { "objs": [], "obj_size": 7147520, "explicit_objs": "false", "head_obj": { "bucket": { "name": "noaa-nexrad-l2", "pool": ".rgw.buckets", "data_extra_pool": ".rgw.buckets.extra", "index_pool": ".rgw.buckets.index", "marker": "default.384153.1", "bucket_id": "default.384153.1" }, "key": "", "ns": "", "object": "2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar", "instance": "" }, "head_size": 0, "max_head_size": 0, "prefix": "2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar.2~pcu5Hz6foFXjlSxBat22D8YMcHlQOBD", Try running: $ rados -p .rgw.buckets ls | grep pcu5Hz6 Yehuda "tail_bucket": { "name": "noaa-nexrad-l2", "pool": ".rgw.buckets", "data_extra_pool": ".rgw.buckets.extra", "index_pool": ".rgw.buckets.index", "marker": "default.384153.1", "bucket_id": "default.384153.1" }, "rules": [ { "key": 0, "val": { "start_part_num": 1, "start_ofs": 0, "part_size": 0, "stripe_max_size": 4194304, "override_prefix": "" } } ] }, "attrs": {} } On 1/15/16 11:17 AM, Yehuda Sadeh-Weinraub wrote: radosgw-admin object stat --bucket= --object=' ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RGW -- 404 on keys in bucket.list() thousands of multipart ids listed as well.
That's interesting, and might point at the underlying issue that caused it. Could be a racing upload that somehow ended up with the wrong object head. The 'multipart' object should be 4M in size, and the 'shadow' one should have the remainder of the data. You can run 'rados stat -p .rgw.buckets ' to validate that. If that's the case, you can copy these to the expected object names: $ src_uploadid=wksHvto9gRgHUJbhm_TZPXJTZUPXLT2 $ dest_uploadid=pcu5Hz6foFXjlSxBat22D8YMcHlQOBD $ rados -p .rgw.buckets cp default.384153.1__multipart_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${src_uploadid}.1 default.384153.1__multipart_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${dest_uploadid}.1 $ rados -p .rgw.buckets cp default.384153.1__shadow_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${src_upload_id}.1_1 default.384153.1__shadow_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${dest_upload_id}.1_1 Yehuda On Fri, Jan 15, 2016 at 1:02 PM, seapasu...@uchicago.eduwrote: > lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep 'pcu5Hz6' > lacadmin@kh28-10:~$ > > Nothing was found. That said when I run the command with another prefix > snippet:: > lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep 'wksHvto' > default.384153.1__shadow_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~wksHvto9gRgHUJbhm_TZPXJTZUPXLT2.1_1 > default.384153.1__multipart_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~wksHvto9gRgHUJbhm_TZPXJTZUPXLT2.1 > > > > > On 1/15/16 12:05 PM, Yehuda Sadeh-Weinraub wrote: >> >> On Fri, Jan 15, 2016 at 9:36 AM, seapasu...@uchicago.edu >> wrote: >>> >>> Hello Yehuda, >>> >>> Here it is:: >>> >>> radosgw-admin object stat --bucket="noaa-nexrad-l2" >>> >>> --object="2015/01/01/PAKC/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar" >>> { >>> "name": >>> >>> "2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar", >>> "size": 7147520, >>> "policy": { >>> "acl": { >>> "acl_user_map": [ >>> { >>> "user": "b05f707271774dbd89674a0736c9406e", >>> "acl": 15 >>> } >>> ], >>> "acl_group_map": [ >>> { >>> "group": 1, >>> "acl": 1 >>> } >>> ], >>> "grant_map": [ >>> { >>> "id": "", >>> "grant": { >>> "type": { >>> "type": 2 >>> }, >>> "id": "", >>> "email": "", >>> "permission": { >>> "flags": 1 >>> }, >>> "name": "", >>> "group": 1 >>> } >>> }, >>> { >>> "id": "b05f707271774dbd89674a0736c9406e", >>> "grant": { >>> "type": { >>> "type": 0 >>> }, >>> "id": "b05f707271774dbd89674a0736c9406e", >>> "email": "", >>> "permission": { >>> "flags": 15 >>> }, >>> "name": "noaa-commons", >>> "group": 0 >>> } >>> } >>> ] >>> }, >>> "owner": { >>> "id": "b05f707271774dbd89674a0736c9406e", >>> "display_name": "noaa-commons" >>> } >>> }, >>> "etag": "b91b6f1650350965c5434c547b3c38ff-1\u", >>> "tag": "_cWrvEa914Gy1AeyzIhRlUdp1wJnek3E\u", >>> "manifest": { >>> "objs": [], >>> "obj_size": 7147520, >>> "explicit_objs": "false", >>> "head_obj": { >>> "bucket": { >>> "name": "noaa-nexrad-l2", >>> "pool": ".rgw.buckets", >>> "data_extra_pool": ".rgw.buckets.extra", >>> "index_pool": ".rgw.buckets.index", >>> "marker": "default.384153.1", >>> "bucket_id": "default.384153.1" >>> }, >>> "key": "", >>> "ns": "", >>> "object": >>> >>> "2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar", >>> "instance": "" >>> }, >>> "head_size": 0, >>> "max_head_size": 0, >>> "prefix": >>> >>> "2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar.2~pcu5Hz6foFXjlSxBat22D8YMcHlQOBD", >> >>
Re: [ceph-users] RGW -- 404 on keys in bucket.list() thousands of multipart ids listed as well.
Sorry I am a bit confused. The successful list that I provided is from a different object of the same size to show that I could indeed get a list. Are you saying to copy the working object to the missing object? Sorry for the confusion. On 1/15/16 3:20 PM, Yehuda Sadeh-Weinraub wrote: That's interesting, and might point at the underlying issue that caused it. Could be a racing upload that somehow ended up with the wrong object head. The 'multipart' object should be 4M in size, and the 'shadow' one should have the remainder of the data. You can run 'rados stat -p .rgw.buckets ' to validate that. If that's the case, you can copy these to the expected object names: $ src_uploadid=wksHvto9gRgHUJbhm_TZPXJTZUPXLT2 $ dest_uploadid=pcu5Hz6foFXjlSxBat22D8YMcHlQOBD $ rados -p .rgw.buckets cp default.384153.1__multipart_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${src_uploadid}.1 default.384153.1__multipart_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${dest_uploadid}.1 $ rados -p .rgw.buckets cp default.384153.1__shadow_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${src_upload_id}.1_1 default.384153.1__shadow_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${dest_upload_id}.1_1 Yehuda On Fri, Jan 15, 2016 at 1:02 PM, seapasu...@uchicago.eduwrote: lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep 'pcu5Hz6' lacadmin@kh28-10:~$ Nothing was found. That said when I run the command with another prefix snippet:: lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep 'wksHvto' default.384153.1__shadow_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~wksHvto9gRgHUJbhm_TZPXJTZUPXLT2.1_1 default.384153.1__multipart_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~wksHvto9gRgHUJbhm_TZPXJTZUPXLT2.1 On 1/15/16 12:05 PM, Yehuda Sadeh-Weinraub wrote: On Fri, Jan 15, 2016 at 9:36 AM, seapasu...@uchicago.edu wrote: Hello Yehuda, Here it is:: radosgw-admin object stat --bucket="noaa-nexrad-l2" --object="2015/01/01/PAKC/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar" { "name": "2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar", "size": 7147520, "policy": { "acl": { "acl_user_map": [ { "user": "b05f707271774dbd89674a0736c9406e", "acl": 15 } ], "acl_group_map": [ { "group": 1, "acl": 1 } ], "grant_map": [ { "id": "", "grant": { "type": { "type": 2 }, "id": "", "email": "", "permission": { "flags": 1 }, "name": "", "group": 1 } }, { "id": "b05f707271774dbd89674a0736c9406e", "grant": { "type": { "type": 0 }, "id": "b05f707271774dbd89674a0736c9406e", "email": "", "permission": { "flags": 15 }, "name": "noaa-commons", "group": 0 } } ] }, "owner": { "id": "b05f707271774dbd89674a0736c9406e", "display_name": "noaa-commons" } }, "etag": "b91b6f1650350965c5434c547b3c38ff-1\u", "tag": "_cWrvEa914Gy1AeyzIhRlUdp1wJnek3E\u", "manifest": { "objs": [], "obj_size": 7147520, "explicit_objs": "false", "head_obj": { "bucket": { "name": "noaa-nexrad-l2", "pool": ".rgw.buckets", "data_extra_pool": ".rgw.buckets.extra", "index_pool": ".rgw.buckets.index", "marker": "default.384153.1", "bucket_id": "default.384153.1" }, "key": "", "ns": "", "object": "2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar", "instance": "" }, "head_size": 0, "max_head_size": 0, "prefix": "2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar.2~pcu5Hz6foFXjlSxBat22D8YMcHlQOBD", Try running: $ rados -p
Re: [ceph-users] RGW -- 404 on keys in bucket.list() thousands of multipart ids listed as well.
Ah, I see. Misread that and the object names were very similar. No, don't copy it. You can try to grep for the specific object name and see if there are pieces of it lying around under a different upload id. Yehuda On Fri, Jan 15, 2016 at 1:44 PM, seapasu...@uchicago.eduwrote: > Sorry I am a bit confused. The successful list that I provided is from a > different object of the same size to show that I could indeed get a list. > Are you saying to copy the working object to the missing object? Sorry for > the confusion. > > > On 1/15/16 3:20 PM, Yehuda Sadeh-Weinraub wrote: >> >> That's interesting, and might point at the underlying issue that >> caused it. Could be a racing upload that somehow ended up with the >> wrong object head. The 'multipart' object should be 4M in size, and >> the 'shadow' one should have the remainder of the data. You can run >> 'rados stat -p .rgw.buckets ' to validate that. If that's the >> case, you can copy these to the expected object names: >> >> $ src_uploadid=wksHvto9gRgHUJbhm_TZPXJTZUPXLT2 >> $ dest_uploadid=pcu5Hz6foFXjlSxBat22D8YMcHlQOBD >> >> $ rados -p .rgw.buckets cp >> >> default.384153.1__multipart_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${src_uploadid}.1 >> >> default.384153.1__multipart_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${dest_uploadid}.1 >> >> $ rados -p .rgw.buckets cp >> >> default.384153.1__shadow_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${src_upload_id}.1_1 >> >> default.384153.1__shadow_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${dest_upload_id}.1_1 >> >> Yehuda >> >> >> On Fri, Jan 15, 2016 at 1:02 PM, seapasu...@uchicago.edu >> wrote: >>> >>> lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep 'pcu5Hz6' >>> lacadmin@kh28-10:~$ >>> >>> Nothing was found. That said when I run the command with another prefix >>> snippet:: >>> lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep 'wksHvto' >>> >>> default.384153.1__shadow_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~wksHvto9gRgHUJbhm_TZPXJTZUPXLT2.1_1 >>> >>> default.384153.1__multipart_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~wksHvto9gRgHUJbhm_TZPXJTZUPXLT2.1 >>> >>> >>> >>> >>> On 1/15/16 12:05 PM, Yehuda Sadeh-Weinraub wrote: On Fri, Jan 15, 2016 at 9:36 AM, seapasu...@uchicago.edu wrote: > > Hello Yehuda, > > Here it is:: > > radosgw-admin object stat --bucket="noaa-nexrad-l2" > > > --object="2015/01/01/PAKC/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar" > { > "name": > > > "2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar", > "size": 7147520, > "policy": { > "acl": { > "acl_user_map": [ > { > "user": "b05f707271774dbd89674a0736c9406e", > "acl": 15 > } > ], > "acl_group_map": [ > { > "group": 1, > "acl": 1 > } > ], > "grant_map": [ > { > "id": "", > "grant": { > "type": { > "type": 2 > }, > "id": "", > "email": "", > "permission": { > "flags": 1 > }, > "name": "", > "group": 1 > } > }, > { > "id": "b05f707271774dbd89674a0736c9406e", > "grant": { > "type": { > "type": 0 > }, > "id": "b05f707271774dbd89674a0736c9406e", > "email": "", > "permission": { > "flags": 15 > }, > "name": "noaa-commons", > "group": 0 > } > } > ] > }, > "owner": { > "id": "b05f707271774dbd89674a0736c9406e", > "display_name": "noaa-commons" > } > }, > "etag": "b91b6f1650350965c5434c547b3c38ff-1\u", > "tag": "_cWrvEa914Gy1AeyzIhRlUdp1wJnek3E\u", >
Re: [ceph-users] RGW -- 404 on keys in bucket.list() thousands of multipart ids listed as well.
Sorry for the confusion:: When I grepped for the prefix of the missing object:: "2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar.2~pcu5Hz6foFXjlSxBat22D8YMcHlQOBD" I am not able to find any chunks of the object:: lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep 'pcu5Hz6' lacadmin@kh28-10:~$ The only piece of the object that I can seem to find is the original one I posted:: lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep 'NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959' default.384153.1_2015/01/01/PAKC/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar And when we stat this object is is 0 bytes as shown earlier:: lacadmin@kh28-10:~$ rados -p .rgw.buckets stat 'default.384153.1_2015/01/01/PAKC/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar' .rgw.buckets/default.384153.1_2015/01/01/PAKC/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar mtime 2015-11-04 15:29:30.00, size 0 Sorry again for the confusion. On 1/15/16 3:58 PM, Yehuda Sadeh-Weinraub wrote: Ah, I see. Misread that and the object names were very similar. No, don't copy it. You can try to grep for the specific object name and see if there are pieces of it lying around under a different upload id. Yehuda On Fri, Jan 15, 2016 at 1:44 PM, seapasu...@uchicago.eduwrote: Sorry I am a bit confused. The successful list that I provided is from a different object of the same size to show that I could indeed get a list. Are you saying to copy the working object to the missing object? Sorry for the confusion. On 1/15/16 3:20 PM, Yehuda Sadeh-Weinraub wrote: That's interesting, and might point at the underlying issue that caused it. Could be a racing upload that somehow ended up with the wrong object head. The 'multipart' object should be 4M in size, and the 'shadow' one should have the remainder of the data. You can run 'rados stat -p .rgw.buckets ' to validate that. If that's the case, you can copy these to the expected object names: $ src_uploadid=wksHvto9gRgHUJbhm_TZPXJTZUPXLT2 $ dest_uploadid=pcu5Hz6foFXjlSxBat22D8YMcHlQOBD $ rados -p .rgw.buckets cp default.384153.1__multipart_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${src_uploadid}.1 default.384153.1__multipart_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${dest_uploadid}.1 $ rados -p .rgw.buckets cp default.384153.1__shadow_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${src_upload_id}.1_1 default.384153.1__shadow_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${dest_upload_id}.1_1 Yehuda On Fri, Jan 15, 2016 at 1:02 PM, seapasu...@uchicago.edu wrote: lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep 'pcu5Hz6' lacadmin@kh28-10:~$ Nothing was found. That said when I run the command with another prefix snippet:: lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep 'wksHvto' default.384153.1__shadow_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~wksHvto9gRgHUJbhm_TZPXJTZUPXLT2.1_1 default.384153.1__multipart_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~wksHvto9gRgHUJbhm_TZPXJTZUPXLT2.1 On 1/15/16 12:05 PM, Yehuda Sadeh-Weinraub wrote: On Fri, Jan 15, 2016 at 9:36 AM, seapasu...@uchicago.edu wrote: Hello Yehuda, Here it is:: radosgw-admin object stat --bucket="noaa-nexrad-l2" --object="2015/01/01/PAKC/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar" { "name": "2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar", "size": 7147520, "policy": { "acl": { "acl_user_map": [ { "user": "b05f707271774dbd89674a0736c9406e", "acl": 15 } ], "acl_group_map": [ { "group": 1, "acl": 1 } ], "grant_map": [ { "id": "", "grant": { "type": { "type": 2 }, "id": "", "email": "", "permission": { "flags": 1 }, "name": "", "group": 1 } }, { "id": "b05f707271774dbd89674a0736c9406e", "grant": { "type": { "type": 0 }, "id": "b05f707271774dbd89674a0736c9406e", "email": "",
Re: [ceph-users] RGW -- 404 on keys in bucket.list() thousands of multipart ids listed as well.
The head object of a multipart object has 0 size, so it's expected. What's missing is the tail of the object. I don't assume you have any logs from when the object was uploaded? Yehuda On Fri, Jan 15, 2016 at 2:12 PM, seapasu...@uchicago.eduwrote: > Sorry for the confusion:: > > When I grepped for the prefix of the missing object:: > "2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar.2~pcu5Hz6foFXjlSxBat22D8YMcHlQOBD" > > I am not able to find any chunks of the object:: > > lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep 'pcu5Hz6' > lacadmin@kh28-10:~$ > > The only piece of the object that I can seem to find is the original one I > posted:: > lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep > 'NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959' > default.384153.1_2015/01/01/PAKC/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar > > And when we stat this object is is 0 bytes as shown earlier:: > lacadmin@kh28-10:~$ rados -p .rgw.buckets stat > 'default.384153.1_2015/01/01/PAKC/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar' > .rgw.buckets/default.384153.1_2015/01/01/PAKC/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar > mtime 2015-11-04 15:29:30.00, size 0 > > Sorry again for the confusion. > > > > On 1/15/16 3:58 PM, Yehuda Sadeh-Weinraub wrote: >> >> Ah, I see. Misread that and the object names were very similar. No, >> don't copy it. You can try to grep for the specific object name and >> see if there are pieces of it lying around under a different upload >> id. >> >> Yehuda >> >> On Fri, Jan 15, 2016 at 1:44 PM, seapasu...@uchicago.edu >> wrote: >>> >>> Sorry I am a bit confused. The successful list that I provided is from a >>> different object of the same size to show that I could indeed get a list. >>> Are you saying to copy the working object to the missing object? Sorry >>> for >>> the confusion. >>> >>> >>> On 1/15/16 3:20 PM, Yehuda Sadeh-Weinraub wrote: That's interesting, and might point at the underlying issue that caused it. Could be a racing upload that somehow ended up with the wrong object head. The 'multipart' object should be 4M in size, and the 'shadow' one should have the remainder of the data. You can run 'rados stat -p .rgw.buckets ' to validate that. If that's the case, you can copy these to the expected object names: $ src_uploadid=wksHvto9gRgHUJbhm_TZPXJTZUPXLT2 $ dest_uploadid=pcu5Hz6foFXjlSxBat22D8YMcHlQOBD $ rados -p .rgw.buckets cp default.384153.1__multipart_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${src_uploadid}.1 default.384153.1__multipart_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${dest_uploadid}.1 $ rados -p .rgw.buckets cp default.384153.1__shadow_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${src_upload_id}.1_1 default.384153.1__shadow_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${dest_upload_id}.1_1 Yehuda On Fri, Jan 15, 2016 at 1:02 PM, seapasu...@uchicago.edu wrote: > > lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep 'pcu5Hz6' > lacadmin@kh28-10:~$ > > Nothing was found. That said when I run the command with another prefix > snippet:: > lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep 'wksHvto' > > > default.384153.1__shadow_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~wksHvto9gRgHUJbhm_TZPXJTZUPXLT2.1_1 > > > default.384153.1__multipart_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~wksHvto9gRgHUJbhm_TZPXJTZUPXLT2.1 > > > > > On 1/15/16 12:05 PM, Yehuda Sadeh-Weinraub wrote: >> >> On Fri, Jan 15, 2016 at 9:36 AM, seapasu...@uchicago.edu >> wrote: >>> >>> Hello Yehuda, >>> >>> Here it is:: >>> >>> radosgw-admin object stat --bucket="noaa-nexrad-l2" >>> >>> >>> >>> --object="2015/01/01/PAKC/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar" >>> { >>>"name": >>> >>> >>> >>> "2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar", >>>"size": 7147520, >>>"policy": { >>>"acl": { >>>"acl_user_map": [ >>>{ >>>"user": "b05f707271774dbd89674a0736c9406e", >>>"acl": 15 >>>} >>>], >>>"acl_group_map": [ >>>{ >>>"group": 1, >>>"acl": 1 >>>} >>>], >>>
Re: [ceph-users] Inconsistent PG / Impossible deep-scrub
Finally, after having corruption with the MDS I had no choice but to try to manually repair the PG. Following the procedure on the blog post from Ceph at http://ceph.com/planet/ceph-manually-repair-object/ I was able to get the PG back to active+clean, ceph pg repair wasn't still not working and an automatic deep-scrub confirmed the object was still faulty. On Fri, Dec 18, 2015 at 10:42 AM, Jérôme Poulinwrote: > Good day everyone, > > I currently manage a Ceph cluster running Firefly 0.80.10, we had some > maintenance which implied stopping OSD and starting them back again. This > caused one of the hard drive to notice it had a bad sector and then Ceph to > mark it as inconsistent. > > After reparing the physical issue, I went and tried ceph pg repair, no > action, then I tried ceph pg deep-scrub, still no action. > > I verified the log of each OSD which had the PG and confirmed that nothing > was logged, no repair, no deep-scrub. After trying deep-scrubbing manually > other PGs, I confirmed that my requests were completely ignored. > > The only flag set is noout since this cluster is too small, but automatic > deep-scrubs are working and are logged both in ceph.log and the OSD log. > > I tried restarting the monitor in charge to elect a new one and restart > each affected OSD for the inconsistent PG with no success. > > I also tried to fix the defective object myself in case it was hanging > something, now the object has the same checksum on each OSD. > > Is there a way to ask the OSD directly to deep-scrub without using the > monitor? Is there a known issue about commands getting ignored? > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RGW -- 404 on keys in bucket.list() thousands of multipart ids listed as well.
I have looked all over and I do not see any explicit mention of "NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959" in the logs nor do I see a timestamp from November 4th although I do see log rotations dating back to october 15th. I don't think it's possible it wasn't logged so I am going through the bucket logs from the 'radosgw-admin log show --object' side and I found the following:: 4604932 { 4604933 "bucket": "noaa-nexrad-l2", 4604934 "time": "2015-11-04 21:29:27.346509Z", 4604935 "time_local": "2015-11-04 15:29:27.346509", 4604936 "remote_addr": "", 4604937 "object_owner": "b05f707271774dbd89674a0736c9406e", 4604938 "user": "b05f707271774dbd89674a0736c9406e", 4604939 "operation": "PUT", 4604940 "uri": "\/noaa-nexrad-l2\/2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar", 4604941 "http_status": "200", 4604942 "error_code": "", 4604943 "bytes_sent": 19, 4604944 "bytes_received": 0, 4604945 "object_size": 0, 4604946 "total_time": 142640400, 4604947 "user_agent": "Boto\/2.38.0 Python\/2.7.7 Linux\/2.6.32-573.7.1.el6.x86_64", 4604948 "referrer": "" 4604949 } Does this help at all. The total time seems exceptionally high. Would it be possible that there is a timeout issue where the put request started a multipart upload with the correct header and then timed out but the radosgw took the data anyway? I am surprised the radosgw returned a 200 let alone placed the key in the bucket listing. That said here is another object (different object) that 404s: 1650873 { 1650874 "bucket": "noaa-nexrad-l2", 1650875 "time": "2015-11-05 04:50:42.606838Z", 1650876 "time_local": "2015-11-04 22:50:42.606838", 1650877 "remote_addr": "", 1650878 "object_owner": "b05f707271774dbd89674a0736c9406e", 1650879 "user": "b05f707271774dbd89674a0736c9406e", 1650880 "operation": "PUT", 1650881 "uri": "\/noaa-nexrad-l2\/2015\/02\/25\/KVBX\/NWS_NEXRAD_NXL2DP_KVBX_2015022516_20150225165959.tar", 1650882 "http_status": "200", 1650883 "error_code": "", 1650884 "bytes_sent": 19, 1650885 "bytes_received": 0, 1650886 "object_size": 0, 1650887 "total_time": 0, 1650888 "user_agent": "Boto\/2.38.0 Python\/2.7.7 Linux\/2.6.32-573.7.1.el6.x86_64", 1650889 "referrer": "" 1650890 } And this one fails with a 404 as well. Does this help at all? Here is a successful object (different object) log entry as well just in case:: 17462367 { 17462368 "bucket": "noaa-nexrad-l2", 17462369 "time": "2015-11-04 21:16:44.148603Z", 17462370 "time_local": "2015-11-04 15:16:44.148603", 17462371 "remote_addr": "", 17462372 "object_owner": "b05f707271774dbd89674a0736c9406e", 17462373 "user": "b05f707271774dbd89674a0736c9406e", 17462374 "operation": "PUT", 17462375 "uri": "\/noaa-nexrad-l2\/2015\/01\/01\/KAKQ\/NWS_NEXRAD_NXL2DP_KAKQ_2015010108_20150101085959.tar", 17462376 "http_status": "200", 17462377 "error_code": "", 17462378 "bytes_sent": 19, 17462379 "bytes_received": 0, 17462380 "object_size": 0, 17462381 "total_time": 0, 17462382 "user_agent": "Boto\/2.38.0 Python\/2.7.7 Linux\/2.6.32-573.7.1.el6.x86_64", 17462383 "referrer": "" 17462384 } So I am guessing these are not pertinent as they look nearly identical. Unfortunately I do not have any client.radosgw.logs to show for the failed files for some reason. Is there anything else I can do to troubleshoot this issue. In the end the radosgw should have never list these files as they never completed successfully, right? On 1/15/16 4:36 PM, Yehuda Sadeh-Weinraub wrote: The head object of a multipart object has 0 size, so it's expected. What's missing is the tail of the object. I don't assume you have any logs from when the object was uploaded? Yehuda On Fri, Jan 15, 2016 at 2:12 PM, seapasu...@uchicago.eduwrote: Sorry for the confusion:: When I grepped for the prefix of the missing object:: "2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar.2~pcu5Hz6foFXjlSxBat22D8YMcHlQOBD" I am not able to find any chunks of the object:: lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep 'pcu5Hz6' lacadmin@kh28-10:~$ The only piece of the object that I can seem to find is the original one I posted:: lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep 'NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959'
[ceph-users] OSDs are down, don't know why
Hello, I'm setting up a small test instance of ceph and I'm running into a situation where the OSDs are being shown as down, but I don't know why. Connectivity seems to be working. The OSD hosts are able to communicate with the MON hosts; running "ceph status" and "ceph osd in" from an OSD host works fine, but with a HEALTH_WARN that I have 2 osds: 0 up, 2 in. Both the OSD and MON daemons seem to be running fine. Network connectivity seems to be okay: I can nc from the OSD to port 6789 on the MON, and from the MON to port 6800-6803 on the OSD (I have constrained the ms bind port min/max config options so that the OSDs will use only these ports). Neither OSD nor MON logs show anything that seems unusual, nor why the OSD is marked as being down. Furthermore, using tcpdump i've watched network traffic between the OSD and the MON, and it seems that the OSD is sending heartbeats and getting an ack from the MON. So I'm definitely not sure why the MON thinks the OSD is down. Some questions: - How does the MON determine if the OSD is down? - Is there a way to get the MON to report on why an OSD is down, e.g. no heartbeat? - Is there any need to open ports other than TCP 6789 and 6800-6803? - Any other suggestions? ceph 0.94 on Debian Jessie Best, Jeff ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Infernalis upgrade breaks when journal on separate partition
In my case one server was also non-GPT installed and in /usr/sbin/ceph-disk I've added line: os.chmod(os.path.join(path,'journal'), 0777) after line 1926. I know that it's very ugly and shouldn't be made on production, but I had no time to search for proper way to fix this. Regards Michał Chybowski Tiktalik.com W dniu 15.01.2016 o 05:00, Stuart Longland pisze: On 12/01/16 01:22, Stillwell, Bryan wrote: Well, it seems I spoke to soon. Not sure what logic the udev rules use to identify ceph journals, but it doesn't seem to pick up on the journals in our case as after a reboot, those partitions are owned by root:disk with permissions 0660. This is handled by the UUIDs of the GPT partitions, and since you're using MS-DOS partition tables it won't work correctly. I would recommend switching to GPT partition tables if you can. I'm not comfortable with switching from MS-DOS to GPT disklabels on a running production server, and nothing in the Ceph docs at the time mentioned that GPT was a requirement on the journal disks. Switching to GPT isn't toggling an option, it'd require resizing partitions (some of these are xfs; can't be shrunk to my knowledge) and moving stuff around to allow for an additional EFI boot partition, possibly changes to boot firmware settings and bootloaders too. I'll look into a udev rule for our particular case and see how I go. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com