[ceph-users] cephfs triggers warnings "tar: file changed as we read it"

2016-01-15 Thread HMLTH

Hello,

I'm evaluating cephfs on a virtual machines cluster. I'm using  
Infernalis (9.2.0) on debian Jessie as client and server.


I'm trying to get some performance numbers on operations like  
tar/untar on things like the linux kernel. I have an issue where tar  
displays this warning : 'file changed as we read it'. I reproduced the  
problem with just the Documentation dir.


root@admin:/mnt/cephfs# rm -rf Documentation ; tar xf  
linux-documentation.tar.bz2 ; sync ; tar c Documentation>/dev/null ;

tar: Documentation/parisc: file changed as we read it
tar: Documentation/pcmcia: file changed as we read it
tar: Documentation/phy: file changed as we read it
tar: Documentation/platform: file changed as we read it
tar: Documentation/power: file changed as we read it
tar: Documentation/powerpc: file changed as we read it
tar: Documentation/pps: file changed as we read it
tar: Documentation/prctl: file changed as we read it
tar: Documentation/pti: file changed as we read it
tar: Documentation/ptp: file changed as we read it
tar: Documentation/rapidio: file changed as we read it
tar: Documentation/s390: file changed as we read it
tar: Documentation/scheduler: file changed as we read it
tar: Documentation/scsi: file changed as we read it
tar: Documentation/security: file changed as we read it
tar: Documentation/w1/slaves: file changed as we read it
tar: Documentation/watchdog: file changed as we read it
tar: Documentation/x86: file changed as we read it
tar: Documentation/zh_CN: file changed as we read it
tar: Documentation: file changed as we read it

If I wait between the two commands, the error is reduced  but not eliminated:

root@admin:/mnt/cephfs# rm -rf Documenation ; tar xf  
linux-documentation.tar.bz2 ; sleep 10 ; tar c Documentation>/dev/null ;

tar: Documentation/virtual: file changed as we read it
tar: Documentation/w1: file changed as we read it
tar: Documentation/watchdog: file changed as we read it
tar: Documentation/x86: file changed as we read it
tar: Documentation/zh_CN: file changed as we read it
tar: Documentation: file changed as we read it


root@admin:/mnt/cephfs# rm -rf Documentation ; tar xf  
linux-documentation.tar.bz2 ; sleep 120 ; tar c  
Documentation>/dev/null ;

tar: Documentation: file changed as we read it

I'm sure no other client process is not modifying the files. I have  
this problem with the fuse client and with the kernel (version in  
Jessie) client.


By doing a "stat", I see some meta-data are changed:
root@admin:/mnt/cephfs# rm -rf Documentation ; tar xf  
linux-documentation.tar.bz2 ;stat Documentation; tar -c  
Documentation>/dev/null ; stat Documentation

  File: ‘Documentation’
  Size: 14740322Blocks: 1  IO Block: 4096   directory
Device: 23h/35d Inode: 1099511913288  Links: 1
Access: (0770/drwxrwx---)  Uid: ( 1000/  )   Gid: ( 1000/  )
Access: 2016-01-15 16:51:40.882143334 +
Modify: 2015-05-12 09:34:49.0 +0100
Change: 2016-01-15 16:52:31.745684502 +
 Birth: -
tar: Documentation/scheduler: file changed as we read it
tar: Documentation/scsi: file changed as we read it
tar: Documentation/zh_CN/arm64: file changed as we read it
tar: Documentation/zh_CN/filesystems: file changed as we read it
tar: Documentation/zh_CN/video4linux: file changed as we read it
tar: Documentation: file changed as we read it
  File: ‘Documentation’
  Size: 15088573Blocks: 1  IO Block: 4096   directory
Device: 23h/35d Inode: 1099511913288  Links: 1
Access: (0770/drwxrwx---)  Uid: ( 1000/  )   Gid: ( 1000/  )
Access: 2016-01-15 16:51:40.882143334 +
Modify: 2015-05-12 09:34:49.0 +0100
Change: 2016-01-15 16:52:31.745684502 +
 Birth: -

I know it's possible to silence this warning with a tar option but, I  
don't want to worry about that in every commands of every script, it  
changes the tar output. And above all, I don't find that very clean.


Do you know any settings that guarantees that all pending async writes  
are terminated when a client opens a file ?


Regards

Thomas HAMEL

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RGW -- 404 on keys in bucket.list() thousands of multipart ids listed as well.

2016-01-15 Thread seapasu...@uchicago.edu

Hello Yehuda,

Here it is::

radosgw-admin object stat --bucket="noaa-nexrad-l2" 
--object="2015/01/01/PAKC/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar"

{
"name": 
"2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar",

"size": 7147520,
"policy": {
"acl": {
"acl_user_map": [
{
"user": "b05f707271774dbd89674a0736c9406e",
"acl": 15
}
],
"acl_group_map": [
{
"group": 1,
"acl": 1
}
],
"grant_map": [
{
"id": "",
"grant": {
"type": {
"type": 2
},
"id": "",
"email": "",
"permission": {
"flags": 1
},
"name": "",
"group": 1
}
},
{
"id": "b05f707271774dbd89674a0736c9406e",
"grant": {
"type": {
"type": 0
},
"id": "b05f707271774dbd89674a0736c9406e",
"email": "",
"permission": {
"flags": 15
},
"name": "noaa-commons",
"group": 0
}
}
]
},
"owner": {
"id": "b05f707271774dbd89674a0736c9406e",
"display_name": "noaa-commons"
}
},
"etag": "b91b6f1650350965c5434c547b3c38ff-1\u",
"tag": "_cWrvEa914Gy1AeyzIhRlUdp1wJnek3E\u",
"manifest": {
"objs": [],
"obj_size": 7147520,
"explicit_objs": "false",
"head_obj": {
"bucket": {
"name": "noaa-nexrad-l2",
"pool": ".rgw.buckets",
"data_extra_pool": ".rgw.buckets.extra",
"index_pool": ".rgw.buckets.index",
"marker": "default.384153.1",
"bucket_id": "default.384153.1"
},
"key": "",
"ns": "",
"object": 
"2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar",

"instance": ""
},
"head_size": 0,
"max_head_size": 0,
"prefix": 
"2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar.2~pcu5Hz6foFXjlSxBat22D8YMcHlQOBD",

"tail_bucket": {
"name": "noaa-nexrad-l2",
"pool": ".rgw.buckets",
"data_extra_pool": ".rgw.buckets.extra",
"index_pool": ".rgw.buckets.index",
"marker": "default.384153.1",
"bucket_id": "default.384153.1"
},
"rules": [
{
"key": 0,
"val": {
"start_part_num": 1,
"start_ofs": 0,
"part_size": 0,
"stripe_max_size": 4194304,
"override_prefix": ""
}
}
]
},
"attrs": {}
}

On 1/15/16 11:17 AM, Yehuda Sadeh-Weinraub wrote:

radosgw-admin object stat --bucket= --object='


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RGW -- 404 on keys in bucket.list() thousands of multipart ids listed as well.

2016-01-15 Thread Yehuda Sadeh-Weinraub
On Thu, Jan 14, 2016 at 10:51 PM, seapasu...@uchicago.edu
 wrote:
> It looks like the gateway is experiencing a similar race condition to what
> we reported before.
>
> The rados object has a size of 0 bytes but the bucket index shows the object
> listed and the object metadata shows a size of
> 7147520 bytes.
>
> I have a lot of logs but I don't think any of them have the full data from
> the upload of this object.
>
> I thought this bug was fixed back in firefly/giant
>
> https://www.mail-archive.com/ceph-users@lists.ceph.com/msg19971.html
>
> --
>
> root@kg34-33:/srv/nfs/griffin_temp# rados -p .rgw.buckets stat
> default.384153.1_2015/01/01/PAKC/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar
> ..rgw.buckets/default.384153.1_2015/01/01/PAKC/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar
> mtime 1446672570, size 0
>
> --
>
> SError: [Errno 2] No such file or directory:
> '/srv/nfs/griffin_tempnoaa-nexrad-l2/2015/01/01/PAKC/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar'
>
> In [13]: print(key.size)
> 7147520
>
> We are currently using 94.5 and the file were uploaded to hammer as well
>
> lacadmin@kh28-10:~$ ceph --version
> ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43)
> lacadmin@kh28-10:~$ radosgw --version
> ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43)
>
>
> The cluster is health_ok and was ok during the upload. I need to confirm
> with the person who uploaded the data but I think they did it with s3cmd.
> Has anyone seen this before? I think I need to file a bug :-(
>

What does 'radosgw-admin object stat --bucket= --object=' show?

Yehuda
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Odd single VM ceph error

2016-01-15 Thread Sage Weil
On Thu, 14 Jan 2016, Robert LeBlanc wrote:
> We have a single VM that is acting odd. We had 7 SSD OSDs (out of 40) go
> down over a period of about 12 hours. These are a cache tier and have size
> 4, min_size 2. I'm not able to make heads or tails of the error and hoped
> someone here could help.
> 
> 2016-01-14 23:09:54.559121 osd.136 [ERR] 13.503 copy from
> f8bedd03/rbd_data.48a6325f5e3f87.683d/head//13 to
> f8bedd03/rbd_data.48a6325f5e3f87.683d/head//13 data digest
> 0x92bc163c != source 0x8fe2d0a9
> 
> The PG fully recovered then the error was
> 
> 2016-01-15 00:39:25.321469 osd.12 [ERR] 13.503 copy from
> f8bedd03/rbd_data.48a6325f5e3f87.683d/head//13 to
> f8bedd03/rbd_data.48a6325f5e3f87.683d/head//13 data digest
> 0x92bc163c != source 0x8fe2d0a9
> 
> A deep scrub of the PG comes back clean and a hash of the files on all OSDs
> match. The file system on this vm keeps going read only.
> 
> The osd file system is EXT4 and this is 0.94.5.

You're using cache tiering I take it?  I think the error is in the base 
tier, while the PG mentioned is the cache tier.

sage
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Infernalis upgrade breaks when journal on separate partition

2016-01-15 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

If you are not booting from the GPT disk, you don't need the EFI
partition (or any special boot partition). The required backup FAT is
usually put at the end where there is usually some free space anyway.
It has been a long time since I've converted from MBR to GPT, but it
didn't require any resizing that I remember. I'd test it in a VM or
similar to make sure you understand the process. You will also have to
manually add the Ceph Journal UUID to the partition after the
conversion for it to all work automatically.
- 
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Thu, Jan 14, 2016 at 9:00 PM, Stuart Longland  wrote:
> On 12/01/16 01:22, Stillwell, Bryan wrote:
>>> Well, it seems I spoke to soon.  Not sure what logic the udev rules use
>>> >to identify ceph journals, but it doesn't seem to pick up on the
>>> >journals in our case as after a reboot, those partitions are owned by
>>> >root:disk with permissions 0660.
>
>> This is handled by the UUIDs of the GPT partitions, and since you're using
>> MS-DOS partition tables it won't work correctly.  I would recommend 
>> switching to
>> GPT partition tables if you can.
>
> I'm not comfortable with switching from MS-DOS to GPT disklabels on a
> running production server, and nothing in the Ceph docs at the time
> mentioned that GPT was a requirement on the journal disks.
>
> Switching to GPT isn't toggling an option, it'd require resizing
> partitions (some of these are xfs; can't be shrunk to my knowledge) and
> moving stuff around to allow for an additional EFI boot partition,
> possibly changes to boot firmware settings and bootloaders too.
>
> I'll look into a udev rule for our particular case and see how I go.
> --
>  _ ___ Stuart Longland - Systems Engineer
> \  /|_) |   T: +61 7 3535 9619
>  \/ | \ | 38b Douglas StreetF: +61 7 3535 9699
>SYSTEMSMilton QLD 4064   http://www.vrt.com.au
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-BEGIN PGP SIGNATURE-
Version: Mailvelope v1.3.3
Comment: https://www.mailvelope.com

wsFcBAEBCAAQBQJWmUrVCRDmVDuy+mK58QAA4VkQAIttGy8fn8zqrSNGnykb
A/FI2WjObejmDREjRc3DVFqrBOL7faSyTvf636N/y0TYF1J5VLJu1Zf4C/2R
aCxm4lDzxvRWnWW935GFtZ9YCwzA9KyR10DJJf1TaQW7nq1a+UYPTiTX9kk6
fnA9X7AICpLDrxNX+/2sYfZAASDBQjslW4qFIJ5CW5JhRVi8CggnUF5wPvf1
R3r7u/tlGJEU3pktTnSix8mzBjJSKpOiHFNikkrj/Md6+pNVgmAfpvA7cj3R
gkazGxL5TiqdwQQ0OVusv18VrL9Vir4tyA0BOam95DEQ8QaZ4PodRQsj5RUd
KD+MjpIbj1OtRriVrTUrlTtLuMq26g8yVlR6HttwFIMkANUu0kso1p2C3NzF
1ETjx+JpcX+Zn/o3gx4AgYC/YJ97y4LRIdTfJR/3P1UnZnkGsyOadFqi6qTC
l6mgzhjURxLUTu7XOnJn28aJ0Ql+gH1HuFUojskiahppE+B5K9KMAVZjbDAN
j7aOspD9oZydu20XkwDD6jnKRmC82tYk58A5PMdYfN9tbgItmlwg51NGV0Tk
n8EoW6mT8zfK3aOqBYSDsmymyL8b6vB/uK05pJz+w9ut1SozaOWmKRPiGifN
JvOeROVCpoF8FoW6J0GJTbRMYIhvxew4VMe3hjxy2OyCXxkay4bINY2kPllD
mOnH
=8a8l
-END PGP SIGNATURE-
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RGW -- 404 on keys in bucket.list() thousands of multipart ids listed as well.

2016-01-15 Thread Yehuda Sadeh-Weinraub
On Fri, Jan 15, 2016 at 9:36 AM, seapasu...@uchicago.edu
 wrote:
> Hello Yehuda,
>
> Here it is::
>
> radosgw-admin object stat --bucket="noaa-nexrad-l2"
> --object="2015/01/01/PAKC/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar"
> {
> "name":
> "2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar",
> "size": 7147520,
> "policy": {
> "acl": {
> "acl_user_map": [
> {
> "user": "b05f707271774dbd89674a0736c9406e",
> "acl": 15
> }
> ],
> "acl_group_map": [
> {
> "group": 1,
> "acl": 1
> }
> ],
> "grant_map": [
> {
> "id": "",
> "grant": {
> "type": {
> "type": 2
> },
> "id": "",
> "email": "",
> "permission": {
> "flags": 1
> },
> "name": "",
> "group": 1
> }
> },
> {
> "id": "b05f707271774dbd89674a0736c9406e",
> "grant": {
> "type": {
> "type": 0
> },
> "id": "b05f707271774dbd89674a0736c9406e",
> "email": "",
> "permission": {
> "flags": 15
> },
> "name": "noaa-commons",
> "group": 0
> }
> }
> ]
> },
> "owner": {
> "id": "b05f707271774dbd89674a0736c9406e",
> "display_name": "noaa-commons"
> }
> },
> "etag": "b91b6f1650350965c5434c547b3c38ff-1\u",
> "tag": "_cWrvEa914Gy1AeyzIhRlUdp1wJnek3E\u",
> "manifest": {
> "objs": [],
> "obj_size": 7147520,
> "explicit_objs": "false",
> "head_obj": {
> "bucket": {
> "name": "noaa-nexrad-l2",
> "pool": ".rgw.buckets",
> "data_extra_pool": ".rgw.buckets.extra",
> "index_pool": ".rgw.buckets.index",
> "marker": "default.384153.1",
> "bucket_id": "default.384153.1"
> },
> "key": "",
> "ns": "",
> "object":
> "2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar",
> "instance": ""
> },
> "head_size": 0,
> "max_head_size": 0,
> "prefix":
> "2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar.2~pcu5Hz6foFXjlSxBat22D8YMcHlQOBD",

Try running:
$ rados -p .rgw.buckets ls | grep pcu5Hz6

Yehuda


> "tail_bucket": {
> "name": "noaa-nexrad-l2",
> "pool": ".rgw.buckets",
> "data_extra_pool": ".rgw.buckets.extra",
> "index_pool": ".rgw.buckets.index",
> "marker": "default.384153.1",
> "bucket_id": "default.384153.1"
> },
> "rules": [
> {
> "key": 0,
> "val": {
> "start_part_num": 1,
> "start_ofs": 0,
> "part_size": 0,
> "stripe_max_size": 4194304,
> "override_prefix": ""
> }
> }
> ]
> },
> "attrs": {}
>
> }
>
> On 1/15/16 11:17 AM, Yehuda Sadeh-Weinraub wrote:
>>
>> radosgw-admin object stat --bucket= --object='
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Odd single VM ceph error

2016-01-15 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Exactly the problem. Thanks for the point in the right direction.
- 
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Fri, Jan 15, 2016 at 10:28 AM, Sage Weil  wrote:
> On Thu, 14 Jan 2016, Robert LeBlanc wrote:
>> We have a single VM that is acting odd. We had 7 SSD OSDs (out of 40) go
>> down over a period of about 12 hours. These are a cache tier and have size
>> 4, min_size 2. I'm not able to make heads or tails of the error and hoped
>> someone here could help.
>>
>> 2016-01-14 23:09:54.559121 osd.136 [ERR] 13.503 copy from
>> f8bedd03/rbd_data.48a6325f5e3f87.683d/head//13 to
>> f8bedd03/rbd_data.48a6325f5e3f87.683d/head//13 data digest
>> 0x92bc163c != source 0x8fe2d0a9
>>
>> The PG fully recovered then the error was
>>
>> 2016-01-15 00:39:25.321469 osd.12 [ERR] 13.503 copy from
>> f8bedd03/rbd_data.48a6325f5e3f87.683d/head//13 to
>> f8bedd03/rbd_data.48a6325f5e3f87.683d/head//13 data digest
>> 0x92bc163c != source 0x8fe2d0a9
>>
>> A deep scrub of the PG comes back clean and a hash of the files on all OSDs
>> match. The file system on this vm keeps going read only.
>>
>> The osd file system is EXT4 and this is 0.94.5.
>
> You're using cache tiering I take it?  I think the error is in the base
> tier, while the PG mentioned is the cache tier.
>
> sage

-BEGIN PGP SIGNATURE-
Version: Mailvelope v1.3.3
Comment: https://www.mailvelope.com

wsFcBAEBCAAQBQJWmUlxCRDmVDuy+mK58QAAUVIP/0B7/0eMpV1lcEhXXLCE
aNNBNSMwXevD7uRRVoqoUnjgqZExttMPNZhFij0MF5ztvJ88lycfb+1hClHk
Z8agN+APhCVKFpbKm0bGf47EPs+BLYlTl8r7fmA2uIWcG0XO4+eZtjngsJON
zZiM2ZE9URJ6GSvquCAADS1XDumAkve0CaJV0c6MOgj4nP1ktDiX0CzjdMF0
fpPc1hUZGAnCEevGGGEQmVcqXDZnUK3OsY2WCZiMCAXBiTQmNMVvJehM1rEg
Ss/kC+VIyqzevPN+/r0STUve7UrkVXII19la5fBItSQt8btSjpEkzFFU1WmI
8ehX5KJwyKGH4NIrPCqjg+TGuUlTkkDO9LcEoj19x9I+Tzuyu/bSayo8FBvT
9SYkW7TzfF0JL+ed4fYCv7kK2OJBPR6uQZ8ABcLRXSbmtZa9oVKgHgTwvbgD
jT+smWg9JB7bGrx7PkRYDQVpgr62Nbau/CFoUP1FWI8bVhStM3LQQQot8x7X
pN9vFwAOwn5yvWLYi+7qmYYLyoOzkg8Ib7d1E3QfOzS86O3mCp14u++x1QBc
5YthsSxfvfs+fNmMKnCrz5YGGcCSntQgJRvOlQl8Oyp29xHVYTP1nqZSXhVi
jC94GBmnX+3Z26JmnybVfZnELDbk9wsp2R0tN1Ai4JxpY0PUq/672PuAZDaO
McAz
=G1Lk
-END PGP SIGNATURE-
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RGW -- 404 on keys in bucket.list() thousands of multipart ids listed as well.

2016-01-15 Thread seapasu...@uchicago.edu

lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep 'pcu5Hz6'
lacadmin@kh28-10:~$

Nothing was found. That said when I run the command with another prefix 
snippet::

lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep 'wksHvto'
default.384153.1__shadow_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~wksHvto9gRgHUJbhm_TZPXJTZUPXLT2.1_1
default.384153.1__multipart_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~wksHvto9gRgHUJbhm_TZPXJTZUPXLT2.1



On 1/15/16 12:05 PM, Yehuda Sadeh-Weinraub wrote:

On Fri, Jan 15, 2016 at 9:36 AM, seapasu...@uchicago.edu
 wrote:

Hello Yehuda,

Here it is::

radosgw-admin object stat --bucket="noaa-nexrad-l2"
--object="2015/01/01/PAKC/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar"
{
 "name":
"2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar",
 "size": 7147520,
 "policy": {
 "acl": {
 "acl_user_map": [
 {
 "user": "b05f707271774dbd89674a0736c9406e",
 "acl": 15
 }
 ],
 "acl_group_map": [
 {
 "group": 1,
 "acl": 1
 }
 ],
 "grant_map": [
 {
 "id": "",
 "grant": {
 "type": {
 "type": 2
 },
 "id": "",
 "email": "",
 "permission": {
 "flags": 1
 },
 "name": "",
 "group": 1
 }
 },
 {
 "id": "b05f707271774dbd89674a0736c9406e",
 "grant": {
 "type": {
 "type": 0
 },
 "id": "b05f707271774dbd89674a0736c9406e",
 "email": "",
 "permission": {
 "flags": 15
 },
 "name": "noaa-commons",
 "group": 0
 }
 }
 ]
 },
 "owner": {
 "id": "b05f707271774dbd89674a0736c9406e",
 "display_name": "noaa-commons"
 }
 },
 "etag": "b91b6f1650350965c5434c547b3c38ff-1\u",
 "tag": "_cWrvEa914Gy1AeyzIhRlUdp1wJnek3E\u",
 "manifest": {
 "objs": [],
 "obj_size": 7147520,
 "explicit_objs": "false",
 "head_obj": {
 "bucket": {
 "name": "noaa-nexrad-l2",
 "pool": ".rgw.buckets",
 "data_extra_pool": ".rgw.buckets.extra",
 "index_pool": ".rgw.buckets.index",
 "marker": "default.384153.1",
 "bucket_id": "default.384153.1"
 },
 "key": "",
 "ns": "",
 "object":
"2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar",
 "instance": ""
 },
 "head_size": 0,
 "max_head_size": 0,
 "prefix":
"2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar.2~pcu5Hz6foFXjlSxBat22D8YMcHlQOBD",

Try running:
$ rados -p .rgw.buckets ls | grep pcu5Hz6

Yehuda



 "tail_bucket": {
 "name": "noaa-nexrad-l2",
 "pool": ".rgw.buckets",
 "data_extra_pool": ".rgw.buckets.extra",
 "index_pool": ".rgw.buckets.index",
 "marker": "default.384153.1",
 "bucket_id": "default.384153.1"
 },
 "rules": [
 {
 "key": 0,
 "val": {
 "start_part_num": 1,
 "start_ofs": 0,
 "part_size": 0,
 "stripe_max_size": 4194304,
 "override_prefix": ""
 }
 }
 ]
 },
 "attrs": {}

}

On 1/15/16 11:17 AM, Yehuda Sadeh-Weinraub wrote:

radosgw-admin object stat --bucket= --object='




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RGW -- 404 on keys in bucket.list() thousands of multipart ids listed as well.

2016-01-15 Thread Yehuda Sadeh-Weinraub
That's interesting, and might point at the underlying issue that
caused it. Could be a racing upload that somehow ended up with the
wrong object head. The 'multipart' object should be 4M in size, and
the 'shadow' one should have the remainder of the data. You can run
'rados stat -p .rgw.buckets ' to validate that. If that's the
case, you can copy these to the expected object names:

$ src_uploadid=wksHvto9gRgHUJbhm_TZPXJTZUPXLT2
$ dest_uploadid=pcu5Hz6foFXjlSxBat22D8YMcHlQOBD

$ rados -p .rgw.buckets cp
default.384153.1__multipart_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${src_uploadid}.1
default.384153.1__multipart_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${dest_uploadid}.1

$ rados -p .rgw.buckets cp
default.384153.1__shadow_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${src_upload_id}.1_1
default.384153.1__shadow_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${dest_upload_id}.1_1

Yehuda


On Fri, Jan 15, 2016 at 1:02 PM, seapasu...@uchicago.edu
 wrote:
> lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep 'pcu5Hz6'
> lacadmin@kh28-10:~$
>
> Nothing was found. That said when I run the command with another prefix
> snippet::
> lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep 'wksHvto'
> default.384153.1__shadow_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~wksHvto9gRgHUJbhm_TZPXJTZUPXLT2.1_1
> default.384153.1__multipart_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~wksHvto9gRgHUJbhm_TZPXJTZUPXLT2.1
>
>
>
>
> On 1/15/16 12:05 PM, Yehuda Sadeh-Weinraub wrote:
>>
>> On Fri, Jan 15, 2016 at 9:36 AM, seapasu...@uchicago.edu
>>  wrote:
>>>
>>> Hello Yehuda,
>>>
>>> Here it is::
>>>
>>> radosgw-admin object stat --bucket="noaa-nexrad-l2"
>>>
>>> --object="2015/01/01/PAKC/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar"
>>> {
>>>  "name":
>>>
>>> "2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar",
>>>  "size": 7147520,
>>>  "policy": {
>>>  "acl": {
>>>  "acl_user_map": [
>>>  {
>>>  "user": "b05f707271774dbd89674a0736c9406e",
>>>  "acl": 15
>>>  }
>>>  ],
>>>  "acl_group_map": [
>>>  {
>>>  "group": 1,
>>>  "acl": 1
>>>  }
>>>  ],
>>>  "grant_map": [
>>>  {
>>>  "id": "",
>>>  "grant": {
>>>  "type": {
>>>  "type": 2
>>>  },
>>>  "id": "",
>>>  "email": "",
>>>  "permission": {
>>>  "flags": 1
>>>  },
>>>  "name": "",
>>>  "group": 1
>>>  }
>>>  },
>>>  {
>>>  "id": "b05f707271774dbd89674a0736c9406e",
>>>  "grant": {
>>>  "type": {
>>>  "type": 0
>>>  },
>>>  "id": "b05f707271774dbd89674a0736c9406e",
>>>  "email": "",
>>>  "permission": {
>>>  "flags": 15
>>>  },
>>>  "name": "noaa-commons",
>>>  "group": 0
>>>  }
>>>  }
>>>  ]
>>>  },
>>>  "owner": {
>>>  "id": "b05f707271774dbd89674a0736c9406e",
>>>  "display_name": "noaa-commons"
>>>  }
>>>  },
>>>  "etag": "b91b6f1650350965c5434c547b3c38ff-1\u",
>>>  "tag": "_cWrvEa914Gy1AeyzIhRlUdp1wJnek3E\u",
>>>  "manifest": {
>>>  "objs": [],
>>>  "obj_size": 7147520,
>>>  "explicit_objs": "false",
>>>  "head_obj": {
>>>  "bucket": {
>>>  "name": "noaa-nexrad-l2",
>>>  "pool": ".rgw.buckets",
>>>  "data_extra_pool": ".rgw.buckets.extra",
>>>  "index_pool": ".rgw.buckets.index",
>>>  "marker": "default.384153.1",
>>>  "bucket_id": "default.384153.1"
>>>  },
>>>  "key": "",
>>>  "ns": "",
>>>  "object":
>>>
>>> "2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar",
>>>  "instance": ""
>>>  },
>>>  "head_size": 0,
>>>  "max_head_size": 0,
>>>  "prefix":
>>>
>>> "2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar.2~pcu5Hz6foFXjlSxBat22D8YMcHlQOBD",
>>
>> 

Re: [ceph-users] RGW -- 404 on keys in bucket.list() thousands of multipart ids listed as well.

2016-01-15 Thread seapasu...@uchicago.edu
Sorry I am a bit confused. The successful list that I provided is from a 
different object of the same size to show that I could indeed get a 
list. Are you saying to copy the working object to the missing object? 
Sorry for the confusion.


On 1/15/16 3:20 PM, Yehuda Sadeh-Weinraub wrote:

That's interesting, and might point at the underlying issue that
caused it. Could be a racing upload that somehow ended up with the
wrong object head. The 'multipart' object should be 4M in size, and
the 'shadow' one should have the remainder of the data. You can run
'rados stat -p .rgw.buckets ' to validate that. If that's the
case, you can copy these to the expected object names:

$ src_uploadid=wksHvto9gRgHUJbhm_TZPXJTZUPXLT2
$ dest_uploadid=pcu5Hz6foFXjlSxBat22D8YMcHlQOBD

$ rados -p .rgw.buckets cp
default.384153.1__multipart_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${src_uploadid}.1
default.384153.1__multipart_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${dest_uploadid}.1

$ rados -p .rgw.buckets cp
default.384153.1__shadow_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${src_upload_id}.1_1
default.384153.1__shadow_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${dest_upload_id}.1_1

Yehuda


On Fri, Jan 15, 2016 at 1:02 PM, seapasu...@uchicago.edu
 wrote:

lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep 'pcu5Hz6'
lacadmin@kh28-10:~$

Nothing was found. That said when I run the command with another prefix
snippet::
lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep 'wksHvto'
default.384153.1__shadow_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~wksHvto9gRgHUJbhm_TZPXJTZUPXLT2.1_1
default.384153.1__multipart_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~wksHvto9gRgHUJbhm_TZPXJTZUPXLT2.1




On 1/15/16 12:05 PM, Yehuda Sadeh-Weinraub wrote:

On Fri, Jan 15, 2016 at 9:36 AM, seapasu...@uchicago.edu
 wrote:

Hello Yehuda,

Here it is::

radosgw-admin object stat --bucket="noaa-nexrad-l2"

--object="2015/01/01/PAKC/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar"
{
  "name":

"2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar",
  "size": 7147520,
  "policy": {
  "acl": {
  "acl_user_map": [
  {
  "user": "b05f707271774dbd89674a0736c9406e",
  "acl": 15
  }
  ],
  "acl_group_map": [
  {
  "group": 1,
  "acl": 1
  }
  ],
  "grant_map": [
  {
  "id": "",
  "grant": {
  "type": {
  "type": 2
  },
  "id": "",
  "email": "",
  "permission": {
  "flags": 1
  },
  "name": "",
  "group": 1
  }
  },
  {
  "id": "b05f707271774dbd89674a0736c9406e",
  "grant": {
  "type": {
  "type": 0
  },
  "id": "b05f707271774dbd89674a0736c9406e",
  "email": "",
  "permission": {
  "flags": 15
  },
  "name": "noaa-commons",
  "group": 0
  }
  }
  ]
  },
  "owner": {
  "id": "b05f707271774dbd89674a0736c9406e",
  "display_name": "noaa-commons"
  }
  },
  "etag": "b91b6f1650350965c5434c547b3c38ff-1\u",
  "tag": "_cWrvEa914Gy1AeyzIhRlUdp1wJnek3E\u",
  "manifest": {
  "objs": [],
  "obj_size": 7147520,
  "explicit_objs": "false",
  "head_obj": {
  "bucket": {
  "name": "noaa-nexrad-l2",
  "pool": ".rgw.buckets",
  "data_extra_pool": ".rgw.buckets.extra",
  "index_pool": ".rgw.buckets.index",
  "marker": "default.384153.1",
  "bucket_id": "default.384153.1"
  },
  "key": "",
  "ns": "",
  "object":

"2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar",
  "instance": ""
  },
  "head_size": 0,
  "max_head_size": 0,
  "prefix":

"2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar.2~pcu5Hz6foFXjlSxBat22D8YMcHlQOBD",

Try running:
$ rados -p 

Re: [ceph-users] RGW -- 404 on keys in bucket.list() thousands of multipart ids listed as well.

2016-01-15 Thread Yehuda Sadeh-Weinraub
Ah, I see. Misread that and the object names were very similar. No,
don't copy it. You can try to grep for the specific object name and
see if there are pieces of it lying around under a different upload
id.

Yehuda

On Fri, Jan 15, 2016 at 1:44 PM, seapasu...@uchicago.edu
 wrote:
> Sorry I am a bit confused. The successful list that I provided is from a
> different object of the same size to show that I could indeed get a list.
> Are you saying to copy the working object to the missing object? Sorry for
> the confusion.
>
>
> On 1/15/16 3:20 PM, Yehuda Sadeh-Weinraub wrote:
>>
>> That's interesting, and might point at the underlying issue that
>> caused it. Could be a racing upload that somehow ended up with the
>> wrong object head. The 'multipart' object should be 4M in size, and
>> the 'shadow' one should have the remainder of the data. You can run
>> 'rados stat -p .rgw.buckets ' to validate that. If that's the
>> case, you can copy these to the expected object names:
>>
>> $ src_uploadid=wksHvto9gRgHUJbhm_TZPXJTZUPXLT2
>> $ dest_uploadid=pcu5Hz6foFXjlSxBat22D8YMcHlQOBD
>>
>> $ rados -p .rgw.buckets cp
>>
>> default.384153.1__multipart_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${src_uploadid}.1
>>
>> default.384153.1__multipart_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${dest_uploadid}.1
>>
>> $ rados -p .rgw.buckets cp
>>
>> default.384153.1__shadow_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${src_upload_id}.1_1
>>
>> default.384153.1__shadow_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${dest_upload_id}.1_1
>>
>> Yehuda
>>
>>
>> On Fri, Jan 15, 2016 at 1:02 PM, seapasu...@uchicago.edu
>>  wrote:
>>>
>>> lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep 'pcu5Hz6'
>>> lacadmin@kh28-10:~$
>>>
>>> Nothing was found. That said when I run the command with another prefix
>>> snippet::
>>> lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep 'wksHvto'
>>>
>>> default.384153.1__shadow_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~wksHvto9gRgHUJbhm_TZPXJTZUPXLT2.1_1
>>>
>>> default.384153.1__multipart_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~wksHvto9gRgHUJbhm_TZPXJTZUPXLT2.1
>>>
>>>
>>>
>>>
>>> On 1/15/16 12:05 PM, Yehuda Sadeh-Weinraub wrote:

 On Fri, Jan 15, 2016 at 9:36 AM, seapasu...@uchicago.edu
  wrote:
>
> Hello Yehuda,
>
> Here it is::
>
> radosgw-admin object stat --bucket="noaa-nexrad-l2"
>
>
> --object="2015/01/01/PAKC/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar"
> {
>   "name":
>
>
> "2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar",
>   "size": 7147520,
>   "policy": {
>   "acl": {
>   "acl_user_map": [
>   {
>   "user": "b05f707271774dbd89674a0736c9406e",
>   "acl": 15
>   }
>   ],
>   "acl_group_map": [
>   {
>   "group": 1,
>   "acl": 1
>   }
>   ],
>   "grant_map": [
>   {
>   "id": "",
>   "grant": {
>   "type": {
>   "type": 2
>   },
>   "id": "",
>   "email": "",
>   "permission": {
>   "flags": 1
>   },
>   "name": "",
>   "group": 1
>   }
>   },
>   {
>   "id": "b05f707271774dbd89674a0736c9406e",
>   "grant": {
>   "type": {
>   "type": 0
>   },
>   "id": "b05f707271774dbd89674a0736c9406e",
>   "email": "",
>   "permission": {
>   "flags": 15
>   },
>   "name": "noaa-commons",
>   "group": 0
>   }
>   }
>   ]
>   },
>   "owner": {
>   "id": "b05f707271774dbd89674a0736c9406e",
>   "display_name": "noaa-commons"
>   }
>   },
>   "etag": "b91b6f1650350965c5434c547b3c38ff-1\u",
>   "tag": "_cWrvEa914Gy1AeyzIhRlUdp1wJnek3E\u",
>   

Re: [ceph-users] RGW -- 404 on keys in bucket.list() thousands of multipart ids listed as well.

2016-01-15 Thread seapasu...@uchicago.edu

Sorry for the confusion::

When I grepped for the prefix of the missing object::
"2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar.2~pcu5Hz6foFXjlSxBat22D8YMcHlQOBD"

I am not able to find any chunks of the object::

lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep 'pcu5Hz6'
lacadmin@kh28-10:~$

The only piece of the object that I can seem to find is the original one 
I posted::
lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep 
'NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959'

default.384153.1_2015/01/01/PAKC/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar

And when we stat this object is is 0 bytes as shown earlier::
lacadmin@kh28-10:~$ rados -p .rgw.buckets stat 
'default.384153.1_2015/01/01/PAKC/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar'
.rgw.buckets/default.384153.1_2015/01/01/PAKC/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar 
mtime 2015-11-04 15:29:30.00, size 0


Sorry again for the confusion.


On 1/15/16 3:58 PM, Yehuda Sadeh-Weinraub wrote:

Ah, I see. Misread that and the object names were very similar. No,
don't copy it. You can try to grep for the specific object name and
see if there are pieces of it lying around under a different upload
id.

Yehuda

On Fri, Jan 15, 2016 at 1:44 PM, seapasu...@uchicago.edu
 wrote:

Sorry I am a bit confused. The successful list that I provided is from a
different object of the same size to show that I could indeed get a list.
Are you saying to copy the working object to the missing object? Sorry for
the confusion.


On 1/15/16 3:20 PM, Yehuda Sadeh-Weinraub wrote:

That's interesting, and might point at the underlying issue that
caused it. Could be a racing upload that somehow ended up with the
wrong object head. The 'multipart' object should be 4M in size, and
the 'shadow' one should have the remainder of the data. You can run
'rados stat -p .rgw.buckets ' to validate that. If that's the
case, you can copy these to the expected object names:

$ src_uploadid=wksHvto9gRgHUJbhm_TZPXJTZUPXLT2
$ dest_uploadid=pcu5Hz6foFXjlSxBat22D8YMcHlQOBD

$ rados -p .rgw.buckets cp

default.384153.1__multipart_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${src_uploadid}.1

default.384153.1__multipart_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${dest_uploadid}.1

$ rados -p .rgw.buckets cp

default.384153.1__shadow_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${src_upload_id}.1_1

default.384153.1__shadow_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${dest_upload_id}.1_1

Yehuda


On Fri, Jan 15, 2016 at 1:02 PM, seapasu...@uchicago.edu
 wrote:

lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep 'pcu5Hz6'
lacadmin@kh28-10:~$

Nothing was found. That said when I run the command with another prefix
snippet::
lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep 'wksHvto'

default.384153.1__shadow_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~wksHvto9gRgHUJbhm_TZPXJTZUPXLT2.1_1

default.384153.1__multipart_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~wksHvto9gRgHUJbhm_TZPXJTZUPXLT2.1




On 1/15/16 12:05 PM, Yehuda Sadeh-Weinraub wrote:

On Fri, Jan 15, 2016 at 9:36 AM, seapasu...@uchicago.edu
 wrote:

Hello Yehuda,

Here it is::

radosgw-admin object stat --bucket="noaa-nexrad-l2"


--object="2015/01/01/PAKC/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar"
{
   "name":


"2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar",
   "size": 7147520,
   "policy": {
   "acl": {
   "acl_user_map": [
   {
   "user": "b05f707271774dbd89674a0736c9406e",
   "acl": 15
   }
   ],
   "acl_group_map": [
   {
   "group": 1,
   "acl": 1
   }
   ],
   "grant_map": [
   {
   "id": "",
   "grant": {
   "type": {
   "type": 2
   },
   "id": "",
   "email": "",
   "permission": {
   "flags": 1
   },
   "name": "",
   "group": 1
   }
   },
   {
   "id": "b05f707271774dbd89674a0736c9406e",
   "grant": {
   "type": {
   "type": 0
   },
   "id": "b05f707271774dbd89674a0736c9406e",
   "email": "",
  

Re: [ceph-users] RGW -- 404 on keys in bucket.list() thousands of multipart ids listed as well.

2016-01-15 Thread Yehuda Sadeh-Weinraub
The head object of a multipart object has 0 size, so it's expected.
What's missing is the tail of the object. I don't assume you have any
logs from when the object was uploaded?

Yehuda

On Fri, Jan 15, 2016 at 2:12 PM, seapasu...@uchicago.edu
 wrote:
> Sorry for the confusion::
>
> When I grepped for the prefix of the missing object::
> "2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar.2~pcu5Hz6foFXjlSxBat22D8YMcHlQOBD"
>
> I am not able to find any chunks of the object::
>
> lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep 'pcu5Hz6'
> lacadmin@kh28-10:~$
>
> The only piece of the object that I can seem to find is the original one I
> posted::
> lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep
> 'NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959'
> default.384153.1_2015/01/01/PAKC/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar
>
> And when we stat this object is is 0 bytes as shown earlier::
> lacadmin@kh28-10:~$ rados -p .rgw.buckets stat
> 'default.384153.1_2015/01/01/PAKC/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar'
> .rgw.buckets/default.384153.1_2015/01/01/PAKC/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar
> mtime 2015-11-04 15:29:30.00, size 0
>
> Sorry again for the confusion.
>
>
>
> On 1/15/16 3:58 PM, Yehuda Sadeh-Weinraub wrote:
>>
>> Ah, I see. Misread that and the object names were very similar. No,
>> don't copy it. You can try to grep for the specific object name and
>> see if there are pieces of it lying around under a different upload
>> id.
>>
>> Yehuda
>>
>> On Fri, Jan 15, 2016 at 1:44 PM, seapasu...@uchicago.edu
>>  wrote:
>>>
>>> Sorry I am a bit confused. The successful list that I provided is from a
>>> different object of the same size to show that I could indeed get a list.
>>> Are you saying to copy the working object to the missing object? Sorry
>>> for
>>> the confusion.
>>>
>>>
>>> On 1/15/16 3:20 PM, Yehuda Sadeh-Weinraub wrote:

 That's interesting, and might point at the underlying issue that
 caused it. Could be a racing upload that somehow ended up with the
 wrong object head. The 'multipart' object should be 4M in size, and
 the 'shadow' one should have the remainder of the data. You can run
 'rados stat -p .rgw.buckets ' to validate that. If that's the
 case, you can copy these to the expected object names:

 $ src_uploadid=wksHvto9gRgHUJbhm_TZPXJTZUPXLT2
 $ dest_uploadid=pcu5Hz6foFXjlSxBat22D8YMcHlQOBD

 $ rados -p .rgw.buckets cp


 default.384153.1__multipart_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${src_uploadid}.1


 default.384153.1__multipart_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${dest_uploadid}.1

 $ rados -p .rgw.buckets cp


 default.384153.1__shadow_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${src_upload_id}.1_1


 default.384153.1__shadow_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~${dest_upload_id}.1_1

 Yehuda


 On Fri, Jan 15, 2016 at 1:02 PM, seapasu...@uchicago.edu
  wrote:
>
> lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep 'pcu5Hz6'
> lacadmin@kh28-10:~$
>
> Nothing was found. That said when I run the command with another prefix
> snippet::
> lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep 'wksHvto'
>
>
> default.384153.1__shadow_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~wksHvto9gRgHUJbhm_TZPXJTZUPXLT2.1_1
>
>
> default.384153.1__multipart_2015/01/01/KABR/NWS_NEXRAD_NXL2DP_KABR_2015010113_20150101135959.tar.2~wksHvto9gRgHUJbhm_TZPXJTZUPXLT2.1
>
>
>
>
> On 1/15/16 12:05 PM, Yehuda Sadeh-Weinraub wrote:
>>
>> On Fri, Jan 15, 2016 at 9:36 AM, seapasu...@uchicago.edu
>>  wrote:
>>>
>>> Hello Yehuda,
>>>
>>> Here it is::
>>>
>>> radosgw-admin object stat --bucket="noaa-nexrad-l2"
>>>
>>>
>>>
>>> --object="2015/01/01/PAKC/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar"
>>> {
>>>"name":
>>>
>>>
>>>
>>> "2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar",
>>>"size": 7147520,
>>>"policy": {
>>>"acl": {
>>>"acl_user_map": [
>>>{
>>>"user": "b05f707271774dbd89674a0736c9406e",
>>>"acl": 15
>>>}
>>>],
>>>"acl_group_map": [
>>>{
>>>"group": 1,
>>>"acl": 1
>>>}
>>>],
>>>

Re: [ceph-users] Inconsistent PG / Impossible deep-scrub

2016-01-15 Thread Jérôme Poulin
Finally, after having corruption with the MDS I had no choice but to try to
manually repair the PG.

Following the procedure on the blog post from Ceph at
http://ceph.com/planet/ceph-manually-repair-object/ I was able to get the
PG back to active+clean, ceph pg repair wasn't still not working and an
automatic deep-scrub confirmed the object was still faulty.

On Fri, Dec 18, 2015 at 10:42 AM, Jérôme Poulin 
wrote:

> Good day everyone,
>
> I currently manage a Ceph cluster running Firefly 0.80.10, we had some
> maintenance which implied stopping OSD and starting them back again. This
> caused one of the hard drive to notice it had a bad sector and then Ceph to
> mark it as inconsistent.
>
> After reparing the physical issue, I went and tried ceph pg repair, no
> action, then I tried ceph pg deep-scrub, still no action.
>
> I verified the log of each OSD which had the PG and confirmed that nothing
> was logged, no repair, no deep-scrub. After trying deep-scrubbing manually
> other PGs, I confirmed that my requests were completely ignored.
>
> The only flag set is noout since this cluster is too small, but automatic
> deep-scrubs are working and are logged both in ceph.log and the OSD log.
>
> I tried restarting the monitor in charge to elect a new one and restart
> each affected OSD for the inconsistent PG with no success.
>
> I also tried to fix the defective object myself in case it was hanging
> something, now the object has the same checksum on each OSD.
>
> Is there a way to ask the OSD directly to deep-scrub without using the
> monitor? Is there a known issue about commands getting ignored?
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RGW -- 404 on keys in bucket.list() thousands of multipart ids listed as well.

2016-01-15 Thread seapasu...@uchicago.edu
I have looked all over and I do not see any explicit mention of 
"NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959" in the logs nor 
do I see a timestamp from November 4th although I do see log rotations 
dating back to october 15th. I don't think it's possible it wasn't 
logged so I am going through the bucket logs from the 'radosgw-admin log 
show --object' side and I found the following::


4604932 {
4604933 "bucket": "noaa-nexrad-l2",
4604934 "time": "2015-11-04 21:29:27.346509Z",
4604935 "time_local": "2015-11-04 15:29:27.346509",
4604936 "remote_addr": "",
4604937 "object_owner": "b05f707271774dbd89674a0736c9406e",
4604938 "user": "b05f707271774dbd89674a0736c9406e",
4604939 "operation": "PUT",
4604940 "uri": 
"\/noaa-nexrad-l2\/2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar",

4604941 "http_status": "200",
4604942 "error_code": "",
4604943 "bytes_sent": 19,
4604944 "bytes_received": 0,
4604945 "object_size": 0,
4604946 "total_time": 142640400,
4604947 "user_agent": "Boto\/2.38.0 Python\/2.7.7 
Linux\/2.6.32-573.7.1.el6.x86_64",

4604948 "referrer": ""
4604949 }

Does this help at all. The total time seems exceptionally high. Would it 
be possible that there is a timeout issue where the put request started 
a multipart upload with the correct header and then timed out but the 
radosgw took the data anyway?


I am surprised the radosgw returned a 200 let alone placed the key in 
the bucket listing.



That said here is another object (different object) that 404s:
1650873 {
1650874 "bucket": "noaa-nexrad-l2",
1650875 "time": "2015-11-05 04:50:42.606838Z",
1650876 "time_local": "2015-11-04 22:50:42.606838",
1650877 "remote_addr": "",
1650878 "object_owner": "b05f707271774dbd89674a0736c9406e",
1650879 "user": "b05f707271774dbd89674a0736c9406e",
1650880 "operation": "PUT",
1650881 "uri": 
"\/noaa-nexrad-l2\/2015\/02\/25\/KVBX\/NWS_NEXRAD_NXL2DP_KVBX_2015022516_20150225165959.tar",

1650882 "http_status": "200",
1650883 "error_code": "",
1650884 "bytes_sent": 19,
1650885 "bytes_received": 0,
1650886 "object_size": 0,
1650887 "total_time": 0,
1650888 "user_agent": "Boto\/2.38.0 Python\/2.7.7 
Linux\/2.6.32-573.7.1.el6.x86_64",

1650889 "referrer": ""
1650890 }

And this one fails with a 404 as well. Does this help at all? Here is a 
successful object (different object) log entry as well just in case::


17462367 {
17462368 "bucket": "noaa-nexrad-l2",
17462369 "time": "2015-11-04 21:16:44.148603Z",
17462370 "time_local": "2015-11-04 15:16:44.148603",
17462371 "remote_addr": "",
17462372 "object_owner": "b05f707271774dbd89674a0736c9406e",
17462373 "user": "b05f707271774dbd89674a0736c9406e",
17462374 "operation": "PUT",
17462375 "uri": 
"\/noaa-nexrad-l2\/2015\/01\/01\/KAKQ\/NWS_NEXRAD_NXL2DP_KAKQ_2015010108_20150101085959.tar",

17462376 "http_status": "200",
17462377 "error_code": "",
17462378 "bytes_sent": 19,
17462379 "bytes_received": 0,
17462380 "object_size": 0,
17462381 "total_time": 0,
17462382 "user_agent": "Boto\/2.38.0 Python\/2.7.7 
Linux\/2.6.32-573.7.1.el6.x86_64",

17462383 "referrer": ""
17462384 }

So I am guessing these are not pertinent as they look nearly identical. 
Unfortunately I do not have any client.radosgw.logs to show for the 
failed files for some reason. Is there anything else I can do to 
troubleshoot this issue. In the end the radosgw should have never list 
these files as they never completed successfully, right?






On 1/15/16 4:36 PM, Yehuda Sadeh-Weinraub wrote:

The head object of a multipart object has 0 size, so it's expected.
What's missing is the tail of the object. I don't assume you have any
logs from when the object was uploaded?

Yehuda

On Fri, Jan 15, 2016 at 2:12 PM, seapasu...@uchicago.edu
 wrote:

Sorry for the confusion::

When I grepped for the prefix of the missing object::
"2015\/01\/01\/PAKC\/NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959.tar.2~pcu5Hz6foFXjlSxBat22D8YMcHlQOBD"

I am not able to find any chunks of the object::

lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep 'pcu5Hz6'
lacadmin@kh28-10:~$

The only piece of the object that I can seem to find is the original one I
posted::
lacadmin@kh28-10:~$ rados -p .rgw.buckets ls | grep
'NWS_NEXRAD_NXL2DP_PAKC_2015010111_20150101115959'

[ceph-users] OSDs are down, don't know why

2016-01-15 Thread Jeff Epstein

Hello,

I'm setting up a small test instance of ceph and I'm running into a 
situation where the OSDs are being shown as down, but I don't know why.


Connectivity seems to be working. The OSD hosts are able to communicate 
with the MON hosts; running "ceph status" and "ceph osd in" from an OSD 
host works fine, but with a HEALTH_WARN that I have 2 osds: 0 up, 2 in. 
Both the OSD and MON daemons seem to be running fine. Network 
connectivity seems to be okay: I can nc from the OSD to port 6789 on the 
MON, and from the MON to port 6800-6803 on the OSD (I have constrained 
the ms bind port min/max config options so that the OSDs will use only 
these ports). Neither OSD nor MON logs show anything that seems unusual, 
nor why the OSD is marked as being down.


Furthermore, using tcpdump i've watched network traffic between the OSD 
and the MON, and it seems that the OSD is sending heartbeats and getting 
an ack from the MON. So I'm definitely not sure why the MON thinks the 
OSD is down.


Some questions:
- How does the MON determine if the OSD is down?
- Is there a way to get the MON to report on why an OSD is down, e.g. no 
heartbeat?

- Is there any need to open ports other than TCP 6789 and 6800-6803?
- Any other suggestions?

ceph 0.94 on Debian Jessie

Best,
Jeff
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Infernalis upgrade breaks when journal on separate partition

2016-01-15 Thread Michał Chybowski
In my case one server was also non-GPT installed and in 
/usr/sbin/ceph-disk I've added line:

os.chmod(os.path.join(path,'journal'), 0777) after line 1926.

I know that it's very ugly and shouldn't be made on production, but I 
had no time to search for proper way to fix this.


Regards
Michał Chybowski
Tiktalik.com

W dniu 15.01.2016 o 05:00, Stuart Longland pisze:

On 12/01/16 01:22, Stillwell, Bryan wrote:

Well, it seems I spoke to soon.  Not sure what logic the udev rules use

to identify ceph journals, but it doesn't seem to pick up on the
journals in our case as after a reboot, those partitions are owned by
root:disk with permissions 0660.

This is handled by the UUIDs of the GPT partitions, and since you're using
MS-DOS partition tables it won't work correctly.  I would recommend switching to
GPT partition tables if you can.

I'm not comfortable with switching from MS-DOS to GPT disklabels on a
running production server, and nothing in the Ceph docs at the time
mentioned that GPT was a requirement on the journal disks.

Switching to GPT isn't toggling an option, it'd require resizing
partitions (some of these are xfs; can't be shrunk to my knowledge) and
moving stuff around to allow for an additional EFI boot partition,
possibly changes to boot firmware settings and bootloaders too.

I'll look into a udev rule for our particular case and see how I go.


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com