Dear All


We are trying to remove old multipart uploads but get in trouble with some of 
them having null characters:


rados -p zh-1.rgw.buckets.index rmomapkey 
.dir.cb1594b3-a782-49d0-a19f-68cd48870a63.81880353.1.0 
'_multipart_MBS-35a9b79c-f27d-44f2-804f-472ef0520816/CBB_BSSRV01/CBB_DiskImage/Disk_4f8130ff-fef5-4b0f-b25e-c6b8b3dba9bf/Volume_NTFS_5b4f5274-9107-4386-93d9-e7f31193805a$/20201218230243/0.cbrevision.525Sr39KY5yVbD_w9ipOXSXsQ95YUnC.25'

rados -p zh-1.rgw.buckets.index rmomapkey 
.dir.cb1594b3-a782-49d0-a19f-68cd48870a63.81880353.1.0 $(echo -ne 
'_multipart_MBS-35a9b79c-f27d-44f2-804f-472ef0520816/CBB_BSSRV01/CBB_DiskImage/Disk_4f8130ff-fef5-4b0f-b25e-c6b8b3dba9bf/Volume_NTFS_5b4f5274-9107-4386-93d9-e7f31193805a$/20201218230243/0.cbrevision.525Sr39KY5yVbD_w9ipOXSXsQ95YUnC\0.25')
-bash: warning: command substitution: ignored null byte in input

rados -p zh-1.rgw.buckets.index listomapkeys 
.dir.cb1594b3-a782-49d0-a19f-68cd48870a63.81880353.1.0 | grep -a 
'_multipart_MBS-35a9b79c-f27d-44f2-804f-472ef0520816/CBB_BSSRV01/CBB_DiskImage/Disk_4f8130ff-fef5-4b0f-b25e-c6b8b3dba9bf/Volume_NTFS_5b4f5274-9107-4386-93d9-e7f31193805a$/20201218230243/0.cbrevision.525Sr39KY5yVbD_w9ipOXSXsQ95YUnC'
 | cat -A
_multipart_MBS-35a9b79c-f27d-44f2-804f-472ef0520816/CBB_BSSRV01/CBB_DiskImage/Disk_4f8130ff-fef5-4b0f-b25e-c6b8b3dba9bf/Volume_NTFS_5b4f5274-9107-4386-93d9-e7f31193805a$/20201218230243/0.cbrevision.525Sr39KY5yVbD_w9ipOXSXsQ95YUnC^@.25$
 # <= not deleted !

It is not working, as the Null Char is stripped off.
Any Idea how to proceed?

This bucket was created on luminous. But this specific object was created after 
our upgrade to nautilus.
Apparently some bugs have added NullChars at the end of MPU object names, 
between uploadid and suffix.
Output from 'radosgw-admin bi list' (see the \u0000 NullChars):
  {
    "type": "plain",
    "idx": 
"_multipart_MBS-35a9b79c-f27d-44f2-804f-472ef0520816/CBB_BSSRV01/CBB_DiskImage/Disk_4f8130ff-fef5-4b0f-b25e-c6b8b3dba9bf/Volume_NTFS_5b4f5274-9107-4386-93d9-e7f31193805a$/20201218230243/0.cbrevision.525Sr39KY5yVbD_w9ipOXSXsQ95YUnC\u0000.25",
    "entry": {
      "name": 
"_multipart_MBS-35a9b79c-f27d-44f2-804f-472ef0520816/CBB_BSSRV01/CBB_DiskImage/Disk_4f8130ff-fef5-4b0f-b25e-c6b8b3dba9bf/Volume_NTFS_5b4f5274-9107-4386-93d9-e7f31193805a$/20201218230243/0.cbrevision.525Sr39KY5yVbD_w9ipOXSXsQ95YUnC\u0000.25",
      "instance": "",
      "ver": {
        "pool": 6,
        "epoch": 852938
      },
      "locator": "",
      "exists": "true",
      "meta": {
        "category": 1,
        "size": 157286400,
        "mtime": "2020-12-25 23:39:20.019898Z",
        "etag": "a126c2f0d439c44176a5d07bd5841575",
        "storage_class": "",
        "owner": "40eb21a9092c4948bcf94386f6042f94",
        "owner_display_name": "amsler1",
        "content_type": "",
        "accounted_size": 157286400,
        "user_data": "",
        "appendable": "false"
      },
      "tag": "_vMx_4vu-E5nWf7kCHJIQCFPGEHRiUAG",
      "flags": 0,
      "pending_map": [],
      "versioned_epoch": 0
    }
  },


On the same bucket, we also see NullChars at the end of some Etags when we 
using 'radosgw-admin bucket list --bucket' but not with 'radosgw-admin object 
stat':

object='MBS-35a9b79c-f27d-44f2-804f-472ef0520816/CBB_BSSRV01/CBB_HV/BSSRV01.Aerztehaus-allschwil.ch/BSSRV05/D0F970B6-DB86-48AF-AA68-946D4642E2A6.xml:/20191103183115/D0F970B6-DB86-48AF-AA68-946D4642E2A6.xml'

radosgw-admin object stat --bucket="$bucket" --object="$object" | jq -c '{name, 
size, etag, tag, obj_size: .manifest.obj_size, 
marker:.manifest.tail_placement.bucket.marker, 
bucket_id:.manifest.tail_placement.bucket.bucket_id}' | cat -A
{"name":"MBS-35a9b79c-f27d-44f2-804f-472ef0520816/CBB_BSSRV01/CBB_HV/BSSRV01.Aerztehaus-allschwil.ch/BSSRV05/D0F970B6-DB86-48AF-AA68-946D4642E2A6.xml:/20191103183115/D0F970B6-DB86-48AF-AA68-946D4642E2A6.xml","size":39372,"etag":"0e73d594032900acb74d3f06b230aeb9","tag":"_xhNKxuWrfxDO5XfYs8Llq8vLTUYqtmm","obj_size":39372,"marker":"cb1594b3-a782-49d0-a19f-68cd48870a63.19334234.139","bucket_id":"cb1594b3-a782-49d0-a19f-68cd48870a63.20382694.169"}$
 # <= no NullChar
radosgw-admin bucket list --bucket "${bucket}" --allow-unordered --max-entries 
20000000 | jq -c 'sort_by(.bucket) | .[] | {name, accounted_size: 
.meta.accounted_size, etag: .meta.etag}' | fgrep -a 
0e73d594032900acb74d3f06b230aeb9 | cat -A
{"name":"MBS-35a9b79c-f27d-44f2-804f-472ef0520816/CBB_BSSRV01/CBB_HV/BSSRV01.Aerztehaus-allschwil.ch/BSSRV05/D0F970B6-DB86-48AF-AA68-946D4642E2A6.xml:/20191103183115/D0F970B6-DB86-48AF-AA68-946D4642E2A6.xml","accounted_size":39372,"etag":"0e73d594032900acb74d3f06b230aeb9\u0000"}$
 # <= no NullChar

rados -p zh-1.rgw.buckets.data stat 
'cb1594b3-a782-49d0-a19f-68cd48870a63.19334234.139_'"$object"
zh-1.rgw.buckets.data/cb1594b3-a782-49d0-a19f-68cd48870a63.19334234.139_MBS-35a9b79c-f27d-44f2-804f-472ef0520816/CBB_BSSRV01/CBB_HV/BSSRV01.Aerztehaus-allschwil.ch/BSSRV05/D0F970B6-DB86-48AF-AA68-946D4642E2A6.xml:/20191103183115/D0F970B6-DB86-48AF-AA68-946D4642E2A6.xml
 mtime 2020-04-21 14:21:27.000000, size 39372

This bucket was causing multi-site rgw sync to crash every minute when using 
rgw_sync_obj_etag_verify = true.
These Etag NullChars may be the cause of this bug:

  *   https://tracker.ceph.com/issues/49955

It may also be related to:

  *   https://tracker.ceph.com/issues/23939


So we would be glad to know how to remove these NullChars from the Etags and 
how to remove the MPU's with NullChars in the object names...
These both issues seem to be the cause of many weird behaviors:

  1.  rgw sync crashes (with rgw_sync_obj_etag_verify = true)
  2.  radosgw-admin bucket sync status --bucket "$bucket" --source-zone 
ch-zh1-az2 => reports "bucket is caught up with source" but when most of the 
objects are missing
  3.  radosgw-admin bucket list --bucket "$bucket" --allow-unordered 
--max-entries 99000000 => returns an imcomplete list
  4.  radosgw-admin bucket stats --bucket "$bucket" => returns wrong number of 
objects and utilized size

The only reliable outputs is from bi list:

  *   radosgw-admin bi list --bucket=$bucket | jq -cr 'map(select(.type == 
"plain" or .type == "instance") | .entry'

Do you know if following commands may help and are safe in multi-site?

  *   radosgw-admin bucket check --bucket $bucket --fix --check-objects
  *   radosgw-admin bucket rewrite --bucket $bucket --min-rewrite-size 0


Or maybe only a dedicated tool need to be developped to deal with these 
NullChars?

Many thanks in advance.



Cheers

Francois Scheurer


--


EveryWare AG
François Scheurer
Senior Systems Engineer
Zurlindenstrasse 52a
CH-8003 Zürich

tel: +41 44 466 60 00
fax: +41 44 466 60 10
mail: francois.scheu...@everyware.ch
web: http://www.everyware.ch

Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to