[ceph-users] Re: NoSuchKey on key that is visible in s3 list/radosgw bk

2020-07-27 Thread Robin H. Johnson
On Mon, Jul 27, 2020 at 08:02:23PM +0200, Mariusz Gronczewski wrote:
> Hi,
> 
> I've got a problem on Octopus (15.2.3, debian packages) install, bucket
> S3 index shows a file:
> 
> s3cmd ls s3://upvid/255/38355 --recursive
> 2020-07-27 17:48  50584342
> 
> s3://upvid/255/38355/juz_nie_zyjesz_sezon_2___oficjalny_zwiastun___netflix_mp4
> 
> radosgw-admin bi list also shows it
> 
> {
> "type": "plain",
> "idx":
> "255/38355/juz_nie_zyjesz_sezon_2___oficjalny_zwiastun___netflix_mp4",
> "entry": { "name":
> "255/38355/juz_nie_zyjesz_sezon_2___oficjalny_zwiastun___netflix_mp4",
> "instance": "", "ver": {
> "pool": 11,
> "epoch": 853842
> },
> "locator": "",
> "exists": "true",
> "meta": {
> "category": 1,
> "size": 50584342,
> "mtime": "2020-07-27T17:48:27.203008Z",
> "etag": "2b31cc8ce8b1fb92a5f65034f2d12581-7",
> "storage_class": "",
> "owner": "filmweb-app",
> "owner_display_name": "filmweb app user",
> "content_type": "",
> "accounted_size": 50584342,
> "user_data": "",
> "appendable": "false"
> },
> "tag": "_3ubjaztglHXfZr05wZCFCPzebQf-ZFP",
> "flags": 0,
> "pending_map": [],
> "versioned_epoch": 0
> }
> },
> 
> but trying to download it via curl (I've set permissions to public0 only gets 
> me
Does the RADOS object for this still exist?

try:
radosgw-admin object stat --bucket ... --object 
'255/38355/juz_nie_zyjesz_sezon_2___oficjalny_zwiastun___netflix_mp4'

If that doesn't return, then the backing object is gone, and you have a
stale index entry that can be cleaned up in most cases with check
bucket.
For cases where that doesn't fix it, my recommended way to fix it is
write a new 0-byte object to the same name, then delete it.

-- 
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation Treasurer
E-Mail   : robb...@gentoo.org
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136


signature.asc
Description: PGP signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Cluster became unresponsive: e5 handle_auth_request failed to assign global_id

2020-07-27 Thread Dino Godor
Well, port 6800 is not a monitor port as I just looked up, so I wouldn't 
look there.


Can you use ceph command from another mon ?

Also maybe the user you use can't access the admin keyring - as far as I 
remember that lead to infinetely hanging commands on my test cluster 
(but was Nautilus, don't know if that changed) - or maybe you used to 
fire the commands from the folder you used to deploy and didn't admin 
the machine.


Just some thoughts.


On 27.07.20 16:28, Илья Борисович Волошин wrote:

Here are all the active ports on mon1 (with the exception of sshd and ntpd):

# netstat -npl
Proto Recv-Q Send-Q Local Address   Foreign Address State
 PID/Program name
tcp0  0 :3300  0.0.0.0:*   LISTEN
  1582/ceph-mon
tcp0  0 :6789  0.0.0.0:*   LISTEN
  1582/ceph-mon
tcp6   0  0 :::9093 :::*LISTEN
  908/alertmanager
tcp6   0  0 :::9094 :::*LISTEN
  908/alertmanager
tcp6   0  0 :::9095 :::*LISTEN
  896/prometheus
tcp6   0  0 :::9100 :::*LISTEN
  906/node_exporter
tcp6   0  0 :::3000 :::*LISTEN
  882/grafana-server
udp6   0  0 :::9094 :::*
  908/alertmanager

I've tried telnet from mon1 host, can connect to 3300 and 6789:

# telnet  3300



Trying ...



Connected to .



Escape character is '^]'.



ceph v2

# telnet  6789
Trying ...
Connected to .
Escape character is '^]'.
ceph v027QQ

6800 and 6801 refuse connection:

# telnet  6800
Trying ...
telnet: Unable to connect to remote host: Connection refused

I don't see any errors in the log related to failures to bind... and all
CEPH systemd services are running as far as I can tell:

# systemctl list-units -a | grep ceph
   ceph-e30397f0-cc32-11ea-8c8e-000c29469cd5@alertmanager.mon1.service
 loadedactive   running   Ceph
alertmanager.mon1 for e30397f0-cc32-11ea-8c8e-000c29469cd5
   ceph-e30397f0-cc32-11ea-8c8e-000c29469cd5@crash.mon1.service
  loadedactive   running   Ceph crash.mon1
for e30397f0-cc32-11ea-8c8e-000c29469cd5
   ceph-e30397f0-cc32-11ea-8c8e-000c29469cd5@grafana.mon1.service
  loadedactive   running   Ceph grafana.mon1
for e30397f0-cc32-11ea-8c8e-000c29469cd5
   ceph-e30397f0-cc32-11ea-8c8e-000c29469cd5@mgr.mon1.peevkl.service
 loadedactive   running   Ceph
mgr.mon1.peevkl for e30397f0-cc32-11ea-8c8e-000c29469cd5
   ceph-e30397f0-cc32-11ea-8c8e-000c29469cd5@mon.mon1.service
  loadedactive   running   Ceph mon.mon1 for
e30397f0-cc32-11ea-8c8e-000c29469cd5
   ceph-e30397f0-cc32-11ea-8c8e-000c29469cd5@node-exporter.mon1.service
  loadedactive   running   Ceph
node-exporter.mon1 for e30397f0-cc32-11ea-8c8e-000c29469cd5
   ceph-e30397f0-cc32-11ea-8c8e-000c29469cd5@prometheus.mon1.service
 loadedactive   running   Ceph
prometheus.mon1 for e30397f0-cc32-11ea-8c8e-000c29469cd5
   system-ceph\x2de30397f0\x2dcc32\x2d11ea\x2d8c8e\x2d000c29469cd5.slice
 loadedactive   active
  system-ceph\x2de30397f0\x2dcc32\x2d11ea\x2d8c8e\x2d000c29469cd5.slice
   ceph-e30397f0-cc32-11ea-8c8e-000c29469cd5.target
  loadedactive   activeCeph cluster
e30397f0-cc32-11ea-8c8e-000c29469cd5
   ceph.target
 loadedactive   activeAll Ceph clusters
and services

Here are currently active docker images:

# docker ps
CONTAINER IDIMAGECOMMAND
  CREATED STATUS  PORTS   NAMES
dfd8dbeccf1eceph/ceph:v15"/usr/bin/ceph-mgr -…"
41 minutes ago  Up 41 minutes
ceph-e30397f0-cc32-11ea-8c8e-000c29469cd5-mgr.mon1.peevkl
9452d1db7ffbceph/ceph:v15"/usr/bin/ceph-mon -…"   3
hours ago Up 3 hours
  ceph-e30397f0-cc32-11ea-8c8e-000c29469cd5-mon.mon1
703ec4a43824prom/prometheus:v2.18.1  "/bin/prometheus --c…"   3
hours ago Up 3 hours
  ceph-e30397f0-cc32-11ea-8c8e-000c29469cd5-prometheus.mon1
d816ec5e645fceph/ceph:v15"/usr/bin/ceph-crash…"   3
hours ago Up 3 hours
  ceph-e30397f0-cc32-11ea-8c8e-000c29469cd5-crash.mon1
38d283ba6424ceph/ceph-grafana:latest "/bin/sh -c 'grafana…"   3
hours ago Up 3 hours
  ceph-e30397f0-cc32-11ea-8c8e-000c29469cd5-grafana.mon1
cc119ec8f09aprom/node-exporter:v0.18.1   "/bin/node_exporter …"   3
hours ago Up 3 hours
  ceph-e30397f0-cc32-11ea-8c8e-000c29469cd5-node-exporter.mon1
aa1d339c4100prom/alertmanager:v0.20.0"/bin/alertmanager -…"   3
hours ago Up 3 hours
  

[ceph-users] Re: rbd-nbd stuck request

2020-07-27 Thread Jason Dillaman
On Mon, Jul 27, 2020 at 3:08 PM Herbert Alexander Faleiros
 wrote:
>
> Hi,
>
> On Fri, Jul 24, 2020 at 12:37:38PM -0400, Jason Dillaman wrote:
> > On Fri, Jul 24, 2020 at 10:45 AM Herbert Alexander Faleiros
> >  wrote:
> > >
> > > On Fri, Jul 24, 2020 at 07:28:07PM +0500, Alexander E. Patrakov wrote:
> > > > On Fri, Jul 24, 2020 at 6:01 PM Herbert Alexander Faleiros
> > > >  wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > is there any way to fix it instead a reboot?
> > > > >
> > > > > [128632.995249] block nbd0: Possible stuck request b14a04af: 
> > > > > control (read@2097152,4096B). Runtime 9540 seconds
> > > > > [128663.718993] block nbd0: Possible stuck request b14a04af: 
> > > > > control (read@2097152,4096B). Runtime 9570 seconds
> > > > > [128694.434774] block nbd0: Possible stuck request b14a04af: 
> > > > > control (read@2097152,4096B). Runtime 9600 seconds
> > > > > [128725.154515] block nbd0: Possible stuck request b14a04af: 
> > > > > control (read@2097152,4096B). Runtime 9630 seconds
> > > > >
> > > > > # ceph -v
> > > > > ceph version 12.2.13 (584a20eb0237c657dc0567da126be145106aa47e) 
> > > > > luminous (stable)
> > > > >
> > > > > # rbd-nbd list-mapped
> > > > > #
> > > > >
> > > > > # uname -r
> > > > > 5.4.52-050452-generic
> > > >
> > > > Not enough data to troubleshoot this. Is the rbd-nbd process running?
> > > >
> > > > I.e.:
> > > >
> > > > # cat /proc/partitions
> > > > # ps axww | grep nbd
> > >
> > > no nbd on /proc/partitions, ps shows only:
> > >
> > > root  192324  0.0  0.0  0 0 ?I<   07:12   0:00 
> > > [knbd0-recv]
> >
> > You can restart the "rbd-nbd" daemon by running "rbd-nbd map --device
> > /dev/nbd0 "
>
> works, but when I try to unmap, the command block the terminal and
> never end (I cannot even kill it). Same with nbd-client.
>
> Detail: only happens with journaling enabled.

Luminous is EOL -- any chance you can reproduce using an Octopus
"rbd-nbd" client?

> --
> Herbert
>


-- 
Jason
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rbd-nbd stuck request

2020-07-27 Thread Herbert Alexander Faleiros
Hi,

On Fri, Jul 24, 2020 at 12:37:38PM -0400, Jason Dillaman wrote:
> On Fri, Jul 24, 2020 at 10:45 AM Herbert Alexander Faleiros
>  wrote:
> >
> > On Fri, Jul 24, 2020 at 07:28:07PM +0500, Alexander E. Patrakov wrote:
> > > On Fri, Jul 24, 2020 at 6:01 PM Herbert Alexander Faleiros
> > >  wrote:
> > > >
> > > > Hi,
> > > >
> > > > is there any way to fix it instead a reboot?
> > > >
> > > > [128632.995249] block nbd0: Possible stuck request b14a04af: 
> > > > control (read@2097152,4096B). Runtime 9540 seconds
> > > > [128663.718993] block nbd0: Possible stuck request b14a04af: 
> > > > control (read@2097152,4096B). Runtime 9570 seconds
> > > > [128694.434774] block nbd0: Possible stuck request b14a04af: 
> > > > control (read@2097152,4096B). Runtime 9600 seconds
> > > > [128725.154515] block nbd0: Possible stuck request b14a04af: 
> > > > control (read@2097152,4096B). Runtime 9630 seconds
> > > >
> > > > # ceph -v
> > > > ceph version 12.2.13 (584a20eb0237c657dc0567da126be145106aa47e) 
> > > > luminous (stable)
> > > >
> > > > # rbd-nbd list-mapped
> > > > #
> > > >
> > > > # uname -r
> > > > 5.4.52-050452-generic
> > >
> > > Not enough data to troubleshoot this. Is the rbd-nbd process running?
> > >
> > > I.e.:
> > >
> > > # cat /proc/partitions
> > > # ps axww | grep nbd
> >
> > no nbd on /proc/partitions, ps shows only:
> >
> > root  192324  0.0  0.0  0 0 ?I<   07:12   0:00 
> > [knbd0-recv]
> 
> You can restart the "rbd-nbd" daemon by running "rbd-nbd map --device
> /dev/nbd0 "

works, but when I try to unmap, the command block the terminal and
never end (I cannot even kill it). Same with nbd-client.

Detail: only happens with journaling enabled.

--
Herbert
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] NoSuchKey on key that is visible in s3 list/radosgw bk

2020-07-27 Thread Mariusz Gronczewski
Hi,

I've got a problem on Octopus (15.2.3, debian packages) install, bucket
S3 index shows a file:

s3cmd ls s3://upvid/255/38355 --recursive
2020-07-27 17:48  50584342

s3://upvid/255/38355/juz_nie_zyjesz_sezon_2___oficjalny_zwiastun___netflix_mp4

radosgw-admin bi list also shows it

{
"type": "plain",
"idx":
"255/38355/juz_nie_zyjesz_sezon_2___oficjalny_zwiastun___netflix_mp4",
"entry": { "name":
"255/38355/juz_nie_zyjesz_sezon_2___oficjalny_zwiastun___netflix_mp4",
"instance": "", "ver": {
"pool": 11,
"epoch": 853842
},
"locator": "",
"exists": "true",
"meta": {
"category": 1,
"size": 50584342,
"mtime": "2020-07-27T17:48:27.203008Z",
"etag": "2b31cc8ce8b1fb92a5f65034f2d12581-7",
"storage_class": "",
"owner": "filmweb-app",
"owner_display_name": "filmweb app user",
"content_type": "",
"accounted_size": 50584342,
"user_data": "",
"appendable": "false"
},
"tag": "_3ubjaztglHXfZr05wZCFCPzebQf-ZFP",
"flags": 0,
"pending_map": [],
"versioned_epoch": 0
}
},

but trying to download it via curl (I've set permissions to public0 only gets me


NoSuchKeyupvidtxe716d-005f1f14cb-e478a-pl-war1e478a-pl-war1-pl

(the actually nonexisting files shows access denied in same context)

same with other tools:

$ s3cmd get 
s3://upvid/255/38355/juz_nie_zyjesz_sezon_2___oficjalny_zwiastun___netflix_mp4 
/tmp
download: 
's3://upvid/255/38355/juz_nie_zyjesz_sezon_2___oficjalny_zwiastun___netflix_mp4'
 -> '/tmp/juz_nie_zyjesz_sezon_2___oficjalny_zwiastun___netflix_mp4'  [1 of 1]
ERROR: S3 error: 404 (NoSuchKey)

cluster health is OK

Any ideas what is happening here ?

-- 
Mariusz Gronczewski, Administrator

Efigence S. A.
ul. Wołoska 9a, 02-583 Warszawa
T:   [+48] 22 380 13 13
NOC: [+48] 22 380 10 20
E: ad...@efigence.com
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: mimic: much more raw used than reported

2020-07-27 Thread Frank Schilder
Hi Igor,

thanks for your answer. I was thinking about that, but as far as I understood, 
to hit this bug actually requires a partial rewrite to happen. However, these 
are disk images in storage servers with basically static files, many of which 
very large (15GB). Therefore, I believe, the vast majority of objects is 
written to only once and should not be affected by the amplification bug.

Is there any way to  confirm/rule out that/check how much  amplification is 
happening?

I'm wondering if I might be observing something else. Since "ceph osd df tree" 
does report the actual utilization and I have only one pool on these OSDs, 
there is no problem with accounting allocated storage to a pool. I know its all 
used by this one pool. I'm more wondering if its not the known amplification 
but something else (at least partly) that plays a role here.

Thanks and best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Igor Fedotov 
Sent: 27 July 2020 12:54:02
To: Frank Schilder; ceph-users
Subject: Re: [ceph-users] mimic: much more raw used than reported

Hi Frank,

you might be being hit by https://tracker.ceph.com/issues/44213

In short the root causes are  significant space overhead due to high
bluestore allocation unit (64K) and EC overwrite design.

This is fixed for upcoming Pacific release by using 4K alloc unit but it
is unlikely to be backported to earlier releases due to its complexity.
To say nothing about the need for OSD redeployment. Hence please expect
no fix for mimic.


And your raw usage reports might still be not that good since mimic
lacks per-pool stats collection https://github.com/ceph/ceph/pull/19454.
I.e. your actual raw space usage is higher than reported. To estimate
proper raw usage one can use bluestore perf counters (namely
bluestore_stored and bluestore_allocated). Summing bluestore_allocated
over all involved OSDs will give actual RAW usage. Summing
bluestore_stored will provide actual data volume after EC processing,
i.e. presumably it should be around 158TiB.


Thanks,

Igor

On 7/26/2020 8:43 PM, Frank Schilder wrote:
> Dear fellow cephers,
>
> I observe a wired problem on our mimic-13.2.8 cluster. We have an EC RBD pool 
> backed by HDDs. These disks are not in any other pool. I noticed that the 
> total capacity (=USED+MAX AVAIL) reported by "ceph df detail" has shrunk 
> recently from 300TiB to 200TiB. Part but by no means all of this can be 
> explained by imbalance of the data distribution.
>
> When I compare the output of "ceph df detail" and "ceph osd df tree", I find 
> 69TiB raw capacity used but not accounted for; see calculations below. These 
> 69TiB raw are equivalent to 20% usable capacity and I really need it back. 
> Together with the imbalance, we loose about 30% capacity.
>
> What is using these extra 69TiB and how can I get it back?
>
>
> Some findings:
>
> These are the 5 largest images in the pool, accounting for a total of 97TiB 
> out of 119TiB usage:
>
> # rbd du :
> NAMEPROVISIONED   USED
> one-133  25 TiB 14 TiB
> NAMEPROVISIONEDUSED
> one-153@222  40 TiB  14 TiB
> one-153@228  40 TiB 357 GiB
> one-153@235  40 TiB 797 GiB
> one-153@241  40 TiB 509 GiB
> one-153@242  40 TiB  43 GiB
> one-153@243  40 TiB  16 MiB
> one-153@244  40 TiB  16 MiB
> one-153@245  40 TiB 324 MiB
> one-153@246  40 TiB 276 MiB
> one-153@247  40 TiB  96 MiB
> one-153@248  40 TiB 138 GiB
> one-153@249  40 TiB 1.8 GiB
> one-153@250  40 TiB 0 B
> one-153  40 TiB 204 MiB
>   40 TiB  16 TiB
> NAME   PROVISIONEDUSED
> one-391@3   40 TiB 432 MiB
> one-391@9   40 TiB  26 GiB
> one-391@15  40 TiB  90 GiB
> one-391@16  40 TiB 0 B
> one-391@17  40 TiB 0 B
> one-391@18  40 TiB 0 B
> one-391@19  40 TiB 0 B
> one-391@20  40 TiB 3.5 TiB
> one-391@21  40 TiB 5.4 TiB
> one-391@22  40 TiB 5.8 TiB
> one-391@23  40 TiB 8.4 TiB
> one-391@24  40 TiB 1.4 TiB
> one-391 40 TiB 2.2 TiB
>  40 TiB  27 TiB
> NAME   PROVISIONEDUSED
> one-394@3   70 TiB 1.4 TiB
> one-394@9   70 TiB 2.5 TiB
> one-394@15  70 TiB  20 GiB
> one-394@16  70 TiB 0 B
> one-394@17  70 TiB 0 B
> one-394@18  70 TiB 0 B
> one-394@19  70 TiB 383 GiB
> one-394@20  70 TiB 3.3 TiB
> one-394@21  70 TiB 5.0 TiB
> one-394@22  70 TiB 5.0 TiB
> one-394@23  70 TiB 9.0 TiB
> one-394@24  70 TiB 1.6 TiB
> one-394 70 TiB 2.5 TiB
>  70 TiB  31 TiB
> NAMEPROVISIONEDUSED
> one-434  25 TiB 9.1 TiB
>
> The large 70TiB images one-391 and one-394 are currently copied to with ca. 
> 5TiB per day.
>
> Output of "ceph df detail" with some columns removed:
>
> NAME ID USED%USED MAX AVAIL OBJECTS   
>RAW USED
> sr-rbd-data-one-hdd  11 119 TiB 58.4584 

[ceph-users] Re: Cluster became unresponsive: e5 handle_auth_request failed to assign global_id

2020-07-27 Thread Илья Борисович Волошин
Here are all the active ports on mon1 (with the exception of sshd and ntpd):

# netstat -npl
Proto Recv-Q Send-Q Local Address   Foreign Address State
PID/Program name
tcp0  0 :3300  0.0.0.0:*   LISTEN
 1582/ceph-mon
tcp0  0 :6789  0.0.0.0:*   LISTEN
 1582/ceph-mon
tcp6   0  0 :::9093 :::*LISTEN
 908/alertmanager
tcp6   0  0 :::9094 :::*LISTEN
 908/alertmanager
tcp6   0  0 :::9095 :::*LISTEN
 896/prometheus
tcp6   0  0 :::9100 :::*LISTEN
 906/node_exporter
tcp6   0  0 :::3000 :::*LISTEN
 882/grafana-server
udp6   0  0 :::9094 :::*
 908/alertmanager

I've tried telnet from mon1 host, can connect to 3300 and 6789:

# telnet  3300



Trying ...



Connected to .



Escape character is '^]'.



ceph v2

# telnet  6789
Trying ...
Connected to .
Escape character is '^]'.
ceph v027QQ

6800 and 6801 refuse connection:

# telnet  6800
Trying ...
telnet: Unable to connect to remote host: Connection refused

I don't see any errors in the log related to failures to bind... and all
CEPH systemd services are running as far as I can tell:

# systemctl list-units -a | grep ceph
  ceph-e30397f0-cc32-11ea-8c8e-000c29469cd5@alertmanager.mon1.service
loadedactive   running   Ceph
alertmanager.mon1 for e30397f0-cc32-11ea-8c8e-000c29469cd5
  ceph-e30397f0-cc32-11ea-8c8e-000c29469cd5@crash.mon1.service
 loadedactive   running   Ceph crash.mon1
for e30397f0-cc32-11ea-8c8e-000c29469cd5
  ceph-e30397f0-cc32-11ea-8c8e-000c29469cd5@grafana.mon1.service
 loadedactive   running   Ceph grafana.mon1
for e30397f0-cc32-11ea-8c8e-000c29469cd5
  ceph-e30397f0-cc32-11ea-8c8e-000c29469cd5@mgr.mon1.peevkl.service
loadedactive   running   Ceph
mgr.mon1.peevkl for e30397f0-cc32-11ea-8c8e-000c29469cd5
  ceph-e30397f0-cc32-11ea-8c8e-000c29469cd5@mon.mon1.service
 loadedactive   running   Ceph mon.mon1 for
e30397f0-cc32-11ea-8c8e-000c29469cd5
  ceph-e30397f0-cc32-11ea-8c8e-000c29469cd5@node-exporter.mon1.service
 loadedactive   running   Ceph
node-exporter.mon1 for e30397f0-cc32-11ea-8c8e-000c29469cd5
  ceph-e30397f0-cc32-11ea-8c8e-000c29469cd5@prometheus.mon1.service
loadedactive   running   Ceph
prometheus.mon1 for e30397f0-cc32-11ea-8c8e-000c29469cd5
  system-ceph\x2de30397f0\x2dcc32\x2d11ea\x2d8c8e\x2d000c29469cd5.slice
loadedactive   active
 system-ceph\x2de30397f0\x2dcc32\x2d11ea\x2d8c8e\x2d000c29469cd5.slice
  ceph-e30397f0-cc32-11ea-8c8e-000c29469cd5.target
 loadedactive   activeCeph cluster
e30397f0-cc32-11ea-8c8e-000c29469cd5
  ceph.target
loadedactive   activeAll Ceph clusters
and services

Here are currently active docker images:

# docker ps
CONTAINER IDIMAGECOMMAND
 CREATED STATUS  PORTS   NAMES
dfd8dbeccf1eceph/ceph:v15"/usr/bin/ceph-mgr -…"
41 minutes ago  Up 41 minutes
ceph-e30397f0-cc32-11ea-8c8e-000c29469cd5-mgr.mon1.peevkl
9452d1db7ffbceph/ceph:v15"/usr/bin/ceph-mon -…"   3
hours ago Up 3 hours
 ceph-e30397f0-cc32-11ea-8c8e-000c29469cd5-mon.mon1
703ec4a43824prom/prometheus:v2.18.1  "/bin/prometheus --c…"   3
hours ago Up 3 hours
 ceph-e30397f0-cc32-11ea-8c8e-000c29469cd5-prometheus.mon1
d816ec5e645fceph/ceph:v15"/usr/bin/ceph-crash…"   3
hours ago Up 3 hours
 ceph-e30397f0-cc32-11ea-8c8e-000c29469cd5-crash.mon1
38d283ba6424ceph/ceph-grafana:latest "/bin/sh -c 'grafana…"   3
hours ago Up 3 hours
 ceph-e30397f0-cc32-11ea-8c8e-000c29469cd5-grafana.mon1
cc119ec8f09aprom/node-exporter:v0.18.1   "/bin/node_exporter …"   3
hours ago Up 3 hours
 ceph-e30397f0-cc32-11ea-8c8e-000c29469cd5-node-exporter.mon1
aa1d339c4100prom/alertmanager:v0.20.0"/bin/alertmanager -…"   3
hours ago Up 3 hours
 ceph-e30397f0-cc32-11ea-8c8e-000c29469cd5-alertmanager.mon1

iptables are active, I tried setting all chain policies to ACCEPT (didn't
help), the rules are as such:

0 0 CEPH   tcp  --  *  *   0.0.0.0/0
0.0.0.0/0tcp dpt:6789
 5060  303K CEPH   tcp  --  *  *   0.0.0.0/0
0.0.0.0/0multiport dports 6800:7300

Chain CEPH includes addresses for monitors and OSDs.

пн, 27 июл. 2020 г. в 17:07, Dino Godor :

> Hi,
>
> have you tried to locally connect to the ports with netcat (or telnet)?
>
> Is the 

[ceph-users] Re: Fwd: BlueFS assertion ceph_assert(h->file->fnode.ino != 1)

2020-07-27 Thread Igor Fedotov

Hi Alexei,

just left a comment in the ticket...


Thanks,

Igor

On 7/25/2020 3:31 PM, Aleksei Zakharov wrote:

Hi all,
I wonder if someone else faced the issue described on the tracker: 
https://tracker.ceph.com/issues/45519
We thought that this problem is caused by high OSD fragmentation, 
until today. For now even OSDs with fragmentation rating < .3 are 
affected. We don't use separate DB/WAL partition in this setup and 
strings like this before failing:
2020-07-25 11:08:22.961 7f6f489d5700  1 bluefs _allocate failed to 
allocate 0x33dd4c5 on bdev 1, free 0x2bc; fallback to bdev 2
2020-07-25 11:08:22.961 7f6f489d5700  1 bluefs _allocate unable to 
allocate 0x33dd4c5 on bdev 2, free 0x; fallback to 
slow device expander

look suspicious for us.
We use 4KiB bluefs and bluestore block sizes as well as store the 
objects ~1KiB size and it looks like this makes the issue to be 
reproduced much more frequently. But, as I can see on the tracker / 
telegram channels, different people face with it from time to time, 
for example: https://paste.ubuntu.com/p/GDCXDrnrtX/ (telegram link 
https://t.me/ceph_users/376)
Did anyone able to identify the root cause and/or find a workaround 
for it?
BTW, ceph would be a nice small-objects storage showing 300-500usec 
latency if not this issue and this: 
https://tracker.ceph.com/issues/45765 one.

--
Regards,
Aleksei Zakharov

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Cluster became unresponsive: e5 handle_auth_request failed to assign global_id

2020-07-27 Thread Dino Godor

Hi,

have you tried to locally connect to the ports with netcat (or telnet)?

Is the process listening ? (something like netstat -4ln or the current 
equivalent thereof)


Is the old (new) Firewall maybe still running ?


On 27.07.20 16:00, Илья Борисович Волошин wrote:

Hello,

I've created an Octopus 15.2.4 cluster with 3 monitors and 3 OSDs (6 hosts
in total, all ESXi VMs). It lived through a couple of reboots without
problem, then I've reconfigured the main host a bit:
set iptables-legacy as current option in update-alternatives (this is a
Debian10 system), applied a basic ruleset of iptables and restarted docker.

After that the cluster became unresponsive (any ceph command hangs
indefinitely). I can use admin socket to manipulate config though. Setting
debug_ms to 5 I see this in the logs (timestamps cut for readability):

7f4096f41700  5 --2- [v2::3300/0,v1::6789/0] >>
[v2::3300/0,v1::6789/0] conn(0x55c21b975800
0x55c21ab45180 unknown :-1 s=START_CONNECT pgs=0 cs=0 l=0 rx=0 tx=
0).send_message enqueueing message m=0x55c21bd84a00 type=67 mon_probe(probe
e30397f0-cc32-11ea-8c8e-000c29469cd5 name mon1 mon_release octopus) v7
7f4098744700  1 --  >>
[v2::6800/561959008,v1::6801/561959008]
conn(0x55c21b974400 msgr2=0x55c21ab45600 unknown :-1 s=STATE_CONNECTING_RE
l=0).process reconnect failed to v2:81.200.2
.152:6800/561959008
7f4098744700  2 --  >>
[v2::6800/561959008,v1::6801/561959008]
conn(0x55c21b974400 msgr2=0x55c21ab45600 unknown :-1 s=STATE_CONNECTING_RE
l=0).process connection refused!

and this:

7f4098744700  2 --2- [v2::3300/0,v1::6789/0] >>
  conn(0x55c21ba38c00 0x55c21bcc5a80 secure :-1 s=AUTH_ACCEPTING pgs=0 cs=0
l=1 rx=0 tx=0)._fault on lossy channel, failing
7f4098744700  1 --2- [v2::3300/0,v1::6789/0] >>
  conn(0x55c21ba38c00 0x55c21bcc5a80 secure :-1 s=AUTH_ACCEPTING pgs=0 cs=0
l=1 rx=0 tx=0).stop
7f4098744700  5 --2- [v2::3300/0,v1::6789/0] >>
  conn(0x55c21ba38c00 0x55c21bcc5a80 secure :-1 s=AUTH_ACCEPTING pgs=0 cs=0
l=1 rx=0 tx=0).reset_recv_state
7f4098744700  5 --2- [v2::3300/0,v1::6789/0] >>
  conn(0x55c21ba38c00 0x55c21bcc5a80 secure :-1 s=AUTH_ACCEPTING pgs=0 cs=0
l=1 rx=0 tx=0).reset_security
7f409373a700  1 --2- [v2::3300/0,v1::6789/0] >>
  conn(0x55c21c0d2800 0x55c21bcc3f80 unknown :-1 s=NONE pgs=0 cs=0 l=0 rx=0
tx=0).accept
7f4098744700  1 --2- [v2::3300/0,v1::6789/0] >>
  conn(0x55c21c0d2800 0x55c21bcc3f80 unknown :-1 s=BANNER_ACCEPTING pgs=0
cs=0 l=0 rx=0 tx=0)._handle_peer_banner_payload supported=0 required=0
7f4098744700  5 --2- [v2::3300/0,v1::6789/0] >>
  conn(0x55c21c0d2800 0x55c21bcc3f80 unknown :-1 s=HELLO_ACCEPTING pgs=0
cs=0 l=0 rx=0 tx=0).handle_hello received hello: peer_type=8
peer_addr_for_me=v2::3300/0
7f4098744700  5 --2- [v2::3300/0,v1::6789/0] >>
  conn(0x55c21c0d2800 0x55c21bcc3f80 unknown :-1 s=HELLO_ACCEPTING pgs=0
cs=0 l=0 rx=0 tx=0).handle_hello getsockname says I am :3300 when
talking to v2::49012/0
7f4098744700  1 mon.mon1@0(probing) e5 handle_auth_request failed to assign
global_id

Config (the result of ceph --admin-daemon
/run/ceph/e30397f0-cc32-11ea-8c8e-000c29469cd5/ceph-mon.mon1.asok config
show):
https://pastebin.com/kifMXs9H

I can connect to ports 3300 and 6789 with telnet; 6800 and 6801 return
'process connection refused'

Setting all iptables policies to ACCEPT didn't change anything.

Where should I start digging to fix this problem? I'd like to at least
understand why this happened before putting the cluster into production.
Any help is appreciated.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Cluster became unresponsive: e5 handle_auth_request failed to assign global_id

2020-07-27 Thread Илья Борисович Волошин
Hello,

I've created an Octopus 15.2.4 cluster with 3 monitors and 3 OSDs (6 hosts
in total, all ESXi VMs). It lived through a couple of reboots without
problem, then I've reconfigured the main host a bit:
set iptables-legacy as current option in update-alternatives (this is a
Debian10 system), applied a basic ruleset of iptables and restarted docker.

After that the cluster became unresponsive (any ceph command hangs
indefinitely). I can use admin socket to manipulate config though. Setting
debug_ms to 5 I see this in the logs (timestamps cut for readability):

7f4096f41700  5 --2- [v2::3300/0,v1::6789/0] >>
[v2::3300/0,v1::6789/0] conn(0x55c21b975800
0x55c21ab45180 unknown :-1 s=START_CONNECT pgs=0 cs=0 l=0 rx=0 tx=
0).send_message enqueueing message m=0x55c21bd84a00 type=67 mon_probe(probe
e30397f0-cc32-11ea-8c8e-000c29469cd5 name mon1 mon_release octopus) v7
7f4098744700  1 --  >>
[v2::6800/561959008,v1::6801/561959008]
conn(0x55c21b974400 msgr2=0x55c21ab45600 unknown :-1 s=STATE_CONNECTING_RE
l=0).process reconnect failed to v2:81.200.2
.152:6800/561959008
7f4098744700  2 --  >>
[v2::6800/561959008,v1::6801/561959008]
conn(0x55c21b974400 msgr2=0x55c21ab45600 unknown :-1 s=STATE_CONNECTING_RE
l=0).process connection refused!

and this:

7f4098744700  2 --2- [v2::3300/0,v1::6789/0] >>
 conn(0x55c21ba38c00 0x55c21bcc5a80 secure :-1 s=AUTH_ACCEPTING pgs=0 cs=0
l=1 rx=0 tx=0)._fault on lossy channel, failing
7f4098744700  1 --2- [v2::3300/0,v1::6789/0] >>
 conn(0x55c21ba38c00 0x55c21bcc5a80 secure :-1 s=AUTH_ACCEPTING pgs=0 cs=0
l=1 rx=0 tx=0).stop
7f4098744700  5 --2- [v2::3300/0,v1::6789/0] >>
 conn(0x55c21ba38c00 0x55c21bcc5a80 secure :-1 s=AUTH_ACCEPTING pgs=0 cs=0
l=1 rx=0 tx=0).reset_recv_state
7f4098744700  5 --2- [v2::3300/0,v1::6789/0] >>
 conn(0x55c21ba38c00 0x55c21bcc5a80 secure :-1 s=AUTH_ACCEPTING pgs=0 cs=0
l=1 rx=0 tx=0).reset_security
7f409373a700  1 --2- [v2::3300/0,v1::6789/0] >>
 conn(0x55c21c0d2800 0x55c21bcc3f80 unknown :-1 s=NONE pgs=0 cs=0 l=0 rx=0
tx=0).accept
7f4098744700  1 --2- [v2::3300/0,v1::6789/0] >>
 conn(0x55c21c0d2800 0x55c21bcc3f80 unknown :-1 s=BANNER_ACCEPTING pgs=0
cs=0 l=0 rx=0 tx=0)._handle_peer_banner_payload supported=0 required=0
7f4098744700  5 --2- [v2::3300/0,v1::6789/0] >>
 conn(0x55c21c0d2800 0x55c21bcc3f80 unknown :-1 s=HELLO_ACCEPTING pgs=0
cs=0 l=0 rx=0 tx=0).handle_hello received hello: peer_type=8
peer_addr_for_me=v2::3300/0
7f4098744700  5 --2- [v2::3300/0,v1::6789/0] >>
 conn(0x55c21c0d2800 0x55c21bcc3f80 unknown :-1 s=HELLO_ACCEPTING pgs=0
cs=0 l=0 rx=0 tx=0).handle_hello getsockname says I am :3300 when
talking to v2::49012/0
7f4098744700  1 mon.mon1@0(probing) e5 handle_auth_request failed to assign
global_id

Config (the result of ceph --admin-daemon
/run/ceph/e30397f0-cc32-11ea-8c8e-000c29469cd5/ceph-mon.mon1.asok config
show):
https://pastebin.com/kifMXs9H

I can connect to ports 3300 and 6789 with telnet; 6800 and 6801 return
'process connection refused'

Setting all iptables policies to ACCEPT didn't change anything.

Where should I start digging to fix this problem? I'd like to at least
understand why this happened before putting the cluster into production.
Any help is appreciated.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] snaptrim blocks IO on ceph nautilus

2020-07-27 Thread Manuel Lausch
Hi,

since some days I try to debug a problem with snaptrimming under
nautilus.

I have a cluster with Nautilus (v14.2.10) , 44 Nodes á 24 OSDs á 14 TB
I create every day a snapshot for 7 days.

Every time the old snapshot is deleting I have bad IO performcance and blocked 
requests for several seconds until the snaptrim is done.
Settings like snaptrim_sleep and osd_pg_max_concurrent_snap_trims don't affect 
this behavior.

In the debug_osd 10/10 log I see the following:

2020-07-27 11:45:49.976 7fd8b8404700 10 osd.411 22457 dequeue_op 0x557886edda20 
prio 196 cost 0 latency 0.019545 osd_repop_reply(client.22731418.0:615257 3.636 
e22457/22372) v2 pg pg[3.636( v 22457'100855 (21737'97756,22457'100855] 
local-lis/les=22372/22374 n=27762 ec=2842/2839 lis/c 22372/22372 les/c/f 
22374/22374/0 22372/22372/22343) [411,36,956,763] r=0 lpr=22372 
luod=22457'100854 crt=22457'100855 lcod 22457'100853 mlcod 22457'100853 
active+clean+snaptrim_wait trimq=[1d~1]]
2020-07-27 11:45:49.976 7fd8b8404700 10 osd.411 22457 dequeue_op 0x557886edda20 
finish
2020-07-27 11:45:49.976 7fd8b8404700 10 osd.411 22457 dequeue_op 0x557886edc2c0 
prio 127 cost 0 latency 0.043165 MOSDScrubReserve(2.2645 RELEASE e22457) v1 pg 
pg[2.2645( empty local-lis/les=22359/22364 n=0 ec=2403/2403 lis/c 22359/22359 
les/c/f 22364/22367/0 22359/22359/22359) [379,411,884,975] r=1 lpr=22359 
crt=0'0 active mbc={}]
2020-07-27 11:45:49.976 7fd8b8404700 10 osd.411 22457 dequeue_op 0x557886edc2c0 
finish
2020-07-27 11:45:50.039 7fd8b8404700 10 osd.411 pg_epoch: 22457 pg[3.278e( v 
22457'99491 (21594'96426,22457'99491] local-lis/les=22359/22362 n=27669 
ec=2859/2839 lis/c 22359/22359 les/c/f 22362/22365/0 22359/22359/22343) 
[411,379,848,924] r=0 lpr=22359 crt=22457'99491 lcod 22457'99489 mlcod 
22457'99489 active+clean+snaptrim trimq=[1d~1]] snap_trimmer posting
2020-07-27 11:45:57.801 7fd8b8404700 10 osd.411 pg_epoch: 22457 pg[3.278e( v 
22457'99493 (21594'96426,22457'99493] local-lis/les=22359/22362 n=27669 
ec=2859/2839 lis/c 22359/22359 les/c/f 22362/22365/0 22359/22359/22343) 
[411,379,848,924] r=0 lpr=22359 luod=22457'99491 crt=22457'99493 lcod 
22457'99489 mlcod 22457'99489 active+clean+snaptrim trimq=[1d~1]] snap_trimmer 
complete
2020-07-27 11:45:57.801 7fd8b8404700 10 osd.411 22457 dequeue_op 0x557880ac3760 
prio 127 cost 663 latency 7.761823 osd_repop(osd.217.0:3025 3.1ca5 
e22457/22378) v2 pg pg[3.1ca5( v 22457'100370 (21716'97357,22457'100370] 
local-lis/les=22378/22379 n=27532 ec=2855/2839 lis/c 22378/22378 les/c/f 
22379/22379/0 22378/22378/22378) [217,411,551,1055] r=1 lpr=22378 luod=0'0 
lua=22294'16 crt=22457'100370 lcod 22457'100369 active mbc={}]
2020-07-27 11:45:57.801 7fd8b8404700 10 osd.411 22457 dequeue_op 0x557880ac3760 
finish
2020-07-27 11:45:57.801 7fd8b8404700 10 osd.411 22457 dequeue_op 0x5578813e1e40 
prio 127 cost 0 latency 7.494296 MOSDScrubReserve(2.37e2 REQUEST e22457) v1 pg 
pg[2.37e2( empty local-lis/les=22355/22356 n=0 ec=2412/2412 lis/c 22355/22355 
les/c/f 22356/22356/0 22355/22355/22355) [245,411,834,768] r=1 lpr=22355 
crt=0'0 active mbc={}]
2020-07-27 11:45:57.801 7fd8b8404700 10 osd.411 22457 dequeue_op 0x5578813e1e40 
finish

the dequeueing of ops works without pauses until the „snap_trimmer posting“ and 
„snap_trimmer complete“ loglines. This task takes in this example about 7 
Seconds. The following operations which are dequeued have now a latency of 
about this time.

I tried to drill down this in the code. (Developers are asked here)
It seems, that the PG will be locked for every operation.
The snap_trimmer posting and complete message comes from „osd/PrimaryLogPG.cc“ 
on line 4700. This indicates me, that the process of deleting a snapshot object 
will sometimes take some time.

After further poking around. I see in „osd/SnapMapper.cc“ the method 
„SnapMapper::get_next_objects_to_trim“ which takes several seconds to get 
finished. I followed this further to the „common/map_cacher.hpp“ to the line 
94: „int r = driver->get_next(key, );“
From there I lost the path.

The slowness is not on all OSDs at the same time. Somteime, this few OSDs are 
affected, sometimes some others. Restart of an OSD does not help.

With luminous and filestore, snapshot deletion was not an issue at all. 
With nautilus and bluestore this is not acceptable for my usecase.

I don‘t know so far, if this is a bluestore specific problem or some general 
issue.
I wonder a bit why there are no other who have this problem.


Regards
Manuel
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Push config to all hosts

2020-07-27 Thread Ricardo Marques
Hi Cem,

Since https://github.com/ceph/ceph/pull/35576 you will be able to tell cephadm 
to keep your `/etc/ceph/ceph.conf` updated in all hosts by runnig:

# ceph config set mgr mgr/cephadm/manage_etc_ceph_ceph_conf true

But this feature was not released yet, so you will have to wait for v15.2.5.


Ricardo Marques


From: Cem Zafer 
Sent: Monday, June 29, 2020 6:37 AM
To: ceph-users@ceph.io 
Subject: [ceph-users] Push config to all hosts

Hi,
What is the best method(s) to push ceph.conf to all hosts in octopus (15.x)?
Thanks.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: mimic: much more raw used than reported

2020-07-27 Thread Igor Fedotov

Frank,

suggest to start with perf counter analysis as per the second part of my 
previous email...



Thanks,

Igor

On 7/27/2020 2:30 PM, Frank Schilder wrote:

Hi Igor,

thanks for your answer. I was thinking about that, but as far as I understood, 
to hit this bug actually requires a partial rewrite to happen. However, these 
are disk images in storage servers with basically static files, many of which 
very large (15GB). Therefore, I believe, the vast majority of objects is 
written to only once and should not be affected by the amplification bug.

Is there any way to  confirm/rule out that/check how much  amplification is 
happening?

I'm wondering if I might be observing something else. Since "ceph osd df tree" 
does report the actual utilization and I have only one pool on these OSDs, there is no 
problem with accounting allocated storage to a pool. I know its all used by this one 
pool. I'm more wondering if its not the known amplification but something else (at least 
partly) that plays a role here.

Thanks and best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Igor Fedotov 
Sent: 27 July 2020 12:54:02
To: Frank Schilder; ceph-users
Subject: Re: [ceph-users] mimic: much more raw used than reported

Hi Frank,

you might be being hit by https://tracker.ceph.com/issues/44213

In short the root causes are  significant space overhead due to high
bluestore allocation unit (64K) and EC overwrite design.

This is fixed for upcoming Pacific release by using 4K alloc unit but it
is unlikely to be backported to earlier releases due to its complexity.
To say nothing about the need for OSD redeployment. Hence please expect
no fix for mimic.


And your raw usage reports might still be not that good since mimic
lacks per-pool stats collection https://github.com/ceph/ceph/pull/19454.
I.e. your actual raw space usage is higher than reported. To estimate
proper raw usage one can use bluestore perf counters (namely
bluestore_stored and bluestore_allocated). Summing bluestore_allocated
over all involved OSDs will give actual RAW usage. Summing
bluestore_stored will provide actual data volume after EC processing,
i.e. presumably it should be around 158TiB.


Thanks,

Igor

On 7/26/2020 8:43 PM, Frank Schilder wrote:

Dear fellow cephers,

I observe a wired problem on our mimic-13.2.8 cluster. We have an EC RBD pool backed by 
HDDs. These disks are not in any other pool. I noticed that the total capacity (=USED+MAX 
AVAIL) reported by "ceph df detail" has shrunk recently from 300TiB to 200TiB. 
Part but by no means all of this can be explained by imbalance of the data distribution.

When I compare the output of "ceph df detail" and "ceph osd df tree", I find 
69TiB raw capacity used but not accounted for; see calculations below. These 69TiB raw are 
equivalent to 20% usable capacity and I really need it back. Together with the imbalance, we loose 
about 30% capacity.

What is using these extra 69TiB and how can I get it back?


Some findings:

These are the 5 largest images in the pool, accounting for a total of 97TiB out 
of 119TiB usage:

# rbd du :
NAMEPROVISIONED   USED
one-133  25 TiB 14 TiB
NAMEPROVISIONEDUSED
one-153@222  40 TiB  14 TiB
one-153@228  40 TiB 357 GiB
one-153@235  40 TiB 797 GiB
one-153@241  40 TiB 509 GiB
one-153@242  40 TiB  43 GiB
one-153@243  40 TiB  16 MiB
one-153@244  40 TiB  16 MiB
one-153@245  40 TiB 324 MiB
one-153@246  40 TiB 276 MiB
one-153@247  40 TiB  96 MiB
one-153@248  40 TiB 138 GiB
one-153@249  40 TiB 1.8 GiB
one-153@250  40 TiB 0 B
one-153  40 TiB 204 MiB
  40 TiB  16 TiB
NAME   PROVISIONEDUSED
one-391@3   40 TiB 432 MiB
one-391@9   40 TiB  26 GiB
one-391@15  40 TiB  90 GiB
one-391@16  40 TiB 0 B
one-391@17  40 TiB 0 B
one-391@18  40 TiB 0 B
one-391@19  40 TiB 0 B
one-391@20  40 TiB 3.5 TiB
one-391@21  40 TiB 5.4 TiB
one-391@22  40 TiB 5.8 TiB
one-391@23  40 TiB 8.4 TiB
one-391@24  40 TiB 1.4 TiB
one-391 40 TiB 2.2 TiB
 40 TiB  27 TiB
NAME   PROVISIONEDUSED
one-394@3   70 TiB 1.4 TiB
one-394@9   70 TiB 2.5 TiB
one-394@15  70 TiB  20 GiB
one-394@16  70 TiB 0 B
one-394@17  70 TiB 0 B
one-394@18  70 TiB 0 B
one-394@19  70 TiB 383 GiB
one-394@20  70 TiB 3.3 TiB
one-394@21  70 TiB 5.0 TiB
one-394@22  70 TiB 5.0 TiB
one-394@23  70 TiB 9.0 TiB
one-394@24  70 TiB 1.6 TiB
one-394 70 TiB 2.5 TiB
 70 TiB  31 TiB
NAMEPROVISIONEDUSED
one-434  25 TiB 9.1 TiB

The large 70TiB images one-391 and one-394 are currently copied to with ca. 
5TiB per day.

Output of "ceph df detail" with some columns removed:

NAME ID USED%USED MAX AVAIL OBJECTS 
 RAW USED
sr-rbd-data-one-hdd  11 119 TiB 58.45  

[ceph-users] Re: please help me fix iSCSI Targets not available

2020-07-27 Thread Ricardo Marques
Hi David, which ceph version are you using?

From: David Thuong 
Sent: Wednesday, July 22, 2020 10:45 AM
To: ceph-users@ceph.io 
Subject: [ceph-users] please help me fix iSCSI Targets not available

iSCSI Targets not available
Please consult the documentation on how to configure and enable the iSCSI 
Targets management functionality.
Available information:
There are no gateways defined

any idea to enable it. tks so much
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: mimic: much more raw used than reported

2020-07-27 Thread Igor Fedotov

Hi Frank,

you might be being hit by https://tracker.ceph.com/issues/44213

In short the root causes are  significant space overhead due to high 
bluestore allocation unit (64K) and EC overwrite design.


This is fixed for upcoming Pacific release by using 4K alloc unit but it 
is unlikely to be backported to earlier releases due to its complexity. 
To say nothing about the need for OSD redeployment. Hence please expect 
no fix for mimic.



And your raw usage reports might still be not that good since mimic 
lacks per-pool stats collection https://github.com/ceph/ceph/pull/19454. 
I.e. your actual raw space usage is higher than reported. To estimate 
proper raw usage one can use bluestore perf counters (namely 
bluestore_stored and bluestore_allocated). Summing bluestore_allocated 
over all involved OSDs will give actual RAW usage. Summing 
bluestore_stored will provide actual data volume after EC processing, 
i.e. presumably it should be around 158TiB.



Thanks,

Igor

On 7/26/2020 8:43 PM, Frank Schilder wrote:

Dear fellow cephers,

I observe a wired problem on our mimic-13.2.8 cluster. We have an EC RBD pool backed by 
HDDs. These disks are not in any other pool. I noticed that the total capacity (=USED+MAX 
AVAIL) reported by "ceph df detail" has shrunk recently from 300TiB to 200TiB. 
Part but by no means all of this can be explained by imbalance of the data distribution.

When I compare the output of "ceph df detail" and "ceph osd df tree", I find 
69TiB raw capacity used but not accounted for; see calculations below. These 69TiB raw are 
equivalent to 20% usable capacity and I really need it back. Together with the imbalance, we loose 
about 30% capacity.

What is using these extra 69TiB and how can I get it back?


Some findings:

These are the 5 largest images in the pool, accounting for a total of 97TiB out 
of 119TiB usage:

# rbd du :
NAMEPROVISIONED   USED
one-133  25 TiB 14 TiB
NAMEPROVISIONEDUSED
one-153@222  40 TiB  14 TiB
one-153@228  40 TiB 357 GiB
one-153@235  40 TiB 797 GiB
one-153@241  40 TiB 509 GiB
one-153@242  40 TiB  43 GiB
one-153@243  40 TiB  16 MiB
one-153@244  40 TiB  16 MiB
one-153@245  40 TiB 324 MiB
one-153@246  40 TiB 276 MiB
one-153@247  40 TiB  96 MiB
one-153@248  40 TiB 138 GiB
one-153@249  40 TiB 1.8 GiB
one-153@250  40 TiB 0 B
one-153  40 TiB 204 MiB
  40 TiB  16 TiB
NAME   PROVISIONEDUSED
one-391@3   40 TiB 432 MiB
one-391@9   40 TiB  26 GiB
one-391@15  40 TiB  90 GiB
one-391@16  40 TiB 0 B
one-391@17  40 TiB 0 B
one-391@18  40 TiB 0 B
one-391@19  40 TiB 0 B
one-391@20  40 TiB 3.5 TiB
one-391@21  40 TiB 5.4 TiB
one-391@22  40 TiB 5.8 TiB
one-391@23  40 TiB 8.4 TiB
one-391@24  40 TiB 1.4 TiB
one-391 40 TiB 2.2 TiB
 40 TiB  27 TiB
NAME   PROVISIONEDUSED
one-394@3   70 TiB 1.4 TiB
one-394@9   70 TiB 2.5 TiB
one-394@15  70 TiB  20 GiB
one-394@16  70 TiB 0 B
one-394@17  70 TiB 0 B
one-394@18  70 TiB 0 B
one-394@19  70 TiB 383 GiB
one-394@20  70 TiB 3.3 TiB
one-394@21  70 TiB 5.0 TiB
one-394@22  70 TiB 5.0 TiB
one-394@23  70 TiB 9.0 TiB
one-394@24  70 TiB 1.6 TiB
one-394 70 TiB 2.5 TiB
 70 TiB  31 TiB
NAMEPROVISIONEDUSED
one-434  25 TiB 9.1 TiB

The large 70TiB images one-391 and one-394 are currently copied to with ca. 
5TiB per day.

Output of "ceph df detail" with some columns removed:

NAME ID USED%USED MAX AVAIL OBJECTS 
 RAW USED
sr-rbd-data-one-hdd  11 119 TiB 58.4584 TiB 31286554
  158 TiB

Pool is EC 6+2.
USED is correct: 31286554*4MiB=119TiB.
RAW USED is correct: 119*8/6=158TiB.
Most of this data is freshly copied onto large RBD images.
Compression is enabled on this pool (aggressive,snappy).

However, when looking at "deph osd df tree", I get

The combined raw capacity of OSDs backing this pool is 406.8TiB (sum over SIZE).
Summing up column USE over all OSDs gives 227.5TiB.

This gives a difference of 69TiB (=227-158) that is not accounted for.

Here the output of "ceph osd df tree limited" to the drives backing the pool:

ID   CLASSWEIGHT REWEIGHT SIZEUSE DATAOMAPMETA 
AVAIL   %USE  VAR  PGS TYPE NAME
   84  hdd8.90999  1.0 8.9 TiB 5.0 TiB 5.0 TiB 180 MiB   16 GiB 3.9 
TiB 56.43 1.72 103 osd.84
  145  hdd8.90999  1.0 8.9 TiB 4.6 TiB 4.6 TiB 144 MiB   14 GiB 4.3 
TiB 51.37 1.57  87 osd.145
  156  hdd8.90999  1.0 8.9 TiB 5.2 TiB 5.1 TiB 173 MiB   16 GiB 3.8 
TiB 57.91 1.77 100 osd.156
  168  hdd8.90999  1.0 8.9 TiB 5.0 TiB 5.0 TiB 164 MiB   16 GiB 3.9 
TiB 56.31 1.72  98 osd.168
  181  hdd8.90999  1.0 8.9 TiB 5.5 TiB 5.4 TiB 121 MiB   17 GiB 3.5 
TiB 

[ceph-users] cache tier dirty status

2020-07-27 Thread Budai Laszlo
Hello all,

is there a way to interrogate a cache tier pool about the number of dirty 
objects/bytes that it contains?

Thank you,
Laszlo
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Reinitialize rgw garbage collector

2020-07-27 Thread Michael Bisig
 Hi all,

I have a question about the garbage collector within RGWs. We run Nautilus 
14.2.8 and we have 32 garbage objects in the gc pool with totally 39 GB of 
garbage that needs to be processed.
When we run,

  radosgw-admin gc process --include-all

objects are processed but most of them won't be deleted. This can be checked by 
using --debug-rgw=5 in the command and stat the objects which are mentioned 
that they have been processed. Also the monitoring doesn't show that a huge 
amount of objects are deleted by the gc. So, I assume that it doesn't actually 
delete the objects. It might be due to a renewed time stamp? (not sure about 
this) Is there anybody who had similar issues with removing a large amount of 
garbage and is there a way to let the gc delete the objects?
Most of the objects within the gc list are __multipart__ objects. Are they 
processed differently than single part objects? E.g. collect all the multiparts 
before the deletion actually happens or how is this implemented? The garbage is 
still increasing and the gc cannot process things what scares us a bit. Also, 
we cannot bypass the gc because the bucket is still in use.

I also thought about reinitializing the GC in order to get an up to date list 
of garbage. (some entries show with `radosgw-admin gc list --include-all` are 
over a month old) Is there a way to make this happen and how save is it?
I thought about exporting the omapobjects from the gc pool (as a backup) and 
delete the objects within the pool (or rename the pool).

I appreciate any input and thank you in advance.

Regards,
Michael

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io