Re: [ceph-users] Ceph pg repair clone_missing?

2019-10-08 Thread Brad Hubbard
On Fri, Oct 4, 2019 at 6:09 PM Marc Roos  wrote:
>
>  >
>  >Try something like the following on each OSD that holds a copy of
>  >rbd_data.1f114174b0dc51.0974 and see what output you get.
>  >Note that you can drop the bluestore flag if they are not bluestore
>  >osds and you will need the osd stopped at the time (set noout). Also
>  >note, snapids are displayed in hexadecimal in the output (but then '4'
>  >is '4' so not a big issues here).
>  >
>  >$ ceph-objectstore-tool --type bluestore --data-path
>  >/var/lib/ceph/osd/ceph-XX/ --pgid 17.36 --op list
>  >rbd_data.1f114174b0dc51.0974
>
> I got these results
>
> osd.7
> Error getting attr on : 17.36_head,#-19:6c00:::scrub_17.36:head#,
> (61) No data available
> ["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","sna
> pid":63,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]
> ["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","sna
> pid":-2,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]

Ah, so of course the problem is the snapshot is missing. You may need
to try something like the following on each of those osds.

$ ceph-objectstore-tool --type bluestore --data-path
/var/lib/ceph/osd/ceph-XX/ --pgid 17.36
'{"oid":"rbd_data.1f114174b0dc51.0974","key":"","snapid":-2,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}'
remove-clone-metadata 4

>
> osd.12
> ["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","sna
> pid":63,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]
> ["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","sna
> pid":-2,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]
>
> osd.29
> ["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","sna
> pid":63,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]
> ["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","sna
> pid":-2,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]
>
>
>  >
>  >The likely issue here is the primary believes snapshot 4 is gone but
>  >there is still data and/or metadata on one of the replicas which is
>  >confusing the issue. If that is the case you can use the the
>  >ceph-objectstore-tool to delete the relevant snapshot(s)
>  >



-- 
Cheers,
Brad

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Space reclamation after rgw pool removal

2019-10-08 Thread George Shuklin

Hello.

I've created an rgw installation, had uploaded about 60M files into a 
single bucket. Removal had looked as a long adventure, so I "ceph osd 
pool rm'ed" both default.rgw.data and default.rgw.index.


Now I have this:

# rados lspools
.rgw.root
default.rgw.control
default.rgw.meta
default.rgw.log

(same as for ceph osd pool ls)

but ceph -s shows:

    pools:   6 pools, 256 pgs

Moreover, ceph osd df shows that I have  TOTAL (raw) 5.5 TiB (use) 3.6 
TiB (data) 3.4 TiB  (omap) 35 GiB (meta) 86 GiB (avail) 1.9 TiB (%use) 
65.36


I tried to force deepscrub for all OSDs but this didn't helped.

Currently I have a few tiny bits in all other pools and I don't 
understand where the space is.


Installation is fresh nautilus, bluestore over HDD.


Few questions:

1. How this space is called? Lost? Non-gc? Cached?

2. Is it normal to have different number is lspools and total number of 
pools?


3. Where I can continue to debug this?

4. (of course) how to this this?

Thanks!

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph Negative Objects Number

2019-10-08 Thread Lazuardi Nasution
Hi,

I get following weird negative objects number on tiering. Why is this
happening? How to get back to normal?

Best regards,

[root@management-a ~]# ceph df detail
GLOBAL:
SIZE AVAIL RAW USED %RAW USED OBJECTS
446T  184T 261T 58.62  22092k
POOLS:
NAMEID CATEGORY QUOTA OBJECTS QUOTA BYTES
  USED   %USED MAX AVAIL OBJECTS  DIRTY  READ
WRITE  RAW USED
rbd 1  -N/A   N/A
   0 025838G0  0  0
 10
volumes 2  -N/A   N/A
  82647G 76.1825838G 21177891 20681k  5897M
 2447M 242T
images  3  -N/A   N/A
   3683G 12.4825838G   705881   689k 37844k
10630k   11049G
backups 4  -N/A   N/A
   0 025838G0  0  0
 00
vms 5  -N/A   N/A
   3003G 10.4125838G   772845   754k   623M
812M9010G
rbd_tiering 11 -N/A   N/A
 333 0 3492G4  0  1
 2  999
volumes_tiering 12 -N/A   N/A
   9761M 0 3492G-1233338  2340M
 1982M0
images_tiering  13 -N/A   N/A
293k 0 3492G  129  0 34642k
 3600k 880k
backups_tiering 14 -N/A   N/A
  83 0 3492G1  0  2
 2  249
vms_tiering 15 -N/A   N/A
   2758M 0 3492G   -32567116 31942M
 2875M0
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs 1 large omap objects

2019-10-08 Thread Paul Emmerich
Hi,

the default for this warning changed recently (see other similar
threads on the mailing list), it was 2 million before 14.2.3.

I don't think the new default of 200k is a good choice, so increasing
it is a reasonable work-around.

Paul

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90

On Mon, Oct 7, 2019 at 3:37 AM Nigel Williams
 wrote:
>
> I've adjusted the threshold:
>
> ceph config set osd osd_deep_scrub_large_omap_object_key_threshold 35
>
> Colleague suggested that this will take effect on the next deep-scrub.
>
> Is the default of 200,000 too small? will this be adjusted in future
> releases or is it meant to be adjusted in some use-cases?
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph stats on the logs

2019-10-08 Thread Eugen Block

Hi,

there is also /var/log/ceph/ceph.log on the MONs, it has the stats  
you're asking for. Does this answer your question?


Regards,
Eugen


Zitat von nokia ceph :


Hi Team,

With default log settings , the ceph  stats will be logged like
cluster [INF] pgmap v30410386: 8192 pgs: 8192 active+clean; 445 TB data,
1339 TB used, 852 TB / 2191 TB avail; 188 kB/s rd, 217 MB/s wr, 1618 op/s
 Jewel : on mon logs
 Nautilus : on mgr logs
Luminous : not able to view similar logs on either mon/mgr , what is the
log level to be set to have this stats on the logs.

Thanks,
Muthu




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph stats on the logs

2019-10-08 Thread nokia ceph
Hi Team,

With default log settings , the ceph  stats will be logged like
cluster [INF] pgmap v30410386: 8192 pgs: 8192 active+clean; 445 TB data,
1339 TB used, 852 TB / 2191 TB avail; 188 kB/s rd, 217 MB/s wr, 1618 op/s
 Jewel : on mon logs
 Nautilus : on mgr logs
Luminous : not able to view similar logs on either mon/mgr , what is the
log level to be set to have this stats on the logs.

Thanks,
Muthu
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mon sudden crash loop - pinned map

2019-10-08 Thread Philippe D'Anjou
 Hi,unfortunately it's single mon, because we had major outage on this cluster 
and it's just being used to copy off data now. We werent able to add more mons 
because once a second mon was added it crashed the first one (there's a bug 
tracker ticket). 
I still have old rocksdb files before I ran a repair on it, but well it had the 
rocksdb corruption issue (not sure why that happened, it ran fine for 2months 
now).
Any options? I mean everything still works, data is accessible, RBDs run, only 
cephfs mount is obviously not working. For that short amount of time the mon 
starts it reports no issues and all commands run fine.

Am Montag, 7. Oktober 2019, 21:59:20 OESZ hat Gregory Farnum 
 Folgendes geschrieben:  
 
 On Sun, Oct 6, 2019 at 1:08 AM Philippe D'Anjou
 wrote:
>
> I had to use rocksdb repair tool before because the rocksdb files got 
> corrupted, for another reason (another bug possibly). Maybe that is why now 
> it crash loops, although it ran fine for a day.

Yeah looks like it lost a bit of data. :/

> What is meant with "turn it off and rebuild from remainder"?

If only one monitor is crashing, you can remove it from the quorum,
zap all the disks, and add it back so that it recovers from its
healthy peers.
-Greg

>
> Am Samstag, 5. Oktober 2019, 02:03:44 OESZ hat Gregory Farnum 
>  Folgendes geschrieben:
>
>
> Hmm, that assert means the monitor tried to grab an OSDMap it had on
> disk but it didn't work. (In particular, a "pinned" full map which we
> kept around after trimming the others to save on disk space.)
>
> That *could* be a bug where we didn't have the pinned map and should
> have (or incorrectly thought we should have), but this code was in
> Mimic as well as Nautilus and I haven't seen similar reports. So it
> could also mean that something bad happened to the monitor's disk or
> Rocksdb store. Can you turn it off and rebuild from the remainder, or
> do they all exhibit this bug?
>
>
> On Fri, Oct 4, 2019 at 5:44 AM Philippe D'Anjou
>  wrote:
> >
> > Hi,
> > our mon is acting up all of a sudden and dying in crash loop with the 
> > following:
> >
> >
> > 2019-10-04 14:00:24.339583 lease_expire=0.00 has v0 lc 4549352
> >    -3> 2019-10-04 14:00:24.335 7f6e5d461700  5 
> >mon.km-fsn-1-dc4-m1-797678@0(leader).paxos(paxos active c 4548623..4549352) 
> >is_readable = 1 - now=2019-10-04 14:00:24.339620 lease_expire=0.00 has 
> >v0 lc 4549352
> >    -2> 2019-10-04 14:00:24.343 7f6e5d461700 -1 
> >mon.km-fsn-1-dc4-m1-797678@0(leader).osd e257349 get_full_from_pinned_map 
> >closest pinned map ver 252615 not available! error: (2) No such file or 
> >directory
> >    -1> 2019-10-04 14:00:24.343 7f6e5d461700 -1 
> >/build/ceph-14.2.4/src/mon/OSDMonitor.cc: In function 'int 
> >OSDMonitor::get_full_from_pinned_map(version_t, ceph::bufferlist&)' thread 
> >7f6e5d461700 time 2019-10-04 14:00:24.347580
> > /build/ceph-14.2.4/src/mon/OSDMonitor.cc: 3932: FAILED ceph_assert(err == 0)
> >
> >  ceph version 14.2.4 (75f4de193b3ea58512f204623e6c5a16e6c1e1ba) nautilus 
> >(stable)
> >  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
> >const*)+0x152) [0x7f6e68eb064e]
> >  2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, 
> >char const*, ...)+0) [0x7f6e68eb0829]
> >  3: (OSDMonitor::get_full_from_pinned_map(unsigned long, 
> >ceph::buffer::v14_2_0::list&)+0x80b) [0x72802b]
> >  4: (OSDMonitor::get_version_full(unsigned long, unsigned long, 
> >ceph::buffer::v14_2_0::list&)+0x3d2) [0x728c82]
> >  5: 
> >(OSDMonitor::encode_trim_extra(std::shared_ptr, 
> >unsigned long)+0x8c) [0x717c3c]
> >  6: (PaxosService::maybe_trim()+0x473) [0x707443]
> >  7: (Monitor::tick()+0xa9) [0x5ecf39]
> >  8: (C_MonContext::finish(int)+0x39) [0x5c3f29]
> >  9: (Context::complete(int)+0x9) [0x6070d9]
> >  10: (SafeTimer::timer_thread()+0x190) [0x7f6e68f45580]
> >  11: (SafeTimerThread::entry()+0xd) [0x7f6e68f46e4d]
> >  12: (()+0x76ba) [0x7f6e67cab6ba]
> >  13: (clone()+0x6d) [0x7f6e674d441d]
> >
> >      0> 2019-10-04 14:00:24.347 7f6e5d461700 -1 *** Caught signal (Aborted) 
> >**
> >  in thread 7f6e5d461700 thread_name:safe_timer
> >
> >  ceph version 14.2.4 (75f4de193b3ea58512f204623e6c5a16e6c1e1ba) nautilus 
> >(stable)
> >  1: (()+0x11390) [0x7f6e67cb5390]
> >  2: (gsignal()+0x38) [0x7f6e67402428]
> >  3: (abort()+0x16a) [0x7f6e6740402a]
> >  4: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
> >const*)+0x1a3) [0x7f6e68eb069f]
> >  5: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, 
> >char const*, ...)+0) [0x7f6e68eb0829]
> >  6: (OSDMonitor::get_full_from_pinned_map(unsigned long, 
> >ceph::buffer::v14_2_0::list&)+0x80b) [0x72802b]
> >  7: (OSDMonitor::get_version_full(unsigned long, unsigned long, 
> >ceph::buffer::v14_2_0::list&)+0x3d2) [0x728c82]
> >  8: 
> >(OSDMonitor::encode_trim_extra(std::shared_ptr, 
> >unsigned long)+0x8c) [0x717c3c]
> >  9: (PaxosService::maybe_trim()+0x473) [0x707443]
> >  10: (Monitor::tick()+0xa9)