date:20181008

On Tue, Oct 9, 2018 at 5:39 AM Alfredo Daniel Rezinovsky
 wrote:
>
> It seems my purge_queue journal is damaged. Even if I reset it keeps
> damaged.
>
> What means inotablev mismatch ?
>
>
> 2018-10-08 16:40:03.144 7f05b6099700 -1 log_channel(cluster) log [ERR] :
> journal replay inotablev mismatch 1 -> 42160
> /build/ceph-13.2.1/src/mds/journal.cc: In function 'void
> EMetaBlob::replay(MDSRank*, LogSegment*, MDSlaveUpdate*)' thread
> 7f05b6099700 time 2018-10-08 16:40:03.150639
> /build/ceph-13.2.1/src/mds/journal.cc: 1572: FAILED
> assert(g_conf->mds_wipe_sessions)
> 2018-10-08 16:40:03.144 7f05b6099700 -1 log_channel(cluster) log [ERR] :
> journal replay sessionmap v 20302542 -(1|2) > table 0
>   ceph version 13.2.1 (5533ecdc0fda920179d7ad84e0aa65a127b20d77) mimic
> (stable)
>   1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> const*)+0x102) [0x7f05c649ff32]
>   2: (()+0x26c0f7) [0x7f05c64a00f7]
>   3: (EMetaBlob::replay(MDSRank*, LogSegment*, MDSlaveUpdate*)+0x5f4b)
> [0x5557a384706b]
>   4: (EUpdate::replay(MDSRank*)+0x39) [0x5557a38485a9]
>   5: (MDLog::_replay_thread()+0x864) [0x5557a37f0c24]
>   6: (MDLog::ReplayThread::entry()+0xd) [0x5557a3594c0d]
>   7: (()+0x76db) [0x7f05c5dac6db]
>   8: (clone()+0x3f) [0x7f05c4f9288f]
>   NOTE: a copy of the executable, or `objdump -rdS ` is
> needed to interpret this.
> 2018-10-08 16:40:03.148 7f05b6099700 -1
> /build/ceph-13.2.1/src/mds/journal.cc: In function 'void
> EMetaBlob::replay(MDSRank*, LogSegment*, MDSlaveUpdate*)' thread
> 7f05b6099700 time 2018-10-08 16:40:03.150639
> /build/ceph-13.2.1/src/mds/journal.cc: 1572: FAILED
> assert(g_conf->mds_wipe_sessions)
>
>   ceph version 13.2.1 (5533ecdc0fda920179d7ad84e0aa65a127b20d77) mimic
> (stable)
>   1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> const*)+0x102) [0x7f05c649ff32]
>   2: (()+0x26c0f7) [0x7f05c64a00f7]
>   3: (EMetaBlob::replay(MDSRank*, LogSegment*, MDSlaveUpdate*)+0x5f4b)
> [0x5557a384706b]
>   4: (EUpdate::replay(MDSRank*)+0x39) [0x5557a38485a9]
>   5: (MDLog::_replay_thread()+0x864) [0x5557a37f0c24]
>   6: (MDLog::ReplayThread::entry()+0xd) [0x5557a3594c0d]
>   7: (()+0x76db) [0x7f05c5dac6db]
>   8: (clone()+0x3f) [0x7f05c4f9288f]
>   NOTE: a copy of the executable, or `objdump -rdS ` is
> needed to interpret this.
>

Looks like you reset inotable and session table, but does not reset journal.

>
>
> There's a way to import an empty journal?
>

'cephfs-journal-tool journal reset'.


> I need the data, even if it's read only.
>
>
>
>
>
> --
> Alfrenovsky
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] list admin issues

2018-10-08 Thread Alex Gorbachev

On Mon, Oct 8, 2018 at 7:48 AM Elias Abacioglu
 wrote:
>
> If it's attachments causing this, perhaps forbid attachments? Force people to 
> use pastebin / imgur type of services?
>
> /E
>
> On Mon, Oct 8, 2018 at 1:33 PM Martin Palma  wrote:
>>
>> Same here also on Gmail with G Suite.
>> On Mon, Oct 8, 2018 at 12:31 AM Paul Emmerich  wrote:
>> >
>> > I'm also seeing this once every few months or so on Gmail with G Suite.
>> >
>> > Paul
>> > Am So., 7. Okt. 2018 um 08:18 Uhr schrieb Joshua Chen
>> > :
>> > >
>> > > I also got removed once, got another warning once (need to re-enable).
>> > >
>> > > Cheers
>> > > Joshua
>> > >
>> > >
>> > > On Sun, Oct 7, 2018 at 5:38 AM Svante Karlsson  
>> > > wrote:
>> > >>
>> > >> I'm also getting removed but not only from ceph. I subscribe 
>> > >> d...@kafka.apache.org list and the same thing happens there.
>> > >>
>> > >> Den lör 6 okt. 2018 kl 23:24 skrev Jeff Smith :
>> > >>>
>> > >>> I have been removed twice.
>> > >>> On Sat, Oct 6, 2018 at 7:07 AM Elias Abacioglu
>> > >>>  wrote:
>> > >>> >
>> > >>> > Hi,
>> > >>> >
>> > >>> > I'm bumping this old thread cause it's getting annoying. My 
>> > >>> > membership get disabled twice a month.
>> > >>> > Between my two Gmail accounts I'm in more than 25 mailing lists and 
>> > >>> > I see this behavior only here. Why is only ceph-users only affected? 
>> > >>> > Maybe Christian was on to something, is this intentional?
>> > >>> > Reality is that there is a lot of ceph-users with Gmail accounts, 
>> > >>> > perhaps it wouldn't be so bad to actually trying to figure this one 
>> > >>> > out?
>> > >>> >
>> > >>> > So can the maintainers of this list please investigate what actually 
>> > >>> > gets bounced? Look at my address if you want.
>> > >>> > I got disabled 20181006, 20180927, 20180916, 20180725, 20180718 most 
>> > >>> > recently.
>> > >>> > Please help!
>> > >>> >
>> > >>> > Thanks,
>> > >>> > Elias
>> > >>> >
>> > >>> > On Mon, Oct 16, 2017 at 5:41 AM Christian Balzer  
>> > >>> > wrote:
>> > >>> >>
>> > >>> >>
>> > >>> >> Most mails to this ML score low or negatively with SpamAssassin, 
>> > >>> >> however
>> > >>> >> once in a while (this is a recent one) we get relatively high 
>> > >>> >> scores.
>> > >>> >> Note that the forged bits are false positives, but the SA is up to 
>> > >>> >> date and
>> > >>> >> google will have similar checks:
>> > >>> >> ---
>> > >>> >> X-Spam-Status: No, score=3.9 required=10.0 tests=BAYES_00,DCC_CHECK,
>> > >>> >>  
>> > >>> >> FORGED_MUA_MOZILLA,FORGED_YAHOO_RCVD,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,
>> > >>> >>  
>> > >>> >> HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,MIME_HTML_MOSTLY,RCVD_IN_MSPIKE_H4,
>> > >>> >>  RCVD_IN_MSPIKE_WL,RDNS_NONE,T_DKIM_INVALID shortcircuit=no 
>> > >>> >> autolearn=no
>> > >>> >> ---
>> > >>> >>
>> > >>> >> Between attachment mails and some of these and you're well on your 
>> > >>> >> way out.
>> > >>> >>
>> > >>> >> The default mailman settings and logic require 5 bounces to trigger
>> > >>> >> unsubscription and 7 days of NO bounces to reset the counter.
>> > >>> >>
>> > >>> >> Christian
>> > >>> >>
>> > >>> >> On Mon, 16 Oct 2017 12:23:25 +0900 Christian Balzer wrote:
>> > >>> >>
>> > >>> >> > On Mon, 16 Oct 2017 14:15:22 +1100 Blair Bethwaite wrote:
>> > >>> >> >
>> > >>> >> > > Thanks Christian,
>> > >>> >> > >
>> > >>> >> > > You're no doubt on the right track, but I'd really like to 
>> > >>> >> > > figure out
>> > >>> >> > > what it is at my end - I'm unlikely to be the only person 
>> > >>> >> > > subscribed
>> > >>> >> > > to ceph-users via a gmail account.
>> > >>> >> > >
>> > >>> >> > > Re. attachments, I'm surprised mailman would be allowing them 
>> > >>> >> > > in the
>> > >>> >> > > first place, and even so gmail's attachment requirements are 
>> > >>> >> > > less
>> > >>> >> > > strict than most corporate email setups (those that don't 
>> > >>> >> > > already use
>> > >>> >> > > a cloud provider).
>> > >>> >> > >
>> > >>> >> > Mailman doesn't do anything with this by default AFAIK, but see 
>> > >>> >> > below.
>> > >>> >> > Strict is fine if you're in control, corporate mail can be hell, 
>> > >>> >> > doubly so
>> > >>> >> > if on M$ cloud.
>> > >>> >> >
>> > >>> >> > > This started happening earlier in the year after I turned off 
>> > >>> >> > > digest
>> > >>> >> > > mode. I also have a paid google domain, maybe I'll try setting
>> > >>> >> > > delivery to that address and seeing if anything changes...
>> > >>> >> > >
>> > >>> >> > Don't think google domain is handled differently, but what do I 
>> > >>> >> > know.
>> > >>> >> >
>> > >>> >> > Though the digest bit confirms my suspicion about attachments:
>> > >>> >> > ---
>> > >>> >> > When a subscriber chooses to receive plain text daily “digests” 
>> > >>> >> > of list
>> > >>> >> > messages, Mailman sends the digest messages without any original
>> > >>> >> > attachments (in Mailman lingo, it “scrubs” the messages of 
>> > >>> >> > attachments).
>> > >>> >> > Howev

Re: [ceph-users] MDSs still core dumping

I was able to start MDS 13.2.1 when I had imported journal, ran 
recover_dentries, reset journal, reset session table, and did ceph fs reset.
However, I got about 1000 errors in log like bad backtrace, loaded dup inode, 
etc. and it eventually failed on assert(stray_in->inode.nlink >= 1) right after 
becoming 'active'.
I'm doing scan_links to give it another try.

> On 8.10.2018, at 23:43, Alfredo Daniel Rezinovsky  
> wrote:
> 
> 
> 
> On 08/10/18 17:41, Sergey Malinin wrote:
>> 
>>> On 8.10.2018, at 23:23, Alfredo Daniel Rezinovsky >> > wrote:
>>> 
>>> I need the data, even if it's read only.
>> 
>> After full data scan you should have been able to boot mds 13.2.2 and mount 
>> the fs.
> The problem started with the upgrade to 13.2.2. I downgraded to 13.2.1 and 
> Yan Zhen told.
> 
> mds reports problems with the journals, and even reseting the journals MDS 
> wont start.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] daahboard

2018-10-08 Thread solarflow99

Ok, thanks for the clarification. I guess I had assumed ansible was
supposed to take care of all that, now I got it working.


On Mon, Oct 8, 2018 at 3:07 PM Jonas Jelten  wrote:

> You need to add or generate a certificate, without it the dashboard
> doesn't start.
> The procedure is described in the documentation.
>
> -- JJ
>
> On 09/10/2018 00.05, solarflow99 wrote:
> > seems like it did, yet I don't see anything listening on the port it
> should be for dashboard.
> >
> > # ceph mgr module ls
> > {
> > "enabled_modules": [
> > "dashboard",
> > "status"
> > ],
> >
> >
> >
> > # ceph status
> >   cluster:
> > id: d36fd17c-174e-40d6-95b9-86bdd196b7d2
> > health: HEALTH_OK
> >
> >   services:
> > mon: 3 daemons, quorum cephmgr101,cephmgr102,cephmgr103
> > mgr: cephmgr103(active), standbys: cephmgr102, cephmgr101
> > mds: cephfs-1/1/1 up  {0=cephmgr103=up:active}, 2 up:standby
> > osd: 3 osds: 3 up, 3 in
> >
> >   data:
> > pools:   3 pools, 192 pgs
> > objects: 2.02 k objects, 41 MiB
> > usage:   6.5 GiB used, 86 GiB / 93 GiB avail
> > pgs: 192 active+clean
> >
> >
> >
> > # netstat -tlpn | grep ceph
> > tcp0  0 172.20.3.23:6789 
> 0.0.0.0:*   LISTEN  8422/ceph-mon
> > tcp0  0 172.20.3.23:6800 
> 0.0.0.0:*   LISTEN  21250/ceph-mds
> > tcp0  0 172.20.3.23:6801 
> 0.0.0.0:*   LISTEN  16562/ceph-mgr
> >
> >
> > On Mon, Oct 8, 2018 at 2:48 AM John Spray  jsp...@redhat.com>> wrote:
> >
> > Assuming that ansible is correctly running "ceph mgr module enable
> > dashboard", then the next place to look is in "ceph status" (any
> > errors?) and "ceph mgr module ls" (any reports of the module unable
> to
> > run?)
> >
> > John
> > On Sat, Oct 6, 2018 at 1:53 AM solarflow99  > wrote:
> > >
> > > I enabled the dashboard module in ansible but I don't see ceph-mgr
> listening on a port for it.  Is there something
> > else I missed?
> > >
> > > ___
> > > ceph-users mailing list
> > > ceph-users@lists.ceph.com 
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] OSD fails to startup with bluestore "direct_read_unaligned (5) Input/output error"

2018-10-08 Thread Alexandre Gosset

Hi,

We are experiencing a recurrent problem with some OSD that fails to
startup after a crash. Here it happened with an OSD during recovery and
this is a very annoying bug, because recovery takes time (small
increment of crush weight) and the way I found to fix this is to use
ceph-volume lvm zap /var/lib/ceph/osd/ceph-$i/block, remove the OSD, and
create it again. Thus needing to start the recovery all over again.

Is there another way to fix this ?

# /usr/bin/ceph-osd -d -i 40 --pid-file /var/run/ceph/osd.40.pid -c
/etc/ceph/ceph.conf --cluster ceph --setuser ceph --setgroup ceph
2018-10-09 00:33:23.025913 7f91c7ae2e00  0 set uid:gid to 64045:64045
(ceph:ceph)
2018-10-09 00:33:23.025934 7f91c7ae2e00  0 ceph version 12.2.8
(ae699615bac534ea496ee965ac6192cb7e0e07c0) luminous (stable), process
ceph-osd, pid 322330
starting osd.40 at - osd_data /var/lib/ceph/osd/ceph-40
/var/lib/ceph/osd/ceph-40/journal
2018-10-09 00:33:23.203930 7f91c7ae2e00  0 load: jerasure load: lrc
load: isa
2018-10-09 00:33:23.204022 7f91c7ae2e00  1 bdev create path
/var/lib/ceph/osd/ceph-40/block type kernel
2018-10-09 00:33:23.204032 7f91c7ae2e00  1 bdev(0x558d31b68b40
/var/lib/ceph/osd/ceph-40/block) open path /var/lib/ceph/osd/ceph-40/block
2018-10-09 00:33:23.204336 7f91c7ae2e00  1 bdev(0x558d31b68b40
/var/lib/ceph/osd/ceph-40/block) open size 1990292758528 (0x1cf66b16000,
1.81TiB) block_size 4096 (4KiB) rotational
2018-10-09 00:33:23.204766 7f91c7ae2e00  1
bluestore(/var/lib/ceph/osd/ceph-40) _set_cache_sizes max 0.5 < ratio 0.99
2018-10-09 00:33:23.204790 7f91c7ae2e00  1
bluestore(/var/lib/ceph/osd/ceph-40) _set_cache_sizes cache_size
1073741824 meta 0.5 kv 0.5 data 0
2018-10-09 00:33:23.204796 7f91c7ae2e00  1 bdev(0x558d31b68b40
/var/lib/ceph/osd/ceph-40/block) close
2018-10-09 00:33:23.499876 7f91c7ae2e00  1
bluestore(/var/lib/ceph/osd/ceph-40) _mount path /var/lib/ceph/osd/ceph-40
2018-10-09 00:33:23.500355 7f91c7ae2e00  1 bdev create path
/var/lib/ceph/osd/ceph-40/block type kernel
2018-10-09 00:33:23.500368 7f91c7ae2e00  1 bdev(0x558d31b68d80
/var/lib/ceph/osd/ceph-40/block) open path /var/lib/ceph/osd/ceph-40/block
2018-10-09 00:33:23.500583 7f91c7ae2e00  1 bdev(0x558d31b68d80
/var/lib/ceph/osd/ceph-40/block) open size 1990292758528 (0x1cf66b16000,
1.81TiB) block_size 4096 (4KiB) rotational
2018-10-09 00:33:23.500884 7f91c7ae2e00  1
bluestore(/var/lib/ceph/osd/ceph-40) _set_cache_sizes max 0.5 < ratio 0.99
2018-10-09 00:33:23.500900 7f91c7ae2e00  1
bluestore(/var/lib/ceph/osd/ceph-40) _set_cache_sizes cache_size
1073741824 meta 0.5 kv 0.5 data 0
2018-10-09 00:33:23.501051 7f91c7ae2e00  1 bdev create path
/var/lib/ceph/osd/ceph-40/block type kernel
2018-10-09 00:33:23.501061 7f91c7ae2e00  1 bdev(0x558d31b69200
/var/lib/ceph/osd/ceph-40/block) open path /var/lib/ceph/osd/ceph-40/block
2018-10-09 00:33:23.501263 7f91c7ae2e00  1 bdev(0x558d31b69200
/var/lib/ceph/osd/ceph-40/block) open size 1990292758528 (0x1cf66b16000,
1.81TiB) block_size 4096 (4KiB) rotational
2018-10-09 00:33:23.501277 7f91c7ae2e00  1 bluefs add_block_device bdev
1 path /var/lib/ceph/osd/ceph-40/block size 1.81TiB
2018-10-09 00:33:23.501310 7f91c7ae2e00  1 bluefs mount
2018-10-09 00:33:23.589465 7f91c7ae2e00  0  set rocksdb option
compaction_readahead_size = 2097152
2018-10-09 00:33:23.589482 7f91c7ae2e00  0  set rocksdb option
compression = kNoCompression
2018-10-09 00:33:23.589487 7f91c7ae2e00  0  set rocksdb option
max_write_buffer_number = 4
2018-10-09 00:33:23.589490 7f91c7ae2e00  0  set rocksdb option
min_write_buffer_number_to_merge = 1
2018-10-09 00:33:23.589494 7f91c7ae2e00  0  set rocksdb option
recycle_log_file_num = 4
2018-10-09 00:33:23.589496 7f91c7ae2e00  0  set rocksdb option
writable_file_max_buffer_size = 0
2018-10-09 00:33:23.589500 7f91c7ae2e00  0  set rocksdb option
write_buffer_size = 268435456
2018-10-09 00:33:23.589517 7f91c7ae2e00  0  set rocksdb option
compaction_readahead_size = 2097152
2018-10-09 00:33:23.589520 7f91c7ae2e00  0  set rocksdb option
compression = kNoCompression
2018-10-09 00:33:23.589523 7f91c7ae2e00  0  set rocksdb option
max_write_buffer_number = 4
2018-10-09 00:33:23.589526 7f91c7ae2e00  0  set rocksdb option
min_write_buffer_number_to_merge = 1
2018-10-09 00:33:23.589528 7f91c7ae2e00  0  set rocksdb option
recycle_log_file_num = 4
2018-10-09 00:33:23.589531 7f91c7ae2e00  0  set rocksdb option
writable_file_max_buffer_size = 0
2018-10-09 00:33:23.589533 7f91c7ae2e00  0  set rocksdb option
write_buffer_size = 268435456
2018-10-09 00:33:23.589656 7f91c7ae2e00  4 rocksdb: RocksDB version: 5.4.0

2018-10-09 00:33:23.589667 7f91c7ae2e00  4 rocksdb: Git sha
rocksdb_build_git_sha:@0@
2018-10-09 00:33:23.589670 7f91c7ae2e00  4 rocksdb: Compile date Aug 30 2018
2018-10-09 00:33:23.589671 7f91c7ae2e00  4 rocksdb: DB SUMMARY

2018-10-09 00:33:23.589706 7f91c7ae2e00  4 rocksdb: CURRENT file:  CURRENT

2018-10-09 00:33:23.589707 7f91c7ae2e00  4 rocksdb: IDENTITY file:  IDENTITY

2018-10-09 00:33:23.589709 7f91c7ae2e00  4 ro

Re: [ceph-users] MDSs still core dumping

... and cephfs-table-tool  reset session ?

> On 9.10.2018, at 01:32, Sergey Malinin  wrote:
> 
> Have you tried to recover dentries and then reset the journal?
> 
> 
>> On 8.10.2018, at 23:43, Alfredo Daniel Rezinovsky > > wrote:
>> 
>> 
>> 
>> On 08/10/18 17:41, Sergey Malinin wrote:
>>> 
 On 8.10.2018, at 23:23, Alfredo Daniel Rezinovsky >>> > wrote:
 
 I need the data, even if it's read only.
>>> 
>>> After full data scan you should have been able to boot mds 13.2.2 and mount 
>>> the fs.
>> The problem started with the upgrade to 13.2.2. I downgraded to 13.2.1 and 
>> Yan Zhen told.
>> 
>> mds reports problems with the journals, and even reseting the journals MDS 
>> wont start.
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] MDSs still core dumping

Have you tried to recover dentries and then reset the journal?


> On 8.10.2018, at 23:43, Alfredo Daniel Rezinovsky  
> wrote:
> 
> 
> 
> On 08/10/18 17:41, Sergey Malinin wrote:
>> 
>>> On 8.10.2018, at 23:23, Alfredo Daniel Rezinovsky >> > wrote:
>>> 
>>> I need the data, even if it's read only.
>> 
>> After full data scan you should have been able to boot mds 13.2.2 and mount 
>> the fs.
> The problem started with the upgrade to 13.2.2. I downgraded to 13.2.1 and 
> Yan Zhen told.
> 
> mds reports problems with the journals, and even reseting the journals MDS 
> wont start.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] list admin issues

2018-10-08 Thread Gerhard W. Recher

@all and Listadmin


These Problems with Gmail Email Addresse or Domains operated under Google 
Control

Are caused by Google refusing high Volume Senders !!!

I operated a Security list and hast Been faced with exactly Same Problems


Google is totally ignorant, no real answers to Out complains !


Just my 2 Cents

Am 8. Oktober 2018 23:52:20 MESZ schrieb Paul Emmerich :
>You don't get removed for sending to the mailing list, you get removed
>because the mailing list servers fails to deliver mail to you.
>Am Mo., 8. Okt. 2018 um 23:22 Uhr schrieb Jeff Smith
>:
>>
>> I just got dumped again.  I have not sent any attechments/images.
>> On Mon, Oct 8, 2018 at 5:48 AM Elias Abacioglu
>>  wrote:
>> >
>> > If it's attachments causing this, perhaps forbid attachments? Force
>people to use pastebin / imgur type of services?
>> >
>> > /E
>> >
>> > On Mon, Oct 8, 2018 at 1:33 PM Martin Palma 
>wrote:
>> >>
>> >> Same here also on Gmail with G Suite.
>> >> On Mon, Oct 8, 2018 at 12:31 AM Paul Emmerich
> wrote:
>> >> >
>> >> > I'm also seeing this once every few months or so on Gmail with G
>Suite.
>> >> >
>> >> > Paul
>> >> > Am So., 7. Okt. 2018 um 08:18 Uhr schrieb Joshua Chen
>> >> > :
>> >> > >
>> >> > > I also got removed once, got another warning once (need to
>re-enable).
>> >> > >
>> >> > > Cheers
>> >> > > Joshua
>> >> > >
>> >> > >
>> >> > > On Sun, Oct 7, 2018 at 5:38 AM Svante Karlsson
> wrote:
>> >> > >>
>> >> > >> I'm also getting removed but not only from ceph. I subscribe
>d...@kafka.apache.org list and the same thing happens there.
>> >> > >>
>> >> > >> Den lör 6 okt. 2018 kl 23:24 skrev Jeff Smith
>:
>> >> > >>>
>> >> > >>> I have been removed twice.
>> >> > >>> On Sat, Oct 6, 2018 at 7:07 AM Elias Abacioglu
>> >> > >>>  wrote:
>> >> > >>> >
>> >> > >>> > Hi,
>> >> > >>> >
>> >> > >>> > I'm bumping this old thread cause it's getting annoying.
>My membership get disabled twice a month.
>> >> > >>> > Between my two Gmail accounts I'm in more than 25 mailing
>lists and I see this behavior only here. Why is only ceph-users only
>affected? Maybe Christian was on to something, is this intentional?
>> >> > >>> > Reality is that there is a lot of ceph-users with Gmail
>accounts, perhaps it wouldn't be so bad to actually trying to figure
>this one out?
>> >> > >>> >
>> >> > >>> > So can the maintainers of this list please investigate
>what actually gets bounced? Look at my address if you want.
>> >> > >>> > I got disabled 20181006, 20180927, 20180916, 20180725,
>20180718 most recently.
>> >> > >>> > Please help!
>> >> > >>> >
>> >> > >>> > Thanks,
>> >> > >>> > Elias
>> >> > >>> >
>> >> > >>> > On Mon, Oct 16, 2017 at 5:41 AM Christian Balzer
> wrote:
>> >> > >>> >>
>> >> > >>> >>
>> >> > >>> >> Most mails to this ML score low or negatively with
>SpamAssassin, however
>> >> > >>> >> once in a while (this is a recent one) we get relatively
>high scores.
>> >> > >>> >> Note that the forged bits are false positives, but the SA
>is up to date and
>> >> > >>> >> google will have similar checks:
>> >> > >>> >> ---
>> >> > >>> >> X-Spam-Status: No, score=3.9 required=10.0
>tests=BAYES_00,DCC_CHECK,
>> >> > >>> >> 
>FORGED_MUA_MOZILLA,FORGED_YAHOO_RCVD,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,
>> >> > >>> >> 
>HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,MIME_HTML_MOSTLY,RCVD_IN_MSPIKE_H4,
>> >> > >>> >>  RCVD_IN_MSPIKE_WL,RDNS_NONE,T_DKIM_INVALID
>shortcircuit=no autolearn=no
>> >> > >>> >> ---
>> >> > >>> >>
>> >> > >>> >> Between attachment mails and some of these and you're
>well on your way out.
>> >> > >>> >>
>> >> > >>> >> The default mailman settings and logic require 5 bounces
>to trigger
>> >> > >>> >> unsubscription and 7 days of NO bounces to reset the
>counter.
>> >> > >>> >>
>> >> > >>> >> Christian
>> >> > >>> >>
>> >> > >>> >> On Mon, 16 Oct 2017 12:23:25 +0900 Christian Balzer
>wrote:
>> >> > >>> >>
>> >> > >>> >> > On Mon, 16 Oct 2017 14:15:22 +1100 Blair Bethwaite
>wrote:
>> >> > >>> >> >
>> >> > >>> >> > > Thanks Christian,
>> >> > >>> >> > >
>> >> > >>> >> > > You're no doubt on the right track, but I'd really
>like to figure out
>> >> > >>> >> > > what it is at my end - I'm unlikely to be the only
>person subscribed
>> >> > >>> >> > > to ceph-users via a gmail account.
>> >> > >>> >> > >
>> >> > >>> >> > > Re. attachments, I'm surprised mailman would be
>allowing them in the
>> >> > >>> >> > > first place, and even so gmail's attachment
>requirements are less
>> >> > >>> >> > > strict than most corporate email setups (those that
>don't already use
>> >> > >>> >> > > a cloud provider).
>> >> > >>> >> > >
>> >> > >>> >> > Mailman doesn't do anything with this by default AFAIK,
>but see below.
>> >> > >>> >> > Strict is fine if you're in control, corporate mail can
>be hell, doubly so
>> >> > >>> >> > if on M$ cloud.
>> >> > >>> >> >
>> >> > >>> >> > > This started happening earlier in the year after I
>turned off digest
>> >> > >>> >> > > mode. I also have a paid google domain, maybe I'll
>try set

Re: [ceph-users] daahboard

2018-10-08 Thread Jonas Jelten

You need to add or generate a certificate, without it the dashboard doesn't 
start.
The procedure is described in the documentation.

-- JJ

On 09/10/2018 00.05, solarflow99 wrote:
> seems like it did, yet I don't see anything listening on the port it should 
> be for dashboard.
> 
> # ceph mgr module ls
> {
>     "enabled_modules": [
>     "dashboard",
>     "status"
>     ],
> 
> 
> 
> # ceph status
>   cluster:
>     id: d36fd17c-174e-40d6-95b9-86bdd196b7d2
>     health: HEALTH_OK
> 
>   services:
>     mon: 3 daemons, quorum cephmgr101,cephmgr102,cephmgr103
>     mgr: cephmgr103(active), standbys: cephmgr102, cephmgr101
>     mds: cephfs-1/1/1 up  {0=cephmgr103=up:active}, 2 up:standby
>     osd: 3 osds: 3 up, 3 in
> 
>   data:
>     pools:   3 pools, 192 pgs
>     objects: 2.02 k objects, 41 MiB
>     usage:   6.5 GiB used, 86 GiB / 93 GiB avail
>     pgs: 192 active+clean
> 
> 
> 
> # netstat -tlpn | grep ceph
> tcp    0  0 172.20.3.23:6789     
> 0.0.0.0:*   LISTEN  8422/ceph-mon
> tcp    0  0 172.20.3.23:6800     
> 0.0.0.0:*   LISTEN  21250/ceph-mds
> tcp    0  0 172.20.3.23:6801     
> 0.0.0.0:*   LISTEN  16562/ceph-mgr
> 
> 
> On Mon, Oct 8, 2018 at 2:48 AM John Spray  > wrote:
> 
> Assuming that ansible is correctly running "ceph mgr module enable
> dashboard", then the next place to look is in "ceph status" (any
> errors?) and "ceph mgr module ls" (any reports of the module unable to
> run?)
> 
> John
> On Sat, Oct 6, 2018 at 1:53 AM solarflow99  > wrote:
> >
> > I enabled the dashboard module in ansible but I don't see ceph-mgr 
> listening on a port for it.  Is there something
> else I missed?
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com 
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] daahboard

2018-10-08 Thread solarflow99

seems like it did, yet I don't see anything listening on the port it should
be for dashboard.

# ceph mgr module ls
{
"enabled_modules": [
"dashboard",
"status"
],



# ceph status
  cluster:
id: d36fd17c-174e-40d6-95b9-86bdd196b7d2
health: HEALTH_OK

  services:
mon: 3 daemons, quorum cephmgr101,cephmgr102,cephmgr103
mgr: cephmgr103(active), standbys: cephmgr102, cephmgr101
mds: cephfs-1/1/1 up  {0=cephmgr103=up:active}, 2 up:standby
osd: 3 osds: 3 up, 3 in

  data:
pools:   3 pools, 192 pgs
objects: 2.02 k objects, 41 MiB
usage:   6.5 GiB used, 86 GiB / 93 GiB avail
pgs: 192 active+clean



# netstat -tlpn | grep ceph
tcp0  0 172.20.3.23:67890.0.0.0:*
LISTEN  8422/ceph-mon
tcp0  0 172.20.3.23:68000.0.0.0:*
LISTEN  21250/ceph-mds
tcp0  0 172.20.3.23:68010.0.0.0:*
LISTEN  16562/ceph-mgr


On Mon, Oct 8, 2018 at 2:48 AM John Spray  wrote:

> Assuming that ansible is correctly running "ceph mgr module enable
> dashboard", then the next place to look is in "ceph status" (any
> errors?) and "ceph mgr module ls" (any reports of the module unable to
> run?)
>
> John
> On Sat, Oct 6, 2018 at 1:53 AM solarflow99  wrote:
> >
> > I enabled the dashboard module in ansible but I don't see ceph-mgr
> listening on a port for it.  Is there something else I missed?
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] MDS hangs in "heartbeat_map" deadlock

2018-10-08 Thread Patrick Donnelly

On Thu, Oct 4, 2018 at 3:58 PM Stefan Kooman  wrote:
> A couple of hours later we hit the same issue. We restarted with
> debug_mds=20 and debug_journaler=20 on the standby-replay node. Eight
> hours later (an hour ago) we hit the same issue. We captured ~ 4.7 GB of
> logging I skipped to the end of the log file just before the
> "hearbeat_map" messages start:
>
> 2018-10-04 23:23:53.144644 7f415ebf4700 20 mds.0.locker  client.17079146 
> pending pAsLsXsFscr allowed pAsLsXsFscr wanted pFscr
> 2018-10-04 23:23:53.144645 7f415ebf4700 10 mds.0.locker eval done
> 2018-10-04 23:23:55.088542 7f415bbee700 10 mds.beacon.mds2 _send up:active 
> seq 5021
> 2018-10-04 23:23:59.088602 7f415bbee700 10 mds.beacon.mds2 _send up:active 
> seq 5022
> 2018-10-04 23:24:03.088688 7f415bbee700 10 mds.beacon.mds2 _send up:active 
> seq 5023
> 2018-10-04 23:24:07.088775 7f415bbee700 10 mds.beacon.mds2 _send up:active 
> seq 5024
> 2018-10-04 23:24:11.088867 7f415bbee700  1 heartbeat_map is_healthy 'MDSRank' 
> had timed out after 15
> 2018-10-04 23:24:11.088871 7f415bbee700  1 mds.beacon.mds2 _send skipping 
> beacon, heartbeat map not healthy
>
> As far as I can see just normal behaviour.
>
> The big question is: what is happening when the mds start logging the 
> hearbeat_map messages?
> Why does it log "heartbeat_map is_healthy", just to log .04 seconds later 
> it's not healthy?
>
> Ceph version: 12.2.8 on all nodes (mon, osd, mds)
> mds: one active / one standby-replay
>
> The system was not under any kind of resource pressure: plenty of CPU, RAM
> available. Metrics all look normal up to the moment things go into a deadlock
> (so it seems).

Thanks for the detailed notes. It looks like the MDS is stuck
somewhere it's not even outputting any log messages. If possible, it'd
be helpful to get a coredump (e.g. by sending SIGQUIT to the MDS) or,
if you're comfortable with gdb, a backtrace of any threads that look
suspicious (e.g. not waiting on a futex) including `info threads`.
-- 
Patrick Donnelly
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] fixing another remapped+incomplete EC 4+2 pg

2018-10-08 Thread Graham Allan

I'm still trying to find a way to reactivate this one pg which is 
incomplete. There are a lot of periods in its history based on a 
combination of a peering storm a couple of weeks ago, with min_size 
being set too low for safety. At this point I think there is no chance 
of bringing back the full set of most recent osds, so I'd like to 
understand the process to roll back to an earlier period no matter how 
long ago.


I understood the process is to set 
osd_find_best_info_ignore_history_les=1 for the primary osd, so 
something like:


ceph tell osd.448 injectargs --osd_find_best_info_ignore_history_les=1

then set that osd down to make it re-peer. But whenever I have tried 
this the osd never becomes active again. Possibly I have misunderstood 
or am doing something else wrong...


Output from pg query is here, if it adds any insight...


https://gist.githubusercontent.com/gtallan/e72b4461fb315983ae9a62cbbcd851d4/raw/0d30ceb315dd5567cb05fd0dc3e2e2c4975d8c01/pg70.b1c-query.txt


(Out of curiosity, is there any way to relate the first and last numbers 
in an interval to an actual timestamp?)


Thanks,

Graham

On 10/03/2018 12:18 PM, Graham Allan wrote:
Following on from my previous adventure with recovering pgs in the face 
of failed OSDs, I now have my EC 4+2 pool oeprating with min_size=5 
which is as things should be.


However I have one pg which is stuck in state remapped+incomplete 
because it has only 4 out of 6 osds running, and I have been unable to 
bring the missing two back into service.



PG_AVAILABILITY Reduced data availability: 1 pg inactive, 1 pg incomplete
    pg 70.82d is remapped+incomplete, acting 
[2147483647,2147483647,190,448,61,315] (reducing pool 
.rgw.buckets.ec42 min_size from 5 may help; search ceph.com/docs for 
'incomplete')


I don't think I want to do anything with min_size as that would make all 
other pgs vulnerable to running dangerously undersized (unless there is 
any way to force that state for only a single pg). It seems to me that 
with 4/6 osds available, it should maybe be possible to force ceph to 
select one or two new osds to rebalance this pg to?


ceph pg query gives me (snippet):


    "down_osds_we_would_probe": [
    98,
    233,
    238,
    239
    ],
    "peering_blocked_by": [],
    "peering_blocked_by_detail": [
    {
    "detail": "peering_blocked_by_history_les_bound"
    }
    ]


Of these, osd 98 appears to have a corrupt xfs filesystem

osd 239 was the original osd to hold a shard of this pg but would not 
keep running, exiting with:


/build/ceph-12.2.7/src/osd/ECBackend.cc: 619: FAILED 
assert(pop.data.length() == 
sinfo.aligned_logical_offset_to_chunk_offset( 
after_progress.data_recovered_to - 
op.recovery_progress.data_recovered_to))


osds 233 and 238 were otherwise evacuated (weight 0) osds to which I 
imported the pg shard from osd 239 (using ceph-objectstore-tool). After 
which they crash with the same assert. More specifically they seem to 
crash in the same way each time the pg becomes active and starts to 
backfill, on the same object:


    -9> 2018-10-03 11:30:28.174586 7f94ce9c4700  5 osd.233 pg_epoch: 
704441 pg[70.82ds1( v 704329'703106 (586066'698574,704329'703106] 
local-lis/les=704439/704440 n=102585 ec=21494/21494 lis/c 
704439/588565 les/c/f 704440/588566/0 68066
6/704439/704439) 
[820,761,105,789,562,485]/[2147483647,233,190,448,61,315]p233(1) r=1 
lpr=704439 pi=[21494,704439)/4 rops=1 
bft=105(2),485(5),562(4),761(1),789(3),820(0) crt=704329'703106 lcod 
0'0 mlcod 0'0 active+undersized+remapped+ba
ckfilling] backfill_pos is 
70:b415ca14:::default.630943.7__shadow_Barley_GC_Project%2fBarley_GC_Project%2fRawdata%2fReads%2fCZOA.6150.7.38741.TGCTGG.fastq.gz.2~Vn8g0rMwpVY8eaW83TDzJ2mczLXAl3z.3_24:head 

    -8> 2018-10-03 11:30:28.174887 7f94ce9c4700  1 -- 
10.31.0.1:6854/2210291 --> 10.31.0.1:6854/2210291 -- 
MOSDECSubOpReadReply(70.82ds1 704441/704439 ECSubReadReply(tid=1, 
attrs_read=0)) v2 -- 0x7f9500472280 con 0
    -7> 2018-10-03 11:30:28.174902 7f94db9de700  1 -- 
10.31.0.1:6854/2210291 <== osd.233 10.31.0.1:6854/2210291 0  
MOSDECSubOpReadReply(70.82ds1 704441/704439 ECSubReadReply(tid=1, 
attrs_read=0)) v2  0+0+0 (0 0 0) 0x7f9500472280

 con 0x7f94fb72b000
    -6> 2018-10-03 11:30:28.176267 7f94ead66700  5 -- 
10.31.0.1:6854/2210291 >> 10.31.0.4:6880/2181727 conn(0x7f94ff2a6000 
:-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=946 cs=1 l=0). 
rx osd.61 seq 9 0x7f9500472500 MOSDECSubOpRe

adReply(70.82ds1 704441/704439 ECSubReadReply(tid=1, attrs_read=0)) v2
    -5> 2018-10-03 11:30:28.176281 7f94ead66700  1 -- 
10.31.0.1:6854/2210291 <== osd.61 10.31.0.4:6880/2181727 9  
MOSDECSubOpReadReply(70.82ds1 704441/704439 ECSubReadReply(tid=1, 
attrs_read=0)) v2  786745+0+0 (875698380 0 0) 0x

7f9500472500 con 0x7f94ff2a6000
    -4> 2018-10-03 11:30:28.177723 7f94ead66700  5 --

Re: [ceph-users] advised needed for different projects design

One cephfs filesystem, one directory per project, quotas in cephfs,
exported via NFS ganesha.
Unless you have lots of really small files, then you might want to
consider RBD (where HA is more annoying to handle).

Paul
Am Mo., 8. Okt. 2018 um 22:45 Uhr schrieb Joshua Chen
:
>
> Hello all,
>   When planning for my institute's need, I would like to seek for design 
> suggestions from you for my special situation:
>
> 1, I will support many projects, currently they are all nfs servers (and 
> those nfs servers serve their clients respectively). For example nfsA (for 
> clients belong to projectA); nfsB, nfsC,,,
>
> 2, For the institute's total capacity (currently 200TB), I would like nfsA, 
> nfsB, nfsC,,, to only see their individual assigned capacities, for example, 
> nfsA only get 50TB at her /export/nfsdata, nfsB only see 140TB, nfsC only 
> 10TB,,,
>
> 3, my question is, what would be the good choice to provide storage to those 
> nfs servers?
>
> RBD? is rbd good for hundreds of TB size for a single block device for a nfs 
> server?
>
> cephFS? this seems good solution for me that the nfs server could mount 
> cephfs and share them over nfs. But how could I make different project (nfsA 
> nfsB nfsC) 'see' or 'mount' part of the total 200TB capacity, there should be 
> many small cephfs(es) and each one has it's own given smaller capacity.
>
> Rados? I don't have much experience on this ? is rados suitable for this 
> multi project servers' need?
>
>
> Thanks in advance
>
> Cheers
> Joshua
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] list admin issues

You don't get removed for sending to the mailing list, you get removed
because the mailing list servers fails to deliver mail to you.
Am Mo., 8. Okt. 2018 um 23:22 Uhr schrieb Jeff Smith :
>
> I just got dumped again.  I have not sent any attechments/images.
> On Mon, Oct 8, 2018 at 5:48 AM Elias Abacioglu
>  wrote:
> >
> > If it's attachments causing this, perhaps forbid attachments? Force people 
> > to use pastebin / imgur type of services?
> >
> > /E
> >
> > On Mon, Oct 8, 2018 at 1:33 PM Martin Palma  wrote:
> >>
> >> Same here also on Gmail with G Suite.
> >> On Mon, Oct 8, 2018 at 12:31 AM Paul Emmerich  
> >> wrote:
> >> >
> >> > I'm also seeing this once every few months or so on Gmail with G Suite.
> >> >
> >> > Paul
> >> > Am So., 7. Okt. 2018 um 08:18 Uhr schrieb Joshua Chen
> >> > :
> >> > >
> >> > > I also got removed once, got another warning once (need to re-enable).
> >> > >
> >> > > Cheers
> >> > > Joshua
> >> > >
> >> > >
> >> > > On Sun, Oct 7, 2018 at 5:38 AM Svante Karlsson 
> >> > >  wrote:
> >> > >>
> >> > >> I'm also getting removed but not only from ceph. I subscribe 
> >> > >> d...@kafka.apache.org list and the same thing happens there.
> >> > >>
> >> > >> Den lör 6 okt. 2018 kl 23:24 skrev Jeff Smith :
> >> > >>>
> >> > >>> I have been removed twice.
> >> > >>> On Sat, Oct 6, 2018 at 7:07 AM Elias Abacioglu
> >> > >>>  wrote:
> >> > >>> >
> >> > >>> > Hi,
> >> > >>> >
> >> > >>> > I'm bumping this old thread cause it's getting annoying. My 
> >> > >>> > membership get disabled twice a month.
> >> > >>> > Between my two Gmail accounts I'm in more than 25 mailing lists 
> >> > >>> > and I see this behavior only here. Why is only ceph-users only 
> >> > >>> > affected? Maybe Christian was on to something, is this intentional?
> >> > >>> > Reality is that there is a lot of ceph-users with Gmail accounts, 
> >> > >>> > perhaps it wouldn't be so bad to actually trying to figure this 
> >> > >>> > one out?
> >> > >>> >
> >> > >>> > So can the maintainers of this list please investigate what 
> >> > >>> > actually gets bounced? Look at my address if you want.
> >> > >>> > I got disabled 20181006, 20180927, 20180916, 20180725, 20180718 
> >> > >>> > most recently.
> >> > >>> > Please help!
> >> > >>> >
> >> > >>> > Thanks,
> >> > >>> > Elias
> >> > >>> >
> >> > >>> > On Mon, Oct 16, 2017 at 5:41 AM Christian Balzer  
> >> > >>> > wrote:
> >> > >>> >>
> >> > >>> >>
> >> > >>> >> Most mails to this ML score low or negatively with SpamAssassin, 
> >> > >>> >> however
> >> > >>> >> once in a while (this is a recent one) we get relatively high 
> >> > >>> >> scores.
> >> > >>> >> Note that the forged bits are false positives, but the SA is up 
> >> > >>> >> to date and
> >> > >>> >> google will have similar checks:
> >> > >>> >> ---
> >> > >>> >> X-Spam-Status: No, score=3.9 required=10.0 
> >> > >>> >> tests=BAYES_00,DCC_CHECK,
> >> > >>> >>  
> >> > >>> >> FORGED_MUA_MOZILLA,FORGED_YAHOO_RCVD,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,
> >> > >>> >>  
> >> > >>> >> HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,MIME_HTML_MOSTLY,RCVD_IN_MSPIKE_H4,
> >> > >>> >>  RCVD_IN_MSPIKE_WL,RDNS_NONE,T_DKIM_INVALID shortcircuit=no 
> >> > >>> >> autolearn=no
> >> > >>> >> ---
> >> > >>> >>
> >> > >>> >> Between attachment mails and some of these and you're well on 
> >> > >>> >> your way out.
> >> > >>> >>
> >> > >>> >> The default mailman settings and logic require 5 bounces to 
> >> > >>> >> trigger
> >> > >>> >> unsubscription and 7 days of NO bounces to reset the counter.
> >> > >>> >>
> >> > >>> >> Christian
> >> > >>> >>
> >> > >>> >> On Mon, 16 Oct 2017 12:23:25 +0900 Christian Balzer wrote:
> >> > >>> >>
> >> > >>> >> > On Mon, 16 Oct 2017 14:15:22 +1100 Blair Bethwaite wrote:
> >> > >>> >> >
> >> > >>> >> > > Thanks Christian,
> >> > >>> >> > >
> >> > >>> >> > > You're no doubt on the right track, but I'd really like to 
> >> > >>> >> > > figure out
> >> > >>> >> > > what it is at my end - I'm unlikely to be the only person 
> >> > >>> >> > > subscribed
> >> > >>> >> > > to ceph-users via a gmail account.
> >> > >>> >> > >
> >> > >>> >> > > Re. attachments, I'm surprised mailman would be allowing them 
> >> > >>> >> > > in the
> >> > >>> >> > > first place, and even so gmail's attachment requirements are 
> >> > >>> >> > > less
> >> > >>> >> > > strict than most corporate email setups (those that don't 
> >> > >>> >> > > already use
> >> > >>> >> > > a cloud provider).
> >> > >>> >> > >
> >> > >>> >> > Mailman doesn't do anything with this by default AFAIK, but see 
> >> > >>> >> > below.
> >> > >>> >> > Strict is fine if you're in control, corporate mail can be 
> >> > >>> >> > hell, doubly so
> >> > >>> >> > if on M$ cloud.
> >> > >>> >> >
> >> > >>> >> > > This started happening earlier in the year after I turned off 
> >> > >>> >> > > digest
> >> > >>> >> > > mode. I also have a paid google domain, maybe I'll try setting
> >> > >>> >> > > delivery to that address and seeing if anything changes..

Re: [ceph-users] Fastest way to find raw device from OSD-ID? (osd -> lvm lv -> lvm pv -> disk)

Yeah, it's usually hanging in some low-level LVM tool (lvs, usually).
They unfortunately like to get stuck indefinitely on some hardware
failures, but there isn't really anything that can be done.
But we've found that it's far more reliable to just call lvs ourselves
instead of relying on ceph-volume lvm list when trying to detect OSDs
in a server, not sure what else it does that sometimes just hangs.

One thing I've learned from working with *a lot* of different hardware
from basically all vendors: every command can hang when you got a disk
that died in some bad way. Sometimes there are work-arounds by
invoking the necessary tools once for every disk instead of once for
all disks.


Paul

Am Mo., 8. Okt. 2018 um 23:34 Uhr schrieb Alfredo Deza :
>
> On Mon, Oct 8, 2018 at 5:04 PM Paul Emmerich  wrote:
> >
> > ceph-volume unfortunately doesn't handle completely hanging IOs too
> > well compared to ceph-disk.
>
> Not sure I follow, would you mind expanding on what you mean by
> "ceph-volume unfortunately doesn't handle completely hanging IOs" ?
>
> ceph-volume just provisions the OSD, nothing else. If LVM is hanging,
> there is nothing we could do there, just like ceph-disk wouldn't be
> able to do anything if the partitioning
> tool would hang.
>
>
>
> > It needs to read actual data from each
> > disk and it'll just hang completely if any of the disks doesn't
> > respond.
> >
> > The low-level command to get the information from LVM is:
> >
> > lvs -o lv_tags
> >
> > this allows you to map a LV to an OSD id.
> >
> >
> > Paul
> > Am Mo., 8. Okt. 2018 um 12:09 Uhr schrieb Kevin Olbrich :
> > >
> > > Hi!
> > >
> > > Yes, thank you. At least on one node this works, the other node just 
> > > freezes but this might by caused by a bad disk that I try to find.
> > >
> > > Kevin
> > >
> > > Am Mo., 8. Okt. 2018 um 12:07 Uhr schrieb Wido den Hollander 
> > > :
> > >>
> > >> Hi,
> > >>
> > >> $ ceph-volume lvm list
> > >>
> > >> Does that work for you?
> > >>
> > >> Wido
> > >>
> > >> On 10/08/2018 12:01 PM, Kevin Olbrich wrote:
> > >> > Hi!
> > >> >
> > >> > Is there an easy way to find raw disks (eg. sdd/sdd1) by OSD id?
> > >> > Before I migrated from filestore with simple-mode to bluestore with 
> > >> > lvm,
> > >> > I was able to find the raw disk with "df".
> > >> > Now, I need to go from LVM LV to PV to disk every time I need to
> > >> > check/smartctl a disk.
> > >> >
> > >> > Kevin
> > >> >
> > >> >
> > >> > ___
> > >> > ceph-users mailing list
> > >> > ceph-users@lists.ceph.com
> > >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > >> >
> > >
> > > ___
> > > ceph-users mailing list
> > > ceph-users@lists.ceph.com
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> >
> >
> > --
> > Paul Emmerich
> >
> > Looking for help with your Ceph cluster? Contact us at https://croit.io
> >
> > croit GmbH
> > Freseniusstr. 31h
> > 81247 München
> > www.croit.io
> > Tel: +49 89 1896585 90
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Fastest way to find raw device from OSD-ID? (osd -> lvm lv -> lvm pv -> disk)

2018-10-08 Thread Alfredo Deza

On Mon, Oct 8, 2018 at 5:04 PM Paul Emmerich  wrote:
>
> ceph-volume unfortunately doesn't handle completely hanging IOs too
> well compared to ceph-disk.

Not sure I follow, would you mind expanding on what you mean by
"ceph-volume unfortunately doesn't handle completely hanging IOs" ?

ceph-volume just provisions the OSD, nothing else. If LVM is hanging,
there is nothing we could do there, just like ceph-disk wouldn't be
able to do anything if the partitioning
tool would hang.



> It needs to read actual data from each
> disk and it'll just hang completely if any of the disks doesn't
> respond.
>
> The low-level command to get the information from LVM is:
>
> lvs -o lv_tags
>
> this allows you to map a LV to an OSD id.
>
>
> Paul
> Am Mo., 8. Okt. 2018 um 12:09 Uhr schrieb Kevin Olbrich :
> >
> > Hi!
> >
> > Yes, thank you. At least on one node this works, the other node just 
> > freezes but this might by caused by a bad disk that I try to find.
> >
> > Kevin
> >
> > Am Mo., 8. Okt. 2018 um 12:07 Uhr schrieb Wido den Hollander 
> > :
> >>
> >> Hi,
> >>
> >> $ ceph-volume lvm list
> >>
> >> Does that work for you?
> >>
> >> Wido
> >>
> >> On 10/08/2018 12:01 PM, Kevin Olbrich wrote:
> >> > Hi!
> >> >
> >> > Is there an easy way to find raw disks (eg. sdd/sdd1) by OSD id?
> >> > Before I migrated from filestore with simple-mode to bluestore with lvm,
> >> > I was able to find the raw disk with "df".
> >> > Now, I need to go from LVM LV to PV to disk every time I need to
> >> > check/smartctl a disk.
> >> >
> >> > Kevin
> >> >
> >> >
> >> > ___
> >> > ceph-users mailing list
> >> > ceph-users@lists.ceph.com
> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> --
> Paul Emmerich
>
> Looking for help with your Ceph cluster? Contact us at https://croit.io
>
> croit GmbH
> Freseniusstr. 31h
> 81247 München
> www.croit.io
> Tel: +49 89 1896585 90
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Fastest way to find raw device from OSD-ID? (osd -> lvm lv -> lvm pv -> disk)

Hi Jakub,

"ceph osd metadata X" this is perfect! This also lists multipath devices
which I was looking for!

Kevin


Am Mo., 8. Okt. 2018 um 21:16 Uhr schrieb Jakub Jaszewski <
jaszewski.ja...@gmail.com>:

> Hi Kevin,
> Have you tried ceph osd metadata OSDid ?
>
> Jakub
>
> pon., 8 paź 2018, 19:32 użytkownik Alfredo Deza 
> napisał:
>
>> On Mon, Oct 8, 2018 at 6:09 AM Kevin Olbrich  wrote:
>> >
>> > Hi!
>> >
>> > Yes, thank you. At least on one node this works, the other node just
>> freezes but this might by caused by a bad disk that I try to find.
>>
>> If it is freezing, you could maybe try running the command where it
>> freezes? (ceph-volume will log it to the terminal)
>>
>>
>> >
>> > Kevin
>> >
>> > Am Mo., 8. Okt. 2018 um 12:07 Uhr schrieb Wido den Hollander <
>> w...@42on.com>:
>> >>
>> >> Hi,
>> >>
>> >> $ ceph-volume lvm list
>> >>
>> >> Does that work for you?
>> >>
>> >> Wido
>> >>
>> >> On 10/08/2018 12:01 PM, Kevin Olbrich wrote:
>> >> > Hi!
>> >> >
>> >> > Is there an easy way to find raw disks (eg. sdd/sdd1) by OSD id?
>> >> > Before I migrated from filestore with simple-mode to bluestore with
>> lvm,
>> >> > I was able to find the raw disk with "df".
>> >> > Now, I need to go from LVM LV to PV to disk every time I need to
>> >> > check/smartctl a disk.
>> >> >
>> >> > Kevin
>> >> >
>> >> >
>> >> > ___
>> >> > ceph-users mailing list
>> >> > ceph-users@lists.ceph.com
>> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >> >
>> >
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] can I define buckets in a multi-zone config that are exempted from replication?

2018-10-08 Thread Casey Bodley




On 10/08/2018 03:45 PM, Christian Rice wrote:


Just getting started here, but I am setting up a three-zone realm, 
each with a pair of S3 object gateways, Luminous on Debian.  I’m 
wondering if there’s a straightforward way to exempt some buckets from 
replicating to other zones?  The idea being there might be data that 
pertains to a specific zone…perhaps due to licensing or other more 
trivial technical reasons shouldn’t be transported off site.


Documentation at 
http://docs.ceph.com/docs/luminous/radosgw/s3/bucketops/ 
 suggests “A 
bucket can be constrained to a region by providing 
LocationConstraintduring a PUT request.”  Is this applicable to my 
multi-zone realm?


TIA,

Christian



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Hi Christian,

A 'region' in radosgw corresponds to the zonegroup, so 
LocationConstraint isn't quite what you want. You can disable sync on a 
single bucket by running this command on the master zone:


$ radosgw-admin bucket sync disable --bucket=bucketname
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] list admin issues

2018-10-08 Thread Jeff Smith

I just got dumped again.  I have not sent any attechments/images.
On Mon, Oct 8, 2018 at 5:48 AM Elias Abacioglu
 wrote:
>
> If it's attachments causing this, perhaps forbid attachments? Force people to 
> use pastebin / imgur type of services?
>
> /E
>
> On Mon, Oct 8, 2018 at 1:33 PM Martin Palma  wrote:
>>
>> Same here also on Gmail with G Suite.
>> On Mon, Oct 8, 2018 at 12:31 AM Paul Emmerich  wrote:
>> >
>> > I'm also seeing this once every few months or so on Gmail with G Suite.
>> >
>> > Paul
>> > Am So., 7. Okt. 2018 um 08:18 Uhr schrieb Joshua Chen
>> > :
>> > >
>> > > I also got removed once, got another warning once (need to re-enable).
>> > >
>> > > Cheers
>> > > Joshua
>> > >
>> > >
>> > > On Sun, Oct 7, 2018 at 5:38 AM Svante Karlsson  
>> > > wrote:
>> > >>
>> > >> I'm also getting removed but not only from ceph. I subscribe 
>> > >> d...@kafka.apache.org list and the same thing happens there.
>> > >>
>> > >> Den lör 6 okt. 2018 kl 23:24 skrev Jeff Smith :
>> > >>>
>> > >>> I have been removed twice.
>> > >>> On Sat, Oct 6, 2018 at 7:07 AM Elias Abacioglu
>> > >>>  wrote:
>> > >>> >
>> > >>> > Hi,
>> > >>> >
>> > >>> > I'm bumping this old thread cause it's getting annoying. My 
>> > >>> > membership get disabled twice a month.
>> > >>> > Between my two Gmail accounts I'm in more than 25 mailing lists and 
>> > >>> > I see this behavior only here. Why is only ceph-users only affected? 
>> > >>> > Maybe Christian was on to something, is this intentional?
>> > >>> > Reality is that there is a lot of ceph-users with Gmail accounts, 
>> > >>> > perhaps it wouldn't be so bad to actually trying to figure this one 
>> > >>> > out?
>> > >>> >
>> > >>> > So can the maintainers of this list please investigate what actually 
>> > >>> > gets bounced? Look at my address if you want.
>> > >>> > I got disabled 20181006, 20180927, 20180916, 20180725, 20180718 most 
>> > >>> > recently.
>> > >>> > Please help!
>> > >>> >
>> > >>> > Thanks,
>> > >>> > Elias
>> > >>> >
>> > >>> > On Mon, Oct 16, 2017 at 5:41 AM Christian Balzer  
>> > >>> > wrote:
>> > >>> >>
>> > >>> >>
>> > >>> >> Most mails to this ML score low or negatively with SpamAssassin, 
>> > >>> >> however
>> > >>> >> once in a while (this is a recent one) we get relatively high 
>> > >>> >> scores.
>> > >>> >> Note that the forged bits are false positives, but the SA is up to 
>> > >>> >> date and
>> > >>> >> google will have similar checks:
>> > >>> >> ---
>> > >>> >> X-Spam-Status: No, score=3.9 required=10.0 tests=BAYES_00,DCC_CHECK,
>> > >>> >>  
>> > >>> >> FORGED_MUA_MOZILLA,FORGED_YAHOO_RCVD,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,
>> > >>> >>  
>> > >>> >> HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,MIME_HTML_MOSTLY,RCVD_IN_MSPIKE_H4,
>> > >>> >>  RCVD_IN_MSPIKE_WL,RDNS_NONE,T_DKIM_INVALID shortcircuit=no 
>> > >>> >> autolearn=no
>> > >>> >> ---
>> > >>> >>
>> > >>> >> Between attachment mails and some of these and you're well on your 
>> > >>> >> way out.
>> > >>> >>
>> > >>> >> The default mailman settings and logic require 5 bounces to trigger
>> > >>> >> unsubscription and 7 days of NO bounces to reset the counter.
>> > >>> >>
>> > >>> >> Christian
>> > >>> >>
>> > >>> >> On Mon, 16 Oct 2017 12:23:25 +0900 Christian Balzer wrote:
>> > >>> >>
>> > >>> >> > On Mon, 16 Oct 2017 14:15:22 +1100 Blair Bethwaite wrote:
>> > >>> >> >
>> > >>> >> > > Thanks Christian,
>> > >>> >> > >
>> > >>> >> > > You're no doubt on the right track, but I'd really like to 
>> > >>> >> > > figure out
>> > >>> >> > > what it is at my end - I'm unlikely to be the only person 
>> > >>> >> > > subscribed
>> > >>> >> > > to ceph-users via a gmail account.
>> > >>> >> > >
>> > >>> >> > > Re. attachments, I'm surprised mailman would be allowing them 
>> > >>> >> > > in the
>> > >>> >> > > first place, and even so gmail's attachment requirements are 
>> > >>> >> > > less
>> > >>> >> > > strict than most corporate email setups (those that don't 
>> > >>> >> > > already use
>> > >>> >> > > a cloud provider).
>> > >>> >> > >
>> > >>> >> > Mailman doesn't do anything with this by default AFAIK, but see 
>> > >>> >> > below.
>> > >>> >> > Strict is fine if you're in control, corporate mail can be hell, 
>> > >>> >> > doubly so
>> > >>> >> > if on M$ cloud.
>> > >>> >> >
>> > >>> >> > > This started happening earlier in the year after I turned off 
>> > >>> >> > > digest
>> > >>> >> > > mode. I also have a paid google domain, maybe I'll try setting
>> > >>> >> > > delivery to that address and seeing if anything changes...
>> > >>> >> > >
>> > >>> >> > Don't think google domain is handled differently, but what do I 
>> > >>> >> > know.
>> > >>> >> >
>> > >>> >> > Though the digest bit confirms my suspicion about attachments:
>> > >>> >> > ---
>> > >>> >> > When a subscriber chooses to receive plain text daily “digests” 
>> > >>> >> > of list
>> > >>> >> > messages, Mailman sends the digest messages without any original
>> > >>> >> > attachments (in Mailman lingo, it “scrubs

Re: [ceph-users] Fastest way to find raw device from OSD-ID? (osd -> lvm lv -> lvm pv -> disk)

ceph-volume unfortunately doesn't handle completely hanging IOs too
well compared to ceph-disk. It needs to read actual data from each
disk and it'll just hang completely if any of the disks doesn't
respond.

The low-level command to get the information from LVM is:

lvs -o lv_tags

this allows you to map a LV to an OSD id.


Paul
Am Mo., 8. Okt. 2018 um 12:09 Uhr schrieb Kevin Olbrich :
>
> Hi!
>
> Yes, thank you. At least on one node this works, the other node just freezes 
> but this might by caused by a bad disk that I try to find.
>
> Kevin
>
> Am Mo., 8. Okt. 2018 um 12:07 Uhr schrieb Wido den Hollander :
>>
>> Hi,
>>
>> $ ceph-volume lvm list
>>
>> Does that work for you?
>>
>> Wido
>>
>> On 10/08/2018 12:01 PM, Kevin Olbrich wrote:
>> > Hi!
>> >
>> > Is there an easy way to find raw disks (eg. sdd/sdd1) by OSD id?
>> > Before I migrated from filestore with simple-mode to bluestore with lvm,
>> > I was able to find the raw disk with "df".
>> > Now, I need to go from LVM LV to PV to disk every time I need to
>> > check/smartctl a disk.
>> >
>> > Kevin
>> >
>> >
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



--
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] advised needed for different projects design

2018-10-08 Thread Joshua Chen

Hello all,
  When planning for my institute's need, I would like to seek for design
suggestions from you for my special situation:

1, I will support many projects, currently they are all nfs servers (and
those nfs servers serve their clients respectively). For example nfsA (for
clients belong to projectA); nfsB, nfsC,,,

2, For the institute's total capacity (currently 200TB), I would like nfsA,
nfsB, nfsC,,, to only see their individual assigned capacities, for
example, nfsA only get 50TB at her /export/nfsdata, nfsB only see 140TB,
nfsC only 10TB,,,

3, my question is, what would be the good choice to provide storage to
those nfs servers?

RBD? is rbd good for hundreds of TB size for a single block device for
a nfs server?

cephFS? this seems good solution for me that the nfs server could mount
cephfs and share them over nfs. But how could I make different project
(nfsA nfsB nfsC) 'see' or 'mount' part of the total 200TB capacity, there
should be many small cephfs(es) and each one has it's own given smaller
capacity.

Rados? I don't have much experience on this ? is rados suitable for this
multi project servers' need?


Thanks in advance

Cheers
Joshua
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] MDSs still core dumping




On 08/10/18 17:41, Sergey Malinin wrote:


On 8.10.2018, at 23:23, Alfredo Daniel Rezinovsky 
mailto:alfrenov...@gmail.com>> wrote:


I need the data, even if it's read only.


After full data scan you should have been able to boot mds 13.2.2 and 
mount the fs.
The problem started with the upgrade to 13.2.2. I downgraded to 13.2.1 
and Yan Zhen told.


mds reports problems with the journals, and even reseting the journals 
MDS wont start.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] MDSs still core dumping


> On 8.10.2018, at 23:23, Alfredo Daniel Rezinovsky  
> wrote:
> 
> I need the data, even if it's read only.

After full data scan you should have been able to boot mds 13.2.2 and mount the 
fs.___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] MDSs still core dumping

It seems my purge_queue journal is damaged. Even if I reset it keeps 
damaged.


What means inotablev mismatch ?


2018-10-08 16:40:03.144 7f05b6099700 -1 log_channel(cluster) log [ERR] : 
journal replay inotablev mismatch 1 -> 42160
/build/ceph-13.2.1/src/mds/journal.cc: In function 'void 
EMetaBlob::replay(MDSRank*, LogSegment*, MDSlaveUpdate*)' thread 
7f05b6099700 time 2018-10-08 16:40:03.150639
/build/ceph-13.2.1/src/mds/journal.cc: 1572: FAILED 
assert(g_conf->mds_wipe_sessions)
2018-10-08 16:40:03.144 7f05b6099700 -1 log_channel(cluster) log [ERR] : 
journal replay sessionmap v 20302542 -(1|2) > table 0
 ceph version 13.2.1 (5533ecdc0fda920179d7ad84e0aa65a127b20d77) mimic 
(stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
const*)+0x102) [0x7f05c649ff32]

 2: (()+0x26c0f7) [0x7f05c64a00f7]
 3: (EMetaBlob::replay(MDSRank*, LogSegment*, MDSlaveUpdate*)+0x5f4b) 
[0x5557a384706b]

 4: (EUpdate::replay(MDSRank*)+0x39) [0x5557a38485a9]
 5: (MDLog::_replay_thread()+0x864) [0x5557a37f0c24]
 6: (MDLog::ReplayThread::entry()+0xd) [0x5557a3594c0d]
 7: (()+0x76db) [0x7f05c5dac6db]
 8: (clone()+0x3f) [0x7f05c4f9288f]
 NOTE: a copy of the executable, or `objdump -rdS ` is 
needed to interpret this.
2018-10-08 16:40:03.148 7f05b6099700 -1 
/build/ceph-13.2.1/src/mds/journal.cc: In function 'void 
EMetaBlob::replay(MDSRank*, LogSegment*, MDSlaveUpdate*)' thread 
7f05b6099700 time 2018-10-08 16:40:03.150639
/build/ceph-13.2.1/src/mds/journal.cc: 1572: FAILED 
assert(g_conf->mds_wipe_sessions)


 ceph version 13.2.1 (5533ecdc0fda920179d7ad84e0aa65a127b20d77) mimic 
(stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
const*)+0x102) [0x7f05c649ff32]

 2: (()+0x26c0f7) [0x7f05c64a00f7]
 3: (EMetaBlob::replay(MDSRank*, LogSegment*, MDSlaveUpdate*)+0x5f4b) 
[0x5557a384706b]

 4: (EUpdate::replay(MDSRank*)+0x39) [0x5557a38485a9]
 5: (MDLog::_replay_thread()+0x864) [0x5557a37f0c24]
 6: (MDLog::ReplayThread::entry()+0xd) [0x5557a3594c0d]
 7: (()+0x76db) [0x7f05c5dac6db]
 8: (clone()+0x3f) [0x7f05c4f9288f]
 NOTE: a copy of the executable, or `objdump -rdS ` is 
needed to interpret this.




There's a way to import an empty journal?

I need the data, even if it's read only.





--
Alfrenovsky

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] MDS hangs in "heartbeat_map" deadlock

2018-10-08 Thread Stefan Kooman

Quoting Stefan Kooman (ste...@bit.nl):
 
> > From what you've described here, it's most likely that the MDS is trying to
> > read something out of RADOS which is taking a long time, and which we
> > didn't expect to cause a slow down. You can check via the admin socket to
> > see if there are outstanding Objecter requests or ops_in_flight to get a
> > clue.

I double checked the load of the OSDs, MONs, MDSs, but that's all
normal. I would expect "slow requests" first, before hitting a timeout of
some sort.
What would cause a MDS spinning and consuming 100% CPU if a request would
be slow?

Thanks,

Stefan

-- 
| BIT BV  http://www.bit.nl/Kamer van Koophandel 09090351
| GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] can I define buckets in a multi-zone config that are exempted from replication?

2018-10-08 Thread Christian Rice

Just getting started here, but I am setting up a three-zone realm, each with a 
pair of S3 object gateways, Luminous on Debian.  I’m wondering if there’s a 
straightforward way to exempt some buckets from replicating to other zones?  
The idea being there might be data that pertains to a specific zone…perhaps due 
to licensing or other more trivial technical reasons shouldn’t be transported 
off site.

Documentation at http://docs.ceph.com/docs/luminous/radosgw/s3/bucketops/ 
suggests “A bucket can be constrained to a region by providing 
LocationConstraint during a PUT request.”  Is this applicable to my multi-zone 
realm?

TIA,
Christian
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] MDS damaged after mimic 13.2.1 to 13.2.2 upgrade




On 08/10/18 11:47, Yan, Zheng wrote:

On Mon, Oct 8, 2018 at 9:46 PM Alfredo Daniel Rezinovsky
 wrote:



On 08/10/18 10:20, Yan, Zheng wrote:

On Mon, Oct 8, 2018 at 9:07 PM Alfredo Daniel Rezinovsky
 wrote:


On 08/10/18 09:45, Yan, Zheng wrote:

On Mon, Oct 8, 2018 at 6:40 PM Alfredo Daniel Rezinovsky
 wrote:

On 08/10/18 07:06, Yan, Zheng wrote:

On Mon, Oct 8, 2018 at 5:43 PM Sergey Malinin  wrote:

On 8.10.2018, at 12:37, Yan, Zheng  wrote:

On Mon, Oct 8, 2018 at 4:37 PM Sergey Malinin  wrote:

What additional steps need to be taken in order to (try to) regain access to 
the fs providing that I backed up metadata pool, created alternate metadata 
pool and ran scan_extents, scan_links, scan_inodes, and somewhat recursive 
scrub.
After that I only mounted the fs read-only to backup the data.
Would anything even work if I had mds journal and purge queue truncated?


did you backed up whole metadata pool?  did you make any modification
to the original metadata pool? If you did, what modifications?

I backed up both journal and purge queue and used cephfs-journal-tool to 
recover dentries, then reset journal and purge queue on original metadata pool.

You can try restoring original journal and purge queue, then downgrade
mds to 13.2.1.   Journal objects names are 20x., purge queue
objects names are 50x.x.

I'm already done a scan_extents and doing a scan_inodes, Do i need to
finish with the scan_links?

I'm with 13.2.2. DO I finish the scan_links and then downgrade?

I have a backup done with "cephfs-journal-tool journal export
backup.bin". I think I don't have the pugue queue

can I reset the purgue-queue journal?, Can I import an empty file


It's better to restore journal to original metadata pool and reset
purge queue to empty, then try starting mds. Reset the purge queue
will leave some objects in orphan states. But we can handle them
later.

Regards
Yan, Zheng

Let's see...

"cephfs-journal-tool journal import  backup.bin" will restore the whole
metadata ?
That's what "journal" means?


It just restores the journal. If you only reset original fs' journal
and purge queue (run scan_foo commands with alternate metadata pool).
It's highly likely restoring the journal will bring your fs back.




So I can stopt  cephfs-data-scan, run the import, downgrade, and then
reset the purge queue?


you said you have already run scan_extents and scan_inodes. what
cephfs-data-scan command is running?

Already ran (without alternate metadata)

time cephfs-data-scan scan_extents cephfs_data # 10 hours

time cephfs-data-scan scan_inodes cephfs_data # running 3 hours
with a warning:
7fddd8f64ec0 -1 datascan.inject_with_backtrace: Dentry
0x0x1db852b/dovecot.index already exists but points to 0x0x1000134f97f

Still not run:

time cephfs-data-scan scan_links


you have modified metatdata pool. I suggest you to run scan_links.
After it finishes, reset session table and try restarting mds. If mds
start successfully, run 'ceph daemon mds.x scrub_path / recursive
repair'. (don't let client mount before it finishes)

Good luck


mds still not starting (with 13.2.1) after full data-scan, journal 
import, and purgue queue journal reset.


tarting mds.storage-02 at -
/build/ceph-13.2.1/src/mds/journal.cc: In function 'void 
EMetaBlob::replay(MDSRank*, LogSegment*, MDSlaveUpdate*)' thread 
7f36f98be700 time 2018-10-08 16:21:18.804623
/build/ceph-13.2.1/src/mds/journal.cc: 1572: FAILED 
assert(g_conf->mds_wipe_sessions)
2018-10-08 16:21:18.802 7f36f98be700 -1 log_channel(cluster) log [ERR] : 
journal replay inotablev mismatch 1 -> 42160
2018-10-08 16:21:18.802 7f36f98be700 -1 log_channel(cluster) log [ERR] : 
journal replay sessionmap v 20302542 -(1|2) > table 0
 ceph version 13.2.1 (5533ecdc0fda920179d7ad84e0aa65a127b20d77) mimic 
(stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
const*)+0x102) [0x7f3709cc4f32]

 2: (()+0x26c0f7) [0x7f3709cc50f7]
 3: (EMetaBlob::replay(MDSRank*, LogSegment*, MDSlaveUpdate*)+0x5f4b) 
[0x5616f5fee06b]

 4: (EUpdate::replay(MDSRank*)+0x39) [0x5616f5fef5a9]
 5: (MDLog::_replay_thread()+0x864) [0x5616f5f97c24]
 6: (MDLog::ReplayThread::entry()+0xd) [0x5616f5d3bc0d]
 7: (()+0x76db) [0x7f37095d16db]
 8: (clone()+0x3f) [0x7f37087b788f]
 NOTE: a copy of the executable, or `objdump -rdS ` is 
needed to interpret this.
2018-10-08 16:21:18.802 7f36f98be700 -1 
/build/ceph-13.2.1/src/mds/journal.cc: In function 'void 
EMetaBlob::replay(MDSRank*, LogSegment*, MDSlaveUpdate*)' thread 
7f36f98be700 time

2018-10-08 16:21:18.804623
/build/ceph-13.2.1/src/mds/journal.cc: 1572: FAILED 
assert(g_conf->mds_wipe_sessions)


core dumped.




After 'import original journal'.  run 'ceph mds repaired
fs_name:damaged_rank', then try restarting mds. Check if mds can
start.


Please remember me the commands:
I've been 3 days without sleep, and I don't wanna to broke it more.


sorry for that.

I updated on friday, broke a golden rule: "READ ONLY FRIDAY". My fault.

Thanks

Re: [ceph-users] Fastest way to find raw device from OSD-ID? (osd -> lvm lv -> lvm pv -> disk)

2018-10-08 Thread Jakub Jaszewski

Hi Kevin,
Have you tried ceph osd metadata OSDid ?

Jakub

pon., 8 paź 2018, 19:32 użytkownik Alfredo Deza  napisał:

> On Mon, Oct 8, 2018 at 6:09 AM Kevin Olbrich  wrote:
> >
> > Hi!
> >
> > Yes, thank you. At least on one node this works, the other node just
> freezes but this might by caused by a bad disk that I try to find.
>
> If it is freezing, you could maybe try running the command where it
> freezes? (ceph-volume will log it to the terminal)
>
>
> >
> > Kevin
> >
> > Am Mo., 8. Okt. 2018 um 12:07 Uhr schrieb Wido den Hollander <
> w...@42on.com>:
> >>
> >> Hi,
> >>
> >> $ ceph-volume lvm list
> >>
> >> Does that work for you?
> >>
> >> Wido
> >>
> >> On 10/08/2018 12:01 PM, Kevin Olbrich wrote:
> >> > Hi!
> >> >
> >> > Is there an easy way to find raw disks (eg. sdd/sdd1) by OSD id?
> >> > Before I migrated from filestore with simple-mode to bluestore with
> lvm,
> >> > I was able to find the raw disk with "df".
> >> > Now, I need to go from LVM LV to PV to disk every time I need to
> >> > check/smartctl a disk.
> >> >
> >> > Kevin
> >> >
> >> >
> >> > ___
> >> > ceph-users mailing list
> >> > ceph-users@lists.ceph.com
> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Fastest way to find raw device from OSD-ID? (osd -> lvm lv -> lvm pv -> disk)

2018-10-08 Thread Alfredo Deza

On Mon, Oct 8, 2018 at 6:09 AM Kevin Olbrich  wrote:
>
> Hi!
>
> Yes, thank you. At least on one node this works, the other node just freezes 
> but this might by caused by a bad disk that I try to find.

If it is freezing, you could maybe try running the command where it
freezes? (ceph-volume will log it to the terminal)


>
> Kevin
>
> Am Mo., 8. Okt. 2018 um 12:07 Uhr schrieb Wido den Hollander :
>>
>> Hi,
>>
>> $ ceph-volume lvm list
>>
>> Does that work for you?
>>
>> Wido
>>
>> On 10/08/2018 12:01 PM, Kevin Olbrich wrote:
>> > Hi!
>> >
>> > Is there an easy way to find raw disks (eg. sdd/sdd1) by OSD id?
>> > Before I migrated from filestore with simple-mode to bluestore with lvm,
>> > I was able to find the raw disk with "df".
>> > Now, I need to go from LVM LV to PV to disk every time I need to
>> > check/smartctl a disk.
>> >
>> > Kevin
>> >
>> >
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Don't upgrade to 13.2.2 if you use cephfs

2018-10-08 Thread Patrick Donnelly

+ceph-announce

On Sun, Oct 7, 2018 at 7:30 PM Yan, Zheng  wrote:
> There is a bug in v13.2.2 mds, which causes decoding purge queue to
> fail. If mds is already in damaged state, please downgrade mds to
> 13.2.1, then run 'ceph mds repaired fs_name:damaged_rank' .
>
> Sorry for all the trouble I caused.
> Yan, Zheng

This issue is being tracked here: http://tracker.ceph.com/issues/36346

The problem was caused by a backport of the wrong commit which
unfortunately was not caught. The backport was not done to Luminous;
only Mimic 13.2.2 is affected. New deployments on 13.2.2 are also
affected but do not require immediate action. A procedure for handling
upgrades of fresh deployments from 13.2.2 to 13.2.3 will be included
in the release notes for 13.2.3.
-- 
Patrick Donnelly
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] rbd ls operation not permitted

On Mon, Oct 8, 2018 at 11:33 AM  wrote:
>
> Thanks, changing rxw to rwx solved the problem. But again, it is
> strange.  I am issuing the rbd command against the ssdvolumes pool and
> not ssdvolumes-13. And why does "allow *" on the mon solves the problem.
> I am a bit lost :-)
>
> --
> This does work
> --
> caps: [mon] allow *
> caps: [osd] allow *
> $ rbd ls -p ssdvolumes --id openstack
> volume-e61ec087-e654-471b-975f-f72b753a3bb0
> $
>
>
> --
> This does NOT work
> --
> caps: [mon] allow r
> caps: [osd] allow class-read object_prefix rbd_children, allow rwx
> pool=ssdvolumes, allow rxw pool=ssdvolumes-13, allow rwx
> pool=sasvolumes-13, allow rwx pool=sasvolumes, allow rwx pool=vms, allow
> rwx pool=images
> $ rbd ls -p ssdvolumes --id openstack
> rbd: list: (1) Operation not permitted
> $
>
>
> --
> This does work
> --
> caps: [mon] allow r
> caps: [osd] allow class-read object_prefix rbd_children, allow rwx
> pool=ssdvolumes, allow rwx pool=ssdvolumes-13, allow rwx
> pool=sasvolumes-13, allow rwx pool=sasvolumes, allow rwx pool=vms, allow
> rwx pool=images
> $ rbd ls -p ssdvolumes --id openstack
> volume-e61ec087-e654-471b-975f-f72b753a3bb0
> $
>
>
> Strange thing is, with an older rbd (like we use in Openstack Ocata) we
> don't see this behavior.

I unsuccessfully tried to re-create this using a Jewel v10.2.7 build
(MON, OSD, and client) but I received the expected "Operation not
permitted" due to the corrupt OSD caps. Starting with Jewel v10.2.11,
the monitor will now at least prevent you from setting corrupt caps on
a user.

>
> On 08-10-2018 17:04, Jason Dillaman wrote:
> > On Mon, Oct 8, 2018 at 10:20 AM  wrote:
> >>
> >> On a Ceph Monitor:
> >> # ceph auth get client.openstack | grep caps
> >> exported keyring for client.openstack
> >> caps mon = "allow r"
> >> caps osd = "allow class-read object_prefix rbd_children, allow
> >> rwx
> >> pool=ssdvolumes, allow rxw pool=ssdvolumes-13, allow rwx
> >> pool=sasvolumes-13, allow rwx pool=sasvolumes, allow rwx pool=vms,
> >> allow
> >> rwx pool=images"
> >> #
> >
> > By chance, is your issue really that your OpenStack 13 cluster cannot
> > access the pool named "ssdvolumes-13"? I ask because you have a typo
> > on your "rwx" cap (you have "rxw" instead).
> >
> >>
> >> On the problematic Openstack cluster:
> >> $ ceph auth get client.openstack --id openstack | grep caps
> >> Error EACCES: access denied
> >> $
> >>
> >>
> >> When I change "caps: [mon] allow r" to "caps: [mon] allow *" the
> >> problem
> >> disappears.
> >>
> >>
> >> On 08-10-2018 16:06, Jason Dillaman wrote:
> >> > Can you run "ceph auth get client.openstack | grep caps"?
> >> >
> >> > On Mon, Oct 8, 2018 at 10:03 AM  wrote:
> >> >>
> >> >> The result of your command:
> >> >>
> >> >> $ rbd ls --debug-rbd=20 -p ssdvolumes --id openstack
> >> >> 2018-10-08 13:42:17.386505 7f604933fd40 20 librbd: list 0x7fff5b25cc30
> >> >> rbd: list: (1) Operation not permitted
> >> >> $
> >> >>
> >> >> Thanks!
> >> >> Sinan
> >> >>
> >> >> On 08-10-2018 15:37, Jason Dillaman wrote:
> >> >> > On Mon, Oct 8, 2018 at 9:24 AM  wrote:
> >> >> >>
> >> >> >> Hi,
> >> >> >>
> >> >> >> I am running a Ceph cluster (Jewel, ceph version 10.2.10-17.el7cp).
> >> >> >>
> >> >> >>
> >> >> >> I also have 2 OpenStack clusters (Ocata (v12) and Pike (v13)).
> >> >> >>
> >> >> >> When I perform a "rbd ls -p  --id openstack" on the OpenStack
> >> >> >> Ocata cluster it works fine, when I perform the same command on the
> >> >> >> OpenStack Pike cluster I am getting an "operation not permitted".
> >> >> >>
> >> >> >>
> >> >> >> OpenStack Ocata (where it does work fine):
> >> >> >> $ rbd -v
> >> >> >> ceph version 10.2.7-48.el7cp
> >> >> >> (cf7751bcd460c757e596d3ee2991884e13c37b96)
> >> >> >> $ rpm -qa | grep rbd
> >> >> >> python-rbd-10.2.7-48.el7cp.x86_64
> >> >> >> libvirt-daemon-driver-storage-rbd-3.9.0-14.el7_5.6.x86_64
> >> >> >> librbd1-10.2.7-48.el7cp.x86_64
> >> >> >> rbd-mirror-10.2.7-48.el7cp.x86_64
> >> >> >> $
> >> >> >>
> >> >> >> OpenStack Pike (where it doesn't work, operation not permitted):
> >> >> >> $ rbd -v
> >> >> >> ceph version 12.2.4-10.el7cp
> >> >> >> (03fd19535b3701f3322c68b5f424335d6fc8dd66)
> >> >> >> luminous (stable)
> >> >> >> $ rpm -qa | grep rbd
> >> >> >> rbd-mirror-12.2.4-10.el7cp.x86_64
> >> >> >> libvirt-daemon-driver-storage-rbd-3.9.0-14.el7_5.5.x86_64
> >> >> >> librbd1-12.2.4-10.el7cp.x86_64
> >> >> >> python-rbd-12.2.4-10.el7cp.x86_64
> >> >> >> $
> >> >> >
> >> >> > Can you run "rbd --debug-rbd=20 ls -p  --id openstack" and
> >> >> > pastebin the resulting logs?
> >> >> >
> >> >> >>
> >> >> >> Both clusters are using the same Ceph client key, same Ceph
> >> >> >> configuration file.
> >> >> >>
> >> >> >> The only difference is the version of rbd.
> >> >> >>
> >> >> >> Is this expected behavior?
> >> >> >>
> >> >> >>
> >> >> >> Thanks!
> >> >> >> Sinan
> >> >> >> ___
> >> >> >> ceph-users mailing list
> >> >> >> ceph-users@li

Re: [ceph-users] Don't upgrade to 13.2.2 if you use cephfs

2018-10-08 Thread Alex Litvak



This is would be a question I had since Zheng posted the problem.  I 
recently purged a brand new cluster because I needed to change default 
WAL/DB settings on all OSDs in collocate scenario.  I decided to  jump 
to 13.2.2 rather then upgrade from 13.2.1.  Now I wonder if I am still 
in trouble.


Also, shouldn't the message to users be less subtle ( ... fix i coming 
...) as this seem to be a production affecting issue for some.


On 10/8/2018 11:18 AM, Paul Emmerich wrote:

Does this only affect upgraded CephFS deployments? A fresh 13.2.2
should work fine if I'm interpreting this bug correctly?

Paul

Am Mo., 8. Okt. 2018 um 11:53 Uhr schrieb Daniel Carrasco
:




El lun., 8 oct. 2018 5:44, Yan, Zheng  escribió:


On Mon, Oct 8, 2018 at 11:34 AM Daniel Carrasco  wrote:


I've got several problems on 12.2.8 too. All my standby MDS uses a lot of 
memory (while active uses normal memory), and I'm receiving a lot of slow MDS 
messages (causing the webpage to freeze and fail until MDS are restarted)... 
Finally I had to copy the entire site to DRBD and use NFS to solve all 
problems...



was standby-replay enabled?



I've tried both and I've seen more less the same behavior, maybe less when is 
not in replay mode.

Anyway, we've deactivated CephFS for now there. I'll try with older versions on 
a test environment




El lun., 8 oct. 2018 a las 5:21, Alex Litvak () 
escribió:


How is this not an emergency announcement?  Also I wonder if I can
downgrade at all ?  I am using ceph with docker deployed with
ceph-ansible.  I wonder if I should push downgrade or basically wait for
the fix.  I believe, a fix needs to be provided.

Thank you,

On 10/7/2018 9:30 PM, Yan, Zheng wrote:

There is a bug in v13.2.2 mds, which causes decoding purge queue to
fail. If mds is already in damaged state, please downgrade mds to
13.2.1, then run 'ceph mds repaired fs_name:damaged_rank' .

Sorry for all the trouble I caused.
Yan, Zheng




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




--
_

   Daniel Carrasco Marín
   Ingeniería para la Innovación i2TIC, S.L.
   Tlf:  +34 911 12 32 84 Ext: 223
   www.i2tic.com
_
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




--
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Don't upgrade to 13.2.2 if you use cephfs

Does this only affect upgraded CephFS deployments? A fresh 13.2.2
should work fine if I'm interpreting this bug correctly?

Paul

Am Mo., 8. Okt. 2018 um 11:53 Uhr schrieb Daniel Carrasco
:
>
>
>
> El lun., 8 oct. 2018 5:44, Yan, Zheng  escribió:
>>
>> On Mon, Oct 8, 2018 at 11:34 AM Daniel Carrasco  wrote:
>> >
>> > I've got several problems on 12.2.8 too. All my standby MDS uses a lot of 
>> > memory (while active uses normal memory), and I'm receiving a lot of slow 
>> > MDS messages (causing the webpage to freeze and fail until MDS are 
>> > restarted)... Finally I had to copy the entire site to DRBD and use NFS to 
>> > solve all problems...
>> >
>>
>> was standby-replay enabled?
>
>
> I've tried both and I've seen more less the same behavior, maybe less when is 
> not in replay mode.
>
> Anyway, we've deactivated CephFS for now there. I'll try with older versions 
> on a test environment
>
>>
>> > El lun., 8 oct. 2018 a las 5:21, Alex Litvak 
>> > () escribió:
>> >>
>> >> How is this not an emergency announcement?  Also I wonder if I can
>> >> downgrade at all ?  I am using ceph with docker deployed with
>> >> ceph-ansible.  I wonder if I should push downgrade or basically wait for
>> >> the fix.  I believe, a fix needs to be provided.
>> >>
>> >> Thank you,
>> >>
>> >> On 10/7/2018 9:30 PM, Yan, Zheng wrote:
>> >> > There is a bug in v13.2.2 mds, which causes decoding purge queue to
>> >> > fail. If mds is already in damaged state, please downgrade mds to
>> >> > 13.2.1, then run 'ceph mds repaired fs_name:damaged_rank' .
>> >> >
>> >> > Sorry for all the trouble I caused.
>> >> > Yan, Zheng
>> >> >
>> >>
>> >>
>> >> ___
>> >> ceph-users mailing list
>> >> ceph-users@lists.ceph.com
>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>> >
>> >
>> > --
>> > _
>> >
>> >   Daniel Carrasco Marín
>> >   Ingeniería para la Innovación i2TIC, S.L.
>> >   Tlf:  +34 911 12 32 84 Ext: 223
>> >   www.i2tic.com
>> > _
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



--
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] rbd ls operation not permitted

Thanks, changing rxw to rwx solved the problem. But again, it is 
strange.  I am issuing the rbd command against the ssdvolumes pool and 
not ssdvolumes-13. And why does "allow *" on the mon solves the problem. 
I am a bit lost :-)


--
This does work
--
caps: [mon] allow *
caps: [osd] allow *
$ rbd ls -p ssdvolumes --id openstack
volume-e61ec087-e654-471b-975f-f72b753a3bb0
$


--
This does NOT work
--
caps: [mon] allow r
caps: [osd] allow class-read object_prefix rbd_children, allow rwx 
pool=ssdvolumes, allow rxw pool=ssdvolumes-13, allow rwx 
pool=sasvolumes-13, allow rwx pool=sasvolumes, allow rwx pool=vms, allow 
rwx pool=images

$ rbd ls -p ssdvolumes --id openstack
rbd: list: (1) Operation not permitted
$


--
This does work
--
caps: [mon] allow r
caps: [osd] allow class-read object_prefix rbd_children, allow rwx 
pool=ssdvolumes, allow rwx pool=ssdvolumes-13, allow rwx 
pool=sasvolumes-13, allow rwx pool=sasvolumes, allow rwx pool=vms, allow 
rwx pool=images

$ rbd ls -p ssdvolumes --id openstack
volume-e61ec087-e654-471b-975f-f72b753a3bb0
$


Strange thing is, with an older rbd (like we use in Openstack Ocata) we 
don't see this behavior.




On 08-10-2018 17:04, Jason Dillaman wrote:

On Mon, Oct 8, 2018 at 10:20 AM  wrote:


On a Ceph Monitor:
# ceph auth get client.openstack | grep caps
exported keyring for client.openstack
caps mon = "allow r"
caps osd = "allow class-read object_prefix rbd_children, allow 
rwx

pool=ssdvolumes, allow rxw pool=ssdvolumes-13, allow rwx
pool=sasvolumes-13, allow rwx pool=sasvolumes, allow rwx pool=vms, 
allow

rwx pool=images"
#


By chance, is your issue really that your OpenStack 13 cluster cannot
access the pool named "ssdvolumes-13"? I ask because you have a typo
on your "rwx" cap (you have "rxw" instead).



On the problematic Openstack cluster:
$ ceph auth get client.openstack --id openstack | grep caps
Error EACCES: access denied
$


When I change "caps: [mon] allow r" to "caps: [mon] allow *" the 
problem

disappears.


On 08-10-2018 16:06, Jason Dillaman wrote:
> Can you run "ceph auth get client.openstack | grep caps"?
>
> On Mon, Oct 8, 2018 at 10:03 AM  wrote:
>>
>> The result of your command:
>>
>> $ rbd ls --debug-rbd=20 -p ssdvolumes --id openstack
>> 2018-10-08 13:42:17.386505 7f604933fd40 20 librbd: list 0x7fff5b25cc30
>> rbd: list: (1) Operation not permitted
>> $
>>
>> Thanks!
>> Sinan
>>
>> On 08-10-2018 15:37, Jason Dillaman wrote:
>> > On Mon, Oct 8, 2018 at 9:24 AM  wrote:
>> >>
>> >> Hi,
>> >>
>> >> I am running a Ceph cluster (Jewel, ceph version 10.2.10-17.el7cp).
>> >>
>> >>
>> >> I also have 2 OpenStack clusters (Ocata (v12) and Pike (v13)).
>> >>
>> >> When I perform a "rbd ls -p  --id openstack" on the OpenStack
>> >> Ocata cluster it works fine, when I perform the same command on the
>> >> OpenStack Pike cluster I am getting an "operation not permitted".
>> >>
>> >>
>> >> OpenStack Ocata (where it does work fine):
>> >> $ rbd -v
>> >> ceph version 10.2.7-48.el7cp
>> >> (cf7751bcd460c757e596d3ee2991884e13c37b96)
>> >> $ rpm -qa | grep rbd
>> >> python-rbd-10.2.7-48.el7cp.x86_64
>> >> libvirt-daemon-driver-storage-rbd-3.9.0-14.el7_5.6.x86_64
>> >> librbd1-10.2.7-48.el7cp.x86_64
>> >> rbd-mirror-10.2.7-48.el7cp.x86_64
>> >> $
>> >>
>> >> OpenStack Pike (where it doesn't work, operation not permitted):
>> >> $ rbd -v
>> >> ceph version 12.2.4-10.el7cp
>> >> (03fd19535b3701f3322c68b5f424335d6fc8dd66)
>> >> luminous (stable)
>> >> $ rpm -qa | grep rbd
>> >> rbd-mirror-12.2.4-10.el7cp.x86_64
>> >> libvirt-daemon-driver-storage-rbd-3.9.0-14.el7_5.5.x86_64
>> >> librbd1-12.2.4-10.el7cp.x86_64
>> >> python-rbd-12.2.4-10.el7cp.x86_64
>> >> $
>> >
>> > Can you run "rbd --debug-rbd=20 ls -p  --id openstack" and
>> > pastebin the resulting logs?
>> >
>> >>
>> >> Both clusters are using the same Ceph client key, same Ceph
>> >> configuration file.
>> >>
>> >> The only difference is the version of rbd.
>> >>
>> >> Is this expected behavior?
>> >>
>> >>
>> >> Thanks!
>> >> Sinan
>> >> ___
>> >> ceph-users mailing list
>> >> ceph-users@lists.ceph.com
>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Mons are using a lot of disk space and has a lot of old osd maps

2018-10-08 Thread Aleksei Zakharov

As i can see, all pg's are active+clean:

~# ceph -s
  cluster:
id: d168189f-6105-4223-b244-f59842404076
health: HEALTH_WARN
noout,nodeep-scrub flag(s) set
mons 1,2,3,4,5 are using a lot of disk space
 
  services:
mon: 5 daemons, quorum 1,2,3,4,5
mgr: api1(active), standbys: api2
osd: 832 osds: 791 up, 790 in
 flags noout,nodeep-scrub
 
  data:
pools:   10 pools, 52336 pgs
objects: 47.78M objects, 238TiB
usage:   854TiB used, 1.28PiB / 2.12PiB avail
pgs: 52336 active+clean
 
  io:
client:   929MiB/s rd, 1.16GiB/s wr, 31.85kop/s rd, 36.19kop/s wr


08.10.2018, 22:11, "Wido den Hollander" :
> On 10/08/2018 05:04 PM, Aleksei Zakharov wrote:
>>  Hi all,
>>
>>  We've upgraded our cluster from jewel to luminous and re-created monitors 
>> using rocksdb.
>>  Now we see, that mon's are using a lot of disk space and used space only 
>> grows. It is about 17GB for now. It was ~13GB when we used leveldb and jewel 
>> release.
>>
>>  When we added new osd's we saw that it downloads from monitors a lot of 
>> data. It was ~15GiB few days ago and it is ~18GiB today.
>>  One of the osd's we created uses filestore and it looks like old osd maps 
>> are not removed:
>>
>>  ~# find /var/lib/ceph/osd/ceph-224/current/meta/ | wc -l
>>  73590
>>
>>  I've tried to run manual compaction (ceph tell mon.NUM compact) but it 
>> doesn't help.
>>
>>  So, how to stop this growth of data on monitors?
>
> What is the status of Ceph? Can you post the output of:
>
> $ ceph -s
>
> MONs do not trim their database if one or more PGs aren't active+clean.
>
> Wido
>
>>  --
>>  Regards,
>>  Aleksei Zakharov
>>
>>  ___
>>  ceph-users mailing list
>>  ceph-users@lists.ceph.com
>>  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-- 
Regards,
Aleksei Zakharov

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Mons are using a lot of disk space and has a lot of old osd maps

2018-10-08 Thread Wido den Hollander



On 10/08/2018 05:04 PM, Aleksei Zakharov wrote:
> Hi all,
> 
> We've upgraded our cluster from jewel to luminous and re-created monitors 
> using rocksdb.
> Now we see, that mon's are using a lot of disk space and used space only 
> grows. It is about 17GB for now. It was ~13GB when we used leveldb and jewel 
> release.
> 
> When we added new osd's we saw that it downloads from monitors a lot of data. 
> It was ~15GiB few days ago and it is ~18GiB today.
> One of the osd's we created uses filestore and it looks like old osd maps are 
> not removed:
> 
> ~# find /var/lib/ceph/osd/ceph-224/current/meta/ | wc -l
> 73590
> 
> I've tried to run manual compaction (ceph tell mon.NUM compact) but it 
> doesn't help.
> 
> So, how to stop this growth of data on monitors?
> 

What is the status of Ceph? Can you post the output of:

$ ceph -s

MONs do not trim their database if one or more PGs aren't active+clean.

Wido

> -- 
> Regards,
> Aleksei Zakharov
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Mons are using a lot of disk space and has a lot of old osd maps

2018-10-08 Thread Aleksei Zakharov

Hi all,

We've upgraded our cluster from jewel to luminous and re-created monitors using 
rocksdb.
Now we see, that mon's are using a lot of disk space and used space only grows. 
It is about 17GB for now. It was ~13GB when we used leveldb and jewel release.

When we added new osd's we saw that it downloads from monitors a lot of data. 
It was ~15GiB few days ago and it is ~18GiB today.
One of the osd's we created uses filestore and it looks like old osd maps are 
not removed:

~# find /var/lib/ceph/osd/ceph-224/current/meta/ | wc -l
73590

I've tried to run manual compaction (ceph tell mon.NUM compact) but it doesn't 
help.

So, how to stop this growth of data on monitors?

-- 
Regards,
Aleksei Zakharov

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] rbd ls operation not permitted

On Mon, Oct 8, 2018 at 10:20 AM  wrote:
>
> On a Ceph Monitor:
> # ceph auth get client.openstack | grep caps
> exported keyring for client.openstack
> caps mon = "allow r"
> caps osd = "allow class-read object_prefix rbd_children, allow rwx
> pool=ssdvolumes, allow rxw pool=ssdvolumes-13, allow rwx
> pool=sasvolumes-13, allow rwx pool=sasvolumes, allow rwx pool=vms, allow
> rwx pool=images"
> #

By chance, is your issue really that your OpenStack 13 cluster cannot
access the pool named "ssdvolumes-13"? I ask because you have a typo
on your "rwx" cap (you have "rxw" instead).

>
> On the problematic Openstack cluster:
> $ ceph auth get client.openstack --id openstack | grep caps
> Error EACCES: access denied
> $
>
>
> When I change "caps: [mon] allow r" to "caps: [mon] allow *" the problem
> disappears.
>
>
> On 08-10-2018 16:06, Jason Dillaman wrote:
> > Can you run "ceph auth get client.openstack | grep caps"?
> >
> > On Mon, Oct 8, 2018 at 10:03 AM  wrote:
> >>
> >> The result of your command:
> >>
> >> $ rbd ls --debug-rbd=20 -p ssdvolumes --id openstack
> >> 2018-10-08 13:42:17.386505 7f604933fd40 20 librbd: list 0x7fff5b25cc30
> >> rbd: list: (1) Operation not permitted
> >> $
> >>
> >> Thanks!
> >> Sinan
> >>
> >> On 08-10-2018 15:37, Jason Dillaman wrote:
> >> > On Mon, Oct 8, 2018 at 9:24 AM  wrote:
> >> >>
> >> >> Hi,
> >> >>
> >> >> I am running a Ceph cluster (Jewel, ceph version 10.2.10-17.el7cp).
> >> >>
> >> >>
> >> >> I also have 2 OpenStack clusters (Ocata (v12) and Pike (v13)).
> >> >>
> >> >> When I perform a "rbd ls -p  --id openstack" on the OpenStack
> >> >> Ocata cluster it works fine, when I perform the same command on the
> >> >> OpenStack Pike cluster I am getting an "operation not permitted".
> >> >>
> >> >>
> >> >> OpenStack Ocata (where it does work fine):
> >> >> $ rbd -v
> >> >> ceph version 10.2.7-48.el7cp
> >> >> (cf7751bcd460c757e596d3ee2991884e13c37b96)
> >> >> $ rpm -qa | grep rbd
> >> >> python-rbd-10.2.7-48.el7cp.x86_64
> >> >> libvirt-daemon-driver-storage-rbd-3.9.0-14.el7_5.6.x86_64
> >> >> librbd1-10.2.7-48.el7cp.x86_64
> >> >> rbd-mirror-10.2.7-48.el7cp.x86_64
> >> >> $
> >> >>
> >> >> OpenStack Pike (where it doesn't work, operation not permitted):
> >> >> $ rbd -v
> >> >> ceph version 12.2.4-10.el7cp
> >> >> (03fd19535b3701f3322c68b5f424335d6fc8dd66)
> >> >> luminous (stable)
> >> >> $ rpm -qa | grep rbd
> >> >> rbd-mirror-12.2.4-10.el7cp.x86_64
> >> >> libvirt-daemon-driver-storage-rbd-3.9.0-14.el7_5.5.x86_64
> >> >> librbd1-12.2.4-10.el7cp.x86_64
> >> >> python-rbd-12.2.4-10.el7cp.x86_64
> >> >> $
> >> >
> >> > Can you run "rbd --debug-rbd=20 ls -p  --id openstack" and
> >> > pastebin the resulting logs?
> >> >
> >> >>
> >> >> Both clusters are using the same Ceph client key, same Ceph
> >> >> configuration file.
> >> >>
> >> >> The only difference is the version of rbd.
> >> >>
> >> >> Is this expected behavior?
> >> >>
> >> >>
> >> >> Thanks!
> >> >> Sinan
> >> >> ___
> >> >> ceph-users mailing list
> >> >> ceph-users@lists.ceph.com
> >> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] MDS damaged after mimic 13.2.1 to 13.2.2 upgrade

On Mon, Oct 8, 2018 at 9:46 PM Alfredo Daniel Rezinovsky
 wrote:
>
>
>
> On 08/10/18 10:20, Yan, Zheng wrote:
> > On Mon, Oct 8, 2018 at 9:07 PM Alfredo Daniel Rezinovsky
> >  wrote:
> >>
> >>
> >> On 08/10/18 09:45, Yan, Zheng wrote:
> >>> On Mon, Oct 8, 2018 at 6:40 PM Alfredo Daniel Rezinovsky
> >>>  wrote:
>  On 08/10/18 07:06, Yan, Zheng wrote:
> > On Mon, Oct 8, 2018 at 5:43 PM Sergey Malinin  wrote:
> >>> On 8.10.2018, at 12:37, Yan, Zheng  wrote:
> >>>
> >>> On Mon, Oct 8, 2018 at 4:37 PM Sergey Malinin  
> >>> wrote:
>  What additional steps need to be taken in order to (try to) regain 
>  access to the fs providing that I backed up metadata pool, created 
>  alternate metadata pool and ran scan_extents, scan_links, 
>  scan_inodes, and somewhat recursive scrub.
>  After that I only mounted the fs read-only to backup the data.
>  Would anything even work if I had mds journal and purge queue 
>  truncated?
> 
> >>> did you backed up whole metadata pool?  did you make any modification
> >>> to the original metadata pool? If you did, what modifications?
> >> I backed up both journal and purge queue and used cephfs-journal-tool 
> >> to recover dentries, then reset journal and purge queue on original 
> >> metadata pool.
> > You can try restoring original journal and purge queue, then downgrade
> > mds to 13.2.1.   Journal objects names are 20x., purge queue
> > objects names are 50x.x.
>  I'm already done a scan_extents and doing a scan_inodes, Do i need to
>  finish with the scan_links?
> 
>  I'm with 13.2.2. DO I finish the scan_links and then downgrade?
> 
>  I have a backup done with "cephfs-journal-tool journal export
>  backup.bin". I think I don't have the pugue queue
> 
>  can I reset the purgue-queue journal?, Can I import an empty file
> 
> >>> It's better to restore journal to original metadata pool and reset
> >>> purge queue to empty, then try starting mds. Reset the purge queue
> >>> will leave some objects in orphan states. But we can handle them
> >>> later.
> >>>
> >>> Regards
> >>> Yan, Zheng
> >> Let's see...
> >>
> >> "cephfs-journal-tool journal import  backup.bin" will restore the whole
> >> metadata ?
> >> That's what "journal" means?
> >>
> > It just restores the journal. If you only reset original fs' journal
> > and purge queue (run scan_foo commands with alternate metadata pool).
> > It's highly likely restoring the journal will bring your fs back.
> >
> >
> >
> >> So I can stopt  cephfs-data-scan, run the import, downgrade, and then
> >> reset the purge queue?
> >>
> > you said you have already run scan_extents and scan_inodes. what
> > cephfs-data-scan command is running?
> Already ran (without alternate metadata)
>
> time cephfs-data-scan scan_extents cephfs_data # 10 hours
>
> time cephfs-data-scan scan_inodes cephfs_data # running 3 hours
> with a warning:
> 7fddd8f64ec0 -1 datascan.inject_with_backtrace: Dentry
> 0x0x1db852b/dovecot.index already exists but points to 0x0x1000134f97f
>
> Still not run:
>
> time cephfs-data-scan scan_links
>

you have modified metatdata pool. I suggest you to run scan_links.
After it finishes, reset session table and try restarting mds. If mds
start successfully, run 'ceph daemon mds.x scrub_path / recursive
repair'. (don't let client mount before it finishes)

Good luck




>
> > After 'import original journal'.  run 'ceph mds repaired
> > fs_name:damaged_rank', then try restarting mds. Check if mds can
> > start.
> >
> >> Please remember me the commands:
> >> I've been 3 days without sleep, and I don't wanna to broke it more.
> >>
> > sorry for that.
> I updated on friday, broke a golden rule: "READ ONLY FRIDAY". My fault.
> >> Thanks
> >>
> >>
> >>
>  What do I do with the journals?
> 
> >> Before proceeding to alternate metadata pool recovery I was able to 
> >> start MDS but it soon failed throwing lots of 'loaded dup inode' 
> >> errors, not sure if that involved changing anything in the pool.
> >> I have left the original metadata pool untouched sine then.
> >>
> >>
> >>> Yan, Zheng
> >>>
> > On 8.10.2018, at 05:15, Yan, Zheng  wrote:
> >
> > Sorry. this is caused wrong backport. downgrading mds to 13.2.1 and
> > marking mds repaird can resolve this.
> >
> > Yan, Zheng
> > On Sat, Oct 6, 2018 at 8:26 AM Sergey Malinin  
> > wrote:
> >> Update:
> >> I discovered http://tracker.ceph.com/issues/24236 and 
> >> https://github.com/ceph/ceph/pull/22146
> >> Make sure that it is not relevant in your case before proceeding 
> >> to operations that modify on-disk data.
> >>
> >>
> >> On 6.10.2018, at 03:17, Sergey Malinin  wrote:
> >>
> >> I ended up rescanning the entire fs using

[ceph-users] rados gateway http compression

2018-10-08 Thread Jin Mao

I like to compare the performance between storing compressed data and
decompress at client vs storing uncompressed data directly. However, a
series tests with or without "Accept-Encoding: gzip" header using curl
(hitting same rgw server) do not seem to bring any differences.

The only compression related doc is about compression at data store but not
on httpd. Can anyone please give some insightful information if gateway
httpd supports compression?

Thank you.

Jin.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] rbd ls operation not permitted


On a Ceph Monitor:
# ceph auth get client.openstack | grep caps
exported keyring for client.openstack
caps mon = "allow r"
	caps osd = "allow class-read object_prefix rbd_children, allow rwx 
pool=ssdvolumes, allow rxw pool=ssdvolumes-13, allow rwx 
pool=sasvolumes-13, allow rwx pool=sasvolumes, allow rwx pool=vms, allow 
rwx pool=images"

#


On the problematic Openstack cluster:
$ ceph auth get client.openstack --id openstack | grep caps
Error EACCES: access denied
$


When I change "caps: [mon] allow r" to "caps: [mon] allow *" the problem 
disappears.



On 08-10-2018 16:06, Jason Dillaman wrote:

Can you run "ceph auth get client.openstack | grep caps"?

On Mon, Oct 8, 2018 at 10:03 AM  wrote:


The result of your command:

$ rbd ls --debug-rbd=20 -p ssdvolumes --id openstack
2018-10-08 13:42:17.386505 7f604933fd40 20 librbd: list 0x7fff5b25cc30
rbd: list: (1) Operation not permitted
$

Thanks!
Sinan

On 08-10-2018 15:37, Jason Dillaman wrote:
> On Mon, Oct 8, 2018 at 9:24 AM  wrote:
>>
>> Hi,
>>
>> I am running a Ceph cluster (Jewel, ceph version 10.2.10-17.el7cp).
>>
>>
>> I also have 2 OpenStack clusters (Ocata (v12) and Pike (v13)).
>>
>> When I perform a "rbd ls -p  --id openstack" on the OpenStack
>> Ocata cluster it works fine, when I perform the same command on the
>> OpenStack Pike cluster I am getting an "operation not permitted".
>>
>>
>> OpenStack Ocata (where it does work fine):
>> $ rbd -v
>> ceph version 10.2.7-48.el7cp
>> (cf7751bcd460c757e596d3ee2991884e13c37b96)
>> $ rpm -qa | grep rbd
>> python-rbd-10.2.7-48.el7cp.x86_64
>> libvirt-daemon-driver-storage-rbd-3.9.0-14.el7_5.6.x86_64
>> librbd1-10.2.7-48.el7cp.x86_64
>> rbd-mirror-10.2.7-48.el7cp.x86_64
>> $
>>
>> OpenStack Pike (where it doesn't work, operation not permitted):
>> $ rbd -v
>> ceph version 12.2.4-10.el7cp
>> (03fd19535b3701f3322c68b5f424335d6fc8dd66)
>> luminous (stable)
>> $ rpm -qa | grep rbd
>> rbd-mirror-12.2.4-10.el7cp.x86_64
>> libvirt-daemon-driver-storage-rbd-3.9.0-14.el7_5.5.x86_64
>> librbd1-12.2.4-10.el7cp.x86_64
>> python-rbd-12.2.4-10.el7cp.x86_64
>> $
>
> Can you run "rbd --debug-rbd=20 ls -p  --id openstack" and
> pastebin the resulting logs?
>
>>
>> Both clusters are using the same Ceph client key, same Ceph
>> configuration file.
>>
>> The only difference is the version of rbd.
>>
>> Is this expected behavior?
>>
>>
>> Thanks!
>> Sinan
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] rbd ls operation not permitted

Can you run "ceph auth get client.openstack | grep caps"?

On Mon, Oct 8, 2018 at 10:03 AM  wrote:
>
> The result of your command:
>
> $ rbd ls --debug-rbd=20 -p ssdvolumes --id openstack
> 2018-10-08 13:42:17.386505 7f604933fd40 20 librbd: list 0x7fff5b25cc30
> rbd: list: (1) Operation not permitted
> $
>
> Thanks!
> Sinan
>
> On 08-10-2018 15:37, Jason Dillaman wrote:
> > On Mon, Oct 8, 2018 at 9:24 AM  wrote:
> >>
> >> Hi,
> >>
> >> I am running a Ceph cluster (Jewel, ceph version 10.2.10-17.el7cp).
> >>
> >>
> >> I also have 2 OpenStack clusters (Ocata (v12) and Pike (v13)).
> >>
> >> When I perform a "rbd ls -p  --id openstack" on the OpenStack
> >> Ocata cluster it works fine, when I perform the same command on the
> >> OpenStack Pike cluster I am getting an "operation not permitted".
> >>
> >>
> >> OpenStack Ocata (where it does work fine):
> >> $ rbd -v
> >> ceph version 10.2.7-48.el7cp
> >> (cf7751bcd460c757e596d3ee2991884e13c37b96)
> >> $ rpm -qa | grep rbd
> >> python-rbd-10.2.7-48.el7cp.x86_64
> >> libvirt-daemon-driver-storage-rbd-3.9.0-14.el7_5.6.x86_64
> >> librbd1-10.2.7-48.el7cp.x86_64
> >> rbd-mirror-10.2.7-48.el7cp.x86_64
> >> $
> >>
> >> OpenStack Pike (where it doesn't work, operation not permitted):
> >> $ rbd -v
> >> ceph version 12.2.4-10.el7cp
> >> (03fd19535b3701f3322c68b5f424335d6fc8dd66)
> >> luminous (stable)
> >> $ rpm -qa | grep rbd
> >> rbd-mirror-12.2.4-10.el7cp.x86_64
> >> libvirt-daemon-driver-storage-rbd-3.9.0-14.el7_5.5.x86_64
> >> librbd1-12.2.4-10.el7cp.x86_64
> >> python-rbd-12.2.4-10.el7cp.x86_64
> >> $
> >
> > Can you run "rbd --debug-rbd=20 ls -p  --id openstack" and
> > pastebin the resulting logs?
> >
> >>
> >> Both clusters are using the same Ceph client key, same Ceph
> >> configuration file.
> >>
> >> The only difference is the version of rbd.
> >>
> >> Is this expected behavior?
> >>
> >>
> >> Thanks!
> >> Sinan
> >> ___
> >> ceph-users mailing list
> >> ceph-users@lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] MDS damaged after mimic 13.2.1 to 13.2.2 upgrade





On 08/10/18 10:20, Yan, Zheng wrote:

On Mon, Oct 8, 2018 at 9:07 PM Alfredo Daniel Rezinovsky
 wrote:



On 08/10/18 09:45, Yan, Zheng wrote:

On Mon, Oct 8, 2018 at 6:40 PM Alfredo Daniel Rezinovsky
 wrote:

On 08/10/18 07:06, Yan, Zheng wrote:

On Mon, Oct 8, 2018 at 5:43 PM Sergey Malinin  wrote:

On 8.10.2018, at 12:37, Yan, Zheng  wrote:

On Mon, Oct 8, 2018 at 4:37 PM Sergey Malinin  wrote:

What additional steps need to be taken in order to (try to) regain access to 
the fs providing that I backed up metadata pool, created alternate metadata 
pool and ran scan_extents, scan_links, scan_inodes, and somewhat recursive 
scrub.
After that I only mounted the fs read-only to backup the data.
Would anything even work if I had mds journal and purge queue truncated?


did you backed up whole metadata pool?  did you make any modification
to the original metadata pool? If you did, what modifications?

I backed up both journal and purge queue and used cephfs-journal-tool to 
recover dentries, then reset journal and purge queue on original metadata pool.

You can try restoring original journal and purge queue, then downgrade
mds to 13.2.1.   Journal objects names are 20x., purge queue
objects names are 50x.x.

I'm already done a scan_extents and doing a scan_inodes, Do i need to
finish with the scan_links?

I'm with 13.2.2. DO I finish the scan_links and then downgrade?

I have a backup done with "cephfs-journal-tool journal export
backup.bin". I think I don't have the pugue queue

can I reset the purgue-queue journal?, Can I import an empty file


It's better to restore journal to original metadata pool and reset
purge queue to empty, then try starting mds. Reset the purge queue
will leave some objects in orphan states. But we can handle them
later.

Regards
Yan, Zheng

Let's see...

"cephfs-journal-tool journal import  backup.bin" will restore the whole
metadata ?
That's what "journal" means?


It just restores the journal. If you only reset original fs' journal
and purge queue (run scan_foo commands with alternate metadata pool).
It's highly likely restoring the journal will bring your fs back.




So I can stopt  cephfs-data-scan, run the import, downgrade, and then
reset the purge queue?


you said you have already run scan_extents and scan_inodes. what
cephfs-data-scan command is running?

Already ran (without alternate metadata)

time cephfs-data-scan scan_extents cephfs_data # 10 hours

time cephfs-data-scan scan_inodes cephfs_data # running 3 hours
with a warning:
7fddd8f64ec0 -1 datascan.inject_with_backtrace: Dentry 
0x0x1db852b/dovecot.index already exists but points to 0x0x1000134f97f


Still not run:

time cephfs-data-scan scan_links



After 'import original journal'.  run 'ceph mds repaired
fs_name:damaged_rank', then try restarting mds. Check if mds can
start.


Please remember me the commands:
I've been 3 days without sleep, and I don't wanna to broke it more.


sorry for that.

I updated on friday, broke a golden rule: "READ ONLY FRIDAY". My fault.

Thanks




What do I do with the journals?


Before proceeding to alternate metadata pool recovery I was able to start MDS 
but it soon failed throwing lots of 'loaded dup inode' errors, not sure if that 
involved changing anything in the pool.
I have left the original metadata pool untouched sine then.



Yan, Zheng


On 8.10.2018, at 05:15, Yan, Zheng  wrote:

Sorry. this is caused wrong backport. downgrading mds to 13.2.1 and
marking mds repaird can resolve this.

Yan, Zheng
On Sat, Oct 6, 2018 at 8:26 AM Sergey Malinin  wrote:

Update:
I discovered http://tracker.ceph.com/issues/24236 and 
https://github.com/ceph/ceph/pull/22146
Make sure that it is not relevant in your case before proceeding to operations 
that modify on-disk data.


On 6.10.2018, at 03:17, Sergey Malinin  wrote:

I ended up rescanning the entire fs using alternate metadata pool approach as 
in http://docs.ceph.com/docs/mimic/cephfs/disaster-recovery/
The process has not competed yet because during the recovery our cluster 
encountered another problem with OSDs that I got fixed yesterday (thanks to 
Igor Fedotov @ SUSE).
The first stage (scan_extents) completed in 84 hours (120M objects in data pool 
on 8 hdd OSDs on 4 hosts). The second (scan_inodes) was interrupted by OSDs 
failure so I have no timing stats but it seems to be runing 2-3 times faster 
than extents scan.
As to root cause -- in my case I recall that during upgrade I had forgotten to 
restart 3 OSDs, one of which was holding metadata pool contents, before 
restarting MDS daemons and that seemed to had an impact on MDS journal 
corruption, because when I restarted those OSDs, MDS was able to start up but 
soon failed throwing lots of 'loaded dup inode' errors.


On 6.10.2018, at 00:41, Alfredo Daniel Rezinovsky  wrote:

Same problem...

# cephfs-journal-tool --journal=purge_queue journal inspect
2018-10-05 18:37:10.704 7f01f60a9bc0 -1 Missing object 500.016c
Overall journal integrity:

Re: [ceph-users] rbd ls operation not permitted


The result of your command:

$ rbd ls --debug-rbd=20 -p ssdvolumes --id openstack
2018-10-08 13:42:17.386505 7f604933fd40 20 librbd: list 0x7fff5b25cc30
rbd: list: (1) Operation not permitted
$

Thanks!
Sinan

On 08-10-2018 15:37, Jason Dillaman wrote:

On Mon, Oct 8, 2018 at 9:24 AM  wrote:


Hi,

I am running a Ceph cluster (Jewel, ceph version 10.2.10-17.el7cp).


I also have 2 OpenStack clusters (Ocata (v12) and Pike (v13)).

When I perform a "rbd ls -p  --id openstack" on the OpenStack
Ocata cluster it works fine, when I perform the same command on the
OpenStack Pike cluster I am getting an "operation not permitted".


OpenStack Ocata (where it does work fine):
$ rbd -v
ceph version 10.2.7-48.el7cp 
(cf7751bcd460c757e596d3ee2991884e13c37b96)

$ rpm -qa | grep rbd
python-rbd-10.2.7-48.el7cp.x86_64
libvirt-daemon-driver-storage-rbd-3.9.0-14.el7_5.6.x86_64
librbd1-10.2.7-48.el7cp.x86_64
rbd-mirror-10.2.7-48.el7cp.x86_64
$

OpenStack Pike (where it doesn't work, operation not permitted):
$ rbd -v
ceph version 12.2.4-10.el7cp 
(03fd19535b3701f3322c68b5f424335d6fc8dd66)

luminous (stable)
$ rpm -qa | grep rbd
rbd-mirror-12.2.4-10.el7cp.x86_64
libvirt-daemon-driver-storage-rbd-3.9.0-14.el7_5.5.x86_64
librbd1-12.2.4-10.el7cp.x86_64
python-rbd-12.2.4-10.el7cp.x86_64
$


Can you run "rbd --debug-rbd=20 ls -p  --id openstack" and
pastebin the resulting logs?



Both clusters are using the same Ceph client key, same Ceph
configuration file.

The only difference is the version of rbd.

Is this expected behavior?


Thanks!
Sinan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] rbd ls operation not permitted

On Mon, Oct 8, 2018 at 9:24 AM  wrote:
>
> Hi,
>
> I am running a Ceph cluster (Jewel, ceph version 10.2.10-17.el7cp).
>
>
> I also have 2 OpenStack clusters (Ocata (v12) and Pike (v13)).
>
> When I perform a "rbd ls -p  --id openstack" on the OpenStack
> Ocata cluster it works fine, when I perform the same command on the
> OpenStack Pike cluster I am getting an "operation not permitted".
>
>
> OpenStack Ocata (where it does work fine):
> $ rbd -v
> ceph version 10.2.7-48.el7cp (cf7751bcd460c757e596d3ee2991884e13c37b96)
> $ rpm -qa | grep rbd
> python-rbd-10.2.7-48.el7cp.x86_64
> libvirt-daemon-driver-storage-rbd-3.9.0-14.el7_5.6.x86_64
> librbd1-10.2.7-48.el7cp.x86_64
> rbd-mirror-10.2.7-48.el7cp.x86_64
> $
>
> OpenStack Pike (where it doesn't work, operation not permitted):
> $ rbd -v
> ceph version 12.2.4-10.el7cp (03fd19535b3701f3322c68b5f424335d6fc8dd66)
> luminous (stable)
> $ rpm -qa | grep rbd
> rbd-mirror-12.2.4-10.el7cp.x86_64
> libvirt-daemon-driver-storage-rbd-3.9.0-14.el7_5.5.x86_64
> librbd1-12.2.4-10.el7cp.x86_64
> python-rbd-12.2.4-10.el7cp.x86_64
> $

Can you run "rbd --debug-rbd=20 ls -p  --id openstack" and
pastebin the resulting logs?

>
> Both clusters are using the same Ceph client key, same Ceph
> configuration file.
>
> The only difference is the version of rbd.
>
> Is this expected behavior?
>
>
> Thanks!
> Sinan
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] MDS damaged after mimic 13.2.1 to 13.2.2 upgrade




On 08/10/18 10:32, Sergey Malinin wrote:


On 8.10.2018, at 16:07, Alfredo Daniel Rezinovsky 
mailto:alfrenov...@gmail.com>> wrote:


So I can stopt  cephfs-data-scan, run the import, downgrade, and then 
reset the purge queue?


I suggest that you backup metadata pool so that in case of failure you 
can continue with data scan from where you stopped.
I've read somewhere that backup must be done using rados export rather 
that cppool in order to keep omapkeys.



To late for that. Tried with cppool and failed...



Please remember me the commands:
I've been 3 days without sleep, and I don't wanna to broke it more.


Lucky man, I've been struggling with it for almost 2 weeks.



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] MDS damaged after mimic 13.2.1 to 13.2.2 upgrade

> On 8.10.2018, at 16:07, Alfredo Daniel Rezinovsky  
> wrote:
> 
> So I can stopt  cephfs-data-scan, run the import, downgrade, and then reset 
> the purge queue?

I suggest that you backup metadata pool so that in case of failure you can 
continue with data scan from where you stopped.
I've read somewhere that backup must be done using rados export rather that 
cppool in order to keep omapkeys.

> 
> Please remember me the commands:
> I've been 3 days without sleep, and I don't wanna to broke it more.

Lucky man, I've been struggling with it for almost 2 weeks.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] rbd ls operation not permitted


Hi,

I am running a Ceph cluster (Jewel, ceph version 10.2.10-17.el7cp).


I also have 2 OpenStack clusters (Ocata (v12) and Pike (v13)).

When I perform a "rbd ls -p  --id openstack" on the OpenStack 
Ocata cluster it works fine, when I perform the same command on the 
OpenStack Pike cluster I am getting an "operation not permitted".



OpenStack Ocata (where it does work fine):
$ rbd -v
ceph version 10.2.7-48.el7cp (cf7751bcd460c757e596d3ee2991884e13c37b96)
$ rpm -qa | grep rbd
python-rbd-10.2.7-48.el7cp.x86_64
libvirt-daemon-driver-storage-rbd-3.9.0-14.el7_5.6.x86_64
librbd1-10.2.7-48.el7cp.x86_64
rbd-mirror-10.2.7-48.el7cp.x86_64
$

OpenStack Pike (where it doesn't work, operation not permitted):
$ rbd -v
ceph version 12.2.4-10.el7cp (03fd19535b3701f3322c68b5f424335d6fc8dd66) 
luminous (stable)

$ rpm -qa | grep rbd
rbd-mirror-12.2.4-10.el7cp.x86_64
libvirt-daemon-driver-storage-rbd-3.9.0-14.el7_5.5.x86_64
librbd1-12.2.4-10.el7cp.x86_64
python-rbd-12.2.4-10.el7cp.x86_64
$


Both clusters are using the same Ceph client key, same Ceph 
configuration file.


The only difference is the version of rbd.

Is this expected behavior?


Thanks!
Sinan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] MDS damaged after mimic 13.2.1 to 13.2.2 upgrade

On Mon, Oct 8, 2018 at 9:07 PM Alfredo Daniel Rezinovsky
 wrote:
>
>
>
> On 08/10/18 09:45, Yan, Zheng wrote:
> > On Mon, Oct 8, 2018 at 6:40 PM Alfredo Daniel Rezinovsky
> >  wrote:
> >> On 08/10/18 07:06, Yan, Zheng wrote:
> >>> On Mon, Oct 8, 2018 at 5:43 PM Sergey Malinin  wrote:
> 
> > On 8.10.2018, at 12:37, Yan, Zheng  wrote:
> >
> > On Mon, Oct 8, 2018 at 4:37 PM Sergey Malinin  wrote:
> >> What additional steps need to be taken in order to (try to) regain 
> >> access to the fs providing that I backed up metadata pool, created 
> >> alternate metadata pool and ran scan_extents, scan_links, scan_inodes, 
> >> and somewhat recursive scrub.
> >> After that I only mounted the fs read-only to backup the data.
> >> Would anything even work if I had mds journal and purge queue 
> >> truncated?
> >>
> > did you backed up whole metadata pool?  did you make any modification
> > to the original metadata pool? If you did, what modifications?
>  I backed up both journal and purge queue and used cephfs-journal-tool to 
>  recover dentries, then reset journal and purge queue on original 
>  metadata pool.
> >>> You can try restoring original journal and purge queue, then downgrade
> >>> mds to 13.2.1.   Journal objects names are 20x., purge queue
> >>> objects names are 50x.x.
> >> I'm already done a scan_extents and doing a scan_inodes, Do i need to
> >> finish with the scan_links?
> >>
> >> I'm with 13.2.2. DO I finish the scan_links and then downgrade?
> >>
> >> I have a backup done with "cephfs-journal-tool journal export
> >> backup.bin". I think I don't have the pugue queue
> >>
> >> can I reset the purgue-queue journal?, Can I import an empty file
> >>
> > It's better to restore journal to original metadata pool and reset
> > purge queue to empty, then try starting mds. Reset the purge queue
> > will leave some objects in orphan states. But we can handle them
> > later.
> >
> > Regards
> > Yan, Zheng
>
> Let's see...
>
> "cephfs-journal-tool journal import  backup.bin" will restore the whole
> metadata ?
> That's what "journal" means?
>

It just restores the journal. If you only reset original fs' journal
and purge queue (run scan_foo commands with alternate metadata pool).
It's highly likely restoring the journal will bring your fs back.



> So I can stopt  cephfs-data-scan, run the import, downgrade, and then
> reset the purge queue?
>
you said you have already run scan_extents and scan_inodes. what
cephfs-data-scan command is running?

After 'import original journal'.  run 'ceph mds repaired
fs_name:damaged_rank', then try restarting mds. Check if mds can
start.

> Please remember me the commands:
> I've been 3 days without sleep, and I don't wanna to broke it more.
>

sorry for that.

> Thanks
>
>
>
> >> What do I do with the journals?
> >>
>  Before proceeding to alternate metadata pool recovery I was able to 
>  start MDS but it soon failed throwing lots of 'loaded dup inode' errors, 
>  not sure if that involved changing anything in the pool.
>  I have left the original metadata pool untouched sine then.
> 
> 
> > Yan, Zheng
> >
> >>> On 8.10.2018, at 05:15, Yan, Zheng  wrote:
> >>>
> >>> Sorry. this is caused wrong backport. downgrading mds to 13.2.1 and
> >>> marking mds repaird can resolve this.
> >>>
> >>> Yan, Zheng
> >>> On Sat, Oct 6, 2018 at 8:26 AM Sergey Malinin  
> >>> wrote:
>  Update:
>  I discovered http://tracker.ceph.com/issues/24236 and 
>  https://github.com/ceph/ceph/pull/22146
>  Make sure that it is not relevant in your case before proceeding to 
>  operations that modify on-disk data.
> 
> 
>  On 6.10.2018, at 03:17, Sergey Malinin  wrote:
> 
>  I ended up rescanning the entire fs using alternate metadata pool 
>  approach as in 
>  http://docs.ceph.com/docs/mimic/cephfs/disaster-recovery/
>  The process has not competed yet because during the recovery our 
>  cluster encountered another problem with OSDs that I got fixed 
>  yesterday (thanks to Igor Fedotov @ SUSE).
>  The first stage (scan_extents) completed in 84 hours (120M objects 
>  in data pool on 8 hdd OSDs on 4 hosts). The second (scan_inodes) was 
>  interrupted by OSDs failure so I have no timing stats but it seems 
>  to be runing 2-3 times faster than extents scan.
>  As to root cause -- in my case I recall that during upgrade I had 
>  forgotten to restart 3 OSDs, one of which was holding metadata pool 
>  contents, before restarting MDS daemons and that seemed to had an 
>  impact on MDS journal corruption, because when I restarted those 
>  OSDs, MDS was able to start up but soon failed throwing lots of 
>  'loaded dup inode' errors.
> 
> 
>

Re: [ceph-users] MDS damaged after mimic 13.2.1 to 13.2.2 upgrade




On 08/10/18 09:45, Yan, Zheng wrote:

On Mon, Oct 8, 2018 at 6:40 PM Alfredo Daniel Rezinovsky
 wrote:

On 08/10/18 07:06, Yan, Zheng wrote:

On Mon, Oct 8, 2018 at 5:43 PM Sergey Malinin  wrote:



On 8.10.2018, at 12:37, Yan, Zheng  wrote:

On Mon, Oct 8, 2018 at 4:37 PM Sergey Malinin  wrote:

What additional steps need to be taken in order to (try to) regain access to 
the fs providing that I backed up metadata pool, created alternate metadata 
pool and ran scan_extents, scan_links, scan_inodes, and somewhat recursive 
scrub.
After that I only mounted the fs read-only to backup the data.
Would anything even work if I had mds journal and purge queue truncated?


did you backed up whole metadata pool?  did you make any modification
to the original metadata pool? If you did, what modifications?

I backed up both journal and purge queue and used cephfs-journal-tool to 
recover dentries, then reset journal and purge queue on original metadata pool.

You can try restoring original journal and purge queue, then downgrade
mds to 13.2.1.   Journal objects names are 20x., purge queue
objects names are 50x.x.

I'm already done a scan_extents and doing a scan_inodes, Do i need to
finish with the scan_links?

I'm with 13.2.2. DO I finish the scan_links and then downgrade?

I have a backup done with "cephfs-journal-tool journal export
backup.bin". I think I don't have the pugue queue

can I reset the purgue-queue journal?, Can I import an empty file


It's better to restore journal to original metadata pool and reset
purge queue to empty, then try starting mds. Reset the purge queue
will leave some objects in orphan states. But we can handle them
later.

Regards
Yan, Zheng


Let's see...

"cephfs-journal-tool journal import  backup.bin" will restore the whole 
metadata ?

That's what "journal" means?

So I can stopt  cephfs-data-scan, run the import, downgrade, and then 
reset the purge queue?


Please remember me the commands:
I've been 3 days without sleep, and I don't wanna to broke it more.

Thanks




What do I do with the journals?


Before proceeding to alternate metadata pool recovery I was able to start MDS 
but it soon failed throwing lots of 'loaded dup inode' errors, not sure if that 
involved changing anything in the pool.
I have left the original metadata pool untouched sine then.



Yan, Zheng


On 8.10.2018, at 05:15, Yan, Zheng  wrote:

Sorry. this is caused wrong backport. downgrading mds to 13.2.1 and
marking mds repaird can resolve this.

Yan, Zheng
On Sat, Oct 6, 2018 at 8:26 AM Sergey Malinin  wrote:

Update:
I discovered http://tracker.ceph.com/issues/24236 and 
https://github.com/ceph/ceph/pull/22146
Make sure that it is not relevant in your case before proceeding to operations 
that modify on-disk data.


On 6.10.2018, at 03:17, Sergey Malinin  wrote:

I ended up rescanning the entire fs using alternate metadata pool approach as 
in http://docs.ceph.com/docs/mimic/cephfs/disaster-recovery/
The process has not competed yet because during the recovery our cluster 
encountered another problem with OSDs that I got fixed yesterday (thanks to 
Igor Fedotov @ SUSE).
The first stage (scan_extents) completed in 84 hours (120M objects in data pool 
on 8 hdd OSDs on 4 hosts). The second (scan_inodes) was interrupted by OSDs 
failure so I have no timing stats but it seems to be runing 2-3 times faster 
than extents scan.
As to root cause -- in my case I recall that during upgrade I had forgotten to 
restart 3 OSDs, one of which was holding metadata pool contents, before 
restarting MDS daemons and that seemed to had an impact on MDS journal 
corruption, because when I restarted those OSDs, MDS was able to start up but 
soon failed throwing lots of 'loaded dup inode' errors.


On 6.10.2018, at 00:41, Alfredo Daniel Rezinovsky  wrote:

Same problem...

# cephfs-journal-tool --journal=purge_queue journal inspect
2018-10-05 18:37:10.704 7f01f60a9bc0 -1 Missing object 500.016c
Overall journal integrity: DAMAGED
Objects missing:
0x16c
Corrupt regions:
0x5b00-

Just after upgrade to 13.2.2

Did you fixed it?


On 26/09/18 13:05, Sergey Malinin wrote:

Hello,
Followed standard upgrade procedure to upgrade from 13.2.1 to 13.2.2.
After upgrade MDS cluster is down, mds rank 0 and purge_queue journal are 
damaged. Resetting purge_queue does not seem to work well as journal still 
appears to be damaged.
Can anybody help?

mds log:

-789> 2018-09-26 18:42:32.527 7f70f78b1700  1 mds.mds2 Updating MDS map to 
version 586 from mon.2
-788> 2018-09-26 18:42:32.527 7f70f78b1700  1 mds.0.583 handle_mds_map i am now 
mds.0.583
-787> 2018-09-26 18:42:32.527 7f70f78b1700  1 mds.0.583 handle_mds_map state 
change up:rejoin --> up:active
-786> 2018-09-26 18:42:32.527 7f70f78b1700  1 mds.0.583 recovery_done -- 
successful recovery!

   -38> 2018-09-26 18:42:32.707 7f70f28a7700 -1 mds.0.purge_queue _consume: 
Decode error at read_pos=0x322ec6636
   -37> 2018-09-26 18:42:32.707 7f70f28a770

Re: [ceph-users] Ceph version upgrade with Juju

2018-10-08 Thread James Page

Hi Fabio

On Thu, 4 Oct 2018 at 23:02 Fabio Abreu  wrote:

> Hi Cephers,
>
> I have  a little  doubt about the migration of Jewel version in the MAAS /
> JUJU implementation scenario .
>
> Could someone has the same experience in production environment?
>

> I am asking this because we mapping all challenges of this scenario.
>

The ceph-mon and ceph-osd charms for Juju support upgrading between Ceph
releases; the upgrade is a rolling upgrade, orchestrated through keys
managed via the Ceph MON cluster itself.

The upgrade triggers are based around the Ubuntu Cloud Archive pockets that
also host Ceph (see [0]), so its possible to online upgrade using the
charms from:

  firefly -> hammer
  hammer -> jewel
  jewel -> luminous

we're working on the upgrade from luminous to mimic at the moment (but that
should not be a heavy lift so should land real-soon-now).

If you are currently deployed on Xenial (which shipped with Jewel), you can
upgrade directly to Luminous by re-configuring the charms:

  juju config ceph-mon source=cloud:xenial-pike

once that has completed (juju status output will tell you when):

  juju config ceph-osd source=cloud:xenial-pike

I'd recommend testing the upgrade in some sort of pre-production testing
environment first!

HTH

James

(ceph-* charm contributor and user)

[0] https://github.com/openstack/charms.ceph/blob/master/ceph/utils.py#L2539
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] cephfs poor performance

2018-10-08 Thread Tomasz Płaza

On 08.10.2018 at 10:29, Yan, Zheng wrote:

On Mon, Oct 8, 2018 at 3:38 PM Tomasz Płaza wrote:

On 08.10.2018 at 09:21, Yan, Zheng wrote:

On Mon, Oct 8, 2018 at 1:54 PM Tomasz Płaza wrote:

Hi,

Can someone please help me, how do I improve performance on our CephFS
cluster?

System in use is: Centos 7.5 with ceph 12.2.7.
The hardware in use are as follows:
3xMON/MGR:
1xIntel(R) Xeon(R) Bronze 3106
16GB RAM
2xSSD for system
1Gbe NIC

2xMDS:
2xIntel(R) Xeon(R) Bronze 3106
64GB RAM
2xSSD for system
10Gbe NIC

6xOSD:
1xIntel(R) Xeon(R) Silver 4108
2xSSD for system
6xHGST HUS726060ALE610 SATA HDD's
1xINTEL SSDSC2BB150G7 for osd db`s (10G partitions) rest for OSD to
place cephfs_metadata
10Gbe NIC

pools (default crush rule aware of device class):
rbd with 1024 pg crush rule replicated_hdd
cephfs_data with 256 pg crush rule replicated_hdd
cephfs_metadata with 32 pg crush rule replicated_ssd

test done by fio: fio --randrepeat=1 --ioengine=libaio --direct=1
--gtod_reduce=1 --name=test --filename=random_read_write.fio --bs=4k
--iodepth=64 --size=1G --readwrite=randrw --rwmixread=75

kernel version? maybe cephfs driver in your kernel does not support
AIO (--iodepth is 1 effectively)

Yan, Zheng

Kernel is 3.10.0-862.9.1.el7.x86_64 (I can update it to
3.10.0-862.14.4.el7) but do not know how to check aio support in kernel
drive if it is relevant because I mounted it with ceph-fuse -n
client.cephfs -k /etc/ceph/ceph.client.cephfs.keyring -m
192.168.10.1:6789 /mnt/cephfs

Tom Płaza

please try kernel mount.

kernel mount did the trick, now performance is 3298/1101. Thank You.

Tom Płaza

shows iops write/read performance as folows:
rbd 3663/1223
cephfs (fuse) 205/68 (wich is a little lower than raw performance of one
hdd used in cluster)

Everything is connected to one Cisco 10Gbe switch.
Please help.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Spółki Grupy Wirtualna Polska:

Wirtualna Polska Holding Spółka Akcyjna z siedzibą w Warszawie, ul. Jutrzenki
137A, 02-231 Warszawa, wpisana do Krajowego Rejestru Sądowego - Rejestru
Przedsiębiorców prowadzonego przez Sąd Rejonowy dla m.st. Warszawy w Warszawie
pod nr KRS: 407130, kapitał zakładowy: 1 445 199,00 zł (w całości
wpłacony), Numer Identyfikacji Podatkowej (NIP): 521-31-11-513

Wirtualna Polska Media Spółka Akcyjna z siedzibą w Warszawie, ul. Jutrzenki
137A, 02-231 Warszawa, wpisana do Krajowego Rejestru Sądowego - Rejestru
Przedsiębiorców prowadzonego przez Sąd Rejonowy dla m.st. Warszawy w Warszawie
pod nr KRS: 580004, kapitał zakładowy: 320 005 950,00 zł (w całości
wpłacony), Numer Identyfikacji Podatkowej (NIP): 527-26-45-593

Administratorem udostępnionych danych osobowych jest Wirtualna Polska Media S.A. z siedzibą
w Warszawie (dalej „WPM”). WPM przetwarza Twoje dane osobowe, które zostały podane przez
Ciebie dobrowolnie w trakcie dotychczasowej współpracy, w związku z zawarciem umowy lub
zostały zebrane ze źródeł powszechnie dostępnych, w szczególności: imię i nazwisko, adres
email, numer telefonu. Przetwarzamy te dane w celach opisanych w polityce
prywatności, między innymi w celu realizacji
współpracy, realizacji obowiązków przewidzianych prawem, w celach marketingowych WP.
Podstawą prawną przetwarzania Twoich danych osobowych w celach marketingowych jest prawnie
uzasadniony interes jakim jest m.in. przesyłanie informacji marketingowych o usługach WP, w
tym zaproszeń na konferencje branżowe, informacje o publikacjach. Twoje dane możemy
przekazywać podmiotom przetwarzającym je na nasze zlecenie oraz podmiotom uprawnionym do
uzyskania danych na podstawie obowiązującego prawa. Masz prawo m.in. do żądania dostępu do
danych, sprostowania, usunięcia lub ograniczenia ich przetwarzania, jak również prawo do
zgłoszenia sprzeciwu w przewidzianych w prawie sytuacjach. Prawa te oraz sposób ich
realizacji opisaliśmy w polityce prywatności. Tam
też znajdziesz informacje jak zakomunikować nam Twoją wolę skorzystania z tych praw.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] list admin issues

2018-10-08 Thread Elias Abacioglu

If it's attachments causing this, perhaps forbid attachments? Force people
to use pastebin / imgur type of services?

/E

On Mon, Oct 8, 2018 at 1:33 PM Martin Palma  wrote:

> Same here also on Gmail with G Suite.
> On Mon, Oct 8, 2018 at 12:31 AM Paul Emmerich 
> wrote:
> >
> > I'm also seeing this once every few months or so on Gmail with G Suite.
> >
> > Paul
> > Am So., 7. Okt. 2018 um 08:18 Uhr schrieb Joshua Chen
> > :
> > >
> > > I also got removed once, got another warning once (need to re-enable).
> > >
> > > Cheers
> > > Joshua
> > >
> > >
> > > On Sun, Oct 7, 2018 at 5:38 AM Svante Karlsson 
> wrote:
> > >>
> > >> I'm also getting removed but not only from ceph. I subscribe
> d...@kafka.apache.org list and the same thing happens there.
> > >>
> > >> Den lör 6 okt. 2018 kl 23:24 skrev Jeff Smith  >:
> > >>>
> > >>> I have been removed twice.
> > >>> On Sat, Oct 6, 2018 at 7:07 AM Elias Abacioglu
> > >>>  wrote:
> > >>> >
> > >>> > Hi,
> > >>> >
> > >>> > I'm bumping this old thread cause it's getting annoying. My
> membership get disabled twice a month.
> > >>> > Between my two Gmail accounts I'm in more than 25 mailing lists
> and I see this behavior only here. Why is only ceph-users only affected?
> Maybe Christian was on to something, is this intentional?
> > >>> > Reality is that there is a lot of ceph-users with Gmail accounts,
> perhaps it wouldn't be so bad to actually trying to figure this one out?
> > >>> >
> > >>> > So can the maintainers of this list please investigate what
> actually gets bounced? Look at my address if you want.
> > >>> > I got disabled 20181006, 20180927, 20180916, 20180725, 20180718
> most recently.
> > >>> > Please help!
> > >>> >
> > >>> > Thanks,
> > >>> > Elias
> > >>> >
> > >>> > On Mon, Oct 16, 2017 at 5:41 AM Christian Balzer 
> wrote:
> > >>> >>
> > >>> >>
> > >>> >> Most mails to this ML score low or negatively with SpamAssassin,
> however
> > >>> >> once in a while (this is a recent one) we get relatively high
> scores.
> > >>> >> Note that the forged bits are false positives, but the SA is up
> to date and
> > >>> >> google will have similar checks:
> > >>> >> ---
> > >>> >> X-Spam-Status: No, score=3.9 required=10.0
> tests=BAYES_00,DCC_CHECK,
> > >>> >>
> FORGED_MUA_MOZILLA,FORGED_YAHOO_RCVD,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,
> > >>> >>
> HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,MIME_HTML_MOSTLY,RCVD_IN_MSPIKE_H4,
> > >>> >>  RCVD_IN_MSPIKE_WL,RDNS_NONE,T_DKIM_INVALID shortcircuit=no
> autolearn=no
> > >>> >> ---
> > >>> >>
> > >>> >> Between attachment mails and some of these and you're well on
> your way out.
> > >>> >>
> > >>> >> The default mailman settings and logic require 5 bounces to
> trigger
> > >>> >> unsubscription and 7 days of NO bounces to reset the counter.
> > >>> >>
> > >>> >> Christian
> > >>> >>
> > >>> >> On Mon, 16 Oct 2017 12:23:25 +0900 Christian Balzer wrote:
> > >>> >>
> > >>> >> > On Mon, 16 Oct 2017 14:15:22 +1100 Blair Bethwaite wrote:
> > >>> >> >
> > >>> >> > > Thanks Christian,
> > >>> >> > >
> > >>> >> > > You're no doubt on the right track, but I'd really like to
> figure out
> > >>> >> > > what it is at my end - I'm unlikely to be the only person
> subscribed
> > >>> >> > > to ceph-users via a gmail account.
> > >>> >> > >
> > >>> >> > > Re. attachments, I'm surprised mailman would be allowing them
> in the
> > >>> >> > > first place, and even so gmail's attachment requirements are
> less
> > >>> >> > > strict than most corporate email setups (those that don't
> already use
> > >>> >> > > a cloud provider).
> > >>> >> > >
> > >>> >> > Mailman doesn't do anything with this by default AFAIK, but see
> below.
> > >>> >> > Strict is fine if you're in control, corporate mail can be
> hell, doubly so
> > >>> >> > if on M$ cloud.
> > >>> >> >
> > >>> >> > > This started happening earlier in the year after I turned off
> digest
> > >>> >> > > mode. I also have a paid google domain, maybe I'll try setting
> > >>> >> > > delivery to that address and seeing if anything changes...
> > >>> >> > >
> > >>> >> > Don't think google domain is handled differently, but what do I
> know.
> > >>> >> >
> > >>> >> > Though the digest bit confirms my suspicion about attachments:
> > >>> >> > ---
> > >>> >> > When a subscriber chooses to receive plain text daily “digests”
> of list
> > >>> >> > messages, Mailman sends the digest messages without any original
> > >>> >> > attachments (in Mailman lingo, it “scrubs” the messages of
> attachments).
> > >>> >> > However, Mailman also includes links to the original
> attachments that the
> > >>> >> > recipient can click on.
> > >>> >> > ---
> > >>> >> >
> > >>> >> > Christian
> > >>> >> >
> > >>> >> > > Cheers,
> > >>> >> > >
> > >>> >> > > On 16 October 2017 at 13:54, Christian Balzer 
> wrote:
> > >>> >> > > >
> > >>> >> > > > Hello,
> > >>> >> > > >
> > >>> >> > > > You're on gmail.
> > >>> >> > > >
> > >>> >> > > > Aside from various potential false positives with regards
> to spam my b

Re: [ceph-users] list admin issues

2018-10-08 Thread Martin Palma

Same here also on Gmail with G Suite.
On Mon, Oct 8, 2018 at 12:31 AM Paul Emmerich  wrote:
>
> I'm also seeing this once every few months or so on Gmail with G Suite.
>
> Paul
> Am So., 7. Okt. 2018 um 08:18 Uhr schrieb Joshua Chen
> :
> >
> > I also got removed once, got another warning once (need to re-enable).
> >
> > Cheers
> > Joshua
> >
> >
> > On Sun, Oct 7, 2018 at 5:38 AM Svante Karlsson  
> > wrote:
> >>
> >> I'm also getting removed but not only from ceph. I subscribe 
> >> d...@kafka.apache.org list and the same thing happens there.
> >>
> >> Den lör 6 okt. 2018 kl 23:24 skrev Jeff Smith :
> >>>
> >>> I have been removed twice.
> >>> On Sat, Oct 6, 2018 at 7:07 AM Elias Abacioglu
> >>>  wrote:
> >>> >
> >>> > Hi,
> >>> >
> >>> > I'm bumping this old thread cause it's getting annoying. My membership 
> >>> > get disabled twice a month.
> >>> > Between my two Gmail accounts I'm in more than 25 mailing lists and I 
> >>> > see this behavior only here. Why is only ceph-users only affected? 
> >>> > Maybe Christian was on to something, is this intentional?
> >>> > Reality is that there is a lot of ceph-users with Gmail accounts, 
> >>> > perhaps it wouldn't be so bad to actually trying to figure this one out?
> >>> >
> >>> > So can the maintainers of this list please investigate what actually 
> >>> > gets bounced? Look at my address if you want.
> >>> > I got disabled 20181006, 20180927, 20180916, 20180725, 20180718 most 
> >>> > recently.
> >>> > Please help!
> >>> >
> >>> > Thanks,
> >>> > Elias
> >>> >
> >>> > On Mon, Oct 16, 2017 at 5:41 AM Christian Balzer  wrote:
> >>> >>
> >>> >>
> >>> >> Most mails to this ML score low or negatively with SpamAssassin, 
> >>> >> however
> >>> >> once in a while (this is a recent one) we get relatively high scores.
> >>> >> Note that the forged bits are false positives, but the SA is up to 
> >>> >> date and
> >>> >> google will have similar checks:
> >>> >> ---
> >>> >> X-Spam-Status: No, score=3.9 required=10.0 tests=BAYES_00,DCC_CHECK,
> >>> >>  
> >>> >> FORGED_MUA_MOZILLA,FORGED_YAHOO_RCVD,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,
> >>> >>  
> >>> >> HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,MIME_HTML_MOSTLY,RCVD_IN_MSPIKE_H4,
> >>> >>  RCVD_IN_MSPIKE_WL,RDNS_NONE,T_DKIM_INVALID shortcircuit=no 
> >>> >> autolearn=no
> >>> >> ---
> >>> >>
> >>> >> Between attachment mails and some of these and you're well on your way 
> >>> >> out.
> >>> >>
> >>> >> The default mailman settings and logic require 5 bounces to trigger
> >>> >> unsubscription and 7 days of NO bounces to reset the counter.
> >>> >>
> >>> >> Christian
> >>> >>
> >>> >> On Mon, 16 Oct 2017 12:23:25 +0900 Christian Balzer wrote:
> >>> >>
> >>> >> > On Mon, 16 Oct 2017 14:15:22 +1100 Blair Bethwaite wrote:
> >>> >> >
> >>> >> > > Thanks Christian,
> >>> >> > >
> >>> >> > > You're no doubt on the right track, but I'd really like to figure 
> >>> >> > > out
> >>> >> > > what it is at my end - I'm unlikely to be the only person 
> >>> >> > > subscribed
> >>> >> > > to ceph-users via a gmail account.
> >>> >> > >
> >>> >> > > Re. attachments, I'm surprised mailman would be allowing them in 
> >>> >> > > the
> >>> >> > > first place, and even so gmail's attachment requirements are less
> >>> >> > > strict than most corporate email setups (those that don't already 
> >>> >> > > use
> >>> >> > > a cloud provider).
> >>> >> > >
> >>> >> > Mailman doesn't do anything with this by default AFAIK, but see 
> >>> >> > below.
> >>> >> > Strict is fine if you're in control, corporate mail can be hell, 
> >>> >> > doubly so
> >>> >> > if on M$ cloud.
> >>> >> >
> >>> >> > > This started happening earlier in the year after I turned off 
> >>> >> > > digest
> >>> >> > > mode. I also have a paid google domain, maybe I'll try setting
> >>> >> > > delivery to that address and seeing if anything changes...
> >>> >> > >
> >>> >> > Don't think google domain is handled differently, but what do I know.
> >>> >> >
> >>> >> > Though the digest bit confirms my suspicion about attachments:
> >>> >> > ---
> >>> >> > When a subscriber chooses to receive plain text daily “digests” of 
> >>> >> > list
> >>> >> > messages, Mailman sends the digest messages without any original
> >>> >> > attachments (in Mailman lingo, it “scrubs” the messages of 
> >>> >> > attachments).
> >>> >> > However, Mailman also includes links to the original attachments 
> >>> >> > that the
> >>> >> > recipient can click on.
> >>> >> > ---
> >>> >> >
> >>> >> > Christian
> >>> >> >
> >>> >> > > Cheers,
> >>> >> > >
> >>> >> > > On 16 October 2017 at 13:54, Christian Balzer  
> >>> >> > > wrote:
> >>> >> > > >
> >>> >> > > > Hello,
> >>> >> > > >
> >>> >> > > > You're on gmail.
> >>> >> > > >
> >>> >> > > > Aside from various potential false positives with regards to 
> >>> >> > > > spam my bet
> >>> >> > > > is that gmail's known dislike for attachments is the cause of 
> >>> >> > > > these
> >>> >> > > > bounces and that setting is beyond your control

Re: [ceph-users] Fastest way to find raw device from OSD-ID? (osd -> lvm lv -> lvm pv -> disk)

Hi!

Yes, thank you. At least on one node this works, the other node just
freezes but this might by caused by a bad disk that I try to find.

Kevin

Am Mo., 8. Okt. 2018 um 12:07 Uhr schrieb Wido den Hollander :

> Hi,
>
> $ ceph-volume lvm list
>
> Does that work for you?
>
> Wido
>
> On 10/08/2018 12:01 PM, Kevin Olbrich wrote:
> > Hi!
> >
> > Is there an easy way to find raw disks (eg. sdd/sdd1) by OSD id?
> > Before I migrated from filestore with simple-mode to bluestore with lvm,
> > I was able to find the raw disk with "df".
> > Now, I need to go from LVM LV to PV to disk every time I need to
> > check/smartctl a disk.
> >
> > Kevin
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] MDS damaged after mimic 13.2.1 to 13.2.2 upgrade

On Mon, Oct 8, 2018 at 5:43 PM Sergey Malinin  wrote:
>
>
>
> > On 8.10.2018, at 12:37, Yan, Zheng  wrote:
> >
> > On Mon, Oct 8, 2018 at 4:37 PM Sergey Malinin  wrote:
> >>
> >> What additional steps need to be taken in order to (try to) regain access 
> >> to the fs providing that I backed up metadata pool, created alternate 
> >> metadata pool and ran scan_extents, scan_links, scan_inodes, and somewhat 
> >> recursive scrub.
> >> After that I only mounted the fs read-only to backup the data.
> >> Would anything even work if I had mds journal and purge queue truncated?
> >>
> >
> > did you backed up whole metadata pool?  did you make any modification
> > to the original metadata pool? If you did, what modifications?
>
> I backed up both journal and purge queue and used cephfs-journal-tool to 
> recover dentries, then reset journal and purge queue on original metadata 
> pool.

You can try restoring original journal and purge queue, then downgrade
mds to 13.2.1.   Journal objects names are 20x., purge queue
objects names are 50x.x.

> Before proceeding to alternate metadata pool recovery I was able to start MDS 
> but it soon failed throwing lots of 'loaded dup inode' errors, not sure if 
> that involved changing anything in the pool.
> I have left the original metadata pool untouched sine then.
>
>
> >
> > Yan, Zheng
> >
> >>
> >>> On 8.10.2018, at 05:15, Yan, Zheng  wrote:
> >>>
> >>> Sorry. this is caused wrong backport. downgrading mds to 13.2.1 and
> >>> marking mds repaird can resolve this.
> >>>
> >>> Yan, Zheng
> >>> On Sat, Oct 6, 2018 at 8:26 AM Sergey Malinin  wrote:
> 
>  Update:
>  I discovered http://tracker.ceph.com/issues/24236 and 
>  https://github.com/ceph/ceph/pull/22146
>  Make sure that it is not relevant in your case before proceeding to 
>  operations that modify on-disk data.
> 
> 
>  On 6.10.2018, at 03:17, Sergey Malinin  wrote:
> 
>  I ended up rescanning the entire fs using alternate metadata pool 
>  approach as in http://docs.ceph.com/docs/mimic/cephfs/disaster-recovery/
>  The process has not competed yet because during the recovery our cluster 
>  encountered another problem with OSDs that I got fixed yesterday (thanks 
>  to Igor Fedotov @ SUSE).
>  The first stage (scan_extents) completed in 84 hours (120M objects in 
>  data pool on 8 hdd OSDs on 4 hosts). The second (scan_inodes) was 
>  interrupted by OSDs failure so I have no timing stats but it seems to be 
>  runing 2-3 times faster than extents scan.
>  As to root cause -- in my case I recall that during upgrade I had 
>  forgotten to restart 3 OSDs, one of which was holding metadata pool 
>  contents, before restarting MDS daemons and that seemed to had an impact 
>  on MDS journal corruption, because when I restarted those OSDs, MDS was 
>  able to start up but soon failed throwing lots of 'loaded dup inode' 
>  errors.
> 
> 
>  On 6.10.2018, at 00:41, Alfredo Daniel Rezinovsky 
>   wrote:
> 
>  Same problem...
> 
>  # cephfs-journal-tool --journal=purge_queue journal inspect
>  2018-10-05 18:37:10.704 7f01f60a9bc0 -1 Missing object 500.016c
>  Overall journal integrity: DAMAGED
>  Objects missing:
>  0x16c
>  Corrupt regions:
>  0x5b00-
> 
>  Just after upgrade to 13.2.2
> 
>  Did you fixed it?
> 
> 
>  On 26/09/18 13:05, Sergey Malinin wrote:
> 
>  Hello,
>  Followed standard upgrade procedure to upgrade from 13.2.1 to 13.2.2.
>  After upgrade MDS cluster is down, mds rank 0 and purge_queue journal 
>  are damaged. Resetting purge_queue does not seem to work well as journal 
>  still appears to be damaged.
>  Can anybody help?
> 
>  mds log:
> 
>  -789> 2018-09-26 18:42:32.527 7f70f78b1700  1 mds.mds2 Updating MDS map 
>  to version 586 from mon.2
>  -788> 2018-09-26 18:42:32.527 7f70f78b1700  1 mds.0.583 handle_mds_map i 
>  am now mds.0.583
>  -787> 2018-09-26 18:42:32.527 7f70f78b1700  1 mds.0.583 handle_mds_map 
>  state change up:rejoin --> up:active
>  -786> 2018-09-26 18:42:32.527 7f70f78b1700  1 mds.0.583 recovery_done -- 
>  successful recovery!
>  
>   -38> 2018-09-26 18:42:32.707 7f70f28a7700 -1 mds.0.purge_queue 
>  _consume: Decode error at read_pos=0x322ec6636
>   -37> 2018-09-26 18:42:32.707 7f70f28a7700  5 mds.beacon.mds2 
>  set_want_state: up:active -> down:damaged
>   -36> 2018-09-26 18:42:32.707 7f70f28a7700  5 mds.beacon.mds2 _send 
>  down:damaged seq 137
>   -35> 2018-09-26 18:42:32.707 7f70f28a7700 10 monclient: 
>  _send_mon_message to mon.ceph3 at mon:6789/0
>   -34> 2018-09-26 18:42:32.707 7f70f28a7700  1 -- mds:6800/e4cc09cf --> 
>  mon:6789/0 -- mdsbeacon(14c72/mds2 down:damaged seq 137 v24a) v7 -- 
>  0x563b321ad480 con 0
>  
>    -3>

Re: [ceph-users] Fastest way to find raw device from OSD-ID? (osd -> lvm lv -> lvm pv -> disk)

2018-10-08 Thread Wido den Hollander

Hi,

$ ceph-volume lvm list

Does that work for you?

Wido

On 10/08/2018 12:01 PM, Kevin Olbrich wrote:
> Hi!
> 
> Is there an easy way to find raw disks (eg. sdd/sdd1) by OSD id?
> Before I migrated from filestore with simple-mode to bluestore with lvm,
> I was able to find the raw disk with "df".
> Now, I need to go from LVM LV to PV to disk every time I need to
> check/smartctl a disk.
> 
> Kevin
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Fastest way to find raw device from OSD-ID? (osd -> lvm lv -> lvm pv -> disk)

Hi!

Is there an easy way to find raw disks (eg. sdd/sdd1) by OSD id?
Before I migrated from filestore with simple-mode to bluestore with lvm, I
was able to find the raw disk with "df".
Now, I need to go from LVM LV to PV to disk every time I need to
check/smartctl a disk.

Kevin
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Don't upgrade to 13.2.2 if you use cephfs

2018-10-08 Thread Daniel Carrasco

El lun., 8 oct. 2018 5:44, Yan, Zheng  escribió:

> On Mon, Oct 8, 2018 at 11:34 AM Daniel Carrasco 
> wrote:
> >
> > I've got several problems on 12.2.8 too. All my standby MDS uses a lot
> of memory (while active uses normal memory), and I'm receiving a lot of
> slow MDS messages (causing the webpage to freeze and fail until MDS are
> restarted)... Finally I had to copy the entire site to DRBD and use NFS to
> solve all problems...
> >
>
> was standby-replay enabled?
>

I've tried both and I've seen more less the same behavior, maybe less when
is not in replay mode.

Anyway, we've deactivated CephFS for now there. I'll try with older
versions on a test environment


> > El lun., 8 oct. 2018 a las 5:21, Alex Litvak (<
> alexander.v.lit...@gmail.com>) escribió:
> >>
> >> How is this not an emergency announcement?  Also I wonder if I can
> >> downgrade at all ?  I am using ceph with docker deployed with
> >> ceph-ansible.  I wonder if I should push downgrade or basically wait for
> >> the fix.  I believe, a fix needs to be provided.
> >>
> >> Thank you,
> >>
> >> On 10/7/2018 9:30 PM, Yan, Zheng wrote:
> >> > There is a bug in v13.2.2 mds, which causes decoding purge queue to
> >> > fail. If mds is already in damaged state, please downgrade mds to
> >> > 13.2.1, then run 'ceph mds repaired fs_name:damaged_rank' .
> >> >
> >> > Sorry for all the trouble I caused.
> >> > Yan, Zheng
> >> >
> >>
> >>
> >> ___
> >> ceph-users mailing list
> >> ceph-users@lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> >
> >
> > --
> > _
> >
> >   Daniel Carrasco Marín
> >   Ingeniería para la Innovación i2TIC, S.L.
> >   Tlf:  +34 911 12 32 84 Ext: 223
> >   www.i2tic.com
> > _
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] mds_cache_memory_limit value

2018-10-08 Thread John Spray

On Fri, Oct 5, 2018 at 9:33 AM Hervé Ballans
 wrote:
>
> Hi all,
>
> I have just configured a new value for 'mds_cache_memory_limit'. The output 
> message tells "not observed, change may require restart".
> So I'm not really sure, has the new value been taken into account directly or 
> do I have to restart the mds daemons on each MDS node ?

That one is handled at runtime, shouldn't need a restart.  The command
line's output is a little bit misleading, as we currently just lack
the internal knowledge to say whether a given config setting requires
a restart or not.

John

> $ sudo ceph tell mds.* injectargs '--mds_cache_memory_limit 17179869184';
> 2018-10-04 16:25:11.692131 7f3012ffd700  0 client.2226325 ms_handle_reset on 
> IP1:6804/2649460488
> 2018-10-04 16:25:11.714746 7f3013fff700  0 client.4154799 ms_handle_reset on 
> IP1:6804/2649460488
> mds.mon1: mds_cache_memory_limit = '17179869184' (not observed, change may 
> require restart)
> 2018-10-04 16:25:11.725028 7f3012ffd700  0 client.4154802 ms_handle_reset on 
> IP0:6805/997393445
> 2018-10-04 16:25:11.748790 7f3013fff700  0 client.4154805 ms_handle_reset on 
> IP0:6805/997393445
> mds.mon0: mds_cache_memory_limit = '17179869184' (not observed, change may 
> require restart)
> 2018-10-04 16:25:11.760127 7f3012ffd700  0 client.2226334 ms_handle_reset on 
> IP2:6801/2590484227
> 2018-10-04 16:25:11.787951 7f3013fff700  0 client.2226337 ms_handle_reset on 
> IP2:6801/2590484227
> mds.mon2: mds_cache_memory_limit = '17179869184' (not observed, change may 
> require restart)
>
> Thanks,
> Hervé
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] daahboard

2018-10-08 Thread John Spray

Assuming that ansible is correctly running "ceph mgr module enable
dashboard", then the next place to look is in "ceph status" (any
errors?) and "ceph mgr module ls" (any reports of the module unable to
run?)

John
On Sat, Oct 6, 2018 at 1:53 AM solarflow99  wrote:
>
> I enabled the dashboard module in ansible but I don't see ceph-mgr listening 
> on a port for it.  Is there something else I missed?
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] MDS damaged after mimic 13.2.1 to 13.2.2 upgrade

> On 8.10.2018, at 12:37, Yan, Zheng  wrote:
> 
> On Mon, Oct 8, 2018 at 4:37 PM Sergey Malinin  wrote:
>> 
>> What additional steps need to be taken in order to (try to) regain access to 
>> the fs providing that I backed up metadata pool, created alternate metadata 
>> pool and ran scan_extents, scan_links, scan_inodes, and somewhat recursive 
>> scrub.
>> After that I only mounted the fs read-only to backup the data.
>> Would anything even work if I had mds journal and purge queue truncated?
>> 
> 
> did you backed up whole metadata pool?  did you make any modification
> to the original metadata pool? If you did, what modifications?

I backed up both journal and purge queue and used cephfs-journal-tool to 
recover dentries, then reset journal and purge queue on original metadata pool.
Before proceeding to alternate metadata pool recovery I was able to start MDS 
but it soon failed throwing lots of 'loaded dup inode' errors, not sure if that 
involved changing anything in the pool.
I have left the original metadata pool untouched sine then.

> 
> Yan, Zheng
> 
>> 
>>> On 8.10.2018, at 05:15, Yan, Zheng  wrote:
>>> 
>>> Sorry. this is caused wrong backport. downgrading mds to 13.2.1 and
>>> marking mds repaird can resolve this.
>>> 
>>> Yan, Zheng
>>> On Sat, Oct 6, 2018 at 8:26 AM Sergey Malinin  wrote:

 Update:
 I discovered http://tracker.ceph.com/issues/24236 and 
 https://github.com/ceph/ceph/pull/22146
 Make sure that it is not relevant in your case before proceeding to 
 operations that modify on-disk data.

 On 6.10.2018, at 03:17, Sergey Malinin  wrote:

 I ended up rescanning the entire fs using alternate metadata pool approach 
 as in http://docs.ceph.com/docs/mimic/cephfs/disaster-recovery/
 The process has not competed yet because during the recovery our cluster 
 encountered another problem with OSDs that I got fixed yesterday (thanks 
 to Igor Fedotov @ SUSE).
 The first stage (scan_extents) completed in 84 hours (120M objects in data 
 pool on 8 hdd OSDs on 4 hosts). The second (scan_inodes) was interrupted 
 by OSDs failure so I have no timing stats but it seems to be runing 2-3 
 times faster than extents scan.
 As to root cause -- in my case I recall that during upgrade I had 
 forgotten to restart 3 OSDs, one of which was holding metadata pool 
 contents, before restarting MDS daemons and that seemed to had an impact 
 on MDS journal corruption, because when I restarted those OSDs, MDS was 
 able to start up but soon failed throwing lots of 'loaded dup inode' 
 errors.

 On 6.10.2018, at 00:41, Alfredo Daniel Rezinovsky  
 wrote:

 Same problem...

 # cephfs-journal-tool --journal=purge_queue journal inspect
 2018-10-05 18:37:10.704 7f01f60a9bc0 -1 Missing object 500.016c
 Overall journal integrity: DAMAGED
 Objects missing:
 0x16c
 Corrupt regions:
 0x5b00-

 Just after upgrade to 13.2.2

 Did you fixed it?

 On 26/09/18 13:05, Sergey Malinin wrote:

 Hello,
 Followed standard upgrade procedure to upgrade from 13.2.1 to 13.2.2.
 After upgrade MDS cluster is down, mds rank 0 and purge_queue journal are 
 damaged. Resetting purge_queue does not seem to work well as journal still 
 appears to be damaged.
 Can anybody help?

 mds log:

 -789> 2018-09-26 18:42:32.527 7f70f78b1700  1 mds.mds2 Updating MDS map to 
 version 586 from mon.2
 -788> 2018-09-26 18:42:32.527 7f70f78b1700  1 mds.0.583 handle_mds_map i 
 am now mds.0.583
 -787> 2018-09-26 18:42:32.527 7f70f78b1700  1 mds.0.583 handle_mds_map 
 state change up:rejoin --> up:active
 -786> 2018-09-26 18:42:32.527 7f70f78b1700  1 mds.0.583 recovery_done -- 
 successful recovery!

  -38> 2018-09-26 18:42:32.707 7f70f28a7700 -1 mds.0.purge_queue _consume: 
 Decode error at read_pos=0x322ec6636
  -37> 2018-09-26 18:42:32.707 7f70f28a7700  5 mds.beacon.mds2 
 set_want_state: up:active -> down:damaged
  -36> 2018-09-26 18:42:32.707 7f70f28a7700  5 mds.beacon.mds2 _send 
 down:damaged seq 137
  -35> 2018-09-26 18:42:32.707 7f70f28a7700 10 monclient: _send_mon_message 
 to mon.ceph3 at mon:6789/0
  -34> 2018-09-26 18:42:32.707 7f70f28a7700  1 -- mds:6800/e4cc09cf --> 
 mon:6789/0 -- mdsbeacon(14c72/mds2 down:damaged seq 137 v24a) v7 -- 
 0x563b321ad480 con 0

   -3> 2018-09-26 18:42:32.743 7f70f98b5700  5 -- mds:6800/3838577103 >> 
 mon:6789/0 conn(0x563b3213e000 :-1 
 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=8 cs=1 l=1). rx mon.2 
 seq 29 0x563b321ab880 mdsbeaco
 n(85106/mds2 down:damaged seq 311 v587) v7
   -2> 2018-09-26 18:42:32.743 7f70f98b5700  1 -- mds:6800/3838577103 <== 
 mon.2 mon:6789/0 29  mdsbeacon(85106/mds2 down:damaged seq 311 v587)

Re: [ceph-users] MDS damaged after mimic 13.2.1 to 13.2.2 upgrade

On Mon, Oct 8, 2018 at 4:37 PM Sergey Malinin  wrote:
>
> What additional steps need to be taken in order to (try to) regain access to 
> the fs providing that I backed up metadata pool, created alternate metadata 
> pool and ran scan_extents, scan_links, scan_inodes, and somewhat recursive 
> scrub.
> After that I only mounted the fs read-only to backup the data.
> Would anything even work if I had mds journal and purge queue truncated?
>

did you backed up whole metadata pool?  did you make any modification
to the original metadata pool? If you did, what modifications?

Yan, Zheng

>
> > On 8.10.2018, at 05:15, Yan, Zheng  wrote:
> >
> > Sorry. this is caused wrong backport. downgrading mds to 13.2.1 and
> > marking mds repaird can resolve this.
> >
> > Yan, Zheng
> > On Sat, Oct 6, 2018 at 8:26 AM Sergey Malinin  wrote:
> >>
> >> Update:
> >> I discovered http://tracker.ceph.com/issues/24236 and 
> >> https://github.com/ceph/ceph/pull/22146
> >> Make sure that it is not relevant in your case before proceeding to 
> >> operations that modify on-disk data.
> >>
> >>
> >> On 6.10.2018, at 03:17, Sergey Malinin  wrote:
> >>
> >> I ended up rescanning the entire fs using alternate metadata pool approach 
> >> as in http://docs.ceph.com/docs/mimic/cephfs/disaster-recovery/
> >> The process has not competed yet because during the recovery our cluster 
> >> encountered another problem with OSDs that I got fixed yesterday (thanks 
> >> to Igor Fedotov @ SUSE).
> >> The first stage (scan_extents) completed in 84 hours (120M objects in data 
> >> pool on 8 hdd OSDs on 4 hosts). The second (scan_inodes) was interrupted 
> >> by OSDs failure so I have no timing stats but it seems to be runing 2-3 
> >> times faster than extents scan.
> >> As to root cause -- in my case I recall that during upgrade I had 
> >> forgotten to restart 3 OSDs, one of which was holding metadata pool 
> >> contents, before restarting MDS daemons and that seemed to had an impact 
> >> on MDS journal corruption, because when I restarted those OSDs, MDS was 
> >> able to start up but soon failed throwing lots of 'loaded dup inode' 
> >> errors.
> >>
> >>
> >> On 6.10.2018, at 00:41, Alfredo Daniel Rezinovsky  
> >> wrote:
> >>
> >> Same problem...
> >>
> >> # cephfs-journal-tool --journal=purge_queue journal inspect
> >> 2018-10-05 18:37:10.704 7f01f60a9bc0 -1 Missing object 500.016c
> >> Overall journal integrity: DAMAGED
> >> Objects missing:
> >>  0x16c
> >> Corrupt regions:
> >>  0x5b00-
> >>
> >> Just after upgrade to 13.2.2
> >>
> >> Did you fixed it?
> >>
> >>
> >> On 26/09/18 13:05, Sergey Malinin wrote:
> >>
> >> Hello,
> >> Followed standard upgrade procedure to upgrade from 13.2.1 to 13.2.2.
> >> After upgrade MDS cluster is down, mds rank 0 and purge_queue journal are 
> >> damaged. Resetting purge_queue does not seem to work well as journal still 
> >> appears to be damaged.
> >> Can anybody help?
> >>
> >> mds log:
> >>
> >>  -789> 2018-09-26 18:42:32.527 7f70f78b1700  1 mds.mds2 Updating MDS map 
> >> to version 586 from mon.2
> >>  -788> 2018-09-26 18:42:32.527 7f70f78b1700  1 mds.0.583 handle_mds_map i 
> >> am now mds.0.583
> >>  -787> 2018-09-26 18:42:32.527 7f70f78b1700  1 mds.0.583 handle_mds_map 
> >> state change up:rejoin --> up:active
> >>  -786> 2018-09-26 18:42:32.527 7f70f78b1700  1 mds.0.583 recovery_done -- 
> >> successful recovery!
> >> 
> >>   -38> 2018-09-26 18:42:32.707 7f70f28a7700 -1 mds.0.purge_queue _consume: 
> >> Decode error at read_pos=0x322ec6636
> >>   -37> 2018-09-26 18:42:32.707 7f70f28a7700  5 mds.beacon.mds2 
> >> set_want_state: up:active -> down:damaged
> >>   -36> 2018-09-26 18:42:32.707 7f70f28a7700  5 mds.beacon.mds2 _send 
> >> down:damaged seq 137
> >>   -35> 2018-09-26 18:42:32.707 7f70f28a7700 10 monclient: 
> >> _send_mon_message to mon.ceph3 at mon:6789/0
> >>   -34> 2018-09-26 18:42:32.707 7f70f28a7700  1 -- mds:6800/e4cc09cf --> 
> >> mon:6789/0 -- mdsbeacon(14c72/mds2 down:damaged seq 137 v24a) v7 -- 
> >> 0x563b321ad480 con 0
> >> 
> >>-3> 2018-09-26 18:42:32.743 7f70f98b5700  5 -- mds:6800/3838577103 >> 
> >> mon:6789/0 conn(0x563b3213e000 :-1 
> >> s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=8 cs=1 l=1). rx mon.2 
> >> seq 29 0x563b321ab880 mdsbeaco
> >> n(85106/mds2 down:damaged seq 311 v587) v7
> >>-2> 2018-09-26 18:42:32.743 7f70f98b5700  1 -- mds:6800/3838577103 <== 
> >> mon.2 mon:6789/0 29  mdsbeacon(85106/mds2 down:damaged seq 311 v587) 
> >> v7  129+0+0 (3296573291 0 0) 0x563b321ab880 con 0x563b3213e
> >> 000
> >>-1> 2018-09-26 18:42:32.743 7f70f98b5700  5 mds.beacon.mds2 
> >> handle_mds_beacon down:damaged seq 311 rtt 0.038261
> >> 0> 2018-09-26 18:42:32.743 7f70f28a7700  1 mds.mds2 respawn!
> >>
> >> # cephfs-journal-tool --journal=purge_queue journal inspect
> >> Overall journal integrity: DAMAGED
> >> Corrupt regions:
> >>  0x322ec65d9-
> >>
> >> # cephfs-journal-tool --journal=purge_queue journal

Re: [ceph-users] After 13.2.2 upgrade: bluefs mount failed to replay log: (5) Input/output error

Hi Paul!

I installed ceph-debuginfo and set these:
debug bluestore = 20/20
debug osd = 20/20
debug bluefs = 20/20
debug bdev = 20/20

V: ceph version 13.2.2 (02899bfda814146b021136e9d8e80eba494e1126) mimic
(stable)

*LOGS*

*OSD 29:*
2018-10-08 10:29:06.001 7f810511a1c0 20 bluefs _read left 0x4d000 len 0x1000
2018-10-08 10:29:06.001 7f810511a1c0 20 bluefs _read got 4096
2018-10-08 10:29:06.001 7f810511a1c0 10 bluefs _replay 0x12b3000: stop:
uuid e510614a-7ca4-eb59-0383-010189889f01 != super.uuid
4df25e30-4769-47b5-b569-01b3f83de70c
2018-10-08 10:29:06.001 7f810511a1c0 10 bluefs _replay log file size was
0x12b3000
2018-10-08 10:29:06.001 7f810511a1c0 -1 bluefs _replay file with link count
0: file(ino 519 size 0x31e2f42 mtime 2018-10-02 12:24:22.632397 bdev 1
allocated 320 extents
[1:0x700820+10,1:0x700900+10,1:0x700910+10,1:0x700920+10,1:0x700930+10,1:0x700940+10,1:0x700950+10,1:0x700960+10,1:0x700970+10,1:0x700980+10,1:0x700990+10,1:0x7009a0+10,1:0x7009b0+10,1:0x7009c0+10,1:0x7009d0+10,1:0x7009e0+10,1:0x7009f0+10,1:0x700a00+10,1:0x700a10+10,1:0x700a20+10,1:0x700a30+10,1:0x700a40+10,1:0x700a50+10,1:0x700a60+10,1:0x700a70+10,1:0x700a80+10,1:0x700a90+10,1:0x700aa0+10,1:0x700ab0+10,1:0x700ac0+10,1:0x700ad0+10,1:0x700ae0+10,1:0x700af0+10,1:0x700b00+10,1:0x700b10+10,1:0x700b20+10,1:0x700b30+10,1:0x700b40+10,1:0x700b50+10,1:0x700b60+10,1:0x700b70+10,1:0x700b80+10,1:0x700b90+10,1:0x700ba0+10,1:0x700bb0+10,1:0x700bc0+10,1:0x700bd0+10,1:0x700be0+10,1:0x700bf0+10,1:0x700c00+10])
2018-10-08 10:29:06.001 7f810511a1c0 -1 bluefs mount failed to replay log:
(5) Input/output error
2018-10-08 10:29:06.001 7f810511a1c0 20 bluefs _stop_alloc
2018-10-08 10:29:06.001 7f810511a1c0 10 bdev(0x558b8f1dea80
/var/lib/ceph/osd/ceph-29/block) discard_drain
2018-10-08 10:29:06.001 7f810511a1c0  1 stupidalloc 0x0x558b8f34d0a0
shutdown
2018-10-08 10:29:06.001 7f810511a1c0 -1
bluestore(/var/lib/ceph/osd/ceph-29) _open_db failed bluefs mount: (5)
Input/output error
2018-10-08 10:29:06.001 7f810511a1c0 20 bdev aio_wait 0x558b8f34f440 done
2018-10-08 10:29:06.001 7f810511a1c0  1 bdev(0x558b8f1dea80
/var/lib/ceph/osd/ceph-29/block) close
2018-10-08 10:29:06.001 7f810511a1c0 10 bdev(0x558b8f1dea80
/var/lib/ceph/osd/ceph-29/block) _aio_stop
2018-10-08 10:29:06.066 7f80ed75f700 10 bdev(0x558b8f1dea80
/var/lib/ceph/osd/ceph-29/block) _aio_thread end
2018-10-08 10:29:06.073 7f810511a1c0 10 bdev(0x558b8f1dea80
/var/lib/ceph/osd/ceph-29/block) _discard_stop
2018-10-08 10:29:06.073 7f80ecf5e700 20 bdev(0x558b8f1dea80
/var/lib/ceph/osd/ceph-29/block) _discard_thread wake
2018-10-08 10:29:06.073 7f80ecf5e700 10 bdev(0x558b8f1dea80
/var/lib/ceph/osd/ceph-29/block) _discard_thread finish
2018-10-08 10:29:06.073 7f810511a1c0 10 bdev(0x558b8f1dea80
/var/lib/ceph/osd/ceph-29/block) _discard_stop stopped
2018-10-08 10:29:06.073 7f810511a1c0  1 bdev(0x558b8f1de000
/var/lib/ceph/osd/ceph-29/block) close
2018-10-08 10:29:06.073 7f810511a1c0 10 bdev(0x558b8f1de000
/var/lib/ceph/osd/ceph-29/block) _aio_stop
2018-10-08 10:29:06.315 7f80ee761700 10 bdev(0x558b8f1de000
/var/lib/ceph/osd/ceph-29/block) _aio_thread end
2018-10-08 10:29:06.321 7f810511a1c0 10 bdev(0x558b8f1de000
/var/lib/ceph/osd/ceph-29/block) _discard_stop
2018-10-08 10:29:06.321 7f80edf60700 20 bdev(0x558b8f1de000
/var/lib/ceph/osd/ceph-29/block) _discard_thread wake
2018-10-08 10:29:06.321 7f80edf60700 10 bdev(0x558b8f1de000
/var/lib/ceph/osd/ceph-29/block) _discard_thread finish
2018-10-08 10:29:06.321 7f810511a1c0 10 bdev(0x558b8f1de000
/var/lib/ceph/osd/ceph-29/block) _discard_stop stopped
2018-10-08 10:29:06.322 7f810511a1c0 -1 osd.29 0 OSD:init: unable to mount
object store
2018-10-08 10:29:06.322 7f810511a1c0 -1  ** ERROR: osd init failed: (5)
Input/output error

*OSD 40 (keeps getting restarted by systemd):*
2018-10-08 10:33:01.867 7fbdd21441c0 20 read_log_and_missing 4754'11872
(4754'11871) modify   5:fd843365:::1000229.29b3:head by
client.1109026.0:2115960 2018-09-23 02:48:36.736842 0
2018-10-08 10:33:01.867 7fbdd21441c0 10 bluefs _read_random h
0x5566a75fb480 0x4a2a19~1036 from file(ino 539 size 0x3fa66ff mtime
2018-10-02 12:19:02.174614 bdev 1 allocated 400 extents
[1:0x7004e0+400])
2018-10-08 10:33:01.867 7fbdd21441c0 20 bluefs _read_random read buffered
0x4a2a19~1036 of 1:0x7004e0+400
2018-10-08 10:33:01.867 7fbdd21441c0  5 bdev(0x5566a70dea80
/var/lib/ceph/osd/ceph-40/block) read_random 0x70052a2a19~1036
2018-10-08 10:33:01.867 7fbdd21441c0 20 bluefs _read_random got 4150
2018-10-08 10:33:01.867 7fbdd21441c0 20
bluestore.OmapIteratorImpl(0x5566bddf5f80) valid i

Re: [ceph-users] MDS damaged after mimic 13.2.1 to 13.2.2 upgrade

What additional steps need to be taken in order to (try to) regain access to 
the fs providing that I backed up metadata pool, created alternate metadata 
pool and ran scan_extents, scan_links, scan_inodes, and somewhat recursive 
scrub.
After that I only mounted the fs read-only to backup the data.
Would anything even work if I had mds journal and purge queue truncated?


> On 8.10.2018, at 05:15, Yan, Zheng  wrote:
> 
> Sorry. this is caused wrong backport. downgrading mds to 13.2.1 and
> marking mds repaird can resolve this.
> 
> Yan, Zheng
> On Sat, Oct 6, 2018 at 8:26 AM Sergey Malinin  wrote:
>> 
>> Update:
>> I discovered http://tracker.ceph.com/issues/24236 and 
>> https://github.com/ceph/ceph/pull/22146
>> Make sure that it is not relevant in your case before proceeding to 
>> operations that modify on-disk data.
>> 
>> 
>> On 6.10.2018, at 03:17, Sergey Malinin  wrote:
>> 
>> I ended up rescanning the entire fs using alternate metadata pool approach 
>> as in http://docs.ceph.com/docs/mimic/cephfs/disaster-recovery/
>> The process has not competed yet because during the recovery our cluster 
>> encountered another problem with OSDs that I got fixed yesterday (thanks to 
>> Igor Fedotov @ SUSE).
>> The first stage (scan_extents) completed in 84 hours (120M objects in data 
>> pool on 8 hdd OSDs on 4 hosts). The second (scan_inodes) was interrupted by 
>> OSDs failure so I have no timing stats but it seems to be runing 2-3 times 
>> faster than extents scan.
>> As to root cause -- in my case I recall that during upgrade I had forgotten 
>> to restart 3 OSDs, one of which was holding metadata pool contents, before 
>> restarting MDS daemons and that seemed to had an impact on MDS journal 
>> corruption, because when I restarted those OSDs, MDS was able to start up 
>> but soon failed throwing lots of 'loaded dup inode' errors.
>> 
>> 
>> On 6.10.2018, at 00:41, Alfredo Daniel Rezinovsky  
>> wrote:
>> 
>> Same problem...
>> 
>> # cephfs-journal-tool --journal=purge_queue journal inspect
>> 2018-10-05 18:37:10.704 7f01f60a9bc0 -1 Missing object 500.016c
>> Overall journal integrity: DAMAGED
>> Objects missing:
>>  0x16c
>> Corrupt regions:
>>  0x5b00-
>> 
>> Just after upgrade to 13.2.2
>> 
>> Did you fixed it?
>> 
>> 
>> On 26/09/18 13:05, Sergey Malinin wrote:
>> 
>> Hello,
>> Followed standard upgrade procedure to upgrade from 13.2.1 to 13.2.2.
>> After upgrade MDS cluster is down, mds rank 0 and purge_queue journal are 
>> damaged. Resetting purge_queue does not seem to work well as journal still 
>> appears to be damaged.
>> Can anybody help?
>> 
>> mds log:
>> 
>>  -789> 2018-09-26 18:42:32.527 7f70f78b1700  1 mds.mds2 Updating MDS map to 
>> version 586 from mon.2
>>  -788> 2018-09-26 18:42:32.527 7f70f78b1700  1 mds.0.583 handle_mds_map i am 
>> now mds.0.583
>>  -787> 2018-09-26 18:42:32.527 7f70f78b1700  1 mds.0.583 handle_mds_map 
>> state change up:rejoin --> up:active
>>  -786> 2018-09-26 18:42:32.527 7f70f78b1700  1 mds.0.583 recovery_done -- 
>> successful recovery!
>> 
>>   -38> 2018-09-26 18:42:32.707 7f70f28a7700 -1 mds.0.purge_queue _consume: 
>> Decode error at read_pos=0x322ec6636
>>   -37> 2018-09-26 18:42:32.707 7f70f28a7700  5 mds.beacon.mds2 
>> set_want_state: up:active -> down:damaged
>>   -36> 2018-09-26 18:42:32.707 7f70f28a7700  5 mds.beacon.mds2 _send 
>> down:damaged seq 137
>>   -35> 2018-09-26 18:42:32.707 7f70f28a7700 10 monclient: _send_mon_message 
>> to mon.ceph3 at mon:6789/0
>>   -34> 2018-09-26 18:42:32.707 7f70f28a7700  1 -- mds:6800/e4cc09cf --> 
>> mon:6789/0 -- mdsbeacon(14c72/mds2 down:damaged seq 137 v24a) v7 -- 
>> 0x563b321ad480 con 0
>> 
>>-3> 2018-09-26 18:42:32.743 7f70f98b5700  5 -- mds:6800/3838577103 >> 
>> mon:6789/0 conn(0x563b3213e000 :-1 
>> s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=8 cs=1 l=1). rx mon.2 seq 
>> 29 0x563b321ab880 mdsbeaco
>> n(85106/mds2 down:damaged seq 311 v587) v7
>>-2> 2018-09-26 18:42:32.743 7f70f98b5700  1 -- mds:6800/3838577103 <== 
>> mon.2 mon:6789/0 29  mdsbeacon(85106/mds2 down:damaged seq 311 v587) v7 
>>  129+0+0 (3296573291 0 0) 0x563b321ab880 con 0x563b3213e
>> 000
>>-1> 2018-09-26 18:42:32.743 7f70f98b5700  5 mds.beacon.mds2 
>> handle_mds_beacon down:damaged seq 311 rtt 0.038261
>> 0> 2018-09-26 18:42:32.743 7f70f28a7700  1 mds.mds2 respawn!
>> 
>> # cephfs-journal-tool --journal=purge_queue journal inspect
>> Overall journal integrity: DAMAGED
>> Corrupt regions:
>>  0x322ec65d9-
>> 
>> # cephfs-journal-tool --journal=purge_queue journal reset
>> old journal was 13470819801~8463
>> new journal start will be 13472104448 (1276184 bytes past old end)
>> writing journal head
>> done
>> 
>> # cephfs-journal-tool --journal=purge_queue journal inspect
>> 2018-09-26 19:00:52.848 7f3f9fa50bc0 -1 Missing object 500.0c8c
>> Overall journal integrity: DAMAGED
>> Objects missing:
>>  0xc8c
>> Corrupt regions:
>>  0x32300-
>> _

Re: [ceph-users] cephfs poor performance

On Mon, Oct 8, 2018 at 3:38 PM Tomasz Płaza  wrote:
>
>
> On 08.10.2018 at 09:21, Yan, Zheng wrote:
> > On Mon, Oct 8, 2018 at 1:54 PM Tomasz Płaza  wrote:
> >> Hi,
> >>
> >> Can someone please help me, how do I improve performance on our CephFS
> >> cluster?
> >>
> >> System in use is: Centos 7.5 with ceph 12.2.7.
> >> The hardware in use are as follows:
> >> 3xMON/MGR:
> >> 1xIntel(R) Xeon(R) Bronze 3106
> >> 16GB RAM
> >> 2xSSD for system
> >> 1Gbe NIC
> >>
> >> 2xMDS:
> >> 2xIntel(R) Xeon(R) Bronze 3106
> >> 64GB RAM
> >> 2xSSD for system
> >> 10Gbe NIC
> >>
> >> 6xOSD:
> >> 1xIntel(R) Xeon(R) Silver 4108
> >> 2xSSD for system
> >> 6xHGST HUS726060ALE610 SATA HDD's
> >> 1xINTEL SSDSC2BB150G7 for osd db`s (10G partitions) rest for OSD to
> >> place cephfs_metadata
> >> 10Gbe NIC
> >>
> >> pools (default crush rule aware of device class):
> >> rbd with 1024 pg crush rule replicated_hdd
> >> cephfs_data with 256 pg crush rule replicated_hdd
> >> cephfs_metadata with 32 pg crush rule replicated_ssd
> >>
> >> test done by fio: fio --randrepeat=1 --ioengine=libaio --direct=1
> >> --gtod_reduce=1 --name=test --filename=random_read_write.fio --bs=4k
> >> --iodepth=64 --size=1G --readwrite=randrw --rwmixread=75
> >>
> > kernel version? maybe cephfs driver in your kernel does not support
> > AIO (--iodepth is 1 effectively)
> >
> > Yan, Zheng
> Kernel is 3.10.0-862.9.1.el7.x86_64 (I can update it to
> 3.10.0-862.14.4.el7) but do not know how to check aio support in kernel
> drive if it is relevant because I mounted it with ceph-fuse -n
> client.cephfs -k /etc/ceph/ceph.client.cephfs.keyring -m
> 192.168.10.1:6789 /mnt/cephfs
>
> Tom Płaza
>

please try kernel mount.

> >> shows iops write/read performance as folows:
> >> rbd 3663/1223
> >> cephfs (fuse) 205/68 (wich is a little lower than raw performance of one
> >> hdd used in cluster)
> >>
> >> Everything is connected to one Cisco 10Gbe switch.
> >> Please help.
> >>
> >> ___
> >> ceph-users mailing list
> >> ceph-users@lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> Spółki Grupy Wirtualna Polska:
>
> Wirtualna Polska Holding Spółka Akcyjna z siedzibą w Warszawie, ul. Jutrzenki 
> 137A, 02-231 Warszawa, wpisana do Krajowego Rejestru Sądowego - Rejestru 
> Przedsiębiorców prowadzonego przez Sąd Rejonowy dla m.st. Warszawy w 
> Warszawie pod nr KRS: 407130, kapitał zakładowy: 1 445 199,00 zł (w 
> całości wpłacony), Numer Identyfikacji Podatkowej (NIP): 521-31-11-513
>
> Wirtualna Polska Media Spółka Akcyjna z siedzibą w Warszawie, ul. Jutrzenki 
> 137A, 02-231 Warszawa, wpisana do Krajowego Rejestru Sądowego - Rejestru 
> Przedsiębiorców prowadzonego przez Sąd Rejonowy dla m.st. Warszawy w 
> Warszawie pod nr KRS: 580004, kapitał zakładowy: 320 005 950,00 zł (w 
> całości wpłacony), Numer Identyfikacji Podatkowej (NIP): 527-26-45-593
>
> Administratorem udostępnionych danych osobowych jest Wirtualna Polska Media 
> S.A. z siedzibą w Warszawie (dalej „WPM”). WPM przetwarza Twoje dane osobowe, 
> które zostały podane przez Ciebie dobrowolnie w trakcie dotychczasowej 
> współpracy, w związku z zawarciem umowy lub zostały zebrane ze źródeł 
> powszechnie dostępnych, w szczególności: imię i nazwisko, adres email, numer 
> telefonu. Przetwarzamy te dane w celach opisanych w polityce 
> prywatności, między innymi w celu 
> realizacji współpracy, realizacji obowiązków przewidzianych prawem, w celach 
> marketingowych WP. Podstawą prawną przetwarzania Twoich danych osobowych w 
> celach marketingowych jest prawnie uzasadniony interes jakim jest m.in. 
> przesyłanie informacji marketingowych o usługach WP, w tym zaproszeń na 
> konferencje branżowe, informacje o publikacjach. Twoje dane możemy 
> przekazywać podmiotom przetwarzającym je na nasze zlecenie oraz podmiotom 
> uprawnionym do uzyskania danych na podstawie obowiązującego prawa. Masz prawo 
> m.in. do żądania dostępu do danych, sprostowania, usunięcia lub ograniczenia 
> ich przetwarzania, jak również prawo do zgłoszenia sprzeciwu w przewidzianych 
> w prawie sytuacjach. Prawa te oraz sposób ich realizacji opisaliśmy w 
> polityce prywatności. Tam też znajdziesz 
> informacje jak zakomunikować nam Twoją wolę skorzystania z tych praw.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] cephfs poor performance

2018-10-08 Thread Tomasz Płaza



On 08.10.2018 at 09:21, Yan, Zheng wrote:

On Mon, Oct 8, 2018 at 1:54 PM Tomasz Płaza  wrote:

Hi,

Can someone please help me, how do I improve performance on our CephFS
cluster?

System in use is: Centos 7.5 with ceph 12.2.7.
The hardware in use are as follows:
3xMON/MGR:
1xIntel(R) Xeon(R) Bronze 3106
16GB RAM
2xSSD for system
1Gbe NIC

2xMDS:
2xIntel(R) Xeon(R) Bronze 3106
64GB RAM
2xSSD for system
10Gbe NIC

6xOSD:
1xIntel(R) Xeon(R) Silver 4108
2xSSD for system
6xHGST HUS726060ALE610 SATA HDD's
1xINTEL SSDSC2BB150G7 for osd db`s (10G partitions) rest for OSD to
place cephfs_metadata
10Gbe NIC

pools (default crush rule aware of device class):
rbd with 1024 pg crush rule replicated_hdd
cephfs_data with 256 pg crush rule replicated_hdd
cephfs_metadata with 32 pg crush rule replicated_ssd

test done by fio: fio --randrepeat=1 --ioengine=libaio --direct=1
--gtod_reduce=1 --name=test --filename=random_read_write.fio --bs=4k
--iodepth=64 --size=1G --readwrite=randrw --rwmixread=75


kernel version? maybe cephfs driver in your kernel does not support
AIO (--iodepth is 1 effectively)

Yan, Zheng
Kernel is 3.10.0-862.9.1.el7.x86_64 (I can update it to 
3.10.0-862.14.4.el7) but do not know how to check aio support in kernel 
drive if it is relevant because I mounted it with ceph-fuse -n 
client.cephfs -k /etc/ceph/ceph.client.cephfs.keyring -m 
192.168.10.1:6789 /mnt/cephfs


Tom Płaza


shows iops write/read performance as folows:
rbd 3663/1223
cephfs (fuse) 205/68 (wich is a little lower than raw performance of one
hdd used in cluster)

Everything is connected to one Cisco 10Gbe switch.
Please help.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] cephfs poor performance

2018-10-08 Thread Marc Roos

 
That is easy I think, so I will give it a try:

Faster CPU's, Use fast NVME disks, all 10Gbit or even better 100Gbit, 
added with a daily prayer.




-Original Message-
From: Tomasz Płaza [mailto:tomasz.pl...@grupawp.pl] 
Sent: maandag 8 oktober 2018 7:46
To: ceph-users@lists.ceph.com
Subject: [ceph-users] cephfs poor performance

Hi,

Can someone please help me, how do I improve performance on our CephFS 
cluster?

System in use is: Centos 7.5 with ceph 12.2.7.
The hardware in use are as follows:
3xMON/MGR:
1xIntel(R) Xeon(R) Bronze 3106
16GB RAM
2xSSD for system
1Gbe NIC

2xMDS:
2xIntel(R) Xeon(R) Bronze 3106
64GB RAM
2xSSD for system
10Gbe NIC

6xOSD:
1xIntel(R) Xeon(R) Silver 4108
2xSSD for system
6xHGST HUS726060ALE610 SATA HDD's
1xINTEL SSDSC2BB150G7 for osd db`s (10G partitions) rest for OSD to 
place cephfs_metadata 10Gbe NIC

pools (default crush rule aware of device class):
rbd with 1024 pg crush rule replicated_hdd cephfs_data with 256 pg crush 
rule replicated_hdd cephfs_metadata with 32 pg crush rule replicated_ssd

test done by fio: fio --randrepeat=1 --ioengine=libaio --direct=1
--gtod_reduce=1 --name=test --filename=random_read_write.fio --bs=4k
--iodepth=64 --size=1G --readwrite=randrw --rwmixread=75

shows iops write/read performance as folows:
rbd 3663/1223
cephfs (fuse) 205/68 (wich is a little lower than raw performance of one 
hdd used in cluster)

Everything is connected to one Cisco 10Gbe switch.
Please help.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] cephfs poor performance