Re: [ceph-users] Recovery from 12.2.5 (corruption) -> 12.2.6 (hair on fire) -> 13.2.0 (some objects inaccessible and CephFS damaged)

2018-07-18 Thread Brad Hubbard
On Wed, Jul 18, 2018 at 2:57 AM, Troy Ablan  wrote:
> I was on 12.2.5 for a couple weeks and started randomly seeing
> corruption, moved to 12.2.6 via yum update on Sunday, and all hell broke
> loose.  I panicked and moved to Mimic, and when that didn't solve the
> problem, only then did I start to root around in mailing lists archives.
>
> It appears I can't downgrade OSDs back to Luminous now that 12.2.7 is
> out, but I'm unsure how to proceed now that the damaged cluster is
> running under Mimic.  Is there anything I can do to get the cluster back
> online and objects readable?

That depends on what the specific problem is. Can you provide some
data that fills in the blanks around "randomly seeing corruption"?

>
> Everything is BlueStore and most of it is EC.
>
> Thanks.
>
> -Troy
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Jewel PG stuck inconsistent with 3 0-size objects

2018-07-16 Thread Brad Hubbard
Your issue is different since not only do the omap digests of all
replicas not match the omap digest from the auth object info but they
are all different to each other.

What is min_size of pool 67 and what can you tell us about the events
leading up to this?

On Mon, Jul 16, 2018 at 7:06 PM, Matthew Vernon  wrote:
> Hi,
>
> Our cluster is running 10.2.9 (from Ubuntu; on 16.04 LTS), and we have a
> pg that's stuck inconsistent; if I repair it, it logs "failed to pick
> suitable auth object" (repair log attached, to try and stop my MUA
> mangling it).
>
> We then deep-scrubbed that pg, at which point
> rados list-inconsistent-obj 67.2e --format=json-pretty produces a bit of
> output (also attached), which includes that all 3 osds have a zero-sized
> object e.g.
>
> "osd": 1937,
> "errors": [
> "omap_digest_mismatch_oi"
> ],
> "size": 0,
> "omap_digest": "0x45773901",
> "data_digest": "0x"
>
> All 3 osds have different omap_digest, but all have 0 size. Indeed,
> looking on the OSD disks directly, each object is 0 size (i.e. they are
> identical).
>
> This looks similar to one of the failure modes in
> http://tracker.ceph.com/issues/21388 where the is a suggestion (comment
> 19 from David Zafman) to do:
>
> rados -p default.rgw.buckets.index setomapval
> .dir.861ae926-7ff0-48c5-86d6-a6ba8d0a7a14.7130858.6 temporary-key anything
> [deep-scrub]
> rados -p default.rgw.buckets.index rmomapkey
> .dir.861ae926-7ff0-48c5-86d6-a6ba8d0a7a14.7130858.6 temporary-key
>
> Is this likely to be the correct approach here, to? And is there an
> underlying bug in ceph that still needs fixing? :)
>
> Thanks,
>
> Matthew
>
>
>
> --
>  The Wellcome Sanger Institute is operated by Genome Research
>  Limited, a charity registered in England with number 1021457 and a
>  company registered in England with number 2742969, whose registered
>  office is 215 Euston Road, London, NW1 2BE.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Self shutdown of 1 whole system (Derbian stretch/Ceph 12.2.7/bluestore)

2018-07-23 Thread Brad Hubbard
Ceph doesn't shut down systems as in kill or reboot the box if that's
what you're saying?

On Mon, Jul 23, 2018 at 5:04 PM, Nicolas Huillard  wrote:
> Le lundi 23 juillet 2018 à 11:07 +0700, Konstantin Shalygin a écrit :
>> > I even have no fancy kernel or device, just real standard Debian.
>> > The
>> > uptime was 6 days since the upgrade from 12.2.6...
>>
>> Nicolas, you should upgrade your 12.2.6 to 12.2.7 due bugs in this
>> release.
>
> That was done (cf. subject).
> This is happening with 12.2.7, fresh and 6 days old.
>
> --
> Nicolas Huillard
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Recovery from 12.2.5 (corruption) -> 12.2.6 (hair on fire) -> 13.2.0 (some objects inaccessible and CephFS damaged)

2018-07-18 Thread Brad Hubbard
On Thu, Jul 19, 2018 at 12:47 PM, Troy Ablan  wrote:
>
>
> On 07/18/2018 06:37 PM, Brad Hubbard wrote:
>> On Thu, Jul 19, 2018 at 2:48 AM, Troy Ablan  wrote:
>>>
>>>
>>> On 07/17/2018 11:14 PM, Brad Hubbard wrote:
>>>>
>>>> On Wed, Jul 18, 2018 at 2:57 AM, Troy Ablan  wrote:
>>>>>
>>>>> I was on 12.2.5 for a couple weeks and started randomly seeing
>>>>> corruption, moved to 12.2.6 via yum update on Sunday, and all hell broke
>>>>> loose.  I panicked and moved to Mimic, and when that didn't solve the
>>>>> problem, only then did I start to root around in mailing lists archives.
>>>>>
>>>>> It appears I can't downgrade OSDs back to Luminous now that 12.2.7 is
>>>>> out, but I'm unsure how to proceed now that the damaged cluster is
>>>>> running under Mimic.  Is there anything I can do to get the cluster back
>>>>> online and objects readable?
>>>>
>>>> That depends on what the specific problem is. Can you provide some
>>>> data that fills in the blanks around "randomly seeing corruption"?
>>>>
>>> Thanks for the reply, Brad.  I have a feeling that almost all of this stems
>>> from the time the cluster spent running 12.2.6.  When booting VMs that use
>>> rbd as a backing store, they typically get I/O errors during boot and cannot
>>> read critical parts of the image.  I also get similar errors if I try to rbd
>>> export most of the images. Also, CephFS is not started as ceph -s indicates
>>> damage.
>>>
>>> Many of the OSDs have been crashing and restarting as I've tried to rbd
>>> export good versions of images (from older snapshots).  Here's one
>>> particular crash:
>>>
>>> 2018-07-18 15:52:15.809 7fcbaab77700 -1
>>> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/h
>>> uge/release/13.2.0/rpm/el7/BUILD/ceph-13.2.0/src/os/bluestore/BlueStore.h:
>>> In function 'void
>>> BlueStore::SharedBlobSet::remove_last(BlueStore::SharedBlob*)' thread
>>> 7fcbaab7
>>> 7700 time 2018-07-18 15:52:15.750916
>>> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.0/rpm/el7/BUILD/ceph-13
>>> .2.0/src/os/bluestore/BlueStore.h: 455: FAILED assert(sb->nref == 0)
>>>
>>>  ceph version 13.2.0 (79a10589f1f80dfe21e8f9794365ed98143071c4) mimic
>>> (stable)
>>>  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
>>> const*)+0xff) [0x7fcbc197a53f]
>>>  2: (()+0x286727) [0x7fcbc197a727]
>>>  3: (BlueStore::SharedBlob::put()+0x1da) [0x5641f39181ca]
>>>  4: (std::_Rb_tree,
>>> boost::intrusive_ptr,
>>> std::_Identity >,
>>> std::less >,
>>> std::allocator >
>>>> ::_M_erase(std::_Rb_tree_node>> lueStore::SharedBlob> >*)+0x2d) [0x5641f3977cfd]
>>>  5: (std::_Rb_tree,
>>> boost::intrusive_ptr,
>>> std::_Identity >,
>>> std::less >,
>>> std::allocator >
>>>> ::_M_erase(std::_Rb_tree_node>> lueStore::SharedBlob> >*)+0x1b) [0x5641f3977ceb]
>>>  6: (std::_Rb_tree,
>>> boost::intrusive_ptr,
>>> std::_Identity >,
>>> std::less >,
>>> std::allocator >
>>>> ::_M_erase(std::_Rb_tree_node>> lueStore::SharedBlob> >*)+0x1b) [0x5641f3977ceb]
>>>  7: (std::_Rb_tree,
>>> boost::intrusive_ptr,
>>> std::_Identity >,
>>> std::less >,
>>> std::allocator >
>>>> ::_M_erase(std::_Rb_tree_node>> lueStore::SharedBlob> >*)+0x1b) [0x5641f3977ceb]
>>>  8: (BlueStore::TransContext::~TransContext()+0xf7) [0x5641f3979297]
>>>  9: (BlueStore::_txc_finish(BlueStore::TransContext*)+0x610)
>>> [0x5641f391c9b0]
>>>  10: (BlueStore::_txc_state_proc(BlueStore::TransContext*)+0x9a)
>>> [0x5641f392a38a]
>>>  11: (BlueStore::_kv_finalize_thread()+0x41e) [0x5641f392b3be]
>>>  12: (BlueStore::KVFinalizeThread::entry()+0xd) [0x5641f397d85d]
>>>  13: (()+0x7e25) [0x7fcbbe4d2e25]
>>>  14: (clone()+0x6d) [0x7fcbbd5c3bad]
>>>  NOTE: a copy of the executable, or `objdump -rdS ` is needed to
>>> interpret this.
>>>
>>>
>>> Here's the output of ceph -s that might fill in some configuration
>>> questions.  Since osds are continually restarti

Re: [ceph-users] Slow requests

2018-07-04 Thread Brad Hubbard
On Wed, Jul 4, 2018 at 6:26 PM, Benjamin Naber  wrote:
> Hi @all,
>
> im currently in testing for setup an production environment based on the 
> following OSD Nodes:
>
> CEPH Version: luminous 12.2.5
>
> 5x OSD Nodes with following specs:
>
> - 8 Core Intel Xeon 2,0 GHZ
>
> - 96GB Ram
>
> - 10x 1,92 TB Intel DC S4500 connectet via SATA
>
> - 4x 10 Gbit NIC 2 bonded via LACP for Backend Network 2 bonded via LACP for 
> Backend Network.
>
> if i run some fio benchmark via a VM that ist running on a RBD Device on a 
> KVM testing Host. the cluster always runs into slow request warning. Also the 
> performance goes heavy down.
>
> If i dump the osd that stucks, i get the following output:
>
> {
> "ops": [
> {
> "description": "osd_op(client.141944.0:359346834 13.1da 
> 13:5b8b7fd3:::rbd_data.170a3238e1f29.00be:head [write 
> 2097152~1048576] snapc 0=[] ondisk+write+known_if_redirected e2755)",
> "initiated_at": "2018-07-04 10:00:49.475879",
> "age": 287.180328,
> "duration": 287.180355,
> "type_data": {
> "flag_point": "waiting for sub ops",
> "client_info": {
> "client": "client.141944",
> "client_addr": "10.111.90.1:0/3532639465",
> "tid": 359346834
> },
> "events": [
> {
> "time": "2018-07-04 10:00:49.475879",
> "event": "initiated"
> },
> {
> "time": "2018-07-04 10:00:49.476935",
> "event": "queued_for_pg"
> },
> {
> "time": "2018-07-04 10:00:49.477547",
> "event": "reached_pg"
> },
> {
> "time": "2018-07-04 10:00:49.477578",
> "event": "started"
> },
> {
> "time": "2018-07-04 10:00:49.477614",
> "event": "waiting for subops from 5,26"
> },
> {
> "time": "2018-07-04 10:00:49.484679",
> "event": "op_commit"
> },
> {
> "time": "2018-07-04 10:00:49.484681",
> "event": "op_applied"
> },
> {
> "time": "2018-07-04 10:00:49.485588",
> "event": "sub_op_commit_rec from 5"
> }
> ]
> }
> },
> {
> "description": "osd_op(client.141944.0:359346835 13.1da 
> 13:5b8b7fd3:::rbd_data.170a3238e1f29.00be:head [write 
> 3145728~1048576] snapc 0=[] ondisk+write+known_if_redirected e2755)",
> "initiated_at": "2018-07-04 10:00:49.477065",
> "age": 287.179143,
> "duration": 287.179221,
> "type_data": {
> "flag_point": "waiting for sub ops",
> "client_info": {
> "client": "client.141944",
> "client_addr": "10.111.90.1:0/3532639465",
> "tid": 359346835
> },
> "events": [
> {
> "time": "2018-07-04 10:00:49.477065",
> "event": "initiated"
> },
> {
> "time": "2018-07-04 10:00:49.478116",
> "event": "queued_for_pg"
> },
> {
> "time": "2018-07-04 10:00:49.478178",
> "event": "reached_pg"
> },
> {
> "time": "2018-07-04 10:00:49.478201",
> "event": "started"
> },
> {
> "time": "2018-07-04 10:00:49.478232",
> "event": "waiting for subops from 5,26"
> },
> {
> "time": "2018-07-04 10:00:49.484695",
> "event": "op_commit"
> },
> {
> "time": "2018-07-04 10:00:49.484696",
> "event": "op_applied"
> },
> {
> "time": "2018-07-04 10:00:49.485621",
> "event": "sub_op_commit_rec from 5"
> }
> ]
> }
> },
> {
> "description": "osd_op(client.141944.0:359346440 13.11d 
> 13:b8afbe4a:::rbd_data.170a3238e1f29.005c:head [write 0~1048576] 
> snapc 0=[] 

Re: [ceph-users] Slow requests

2018-07-09 Thread Brad Hubbard
On Mon, Jul 9, 2018 at 5:28 PM, Benjamin Naber
 wrote:
> Hi @all,
>
> Problem seems to be solved, afther downgrading from Kernel 4.17.2 to 
> 3.10.0-862.
> Anyone other have issues with newer Kernels and osd nodes?

I'd suggest you pursue that with whoever supports the kernel
exhibiting the problem.

>
> kind regards
>
> Ben
>
>> Brad Hubbard  hat am 5. Juli 2018 um 01:16 geschrieben:
>>
>>
>> On Wed, Jul 4, 2018 at 6:26 PM, Benjamin Naber  
>> wrote:
>> > Hi @all,
>> >
>> > im currently in testing for setup an production environment based on the 
>> > following OSD Nodes:
>> >
>> > CEPH Version: luminous 12.2.5
>> >
>> > 5x OSD Nodes with following specs:
>> >
>> > - 8 Core Intel Xeon 2,0 GHZ
>> >
>> > - 96GB Ram
>> >
>> > - 10x 1,92 TB Intel DC S4500 connectet via SATA
>> >
>> > - 4x 10 Gbit NIC 2 bonded via LACP for Backend Network 2 bonded via LACP 
>> > for Backend Network.
>> >
>> > if i run some fio benchmark via a VM that ist running on a RBD Device on a 
>> > KVM testing Host. the cluster always runs into slow request warning. Also 
>> > the performance goes heavy down.
>> >
>> > If i dump the osd that stucks, i get the following output:
>> >
>> > {
>> > "ops": [
>> > {
>> > "description": "osd_op(client.141944.0:359346834 13.1da 
>> > 13:5b8b7fd3:::rbd_data.170a3238e1f29.00be:head [write 
>> > 2097152~1048576] snapc 0=[] ondisk+write+known_if_redirected e2755)",
>> > "initiated_at": "2018-07-04 10:00:49.475879",
>> > "age": 287.180328,
>> > "duration": 287.180355,
>> > "type_data": {
>> > "flag_point": "waiting for sub ops",
>> > "client_info": {
>> > "client": "client.141944",
>> > "client_addr": "10.111.90.1:0/3532639465",
>> > "tid": 359346834
>> > },
>> > "events": [
>> > {
>> > "time": "2018-07-04 10:00:49.475879",
>> > "event": "initiated"
>> > },
>> > {
>> > "time": "2018-07-04 10:00:49.476935",
>> > "event": "queued_for_pg"
>> > },
>> > {
>> > "time": "2018-07-04 10:00:49.477547",
>> > "event": "reached_pg"
>> > },
>> > {
>> > "time": "2018-07-04 10:00:49.477578",
>> > "event": "started"
>> > },
>> > {
>> > "time": "2018-07-04 10:00:49.477614",
>> > "event": "waiting for subops from 5,26"
>> > },
>> > {
>> > "time": "2018-07-04 10:00:49.484679",
>> > "event": "op_commit"
>> > },
>> > {
>> > "time": "2018-07-04 10:00:49.484681",
>> > "event": "op_applied"
>> > },
>> > {
>> > "time": "2018-07-04 10:00:49.485588",
>> > "event": "sub_op_commit_rec from 5"
>> > }
>> > ]
>> > }
>> > },
>> > {
>> > "description": "osd_op(client.141944.0:359346835 13.1da 
>> > 13:5b8b7fd3:::rbd_data.170a3238e1f29.00be:head [write 
>> > 3145728~1048576] snapc 0=[] ondisk+write+known_if_redirected e2755)",
>> > "initiated_at"

Re: [ceph-users] Ceph 12.2.2 - Compiler Hangs on src/rocksdb/monitoring/statistics.cc

2018-01-13 Thread Brad Hubbard


On Sun, Jan 14, 2018 at 4:41 AM, Dyweni - Ceph-Users <6exbab4fy...@dyweni.com> 
wrote:
> Hi,
>
> GLIBC 2.25-r9
> GCC 6.4.0-r1
>
> When compiling Ceph 12.2.2, the compilation hangs (cc1plus goes into an
> infinite loop and never finishes, requiring the process to be killed
> manually) while compiling the file 'src/rocksdb/monitoring/statistics.cc'.
> By forever, I've left it sit and it ran for 140+ minutes.
>
> The specific command is which runs forever is:
>
> /usr/i686-pc-linux-gnu/gcc-bin/6.4.0/i686-pc-linux-gnu-g++ -DOS_LINUX
> -DROCKSDB_FALLOCATE_PRESENT -DROCKSDB_LIB_IO_POSIX
> -DROCKSDB_MALLOC_USABLE_SIZE -DROCKSDB_PLATFORM_POSIX
> -I/var/tmp/portage/sys-cluster/ceph-12.2.2/work/ceph-12.2.2/src/rocksdb
> -I/var/tmp/portage/sys-cluster/ceph-12.2.2/work/ceph-12.2.2/src/rocksdb/include
> -isystem
> /var/tmp/portage/sys-cluster/ceph-12.2.2/work/ceph-12.2.2/src/rocksdb/third-party/gtest-1.7.0/fused-src
> -march=i686 -O2 -pipe -W -Wextra -Wall -Wsign-compare -Wshadow
> -Wno-unused-parameter -Wno-unused-variable -Woverloaded-virtual
> -Wnon-virtual-dtor -Wno-missing-field-initializers -std=c++11 -O2
> -fno-omit-frame-pointer -momit-leaf-frame-pointer -Werror
> -fno-builtin-memcmp -fPIC -o
> CMakeFiles/rocksdb.dir/monitoring/statistics.cc.o -c
> /var/tmp/portage/sys-cluster/ceph-12.2.2/work/ceph-12.2.2/src/rocksdb/monitoring/statistics.cc

This code is part of rocksdb but this is most likely a compiler bug since it
shouldn't be possible to write code to "trick" the compiler into looping
forever. Suggest you try a different compiler.

>
>
> However, if I turn off all optimizations (replace all '-O2' with '-O0') or
> remove '-fno-omit-frame-pointer' (while keeping all the '-O2'), then the
> compilation finishes.
>
>
> Thanks,
> Dyweni
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] radosgw fails with "ERROR: failed to initialize watch: (34) Numerical result out of range"

2018-01-16 Thread Brad Hubbard
On Wed, Jan 17, 2018 at 2:20 AM, Nikos Kormpakis <nk...@noc.grnet.gr> wrote:
> On 01/16/2018 12:53 AM, Brad Hubbard wrote:
>> On Tue, Jan 16, 2018 at 1:35 AM, Alexander Peters <apet...@sphinx.at> wrote:
>>> i created the dump output but it looks very cryptic to me so i can't really 
>>> make much sense of it. is there anything to look for in particular?
>>
>> Yes, basically we are looking for any line that ends in "= 34". You
>> might also find piping it through c++filt helps.
>>
>> Something like...
>>
>> $ c++filt 
> Hello,
> we're facing the exact same issue. I added some more info about
> our cluster and output from ltrace in [1].

Unfortunately, the strlen lines in that output are expected.

Is it possible for me to access the ltrace output file somehow
(you could email it directly or use  ceph-post-file perhaps)?

>
> Best regards,
> Nikos.
>
> [1] http://tracker.ceph.com/issues/22351
>
>>>
>>> i think i am going to read up on how interpret ltrace output...
>>>
>>> BR
>>> Alex
>>>
>>> - Ursprüngliche Mail -
>>> Von: "Brad Hubbard" <bhubb...@redhat.com>
>>> An: "Alexander Peters" <alexander.pet...@sphinx.at>
>>> CC: "Ceph Users" <ceph-users@lists.ceph.com>
>>> Gesendet: Montag, 15. Januar 2018 03:09:53
>>> Betreff: Re: [ceph-users] radosgw fails with "ERROR: failed to initialize 
>>> watch: (34) Numerical result out of range"
>>>
>>> On Mon, Jan 15, 2018 at 11:38 AM, Brad Hubbard <bhubb...@redhat.com> wrote:
>>>> On Mon, Jan 15, 2018 at 10:38 AM, Alexander Peters
>>>> <alexander.pet...@sphinx.at> wrote:
>>>>> Thanks for the reply - unfortunatly the link you send is behind a paywall 
>>>>> so
>>>>> at least for now i can’t read it.
>>>>
>>>> That's why I provided the cause as laid out in that article (pgp num > pg 
>>>> num).
>>>>
>>>> Do you have any settings in ceph.conf related to pg_num or pgp_num?
>>>>
>>>> If not, please add your details to http://tracker.ceph.com/issues/22351
>>>
>>> Rados can return ERANGE (34) in multiple places so identifying where
>>> might be a big step towards working this out.
>>>
>>> $ ltrace -fo /tmp/ltrace.out /usr/bin/radosgw --cluster ceph --name
>>> client.radosgw.ctrl02 --setuser ceph --setgroup ceph -f -d
>>>
>>> The objective is to find which function(s) return 34.
>>>
>>>>
>>>>>
>>>>> output of ceph osd dump shows that pgp num == pg num:
>>>>>
>>>>> [root@ctrl01 ~]# ceph osd dump
>>>>> epoch 142
>>>>> fsid 0e2d841f-68fd-4629-9813-ab083e8c0f10
>>>>> created 2017-12-20 23:04:59.781525
>>>>> modified 2018-01-14 21:30:57.528682
>>>>> flags sortbitwise,recovery_deletes,purged_snapdirs
>>>>> crush_version 6
>>>>> full_ratio 0.95
>>>>> backfillfull_ratio 0.9
>>>>> nearfull_ratio 0.85
>>>>> require_min_compat_client jewel
>>>>> min_compat_client jewel
>>>>> require_osd_release luminous
>>>>> pool 1 'glance' replicated size 3 min_size 2 crush_rule 0 object_hash
>>>>> rjenkins pg_num 64 pgp_num 64 last_change 119 flags hashpspool 
>>>>> stripe_width
>>>>> 0 application rbd
>>>>> removed_snaps [1~3]
>>>>> pool 2 'cinder-2' replicated size 3 min_size 2 crush_rule 0 object_hash
>>>>> rjenkins pg_num 64 pgp_num 64 last_change 120 flags hashpspool 
>>>>> stripe_width
>>>>> 0 application rbd
>>>>> removed_snaps [1~3]
>>>>> pool 3 'cinder-3' replicated size 3 min_size 2 crush_rule 0 object_hash
>>>>> rjenkins pg_num 64 pgp_num 64 last_change 121 flags hashpspool 
>>>>> stripe_width
>>>>> 0 application rbd
>>>>> removed_snaps [1~3]
>>>>> pool 4 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash
>>>>> rjenkins pg_num 8 pgp_num 8 last_change 94 owner 18446744073709551615 
>>>>> flags
>>>>> hashpspool stripe_width 0 application rgw
>>>>> max_osd 3
>>>>> osd.0 up   in  weight 1 up_from 82 up_thru 140 down_at 79
>>>>> last_clean_interval [23,78) 10.16.0.11:6800/1795 10.16.0.11:6801/1795
>>>>> 10.16.0.11:6802/1795 10.16.0.11

Re: [ceph-users] OSD doesn't start - fresh installation

2018-01-22 Thread Brad Hubbard
On Mon, Jan 22, 2018 at 10:37 PM, Hüseyin Atatür YILDIRIM <
hyildi...@havelsan.com.tr> wrote:

>
> Hi again,
>
>
>
> In the “journalctl –xe”  output:
>
>
>
> Jan 22 15:29:18 mon02 ceph-osd-prestart.sh[1526]: OSD data directory
> /var/lib/ceph/osd/ceph-1 does not exist; bailing out.
>
>
>
> Also in my previous post, I forgot to say that “ceph-deploy osd create”
> command  doesn’t fail and appears to be successful, you can see from the
> logs.
>
> But dameons on nodes don’t start.
>
>
>
> Regards,
>
> Atatur
>
>
>
>
>
>
> 
> Hüseyin Atatür YILDIRIM
> SİSTEM MÜHENDİSİ
> Üniversiteler Mah. İhsan Doğramacı Bul. ODTÜ Teknokent Havelsan A.Ş. 23/B
> Çankaya Ankara TÜRKİYE
> +90 312 292 74 00 <+90%20312%20292%2074%2000> +90 312 219 57 97
> <+90%20312%20219%2057%2097>
> YASAL UYARI: Bu elektronik posta işbu linki kullanarak ulaşabileceğiniz
> Koşul ve Şartlar dokümanına tabidir.
> 
> LEGAL NOTICE: This e-mail is subject to the Terms and Conditions
> document which can be accessed with this link.
> 
> Lütfen gerekmedikçe bu sayfa ve eklerini yazdırmayınız / Please consider
> the environment before printing this email
>
> *From:* Hüseyin Atatür YILDIRIM
> *Sent:* Monday, January 22, 2018 3:19 PM
> *To:* ceph-users@lists.ceph.com
> *Subject:* OSD doesn't start - fresh installation
>
>
>
> Hi all,
>
>
>
> Fresh installation but already ısed disks. I zapped all the disks and  ran
>  “ceph-deploy ods create”  again but  got same results.
>
> Log is attached. Can you please help?
>

Did you mean "sdb1" rather than "sdb" perhaps?


>
>
>
> Thank you,
>
> Atatur
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] pg inconsistent

2018-03-07 Thread Brad Hubbard
On Thu, Mar 8, 2018 at 1:22 AM, Harald Staub  wrote:
> "ceph pg repair" leads to:
> 5.7bd repair 2 errors, 0 fixed
>
> Only an empty list from:
> rados list-inconsistent-obj 5.7bd --format=json-pretty
>
> Inspired by http://tracker.ceph.com/issues/12577 , I tried again with more
> verbose logging and searched the osd logs e.g. for "!=", "mismatch", could
> not find anything interesting. Oh well, these are several millions of lines
> ...
>
> Any hint what I could look for?

Try searching for "scrub_compare_maps" and looking for "5.7bd" in that context.

>
> The 3 OSDs involved are running on 12.2.4, one of them is on BlueStore.
>
> Cheers
>  Harry
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] /var/lib/ceph/osd/ceph-xxx/current/meta shows "Structure needs cleaning"

2018-03-08 Thread Brad Hubbard
On Thu, Mar 8, 2018 at 5:01 PM, 赵贺东  wrote:
> Hi All,
>
> Every time after we activate osd, we got “Structure needs cleaning” in 
> /var/lib/ceph/osd/ceph-xxx/current/meta.
>
>
> /var/lib/ceph/osd/ceph-xxx/current/meta
> # ls -l
> ls: reading directory .: Structure needs cleaning
> total 0
>
> Could Anyone say something about this error?

It's an indication of possible corruption on the filesystem containing "meta".

Can you unmount it and run a filesystem check on it?

At the time the filesystem first detected the corruption it would have
logged it to dmesg and possibly syslog which may give you a clue. Did
you lose power or have a kernel panic or something?

>
> Thank you!
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] /var/lib/ceph/osd/ceph-xxx/current/meta shows "Structure needs cleaning"

2018-03-08 Thread Brad Hubbard
On Thu, Mar 8, 2018 at 7:33 PM, 赵赵贺东 <zhaohed...@gmail.com> wrote:
> Hi Brad,
>
> Thank you for your attention.
>
>> 在 2018年3月8日,下午4:47,Brad Hubbard <bhubb...@redhat.com> 写道:
>>
>> On Thu, Mar 8, 2018 at 5:01 PM, 赵贺东 <zhaohed...@gmail.com> wrote:
>>> Hi All,
>>>
>>> Every time after we activate osd, we got “Structure needs cleaning” in 
>>> /var/lib/ceph/osd/ceph-xxx/current/meta.
>>>
>>>
>>> /var/lib/ceph/osd/ceph-xxx/current/meta
>>> # ls -l
>>> ls: reading directory .: Structure needs cleaning
>>> total 0
>>>
>>> Could Anyone say something about this error?
>>
>> It's an indication of possible corruption on the filesystem containing 
>> "meta".
>>
>> Can you unmount it and run a filesystem check on it?
> I did some xfs_repair operation, but no effect.Structure needs cleaning” 
> still exist.
>
>
>
>>
>> At the time the filesystem first detected the corruption it would have
>> logged it to dmesg and possibly syslog which may give you a clue. Did
>> you lose power or have a kernel panic or something?
> We did not lose power.
> You are right, we get a metadata corruption in dmesg every time just 
> following the osd activating operation.
>
> [  399.513525] XFS (sda1): Metadata corruption detected at 
> xfs_dir3_data_read_verify+0x58/0xd0, xfs_dir3_data block 0x48b9ff80
> [  399.524709] XFS (sda1): Unmount and run xfs_repair
> [  399.529511] XFS (sda1): First 64 bytes of corrupted metadata buffer:
> [  399.535917] dd8f2000: 58 46 53 42 00 00 10 00 00 00 00 00 91 73 fe fb  
> XFSB.s..
> [  399.543959] dd8f2010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 
> [  399.551983] dd8f2020: e5 30 40 22 51 8f 4f 1c 80 73 56 9b 71 aa 92 24  
> .0@"Q.O..sV.q..$
> [  399.560037] dd8f2030: 00 00 00 00 80 00 00 07 ff ff ff ff ff ff ff ff  
> 
> [  399.568118] XFS (sda1): metadata I/O error: block 0x48b9ff80 
> ("xfs_trans_read_buf_map") error 117 numblks 8
> [  399.583179] XFS (sda1): Metadata corruption detected at 
> xfs_dir3_data_read_verify+0x58/0xd0, xfs_dir3_data block 0x48b9ff80
> [  399.594378] XFS (sda1): Unmount and run xfs_repair
> [  399.599182] XFS (sda1): First 64 bytes of corrupted metadata buffer:
> [  399.605575] e47db000: 58 46 53 42 00 00 10 00 00 00 00 00 91 73 fe fb  
> XFSB.s..
> [  399.613613] e47db010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 
> [  399.621637] e47db020: e5 30 40 22 51 8f 4f 1c 80 73 56 9b 71 aa 92 24  
> .0@"Q.O..sV.q..$
> [  399.629679] e47db030: 00 00 00 00 80 00 00 07 ff ff ff ff ff ff ff ff  
> 
> [  399.637856] XFS (sda1): metadata I/O error: block 0x48b9ff80 
> ("xfs_trans_read_buf_map") error 117 numblks 8
> [  399.648165] XFS (sda1): Metadata corruption detected at 
> xfs_dir3_data_read_verify+0x58/0xd0, xfs_dir3_data block 0x48b9ff80
> [  399.659378] XFS (sda1): Unmount and run xfs_repair
> [  399.664196] XFS (sda1): First 64 bytes of corrupted metadata buffer:
> [  399.670570] e47db000: 58 46 53 42 00 00 10 00 00 00 00 00 91 73 fe fb  
> XFSB.s..
> [  399.678610] e47db010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
> 
> [  399.686643] e47db020: e5 30 40 22 51 8f 4f 1c 80 73 56 9b 71 aa 92 24  
> .0@"Q.O..sV.q..$
> [  399.694681] e47db030: 00 00 00 00 80 00 00 07 ff ff ff ff ff ff ff ff  
> 
> [  399.702794] XFS (sda1): metadata I/O error: block 0x48b9ff80 
> ("xfs_trans_read_buf_map") error 117 numblks 8

I'd suggest the next step is to look for a matching XFS bug in your
distro and, if possible, try a different distro and see if you get the
same result.

>
>
> Thank you !
>
>
>>
>>>
>>> Thank you!
>>>
>>>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>>
>> --
>> Cheers,
>> Brad
>



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSD crash with segfault Luminous 12.2.4

2018-03-08 Thread Brad Hubbard
On Fri, Mar 9, 2018 at 3:54 AM, Subhachandra Chandra
 wrote:
> I noticed a similar crash too. Unfortunately, I did not get much info in the
> logs.
>
>  *** Caught signal (Segmentation fault) **
>
> Mar 07 17:58:26 data7 ceph-osd-run.sh[796380]:  in thread 7f63a0a97700
> thread_name:safe_timer
>
> Mar 07 17:58:28 data7 ceph-osd-run.sh[796380]: docker_exec.sh: line 56:
> 797138 Segmentation fault  (core dumped) "$@"

The log isn't very helpful AFAICT. Are these both container
environments? If so, what are the details (OS, etc.).

Can anyone capture a core file? Please feel free to open a tracker on this.

>
>
> Thanks
>
> Subhachandra
>
>
>
> On Thu, Mar 8, 2018 at 6:00 AM, Dietmar Rieder 
> wrote:
>>
>> Hi,
>>
>> I noticed in my client (using cephfs) logs that an osd was unexpectedly
>> going down.
>> While checking the osd logs for the affected OSD I found that the osd
>> was seg faulting:
>>
>> []
>> 2018-03-07 06:01:28.873049 7fd9af370700 -1 *** Caught signal
>> (Segmentation fault) **
>>  in thread 7fd9af370700 thread_name:safe_timer
>>
>>   ceph version 12.2.4 (52085d5249a80c5f5121a76d6288429f35e4e77b)
>> luminous (stable)
>>1: (()+0xa3c611) [0x564585904611]
>> 2: (()+0xf5e0) [0x7fd9b66305e0]
>>  NOTE: a copy of the executable, or `objdump -rdS ` is
>> needed to interpret this.
>> [...]
>>
>> Should I open a ticket for this? What additional information is needed?
>>
>>
>> I put the relevant log entries for download under [1], so maybe someone
>> with more
>> experience can find some useful information therein.
>>
>> Thanks
>>   Dietmar
>>
>>
>> [1] https://expirebox.com/download/6473c34c80e8142e22032469a59df555.html
>>
>> --
>> _
>> D i e t m a r  R i e d e r, Mag.Dr.
>> Innsbruck Medical University
>> Biocenter - Division for Bioinformatics
>> Email: dietmar.rie...@i-med.ac.at
>> Web:   http://www.icbi.at
>>
>>
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Random health OSD_SCRUB_ERRORS on various OSDs, after pg repair back to HEALTH_OK

2018-03-06 Thread Brad Hubbard
On Tue, Mar 6, 2018 at 5:26 PM, Marco Baldini - H.S. Amiata <
mbald...@hsamiata.it> wrote:

> Hi
>
> I monitor dmesg in each of the 3 nodes, no hardware issue reported. And
> the problem happens with various different OSDs in different nodes, for me
> it is clear it's not an hardware problem.
>

If you have osd_debug set to 25 or greater when you run the deep scrub you
should get more information about the nature of the read error in the
ReplicatedBackend::be_deep_scrub() function (assuming this is a replicated
pool).

This may create large logs so watch they don't exhaust storage.

> Thanks for reply
>
>
>
> Il 05/03/2018 21:45, Vladimir Prokofev ha scritto:
>
> > always solved by ceph pg repair 
> That doesn't necessarily means that there's no hardware issue. In my case
> repair also worked fine and returned cluster to OK state every time, but in
> time faulty disk fail another scrub operation, and this repeated multiple
> times before we replaced that disk.
> One last thing to look into is dmesg at your OSD nodes. If there's a
> hardware read error it will be logged in dmesg.
>
> 2018-03-05 18:26 GMT+03:00 Marco Baldini - H.S. Amiata <
> mbald...@hsamiata.it>:
>
>> Hi and thanks for reply
>>
>> The OSDs are all healthy, in fact after a ceph pg repair  the ceph
>> health is back to OK and in the OSD log I see   repair ok, 0 fixed
>>
>> The SMART data of the 3 OSDs seems fine
>>
>> *OSD.5*
>>
>> # ceph-disk list | grep osd.5
>>  /dev/sdd1 ceph data, active, cluster ceph, osd.5, block /dev/sdd2
>>
>> # smartctl -a /dev/sdd
>> smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.13.13-6-pve] (local build)
>> Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
>>
>> === START OF INFORMATION SECTION ===
>> Model Family: Seagate Barracuda 7200.14 (AF)
>> Device Model: ST1000DM003-1SB10C
>> Serial Number:Z9A1MA1V
>> LU WWN Device Id: 5 000c50 090c7028b
>> Firmware Version: CC43
>> User Capacity:1,000,204,886,016 bytes [1.00 TB]
>> Sector Sizes: 512 bytes logical, 4096 bytes physical
>> Rotation Rate:7200 rpm
>> Form Factor:  3.5 inches
>> Device is:In smartctl database [for details use: -P show]
>> ATA Version is:   ATA8-ACS T13/1699-D revision 4
>> SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
>> Local Time is:Mon Mar  5 16:17:22 2018 CET
>> SMART support is: Available - device has SMART capability.
>> SMART support is: Enabled
>>
>> === START OF READ SMART DATA SECTION ===
>> SMART overall-health self-assessment test result: PASSED
>>
>> General SMART Values:
>> Offline data collection status:  (0x82)  Offline data collection activity
>>  was completed without error.
>>  Auto Offline Data Collection: Enabled.
>> Self-test execution status:  (   0)  The previous self-test routine 
>> completed
>>  without error or no self-test has ever
>>  been run.
>> Total time to complete Offline
>> data collection: (0) seconds.
>> Offline data collection
>> capabilities: (0x7b) SMART execute Offline immediate.
>>  Auto Offline data collection on/off 
>> support.
>>  Suspend Offline collection upon new
>>  command.
>>  Offline surface scan supported.
>>  Self-test supported.
>>  Conveyance Self-test supported.
>>  Selective Self-test supported.
>> SMART capabilities:(0x0003)  Saves SMART data before entering
>>  power-saving mode.
>>  Supports SMART auto save timer.
>> Error logging capability:(0x01)  Error logging supported.
>>  General Purpose Logging supported.
>> Short self-test routine
>> recommended polling time: (   1) minutes.
>> Extended self-test routine
>> recommended polling time: ( 109) minutes.
>> Conveyance self-test routine
>> recommended polling time: (   2) minutes.
>> SCT capabilities:   (0x1085) SCT Status supported.
>>
>> SMART Attributes Data Structure revision number: 10
>> Vendor Specific SMART Attributes with Thresholds:
>> ID# ATTRIBUTE_NAME  FLAG VALUE WORST THRESH TYPE  UPDATED  
>> WHEN_FAILED RAW_VALUE
>>   1 Raw_Read_Error_Rate 0x000f   082   063   006Pre-fail  Always 
>>   -   193297722
>>   3 Spin_Up_Time0x0003   097   097   000Pre-fail  Always 
>>   -   0
>>   4 Start_Stop_Count0x0032   100   100   020Old_age   Always 
>>   -   60
>>   5 Reallocated_Sector_Ct   0x0033   100   100   010Pre-fail  Always 
>>   -   0
>>   7 Seek_Error_Rate 

Re: [ceph-users] Random health OSD_SCRUB_ERRORS on various OSDs, after pg repair back to HEALTH_OK

2018-03-06 Thread Brad Hubbard
debug_osd that is... :)

On Tue, Mar 6, 2018 at 7:10 PM, Brad Hubbard <bhubb...@redhat.com> wrote:

>
>
> On Tue, Mar 6, 2018 at 5:26 PM, Marco Baldini - H.S. Amiata <
> mbald...@hsamiata.it> wrote:
>
>> Hi
>>
>> I monitor dmesg in each of the 3 nodes, no hardware issue reported. And
>> the problem happens with various different OSDs in different nodes, for me
>> it is clear it's not an hardware problem.
>>
>
> If you have osd_debug set to 25 or greater when you run the deep scrub you
> should get more information about the nature of the read error in the
> ReplicatedBackend::be_deep_scrub() function (assuming this is a
> replicated pool).
>
> This may create large logs so watch they don't exhaust storage.
>
>> Thanks for reply
>>
>>
>>
>> Il 05/03/2018 21:45, Vladimir Prokofev ha scritto:
>>
>> > always solved by ceph pg repair 
>> That doesn't necessarily means that there's no hardware issue. In my case
>> repair also worked fine and returned cluster to OK state every time, but in
>> time faulty disk fail another scrub operation, and this repeated multiple
>> times before we replaced that disk.
>> One last thing to look into is dmesg at your OSD nodes. If there's a
>> hardware read error it will be logged in dmesg.
>>
>> 2018-03-05 18:26 GMT+03:00 Marco Baldini - H.S. Amiata <
>> mbald...@hsamiata.it>:
>>
>>> Hi and thanks for reply
>>>
>>> The OSDs are all healthy, in fact after a ceph pg repair  the ceph
>>> health is back to OK and in the OSD log I see   repair ok, 0 fixed
>>>
>>> The SMART data of the 3 OSDs seems fine
>>>
>>> *OSD.5*
>>>
>>> # ceph-disk list | grep osd.5
>>>  /dev/sdd1 ceph data, active, cluster ceph, osd.5, block /dev/sdd2
>>>
>>> # smartctl -a /dev/sdd
>>> smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.13.13-6-pve] (local build)
>>> Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
>>>
>>> === START OF INFORMATION SECTION ===
>>> Model Family: Seagate Barracuda 7200.14 (AF)
>>> Device Model: ST1000DM003-1SB10C
>>> Serial Number:Z9A1MA1V
>>> LU WWN Device Id: 5 000c50 090c7028b
>>> Firmware Version: CC43
>>> User Capacity:1,000,204,886,016 bytes [1.00 TB]
>>> Sector Sizes: 512 bytes logical, 4096 bytes physical
>>> Rotation Rate:7200 rpm
>>> Form Factor:  3.5 inches
>>> Device is:In smartctl database [for details use: -P show]
>>> ATA Version is:   ATA8-ACS T13/1699-D revision 4
>>> SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
>>> Local Time is:Mon Mar  5 16:17:22 2018 CET
>>> SMART support is: Available - device has SMART capability.
>>> SMART support is: Enabled
>>>
>>> === START OF READ SMART DATA SECTION ===
>>> SMART overall-health self-assessment test result: PASSED
>>>
>>> General SMART Values:
>>> Offline data collection status:  (0x82) Offline data collection activity
>>> was completed without error.
>>> Auto Offline Data Collection: Enabled.
>>> Self-test execution status:  (   0) The previous self-test routine 
>>> completed
>>> without error or no self-test has ever
>>> been run.
>>> Total time to complete Offline
>>> data collection:(0) seconds.
>>> Offline data collection
>>> capabilities:(0x7b) SMART execute Offline immediate.
>>> Auto Offline data collection on/off 
>>> support.
>>> Suspend Offline collection upon new
>>> command.
>>> Offline surface scan supported.
>>> Self-test supported.
>>> Conveyance Self-test supported.
>>> Selective Self-test supported.
>>> SMART capabilities:(0x0003) Saves SMART data before entering
>>> power-saving mode.
>>> Supports SMART auto save timer.
>>> Error logging capability:(0x01) Error logging supported.
>>> General Purpose Logging supported.
&g

Re: [ceph-users] 1 mon unable to join the quorum

2018-04-04 Thread Brad Hubbard
See my latest update in the tracker.

On Sun, Apr 1, 2018 at 2:27 AM, Julien Lavesque
<julien.laves...@objectif-libre.com> wrote:
> At first the cluster has been deployed using ceph-ansible in version
> infernalis.
> For some unknown reason the controller02 was out of the quorum and we were
> unable to add it in the quorum.
>
> We have updated the cluster to jewel version using the rolling-update
> playbook from ceph-ansible
>
> The controller02 was still not in the quorum.
>
> We tried to delete the mon completely and add it again using the manual
> method of http://docs.ceph.com/docs/jewel/rados/operations/add-or-rm-mons/
> (with id controller02)
>
> The logs provided are when the controller02 was added with the manual
> method.
>
> But the controller02 won't join the cluster
>
> Hope It helps understand
>
>
>
> On 31/03/2018 02:12, Brad Hubbard wrote:
>>
>> I'm not sure I completely understand your "test". What exactly are you
>> trying to achieve and what documentation are you following?
>>
>> On Fri, Mar 30, 2018 at 10:49 PM, Julien Lavesque
>> <julien.laves...@objectif-libre.com> wrote:
>>>
>>> Brad,
>>>
>>> Thanks for your answer
>>>
>>> On 30/03/2018 02:09, Brad Hubbard wrote:
>>>>
>>>>
>>>> 2018-03-19 11:03:50.819493 7f842ed47640  0 mon.controller02 does not
>>>> exist in monmap, will attempt to join an existing cluster
>>>> 2018-03-19 11:03:50.820323 7f842ed47640  0 starting mon.controller02
>>>> rank -1 at 172.18.8.6:6789/0 mon_data
>>>> /var/lib/ceph/mon/ceph-controller02 fsid
>>>> f37f31b1-92c5-47c8-9834-1757a677d020
>>>>
>>>> We are called 'mon.controller02' and we can not find our name in the
>>>> local copy of the monmap.
>>>>
>>>> 2018-03-19 11:03:52.346318 7f842735d700 10
>>>> mon.controller02@-1(probing) e68  ready to join, but i'm not in the
>>>> monmap or my addr is blank, trying to join
>>>>
>>>> Our name is not in the copy of the monmap we got from peer controller01
>>>> either.
>>>
>>>
>>>
>>> During our test we have deleted completely the controller02 monitor and
>>> add
>>> it again.
>>>
>>> The log you have is when the controller02 is added (so it wasn't in the
>>> monmap before)
>>>
>>>
>>>>
>>>> $ cat ../controller02-mon_status.log
>>>> [root@controller02 ~]# ceph --admin-daemon
>>>> /var/run/ceph/ceph-mon.controller02.asok mon_status
>>>> {
>>>> "name": "controller02",
>>>> "rank": 1,
>>>> "state": "electing",
>>>> "election_epoch": 32749,
>>>> "quorum": [],
>>>> "outside_quorum": [],
>>>> "extra_probe_peers": [],
>>>> "sync_provider": [],
>>>> "monmap": {
>>>> "epoch": 71,
>>>> "fsid": "f37f31b1-92c5-47c8-9834-1757a677d020",
>>>> "modified": "2018-03-29 10:48:06.371157",
>>>> "created": "0.00",
>>>> "mons": [
>>>> {
>>>> "rank": 0,
>>>> "name": "controller01",
>>>> "addr": "172.18.8.5:6789\/0"
>>>> },
>>>> {
>>>> "rank": 1,
>>>> "name": "controller02",
>>>> "addr": "172.18.8.6:6789\/0"
>>>> },
>>>> {
>>>> "rank": 2,
>>>> "name": "controller03",
>>>> "addr": "172.18.8.7:6789\/0"
>>>> }
>>>> ]
>>>> }
>>>> }
>>>>
>>>> In the monmaps we are called 'controller02', not 'mon.controller02'.
>>>> These names need to be identical.
>>>>
>>>
>>> The cluster has been deployed using ceph-ansible with the servers
>>> hostname.
>>> All monitors are called mon.controller0x in the monmap and all the 3
>>> monitors have the same configurati

Re: [ceph-users] where is it possible download CentOS 7.5

2018-03-27 Thread Brad Hubbard
See the thread in this very ML titled "Ceph iSCSI is a prank?", last update
thirteen days ago.

If your questions are not answered by that thread let us know.

Please also remember that CentOS is not the only platform that ceph runs on
by a long shot and that not all distros lag as much as it (not a criticism,
just a fact. The reasons for lagging are valid and well documented and
should be accepted by those who choose to use them). if you want the
bleeding edge then rhel/centos should not be your platform of choice.


On Tue, Mar 27, 2018 at 7:04 PM, Max Cuttins  wrote:

> Thanks Jason,
>
> this is exactly what i read around and I supposed.
> The RHEL 7.5 is not yet released (neither is Kernel 4.16)
>
> So my dubt are 2:
>
> *1) If it's not released... why is this in the documentation?*
> Is the documentation talking about a Dev candidate already accessible
> somewhere?
>
> 2) why in the dashboard is there already a iSCSI board?
> I guess I miss something or is really just for future implementation
> and not usable yet?
> And if it is usable... where I can download the necessarie in order to
> start?
>
>
> Il 26/03/2018 14:10, Jason Dillaman ha scritto:
>
> RHEL 7.5 has not been released yet, but it should be released very
> soon. After it's released, it usually takes the CentOS team a little
> time to put together their matching release. I also suspect that Linux
> kernel 4.16 is going to be released in the next week or so as well.
>
> On Sat, Mar 24, 2018 at 7:36 AM, Max Cuttins  
>  wrote:
>
> As stated in the documentation, in order to use iSCSI it's needed use
> CentOS7.5.
> Where can I download it?
>
>
> Thanks
>
>
> iSCSI Targets
>
> Traditionally, block-level access to a Ceph storage cluster has been limited
> to QEMU and librbd, which is a key enabler for adoption within OpenStack
> environments. Starting with the Ceph Luminous release, block-level access is
> expanding to offer standard iSCSI support allowing wider platform usage, and
> potentially opening new use cases.
>
> RHEL/CentOS 7.5; Linux kernel v4.16 or newer; or the Ceph iSCSI client test
> kernel
> A working Ceph Storage cluster, deployed with ceph-ansible or using the
> command-line interface
> iSCSI gateways nodes, which can either be colocated with OSD nodes or on
> dedicated nodes
> Separate network subnets for iSCSI front-end traffic and Ceph back-end
> traffic
>
>
> ___
> ceph-users mailing 
> listceph-us...@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSD crash with segfault Luminous 12.2.4

2018-03-27 Thread Brad Hubbard
"NOTE: a copy of the executable, or `objdump -rdS ` is
needed to interpret this."

Have you ever wondered what this means and why it's there? :)

This is at least something you can try. it may provide useful
information, it may not.

This stack looks like it is either corrupted, or possibly not in ceph
but in one of the linked libraries or glibc itself. If it's the
former, it probably won't tell us anything. If it's the latter you
will need the relevant debuginfo installed to get meaningful output
and note that it will probably take a while. '' in this
case is ceph-osd of course.

Alternatively, if you can upload a coredump and an sosreport (so I can
validate exact versions of all packages installed) I can try and take
a look.

On Fri, Mar 23, 2018 at 9:20 PM, Dietmar Rieder
 wrote:
> Hi,
>
>
> I encountered one more two days ago, and I opened a ticket:
>
> http://tracker.ceph.com/issues/23431
>
> In our case it is more like 1 every two weeks, for now...
> And it is affecting different OSDs on different hosts.
>
> Dietmar
>
> On 03/23/2018 11:50 AM, Oliver Freyermuth wrote:
>> Hi together,
>>
>> I notice exactly the same, also the same addresses, Luminous 12.2.4, CentOS 
>> 7.
>> Sadly, logs are equally unhelpful.
>>
>> It happens randomly on an OSD about once per 2-3 days (of the 196 total OSDs 
>> we have). It's also not a container environment.
>>
>> Cheers,
>>   Oliver
>>
>> Am 08.03.2018 um 15:00 schrieb Dietmar Rieder:
>>> Hi,
>>>
>>> I noticed in my client (using cephfs) logs that an osd was unexpectedly
>>> going down.
>>> While checking the osd logs for the affected OSD I found that the osd
>>> was seg faulting:
>>>
>>> []
>>> 2018-03-07 06:01:28.873049 7fd9af370700 -1 *** Caught signal
>>> (Segmentation fault) **
>>>  in thread 7fd9af370700 thread_name:safe_timer
>>>
>>>   ceph version 12.2.4 (52085d5249a80c5f5121a76d6288429f35e4e77b)
>>> luminous (stable)
>>>1: (()+0xa3c611) [0x564585904611]
>>> 2: (()+0xf5e0) [0x7fd9b66305e0]
>>>  NOTE: a copy of the executable, or `objdump -rdS ` is
>>> needed to interpret this.
>>> [...]
>>>
>>> Should I open a ticket for this? What additional information is needed?
>>>
>>>
>>> I put the relevant log entries for download under [1], so maybe someone
>>> with more
>>> experience can find some useful information therein.
>>>
>>> Thanks
>>>   Dietmar
>>>
>>>
>>> [1] https://expirebox.com/download/6473c34c80e8142e22032469a59df555.html
>>>
>>>
>>>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
>>
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
> --
> _
> D i e t m a r  R i e d e r, Mag.Dr.
> Innsbruck Medical University
> Biocenter - Division for Bioinformatics
> Innrain 80, 6020 Innsbruck
> Phone: +43 512 9003 71402
> Fax: +43 512 9003 73100
> Email: dietmar.rie...@i-med.ac.at
> Web:   http://www.icbi.at
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] where is it possible download CentOS 7.5

2018-03-27 Thread Brad Hubbard
On Tue, Mar 27, 2018 at 9:12 PM, Max Cuttins <m...@phoenixweb.it> wrote:

> Hi Brad,
>
> that post was mine. I knew it quite well.
>
That Post was about confirm the fact that minimum requirements written in
> the documentation really didn't exists.
>
> However I never asked if there is somewhere a place where is possible to
> download the DEV or the RC of Centos7.5.
> I was thinking about to join the community of tester and developers that
> are already testing Ceph on that "*not ready*" environment.
>
> In that POST these questions were not really made, so no answer where
> given.
>

>From that thread.

"The necessary kernel changes actually are included as part of 4.16-rc1
which is available now. We also offer a pre-built test kernel with the
necessary fixes here [1].

[1] https://shaman.ceph.com/repos/kernel/ceph-iscsi-test/;
<https://shaman.ceph.com/repos/kernel/ceph-iscsi-test/>

I notice that URL is unavailable so maybe the real question should be why
is that kernel no longer available?

There are plenty more available at
https://shaman.ceph.com/repos/kernel/testing/ but *I* can't tell you which
is relevant but perhaps someone else can.

I see that you talked also about other distribution. Well, I read around
> that Suse already implement iSCSI.
> However as far as I know (which is not so much), this distribution use
> modified kernel in order to let this work.
> And in order to use it it's needed  a dashboard that can handle these kind
> of differences (OpenAttic).
> I knew already OpenAttic is contributing in developing the next generation
> of the Ceph Dashboard (and this sound damn good!).
> However this also means to me that the *official dashboard* should not be
> talking about ISCSI at all (as every implementation of iSCSI are running on
> mod version).
>
> So these are the things I cannot figure out:
> Why is the iSCSI board on the CEPH official dashboard? (I could understand
> on OpenAttic which run on SUSE but not on the official one).
>
Why do you believe it should not be?

> And why, in the official documentation, the minimu requirements to let
> iSCSI work, is to install CentOS7.5? Which doesn't exist? Is there a RC
> candidate which I can start to use?
>

But it doesn't say that, it says " RHEL/CentOS 7.5; Linux kernel v4.16 or
newer; or the Ceph iSCSI client test kernel
<https://shaman.ceph.com/repos/kernel/ceph-iscsi-test>". You seem to be
ignoring the "Ceph iSCSI client test kernel
<https://shaman.ceph.com/repos/kernel/ceph-iscsi-test>" part?

And... if SUSE or even other distribution works already with iSCSI... why
> the documentation just doesn't reccomend these ones instead of RHEL or
> CENTOS?
>
Because that would be odd, to say the least. If the documentation is
incorrect for CentOS then it was, at least at some point, thought to be
correct and it probably will be correct again in the near future and, if
not, we can review and correct it as necessary.

> There is something confused about what the documentation minimal
> requirements, the dashboard suggest to be able to do, and what i read
> around about modded Ceph for other linux distributions.
> I create a new post to clarify all these points.
>
> Thanks for your answer! :)
>
>
>
> Il 27/03/2018 11:24, Brad Hubbard ha scritto:
>
> See the thread in this very ML titled "Ceph iSCSI is a prank?", last
> update thirteen days ago.
>
> If your questions are not answered by that thread let us know.
>
> Please also remember that CentOS is not the only platform that ceph runs
> on by a long shot and that not all distros lag as much as it (not a
> criticism, just a fact. The reasons for lagging are valid and well
> documented and should be accepted by those who choose to use them). if you
> want the bleeding edge then rhel/centos should not be your platform of
> choice.
>
>
> On Tue, Mar 27, 2018 at 7:04 PM, Max Cuttins <m...@phoenixweb.it> wrote:
>
>> Thanks Jason,
>>
>> this is exactly what i read around and I supposed.
>> The RHEL 7.5 is not yet released (neither is Kernel 4.16)
>>
>> So my dubt are 2:
>>
>> *1) If it's not released... why is this in the documentation?*
>> Is the documentation talking about a Dev candidate already accessible
>> somewhere?
>>
>> 2) why in the dashboard is there already a iSCSI board?
>> I guess I miss something or is really just for future implementation
>> and not usable yet?
>> And if it is usable... where I can download the necessarie in order to
>> start?
>>
>>
>> Il 26/03/2018 14:10, Jason Dillaman ha scritto:
>>
>> RHEL 7.5 has not been released yet, but it should be released very
>> soon. Afte

Re: [ceph-users] where is it possible download CentOS 7.5

2018-03-27 Thread Brad Hubbard
On Tue, Mar 27, 2018 at 9:46 PM, Brad Hubbard <bhubb...@redhat.com> wrote:

>
>
> On Tue, Mar 27, 2018 at 9:12 PM, Max Cuttins <m...@phoenixweb.it> wrote:
>
>> Hi Brad,
>>
>> that post was mine. I knew it quite well.
>>
> That Post was about confirm the fact that minimum requirements written in
>> the documentation really didn't exists.
>>
>> However I never asked if there is somewhere a place where is possible to
>> download the DEV or the RC of Centos7.5.
>> I was thinking about to join the community of tester and developers that
>> are already testing Ceph on that "*not ready*" environment.
>>
>> In that POST these questions were not really made, so no answer where
>> given.
>>
>
> From that thread.
>
> "The necessary kernel changes actually are included as part of 4.16-rc1
> which is available now. We also offer a pre-built test kernel with the
> necessary fixes here [1].
>
> [1] https://shaman.ceph.com/repos/kernel/ceph-iscsi-test/;
> <https://shaman.ceph.com/repos/kernel/ceph-iscsi-test/>
>
> I notice that URL is unavailable so maybe the real question should be why
> is that kernel no longer available?
>

Turns out this build got "garbage collected" and replacing it is being
worked on right now.


>
> There are plenty more available at https://shaman.ceph.com/repos/
> kernel/testing/ but *I* can't tell you which is relevant but perhaps
> someone else can.
>
> I see that you talked also about other distribution. Well, I read around
>> that Suse already implement iSCSI.
>> However as far as I know (which is not so much), this distribution use
>> modified kernel in order to let this work.
>> And in order to use it it's needed  a dashboard that can handle these
>> kind of differences (OpenAttic).
>> I knew already OpenAttic is contributing in developing the next
>> generation of the Ceph Dashboard (and this sound damn good!).
>> However this also means to me that the *official dashboard* should not
>> be talking about ISCSI at all (as every implementation of iSCSI are running
>> on mod version).
>>
>> So these are the things I cannot figure out:
>> Why is the iSCSI board on the CEPH official dashboard? (I could
>> understand on OpenAttic which run on SUSE but not on the official one).
>>
> Why do you believe it should not be?
>
>> And why, in the official documentation, the minimu requirements to let
>> iSCSI work, is to install CentOS7.5? Which doesn't exist? Is there a RC
>> candidate which I can start to use?
>>
>
> But it doesn't say that, it says " RHEL/CentOS 7.5; Linux kernel v4.16 or
> newer; or the Ceph iSCSI client test kernel
> <https://shaman.ceph.com/repos/kernel/ceph-iscsi-test>". You seem to be
> ignoring the "Ceph iSCSI client test kernel
> <https://shaman.ceph.com/repos/kernel/ceph-iscsi-test>" part?
>
> And... if SUSE or even other distribution works already with iSCSI... why
>> the documentation just doesn't reccomend these ones instead of RHEL or
>> CENTOS?
>>
> Because that would be odd, to say the least. If the documentation is
> incorrect for CentOS then it was, at least at some point, thought to be
> correct and it probably will be correct again in the near future and, if
> not, we can review and correct it as necessary.
>
>> There is something confused about what the documentation minimal
>> requirements, the dashboard suggest to be able to do, and what i read
>> around about modded Ceph for other linux distributions.
>> I create a new post to clarify all these points.
>>
>> Thanks for your answer! :)
>>
>>
>>
>> Il 27/03/2018 11:24, Brad Hubbard ha scritto:
>>
>> See the thread in this very ML titled "Ceph iSCSI is a prank?", last
>> update thirteen days ago.
>>
>> If your questions are not answered by that thread let us know.
>>
>> Please also remember that CentOS is not the only platform that ceph runs
>> on by a long shot and that not all distros lag as much as it (not a
>> criticism, just a fact. The reasons for lagging are valid and well
>> documented and should be accepted by those who choose to use them). if you
>> want the bleeding edge then rhel/centos should not be your platform of
>> choice.
>>
>>
>> On Tue, Mar 27, 2018 at 7:04 PM, Max Cuttins <m...@phoenixweb.it> wrote:
>>
>>> Thanks Jason,
>>>
>>> this is exactly what i read around and I supposed.
>>> The RHEL 7.5 is not yet released (neither is Kernel 4.16)
>>>
>>> So my dubt are 2:

Re: [ceph-users] OSD crash with segfault Luminous 12.2.4

2018-03-27 Thread Brad Hubbard
On Tue, Mar 27, 2018 at 9:04 PM, Dietmar Rieder
<dietmar.rie...@i-med.ac.at> wrote:
> Thanks Brad!

Hey Dietmar,

yw.

>
> I added some information to the ticket.
> Unfortunately I still could not grab a coredump, since there was no
> segfault lately.

OK. That may help to get us started. Getting late here for me so I'll
take a look at this tomorrow.

Thanks!

>
>  http://tracker.ceph.com/issues/23431
>
> Maybe Oliver has something to add as well.
>
>
> Dietmar
>
>
> On 03/27/2018 11:37 AM, Brad Hubbard wrote:
>> "NOTE: a copy of the executable, or `objdump -rdS ` is
>> needed to interpret this."
>>
>> Have you ever wondered what this means and why it's there? :)
>>
>> This is at least something you can try. it may provide useful
>> information, it may not.
>>
>> This stack looks like it is either corrupted, or possibly not in ceph
>> but in one of the linked libraries or glibc itself. If it's the
>> former, it probably won't tell us anything. If it's the latter you
>> will need the relevant debuginfo installed to get meaningful output
>> and note that it will probably take a while. '' in this
>> case is ceph-osd of course.
>>
>> Alternatively, if you can upload a coredump and an sosreport (so I can
>> validate exact versions of all packages installed) I can try and take
>> a look.
>>
>> On Fri, Mar 23, 2018 at 9:20 PM, Dietmar Rieder
>> <dietmar.rie...@i-med.ac.at> wrote:
>>> Hi,
>>>
>>>
>>> I encountered one more two days ago, and I opened a ticket:
>>>
>>> http://tracker.ceph.com/issues/23431
>>>
>>> In our case it is more like 1 every two weeks, for now...
>>> And it is affecting different OSDs on different hosts.
>>>
>>> Dietmar
>>>
>>> On 03/23/2018 11:50 AM, Oliver Freyermuth wrote:
>>>> Hi together,
>>>>
>>>> I notice exactly the same, also the same addresses, Luminous 12.2.4, 
>>>> CentOS 7.
>>>> Sadly, logs are equally unhelpful.
>>>>
>>>> It happens randomly on an OSD about once per 2-3 days (of the 196 total 
>>>> OSDs we have). It's also not a container environment.
>>>>
>>>> Cheers,
>>>>   Oliver
>>>>
>>>> Am 08.03.2018 um 15:00 schrieb Dietmar Rieder:
>>>>> Hi,
>>>>>
>>>>> I noticed in my client (using cephfs) logs that an osd was unexpectedly
>>>>> going down.
>>>>> While checking the osd logs for the affected OSD I found that the osd
>>>>> was seg faulting:
>>>>>
>>>>> []
>>>>> 2018-03-07 06:01:28.873049 7fd9af370700 -1 *** Caught signal
>>>>> (Segmentation fault) **
>>>>>  in thread 7fd9af370700 thread_name:safe_timer
>>>>>
>>>>>   ceph version 12.2.4 (52085d5249a80c5f5121a76d6288429f35e4e77b)
>>>>> luminous (stable)
>>>>>1: (()+0xa3c611) [0x564585904611]
>>>>> 2: (()+0xf5e0) [0x7fd9b66305e0]
>>>>>  NOTE: a copy of the executable, or `objdump -rdS ` is
>>>>> needed to interpret this.
>>>>> [...]
>>>>>
>>>>> Should I open a ticket for this? What additional information is needed?
>>>>>
>>>>>
>>>>> I put the relevant log entries for download under [1], so maybe someone
>>>>> with more
>>>>> experience can find some useful information therein.
>>>>>
>>>>> Thanks
>>>>>   Dietmar
>>>>>
>>>>>
>>>>> [1] https://expirebox.com/download/6473c34c80e8142e22032469a59df555.html
>>>>>
>>>>>
>>>>>
>>>>> ___
>>>>> ceph-users mailing list
>>>>> ceph-users@lists.ceph.com
>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>
>>>>
>>>>
>>>>
>>>>
>>>> ___
>>>> ceph-users mailing list
>>>> ceph-users@lists.ceph.com
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>
>>>
>>>
>>> --
>>> _
>>> D i e t m a r  R i e d e r, Mag.Dr.
>>> Innsbruck Medical University
>>> Biocenter - Division for Bioinformatics
>>> Innrain 80, 6020 Innsbruck
>>> Phone: +43 512 9003 71402
>>> Fax: +43 512 9003 73100
>>> Email: dietmar.rie...@i-med.ac.at
>>> Web:   http://www.icbi.at
>>>
>>>
>>>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
>>
>>
>
>
> --
> _
> D i e t m a r  R i e d e r, Mag.Dr.
> Innsbruck Medical University
> Biocenter - Division for Bioinformatics
> Innrain 80, 6020 Innsbruck
> Phone: +43 512 9003 71402
> Fax: +43 512 9003 73100
> Email: dietmar.rie...@i-med.ac.at
> Web:   http://www.icbi.at
>
>



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] where is it possible download CentOS 7.5

2018-03-28 Thread Brad Hubbard
On Wed, Mar 28, 2018 at 6:53 PM, Max Cuttins <m...@phoenixweb.it> wrote:
> Il 27/03/2018 13:46, Brad Hubbard ha scritto:
>
>
>
> On Tue, Mar 27, 2018 at 9:12 PM, Max Cuttins <m...@phoenixweb.it> wrote:
>>
>> Hi Brad,
>>
>> that post was mine. I knew it quite well.
>>
>> That Post was about confirm the fact that minimum requirements written in
>> the documentation really didn't exists.
>>
>> However I never asked if there is somewhere a place where is possible to
>> download the DEV or the RC of Centos7.5.
>> I was thinking about to join the community of tester and developers that
>> are already testing Ceph on that "not ready" environment.
>>
>> In that POST these questions were not really made, so no answer where
>> given.
>
>
> From that thread.
>
> "The necessary kernel changes actually are included as part of 4.16-rc1
> which is available now. We also offer a pre-built test kernel with the
> necessary fixes here [1].
>
> [1] https://shaman.ceph.com/repos/kernel/ceph-iscsi-test/;
>
> I notice that URL is unavailable so maybe the real question should be why is
> that kernel no longer available?
>
>
> Yes, the link was broken and it seemed to me a misprint of old docs.
> As all other stuffs described didn't exists already I thought that event
> this Kernel test was not available (already or anymore).

The link is fixed as of 12-18 hours ago and the kernel is available again.

>
>
> There are plenty more available at
> https://shaman.ceph.com/repos/kernel/testing/ but *I* can't tell you which
> is relevant but perhaps someone else can.
>
>
> However the 4.16 is almost ready to be released (shoulded had been already).
> At this moment is just a double work use that kernel and after upgrade it to
> the final one.

OK, I guess you just need to wait (by your choice) then.

>
>
>> I see that you talked also about other distribution. Well, I read around
>> that Suse already implement iSCSI.
>> However as far as I know (which is not so much), this distribution use
>> modified kernel in order to let this work.
>> And in order to use it it's needed  a dashboard that can handle these kind
>> of differences (OpenAttic).
>> I knew already OpenAttic is contributing in developing the next generation
>> of the Ceph Dashboard (and this sound damn good!).
>> However this also means to me that the official dashboard should not be
>> talking about ISCSI at all (as every implementation of iSCSI are running on
>> mod version).
>>
>> So these are the things I cannot figure out:
>> Why is the iSCSI board on the CEPH official dashboard? (I could understand
>> on OpenAttic which run on SUSE but not on the official one).
>
> Why do you believe it should not be?
>
>
> Maybe I'm in wrong, but I guess that the dashboard manager expects to get
> data/details/stats/config from a particular set of paths, components and
> daemons which cannot be the same for all the ad-hoc implementation.
> So there is a dashboard that show values for a component which is not
> there (instead could be there something else but written in another way).
> Every ad-hoc implementation (like OpenAttic) of course know where to find
> data/details/stats/config for work with their implementation (so it's
> understandable that they have board for iSCSI).
> Right?

Not as far as I'm concerned. See John's email on the subject in this thread.

>
>
>> And why, in the official documentation, the minimu requirements to let
>> iSCSI work, is to install CentOS7.5? Which doesn't exist? Is there a RC
>> candidate which I can start to use?
>
>
> But it doesn't say that, it says " RHEL/CentOS 7.5; Linux kernel v4.16 or
> newer; or the Ceph iSCSI client test kernel". You seem to be ignoring the
> "Ceph iSCSI client test kernel" part?
>
> Yes, the link was broken and it seemed to me a misprint of old docs.
>
> Moreover at first read I figure out that I needed both centOS7.5 AND kernel
> 4.16. OR the kernel test.
> Now you are telling me that all requirements are alternative. Which explain
> to me why the documentation suggest just CentOS and not all others
> distribution.
> Also this sounds good.
>
> But I don't think that CentOS7.5 will use the kernel 4.16 ... so you are
> telling me that new feature will be backported to the kernel 3.* ?

Nope. I'm not part of the Red hat kernel team and don't have the
influence to shape what they do.

> Is it right? So i don't need to upgrade the kernel If I'll use
> RHEL/CentOS7.5 ?
> This sound even better. I was a bit worried to don't use the mainstream
> kernel of the distri

Re: [ceph-users] 1 mon unable to join the quorum

2018-03-30 Thread Brad Hubbard
I'm not sure I completely understand your "test". What exactly are you
trying to achieve and what documentation are you following?

On Fri, Mar 30, 2018 at 10:49 PM, Julien Lavesque
<julien.laves...@objectif-libre.com> wrote:
> Brad,
>
> Thanks for your answer
>
> On 30/03/2018 02:09, Brad Hubbard wrote:
>>
>> 2018-03-19 11:03:50.819493 7f842ed47640  0 mon.controller02 does not
>> exist in monmap, will attempt to join an existing cluster
>> 2018-03-19 11:03:50.820323 7f842ed47640  0 starting mon.controller02
>> rank -1 at 172.18.8.6:6789/0 mon_data
>> /var/lib/ceph/mon/ceph-controller02 fsid
>> f37f31b1-92c5-47c8-9834-1757a677d020
>>
>> We are called 'mon.controller02' and we can not find our name in the
>> local copy of the monmap.
>>
>> 2018-03-19 11:03:52.346318 7f842735d700 10
>> mon.controller02@-1(probing) e68  ready to join, but i'm not in the
>> monmap or my addr is blank, trying to join
>>
>> Our name is not in the copy of the monmap we got from peer controller01
>> either.
>
>
> During our test we have deleted completely the controller02 monitor and add
> it again.
>
> The log you have is when the controller02 is added (so it wasn't in the
> monmap before)
>
>
>>
>> $ cat ../controller02-mon_status.log
>> [root@controller02 ~]# ceph --admin-daemon
>> /var/run/ceph/ceph-mon.controller02.asok mon_status
>> {
>> "name": "controller02",
>> "rank": 1,
>> "state": "electing",
>> "election_epoch": 32749,
>> "quorum": [],
>> "outside_quorum": [],
>> "extra_probe_peers": [],
>> "sync_provider": [],
>> "monmap": {
>> "epoch": 71,
>> "fsid": "f37f31b1-92c5-47c8-9834-1757a677d020",
>> "modified": "2018-03-29 10:48:06.371157",
>> "created": "0.00",
>> "mons": [
>> {
>> "rank": 0,
>> "name": "controller01",
>> "addr": "172.18.8.5:6789\/0"
>> },
>> {
>> "rank": 1,
>> "name": "controller02",
>> "addr": "172.18.8.6:6789\/0"
>> },
>> {
>> "rank": 2,
>> "name": "controller03",
>> "addr": "172.18.8.7:6789\/0"
>> }
>> ]
>> }
>> }
>>
>> In the monmaps we are called 'controller02', not 'mon.controller02'.
>> These names need to be identical.
>>
>
> The cluster has been deployed using ceph-ansible with the servers hostname.
> All monitors are called mon.controller0x in the monmap and all the 3
> monitors have the same configuration
>
> We have the same behavior creating a monmap from scratch :
>
> [root@controller03 ~]# monmaptool --create --add controller01
> 172.18.8.5:6789 --add controller02 172.18.8.6:6789 --add controller03
> 172.18.8.7:6789 --fsid f37f31b1-92c5-47c8-9834-1757a677d020 --clobber
> test-monmap
> monmaptool: monmap file test-monmap
> monmaptool: set fsid to f37f31b1-92c5-47c8-9834-1757a677d020
> monmaptool: writing epoch 0 to test-monmap (3 monitors)
>
> [root@controller03 ~]# monmaptool --print test-monmap
> monmaptool: monmap file test-monmap
> epoch 0
> fsid f37f31b1-92c5-47c8-9834-1757a677d020
> last_changed 2018-03-30 14:42:18.809719
> created 2018-03-30 14:42:18.809719
> 0: 172.18.8.5:6789/0 mon.controller01
> 1: 172.18.8.6:6789/0 mon.controller02
> 2: 172.18.8.7:6789/0 mon.controller03
>
>
>>
>> On Thu, Mar 29, 2018 at 7:23 PM, Julien Lavesque
>> <julien.laves...@objectif-libre.com> wrote:
>>>
>>> Hi Brad,
>>>
>>> The results have been uploaded on the tracker
>>> (https://tracker.ceph.com/issues/23403)
>>>
>>> Julien
>>>
>>>
>>> On 29/03/2018 07:54, Brad Hubbard wrote:
>>>>
>>>>
>>>> Can you update with the result of the following commands from all of the
>>>> MONs?
>>>>
>>>> # ceph --admin-daemon /var/run/ceph/ceph-mon.[whatever].asok mon_status
>>>> # ceph --admin-daemon /var/run/ceph/ceph-mon.[whatever].asok
>>>> quorum_status
>>>>
>>>> On Thu, Ma

Re: [ceph-users] 1 mon unable to join the quorum

2018-03-28 Thread Brad Hubbard
Can you update with the result of the following commands from all of the MONs?

# ceph --admin-daemon /var/run/ceph/ceph-mon.[whatever].asok mon_status
# ceph --admin-daemon /var/run/ceph/ceph-mon.[whatever].asok quorum_status

On Thu, Mar 29, 2018 at 3:11 PM, Gauvain Pocentek
 wrote:
> Hello Ceph users,
>
> We are having a problem on a ceph cluster running Jewel: one of the mons
> left the quorum, and we  have not been able to make it join again. The two
> other monitors are running just fine, but obviously we need this third one.
>
> The problem happened before Jewel, when the cluster was running Infernalis.
> We upgraded hoping that it would solve the problem, but no luck.
>
> We've validated several things: no network problem, no clock skew, same OS
> and ceph version everywhere. We've also removed the mon completely, and
> recreated it. We also tried to run an additional mon on one of the OSD
> machines, this mon didn't join the quorum either.
>
> We've opened https://tracker.ceph.com/issues/23403 with logs from the 3 mons
> during a fresh startup of the problematic logs.
>
> Is there anything we could try to do to resolve this issue? We are getting
> out of ideas.
>
> We'd appreciate any suggestion!
>
> Gauvain Pocentek
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] 1 mon unable to join the quorum

2018-03-29 Thread Brad Hubbard
2018-03-19 11:03:50.819493 7f842ed47640  0 mon.controller02 does not
exist in monmap, will attempt to join an existing cluster
2018-03-19 11:03:50.820323 7f842ed47640  0 starting mon.controller02
rank -1 at 172.18.8.6:6789/0 mon_data
/var/lib/ceph/mon/ceph-controller02 fsid
f37f31b1-92c5-47c8-9834-1757a677d020

We are called 'mon.controller02' and we can not find our name in the
local copy of the monmap.

2018-03-19 11:03:52.346318 7f842735d700 10
mon.controller02@-1(probing) e68  ready to join, but i'm not in the
monmap or my addr is blank, trying to join

Our name is not in the copy of the monmap we got from peer controller01 either.

$ cat ../controller02-mon_status.log
[root@controller02 ~]# ceph --admin-daemon
/var/run/ceph/ceph-mon.controller02.asok mon_status
{
"name": "controller02",
"rank": 1,
"state": "electing",
"election_epoch": 32749,
"quorum": [],
"outside_quorum": [],
"extra_probe_peers": [],
"sync_provider": [],
"monmap": {
"epoch": 71,
"fsid": "f37f31b1-92c5-47c8-9834-1757a677d020",
"modified": "2018-03-29 10:48:06.371157",
"created": "0.00",
"mons": [
{
"rank": 0,
"name": "controller01",
"addr": "172.18.8.5:6789\/0"
},
{
"rank": 1,
"name": "controller02",
"addr": "172.18.8.6:6789\/0"
},
{
"rank": 2,
"name": "controller03",
"addr": "172.18.8.7:6789\/0"
}
]
}
}

In the monmaps we are called 'controller02', not 'mon.controller02'.
These names need to be identical.


On Thu, Mar 29, 2018 at 7:23 PM, Julien Lavesque
<julien.laves...@objectif-libre.com> wrote:
> Hi Brad,
>
> The results have been uploaded on the tracker
> (https://tracker.ceph.com/issues/23403)
>
> Julien
>
>
> On 29/03/2018 07:54, Brad Hubbard wrote:
>>
>> Can you update with the result of the following commands from all of the
>> MONs?
>>
>> # ceph --admin-daemon /var/run/ceph/ceph-mon.[whatever].asok mon_status
>> # ceph --admin-daemon /var/run/ceph/ceph-mon.[whatever].asok quorum_status
>>
>> On Thu, Mar 29, 2018 at 3:11 PM, Gauvain Pocentek
>> <gauvain.pocen...@objectif-libre.com> wrote:
>>>
>>> Hello Ceph users,
>>>
>>> We are having a problem on a ceph cluster running Jewel: one of the mons
>>> left the quorum, and we  have not been able to make it join again. The
>>> two
>>> other monitors are running just fine, but obviously we need this third
>>> one.
>>>
>>> The problem happened before Jewel, when the cluster was running
>>> Infernalis.
>>> We upgraded hoping that it would solve the problem, but no luck.
>>>
>>> We've validated several things: no network problem, no clock skew, same
>>> OS
>>> and ceph version everywhere. We've also removed the mon completely, and
>>> recreated it. We also tried to run an additional mon on one of the OSD
>>> machines, this mon didn't join the quorum either.
>>>
>>> We've opened https://tracker.ceph.com/issues/23403 with logs from the 3
>>> mons
>>> during a fresh startup of the problematic logs.
>>>
>>> Is there anything we could try to do to resolve this issue? We are
>>> getting
>>> out of ideas.
>>>
>>> We'd appreciate any suggestion!
>>>
>>> Gauvain Pocentek
>>>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Slow requests troubleshooting in Luminous - details missing

2018-03-05 Thread Brad Hubbard
On Fri, Mar 2, 2018 at 3:54 PM, Alex Gorbachev  wrote:
> On Thu, Mar 1, 2018 at 10:57 PM, David Turner  wrote:
>> Blocked requests and slow requests are synonyms in ceph. They are 2 names
>> for the exact same thing.
>>
>>
>> On Thu, Mar 1, 2018, 10:21 PM Alex Gorbachev  
>> wrote:
>>>
>>> On Thu, Mar 1, 2018 at 2:47 PM, David Turner 
>>> wrote:
>>> > `ceph health detail` should show you more information about the slow
>>> > requests.  If the output is too much stuff, you can grep out for blocked
>>> > or
>>> > something.  It should tell you which OSDs are involved, how long they've
>>> > been slow, etc.  The default is for them to show '> 32 sec' but that may
>>> > very well be much longer and `ceph health detail` will show that.
>>>
>>> Hi David,
>>>
>>> Thank you for the reply.  Unfortunately, the health detail only shows
>>> blocked requests.  This seems to be related to a compression setting
>>> on the pool, nothing in OSD logs.
>>>
>>> I replied to another compression thread.  This makes sense since
>>> compression is new, and in the past all such issues were reflected in
>>> OSD logs and related to either network or OSD hardware.
>>>
>>> Regards,
>>> Alex
>>>
>>> >
>>> > On Thu, Mar 1, 2018 at 2:23 PM Alex Gorbachev 
>>> > wrote:
>>> >>
>>> >> Is there a switch to turn on the display of specific OSD issues?  Or
>>> >> does the below indicate a generic problem, e.g. network and no any
>>> >> specific OSD?
>>> >>
>>> >> 2018-02-28 18:09:36.438300 7f6dead56700  0
>>> >> mon.roc-vm-sc3c234@0(leader).data_health(46) update_stats avail 56%
>>> >> total 15997 MB, used 6154 MB, avail 9008 MB
>>> >> 2018-02-28 18:09:41.477216 7f6dead56700  0 log_channel(cluster) log
>>> >> [WRN] : Health check failed: 73 slow requests are blocked > 32 sec
>>> >> (REQUEST_SLOW)
>>> >> 2018-02-28 18:09:47.552669 7f6dead56700  0 log_channel(cluster) log
>>> >> [WRN] : Health check update: 74 slow requests are blocked > 32 sec
>>> >> (REQUEST_SLOW)
>>> >> 2018-02-28 18:09:53.794882 7f6de8551700  0
>>> >> mon.roc-vm-sc3c234@0(leader) e1 handle_command mon_command({"prefix":
>>> >> "status", "format": "json"} v 0) v1
>>> >>
>>> >> --
>
> I was wrong where the pool compression does not matter, even
> uncompressed pool also generates these slow messages.
>
> Question is why no subsequent message relating to specific OSDs (like
> in Jewel and prior, like this example from RH:
>
> 2015-08-24 13:18:10.024659 osd.1 127.0.0.1:6812/3032 9 : cluster [WRN]
> 6 slow requests, 6 included below; oldest blocked for > 61.758455 secs
>
> 2016-07-25 03:44:06.510583 osd.50 [WRN] slow request 30.005692 seconds
> old, received at {date-time}: osd_op(client.4240.0:8
> benchmark_data_ceph-1_39426_object7 [write 0~4194304] 0.69848840) v4
> currently waiting for subops from [610]
>
> In comparison, my Luminous cluster only shows the general slow/blocked 
> message:
>
> 2018-03-01 21:52:54.237270 7f7e419e3700  0 log_channel(cluster) log
> [WRN] : Health check failed: 116 slow requests are blocked > 32 sec
> (REQUEST_SLOW)
> 2018-03-01 21:53:00.282721 7f7e419e3700  0 log_channel(cluster) log
> [WRN] : Health check update: 66 slow requests are blocked > 32 sec
> (REQUEST_SLOW)
> 2018-03-01 21:53:08.534244 7f7e419e3700  0 log_channel(cluster) log
> [WRN] : Health check update: 5 slow requests are blocked > 32 sec
> (REQUEST_SLOW)
> 2018-03-01 21:53:10.382510 7f7e419e3700  0 log_channel(cluster) log
> [INF] : Health check cleared: REQUEST_SLOW (was: 5 slow requests are
> blocked > 32 sec)
> 2018-03-01 21:53:10.382546 7f7e419e3700  0 log_channel(cluster) log
> [INF] : Cluster is now healthy
>
> So where are the details?

Working on this, thanks.

See https://tracker.ceph.com/issues/23236

>
> Thanks,
> Alex
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] fixing unrepairable inconsistent PG

2018-06-28 Thread Brad Hubbard
On Fri, Jun 29, 2018 at 2:38 AM, Andrei Mikhailovsky  wrote:
> Hi Brad,
>
> This has helped to repair the issue. Many thanks for your help on this!!!

No problem.

>
> I had so many objects with broken omap checksum, that I spent at least a few 
> hours identifying those and using the commands you've listed to repair. They 
> were all related to one pool called .rgw.buckets.index . All other pools look 
> okay so far.

So originally you said you were having trouble with "one inconsistent
and stubborn PG" When did that become "so many objects"?

>
> I am wondering what could have got horribly wrong with the above pool?

Is that pool 18? I notice it seems to be size 2, what is min_size on that pool?

As to working out what went wrong. What event(s) coincided with or
preceded the problem? What history can you provide? What data can you
provide from the time leading up to when the issue was first seen?

>
> Cheers
>
> Andrei
> - Original Message -
>> From: "Brad Hubbard" 
>> To: "Andrei Mikhailovsky" 
>> Cc: "ceph-users" 
>> Sent: Thursday, 28 June, 2018 01:08:34
>> Subject: Re: [ceph-users] fixing unrepairable inconsistent PG
>
>> Try the following. You can do this with all osds up and running.
>>
>> # rados -p [name_of_pool_18] setomapval .dir.default.80018061.2
>> temporary-key anything
>> # ceph pg deep-scrub 18.2
>>
>> Once you are sure the scrub has completed and the pg is no longer
>> inconsistent you can remove the temporary key.
>>
>> # rados -p [name_of_pool_18] rmomapkey .dir.default.80018061.2 temporary-key
>>
>>
>> On Wed, Jun 27, 2018 at 9:42 PM, Andrei Mikhailovsky  
>> wrote:
>>> Here is one more thing:
>>>
>>> rados list-inconsistent-obj 18.2
>>> {
>>>"inconsistents" : [
>>>   {
>>>  "object" : {
>>> "locator" : "",
>>> "version" : 632942,
>>> "nspace" : "",
>>> "name" : ".dir.default.80018061.2",
>>> "snap" : "head"
>>>  },
>>>  "union_shard_errors" : [
>>> "omap_digest_mismatch_info"
>>>  ],
>>>  "shards" : [
>>> {
>>>"osd" : 21,
>>>"primary" : true,
>>>"data_digest" : "0x",
>>>"omap_digest" : "0x25e8a1da",
>>>"errors" : [
>>>   "omap_digest_mismatch_info"
>>>],
>>>"size" : 0
>>> },
>>> {
>>>"data_digest" : "0x",
>>>"primary" : false,
>>>"osd" : 28,
>>>"errors" : [
>>>   "omap_digest_mismatch_info"
>>>],
>>>"omap_digest" : "0x25e8a1da",
>>>"size" : 0
>>> }
>>>  ],
>>>  "errors" : [],
>>>  "selected_object_info" : {
>>> "mtime" : "2018-06-19 16:31:44.759717",
>>> "alloc_hint_flags" : 0,
>>> "size" : 0,
>>> "last_reqid" : "client.410876514.0:1",
>>> "local_mtime" : "2018-06-19 16:31:44.760139",
>>> "data_digest" : "0x",
>>> "truncate_seq" : 0,
>>> "legacy_snaps" : [],
>>> "expected_write_size" : 0,
>>> "watchers" : {},
>>> "flags" : [
>>>"dirty",
>>>"data_digest",
>>>    "omap_digest"
>>> ],
>>> "oid" : {
>>>"pool" : 18,
>>>"hash" : 1156456354,
>>>"key" : "",
>>>"oid" : ".dir.default.80018061.2",
>>>"namespace&qu

Re: [ceph-users] How to repair active+clean+inconsistent?

2018-11-11 Thread Brad Hubbard
What does "rados list-inconsistent-obj " say?

Note that you may have to do a deep scrub to populate the output.
On Mon, Nov 12, 2018 at 5:10 AM K.C. Wong  wrote:
>
> Hi folks,
>
> I would appreciate any pointer as to how I can resolve a
> PG stuck in “active+clean+inconsistent” state. This has
> resulted in HEALTH_ERR status for the last 5 days with no
> end in sight. The state got triggered when one of the drives
> in the PG returned I/O error. I’ve since replaced the failed
> drive.
>
> I’m running Jewel (out of centos-release-ceph-jewel) on
> CentOS 7. I’ve tried “ceph pg repair ” and it didn’t seem
> to do anything. I’ve tried even more drastic measures such as
> comparing all the files (using filestore) under that PG_head
> on all 3 copies and then nuking the outlier. Nothing worked.
>
> Many thanks,
>
> -kc
>
> K.C. Wong
> kcw...@verseon.com
> M: +1 (408) 769-8235
>
> -
> Confidentiality Notice:
> This message contains confidential information. If you are not the
> intended recipient and received this message in error, any use or
> distribution is strictly prohibited. Please also notify us
> immediately by return e-mail, and delete this message from your
> computer system. Thank you.
> -
> 4096R/B8995EDE  E527 CBE8 023E 79EA 8BBB  5C77 23A6 92E9 B899 5EDE
> hkps://hkps.pool.sks-keyservers.net
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How to subscribe to developers list

2018-11-11 Thread Brad Hubbard
What do you get if you send "help" (without quotes) to m
ajord...@vger.kernel.org ?

On Sun, Nov 11, 2018 at 10:15 AM Cranage, Steve <
scran...@deepspacestorage.com> wrote:

> Can anyone tell me the secret? A colleague tried and failed many times so
> I tried and got this:
>
>
>
>
>
> Steve Cranage
> --_000_SN4PR0701MB3792CB55C8AA7468ADE7FC4DB2C00SN4PR0701MB3792_
>  Command
> '--_000_sn4pr0701mb3792cb55c8aa7468ade7fc4db2c00sn4pr0701mb3792_' not
> recognized.
>  Content-Type: text/plain; charset="us-ascii"
>  Command 'content-type:' not recognized.
>  Content-Transfer-Encoding: quoted-printable
>  Command 'content-transfer-encoding:' not recognized.
> 
>  subscribe+ceph-devel
>  Command 'subscribe+ceph-devel' not recognized.
>
>
>
> According to the server help, the 'subscribe+ceph-devel’ should be correct
> syntax, but apparently not so.
>
>
>
> TIA!
>
> Principal Architect, Co-Founder
>
> DeepSpace Storage
>
> 719-930-6960
>
> [image: cid:image001.png@01D3FCBC.58FDB6F0]
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>


-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How to repair active+clean+inconsistent?

2018-11-11 Thread Brad Hubbard
On Mon, Nov 12, 2018 at 4:21 PM Ashley Merrick  wrote:
>
> Your need to run "ceph pg deep-scrub 1.65" first

Right, thanks Ashley. That's what the "Note that you may have to do a
deep scrub to populate the output." part of my answer meant but
perhaps I needed to go further?

The system has a record of a scrub error on a previous scan but
subsequent activity in the cluster has invalidated the specifics. You
need to run another scrub to get the specific information for this pg
at this point in time (the information does not remain valid
indefinitely and therefore may need to be renewed depending on
circumstances).

>
> On Mon, Nov 12, 2018 at 2:20 PM K.C. Wong  wrote:
>>
>> Hi Brad,
>>
>> I got the following:
>>
>> [root@mgmt01 ~]# ceph health detail
>> HEALTH_ERR 1 pgs inconsistent; 1 scrub errors
>> pg 1.65 is active+clean+inconsistent, acting [62,67,47]
>> 1 scrub errors
>> [root@mgmt01 ~]# rados list-inconsistent-obj 1.65
>> No scrub information available for pg 1.65
>> error 2: (2) No such file or directory
>> [root@mgmt01 ~]# rados list-inconsistent-snapset 1.65
>> No scrub information available for pg 1.65
>> error 2: (2) No such file or directory
>>
>> Rather odd output, I’d say; not that I understand what
>> that means. I also tried ceph list-inconsistent-pg:
>>
>> [root@mgmt01 ~]# rados lspools
>> rbd
>> cephfs_data
>> cephfs_metadata
>> .rgw.root
>> default.rgw.control
>> default.rgw.data.root
>> default.rgw.gc
>> default.rgw.log
>> ctrl-p
>> prod
>> corp
>> camp
>> dev
>> default.rgw.users.uid
>> default.rgw.users.keys
>> default.rgw.buckets.index
>> default.rgw.buckets.data
>> default.rgw.buckets.non-ec
>> [root@mgmt01 ~]# for i in $(rados lspools); do rados list-inconsistent-pg 
>> $i; done
>> []
>> ["1.65"]
>> []
>> []
>> []
>> []
>> []
>> []
>> []
>> []
>> []
>> []
>> []
>> []
>> []
>> []
>> []
>> []
>>
>> So, that’d put the inconsistency in the cephfs_data pool.
>>
>> Thank you for your help,
>>
>> -kc
>>
>> K.C. Wong
>> kcw...@verseon.com
>> M: +1 (408) 769-8235
>>
>> -
>> Confidentiality Notice:
>> This message contains confidential information. If you are not the
>> intended recipient and received this message in error, any use or
>> distribution is strictly prohibited. Please also notify us
>> immediately by return e-mail, and delete this message from your
>> computer system. Thank you.
>> -
>>
>> 4096R/B8995EDE  E527 CBE8 023E 79EA 8BBB  5C77 23A6 92E9 B899 5EDE
>>
>> hkps://hkps.pool.sks-keyservers.net
>>
>> On Nov 11, 2018, at 5:43 PM, Brad Hubbard  wrote:
>>
>> What does "rados list-inconsistent-obj " say?
>>
>> Note that you may have to do a deep scrub to populate the output.
>> On Mon, Nov 12, 2018 at 5:10 AM K.C. Wong  wrote:
>>
>>
>> Hi folks,
>>
>> I would appreciate any pointer as to how I can resolve a
>> PG stuck in “active+clean+inconsistent” state. This has
>> resulted in HEALTH_ERR status for the last 5 days with no
>> end in sight. The state got triggered when one of the drives
>> in the PG returned I/O error. I’ve since replaced the failed
>> drive.
>>
>> I’m running Jewel (out of centos-release-ceph-jewel) on
>> CentOS 7. I’ve tried “ceph pg repair ” and it didn’t seem
>> to do anything. I’ve tried even more drastic measures such as
>> comparing all the files (using filestore) under that PG_head
>> on all 3 copies and then nuking the outlier. Nothing worked.
>>
>> Many thanks,
>>
>> -kc
>>
>> K.C. Wong
>> kcw...@verseon.com
>> M: +1 (408) 769-8235
>>
>> -
>> Confidentiality Notice:
>> This message contains confidential information. If you are not the
>> intended recipient and received this message in error, any use or
>> distribution is strictly prohibited. Please also notify us
>> immediately by return e-mail, and delete this message from your
>> computer system. Thank you.
>> -
>> 4096R/B8995EDE  E527 CBE8 023E 79EA 8BBB  5C77 23A6 92E9 B899 5EDE
>> hkps://hkps.pool.sks-keyservers.net
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>>
>>
>> --
>> Cheers,
>> Brad
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How to repair active+clean+inconsistent?

2018-11-14 Thread Brad Hubbard
You could try a 'rados get' and then a 'rados put' on the object to start with.
On Thu, Nov 15, 2018 at 4:07 AM K.C. Wong  wrote:
>
> So, I’ve issued the deep-scrub command (and the repair command)
> and nothing seems to happen.
> Unrelated to this issue, I have to take down some OSD to prepare
> a host for RMA. One of them happens to be in the replication
> group for this PG. So, a scrub happened indirectly. I now have
> this from “ceph -s”:
>
> cluster 374aed9e-5fc1-47e1-8d29-4416f7425e76
>  health HEALTH_ERR
> 1 pgs inconsistent
> 18446 scrub errors
>  monmap e1: 3 mons at 
> {mgmt01=10.0.1.1:6789/0,mgmt02=10.1.1.1:6789/0,mgmt03=10.2.1.1:6789/0}
> election epoch 252, quorum 0,1,2 mgmt01,mgmt02,mgmt03
>   fsmap e346: 1/1/1 up {0=mgmt01=up:active}, 2 up:standby
>  osdmap e40248: 120 osds: 119 up, 119 in
> flags sortbitwise,require_jewel_osds
>   pgmap v22025963: 3136 pgs, 18 pools, 18975 GB data, 214 Mobjects
> 59473 GB used, 287 TB / 345 TB avail
> 3120 active+clean
>   15 active+clean+scrubbing+deep
>1 active+clean+inconsistent
>
> That’s a lot of scrub errors:
>
> HEALTH_ERR 1 pgs inconsistent; 18446 scrub errors
> pg 1.65 is active+clean+inconsistent, acting [62,67,33]
> 18446 scrub errors
>
> Now, “rados list-inconsistent-obj 1.65” returns a *very* long JSON
> output. Here’s a very small snippet, the errors look the same across:
>
> {
>   “object”:{
> "name":”10ea8bb.0045”,
> "nspace":”",
> "locator":”",
> "snap":"head”,
> "version”:59538
>   },
>   "errors":["attr_name_mismatch”],
>   "union_shard_errors":["oi_attr_missing”],
>   "selected_object_info":"1:a70dc1cc:::10ea8bb.0045:head(2897'59538 
> client.4895965.0:462007 dirty|data_digest|omap_digest s 4194304 uv 59538 dd 
> f437a612 od  alloc_hint [0 0])”,
>   "shards”:[
> {
>   "osd":33,
>   "errors":[],
>   "size":4194304,
>   "omap_digest”:"0x”,
>   "data_digest”:"0xf437a612”,
>   "attrs":[
> {"name":"_”,
>  "value":”EAgNAQAABAM1AA...“,
>  "Base64":true},
> {"name":"snapset”,
>  "value":”AgIZAQ...“,
>  "Base64":true}
>   ]
> },
> {
>   "osd":62,
>   "errors":[],
>   "size":4194304,
>   "omap_digest":"0x”,
>   "data_digest":"0xf437a612”,
>   "attrs”:[
> {"name":"_”,
>  "value":”EAgNAQAABAM1AA...",
>  "Base64":true},
> {"name":"snapset”,
>  "value":”AgIZAQ…",
>  "Base64":true}
>   ]
> },
> {
>   "osd":67,
>   "errors":["oi_attr_missing”],
>   "size":4194304,
>   "omap_digest":"0x”,
>   "data_digest":"0xf437a612”,
>   "attrs":[]
> }
>   ]
> }
>
> Clearly, on osd.67, the “attrs” array is empty. The question is,
> how do I fix this?
>
> Many thanks in advance,
>
> -kc
>
> K.C. Wong
> kcw...@verseon.com
> M: +1 (408) 769-8235
>
> -
> Confidentiality Notice:
> This message contains confidential information. If you are not the
> intended recipient and received this message in error, any use or
> distribution is strictly prohibited. Please also notify us
> immediately by return e-mail, and delete this message from your
> computer system. Thank you.
> -
>
> 4096R/B8995EDE  E527 CBE8 023E 79EA 8BBB  5C77 23A6 92E9 B899 5EDE
>
> hkps://hkps.pool.sks-keyservers.net
>
> On Nov 11, 2018, at 10:58 PM, Brad Hubbard  wrote:
>
> On Mon, Nov 12, 2018 at 4:21 PM Ashley Merrick  
> wrote:
>
>
> Your need to run "ceph pg deep-scrub 1.65" first
>
>
> Right, thanks Ashley. That's what the "Note that you may have to do a
> deep scrub to populate the output." part of my answer meant but
> perhaps I needed to go further?
>
> The system has a record of a scrub error on a previous scan but
> subsequent activity in the cluster has invalidated the specifics. You
> need to run another scrub to get the specifi

Re: [ceph-users] OSDs crashing

2018-09-25 Thread Brad Hubbard
On Tue, Sep 25, 2018 at 11:31 PM Josh Haft  wrote:
>
> Hi cephers,
>
> I have a cluster of 7 storage nodes with 12 drives each and the OSD
> processes are regularly crashing. All 84 have crashed at least once in
> the past two days. Cluster is Luminous 12.2.2 on CentOS 7.4.1708,
> kernel version 3.10.0-693.el7.x86_64. I rebooted one of the OSD nodes
> to see if that cleared up the issue, but it did not. This problem has
> been going on for about a month now, but it was much less frequent
> initially - I'd see a crash once every few days or so. I took a look
> through the mailing list and bug reports, but wasn't able to find
> anything resembling this problem.
>
> I am running a second cluster - also 12.2.2, CentOS 7.4.1708, and
> kernel version 3.10.0-693.el7.x86_64 - but I do not see the issue
> there.
>
> Log messages always look similar to the following, and I've pulled out
> the back trace from a core dump as well. The aborting thread always
> looks to be msgr-worker.
>



> #7  0x7f9e731a3a36 in __cxxabiv1::__terminate (handler= out>) at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:38
> #8  0x7f9e731a3a63 in std::terminate () at
> ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:48
> #9  0x7f9e731fa345 in std::(anonymous
> namespace)::execute_native_thread_routine (__p=) at
> ../../../../../libstdc++-v3/src/c++11/thread.cc:92

That is this code executing.

https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libstdc%2B%2B-v3/src/c%2B%2B11/thread.cc;h=0351f19e042b0701ba3c2597ecec87144fd631d5;hb=cf82a597b0d189857acb34a08725762c4f5afb50#l76

So the problem is we are generating an exception when our thread gets
run, we should probably catch that before it gets to here but that's
another story...

The exception is "buffer::malformed_input: entity_addr_t marker != 1"
and there is some precedent for this
(https://tracker.ceph.com/issues/21660,
https://tracker.ceph.com/issues/24819) but I don't think they are your
issue.

We generated that exception because we encountered an ill-formed
entity_addr_t whilst decoding a message.

Could you open a tracker for this issue and upload the entire log from
a crash, preferably with "debug ms >= 5" but be careful as this will
create very large log files. You can use ceph-post-file to upload
large compressed files.

Let me know the tracker ID here once you've created it.

P.S. This is likely fixed in a later version of Luminous since you
seem to be the only one hitting it. Either that or there is something
unusual about your environment.

>
> Has anyone else seen this? Any suggestions on how to proceed? I do
> intend to upgrade to Mimic but would prefer to do it when the cluster
> is stable.
>
> Thanks for your help.
> Josh
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] PG inconsistent, "pg repair" not working

2018-09-25 Thread Brad Hubbard
On Tue, Sep 25, 2018 at 7:50 PM Sergey Malinin  wrote:
>
> # rados list-inconsistent-obj 1.92
> {"epoch":519,"inconsistents":[]}

It's likely the epoch has changed since the last scrub and you'll need
to run another scrub to repopulate this data.

>
> September 25, 2018 4:58 AM, "Brad Hubbard"  wrote:
>
> > What does the output of the following command look like?
> >
> > $ rados list-inconsistent-obj 1.92
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] process stuck in D state on cephfs kernel mount

2019-01-21 Thread Brad Hubbard
http://www.brendangregg.com/blog/2017-08-08/linux-load-averages.html
should still be current enough and makes good reading on the subject.

On Mon, Jan 21, 2019 at 8:46 PM Stijn De Weirdt  wrote:
>
> hi marc,
>
> > - how to prevent the D state process to accumulate so much load?
> you can't. in linux, uninterruptable tasks themself count as "load",
> this does not mean you eg ran out of cpu resources.
>
> stijn
>
> >
> > Thanks,
> >
> >
> >
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph 10.2.11 - Status not working

2018-12-17 Thread Brad Hubbard
On Tue, Dec 18, 2018 at 10:23 AM Mike O'Connor  wrote:
>
> Hi All
>
> I have a ceph cluster which has been working with out issues for about 2
> years now, it was upgrade about 6 month ago to 10.2.11
>
> root@blade3:/var/lib/ceph/mon# ceph status
> 2018-12-18 10:42:39.242217 7ff770471700  0 -- 10.1.5.203:0/1608630285 >>
> 10.1.5.207:6789/0 pipe(0x7ff768000c80 sd=4 :0 s=1 pgs=0 cs=0 l=1
> c=0x7ff768001f90).fault
> 2018-12-18 10:42:45.242745 7ff770471700  0 -- 10.1.5.203:0/1608630285 >>
> 10.1.5.207:6789/0 pipe(0x7ff7680051e0 sd=3 :0 s=1 pgs=0 cs=0 l=1
> c=0x7ff768002410).fault
> 2018-12-18 10:42:51.243230 7ff770471700  0 -- 10.1.5.203:0/1608630285 >>
> 10.1.5.207:6789/0 pipe(0x7ff7680051e0 sd=3 :0 s=1 pgs=0 cs=0 l=1
> c=0x7ff768002f40).fault
> 2018-12-18 10:42:54.243452 7ff770572700  0 -- 10.1.5.203:0/1608630285 >>
> 10.1.5.205:6789/0 pipe(0x7ff768000c80 sd=4 :0 s=1 pgs=0 cs=0 l=1
> c=0x7ff768008060).fault
> 2018-12-18 10:42:57.243715 7ff770471700  0 -- 10.1.5.203:0/1608630285 >>
> 10.1.5.207:6789/0 pipe(0x7ff7680051e0 sd=3 :0 s=1 pgs=0 cs=0 l=1
> c=0x7ff768003580).fault
> 2018-12-18 10:43:03.244280 7ff7781b9700  0 -- 10.1.5.203:0/1608630285 >>
> 10.1.5.205:6789/0 pipe(0x7ff7680051e0 sd=3 :0 s=1 pgs=0 cs=0 l=1
> c=0x7ff768003670).fault
>
> All system can ping each other. I simple can not see why its failing.
>
>
> ceph.conf
>
> [global]
>  auth client required = cephx
>  auth cluster required = cephx
>  auth service required = cephx
>  cluster network = 10.1.5.0/24
>  filestore xattr use omap = true
>  fsid = 42a0f015-76da-4f47-b506-da5cdacd030f
>  keyring = /etc/pve/priv/$cluster.$name.keyring
>  osd journal size = 5120
>  osd pool default min size = 1
>  public network = 10.1.5.0/24
>  mon_pg_warn_max_per_osd = 0
>
> [client]
>  rbd cache = true
> [osd]
>  keyring = /var/lib/ceph/osd/ceph-$id/keyring
>  osd max backfills = 1
>  osd recovery max active = 1
>  osd_disk_threads = 1
>  osd_disk_thread_ioprio_class = idle
>  osd_disk_thread_ioprio_priority = 7
> [mon.2]
>  host = blade5
>  mon addr = 10.1.5.205:6789
> [mon.1]
>  host = blade3
>  mon addr = 10.1.5.203:6789
> [mon.3]
>  host = blade7
>  mon addr = 10.1.5.207:6789
> [mon.0]
>  host = blade1
>  mon addr = 10.1.5.201:6789
> [mds]
>  mds data = /var/lib/ceph/mds/mds.$id
>  keyring = /var/lib/ceph/mds/mds.$id/mds.$id.keyring
> [mds.0]
>  host = blade1
> [mds.1]
>  host = blade3
> [mds.2]
>  host = blade5
> [mds.3]
>  host = blade7
>
>
> Any ideas ? more information ?

The system on which you are running the "ceph" client, blade3
(10.1.5.203) is trying to contact monitors on 10.1.5.207 (blade7) port
6789 and 10.1.5.205 (blade5) port 6789. You need to check the ceph-mon
binary is running on blade7 and blade5 and that they are listening on
port 6789 and that that port is accessible from blade3. The simplest
explanation is the MONs are not running. The next simplest is their is
a firewall interfering with blade3's ability to connect to port 6789
on those machines. Check the above and see what you find.

-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph OOM Killer Luminous

2018-12-21 Thread Brad Hubbard
Can you provide the complete OOM message from the dmesg log?

On Sat, Dec 22, 2018 at 7:53 AM Pardhiv Karri  wrote:
>
>
> Thank You for the quick response Dyweni!
>
> We are using FileStore as this cluster is upgraded from 
> Hammer-->Jewel-->Luminous 12.2.8. 16x2TB HDD per node for all nodes. R730xd 
> has 128GB and R740xd has 96GB of RAM. Everything else is the same.
>
> Thanks,
> Pardhiv Karri
>
> On Fri, Dec 21, 2018 at 1:43 PM Dyweni - Ceph-Users <6exbab4fy...@dyweni.com> 
> wrote:
>>
>> Hi,
>>
>>
>> You could be running out of memory due to the default Bluestore cache sizes.
>>
>>
>> How many disks/OSDs in the R730xd versus the R740xd?  How much memory in 
>> each server type?  How many are HDD versus SSD?  Are you running Bluestore?
>>
>>
>> OSD's in Luminous, which run Bluestore, allocate memory to use as a "cache", 
>> since the kernel-provided page-cache is not available to Bluestore.  
>> Bluestore, by default, will use 1GB of memory for each HDD, and 3GB of 
>> memory for each SSD.  OSD's do not allocate all that memory up front, but 
>> grow into it as it is used.  This cache is in addition to any other memory 
>> the OSD uses.
>>
>>
>> Check out the bluestore_cache_* values (these are specified in bytes) in the 
>> manual cache sizing section of the docs 
>> (http://docs.ceph.com/docs/master/rados/configuration/bluestore-config-ref/).
>>Note that the automatic cache sizing feature wasn't added until 12.2.9.
>>
>>
>>
>> As an example, I have OSD's running on 32bit/armhf nodes.  These nodes have 
>> 2GB of memory.  I run 1 Bluestore OSD on each node.  In my ceph.conf file, I 
>> have 'bluestore cache size = 536870912' and 'bluestore cache kv max = 
>> 268435456'.  I see aprox 1.35-1.4 GB used by each OSD.
>>
>>
>>
>>
>> On 2018-12-21 15:19, Pardhiv Karri wrote:
>>
>> Hi,
>>
>> We have a luminous cluster which was upgraded from Hammer --> Jewel --> 
>> Luminous 12.2.8 recently. Post upgrade we are seeing issue with a few nodes 
>> where they are running out of memory and dying. In the logs we are seeing 
>> OOM killer. We don't have this issue before upgrade. The only difference is 
>> the nodes without any issue are R730xd and the ones with the memory leak are 
>> R740xd. The hardware vendor don't see anything wrong with the hardware. From 
>> Ceph end we are not seeing any issue when it comes to running the cluster, 
>> only issue is with memory leak. Right now we are actively rebooting the 
>> nodes in timely manner to avoid crashes. One R740xd node we set all the OSDs 
>> to 0.0 and there is no memory leak there. Any pointers to fix the issue 
>> would be helpful.
>>
>> Thanks,
>> Pardhiv Karri
>>
>>
>>
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> --
> Pardhiv Karri
> "Rise and Rise again until LAMBS become LIONS"
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Crush, data placement and randomness

2018-12-06 Thread Brad Hubbard
https://ceph.com/wp-content/uploads/2016/08/weil-crush-sc06.pdf
On Thu, Dec 6, 2018 at 8:11 PM Leon Robinson  wrote:
>
> The most important thing to remember about CRUSH is that the H stands for 
> hashing.
>
> If you hash the same object you're going to get the same result.
>
> e.g. cat /etc/fstab | md5sum is always the same output, unless you change the 
> file contents.
>
> CRUSH uses the number of osds and the object and the pool and a bunch of 
> other things to create a hash which determines placement. If any of that 
> changes then the hash will change, and the placement with change, if it 
> restores to exactly how it was, then the placement returns to how it was.
>
> On Thu, 2018-12-06 at 09:44 +0100, Marc Roos wrote:
>
>
>
>
> Afaik it is not random, it is calculated where your objects are stored.
>
> Some algorithm that probably takes into account how many osd's you have
>
> and their sizes.
>
> How can it be random placed? You would not be able to ever find it
>
> again. Because there is not such a thing as a 'file allocation table'
>
>
> But better search for this, I am not that deep into ceph ;)
>
>
>
>
>
> -Original Message-
>
> From: Franck Desjeunes [mailto:
>
> fdesjeu...@gmail.com
>
> ]
>
> Sent: 06 December 2018 08:01
>
> To:
>
> ceph-users@lists.ceph.com
>
>
> Subject: [ceph-users] Crush, data placement and randomness
>
>
> Hi all cephers.
>
>
> I don't know if this is the right place to ask this kind of questions,
>
> but I'll give it a try.
>
>
>
> I'm getting interested in ceph and deep dived into the technical details
>
> of it but I'm struggling to understand few things.
>
>
> When I execute a ceph osd map on an hypothetic object that does not
>
> exist, the command always give me the same OSDs set to store the object.
>
> So, what is the randomness of the CRUSH algorithm if  an object A will
>
> always be stored in the same OSDs set ?
>
>
> In the same way, why when I use librados to read an object, the stack
>
> trace shows that the code goes through the exact same functions calls as
>
> to create an object to get the OSDs set ?
>
>
> As far as I see, for me, CRUSH is fully deterministic and I don't
>
> understand why it is qualified as a pseudo-random algorithm.
>
>
> Thank you for your help.
>
>
> Best regards.
>
>
>
> ___
>
> ceph-users mailing list
>
> ceph-users@lists.ceph.com
>
>
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
> --
>
> Leon L. Robinson 
>
> 
>
> NOTICE AND DISCLAIMER
> This e-mail (including any attachments) is intended for the above-named 
> person(s). If you are not the intended recipient, notify the sender 
> immediately, delete this email from your system and do not disclose or use 
> for any purpose. We may monitor all incoming and outgoing emails in line with 
> current legislation. We have taken steps to ensure that this email and 
> attachments are free from any virus, but it remains your responsibility to 
> ensure that viruses do not adversely affect you
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] centos 7.6 kernel panic caused by osd

2019-01-10 Thread Brad Hubbard
On Fri, Jan 11, 2019 at 12:20 AM Rom Freiman  wrote:
>
> Hey,
> After upgrading to centos7.6, I started encountering the following kernel 
> panic
>
> [17845.147263] XFS (rbd4): Unmounting Filesystem
> [17846.860221] rbd: rbd4: capacity 3221225472 features 0x1
> [17847.109887] XFS (rbd4): Mounting V5 Filesystem
> [17847.191646] XFS (rbd4): Ending clean mount
> [17861.663757] rbd: rbd5: capacity 3221225472 features 0x1
> [17862.930418] usercopy: kernel memory exposure attempt detected from 
> 9d54d26d8800 (kmalloc-512) (1024 bytes)
> [17862.941698] [ cut here ]
> [17862.946854] kernel BUG at mm/usercopy.c:72!
> [17862.951524] invalid opcode:  [#1] SMP
> [17862.956123] Modules linked in: vhost_net vhost macvtap macvlan tun 
> xt_REDIRECT nf_nat_redirect ip6table_mangle xt_nat xt_mark xt_connmark 
> xt_CHECKSUM ip6table_raw xt_physdev iptable_mangle veth iptable_raw rbd 
> libceph dns_resolver ebtable_filter ebtables ip6table_filter ip6_tables 
> xt_comment mlx4_en(OE) mlx4_core(OE) xt_multiport ipt_REJECT nf_reject_ipv4 
> nf_conntrack_netlink nfnetlink iptable_nat xt_addrtype iptable_filter 
> xt_conntrack br_netfilter bridge stp llc xfs openvswitch nf_conntrack_ipv6 
> nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_defrag_ipv6 
> nf_nat nf_conntrack mlx5_core(OE) mlxfw(OE) iTCO_wdt iTCO_vendor_support 
> sb_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass 
> pcspkr joydev sg mei_me lpc_ich i2c_i801 mei ioatdma ipmi_si ipmi_devintf 
> ipmi_msghandler
> [17863.036328]  dm_multipath ip_tables ext4 mbcache jbd2 dm_thin_pool 
> dm_persistent_data dm_bio_prison dm_bufio libcrc32c sd_mod crc_t10dif 
> crct10dif_generic crct10dif_pclmul crct10dif_common crc32_pclmul crc32c_intel 
> ghash_clmulni_intel mgag200 igb aesni_intel isci lrw gf128mul glue_helper 
> ablk_helper ahci drm_kms_helper cryptd libsas dca syscopyarea sysfillrect 
> sysimgblt fb_sys_fops ttm libahci scsi_transport_sas ptp drm libata pps_core 
> mlx_compat(OE) drm_panel_orientation_quirks i2c_algo_bit devlink wmi 
> scsi_transport_iscsi sunrpc dm_mirror dm_region_hash dm_log dm_mod [last 
> unloaded: mlx4_core]
> [17863.094372] CPU: 3 PID: 71755 Comm: msgr-worker-1 Kdump: loaded Tainted: G 
>   OE     3.10.0-957.1.3.el7.x86_64 #1
> [17863.107673] Hardware name: Intel Corporation S2600JF/S2600JF, BIOS 
> SE5C600.86B.02.06.0006.032420170950 03/24/2017
> [17863.119134] task: 9d4e8e33e180 ti: 9d53dbaf8000 task.ti: 
> 9d53dbaf8000
> [17863.127489] RIP: 0010:[]  [] 
> __check_object_size+0x87/0x250
> [17863.137217] RSP: 0018:9d53dbafbb98  EFLAGS: 00010246
> [17863.143140] RAX: 0062 RBX: 9d54d26d8800 RCX: 
> 
> [17863.151106] RDX:  RSI: 9d557bad3898 RDI: 
> 9d557bad3898
> [17863.159072] RBP: 9d53dbafbbb8 R08:  R09: 
> 
> [17863.167038] R10: 0d0f R11: 9d53dbafb896 R12: 
> 0400
> [17863.175001] R13: 0001 R14: 9d54d26d8c00 R15: 
> 0400
> [17863.182968] FS:  7f531fa98700() GS:9d557bac() 
> knlGS:
> [17863.192001] CS:  0010 DS:  ES:  CR0: 80050033
> [17863.198414] CR2: 7f4438516930 CR3: 000f19236000 CR4: 
> 001627e0
> [17863.206379] Call Trace:
> [17863.209114]  [] memcpy_toiovec+0x4d/0xb0
> [17863.215240]  [] skb_copy_datagram_iovec+0x128/0x280
> [17863.222434]  [] tcp_recvmsg+0x22a/0xb30
> [17863.228463]  [] inet_recvmsg+0x80/0xb0
> [17863.234395]  [] sock_aio_read.part.9+0x14c/0x170
> [17863.241297]  [] ? wake_up_q+0x5b/0x80
> [17863.247129]  [] sock_aio_read+0x21/0x30
> [17863.253157]  [] do_sync_read+0x93/0xe0
> [17863.259087]  [] vfs_read+0x145/0x170
> [17863.264823]  [] SyS_read+0x7f/0xf0
> [17863.270366]  [] system_call_fastpath+0x22/0x27
> [17863.277061] Code: 45 d1 48 c7 c6 d4 b6 67 a6 48 c7 c1 e0 4b 68 a6 48 0f 45 
> f1 49 89 c0 4d 89 e1 48 89 d9 48 c7 c7 d0 1a 68 a6 31 c0 e8 20 d5 51 00 <0f> 
> 0b 0f 1f 80 00 00 00 00 48 c7 c0 00 00 c0 a5 4c 39 f0 73 0d
> [17863.298802] RIP  [] __check_object_size+0x87/0x250
> [17863.305912]  RSP 
>
> It seems to be related to rbd operations but I cannot pinpoint directly the 
> reason.

To me this seems to be an issue in the networking subsystem and there
is nothing, at this stage, that implicates the ceph modules.

If the Mellanox modules are involved in any way I would start looking
there (not because I am biased against them, but because experience
tells me that is the place to start) and then move on to the other
networking modules and the kernel more generally. This looks like some
sort of memory accounting error in the networking subsystem. I could
be wrong, of course, but there would need to be further data to tell
either way. I'd suggest capturing a vmcore and getting someone to
analyse it for you would be a good next step.

>
> Versions:
> CentOS Linux release 7.6.1810 (Core)
> Linux 

Re: [ceph-users] centos 7.6 kernel panic caused by osd

2019-01-10 Thread Brad Hubbard
On Fri, Jan 11, 2019 at 9:57 AM Jason Dillaman  wrote:
>
> I think Ilya recently looked into a bug that can occur when
> CONFIG_HARDENED_USERCOPY is enabled and the IO's TCP message goes
> through the loopback interface (i.e. co-located OSDs and krbd).
> Assuming that you have the same setup, you might be hitting the same
> bug.

Thanks for that Jason, I wasn't aware of that bug. I'm interested to
see the details.

>
> On Thu, Jan 10, 2019 at 6:46 PM Brad Hubbard  wrote:
> >
> > On Fri, Jan 11, 2019 at 12:20 AM Rom Freiman  wrote:
> > >
> > > Hey,
> > > After upgrading to centos7.6, I started encountering the following kernel 
> > > panic
> > >
> > > [17845.147263] XFS (rbd4): Unmounting Filesystem
> > > [17846.860221] rbd: rbd4: capacity 3221225472 features 0x1
> > > [17847.109887] XFS (rbd4): Mounting V5 Filesystem
> > > [17847.191646] XFS (rbd4): Ending clean mount
> > > [17861.663757] rbd: rbd5: capacity 3221225472 features 0x1
> > > [17862.930418] usercopy: kernel memory exposure attempt detected from 
> > > 9d54d26d8800 (kmalloc-512) (1024 bytes)
> > > [17862.941698] [ cut here ]
> > > [17862.946854] kernel BUG at mm/usercopy.c:72!
> > > [17862.951524] invalid opcode:  [#1] SMP
> > > [17862.956123] Modules linked in: vhost_net vhost macvtap macvlan tun 
> > > xt_REDIRECT nf_nat_redirect ip6table_mangle xt_nat xt_mark xt_connmark 
> > > xt_CHECKSUM ip6table_raw xt_physdev iptable_mangle veth iptable_raw rbd 
> > > libceph dns_resolver ebtable_filter ebtables ip6table_filter ip6_tables 
> > > xt_comment mlx4_en(OE) mlx4_core(OE) xt_multiport ipt_REJECT 
> > > nf_reject_ipv4 nf_conntrack_netlink nfnetlink iptable_nat xt_addrtype 
> > > iptable_filter xt_conntrack br_netfilter bridge stp llc xfs openvswitch 
> > > nf_conntrack_ipv6 nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 
> > > nf_nat_ipv4 nf_defrag_ipv6 nf_nat nf_conntrack mlx5_core(OE) mlxfw(OE) 
> > > iTCO_wdt iTCO_vendor_support sb_edac intel_powerclamp coretemp intel_rapl 
> > > iosf_mbi kvm_intel kvm irqbypass pcspkr joydev sg mei_me lpc_ich i2c_i801 
> > > mei ioatdma ipmi_si ipmi_devintf ipmi_msghandler
> > > [17863.036328]  dm_multipath ip_tables ext4 mbcache jbd2 dm_thin_pool 
> > > dm_persistent_data dm_bio_prison dm_bufio libcrc32c sd_mod crc_t10dif 
> > > crct10dif_generic crct10dif_pclmul crct10dif_common crc32_pclmul 
> > > crc32c_intel ghash_clmulni_intel mgag200 igb aesni_intel isci lrw 
> > > gf128mul glue_helper ablk_helper ahci drm_kms_helper cryptd libsas dca 
> > > syscopyarea sysfillrect sysimgblt fb_sys_fops ttm libahci 
> > > scsi_transport_sas ptp drm libata pps_core mlx_compat(OE) 
> > > drm_panel_orientation_quirks i2c_algo_bit devlink wmi 
> > > scsi_transport_iscsi sunrpc dm_mirror dm_region_hash dm_log dm_mod [last 
> > > unloaded: mlx4_core]
> > > [17863.094372] CPU: 3 PID: 71755 Comm: msgr-worker-1 Kdump: loaded 
> > > Tainted: G   OE     3.10.0-957.1.3.el7.x86_64 #1
> > > [17863.107673] Hardware name: Intel Corporation S2600JF/S2600JF, BIOS 
> > > SE5C600.86B.02.06.0006.032420170950 03/24/2017
> > > [17863.119134] task: 9d4e8e33e180 ti: 9d53dbaf8000 task.ti: 
> > > 9d53dbaf8000
> > > [17863.127489] RIP: 0010:[]  [] 
> > > __check_object_size+0x87/0x250
> > > [17863.137217] RSP: 0018:9d53dbafbb98  EFLAGS: 00010246
> > > [17863.143140] RAX: 0062 RBX: 9d54d26d8800 RCX: 
> > > 
> > > [17863.151106] RDX:  RSI: 9d557bad3898 RDI: 
> > > 9d557bad3898
> > > [17863.159072] RBP: 9d53dbafbbb8 R08:  R09: 
> > > 
> > > [17863.167038] R10: 0d0f R11: 9d53dbafb896 R12: 
> > > 0400
> > > [17863.175001] R13: 0001 R14: 9d54d26d8c00 R15: 
> > > 0400
> > > [17863.182968] FS:  7f531fa98700() GS:9d557bac() 
> > > knlGS:
> > > [17863.192001] CS:  0010 DS:  ES:  CR0: 80050033
> > > [17863.198414] CR2: 7f4438516930 CR3: 000f19236000 CR4: 
> > > 001627e0
> > > [17863.206379] Call Trace:
> > > [17863.209114]  [] memcpy_toiovec+0x4d/0xb0
> > > [17863.215240]  [] skb_copy_datagram_iovec+0x128/0x280
> > > [17863.222434]  [] tcp_recvmsg+0x22a/0xb30
> > > [17863.228463]  [] inet_recvmsg+0x80/0xb0
> > > [17863.234395]  [] sock

Re: [ceph-users] centos 7.6 kernel panic caused by osd

2019-01-11 Thread Brad Hubbard
Haha, in the email thread he says CentOS but the bug is opened against RHEL :P

Is it worth recommending a fix in skb_can_coalesce() upstream so other
modules don't hit this?

On Fri, Jan 11, 2019 at 7:39 PM Ilya Dryomov  wrote:
>
> On Fri, Jan 11, 2019 at 1:38 AM Brad Hubbard  wrote:
> >
> > On Fri, Jan 11, 2019 at 9:57 AM Jason Dillaman  wrote:
> > >
> > > I think Ilya recently looked into a bug that can occur when
> > > CONFIG_HARDENED_USERCOPY is enabled and the IO's TCP message goes
> > > through the loopback interface (i.e. co-located OSDs and krbd).
> > > Assuming that you have the same setup, you might be hitting the same
> > > bug.
> >
> > Thanks for that Jason, I wasn't aware of that bug. I'm interested to
> > see the details.
>
> Here is Rom's BZ, it has some details:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1665248
>
> Thanks,
>
> Ilya



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] centos 7.6 kernel panic caused by osd

2019-01-11 Thread Brad Hubbard
On Fri, Jan 11, 2019 at 8:58 PM Rom Freiman  wrote:
>
> Same kernel :)

Not exactly the point I had in mind, but sure ;)

>
>
> On Fri, Jan 11, 2019, 12:49 Brad Hubbard  wrote:
>>
>> Haha, in the email thread he says CentOS but the bug is opened against RHEL 
>> :P
>>
>> Is it worth recommending a fix in skb_can_coalesce() upstream so other
>> modules don't hit this?
>>
>> On Fri, Jan 11, 2019 at 7:39 PM Ilya Dryomov  wrote:
>> >
>> > On Fri, Jan 11, 2019 at 1:38 AM Brad Hubbard  wrote:
>> > >
>> > > On Fri, Jan 11, 2019 at 9:57 AM Jason Dillaman  
>> > > wrote:
>> > > >
>> > > > I think Ilya recently looked into a bug that can occur when
>> > > > CONFIG_HARDENED_USERCOPY is enabled and the IO's TCP message goes
>> > > > through the loopback interface (i.e. co-located OSDs and krbd).
>> > > > Assuming that you have the same setup, you might be hitting the same
>> > > > bug.
>> > >
>> > > Thanks for that Jason, I wasn't aware of that bug. I'm interested to
>> > > see the details.
>> >
>> > Here is Rom's BZ, it has some details:
>> >
>> > https://bugzilla.redhat.com/show_bug.cgi?id=1665248
>> >
>> > Thanks,
>> >
>> > Ilya
>>
>>
>>
>> --
>> Cheers,
>> Brad
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Compacting omap data

2019-01-03 Thread Brad Hubbard
Nautilus will make this easier.

https://github.com/ceph/ceph/pull/18096

On Thu, Jan 3, 2019 at 5:22 AM Bryan Stillwell  wrote:
>
> Recently on one of our bigger clusters (~1,900 OSDs) running Luminous 
> (12.2.8), we had a problem where OSDs would frequently get restarted while 
> deep-scrubbing.
>
> After digging into it I found that a number of the OSDs had very large omap 
> directories (50GiB+).  I believe these were OSDs that had previous held PGs 
> that were part of the .rgw.buckets.index pool which I have recently moved to 
> all SSDs, however, it seems like the data remained on the HDDs.
>
> I was able to reduce the data usage on most of the OSDs (from ~50 GiB to < 
> 200 MiB!) by compacting the omap dbs offline by setting 
> 'leveldb_compact_on_mount = true' in the [osd] section of ceph.conf, but that 
> didn't work on the newer OSDs which use rocksdb.  On those I had to do an 
> online compaction using a command like:
>
> $ ceph tell osd.510 compact
>
> That worked, but today when I tried doing that on some of the SSD-based OSDs 
> which are backing .rgw.buckets.index I started getting slow requests and the 
> compaction ultimately failed with this error:
>
> $ ceph tell osd.1720 compact
> osd.1720: Error ENXIO: osd down
>
> When I tried it again it succeeded:
>
> $ ceph tell osd.1720 compact
> osd.1720: compacted omap in 420.999 seconds
>
> The data usage on that OSD dropped from 57.8 GiB to 43.4 GiB which was nice, 
> but I don't believe that'll get any smaller until I start splitting the PGs 
> in the .rgw.buckets.index pool to better distribute that pool across the 
> SSD-based OSDs.
>
> The first question I have is what is the option to do an offline compaction 
> of rocksdb so I don't impact our customers while compacting the rest of the 
> SSD-based OSDs?
>
> The next question is if there's a way to configure Ceph to automatically 
> compact the omap dbs in the background in a way that doesn't affect user 
> experience?
>
> Finally, I was able to figure out that the omap directories were getting 
> large because we're using filestore on this cluster, but how could someone 
> determine this when using BlueStore?
>
> Thanks,
> Bryan
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [RGWRados]librados: Objecter returned from getxattrs r=-36

2018-09-19 Thread Brad Hubbard
Are you using filestore or bluestore on the OSDs? If filestore what is
the underlying filesystem?

You could try setting debug_osd and debug_filestore to 20 and see if
that gives some more info?
On Wed, Sep 19, 2018 at 12:36 PM fatkun chan  wrote:
>
>
> ceph version 12.2.5 (cad919881333ac92274171586c827e01f554a70a) luminous 
> (stable)
>
> I have a file with long name , when I cat the file through minio client, the 
> error show.
> librados: Objecter returned from getxattrs r=-36
>
>
> the log is come from radosgw
>
> 2018-09-15 03:38:24.763109 7f833c0ed700  2 req 20:0.000272:s3:GET 
> /hand-gesture/:list_bucket:verifying op params
> 2018-09-15 03:38:24.763111 7f833c0ed700  2 req 20:0.000273:s3:GET 
> /hand-gesture/:list_bucket:pre-executing
> 2018-09-15 03:38:24.763112 7f833c0ed700  2 req 20:0.000274:s3:GET 
> /hand-gesture/:list_bucket:executing
> 2018-09-15 03:38:24.763115 7f833c0ed700 10 cls_bucket_list 
> hand-gesture[7f3000c9-66f8-4598-9811-df3800e4469a.804194.12]) start [] 
> num_entries 1001
> 2018-09-15 03:38:24.763822 7f833c0ed700 20 get_obj_state: rctx=0x7f833c0e5790 
> obj=hand-gesture:train_result/mobilenetv2_160_0.35_feature16_pyramid3_minside160_lr0.01_batchsize32_steps2000_limitratio0.5625_slot_blankdata201809041612_bluedata201808302300composite_background_201809111827/201809111827/logs/events.out.tfevents.1536672273.tf-hand-gesture-58-worker-s7uc-0-jsuf7
>  state=0x7f837553c0a0 s->prefetch_data=0
> 2018-09-15 03:38:24.763841 7f833c0ed700 10 librados: getxattrs 
> oid=7f3000c9-66f8-4598-9811-df3800e4469a.804194.12_train_result/mobilenetv2_160_0.35_feature16_pyramid3_minside160_lr0.01_batchsize32_steps2000_limitratio0.5625_slot_blankdata201809041612_bluedata201808302300composite_background_201809111827/201809111827/logs/events.out.tfevents.1536672273.tf-hand-gesture-58-worker-s7uc-0-jsuf7
>  nspace=
> 2018-09-15 03:38:24.764283 7f833c0ed700 10 librados: Objecter returned from 
> getxattrs r=-36
> 2018-09-15 03:38:24.764304 7f833c0ed700  2 req 20:0.001466:s3:GET 
> /hand-gesture/:list_bucket:completing
> 2018-09-15 03:38:24.764308 7f833c0ed700  0 WARNING: set_req_state_err 
> err_no=36 resorting to 500
> 2018-09-15 03:38:24.764355 7f833c0ed700  2 req 20:0.001517:s3:GET 
> /hand-gesture/:list_bucket:op status=-36
> 2018-09-15 03:38:24.764362 7f833c0ed700  2 req 20:0.001524:s3:GET 
> /hand-gesture/:list_bucket:http status=500
> 2018-09-15 03:38:24.764364 7f833c0ed700  1 == req done req=0x7f833c0e7110 
> op status=-36 http_status=500 ==
> 2018-09-15 03:38:24.764371 7f833c0ed700 20 process_request() returned -36
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Slow OPS

2019-03-20 Thread Brad Hubbard
On Thu, Mar 21, 2019 at 12:11 AM Glen Baars  wrote:
>
> Hello Ceph Users,
>
>
>
> Does anyone know what the flag point ‘Started’ is? Is that ceph osd daemon 
> waiting on the disk subsystem?

This is set by "mark_started()" and is roughly set when the pg starts
processing the op. Might want to capture dump_historic_ops output
after the op completes.

>
>
>
> Ceph 13.2.4 on centos 7.5
>
>
>
> "description": "osd_op(client.1411875.0:422573570 5.18ds0 
> 5:b1ed18e5:::rbd_data.6.cf7f46b8b4567.0046e41a:head [read
>
> 1703936~16384] snapc 0=[] ondisk+read+known_if_redirected e30622)",
>
> "initiated_at": "2019-03-21 01:04:40.598438",
>
> "age": 11.340626,
>
> "duration": 11.342846,
>
> "type_data": {
>
> "flag_point": "started",
>
> "client_info": {
>
> "client": "client.1411875",
>
> "client_addr": "10.4.37.45:0/627562602",
>
> "tid": 422573570
>
> },
>
> "events": [
>
> {
>
> "time": "2019-03-21 01:04:40.598438",
>
> "event": "initiated"
>
> },
>
> {
>
> "time": "2019-03-21 01:04:40.598438",
>
> "event": "header_read"
>
> },
>
> {
>
> "time": "2019-03-21 01:04:40.598439",
>
> "event": "throttled"
>
> },
>
> {
>
> "time": "2019-03-21 01:04:40.598450",
>
> "event": "all_read"
>
> },
>
> {
>
> "time": "2019-03-21 01:04:40.598499",
>
> "event": "dispatched"
>
> },
>
> {
>
> "time": "2019-03-21 01:04:40.598504",
>
> "event": "queued_for_pg"
>
> },
>
> {
>
> "time": "2019-03-21 01:04:40.598883",
>
> "event": "reached_pg"
>
> },
>
> {
>
> "time": "2019-03-21 01:04:40.598905",
>
> "event": "started"
>
> }
>
> ]
>
> }
>
> }
>
> ],
>
>
>
> Glen
>
> This e-mail is intended solely for the benefit of the addressee(s) and any 
> other named recipient. It is confidential and may contain legally privileged 
> or confidential information. If you are not the recipient, any use, 
> distribution, disclosure or copying of this e-mail is prohibited. The 
> confidentiality and legal privilege attached to this communication is not 
> waived or lost by reason of the mistaken transmission or delivery to you. If 
> you have received this e-mail in error, please notify us immediately.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Slow OPS

2019-03-20 Thread Brad Hubbard
Actually, the lag is between "sub_op_committed" and "commit_sent". Is
there any pattern to these slow requests? Do they involve the same
osd, or set of osds?

On Thu, Mar 21, 2019 at 3:37 PM Brad Hubbard  wrote:
>
> On Thu, Mar 21, 2019 at 3:20 PM Glen Baars  
> wrote:
> >
> > Thanks for that - we seem to be experiencing the wait in this section of 
> > the ops.
> >
> > {
> > "time": "2019-03-21 14:12:42.830191",
> > "event": "sub_op_committed"
> > },
> > {
> > "time": "2019-03-21 14:12:43.699872",
> > "event": "commit_sent"
> > },
> >
> > Does anyone know what that section is waiting for?
>
> Hi Glen,
>
> These are documented, to some extent, here.
>
> http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-osd/
>
> It looks like it may be taking a long time to communicate the commit
> message back to the client? Are these slow ops always the same client?
>
> >
> > Kind regards,
> > Glen Baars
> >
> > -Original Message-
> > From: Brad Hubbard 
> > Sent: Thursday, 21 March 2019 8:23 AM
> > To: Glen Baars 
> > Cc: ceph-users@lists.ceph.com
> > Subject: Re: [ceph-users] Slow OPS
> >
> > On Thu, Mar 21, 2019 at 12:11 AM Glen Baars  
> > wrote:
> > >
> > > Hello Ceph Users,
> > >
> > >
> > >
> > > Does anyone know what the flag point ‘Started’ is? Is that ceph osd 
> > > daemon waiting on the disk subsystem?
> >
> > This is set by "mark_started()" and is roughly set when the pg starts 
> > processing the op. Might want to capture dump_historic_ops output after the 
> > op completes.
> >
> > >
> > >
> > >
> > > Ceph 13.2.4 on centos 7.5
> > >
> > >
> > >
> > > "description": "osd_op(client.1411875.0:422573570 5.18ds0
> > > 5:b1ed18e5:::rbd_data.6.cf7f46b8b4567.0046e41a:head [read
> > >
> > > 1703936~16384] snapc 0=[] ondisk+read+known_if_redirected e30622)",
> > >
> > > "initiated_at": "2019-03-21 01:04:40.598438",
> > >
> > > "age": 11.340626,
> > >
> > > "duration": 11.342846,
> > >
> > > "type_data": {
> > >
> > > "flag_point": "started",
> > >
> > > "client_info": {
> > >
> > > "client": "client.1411875",
> > >
> > > "client_addr": "10.4.37.45:0/627562602",
> > >
> > > "tid": 422573570
> > >
> > > },
> > >
> > > "events": [
> > >
> > > {
> > >
> > > "time": "2019-03-21 01:04:40.598438",
> > >
> > > "event": "initiated"
> > >
> > > },
> > >
> > > {
> > >
> > > "time": "2019-03-21 01:04:40.598438",
> > >
> > > "event": "header_read"
> > >
> > > },
> > >
> > > {
> > >
> > > "time": "2019-03-21 01:04:40.598439",
> > >
> > > "event": "throttled"
> > >
> > > },
> > >
> > > {
> > >
> > > "time": "2019-03-21 01:04:40.598450",
> > >
> > > "event": "all_read"
> > >
> > > },
> > >
> > > {
> > >
> > > "time": "2019-03-21 01:04:40.598499",
> > >
> > > "event": "dispatched"
> > >
> > > },
> > >
> &g

Re: [ceph-users] Slow OPS

2019-03-20 Thread Brad Hubbard
On Thu, Mar 21, 2019 at 3:20 PM Glen Baars  wrote:
>
> Thanks for that - we seem to be experiencing the wait in this section of the 
> ops.
>
> {
> "time": "2019-03-21 14:12:42.830191",
> "event": "sub_op_committed"
> },
> {
> "time": "2019-03-21 14:12:43.699872",
> "event": "commit_sent"
> },
>
> Does anyone know what that section is waiting for?

Hi Glen,

These are documented, to some extent, here.

http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-osd/

It looks like it may be taking a long time to communicate the commit
message back to the client? Are these slow ops always the same client?

>
> Kind regards,
> Glen Baars
>
> -Original Message-
> From: Brad Hubbard 
> Sent: Thursday, 21 March 2019 8:23 AM
> To: Glen Baars 
> Cc: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] Slow OPS
>
> On Thu, Mar 21, 2019 at 12:11 AM Glen Baars  
> wrote:
> >
> > Hello Ceph Users,
> >
> >
> >
> > Does anyone know what the flag point ‘Started’ is? Is that ceph osd daemon 
> > waiting on the disk subsystem?
>
> This is set by "mark_started()" and is roughly set when the pg starts 
> processing the op. Might want to capture dump_historic_ops output after the 
> op completes.
>
> >
> >
> >
> > Ceph 13.2.4 on centos 7.5
> >
> >
> >
> > "description": "osd_op(client.1411875.0:422573570 5.18ds0
> > 5:b1ed18e5:::rbd_data.6.cf7f46b8b4567.0046e41a:head [read
> >
> > 1703936~16384] snapc 0=[] ondisk+read+known_if_redirected e30622)",
> >
> > "initiated_at": "2019-03-21 01:04:40.598438",
> >
> > "age": 11.340626,
> >
> > "duration": 11.342846,
> >
> > "type_data": {
> >
> > "flag_point": "started",
> >
> > "client_info": {
> >
> > "client": "client.1411875",
> >
> > "client_addr": "10.4.37.45:0/627562602",
> >
> > "tid": 422573570
> >
> > },
> >
> > "events": [
> >
> > {
> >
> > "time": "2019-03-21 01:04:40.598438",
> >
> > "event": "initiated"
> >
> > },
> >
> > {
> >
> > "time": "2019-03-21 01:04:40.598438",
> >
> > "event": "header_read"
> >
> > },
> >
> > {
> >
> > "time": "2019-03-21 01:04:40.598439",
> >
> > "event": "throttled"
> >
> > },
> >
> > {
> >
> > "time": "2019-03-21 01:04:40.598450",
> >
> > "event": "all_read"
> >
> > },
> >
> > {
> >
> > "time": "2019-03-21 01:04:40.598499",
> >
> > "event": "dispatched"
> >
> > },
> >
> > {
> >
> > "time": "2019-03-21 01:04:40.598504",
> >
> > "event": "queued_for_pg"
> >
> > },
> >
> > {
> >
> > "time": "2019-03-21 01:04:40.598883",
> >
> > "event": "reached_pg"
> >
> > },
> >
> > {
> >
> > "time": "2019-03-21 01:04:40.598905",
> >
> > "event": "started"
> >
> > }
> >
> > ]
> >
> >

Re: [ceph-users] scrub errors

2019-03-25 Thread Brad Hubbard
It would help to know what version you are running but, to begin with,
could you post the output of the following?

$ sudo ceph pg 10.2a query
$ sudo rados list-inconsistent-obj 10.2a --format=json-pretty

Also, have a read of
http://docs.ceph.com/docs/mimic/rados/troubleshooting/troubleshooting-pg/
(adjust the URl for your release).

On Tue, Mar 26, 2019 at 8:19 AM solarflow99  wrote:
>
> I noticed my cluster has scrub errors but the deep-scrub command doesn't show 
> any errors.  Is there any way to know what it takes to fix it?
>
>
>
> # ceph health detail
> HEALTH_ERR 1 pgs inconsistent; 47 scrub errors
> pg 10.2a is active+clean+inconsistent, acting [41,38,8]
> 47 scrub errors
>
> # zgrep 10.2a /var/log/ceph/ceph.log*
> /var/log/ceph/ceph.log-20190323.gz:2019-03-22 16:20:18.148299 osd.41 
> 192.168.4.19:6809/30077 54885 : cluster [INF] 10.2a deep-scrub starts
> /var/log/ceph/ceph.log-20190323.gz:2019-03-22 18:29:02.024040 osd.41 
> 192.168.4.19:6809/30077 54886 : cluster [ERR] 10.2a shard 38 missing 
> 10/24083d2a/ec50777d-cc99-46a8-8610-4492213f412f/head
> /var/log/ceph/ceph.log-20190323.gz:2019-03-22 18:29:02.024049 osd.41 
> 192.168.4.19:6809/30077 54887 : cluster [ERR] 10.2a shard 38 missing 
> 10/ff183d2a/fce859b9-61a9-46cb-82f1-4b4af31c10db/head
> /var/log/ceph/ceph.log-20190323.gz:2019-03-22 18:29:02.024074 osd.41 
> 192.168.4.19:6809/30077 54888 : cluster [ERR] 10.2a shard 38 missing 
> 10/34283d2a/4b7c96cb-c494-4637-8669-e42049bd0e1c/head
> /var/log/ceph/ceph.log-20190323.gz:2019-03-22 18:29:02.024076 osd.41 
> 192.168.4.19:6809/30077 54889 : cluster [ERR] 10.2a shard 38 missing 
> 10/df283d2a/bbe61149-99f8-4b83-a42b-b208d18094a8/head
> /var/log/ceph/ceph.log-20190323.gz:2019-03-22 18:29:02.024077 osd.41 
> 192.168.4.19:6809/30077 54890 : cluster [ERR] 10.2a shard 38 missing 
> 10/35383d2a/60e8ed9b-bd04-5a43-8917-6f29eba28a66:0014/head
> /var/log/ceph/ceph.log-20190323.gz:2019-03-22 18:29:02.024078 osd.41 
> 192.168.4.19:6809/30077 54891 : cluster [ERR] 10.2a shard 38 missing 
> 10/d5383d2a/2bdeb186-561b-4151-b87e-fe7c2e217d41/head
> /var/log/ceph/ceph.log-20190323.gz:2019-03-22 18:29:02.024080 osd.41 
> 192.168.4.19:6809/30077 54892 : cluster [ERR] 10.2a shard 38 missing 
> 10/a7383d2a/b6b9d21d-2f4f-4550-8928-52552349db7d/head
> /var/log/ceph/ceph.log-20190323.gz:2019-03-22 18:29:02.024081 osd.41 
> 192.168.4.19:6809/30077 54893 : cluster [ERR] 10.2a shard 38 missing 
> 10/9c383d2a/5b552687-c709-4e87-b773-1cce5b262754/head
> /var/log/ceph/ceph.log-20190323.gz:2019-03-22 18:29:02.024082 osd.41 
> 192.168.4.19:6809/30077 54894 : cluster [ERR] 10.2a shard 38 missing 
> 10/5d383d2a/cb1a2ea8-0872-4de9-8b93-5ea8d9d8e613/head
> /var/log/ceph/ceph.log-20190323.gz:2019-03-22 18:29:02.024083 osd.41 
> 192.168.4.19:6809/30077 54895 : cluster [ERR] 10.2a shard 38 missing 
> 10/8f483d2a/74c7a2b9-f00a-4c89-afbd-c1b8439234ac/head
> /var/log/ceph/ceph.log-20190323.gz:2019-03-22 18:29:02.024085 osd.41 
> 192.168.4.19:6809/30077 54896 : cluster [ERR] 10.2a shard 38 missing 
> 10/b1583d2a/b3f00768-82a2-4637-91d1-164f3a51312a/head
> /var/log/ceph/ceph.log-20190323.gz:2019-03-22 18:29:02.024086 osd.41 
> 192.168.4.19:6809/30077 54897 : cluster [ERR] 10.2a shard 38 missing 
> 10/35583d2a/e347aff4-7b71-476e-863a-310e767e4160/head
> /var/log/ceph/ceph.log-20190323.gz:2019-03-22 18:29:02.024088 osd.41 
> 192.168.4.19:6809/30077 54898 : cluster [ERR] 10.2a shard 38 missing 
> 10/69583d2a/0805d07a-49d1-44cb-87c7-3bd73a0ce692/head
> /var/log/ceph/ceph.log-20190323.gz:2019-03-22 18:29:02.024122 osd.41 
> 192.168.4.19:6809/30077 54899 : cluster [ERR] 10.2a shard 38 missing 
> 10/1a583d2a/d65bcf6a-9457-46c3-8fbc-432ebbaad89a/head
> /var/log/ceph/ceph.log-20190323.gz:2019-03-22 18:29:02.024123 osd.41 
> 192.168.4.19:6809/30077 54900 : cluster [ERR] 10.2a shard 38 missing 
> 10/6d583d2a/5592f7d6-a131-4eb2-a3dd-b2d96691dd7e/head
> /var/log/ceph/ceph.log-20190323.gz:2019-03-22 18:29:02.024124 osd.41 
> 192.168.4.19:6809/30077 54901 : cluster [ERR] 10.2a shard 38 missing 
> 10/f0683d2a/81897399-4cb0-59b3-b9ae-bf043a272137:0003/head
>
>
>
> # ceph pg deep-scrub 10.2a
> instructing pg 10.2a on osd.41 to deep-scrub
>
>
> # ceph -w | grep 10.2a
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] scrub errors

2019-03-25 Thread Brad Hubbard
; "last_active": "2018-09-22 06:33:14.791334",
> "last_peered": "2018-09-22 06:33:14.791334",
> "last_clean": "2018-09-22 06:33:14.791334",
> "last_became_active": "0.00",
> "last_became_peered": "0.00",
> "last_unstale": "2018-09-22 06:33:14.791334",
> "last_undegraded": "2018-09-22 06:33:14.791334",
> "last_fullsized": "2018-09-22 06:33:14.791334",
> "mapping_epoch": 21472,
> "log_start": "21395'11840466",
> "ondisk_log_start": "21395'11840466",
> "created": 8200,
> "last_epoch_clean": 20840,
> "parent": "0.0",
>     "parent_split_bits": 0,
> "last_scrub": "21395'11835365",
> "last_scrub_stamp": "2018-09-21 12:11:47.230141",
> "last_deep_scrub": "21395'11835365",
> "last_deep_scrub_stamp": "2018-09-21 12:11:47.230141",
> "last_clean_scrub_stamp": "2018-09-21 12:11:47.230141",
> "log_size": 3050,
> "ondisk_log_size": 3050,
> "stats_invalid": "0",
> "stat_sum": {
> "num_bytes": 6405126628,
> "num_objects": 241711,
> "num_object_clones": 0,
> "num_object_copies": 725130,
> "num_objects_missing_on_primary": 0,
> "num_objects_degraded": 0,
> "num_objects_misplaced": 0,
> "num_objects_unfound": 0,
> "num_objects_dirty": 241711,
> "num_whiteouts": 0,
> "num_read": 5637862,
> "num_read_kb": 48735376,
> "num_write": 6789687,
> "num_write_kb": 67678402,
> "num_scrub_errors": 0,
> "num_shallow_scrub_errors": 0,
> "num_deep_scrub_errors": 0,
> "num_objects_recovered": 167079,
> "num_bytes_recovered": 5191625476,
> "num_keys_recovered": 0,
> "num_objects_omap": 0,
> "num_objects_hit_set_archive": 0,
> "num_bytes_hit_set_archive": 0
> },
> "up": [
> 41,
> 38,
> 8
> ],
> "acting": [
> 41,
> 38,
> 8
> ],
> "blocked_by": [],
> "up_primary": 41,
> "acting_primary": 41
> },
> "empty": 0,
> "dne": 0,
> "incomplete": 0,
> "last_epoch_started": 21481,
> "hit_set_history": {
> "current_last_update": "0'0",
> "current_last_stamp": "0.00",
> "current_info": {
> "begin": "0.00",
> "end": "0.00",
> "version": "0'0",
> "using_gmt": "0"
> },
> "history": []
> }
> }
> ],
> "recovery_state": [
> {
> "name": "Started\/Primary\/Active",
> "enter_time": "2018-09-22 07:07:48.637248",
> "might_have_unfound": [
> {
> "osd": "7",
> "status": "not queried"
> },
> {
> "osd": "8",
> "status"

Re: [ceph-users] OS Upgrade now monitor wont start

2019-03-24 Thread Brad Hubbard
Do a "ps auwwx" to see how a running monitor was started and use the
equivalent command to try to start the MON that won't start. "ceph-mon
--help" will show you what you need. Most important is to get the ID
portion right and to add "-d" to get it to run in teh foreground and
log to stdout. HTH and good luck!

On Mon, Mar 25, 2019 at 11:10 AM Brent Kennedy  wrote:
>
> Upgraded all the OS’s in the cluster to Ubuntu 14.04 LTS from Ubuntu 12.02 
> LTS then finished the upgrade from Firefly to Luminous.
>
>
>
> I then tried to upgrade the first monitor to Ubuntu 16.04 LTS, the OS upgrade 
> went fine, but then the monitor and manager wouldn’t start.  I then used 
> ceph-deploy to install over the existing install to ensure the new packages 
> were installed.  Monitor and manager still wont start.  Oddly enough, it 
> seems that logging wont populate either.  I was trying to find the command to 
> run the monitor manually to see if could read the output since the logging in 
> /var/log/ceph isn’t populating.  I did a file system search to see if a log 
> file was created in another directory, but it appears that’s not the case.  
> Monitor and cluster were healthy before I started the OS upgrade.  Nothing in 
> “Journalctl –xe” other than the services starting up without any errors.  
> Cluster shows 1/3 monitors down in health status though.
>
>
>
> I hope to upgrade all the remaining monitors to 16.04.  I already upgraded 
> the gateways to 16.04 without issue.  All the OSDs are being replaced with 
> newer hardware and going to CentOS 7.6.
>
>
>
>
>
> Regards,
>
> -Brent
>
>
>
> Existing Clusters:
>
> Test: Luminous 12.2.11 with 3 osd servers, 1 mon/man, 1 gateway ( all virtual 
> on SSD )
>
> US Production(HDD): Jewel 10.2.11 with 5 osd servers, 3 mons, 3 gateways 
> behind haproxy LB
>
> UK Production(HDD): Luminous 12.2.11 with 15 osd servers, 3 mons/man, 3 
> gateways behind haproxy LB
>
> US Production(SSD): Luminous 12.2.11 with 6 osd servers, 3 mons/man, 3 
> gateways behind haproxy LB
>
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] VM management setup

2019-04-05 Thread Brad Hubbard
If you want to do containers at the same time, or transition some/all
to containers at some point in future maybe something based on
kubevirt [1] would be more futureproof?

[1] http://kubevirt.io/

CNV is an example,
https://www.redhat.com/en/resources/container-native-virtualization

On Sat, Apr 6, 2019 at 7:37 AM Ronny Aasen  wrote:
>
>
> Proxmox VE is a simple solution.
> https://www.proxmox.com/en/proxmox-ve
>
> based on debian. can administer an internal ceph cluster or connect to
> an external connected . easy and almost self explanatory web interface.
>
> good luck in your search !
>
> Ronny
>
>
>
> On 05.04.2019 21:34, jes...@krogh.cc wrote:
> > Hi. Knowing this is a bit off-topic but seeking recommendations
> > and advise anyway.
> >
> > We're seeking a "management" solution for VM's - currently in the 40-50
> > VM - but would like to have better access in managing them and potintially
> > migrate them across multiple hosts, setup block devices, etc, etc.
> >
> > This is only to be used internally in a department where a bunch of
> > engineering people will manage it, no costumers and that kind of thing.
> >
> > Up until now we have been using virt-manager with kvm - and have been
> > quite satisfied when we were in the "few vms", but it seems like the
> > time to move on.
> >
> > Thus we're looking for something "simple" that can help manage a ceph+kvm
> > based setup -  the simpler and more to the point the better.
> >
> > Any recommendations?
> >
> > .. found a lot of names allready ..
> > OpenStack
> > CloudStack
> > Proxmox
> > ..
> >
> > But recommendations are truely welcome.
> >
> > Thanks.
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] scrub errors

2019-03-28 Thread Brad Hubbard
On Fri, Mar 29, 2019 at 7:54 AM solarflow99  wrote:
>
> ok, I tried doing ceph osd out on each of the 4 OSDs 1 by 1.  I got it out of 
> backfill mode but still not sure if it'll fix anything.  pg 10.2a still shows 
> state active+clean+inconsistent.  Peer 8  is now 
> remapped+inconsistent+peering, and the other peer is active+clean+inconsistent

Per the document I linked previously if a pg remains remapped you
likely have a problem with your configuration. Take a good look at
your crushmap, pg distribution, pool configuration, etc.

>
>
> On Wed, Mar 27, 2019 at 4:13 PM Brad Hubbard  wrote:
>>
>> On Thu, Mar 28, 2019 at 8:33 AM solarflow99  wrote:
>> >
>> > yes, but nothing seems to happen.  I don't understand why it lists OSDs 7 
>> > in the  "recovery_state": when i'm only using 3 replicas and it seems to 
>> > use 41,38,8
>>
>> Well, osd 8s state is listed as
>> "active+undersized+degraded+remapped+wait_backfill" so it seems to be
>> stuck waiting for backfill for some reason. One thing you could try is
>> restarting all of the osds including 7 and 17 to see if forcing them
>> to peer again has any positive effect. Don't restart them all at once,
>> just one at a time waiting until each has peered before moving on.
>>
>> >
>> > # ceph health detail
>> > HEALTH_ERR 1 pgs inconsistent; 47 scrub errors
>> > pg 10.2a is active+clean+inconsistent, acting [41,38,8]
>> > 47 scrub errors
>> >
>> >
>> >
>> > As you can see all OSDs are up and in:
>> >
>> > # ceph osd stat
>> >  osdmap e23265: 49 osds: 49 up, 49 in
>> >
>> >
>> >
>> >
>> > And this just stays the same:
>> >
>> > "up": [
>> > 41,
>> > 38,
>> > 8
>> > ],
>> > "acting": [
>> > 41,
>> > 38,
>> > 8
>> >
>> >  "recovery_state": [
>> > {
>> > "name": "Started\/Primary\/Active",
>> > "enter_time": "2018-09-22 07:07:48.637248",
>> > "might_have_unfound": [
>> > {
>> > "osd": "7",
>> > "status": "not queried"
>> > },
>> > {
>> > "osd": "8",
>> > "status": "already probed"
>> > },
>> > {
>> > "osd": "17",
>> > "status": "not queried"
>> > },
>> > {
>> > "osd": "38",
>> > "status": "already probed"
>> > }
>> > ],
>> >
>> >
>> > On Tue, Mar 26, 2019 at 4:53 PM Brad Hubbard  wrote:
>> >>
>> >> http://docs.ceph.com/docs/hammer/rados/troubleshooting/troubleshooting-pg/
>> >>
>> >> Did you try repairing the pg?
>> >>
>> >>
>> >> On Tue, Mar 26, 2019 at 9:08 AM solarflow99  wrote:
>> >> >
>> >> > yes, I know its old.  I intend to have it replaced but thats a few 
>> >> > months away and was hoping to get past this.  the other OSDs appear to 
>> >> > be ok, I see them up and in, why do you see something wrong?
>> >> >
>> >> > On Mon, Mar 25, 2019 at 4:00 PM Brad Hubbard  
>> >> > wrote:
>> >> >>
>> >> >> Hammer is no longer supported.
>> >> >>
>> >> >> What's the status of osds 7 and 17?
>> >> >>
>> >> >> On Tue, Mar 26, 2019 at 8:56 AM solarflow99  
>> >> >> wrote:
>> >> >> >
>> >> >> > hi, thanks.  Its still using Hammer.  Here's the output from the pg 
>> >> >> > query, the last command you gave doesn't work at all but be too old.
>> >> >> >
>> >> >> >
>> >> >> > # ceph pg 10.2a query
>> >> >> > {
>> >> >> > "state": "active+clean+inconsistent",
>> >> >> > "

Re: [ceph-users] scrub errors

2019-03-27 Thread Brad Hubbard
On Thu, Mar 28, 2019 at 8:33 AM solarflow99  wrote:
>
> yes, but nothing seems to happen.  I don't understand why it lists OSDs 7 in 
> the  "recovery_state": when i'm only using 3 replicas and it seems to use 
> 41,38,8

Well, osd 8s state is listed as
"active+undersized+degraded+remapped+wait_backfill" so it seems to be
stuck waiting for backfill for some reason. One thing you could try is
restarting all of the osds including 7 and 17 to see if forcing them
to peer again has any positive effect. Don't restart them all at once,
just one at a time waiting until each has peered before moving on.

>
> # ceph health detail
> HEALTH_ERR 1 pgs inconsistent; 47 scrub errors
> pg 10.2a is active+clean+inconsistent, acting [41,38,8]
> 47 scrub errors
>
>
>
> As you can see all OSDs are up and in:
>
> # ceph osd stat
>  osdmap e23265: 49 osds: 49 up, 49 in
>
>
>
>
> And this just stays the same:
>
> "up": [
> 41,
> 38,
> 8
> ],
> "acting": [
> 41,
> 38,
> 8
>
>  "recovery_state": [
> {
> "name": "Started\/Primary\/Active",
> "enter_time": "2018-09-22 07:07:48.637248",
> "might_have_unfound": [
> {
> "osd": "7",
> "status": "not queried"
> },
> {
> "osd": "8",
> "status": "already probed"
>     },
> {
> "osd": "17",
> "status": "not queried"
> },
> {
> "osd": "38",
> "status": "already probed"
> }
> ],
>
>
> On Tue, Mar 26, 2019 at 4:53 PM Brad Hubbard  wrote:
>>
>> http://docs.ceph.com/docs/hammer/rados/troubleshooting/troubleshooting-pg/
>>
>> Did you try repairing the pg?
>>
>>
>> On Tue, Mar 26, 2019 at 9:08 AM solarflow99  wrote:
>> >
>> > yes, I know its old.  I intend to have it replaced but thats a few months 
>> > away and was hoping to get past this.  the other OSDs appear to be ok, I 
>> > see them up and in, why do you see something wrong?
>> >
>> > On Mon, Mar 25, 2019 at 4:00 PM Brad Hubbard  wrote:
>> >>
>> >> Hammer is no longer supported.
>> >>
>> >> What's the status of osds 7 and 17?
>> >>
>> >> On Tue, Mar 26, 2019 at 8:56 AM solarflow99  wrote:
>> >> >
>> >> > hi, thanks.  Its still using Hammer.  Here's the output from the pg 
>> >> > query, the last command you gave doesn't work at all but be too old.
>> >> >
>> >> >
>> >> > # ceph pg 10.2a query
>> >> > {
>> >> > "state": "active+clean+inconsistent",
>> >> > "snap_trimq": "[]",
>> >> > "epoch": 23265,
>> >> > "up": [
>> >> > 41,
>> >> > 38,
>> >> > 8
>> >> > ],
>> >> > "acting": [
>> >> > 41,
>> >> > 38,
>> >> > 8
>> >> > ],
>> >> > "actingbackfill": [
>> >> > "8",
>> >> > "38",
>> >> > "41"
>> >> > ],
>> >> > "info": {
>> >> > "pgid": "10.2a",
>> >> > "last_update": "23265'20886859",
>> >> > "last_complete": "23265'20886859",
>> >> > "log_tail": "23265'20883809",
>> >> > "last_user_version": 20886859,
>> >> > "last_backfill": "MAX",
>> >> > "purged_snaps": "[]",
>> >> > "history": {
>> >> > "epoch_created": 8200,
>> >> > "last_epoch_started": 21481,
>> >> > "last_epoch_clean": 21487,
>> >> >

Re: [ceph-users] scrub errors

2019-03-26 Thread Brad Hubbard
http://docs.ceph.com/docs/hammer/rados/troubleshooting/troubleshooting-pg/

Did you try repairing the pg?


On Tue, Mar 26, 2019 at 9:08 AM solarflow99  wrote:
>
> yes, I know its old.  I intend to have it replaced but thats a few months 
> away and was hoping to get past this.  the other OSDs appear to be ok, I see 
> them up and in, why do you see something wrong?
>
> On Mon, Mar 25, 2019 at 4:00 PM Brad Hubbard  wrote:
>>
>> Hammer is no longer supported.
>>
>> What's the status of osds 7 and 17?
>>
>> On Tue, Mar 26, 2019 at 8:56 AM solarflow99  wrote:
>> >
>> > hi, thanks.  Its still using Hammer.  Here's the output from the pg query, 
>> > the last command you gave doesn't work at all but be too old.
>> >
>> >
>> > # ceph pg 10.2a query
>> > {
>> > "state": "active+clean+inconsistent",
>> > "snap_trimq": "[]",
>> > "epoch": 23265,
>> > "up": [
>> > 41,
>> > 38,
>> > 8
>> > ],
>> > "acting": [
>> > 41,
>> > 38,
>> > 8
>> > ],
>> > "actingbackfill": [
>> > "8",
>> > "38",
>> > "41"
>> > ],
>> > "info": {
>> > "pgid": "10.2a",
>> > "last_update": "23265'20886859",
>> > "last_complete": "23265'20886859",
>> > "log_tail": "23265'20883809",
>> > "last_user_version": 20886859,
>> > "last_backfill": "MAX",
>> > "purged_snaps": "[]",
>> > "history": {
>> > "epoch_created": 8200,
>> > "last_epoch_started": 21481,
>> > "last_epoch_clean": 21487,
>> > "last_epoch_split": 0,
>> > "same_up_since": 21472,
>> > "same_interval_since": 21474,
>> > "same_primary_since": 8244,
>> > "last_scrub": "23265'20864209",
>> > "last_scrub_stamp": "2019-03-22 22:39:13.930673",
>> > "last_deep_scrub": "23265'20864209",
>> > "last_deep_scrub_stamp": "2019-03-22 22:39:13.930673",
>> > "last_clean_scrub_stamp": "2019-03-15 01:33:21.447438"
>> > },
>> > "stats": {
>> > "version": "23265'20886859",
>> > "reported_seq": "10109937",
>> > "reported_epoch": "23265",
>> > "state": "active+clean+inconsistent",
>> > "last_fresh": "2019-03-25 15:52:53.720768",
>> > "last_change": "2019-03-22 22:39:13.931038",
>> > "last_active": "2019-03-25 15:52:53.720768",
>> > "last_peered": "2019-03-25 15:52:53.720768",
>> > "last_clean": "2019-03-25 15:52:53.720768",
>> > "last_became_active": "0.00",
>> > "last_became_peered": "0.00",
>> > "last_unstale": "2019-03-25 15:52:53.720768",
>> > "last_undegraded": "2019-03-25 15:52:53.720768",
>> > "last_fullsized": "2019-03-25 15:52:53.720768",
>> > "mapping_epoch": 21472,
>> > "log_start": "23265'20883809",
>> > "ondisk_log_start": "23265'20883809",
>> > "created": 8200,
>> > "last_epoch_clean": 21487,
>> > "parent": "0.0",
>> > "parent_split_bits": 0,
>> > "last_scrub": "23265'20864209",
>> > "last_scrub_stamp": "2019-03-22 22:39:13.930673",
>> > "last_deep_scrub": "23265'20864209",
>> > "last

Re: [ceph-users] Fedora 29 Issues.

2019-03-26 Thread Brad Hubbard
https://bugzilla.redhat.com/show_bug.cgi?id=1662496

On Wed, Mar 27, 2019 at 5:00 AM Andrew J. Hutton
 wrote:
>
> More or less followed the install instructions with modifications as
> needed; but I'm suspecting that either a dependency was missed in the
> F29 package or something else is up. I don't see anything obvious; any
> ideas?
>
> When I try to start setting up my first node I get the following:
>
> [root@odin ceph-cluster]# ceph-deploy new thor
> [ceph_deploy.conf][DEBUG ] found configuration file at:
> /root/.cephdeploy.conf
> [ceph_deploy.cli][INFO  ] Invoked (1.5.32): /usr/bin/ceph-deploy new thor
> [ceph_deploy.cli][INFO  ] ceph-deploy options:
> [ceph_deploy.cli][INFO  ]  username  : None
> [ceph_deploy.cli][INFO  ]  verbose   : False
> [ceph_deploy.cli][INFO  ]  overwrite_conf: False
> [ceph_deploy.cli][INFO  ]  quiet : False
> [ceph_deploy.cli][INFO  ]  cd_conf   :
> 
> [ceph_deploy.cli][INFO  ]  cluster   : ceph
> [ceph_deploy.cli][INFO  ]  ssh_copykey   : True
> [ceph_deploy.cli][INFO  ]  mon   : ['thor']
> [ceph_deploy.cli][INFO  ]  func  :  at 0x7f9fb9ee8ed8>
> [ceph_deploy.cli][INFO  ]  public_network: None
> [ceph_deploy.cli][INFO  ]  ceph_conf : None
> [ceph_deploy.cli][INFO  ]  cluster_network   : None
> [ceph_deploy.cli][INFO  ]  default_release   : False
> [ceph_deploy.cli][INFO  ]  fsid  : None
> [ceph_deploy.new][DEBUG ] Creating new cluster named ceph
> [ceph_deploy.new][INFO  ] making sure passwordless SSH succeeds
> [thor][DEBUG ] connected to host: odin
> [ceph_deploy][ERROR ] Traceback (most recent call last):
> [ceph_deploy][ERROR ]   File
> "/usr/lib/python2.7/site-packages/ceph_deploy/util/decorators.py", line
> 69, in newfunc
> [ceph_deploy][ERROR ] return f(*a, **kw)
> [ceph_deploy][ERROR ]   File
> "/usr/lib/python2.7/site-packages/ceph_deploy/cli.py", line 169, in _main
> [ceph_deploy][ERROR ] return args.func(args)
> [ceph_deploy][ERROR ]   File
> "/usr/lib/python2.7/site-packages/ceph_deploy/new.py", line 141, in new
> [ceph_deploy][ERROR ] ssh_copy_keys(host, args.username)
> [ceph_deploy][ERROR ]   File
> "/usr/lib/python2.7/site-packages/ceph_deploy/new.py", line 35, in
> ssh_copy_keys
> [ceph_deploy][ERROR ] if ssh.can_connect_passwordless(hostname):
> [ceph_deploy][ERROR ]   File
> "/usr/lib/python2.7/site-packages/ceph_deploy/util/ssh.py", line 22, in
> can_connect_passwordless
> [ceph_deploy][ERROR ] out, err, retval = remoto.process.check(conn,
> command, stop_on_error=False)
> [ceph_deploy][ERROR ]   File
> "/usr/lib/python2.7/site-packages/remoto/process.py", line 163, in check
> [ceph_deploy][ERROR ] kw = extend_path(conn, kw)
> [ceph_deploy][ERROR ] NameError: global name 'extend_path' is not defined
> [ceph_deploy][ERROR ]
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] http://tracker.ceph.com/issues/38122

2019-03-06 Thread Brad Hubbard
+Jos Collin 

On Thu, Mar 7, 2019 at 9:41 AM Milanov, Radoslav Nikiforov 
wrote:

> Can someone elaborate on
>
>
>
> From http://tracker.ceph.com/issues/38122
>
>
>
> Which exactly package is missing?
>
> And why is this happening ? In Mimic all dependencies are resolved by yum?
>
> - Rado
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>


-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Failed to repair pg

2019-03-07 Thread Brad Hubbard
you could try reading the data from this object and write it again
using rados get then rados put.

On Fri, Mar 8, 2019 at 3:32 AM Herbert Alexander Faleiros
 wrote:
>
> On Thu, Mar 07, 2019 at 01:37:55PM -0300, Herbert Alexander Faleiros wrote:
> > Hi,
> >
> > # ceph health detail
> > HEALTH_ERR 3 scrub errors; Possible data damage: 1 pg inconsistent
> > OSD_SCRUB_ERRORS 3 scrub errors
> > PG_DAMAGED Possible data damage: 1 pg inconsistent
> > pg 2.2bb is active+clean+inconsistent, acting [36,12,80]
> >
> > # ceph pg repair 2.2bb
> > instructing pg 2.2bb on osd.36 to repair
> >
> > But:
> >
> > 2019-03-07 13:23:38.636881 [ERR]  Health check update: Possible data 
> > damage: 1 pg inconsistent, 1 pg repair (PG_DAMAGED)
> > 2019-03-07 13:20:38.373431 [ERR]  2.2bb deep-scrub 3 errors
> > 2019-03-07 13:20:38.373426 [ERR]  2.2bb deep-scrub 0 missing, 1 
> > inconsistent objects
> > 2019-03-07 13:20:43.486860 [ERR]  Health check update: 3 scrub errors 
> > (OSD_SCRUB_ERRORS)
> > 2019-03-07 13:19:17.741350 [ERR]  deep-scrub 2.2bb 
> > 2:dd4a7bd3:::rbd_data.dfd5e2235befd0.0001c299:4f986 : is an 
> > unexpected clone
> > 2019-03-07 13:19:17.523042 [ERR]  2.2bb shard 36 soid 
> > 2:dd4a7bd3:::rbd_data.dfd5e2235befd0.0001c299:4f986 : data_digest 
> > 0x != data_digest 0xfc6b9538 from shard 12, size 0 != size 4194304 
> > from auth oi 
> > 2:dd4a7bd3:::rbd_data.dfd5e2235befd0.0001c299:4f986(482757'14986708 
> > client.112595650.0:344888465 dirty|omap_digest s 4194304 uv 14974021 od 
> >  alloc_hint [0 0 0]), size 0 != size 4194304 from shard 12
> > 2019-03-07 13:19:17.523038 [ERR]  2.2bb shard 36 soid 
> > 2:dd4a7bd3:::rbd_data.dfd5e2235befd0.0001c299:4f986 : candidate 
> > size 0 info size 4194304 mismatch
> > 2019-03-07 13:16:48.542673 [ERR]  2.2bb repair 2 errors, 1 fixed
> > 2019-03-07 13:16:48.542656 [ERR]  2.2bb repair 1 missing, 0 inconsistent 
> > objects
> > 2019-03-07 13:16:53.774956 [ERR]  Health check update: Possible data 
> > damage: 1 pg inconsistent (PG_DAMAGED)
> > 2019-03-07 13:16:53.774916 [ERR]  Health check update: 2 scrub errors 
> > (OSD_SCRUB_ERRORS)
> > 2019-03-07 13:15:16.986872 [ERR]  repair 2.2bb 
> > 2:dd4a7bd3:::rbd_data.dfd5e2235befd0.0001c299:4f986 : is an 
> > unexpected clone
> > 2019-03-07 13:15:16.986817 [ERR]  2.2bb shard 36 
> > 2:dd4a7bd3:::rbd_data.dfd5e2235befd0.0001c299:4f986 : missing
> > 2019-03-07 13:12:18.517442 [ERR]  Health check update: Possible data 
> > damage: 1 pg inconsistent, 1 pg repair (PG_DAMAGED)
> >
> > Also tried deep-scrub and scrub, same results.
> >
> > Also set noscrub,nodeep-scrub, kicked currently active scrubs one at
> > a time using 'ceph osd down '. After the last scrub was kicked,
> > forced scrub ran immediately then 'ceph pg repair', no luck.
> >
> > Finally tryed the manual aproach:
> >
> >  - stop osd.36
> >  - flush-journal
> >  - rm rbd\udata.dfd5e2235befd0.0001c299__4f986_CBDE52BB__2
> >  - start osd.36
> >  - ceph pg repair 2.2bb
> >
> > Also no luck...
> >
> > rbd\udata.dfd5e2235befd0.0001c299__4f986_CBDE52BB__2 at osd.36
> > is empty (0 size). At osd.80 4.0M, osd.2 is bluestore (can't find it).
> >
> > Ceph is 12.2.10, I'm currently migrating all my OSDs to bluestore.
> >
> > Is there anything else I can do?
>
> Should I do something like this? (below, after stop osd.36)
>
> # ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-36/ --journal-path 
> /dev/sdc1 rbd_data.dfd5e2235befd0.0001c299 remove-clone-metadata 
> 326022
>
> I'm no sure about rbd_data.$RBD and $CLONEID (took from rados
> list-inconsistent-obj, also below).
>
> > # rados list-inconsistent-obj 2.2bb | jq
> > {
> >   "epoch": 484655,
> >   "inconsistents": [
> > {
> >   "object": {
> > "name": "rbd_data.dfd5e2235befd0.0001c299",
> > "nspace": "",
> > "locator": "",
> > "snap": 326022,
> > "version": 14974021
> >   },
> >   "errors": [
> > "data_digest_mismatch",
> > "size_mismatch"
> >   ],
> >   "union_shard_errors": [
> > "size_mismatch_info",
> > "obj_size_info_mismatch"
> >   ],
> >   "selected_object_info": {
> > "oid": {
> >   "oid": "rbd_data.dfd5e2235befd0.0001c299",
> >   "key": "",
> >   "snapid": 326022,
> >   "hash": 3420345019,
> >   "max": 0,
> >   "pool": 2,
> >   "namespace": ""
> > },
> > "version": "482757'14986708",
> > "prior_version": "482697'14980304",
> > "last_reqid": "client.112595650.0:344888465",
> > "user_version": 14974021,
> > "size": 4194304,
> > "mtime": "2019-03-02 22:30:23.812849",
> > "local_mtime": "2019-03-02 22:30:23.813281",
> > "lost": 0,
> > "flags": [
> >   "dirty",
> >   "omap_digest"
> > ],
> > "legacy_snaps": [],
> > "truncate_seq": 

Re: [ceph-users] leak memory when mount cephfs

2019-03-19 Thread Brad Hubbard
On Tue, Mar 19, 2019 at 7:54 PM Zhenshi Zhou  wrote:
>
> Hi,
>
> I mount cephfs on my client servers. Some of the servers mount without any
> error whereas others don't.
>
> The error:
> # ceph-fuse -n client.kvm -m ceph.somedomain.com:6789 /mnt/kvm -r /kvm -d
> 2019-03-19 17:03:29.136 7f8c80eddc80 -1 deliberately leaking some memory
> 2019-03-19 17:03:29.137 7f8c80eddc80  0 ceph version 13.2.4 
> (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic (stable), process ceph-fuse, 
> pid 2951226
> ceph-fuse: symbol lookup error: ceph-fuse: undefined symbol: 
> _Z12pipe_cloexecPi

$ c++filt  _Z12pipe_cloexecPi
pipe_cloexec(int*)

$ sudo find /lib* /usr/lib* -iname '*.so*' | xargs nm -AD 2>&1 | grep
_Z12pipe_cloexecPi
/usr/lib64/ceph/libceph-common.so:0063bb00 T _Z12pipe_cloexecPi
/usr/lib64/ceph/libceph-common.so.0:0063bb00 T _Z12pipe_cloexecPi

This appears to be an incompatibility between ceph-fuse and the
version of libceph-common it is finding. The version of ceph-fuse you
are using expects  libceph-common to define the function
"pipe_cloexec(int*)" but it does not. I'd say the verion of
libceph-common.so you have installed is too old. Compare it to the
version on a system that works.

>
> I'm not sure why some servers cannot mount cephfs. Are the servers don't have
> enough memory?
>
> Both client and server use version 13.2.4.
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Large OMAP Objects in default.rgw.log pool

2019-03-07 Thread Brad Hubbard
On Fri, Mar 8, 2019 at 4:46 AM Samuel Taylor Liston  wrote:
>
> Hello All,
> I have recently had 32 large map objects appear in my default.rgw.log 
> pool.  Running luminous 12.2.8.
>
> Not sure what to think about these.I’ve done a lot of reading 
> about how when these normally occur it is related to a bucket needing 
> resharding, but it doesn’t look like my default.rgw.log pool  has anything in 
> it, let alone buckets.  Here’s some info on the system:
>
> [root@elm-rgw01 ~]# ceph versions
> {
> "mon": {
> "ceph version 12.2.8 (ae699615bac534ea496ee965ac6192cb7e0e07c0) 
> luminous (stable)": 5
> },
> "mgr": {
> "ceph version 12.2.8 (ae699615bac534ea496ee965ac6192cb7e0e07c0) 
> luminous (stable)": 1
> },
> "osd": {
> "ceph version 12.2.8 (ae699615bac534ea496ee965ac6192cb7e0e07c0) 
> luminous (stable)": 192
> },
> "mds": {},
> "rgw": {
> "ceph version 12.2.8 (ae699615bac534ea496ee965ac6192cb7e0e07c0) 
> luminous (stable)": 1
> },
> "overall": {
> "ceph version 12.2.8 (ae699615bac534ea496ee965ac6192cb7e0e07c0) 
> luminous (stable)": 199
> }
> }
> [root@elm-rgw01 ~]# ceph osd pool ls
> .rgw.root
> default.rgw.control
> default.rgw.meta
> default.rgw.log
> default.rgw.buckets.index
> default.rgw.buckets.non-ec
> default.rgw.buckets.data
> [root@elm-rgw01 ~]# ceph health detail
> HEALTH_WARN 32 large omap objects
> LARGE_OMAP_OBJECTS 32 large omap objects
> 32 large objects found in pool 'default.rgw.log'
> Search the cluster log for 'Large omap object found' for more details.—
>
> Looking closer at these object they are all of size 0.  Also that pool shows 
> a capacity usage of 0:

The size here relates to data size. Object map (omap) data is metadata
so an object of size 0 can have considerable omap data associated with
it (the omap data is stored separately from the object in a key/value
database). The large omap warning in health detail output should tell
you " "Search the cluster log for 'Large omap object found' for more
details." If you do that you should get the names of the specific
objects involved. You can then use the rados commands listomapkeys and
listomapvals to see the specifics of the omap data. Someone more
familiar with rgw can then probably help you out on what purpose they
serve.

HTH.

>
> (just a sampling of the 236 objects at size 0)
>
> [root@elm-mon01 ceph]# for i in `rados ls -p default.rgw.log`; do echo ${i}; 
> rados stat -p default.rgw.log ${i};done
> obj_delete_at_hint.78
> default.rgw.log/obj_delete_at_hint.78 mtime 2019-03-07 
> 11:39:19.00, size 0
> obj_delete_at_hint.70
> default.rgw.log/obj_delete_at_hint.70 mtime 2019-03-07 
> 11:39:19.00, size 0
> obj_delete_at_hint.000104
> default.rgw.log/obj_delete_at_hint.000104 mtime 2019-03-07 
> 11:39:20.00, size 0
> obj_delete_at_hint.26
> default.rgw.log/obj_delete_at_hint.26 mtime 2019-03-07 
> 11:39:19.00, size 0
> obj_delete_at_hint.28
> default.rgw.log/obj_delete_at_hint.28 mtime 2019-03-07 
> 11:39:19.00, size 0
> obj_delete_at_hint.40
> default.rgw.log/obj_delete_at_hint.40 mtime 2019-03-07 
> 11:39:19.00, size 0
> obj_delete_at_hint.15
> default.rgw.log/obj_delete_at_hint.15 mtime 2019-03-07 
> 11:39:19.00, size 0
> obj_delete_at_hint.69
> default.rgw.log/obj_delete_at_hint.69 mtime 2019-03-07 
> 11:39:19.00, size 0
> obj_delete_at_hint.95
> default.rgw.log/obj_delete_at_hint.95 mtime 2019-03-07 
> 11:39:19.00, size 0
> obj_delete_at_hint.03
> default.rgw.log/obj_delete_at_hint.03 mtime 2019-03-07 
> 11:39:19.00, size 0
> obj_delete_at_hint.47
> default.rgw.log/obj_delete_at_hint.47 mtime 2019-03-07 
> 11:39:19.00, size 0
>
>
> [root@elm-mon01 ceph]# rados df
> POOL_NAME  USEDOBJECTS   CLONES COPIES 
> MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPSRD  WR_OPSWR
> .rgw.root  1.09KiB 4  0 12
>   0   00 14853 9.67MiB 0 0B
> default.rgw.buckets.data444TiB 166829939  0 1000979634
>   0   00 362357590  859TiB 909188749 703TiB
> default.rgw.buckets.index   0B   358  0   1074
>   0   00 729694496 1.04TiB 522654976 0B
> default.rgw.buckets.non-ec  0B   182  0546
>   0   00 194204616  148GiB  97962607 0B
> default.rgw.control 0B 8  0 24
>   0   00 0  0B 0 0B
> default.rgw.log 0B   236  0708
>   0   00  33268863 3.01TiB  18415356 0B
> default.rgw.meta   16.2KiB67  0201
>   0   0 

Re: [ceph-users] Slow OPS

2019-03-21 Thread Brad Hubbard
A repop is a sub-operation between primaries and replicas mostly.

That op only shows a duration of 1.3 seconds and the delay you
mentioned previously was under a second. Do you see larger delays? Are
they always between "sub_op_committed" and "commit_sent"?

What is your workload and how heavily utilised is your
cluster/network? How hard are the underlying disks working?

On Thu, Mar 21, 2019 at 4:11 PM Glen Baars  wrote:
>
> Hello Brad,
>
> It doesn't seem to be a set of OSDs, the cluster has 160ish OSDs over 9 hosts.
>
> I seem to get a lot of these ops also that don't show a client.
>
> "description": "osd_repop(client.14349712.0:4866968 15.36 
> e30675/22264 15:6dd17247:::rbd_data.2359ef6b8b4567.0042766
> a:head v 30675'5522366)",
> "initiated_at": "2019-03-21 16:51:56.862447",
> "age": 376.527241,
>     "duration": 1.331278,
>
> Kind regards,
> Glen Baars
>
> -Original Message-
> From: Brad Hubbard 
> Sent: Thursday, 21 March 2019 1:43 PM
> To: Glen Baars 
> Cc: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] Slow OPS
>
> Actually, the lag is between "sub_op_committed" and "commit_sent". Is there 
> any pattern to these slow requests? Do they involve the same osd, or set of 
> osds?
>
> On Thu, Mar 21, 2019 at 3:37 PM Brad Hubbard  wrote:
> >
> > On Thu, Mar 21, 2019 at 3:20 PM Glen Baars  
> > wrote:
> > >
> > > Thanks for that - we seem to be experiencing the wait in this section of 
> > > the ops.
> > >
> > > {
> > > "time": "2019-03-21 14:12:42.830191",
> > > "event": "sub_op_committed"
> > > },
> > > {
> > > "time": "2019-03-21 14:12:43.699872",
> > > "event": "commit_sent"
> > > },
> > >
> > > Does anyone know what that section is waiting for?
> >
> > Hi Glen,
> >
> > These are documented, to some extent, here.
> >
> > http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting
> > -osd/
> >
> > It looks like it may be taking a long time to communicate the commit
> > message back to the client? Are these slow ops always the same client?
> >
> > >
> > > Kind regards,
> > > Glen Baars
> > >
> > > -Original Message-
> > > From: Brad Hubbard 
> > > Sent: Thursday, 21 March 2019 8:23 AM
> > > To: Glen Baars 
> > > Cc: ceph-users@lists.ceph.com
> > > Subject: Re: [ceph-users] Slow OPS
> > >
> > > On Thu, Mar 21, 2019 at 12:11 AM Glen Baars  
> > > wrote:
> > > >
> > > > Hello Ceph Users,
> > > >
> > > >
> > > >
> > > > Does anyone know what the flag point ‘Started’ is? Is that ceph osd 
> > > > daemon waiting on the disk subsystem?
> > >
> > > This is set by "mark_started()" and is roughly set when the pg starts 
> > > processing the op. Might want to capture dump_historic_ops output after 
> > > the op completes.
> > >
> > > >
> > > >
> > > >
> > > > Ceph 13.2.4 on centos 7.5
> > > >
> > > >
> > > >
> > > > "description": "osd_op(client.1411875.0:422573570
> > > > 5.18ds0
> > > > 5:b1ed18e5:::rbd_data.6.cf7f46b8b4567.0046e41a:head [read
> > > >
> > > > 1703936~16384] snapc 0=[] ondisk+read+known_if_redirected
> > > > e30622)",
> > > >
> > > > "initiated_at": "2019-03-21 01:04:40.598438",
> > > >
> > > > "age": 11.340626,
> > > >
> > > > "duration": 11.342846,
> > > >
> > > > "type_data": {
> > > >
> > > > "flag_point": "started",
> > > >
> > > > "client_info": {
> > > >
> > > > "client": "client.1411875",
> > > >
> > > > "client_

Re: [ceph-users] Debugging 'slow requests' ...

2019-02-08 Thread Brad Hubbard
Try capturing another log with debug_ms turned up. 1 or 5 should be Ok
to start with.

On Fri, Feb 8, 2019 at 8:37 PM Massimo Sgaravatto
 wrote:
>
> Our Luminous ceph cluster have been worked without problems for a while, but 
> in the last days we have been suffering from continuous slow requests.
>
> We have indeed done some changes in the infrastructure recently:
>
> - Moved OSD nodes to a new switch
> - Increased pg nums for a pool, to have about ~ 100 PGs/OSD (also because  we 
> have to install new OSDs in the cluster). The output of 'ceph osd df' is 
> attached.
>
> The problem could also be due to some ''bad' client, but in the log I don't 
> see a clear "correlation" with specific clients or images for such blocked 
> requests.
>
> I also tried to update to latest luminous release and latest CentOS7, but 
> this didn't help.
>
>
>
> Attached you can find the detail of one of such slow operations which took 
> about 266 secs (output from 'ceph daemon osd.11 dump_historic_ops').
> So as far as I can understand from these events:
> {
> "time": "2019-02-08 10:26:25.651728",
> "event": "op_commit"
> },
> {
> "time": "2019-02-08 10:26:25.651965",
> "event": "op_applied"
> },
>
>   {
> "time": "2019-02-08 10:26:25.653236",
> "event": "sub_op_commit_rec from 33"
> },
> {
> "time": "2019-02-08 10:30:51.890404",
> "event": "sub_op_commit_rec from 23"
> },
>
> the problem seems with the  "sub_op_commit_rec from 23" event which took too 
> much.
> So the problem is that the answer from OSD 23 took to much ?
>
>
> In the logs of the 2 OSD (11 and 23)in that time frame (attached) I can't 
> find anything useful.
> When the problem happened the load and the usage of memory was not high in 
> the relevant nodes.
>
>
> Any help to debug the issue is really appreciated ! :-)
>
> Thanks, Massimo
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSD fails to start (fsck error, unable to read osd superblock)

2019-02-09 Thread Brad Hubbard
On Sun, Feb 10, 2019 at 1:56 AM Ruben Rodriguez  wrote:
>
> Hi there,
>
> Running 12.2.11-1xenial on a machine with 6 SSD OSD with bluestore.
>
> Today we had two disks fail out of the controller, and after a reboot
> they both seemed to come back fine but ceph-osd was only able to start
> in one of them. The other one gets this:
>
> 2019-02-08 18:53:00.703376 7f64f948ce00 -1
> bluestore(/var/lib/ceph/osd/ceph-3) _verify_csum bad crc32c/0x1000
> checksum at blob offset 0x0, got 0x95104dfc, expected 0xb9e3e26d, device
> location [0x4000~1000], logical extent 0x0~1000, object
> #-1:7b3f43c4:::osd_superblock:0#
> 2019-02-08 18:53:00.703406 7f64f948ce00 -1 osd.3 0 OSD::init() : unable
> to read osd superblock
>
> Note that there are no actual IO errors being shown by the controller in
> dmesg, and that the disk is readable. The metadata FS is mounted and
> looks normal.
>
> I tried running "ceph-bluestore-tool repair --path
> /var/lib/ceph/osd/ceph-3 --deep 1" and that gets many instances of:

Running this with debug_bluestore=30 might give more information on
the nature of the IO error.

>
> 2019-02-08 19:00:31.783815 7fa35bd0df80 -1
> bluestore(/var/lib/ceph/osd/ceph-3) _verify_csum bad crc32c/0x1000
> checksum at blob offset 0x0, got 0x95104dfc, expected 0xb9e3e26d, device
> location [0x4000~1000], logical extent 0x0~1000, object
> #-1:7b3f43c4:::osd_superblock:0#
> 2019-02-08 19:00:31.783866 7fa35bd0df80 -1
> bluestore(/var/lib/ceph/osd/ceph-3) fsck error:
> #-1:7b3f43c4:::osd_superblock:0# error during read: (5) Input/output error
>
> ...which is the same error. Due to a host being down for unrelated
> reasons, this is preventing some PG's from activating, keeping one pool
> inaccessible. There is no critical data in it, but I'm more interested
> in solving the issue for reliability.
>
> Any advice? What does bad crc indicate in this context? Should I send
> this to the bug tracker instead?
> --
> Ruben Rodriguez | Chief Technology Officer, Free Software Foundation
> GPG Key: 05EF 1D2F FE61 747D 1FC8 27C3 7FAC 7D26 472F 4409
> https://fsf.org | https://gnu.org
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Debugging 'slow requests' ...

2019-02-09 Thread Brad Hubbard
_repop(client.171725953.0:404377591 8.9b 
> e1205833/1205735 8:d90ad\
> ab6:::rbd_data.c47f3c390c8495.0001934a:head v 1205833'4767322) v2 -- 
> 0x56308ca42400 con 0
> 2019-02-09 07:29:34.417132 7f5fcf60f700  1 -- 192.168.222.202:6816/157436 <== 
> osd.14 192.168.222.203:6811/158495 11242  
> osd_repop_reply(client.171725953.0:404377591 8.9b e120\
> 5833/1205735) v2  111+0+0 (634943494 0 0) 0x563090642780 con 
> 0x56308bbd
>
> The answer from 14 arrives immediately:
>
> 2019-02-09 07:29:34.417132 7f5fcf60f700  1 -- 192.168.222.202:6816/157436 <== 
> osd.14 192.168.222.203:6811/158495 11242  
> osd_repop_reply(client.171725953.0:404377591 8.9b e120\
> 5833/1205735) v2  111+0+0 (634943494 0 0) 0x563090642780 con 
> 0x56308bbd
>
> while the one from 29 arrives only at 7.35:
>
> 2019-02-09 07:35:14.628614 7f5fcee0e700  1 -- 192.168.222.202:6816/157436 <== 
> osd.29 192.168.222.204:6804/4159520 1952  
> osd_repop_reply(client.171725953.0:404377591 8.9b e120\
> 5833/1205735) v2  111+0+0 (3804866849 0 0) 0x56308f3f2a00 con 
> 0x56308bf61000
>
>
> In osd.29 log it looks like the request only arrives at 07.35 (and it 
> promptly replies):
>
> 2019-02-09 07:35:14.627462 7f99972cc700  1 -- 192.168.222.204:6804/4159520 
> <== osd.5 192.168.222.202:6816/157436 2527  
> osd_repop(client.171725953.0:404377591 8.9b e1205833/1205735) v2  
> 1050+0+123635 (1225076790 0 171428115) 0x5610f5128a00 con 0x5610fc5bf000
> 2019-02-09 07:35:14.628343 7f998d6d4700  1 -- 192.168.222.204:6804/4159520 
> --> 192.168.222.202:6816/157436 -- 
> osd_repop_reply(client.171725953.0:404377591 8.9b e1205833/1205735 ondisk, 
> result = 0) v2 -- 0x5610f4a51180 con 0
>
>
> Network problems ?
>
>
> Full logs for the 3 relevant OSDs (just for that time period) is at: 
> https://drive.google.com/drive/folders/1TG5MomMJsqVbsuFosvYokNptLufxOnPY?usp=sharing
>
>
>
> Thanks again !
> Cheers, Massimo
>
>
>
> On Fri, Feb 8, 2019 at 11:50 PM Brad Hubbard  wrote:
>>
>> Try capturing another log with debug_ms turned up. 1 or 5 should be Ok
>> to start with.
>>
>> On Fri, Feb 8, 2019 at 8:37 PM Massimo Sgaravatto
>>  wrote:
>> >
>> > Our Luminous ceph cluster have been worked without problems for a while, 
>> > but in the last days we have been suffering from continuous slow requests.
>> >
>> > We have indeed done some changes in the infrastructure recently:
>> >
>> > - Moved OSD nodes to a new switch
>> > - Increased pg nums for a pool, to have about ~ 100 PGs/OSD (also because  
>> > we have to install new OSDs in the cluster). The output of 'ceph osd df' 
>> > is attached.
>> >
>> > The problem could also be due to some ''bad' client, but in the log I 
>> > don't see a clear "correlation" with specific clients or images for such 
>> > blocked requests.
>> >
>> > I also tried to update to latest luminous release and latest CentOS7, but 
>> > this didn't help.
>> >
>> >
>> >
>> > Attached you can find the detail of one of such slow operations which took 
>> > about 266 secs (output from 'ceph daemon osd.11 dump_historic_ops').
>> > So as far as I can understand from these events:
>> > {
>> > "time": "2019-02-08 10:26:25.651728",
>> > "event": "op_commit"
>> > },
>> > {
>> > "time": "2019-02-08 10:26:25.651965",
>> > "event": "op_applied"
>> > },
>> >
>> >   {
>> > "time": "2019-02-08 10:26:25.653236",
>> > "event": "sub_op_commit_rec from 33"
>> > },
>> > {
>> > "time": "2019-02-08 10:30:51.890404",
>> > "event": "sub_op_commit_rec from 23"
>> > },
>> >
>> > the problem seems with the  "sub_op_commit_rec from 23" event which took 
>> > too much.
>> > So the problem is that the answer from OSD 23 took to much ?
>> >
>> >
>> > In the logs of the 2 OSD (11 and 23)in that time frame (attached) I can't 
>> > find anything useful.
>> > When the problem happened the load and the usage of memory was not high in 
>> > the relevant nodes.
>> >
>> >
>> > Any help to debug the issue is really appreciated ! :-)
>> >
>> > Thanks, Massimo
>> >
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>>
>> --
>> Cheers,
>> Brad



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSD fails to start (fsck error, unable to read osd superblock)

2019-02-13 Thread Brad Hubbard
A single OSD should be expendable and you should be able to just "zap"
it and recreate it. Was this not true in your case?

On Wed, Feb 13, 2019 at 1:27 AM Ruben Rodriguez  wrote:
>
>
>
> On 2/9/19 5:40 PM, Brad Hubbard wrote:
> > On Sun, Feb 10, 2019 at 1:56 AM Ruben Rodriguez  wrote:
> >>
> >> Hi there,
> >>
> >> Running 12.2.11-1xenial on a machine with 6 SSD OSD with bluestore.
> >>
> >> Today we had two disks fail out of the controller, and after a reboot
> >> they both seemed to come back fine but ceph-osd was only able to start
> >> in one of them. The other one gets this:
> >>
> >> 2019-02-08 18:53:00.703376 7f64f948ce00 -1
> >> bluestore(/var/lib/ceph/osd/ceph-3) _verify_csum bad crc32c/0x1000
> >> checksum at blob offset 0x0, got 0x95104dfc, expected 0xb9e3e26d, device
> >> location [0x4000~1000], logical extent 0x0~1000, object
> >> #-1:7b3f43c4:::osd_superblock:0#
> >> 2019-02-08 18:53:00.703406 7f64f948ce00 -1 osd.3 0 OSD::init() : unable
> >> to read osd superblock
> >>
> >> Note that there are no actual IO errors being shown by the controller in
> >> dmesg, and that the disk is readable. The metadata FS is mounted and
> >> looks normal.
> >>
> >> I tried running "ceph-bluestore-tool repair --path
> >> /var/lib/ceph/osd/ceph-3 --deep 1" and that gets many instances of:
> >
> > Running this with debug_bluestore=30 might give more information on
> > the nature of the IO error.
>
> I had collected the logs with debug info already, and nothing
> significant was listed there. I applied this patch
> https://github.com/ceph/ceph/pull/26247 and it allowed me to move
> forward. There was a osd map corruption issue that I had to handle by
> hand, but after that the osd started fine. After it started and
> backfills finished, the bluestore_ignore_data_csum flag is no longer
> needed, so I reverted to standard packages.
>
> --
> Ruben Rodriguez | Chief Technology Officer, Free Software Foundation
> GPG Key: 05EF 1D2F FE61 747D 1FC8  27C3 7FAC 7D26 472F 4409
> https://fsf.org | https://gnu.org
>


-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Debugging 'slow requests' ...

2019-02-11 Thread Brad Hubbard
Glad to help!

On Tue, Feb 12, 2019 at 4:55 PM Massimo Sgaravatto
 wrote:
>
> Thanks a lot Brad !
>
> The problem is indeed in the network: we moved the OSD nodes back to the 
> "old" switches and the problem disappeared.
>
> Now we have to figure out what is wrong/misconfigured with the new switch: we 
> would try to replicate the problem, possibly without a ceph deployment ...
>
> Thanks again for your help !
>
> Cheers, Massimo
>
> On Sun, Feb 10, 2019 at 12:07 AM Brad Hubbard  wrote:
>>
>> The log ends at
>>
>> $ zcat ceph-osd.5.log.gz |tail -2
>> 2019-02-09 07:37:00.022534 7f5fce60d700  1 --
>> 192.168.61.202:6816/157436 >> - conn(0x56308edcf000 :6816
>> s=STATE_ACCEPTING pgs=0 cs=0 l=0)._process_connection sd=296 -
>>
>> The last two messages are outbound to 192.168.222.204 and there are no
>> further messages between these two hosts (other than osd_pings) in the
>> log.
>>
>> $ zcat ceph-osd.5.log.gz |gawk
>> '!/osd_ping/&&/192.168.222.202/&&/192.168.222.204/&&/07:29:3/'|tail -5
>> 2019-02-09 07:29:34.267744 7f5fcee0e700  1 --
>> 192.168.222.202:6816/157436 <== osd.29 192.168.222.204:6804/4159520
>> 1946  rep_scrubmap(8.2bc e1205735 from shard 29) v2  40+0+1492
>> (3695125937 0 2050362985) 0x563090674d80 con 0x56308bf61000
>> 2019-02-09 07:29:34.375223 7f5faf4b4700  1 --
>> 192.168.222.202:6816/157436 --> 192.168.222.204:6804/4159520 --
>> replica_scrub(pg:
>> 8.2bc,from:0'0,to:0'0,epoch:1205833/1205735,start:8:3d4e6145:::rbd_data.35f46d19abe4ed.77a4:0,end:8:3d4e6916:::rbd_data.a6dc2425de9600.0006249c:0,chunky:1,deep:0,version:9,allow_preemption:1,priority=5)
>> v9 -- 0x56308bdf2000 con 0
>> 2019-02-09 07:29:34.378535 7f5fcee0e700  1 --
>> 192.168.222.202:6816/157436 <== osd.29 192.168.222.204:6804/4159520
>> 1947  rep_scrubmap(8.2bc e1205735 from shard 29) v2  40+0+1494
>> (3695125937 0 865217733) 0x563092d90900 con 0x56308bf61000
>> 2019-02-09 07:29:34.415868 7f5faf4b4700  1 --
>> 192.168.222.202:6816/157436 --> 192.168.222.204:6804/4159520 --
>> osd_repop(client.171725953.0:404377591 8.9b e1205833/1205735
>> 8:d90adab6:::rbd_data.c47f3c390c8495.0001934a:head v
>> 1205833'4767322) v2 -- 0x56308ca42400 con 0
>> 2019-02-09 07:29:34.486296 7f5faf4b4700  1 --
>> 192.168.222.202:6816/157436 --> 192.168.222.204:6804/4159520 --
>> replica_scrub(pg:
>> 8.2bc,from:0'0,to:0'0,epoch:1205833/1205735,start:8:3d4e6916:::rbd_data.a6dc2425de9600.0006249c:0,end:8:3d4e7434:::rbd_data.47c1b437840214.0003c594:0,chunky:1,deep:0,version:9,allow_preemption:1,priority=5)
>> v9 -- 0x56308e565340 con 0
>>
>> I'd be taking a good, hard look at the network, yes.
>>
>> On Sat, Feb 9, 2019 at 6:33 PM Massimo Sgaravatto
>>  wrote:
>> >
>> > Thanks for your feedback !
>> >
>> > I increased debug_ms to 1/5.
>> >
>> > This is another slow request (full output from 'ceph daemon osd.5 
>> > dump_historic_ops' for this event is attached):
>> >
>> >
>> > {
>> > "description": "osd_op(client.171725953.0:404377591 8.9b 
>> > 8:d90adab6:
>> > ::rbd_data.c47f3c390c8495.0001934a:head [set-alloc-hint 
>> > object_size 4194
>> > 304 write_size 4194304,write 1413120~122880] snapc 0=[] 
>> > ondisk+write+known_if_re
>> > directed e1205833)",
>> > "initiated_at": "2019-02-09 07:29:34.404655",
>> > "age": 387.914193,
>> > "duration": 340.224154,
>> > "type_data": {
>> > "flag_point": "commit sent; apply or cleanup",
>> > "client_info": {
>> > "client": "client.171725953",
>> > "client_addr": "192.168.61.66:0/4056439540",
>> > "tid": 404377591
>> > },
>> > "events": [
>> > {
>> > "time": "2019-02-09 07:29:34.404655",
>> > "event": "initiated"
>> > },
>> > 
>> > 
>> >{
>> > "time": "2019-02-09 07:29:34.416752",
>> > "event": "op

Re: [ceph-users] backfill_toofull after adding new OSDs

2019-02-06 Thread Brad Hubbard
Let's try to restrict discussion to the original thread
"backfill_toofull while OSDs are not full" and get a tracker opened up
for this issue.

On Sat, Feb 2, 2019 at 11:52 AM Fyodor Ustinov  wrote:
>
> Hi!
>
> Right now, after adding OSD:
>
> # ceph health detail
> HEALTH_ERR 74197563/199392333 objects misplaced (37.212%); Degraded data 
> redundancy (low space): 1 pg backfill_toofull
> OBJECT_MISPLACED 74197563/199392333 objects misplaced (37.212%)
> PG_DEGRADED_FULL Degraded data redundancy (low space): 1 pg backfill_toofull
> pg 6.eb is active+remapped+backfill_wait+backfill_toofull, acting 
> [21,0,47]
>
> # ceph pg ls-by-pool iscsi backfill_toofull
> PG   OBJECTS DEGRADED MISPLACED UNFOUND BYTES  LOG  STATE 
>  STATE_STAMPVERSION   REPORTED   UP   
>   ACTING   SCRUB_STAMPDEEP_SCRUB_STAMP
> 6.eb 6450  1290   0 1645654016 3067 
> active+remapped+backfill_wait+backfill_toofull 2019-02-02 00:20:32.975300 
> 7208'6567 9790:16214 [5,1,21]p5 [21,0,47]p21 2019-01-18 04:13:54.280495 
> 2019-01-18 04:13:54.280495
>
> All OSD have less 40% USE.
>
> ID CLASS WEIGHT  REWEIGHT SIZEUSE AVAIL   %USE  VAR  PGS
>  0   hdd 9.56149  1.0 9.6 TiB 3.2 TiB 6.3 TiB 33.64 1.31 313
>  1   hdd 9.56149  1.0 9.6 TiB 3.3 TiB 6.3 TiB 34.13 1.33 295
>  5   hdd 9.56149  1.0 9.6 TiB 756 GiB 8.8 TiB  7.72 0.30 103
> 47   hdd 9.32390  1.0 9.3 TiB 3.1 TiB 6.2 TiB 33.75 1.31 306
>
> (all other OSD also have less 40%)
>
> ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic (stable)
>
> Maybe the developers will pay attention to the letter and say something?
>
> - Original Message -
> From: "Fyodor Ustinov" 
> To: "Caspar Smit" 
> Cc: "Jan Kasprzak" , "ceph-users" 
> Sent: Thursday, 31 January, 2019 16:50:24
> Subject: Re: [ceph-users] backfill_toofull after adding new OSDs
>
> Hi!
>
> I saw the same several times when I added a new osd to the cluster. One-two 
> pg in "backfill_toofull" state.
>
> In all versions of mimic.
>
> - Original Message -
> From: "Caspar Smit" 
> To: "Jan Kasprzak" 
> Cc: "ceph-users" 
> Sent: Thursday, 31 January, 2019 15:43:07
> Subject: Re: [ceph-users] backfill_toofull after adding new OSDs
>
> Hi Jan,
>
> You might be hitting the same issue as Wido here:
>
> [ https://www.spinics.net/lists/ceph-users/msg50603.html | 
> https://www.spinics.net/lists/ceph-users/msg50603.html ]
>
> Kind regards,
> Caspar
>
> Op do 31 jan. 2019 om 14:36 schreef Jan Kasprzak < [ mailto:k...@fi.muni.cz | 
> k...@fi.muni.cz ] >:
>
>
> Hello, ceph users,
>
> I see the following HEALTH_ERR during cluster rebalance:
>
> Degraded data redundancy (low space): 8 pgs backfill_toofull
>
> Detailed description:
> I have upgraded my cluster to mimic and added 16 new bluestore OSDs
> on 4 hosts. The hosts are in a separate region in my crush map, and crush
> rules prevented data to be moved on the new OSDs. Now I want to move
> all data to the new OSDs (and possibly decomission the old filestore OSDs).
> I have created the following rule:
>
> # ceph osd crush rule create-replicated on-newhosts newhostsroot host
>
> after this, I am slowly moving the pools one-by-one to this new rule:
>
> # ceph osd pool set test-hdd-pool crush_rule on-newhosts
>
> When I do this, I get the above error. This is misleading, because
> ceph osd df does not suggest the OSDs are getting full (the most full
> OSD is about 41 % full). After rebalancing is done, the HEALTH_ERR
> disappears. Why am I getting this error?
>
> # ceph -s
> cluster:
> id: ...my UUID...
> health: HEALTH_ERR
> 1271/3803223 objects misplaced (0.033%)
> Degraded data redundancy: 40124/3803223 objects degraded (1.055%), 65 pgs 
> degraded, 67 pgs undersized
> Degraded data redundancy (low space): 8 pgs backfill_toofull
>
> services:
> mon: 3 daemons, quorum mon1,mon2,mon3
> mgr: mon2(active), standbys: mon1, mon3
> osd: 80 osds: 80 up, 80 in; 90 remapped pgs
> rgw: 1 daemon active
>
> data:
> pools: 13 pools, 5056 pgs
> objects: 1.27 M objects, 4.8 TiB
> usage: 15 TiB used, 208 TiB / 224 TiB avail
> pgs: 40124/3803223 objects degraded (1.055%)
> 1271/3803223 objects misplaced (0.033%)
> 4963 active+clean
> 41 active+recovery_wait+undersized+degraded+remapped
> 21 active+recovery_wait+undersized+degraded
> 17 active+remapped+backfill_wait
> 5 active+remapped+backfill_wait+backfill_toofull
> 3 active+remapped+backfill_toofull
> 2 active+recovering+undersized+remapped
> 2 active+recovering+undersized+degraded+remapped
> 1 active+clean+remapped
> 1 active+recovering+undersized+degraded
>
> io:
> client: 6.6 MiB/s rd, 2.7 MiB/s wr, 75 op/s rd, 89 op/s wr
> recovery: 2.0 MiB/s, 92 objects/s
>
> Thanks for any hint,
>
> -Yenya
>
> --
> | Jan "Yenya" Kasprzak http://fi.muni.cz/ | fi.muni.cz ] - work | 
> [ http://yenya.net/ | yenya.net ] - private}> |
> | [ http://www.fi.muni.cz/~kas/ | http://www.fi.muni.cz/~kas/ ] GPG: 
> 4096R/A45477D5 |

Re: [ceph-users] showing active config settings

2019-04-15 Thread Brad Hubbard
On Tue, Apr 16, 2019 at 7:38 AM solarflow99  wrote:
>
> Then why doesn't this work?
>
> # ceph tell 'osd.*' injectargs '--osd-recovery-max-active 4'
> osd.0: osd_recovery_max_active = '4' (not observed, change may require 
> restart)
> osd.1: osd_recovery_max_active = '4' (not observed, change may require 
> restart)
> osd.2: osd_recovery_max_active = '4' (not observed, change may require 
> restart)
> osd.3: osd_recovery_max_active = '4' (not observed, change may require 
> restart)
> osd.4: osd_recovery_max_active = '4' (not observed, change may require 
> restart)
>
> # ceph -n osd.1 --show-config | grep osd_recovery_max_active
> osd_recovery_max_active = 3

Did you try "config diff" as Paul suggested?

>
>
>
> On Wed, Apr 10, 2019 at 7:21 AM Eugen Block  wrote:
>>
>> > I always end up using "ceph --admin-daemon
>> > /var/run/ceph/name-of-socket-here.asok config show | grep ..." to get what
>> > is in effect now for a certain daemon.
>> > Needs you to be on the host of the daemon of course.
>>
>> Me too, I just wanted to try what OP reported. And after trying that,
>> I'll keep it that way. ;-)
>>
>>
>> Zitat von Janne Johansson :
>>
>> > Den ons 10 apr. 2019 kl 13:37 skrev Eugen Block :
>> >
>> >> > If you don't specify which daemon to talk to, it tells you what the
>> >> > defaults would be for a random daemon started just now using the same
>> >> > config as you have in /etc/ceph/ceph.conf.
>> >>
>> >> I tried that, too, but the result is not correct:
>> >>
>> >> host1:~ # ceph -n osd.1 --show-config | grep osd_recovery_max_active
>> >> osd_recovery_max_active = 3
>> >>
>> >
>> > I always end up using "ceph --admin-daemon
>> > /var/run/ceph/name-of-socket-here.asok config show | grep ..." to get what
>> > is in effect now for a certain daemon.
>> > Needs you to be on the host of the daemon of course.
>> >
>> > --
>> > May the most significant bit of your life be positive.
>>
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] showing active config settings

2019-04-16 Thread Brad Hubbard
$ ceph config set osd osd_recovery_max_active 4
$ ceph daemon osd.0 config diff|grep -A5 osd_recovery_max_active
"osd_recovery_max_active": {
"default": 3,
"mon": 4,
"override": 4,
"final": 4
},

On Wed, Apr 17, 2019 at 5:29 AM solarflow99  wrote:
>
> I wish there was a way to query the running settings from one of the MGR 
> hosts, and it doesn't help that ansible doesn't even copy the keyring to the 
> OSD nodes so commands there wouldn't work anyway.
> I'm still puzzled why it doesn't show any change when I run this no matter 
> what I set it to:
>
> # ceph -n osd.1 --show-config | grep osd_recovery_max_active
> osd_recovery_max_active = 3
>
> in fact it doesn't matter if I use an OSD number that doesn't exist, same 
> thing if I use ceph get
>
>
>
> On Tue, Apr 16, 2019 at 1:18 AM Brad Hubbard  wrote:
>>
>> On Tue, Apr 16, 2019 at 6:03 PM Paul Emmerich  wrote:
>> >
>> > This works, it just says that it *might* require a restart, but this
>> > particular option takes effect without a restart.
>>
>> We've already looked at changing the wording once to make it more palatable.
>>
>> http://tracker.ceph.com/issues/18424
>>
>> >
>> > Implementation detail: this message shows up if there's no internal
>> > function to be called when this option changes, so it can't be sure if
>> > the change is actually doing anything because the option might be
>> > cached or only read on startup. But in this case this option is read
>> > in the relevant path every time and no notification is required. But
>> > the injectargs command can't know that.
>>
>> Right on all counts. The functions are referred to as observers and
>> register to be notified if the value changes, hence "not observed."
>>
>> >
>> > Paul
>> >
>> > On Mon, Apr 15, 2019 at 11:38 PM solarflow99  wrote:
>> > >
>> > > Then why doesn't this work?
>> > >
>> > > # ceph tell 'osd.*' injectargs '--osd-recovery-max-active 4'
>> > > osd.0: osd_recovery_max_active = '4' (not observed, change may require 
>> > > restart)
>> > > osd.1: osd_recovery_max_active = '4' (not observed, change may require 
>> > > restart)
>> > > osd.2: osd_recovery_max_active = '4' (not observed, change may require 
>> > > restart)
>> > > osd.3: osd_recovery_max_active = '4' (not observed, change may require 
>> > > restart)
>> > > osd.4: osd_recovery_max_active = '4' (not observed, change may require 
>> > > restart)
>> > >
>> > > # ceph -n osd.1 --show-config | grep osd_recovery_max_active
>> > > osd_recovery_max_active = 3
>> > >
>> > >
>> > >
>> > > On Wed, Apr 10, 2019 at 7:21 AM Eugen Block  wrote:
>> > >>
>> > >> > I always end up using "ceph --admin-daemon
>> > >> > /var/run/ceph/name-of-socket-here.asok config show | grep ..." to get 
>> > >> > what
>> > >> > is in effect now for a certain daemon.
>> > >> > Needs you to be on the host of the daemon of course.
>> > >>
>> > >> Me too, I just wanted to try what OP reported. And after trying that,
>> > >> I'll keep it that way. ;-)
>> > >>
>> > >>
>> > >> Zitat von Janne Johansson :
>> > >>
>> > >> > Den ons 10 apr. 2019 kl 13:37 skrev Eugen Block :
>> > >> >
>> > >> >> > If you don't specify which daemon to talk to, it tells you what the
>> > >> >> > defaults would be for a random daemon started just now using the 
>> > >> >> > same
>> > >> >> > config as you have in /etc/ceph/ceph.conf.
>> > >> >>
>> > >> >> I tried that, too, but the result is not correct:
>> > >> >>
>> > >> >> host1:~ # ceph -n osd.1 --show-config | grep osd_recovery_max_active
>> > >> >> osd_recovery_max_active = 3
>> > >> >>
>> > >> >
>> > >> > I always end up using "ceph --admin-daemon
>> > >> > /var/run/ceph/name-of-socket-here.asok config show | grep ..." to get 
>> > >> > what
>> > >> > is in effect now for a certain daemon.
>> > >> > Needs you to be on the host of the daemon of course.
>> > >> >
>> > >> > --
>> > >> > May the most significant bit of your life be positive.
>> > >>
>> > >>
>> > >>
>> > >> ___
>> > >> ceph-users mailing list
>> > >> ceph-users@lists.ceph.com
>> > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> > >
>> > > ___
>> > > ceph-users mailing list
>> > > ceph-users@lists.ceph.com
>> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>>
>> --
>> Cheers,
>> Brad



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Is it possible to run a standalone Bluestore instance?

2019-04-17 Thread Brad Hubbard
Does it define _ZTIN13PriorityCache8PriCacheE ? If it does, and all is
as you say, then it should not say that _ZTIN13PriorityCache8PriCacheE
is undefined. Does ldd show that it is finding the libraries you think
it is? Either it is finding a different version of that library
somewhere else or the version you have may not define that symbol.

On Thu, Apr 18, 2019 at 11:12 AM Can Zhang  wrote:
>
> It's already in LD_LIBRARY_PATH, under the same directory of
> libfio_ceph_objectstore.so
>
>
> $ ll lib/|grep libceph-common
> lrwxrwxrwx. 1 root root19 Apr 17 11:15 libceph-common.so ->
> libceph-common.so.0
> -rwxr-xr-x. 1 root root 211853400 Apr 17 11:15 libceph-common.so.0
>
>
>
>
> Best,
> Can Zhang
>
> On Thu, Apr 18, 2019 at 7:00 AM Brad Hubbard  wrote:
> >
> > On Wed, Apr 17, 2019 at 1:37 PM Can Zhang  wrote:
> > >
> > > Thanks for your suggestions.
> > >
> > > I tried to build libfio_ceph_objectstore.so, but it fails to load:
> > >
> > > ```
> > > $ LD_LIBRARY_PATH=./lib ./bin/fio --enghelp=libfio_ceph_objectstore.so
> > >
> > > fio: engine libfio_ceph_objectstore.so not loadable
> > > IO engine libfio_ceph_objectstore.so not found
> > > ```
> > >
> > > I managed to print the dlopen error, it said:
> > >
> > > ```
> > > dlopen error: ./lib/libfio_ceph_objectstore.so: undefined symbol:
> > > _ZTIN13PriorityCache8PriCacheE
> >
> > $ c++filt _ZTIN13PriorityCache8PriCacheE
> > typeinfo for PriorityCache::PriCache
> >
> > $ sudo find /lib* /usr/lib* -iname '*.so*' | xargs nm -AD 2>&1 | grep
> > _ZTIN13PriorityCache8PriCacheE
> > /usr/lib64/ceph/libceph-common.so:008edab0 V
> > _ZTIN13PriorityCache8PriCacheE
> > /usr/lib64/ceph/libceph-common.so.0:008edab0 V
> > _ZTIN13PriorityCache8PriCacheE
> >
> > It needs to be able to find libceph-common, put it in your path or preload 
> > it.
> >
> > > ```
> > >
> > > I found a not-so-relevant
> > > issue(https://tracker.ceph.com/issues/38360), the error seems to be
> > > caused by mixed versions. My build environment is CentOS 7.5.1804 with
> > > SCL devtoolset-7, and ceph is latest master branch. Does someone know
> > > about the symbol?
> > >
> > >
> > > Best,
> > > Can Zhang
> > >
> > > Best,
> > > Can Zhang
> > >
> > >
> > > On Tue, Apr 16, 2019 at 8:37 PM Igor Fedotov  wrote:
> > > >
> > > > Besides already mentioned store_test.cc one can also use ceph
> > > > objectstore fio plugin
> > > > (https://github.com/ceph/ceph/tree/master/src/test/fio) to access
> > > > standalone BlueStore instance from FIO benchmarking tool.
> > > >
> > > >
> > > > Thanks,
> > > >
> > > > Igor
> > > >
> > > > On 4/16/2019 7:58 AM, Can ZHANG wrote:
> > > > > Hi,
> > > > >
> > > > > I'd like to run a standalone Bluestore instance so as to test and tune
> > > > > its performance. Are there any tools about it, or any suggestions?
> > > > >
> > > > >
> > > > >
> > > > > Best,
> > > > > Can Zhang
> > > > >
> > > > > ___
> > > > > ceph-users mailing list
> > > > > ceph-users@lists.ceph.com
> > > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > > ___
> > > ceph-users mailing list
> > > ceph-users@lists.ceph.com
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> >
> >
> > --
> > Cheers,
> > Brad



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Is it possible to run a standalone Bluestore instance?

2019-04-17 Thread Brad Hubbard
On Wed, Apr 17, 2019 at 1:37 PM Can Zhang  wrote:
>
> Thanks for your suggestions.
>
> I tried to build libfio_ceph_objectstore.so, but it fails to load:
>
> ```
> $ LD_LIBRARY_PATH=./lib ./bin/fio --enghelp=libfio_ceph_objectstore.so
>
> fio: engine libfio_ceph_objectstore.so not loadable
> IO engine libfio_ceph_objectstore.so not found
> ```
>
> I managed to print the dlopen error, it said:
>
> ```
> dlopen error: ./lib/libfio_ceph_objectstore.so: undefined symbol:
> _ZTIN13PriorityCache8PriCacheE

$ c++filt _ZTIN13PriorityCache8PriCacheE
typeinfo for PriorityCache::PriCache

$ sudo find /lib* /usr/lib* -iname '*.so*' | xargs nm -AD 2>&1 | grep
_ZTIN13PriorityCache8PriCacheE
/usr/lib64/ceph/libceph-common.so:008edab0 V
_ZTIN13PriorityCache8PriCacheE
/usr/lib64/ceph/libceph-common.so.0:008edab0 V
_ZTIN13PriorityCache8PriCacheE

It needs to be able to find libceph-common, put it in your path or preload it.

> ```
>
> I found a not-so-relevant
> issue(https://tracker.ceph.com/issues/38360), the error seems to be
> caused by mixed versions. My build environment is CentOS 7.5.1804 with
> SCL devtoolset-7, and ceph is latest master branch. Does someone know
> about the symbol?
>
>
> Best,
> Can Zhang
>
> Best,
> Can Zhang
>
>
> On Tue, Apr 16, 2019 at 8:37 PM Igor Fedotov  wrote:
> >
> > Besides already mentioned store_test.cc one can also use ceph
> > objectstore fio plugin
> > (https://github.com/ceph/ceph/tree/master/src/test/fio) to access
> > standalone BlueStore instance from FIO benchmarking tool.
> >
> >
> > Thanks,
> >
> > Igor
> >
> > On 4/16/2019 7:58 AM, Can ZHANG wrote:
> > > Hi,
> > >
> > > I'd like to run a standalone Bluestore instance so as to test and tune
> > > its performance. Are there any tools about it, or any suggestions?
> > >
> > >
> > >
> > > Best,
> > > Can Zhang
> > >
> > > ___
> > > ceph-users mailing list
> > > ceph-users@lists.ceph.com
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Is it possible to run a standalone Bluestore instance?

2019-04-18 Thread Brad Hubbard
Let me try to reproduce this on centos 7.5 with master and I'll let
you know how I go.

On Thu, Apr 18, 2019 at 3:59 PM Can Zhang  wrote:
>
> Using the commands you provided, I actually find some differences:
>
> On my CentOS VM:
> ```
> # sudo find ./lib*  -iname '*.so*' | xargs nm -AD 2>&1 | grep
> _ZTIN13PriorityCache8PriCacheE
> ./libceph-common.so:0221cc08 V _ZTIN13PriorityCache8PriCacheE
> ./libceph-common.so.0:0221cc08 V _ZTIN13PriorityCache8PriCacheE
> ./libfio_ceph_objectstore.so: U _ZTIN13PriorityCache8PriCacheE
> ```
> ```
> # ldd libfio_ceph_objectstore.so |grep common
> libceph-common.so.0 => /root/ceph/build/lib/libceph-common.so.0
> (0x7fd13f3e7000)
> ```
> On my Ubuntu VM:
> ```
> $ sudo find ./lib*  -iname '*.so*' | xargs nm -AD 2>&1 | grep
> _ZTIN13PriorityCache8PriCacheE
> ./libfio_ceph_objectstore.so:019d13e0 V _ZTIN13PriorityCache8PriCacheE
> ```
> ```
> $ ldd libfio_ceph_objectstore.so |grep common
> libceph-common.so.0 =>
> /home/can/work/ceph/build/lib/libceph-common.so.0 (0x7f024a89e000)
> ```
>
> Notice the "U" and "V" from nm results.
>
>
>
>
> Best,
> Can Zhang
>
> On Thu, Apr 18, 2019 at 9:36 AM Brad Hubbard  wrote:
> >
> > Does it define _ZTIN13PriorityCache8PriCacheE ? If it does, and all is
> > as you say, then it should not say that _ZTIN13PriorityCache8PriCacheE
> > is undefined. Does ldd show that it is finding the libraries you think
> > it is? Either it is finding a different version of that library
> > somewhere else or the version you have may not define that symbol.
> >
> > On Thu, Apr 18, 2019 at 11:12 AM Can Zhang  wrote:
> > >
> > > It's already in LD_LIBRARY_PATH, under the same directory of
> > > libfio_ceph_objectstore.so
> > >
> > >
> > > $ ll lib/|grep libceph-common
> > > lrwxrwxrwx. 1 root root19 Apr 17 11:15 libceph-common.so ->
> > > libceph-common.so.0
> > > -rwxr-xr-x. 1 root root 211853400 Apr 17 11:15 libceph-common.so.0
> > >
> > >
> > >
> > >
> > > Best,
> > > Can Zhang
> > >
> > > On Thu, Apr 18, 2019 at 7:00 AM Brad Hubbard  wrote:
> > > >
> > > > On Wed, Apr 17, 2019 at 1:37 PM Can Zhang  wrote:
> > > > >
> > > > > Thanks for your suggestions.
> > > > >
> > > > > I tried to build libfio_ceph_objectstore.so, but it fails to load:
> > > > >
> > > > > ```
> > > > > $ LD_LIBRARY_PATH=./lib ./bin/fio --enghelp=libfio_ceph_objectstore.so
> > > > >
> > > > > fio: engine libfio_ceph_objectstore.so not loadable
> > > > > IO engine libfio_ceph_objectstore.so not found
> > > > > ```
> > > > >
> > > > > I managed to print the dlopen error, it said:
> > > > >
> > > > > ```
> > > > > dlopen error: ./lib/libfio_ceph_objectstore.so: undefined symbol:
> > > > > _ZTIN13PriorityCache8PriCacheE
> > > >
> > > > $ c++filt _ZTIN13PriorityCache8PriCacheE
> > > > typeinfo for PriorityCache::PriCache
> > > >
> > > > $ sudo find /lib* /usr/lib* -iname '*.so*' | xargs nm -AD 2>&1 | grep
> > > > _ZTIN13PriorityCache8PriCacheE
> > > > /usr/lib64/ceph/libceph-common.so:008edab0 V
> > > > _ZTIN13PriorityCache8PriCacheE
> > > > /usr/lib64/ceph/libceph-common.so.0:008edab0 V
> > > > _ZTIN13PriorityCache8PriCacheE
> > > >
> > > > It needs to be able to find libceph-common, put it in your path or 
> > > > preload it.
> > > >
> > > > > ```
> > > > >
> > > > > I found a not-so-relevant
> > > > > issue(https://tracker.ceph.com/issues/38360), the error seems to be
> > > > > caused by mixed versions. My build environment is CentOS 7.5.1804 with
> > > > > SCL devtoolset-7, and ceph is latest master branch. Does someone know
> > > > > about the symbol?
> > > > >
> > > > >
> > > > > Best,
> > > > > Can Zhang
> > > > >
> > > > > Best,
> > > > > Can Zhang
> > > > >
> > > > >
> > > > > On Tue, Apr 16, 2019 at 8:37 PM Igor Fedotov  wrote:
> > > > > >
> > > > > > Besides already mentioned store_test.cc one can also use ce

Re: [ceph-users] obj_size_info_mismatch error handling

2019-06-17 Thread Brad Hubbard
Can you open a tracker for this Dan and provide scrub logs with
debug_osd=20 and rados list-inconsistent-obj output?

On Mon, Jun 3, 2019 at 10:44 PM Dan van der Ster  wrote:
>
> Hi Reed and Brad,
>
> Did you ever learn more about this problem?
> We currently have a few inconsistencies arriving with the same env
> (cephfs, v13.2.5) and symptoms.
>
> PG Repair doesn't fix the inconsistency, nor does Brad's omap
> workaround earlier in the thread.
> In our case, we can fix by cp'ing the file to a new inode, deleting
> the inconsistent file, then scrubbing the PG.
>
> -- Dan
>
>
> On Fri, May 3, 2019 at 3:18 PM Reed Dier  wrote:
> >
> > Just to follow up for the sake of the mailing list,
> >
> > I had not had a chance to attempt your steps yet, but things appear to have 
> > worked themselves out on their own.
> >
> > Both scrub errors cleared without intervention, and I'm not sure if it is 
> > the results of that object getting touched in CephFS that triggered the 
> > update of the size info, or if something else was able to clear it.
> >
> > Didn't see anything relating to the clearing in mon, mgr, or osd logs.
> >
> > So, not entirely sure what fixed it, but it is resolved on its own.
> >
> > Thanks,
> >
> > Reed
> >
> > On Apr 30, 2019, at 8:01 PM, Brad Hubbard  wrote:
> >
> > On Wed, May 1, 2019 at 10:54 AM Brad Hubbard  wrote:
> >
> >
> > Which size is correct?
> >
> >
> > Sorry, accidental discharge =D
> >
> > If the object info size is *incorrect* try forcing a write to the OI
> > with something like the following.
> >
> > 1. rados -p [name_of_pool_17] setomapval 10008536718.
> > temporary-key anything
> > 2. ceph pg deep-scrub 17.2b9
> > 3. Wait for the scrub to finish
> > 4. rados -p [name_of_pool_2] rmomapkey 10008536718. temporary-key
> >
> > If the object info size is *correct* you could try just doing a rados
> > get followed by a rados put of the object to see if the size is
> > updated correctly.
> >
> > It's more likely the object info size is wrong IMHO.
> >
> >
> > On Tue, Apr 30, 2019 at 1:06 AM Reed Dier  wrote:
> >
> >
> > Hi list,
> >
> > Woke up this morning to two PG's reporting scrub errors, in a way that I 
> > haven't seen before.
> >
> > $ ceph versions
> > {
> >"mon": {
> >"ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) 
> > mimic (stable)": 3
> >},
> >"mgr": {
> >"ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) 
> > mimic (stable)": 3
> >},
> >"osd": {
> >"ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) 
> > mimic (stable)": 156
> >},
> >"mds": {
> >"ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) 
> > mimic (stable)": 2
> >},
> >"overall": {
> >"ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) 
> > mimic (stable)": 156,
> >"ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) 
> > mimic (stable)": 8
> >}
> > }
> >
> >
> > OSD_SCRUB_ERRORS 8 scrub errors
> > PG_DAMAGED Possible data damage: 2 pgs inconsistent
> >pg 17.72 is active+clean+inconsistent, acting [3,7,153]
> >pg 17.2b9 is active+clean+inconsistent, acting [19,7,16]
> >
> >
> > Here is what $rados list-inconsistent-obj 17.2b9 --format=json-pretty 
> > yields:
> >
> > {
> >"epoch": 134582,
> >"inconsistents": [
> >{
> >"object": {
> >"name": "10008536718.",
> >"nspace": "",
> >"locator": "",
> >"snap": "head",
> >"version": 0
> >},
> >"errors": [],
> >"union_shard_errors": [
> >"obj_size_info_mismatch"
> >],
> >"shards": [
> >{
> >"osd": 7,
> >"primary": false,
> >"errors": [
> >"obj_size_info_mismatch"
> > 

Re: [ceph-users] obj_size_info_mismatch error handling

2019-04-30 Thread Brad Hubbard
On Wed, May 1, 2019 at 10:54 AM Brad Hubbard  wrote:
>
> Which size is correct?

Sorry, accidental discharge =D

If the object info size is *incorrect* try forcing a write to the OI
with something like the following.

1. rados -p [name_of_pool_17] setomapval 10008536718.
temporary-key anything
2. ceph pg deep-scrub 17.2b9
3. Wait for the scrub to finish
4. rados -p [name_of_pool_2] rmomapkey 10008536718. temporary-key

If the object info size is *correct* you could try just doing a rados
get followed by a rados put of the object to see if the size is
updated correctly.

It's more likely the object info size is wrong IMHO.

>
> On Tue, Apr 30, 2019 at 1:06 AM Reed Dier  wrote:
> >
> > Hi list,
> >
> > Woke up this morning to two PG's reporting scrub errors, in a way that I 
> > haven't seen before.
> >
> > $ ceph versions
> > {
> > "mon": {
> > "ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) 
> > mimic (stable)": 3
> > },
> > "mgr": {
> > "ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) 
> > mimic (stable)": 3
> > },
> > "osd": {
> > "ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) 
> > mimic (stable)": 156
> > },
> > "mds": {
> > "ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) 
> > mimic (stable)": 2
> > },
> > "overall": {
> > "ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) 
> > mimic (stable)": 156,
> > "ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) 
> > mimic (stable)": 8
> > }
> > }
> >
> >
> > OSD_SCRUB_ERRORS 8 scrub errors
> > PG_DAMAGED Possible data damage: 2 pgs inconsistent
> > pg 17.72 is active+clean+inconsistent, acting [3,7,153]
> > pg 17.2b9 is active+clean+inconsistent, acting [19,7,16]
> >
> >
> > Here is what $rados list-inconsistent-obj 17.2b9 --format=json-pretty 
> > yields:
> >
> > {
> > "epoch": 134582,
> > "inconsistents": [
> > {
> > "object": {
> > "name": "10008536718.",
> > "nspace": "",
> > "locator": "",
> > "snap": "head",
> > "version": 0
> > },
> > "errors": [],
> > "union_shard_errors": [
> > "obj_size_info_mismatch"
> > ],
> > "shards": [
> > {
> > "osd": 7,
> > "primary": false,
> > "errors": [
> > "obj_size_info_mismatch"
> > ],
> > "size": 5883,
> > "object_info": {
> > "oid": {
> > "oid": "10008536718.",
> > "key": "",
> > "snapid": -2,
> > "hash": 1752643257,
> > "max": 0,
> > "pool": 17,
> > "namespace": ""
> > },
> > "version": "134599'448331",
> > "prior_version": "134599'448330",
> > "last_reqid": "client.1580931080.0:671854",
> > "user_version": 448331,
> > "size": 3505,
> > "mtime": "2019-04-28 15:32:20.003519",
> > "local_mtime": "2019-04-28 15:32:25.991015",
> > "lost": 0,
> > "flags": [
> > "dirty",
> > "data_digest",
> > "omap_digest"
> > ],
> > "truncat

Re: [ceph-users] obj_size_info_mismatch error handling

2019-04-30 Thread Brad Hubbard
Which size is correct?

On Tue, Apr 30, 2019 at 1:06 AM Reed Dier  wrote:
>
> Hi list,
>
> Woke up this morning to two PG's reporting scrub errors, in a way that I 
> haven't seen before.
>
> $ ceph versions
> {
> "mon": {
> "ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) mimic 
> (stable)": 3
> },
> "mgr": {
> "ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) mimic 
> (stable)": 3
> },
> "osd": {
> "ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic 
> (stable)": 156
> },
> "mds": {
> "ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) mimic 
> (stable)": 2
> },
> "overall": {
> "ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic 
> (stable)": 156,
> "ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) mimic 
> (stable)": 8
> }
> }
>
>
> OSD_SCRUB_ERRORS 8 scrub errors
> PG_DAMAGED Possible data damage: 2 pgs inconsistent
> pg 17.72 is active+clean+inconsistent, acting [3,7,153]
> pg 17.2b9 is active+clean+inconsistent, acting [19,7,16]
>
>
> Here is what $rados list-inconsistent-obj 17.2b9 --format=json-pretty yields:
>
> {
> "epoch": 134582,
> "inconsistents": [
> {
> "object": {
> "name": "10008536718.",
> "nspace": "",
> "locator": "",
> "snap": "head",
> "version": 0
> },
> "errors": [],
> "union_shard_errors": [
> "obj_size_info_mismatch"
> ],
> "shards": [
> {
> "osd": 7,
> "primary": false,
> "errors": [
> "obj_size_info_mismatch"
> ],
> "size": 5883,
> "object_info": {
> "oid": {
> "oid": "10008536718.",
> "key": "",
> "snapid": -2,
> "hash": 1752643257,
> "max": 0,
> "pool": 17,
> "namespace": ""
> },
> "version": "134599'448331",
> "prior_version": "134599'448330",
> "last_reqid": "client.1580931080.0:671854",
> "user_version": 448331,
> "size": 3505,
> "mtime": "2019-04-28 15:32:20.003519",
> "local_mtime": "2019-04-28 15:32:25.991015",
> "lost": 0,
> "flags": [
> "dirty",
> "data_digest",
> "omap_digest"
> ],
> "truncate_seq": 899,
> "truncate_size": 0,
> "data_digest": "0xf99a3bd3",
> "omap_digest": "0x",
> "expected_object_size": 0,
> "expected_write_size": 0,
> "alloc_hint_flags": 0,
> "manifest": {
> "type": 0
> },
> "watchers": {}
> }
> },
> {
> "osd": 16,
> "primary": false,
> "errors": [
> "obj_size_info_mismatch"
> ],
> "size": 5883,
> "object_info": {
> "oid": {
> "oid": "10008536718.",
> "key": "",
> "snapid": -2,
> "hash": 1752643257,
> "max": 0,
> "pool": 17,
> "namespace": ""
> },
> "version": "134599'448331",
> "prior_version": "134599'448330",
> "last_reqid": "client.1580931080.0:671854",
> "user_version": 448331,
> "size": 3505,
> "mtime": "2019-04-28 15:32:20.003519",
> "local_mtime": "2019-04-28 15:32:25.991015",
> "lost": 0,
> "flags": [
> "dirty",
> "data_digest",
> "omap_digest"
> ],
> "truncate_seq": 899,
> "truncate_size": 0,
> "data_digest": "0xf99a3bd3",
>   

Re: [ceph-users] Is it possible to run a standalone Bluestore instance?

2019-04-19 Thread Brad Hubbard
OK. So this works for me with master commit
bdaac2d619d603f53a16c07f9d7bd47751137c4c on Centos 7.5.1804.

I cloned the repo and ran './install-deps.sh' and './do_cmake.sh
-DWITH_FIO=ON' then 'make all'.

# find ./lib  -iname '*.so*' | xargs nm -AD 2>&1 | grep
_ZTIN13PriorityCache8PriCacheE
./lib/libfio_ceph_objectstore.so:018f72d0 V
_ZTIN13PriorityCache8PriCacheE

# LD_LIBRARY_PATH=./lib ./bin/fio --enghelp=libfio_ceph_objectstore.so
conf: Path to a ceph configuration file
oi_attr_len : Set OI(aka '_') attribute to specified length
snapset_attr_len: Set 'snapset' attribute to specified length
_fastinfo_omap_len  : Set '_fastinfo' OMAP attribute to specified length
pglog_simulation: Enables PG Log simulation behavior
pglog_omap_len  : Set pglog omap entry to specified length
pglog_dup_omap_len  : Set duplicate pglog omap entry to specified length
single_pool_mode: Enables the mode when all jobs run against
the same pool
preallocate_files   : Enables/disables file preallocation (touch
and resize) on init

So my result above matches your result on ubuntu but not on centos. It
looks to me like we used to define in libceph-common but currently
it's defined in libfio_ceph_objectstore.so. For reasons that are
unclear you are seeing the old behaviour. Why this is and why it isn't
working as designed is not clear to me but I suspect if you clone the
repo again and build from scratch (maybe in a different directory if
you wish to keep debugging, see below) you should get a working
result. Could you try that as a test?

If, on the other hand, you wish to keep debugging your current
environment I'd suggest looking at the output of the following command
as it may shed further light on the issue.

# LD_DEBUG=all LD_LIBRARY_PATH=./lib ./bin/fio
--enghelp=libfio_ceph_objectstore.so

'LD_DEBUG=lib' may suffice but that's difficult to judge without
knowing what the problem is. I still suspect somehow you have
mis-matched libraries and, if that's the case, it's probably not worth
pursuing. If you can give me specific steps so I can reproduce this
from a freshly cloned tree I'd be happy to look further into it.

Good luck.

On Thu, Apr 18, 2019 at 7:00 PM Brad Hubbard  wrote:
>
> Let me try to reproduce this on centos 7.5 with master and I'll let
> you know how I go.
>
> On Thu, Apr 18, 2019 at 3:59 PM Can Zhang  wrote:
> >
> > Using the commands you provided, I actually find some differences:
> >
> > On my CentOS VM:
> > ```
> > # sudo find ./lib*  -iname '*.so*' | xargs nm -AD 2>&1 | grep
> > _ZTIN13PriorityCache8PriCacheE
> > ./libceph-common.so:0221cc08 V _ZTIN13PriorityCache8PriCacheE
> > ./libceph-common.so.0:0221cc08 V _ZTIN13PriorityCache8PriCacheE
> > ./libfio_ceph_objectstore.so: U 
> > _ZTIN13PriorityCache8PriCacheE
> > ```
> > ```
> > # ldd libfio_ceph_objectstore.so |grep common
> > libceph-common.so.0 => /root/ceph/build/lib/libceph-common.so.0
> > (0x7fd13f3e7000)
> > ```
> > On my Ubuntu VM:
> > ```
> > $ sudo find ./lib*  -iname '*.so*' | xargs nm -AD 2>&1 | grep
> > _ZTIN13PriorityCache8PriCacheE
> > ./libfio_ceph_objectstore.so:019d13e0 V 
> > _ZTIN13PriorityCache8PriCacheE
> > ```
> > ```
> > $ ldd libfio_ceph_objectstore.so |grep common
> > libceph-common.so.0 =>
> > /home/can/work/ceph/build/lib/libceph-common.so.0 (0x7f024a89e000)
> > ```
> >
> > Notice the "U" and "V" from nm results.
> >
> >
> >
> >
> > Best,
> > Can Zhang
> >
> > On Thu, Apr 18, 2019 at 9:36 AM Brad Hubbard  wrote:
> > >
> > > Does it define _ZTIN13PriorityCache8PriCacheE ? If it does, and all is
> > > as you say, then it should not say that _ZTIN13PriorityCache8PriCacheE
> > > is undefined. Does ldd show that it is finding the libraries you think
> > > it is? Either it is finding a different version of that library
> > > somewhere else or the version you have may not define that symbol.
> > >
> > > On Thu, Apr 18, 2019 at 11:12 AM Can Zhang  wrote:
> > > >
> > > > It's already in LD_LIBRARY_PATH, under the same directory of
> > > > libfio_ceph_objectstore.so
> > > >
> > > >
> > > > $ ll lib/|grep libceph-common
> > > > lrwxrwxrwx. 1 root root19 Apr 17 11:15 libceph-common.so ->
> > > > libceph-common.so.0
> > > > -rwxr-xr-x. 1 root root 211853400 Apr 17 11:15 libceph-common.so.0
> > > >
> > > >
> > > >
> > > >
> > > > Best,
> > > > Can Zhang

Re: [ceph-users] set_mon_vals failed to set cluster_network Configuration option 'cluster_network' may not be modified at runtime

2019-07-02 Thread Brad Hubbard
I'd suggest creating a tracker similar to
http://tracker.ceph.com/issues/40554 which was created for the issue
in the thread you mentioned.

On Wed, Jul 3, 2019 at 12:29 AM Vandeir Eduardo
 wrote:
>
> Hi,
>
> on client machines, when I use the command rbd, for example, rbd ls
> poolname, this message is always displayed:
>
> 2019-07-02 11:18:10.613 7fb2eaffd700 -1 set_mon_vals failed to set
> cluster_network = 10.1.2.0/24: Configuration option 'cluster_network'
> may not be modified at runtime
> 2019-07-02 11:18:10.613 7fb2eaffd700 -1 set_mon_vals failed to set
> public_network = 10.1.1.0/24: Configuration option 'public_network'
> may not be modified at runtime
> 2019-07-02 11:18:10.621 7fb2ea7fc700 -1 set_mon_vals failed to set
> cluster_network = 10.1.2.0/24: Configuration option 'cluster_network'
> may not be modified at runtime
> 2019-07-02 11:18:10.621 7fb2ea7fc700 -1 set_mon_vals failed to set
> public_network = 10.1.1.0/24: Configuration option 'public_network'
> may not be modified at runtime
>
> After this, rbd image names are displayed normally.
>
> If I run this command on a ceph node, this "warning/information???"
> messages are not displayed. Is there a way to get ride of this? Its
> really annoying.
>
> The only thread I found about something similar was this:
> https://www.spinics.net/lists/ceph-devel/msg42657.html
>
> I already tryied the commands "ceph config rm global cluster_network"
> and "ceph config rm global public_network", but the messages still
> persist.
>
> Any ideas?
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] details about cloning objects using librados

2019-06-27 Thread Brad Hubbard
On Thu, Jun 27, 2019 at 8:58 PM nokia ceph  wrote:
>
> Hi Team,
>
> We have a requirement to create multiple copies of an object and currently we 
> are handling it in client side to write as separate objects and this causes 
> huge network traffic between client and cluster.
> Is there possibility of cloning an object to multiple copies using librados 
> api?
> Please share the document details if it is feasible.

It may be possible to use an object class to accomplish what you want
to achieve but the more we understand what you are trying to do, the
better the advice we can offer (at the moment your description sounds
like replication which is already part of RADOS as you know).

More on object classes from Cephalocon Barcelona in May this year:
https://www.youtube.com/watch?v=EVrP9MXiiuU

>
> Thanks,
> Muthu
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



--
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] details about cloning objects using librados

2019-07-02 Thread Brad Hubbard
On Wed, Jul 3, 2019 at 4:25 AM Gregory Farnum  wrote:
>
> I'm not sure how or why you'd get an object class involved in doing
> this in the normal course of affairs.
>
> There's a copy_from op that a client can send and which copies an
> object from another OSD into the target object. That's probably the
> primitive you want to build on. Note that the OSD doesn't do much

Argh! yes, good idea. We really should document that!

> consistency checking (it validates that the object version matches an
> input, but if they don't it just returns an error) so the client
> application is responsible for any locking needed.
> -Greg
>
> On Tue, Jul 2, 2019 at 3:49 AM Brad Hubbard  wrote:
> >
> > Yes, this should be possible using an object class which is also a
> > RADOS client (via the RADOS API). You'll still have some client
> > traffic as the machine running the object class will still need to
> > connect to the relevant primary osd and send the write (presumably in
> > some situations though this will be the same machine).
> >
> > On Tue, Jul 2, 2019 at 4:08 PM nokia ceph  wrote:
> > >
> > > Hi Brett,
> > >
> > > I think I was wrong here in the requirement description. It is not about 
> > > data replication , we need same content stored in different object/name.
> > > We store video contents inside the ceph cluster. And our new requirement 
> > > is we need to store same content for different users , hence need same 
> > > content in different object name . if client sends write request for 
> > > object x and sets number of copies as 100, then cluster has to clone 100 
> > > copies of object x and store it as object x1, objectx2,etc. Currently 
> > > this is done in the client side where objectx1, object x2...objectx100 
> > > are cloned inside the client and write request sent for all 100 objects 
> > > which we want to avoid to reduce network consumption.
> > >
> > > Similar usecases are rbd snapshot , radosgw copy .
> > >
> > > Is this possible in object class ?
> > >
> > > thanks,
> > > Muthu
> > >
> > >
> > > On Mon, Jul 1, 2019 at 7:58 PM Brett Chancellor 
> > >  wrote:
> > >>
> > >> Ceph already does this by default. For each replicated pool, you can set 
> > >> the 'size' which is the number of copies you want Ceph to maintain. The 
> > >> accepted norm for replicas is 3, but you can set it higher if you want 
> > >> to incur the performance penalty.
> > >>
> > >> On Mon, Jul 1, 2019, 6:01 AM nokia ceph  wrote:
> > >>>
> > >>> Hi Brad,
> > >>>
> > >>> Thank you for your response , and we will check this video as well.
> > >>> Our requirement is while writing an object into the cluster , if we can 
> > >>> provide number of copies to be made , the network consumption between 
> > >>> client and cluster will be only for one object write. However , the 
> > >>> cluster will clone/copy multiple objects and stores inside the cluster.
> > >>>
> > >>> Thanks,
> > >>> Muthu
> > >>>
> > >>> On Fri, Jun 28, 2019 at 9:23 AM Brad Hubbard  
> > >>> wrote:
> > >>>>
> > >>>> On Thu, Jun 27, 2019 at 8:58 PM nokia ceph  
> > >>>> wrote:
> > >>>> >
> > >>>> > Hi Team,
> > >>>> >
> > >>>> > We have a requirement to create multiple copies of an object and 
> > >>>> > currently we are handling it in client side to write as separate 
> > >>>> > objects and this causes huge network traffic between client and 
> > >>>> > cluster.
> > >>>> > Is there possibility of cloning an object to multiple copies using 
> > >>>> > librados api?
> > >>>> > Please share the document details if it is feasible.
> > >>>>
> > >>>> It may be possible to use an object class to accomplish what you want
> > >>>> to achieve but the more we understand what you are trying to do, the
> > >>>> better the advice we can offer (at the moment your description sounds
> > >>>> like replication which is already part of RADOS as you know).
> > >>>>
> > >>>> More on object classes from Cephalocon Barcelona in May this year:
> > >>>> https://www.youtube.com/watch?v=EVrP9MXiiuU
> > >>>>
> > >>>> >
> > >>>> > Thanks,
> > >>>> > Muthu
> > >>>> > ___
> > >>>> > ceph-users mailing list
> > >>>> > ceph-users@lists.ceph.com
> > >>>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > >>>>
> > >>>>
> > >>>>
> > >>>> --
> > >>>> Cheers,
> > >>>> Brad
> > >>>
> > >>> ___
> > >>> ceph-users mailing list
> > >>> ceph-users@lists.ceph.com
> > >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> >
> >
> > --
> > Cheers,
> > Brad
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Is it possible to run a standalone Bluestore instance?

2019-04-21 Thread Brad Hubbard
Glad it worked.

On Mon, Apr 22, 2019 at 11:01 AM Can Zhang  wrote:
>
> Thanks for your detailed response.
>
> I freshly installed a CentOS 7.6 and run install-deps.sh and
> do_cmake.sh this time, and it works this time. Maybe the problem was
> caused by dirty environment.
>
>
> Best,
> Can Zhang
>
>
> On Fri, Apr 19, 2019 at 6:28 PM Brad Hubbard  wrote:
> >
> > OK. So this works for me with master commit
> > bdaac2d619d603f53a16c07f9d7bd47751137c4c on Centos 7.5.1804.
> >
> > I cloned the repo and ran './install-deps.sh' and './do_cmake.sh
> > -DWITH_FIO=ON' then 'make all'.
> >
> > # find ./lib  -iname '*.so*' | xargs nm -AD 2>&1 | grep
> > _ZTIN13PriorityCache8PriCacheE
> > ./lib/libfio_ceph_objectstore.so:018f72d0 V
> > _ZTIN13PriorityCache8PriCacheE
> >
> > # LD_LIBRARY_PATH=./lib ./bin/fio --enghelp=libfio_ceph_objectstore.so
> > conf: Path to a ceph configuration file
> > oi_attr_len : Set OI(aka '_') attribute to specified length
> > snapset_attr_len: Set 'snapset' attribute to specified length
> > _fastinfo_omap_len  : Set '_fastinfo' OMAP attribute to specified length
> > pglog_simulation: Enables PG Log simulation behavior
> > pglog_omap_len  : Set pglog omap entry to specified length
> > pglog_dup_omap_len  : Set duplicate pglog omap entry to specified length
> > single_pool_mode: Enables the mode when all jobs run against
> > the same pool
> > preallocate_files   : Enables/disables file preallocation (touch
> > and resize) on init
> >
> > So my result above matches your result on ubuntu but not on centos. It
> > looks to me like we used to define in libceph-common but currently
> > it's defined in libfio_ceph_objectstore.so. For reasons that are
> > unclear you are seeing the old behaviour. Why this is and why it isn't
> > working as designed is not clear to me but I suspect if you clone the
> > repo again and build from scratch (maybe in a different directory if
> > you wish to keep debugging, see below) you should get a working
> > result. Could you try that as a test?
> >
> > If, on the other hand, you wish to keep debugging your current
> > environment I'd suggest looking at the output of the following command
> > as it may shed further light on the issue.
> >
> > # LD_DEBUG=all LD_LIBRARY_PATH=./lib ./bin/fio
> > --enghelp=libfio_ceph_objectstore.so
> >
> > 'LD_DEBUG=lib' may suffice but that's difficult to judge without
> > knowing what the problem is. I still suspect somehow you have
> > mis-matched libraries and, if that's the case, it's probably not worth
> > pursuing. If you can give me specific steps so I can reproduce this
> > from a freshly cloned tree I'd be happy to look further into it.
> >
> > Good luck.
> >
> > On Thu, Apr 18, 2019 at 7:00 PM Brad Hubbard  wrote:
> > >
> > > Let me try to reproduce this on centos 7.5 with master and I'll let
> > > you know how I go.
> > >
> > > On Thu, Apr 18, 2019 at 3:59 PM Can Zhang  wrote:
> > > >
> > > > Using the commands you provided, I actually find some differences:
> > > >
> > > > On my CentOS VM:
> > > > ```
> > > > # sudo find ./lib*  -iname '*.so*' | xargs nm -AD 2>&1 | grep
> > > > _ZTIN13PriorityCache8PriCacheE
> > > > ./libceph-common.so:0221cc08 V _ZTIN13PriorityCache8PriCacheE
> > > > ./libceph-common.so.0:0221cc08 V _ZTIN13PriorityCache8PriCacheE
> > > > ./libfio_ceph_objectstore.so: U 
> > > > _ZTIN13PriorityCache8PriCacheE
> > > > ```
> > > > ```
> > > > # ldd libfio_ceph_objectstore.so |grep common
> > > > libceph-common.so.0 => /root/ceph/build/lib/libceph-common.so.0
> > > > (0x7fd13f3e7000)
> > > > ```
> > > > On my Ubuntu VM:
> > > > ```
> > > > $ sudo find ./lib*  -iname '*.so*' | xargs nm -AD 2>&1 | grep
> > > > _ZTIN13PriorityCache8PriCacheE
> > > > ./libfio_ceph_objectstore.so:019d13e0 V 
> > > > _ZTIN13PriorityCache8PriCacheE
> > > > ```
> > > > ```
> > > > $ ldd libfio_ceph_objectstore.so |grep common
> > > > libceph-common.so.0 =>
> > > > /home/can/work/ceph/build/lib/libceph-common.so.0 (0x7f024a89e000)
> > > > ```
> > > >
> > > > Notice the "U" and "V" from nm results.
>

Re: [ceph-users] showing active config settings

2019-04-16 Thread Brad Hubbard
On Tue, Apr 16, 2019 at 6:03 PM Paul Emmerich  wrote:
>
> This works, it just says that it *might* require a restart, but this
> particular option takes effect without a restart.

We've already looked at changing the wording once to make it more palatable.

http://tracker.ceph.com/issues/18424

>
> Implementation detail: this message shows up if there's no internal
> function to be called when this option changes, so it can't be sure if
> the change is actually doing anything because the option might be
> cached or only read on startup. But in this case this option is read
> in the relevant path every time and no notification is required. But
> the injectargs command can't know that.

Right on all counts. The functions are referred to as observers and
register to be notified if the value changes, hence "not observed."

>
> Paul
>
> On Mon, Apr 15, 2019 at 11:38 PM solarflow99  wrote:
> >
> > Then why doesn't this work?
> >
> > # ceph tell 'osd.*' injectargs '--osd-recovery-max-active 4'
> > osd.0: osd_recovery_max_active = '4' (not observed, change may require 
> > restart)
> > osd.1: osd_recovery_max_active = '4' (not observed, change may require 
> > restart)
> > osd.2: osd_recovery_max_active = '4' (not observed, change may require 
> > restart)
> > osd.3: osd_recovery_max_active = '4' (not observed, change may require 
> > restart)
> > osd.4: osd_recovery_max_active = '4' (not observed, change may require 
> > restart)
> >
> > # ceph -n osd.1 --show-config | grep osd_recovery_max_active
> > osd_recovery_max_active = 3
> >
> >
> >
> > On Wed, Apr 10, 2019 at 7:21 AM Eugen Block  wrote:
> >>
> >> > I always end up using "ceph --admin-daemon
> >> > /var/run/ceph/name-of-socket-here.asok config show | grep ..." to get 
> >> > what
> >> > is in effect now for a certain daemon.
> >> > Needs you to be on the host of the daemon of course.
> >>
> >> Me too, I just wanted to try what OP reported. And after trying that,
> >> I'll keep it that way. ;-)
> >>
> >>
> >> Zitat von Janne Johansson :
> >>
> >> > Den ons 10 apr. 2019 kl 13:37 skrev Eugen Block :
> >> >
> >> >> > If you don't specify which daemon to talk to, it tells you what the
> >> >> > defaults would be for a random daemon started just now using the same
> >> >> > config as you have in /etc/ceph/ceph.conf.
> >> >>
> >> >> I tried that, too, but the result is not correct:
> >> >>
> >> >> host1:~ # ceph -n osd.1 --show-config | grep osd_recovery_max_active
> >> >> osd_recovery_max_active = 3
> >> >>
> >> >
> >> > I always end up using "ceph --admin-daemon
> >> > /var/run/ceph/name-of-socket-here.asok config show | grep ..." to get 
> >> > what
> >> > is in effect now for a certain daemon.
> >> > Needs you to be on the host of the daemon of course.
> >> >
> >> > --
> >> > May the most significant bit of your life be positive.
> >>
> >>
> >>
> >> ___
> >> ceph-users mailing list
> >> ceph-users@lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] details about cloning objects using librados

2019-07-02 Thread Brad Hubbard
Yes, this should be possible using an object class which is also a
RADOS client (via the RADOS API). You'll still have some client
traffic as the machine running the object class will still need to
connect to the relevant primary osd and send the write (presumably in
some situations though this will be the same machine).

On Tue, Jul 2, 2019 at 4:08 PM nokia ceph  wrote:
>
> Hi Brett,
>
> I think I was wrong here in the requirement description. It is not about data 
> replication , we need same content stored in different object/name.
> We store video contents inside the ceph cluster. And our new requirement is 
> we need to store same content for different users , hence need same content 
> in different object name . if client sends write request for object x and 
> sets number of copies as 100, then cluster has to clone 100 copies of object 
> x and store it as object x1, objectx2,etc. Currently this is done in the 
> client side where objectx1, object x2...objectx100 are cloned inside the 
> client and write request sent for all 100 objects which we want to avoid to 
> reduce network consumption.
>
> Similar usecases are rbd snapshot , radosgw copy .
>
> Is this possible in object class ?
>
> thanks,
> Muthu
>
>
> On Mon, Jul 1, 2019 at 7:58 PM Brett Chancellor  
> wrote:
>>
>> Ceph already does this by default. For each replicated pool, you can set the 
>> 'size' which is the number of copies you want Ceph to maintain. The accepted 
>> norm for replicas is 3, but you can set it higher if you want to incur the 
>> performance penalty.
>>
>> On Mon, Jul 1, 2019, 6:01 AM nokia ceph  wrote:
>>>
>>> Hi Brad,
>>>
>>> Thank you for your response , and we will check this video as well.
>>> Our requirement is while writing an object into the cluster , if we can 
>>> provide number of copies to be made , the network consumption between 
>>> client and cluster will be only for one object write. However , the cluster 
>>> will clone/copy multiple objects and stores inside the cluster.
>>>
>>> Thanks,
>>> Muthu
>>>
>>> On Fri, Jun 28, 2019 at 9:23 AM Brad Hubbard  wrote:
>>>>
>>>> On Thu, Jun 27, 2019 at 8:58 PM nokia ceph  
>>>> wrote:
>>>> >
>>>> > Hi Team,
>>>> >
>>>> > We have a requirement to create multiple copies of an object and 
>>>> > currently we are handling it in client side to write as separate objects 
>>>> > and this causes huge network traffic between client and cluster.
>>>> > Is there possibility of cloning an object to multiple copies using 
>>>> > librados api?
>>>> > Please share the document details if it is feasible.
>>>>
>>>> It may be possible to use an object class to accomplish what you want
>>>> to achieve but the more we understand what you are trying to do, the
>>>> better the advice we can offer (at the moment your description sounds
>>>> like replication which is already part of RADOS as you know).
>>>>
>>>> More on object classes from Cephalocon Barcelona in May this year:
>>>> https://www.youtube.com/watch?v=EVrP9MXiiuU
>>>>
>>>> >
>>>> > Thanks,
>>>> > Muthu
>>>> > ___
>>>> > ceph-users mailing list
>>>> > ceph-users@lists.ceph.com
>>>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>
>>>>
>>>>
>>>> --
>>>> Cheers,
>>>> Brad
>>>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Sudden loss of all SSD OSDs in a cluster, immedaite abort on restart [Mimic 13.2.6]

2019-08-18 Thread Brad Hubbard
On Thu, Aug 15, 2019 at 2:09 AM Troy Ablan  wrote:
>
> Paul,
>
> Thanks for the reply.  All of these seemed to fail except for pulling
> the osdmap from the live cluster.
>
> -Troy
>
> -[~:#]- ceph-objectstore-tool --op get-osdmap --data-path
> /var/lib/ceph/osd/ceph-45/ --file osdmap45
> terminate called after throwing an instance of
> 'ceph::buffer::malformed_input'
>what():  buffer::malformed_input: unsupported bucket algorithm: -1

That's this code.

3114   switch (alg) {
3115   case CRUSH_BUCKET_UNIFORM:
3116 size = sizeof(crush_bucket_uniform);
3117 break;
3118   case CRUSH_BUCKET_LIST:
3119 size = sizeof(crush_bucket_list);
3120 break;
3121   case CRUSH_BUCKET_TREE:
3122 size = sizeof(crush_bucket_tree);
3123 break;
3124   case CRUSH_BUCKET_STRAW:
3125 size = sizeof(crush_bucket_straw);
3126 break;
3127   case CRUSH_BUCKET_STRAW2:
3128 size = sizeof(crush_bucket_straw2);
3129 break;
3130   default:
3131 {
3132   char str[128];
3133   snprintf(str, sizeof(str), "unsupported bucket algorithm:
%d", alg);
3134   throw buffer::malformed_input(str);
3135 }
3136   }

CRUSH_BUCKET_UNIFORM = 1
CRUSH_BUCKET_LIST = 2
CRUSH_BUCKET_TREE = 3
CRUSH_BUCKET_STRAW = 4
CRUSH_BUCKET_STRAW2 = 5

So valid values for bucket algorithms are 1 through 5 but, for
whatever reason, at least one of yours is being interpreted as "-1"

this doesn't seem like something that would just happen spontaneously
with no changes to the cluster.

What recent changes have you made to the osdmap? What recent changes
have you made to the crushmap? Have you recently upgraded?

> *** Caught signal (Aborted) **
>   in thread 7f945ee04f00 thread_name:ceph-objectstor
>   ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic
> (stable)
>   1: (()+0xf5d0) [0x7f94531935d0]
>   2: (gsignal()+0x37) [0x7f9451d80207]
>   3: (abort()+0x148) [0x7f9451d818f8]
>   4: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7f945268f7d5]
>   5: (()+0x5e746) [0x7f945268d746]
>   6: (()+0x5e773) [0x7f945268d773]
>   7: (__cxa_rethrow()+0x49) [0x7f945268d9e9]
>   8: (CrushWrapper::decode(ceph::buffer::list::iterator&)+0x18b8)
> [0x7f94553218d8]
>   9: (OSDMap::decode(ceph::buffer::list::iterator&)+0x4ad) [0x7f94550ff4ad]
>   10: (OSDMap::decode(ceph::buffer::list&)+0x31) [0x7f9455101db1]
>   11: (get_osdmap(ObjectStore*, unsigned int, OSDMap&,
> ceph::buffer::list&)+0x1d0) [0x55de1f9a6e60]
>   12: (main()+0x5340) [0x55de1f8c8870]
>   13: (__libc_start_main()+0xf5) [0x7f9451d6c3d5]
>   14: (()+0x3adc10) [0x55de1f9a1c10]
> Aborted
>
> -[~:#]- ceph-objectstore-tool --op get-osdmap --data-path
> /var/lib/ceph/osd/ceph-46/ --file osdmap46
> terminate called after throwing an instance of
> 'ceph::buffer::malformed_input'
>what():  buffer::malformed_input: unsupported bucket algorithm: -1
> *** Caught signal (Aborted) **
>   in thread 7f9ce4135f00 thread_name:ceph-objectstor
>   ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic
> (stable)
>   1: (()+0xf5d0) [0x7f9cd84c45d0]
>   2: (gsignal()+0x37) [0x7f9cd70b1207]
>   3: (abort()+0x148) [0x7f9cd70b28f8]
>   4: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7f9cd79c07d5]
>   5: (()+0x5e746) [0x7f9cd79be746]
>   6: (()+0x5e773) [0x7f9cd79be773]
>   7: (__cxa_rethrow()+0x49) [0x7f9cd79be9e9]
>   8: (CrushWrapper::decode(ceph::buffer::list::iterator&)+0x18b8)
> [0x7f9cda6528d8]
>   9: (OSDMap::decode(ceph::buffer::list::iterator&)+0x4ad) [0x7f9cda4304ad]
>   10: (OSDMap::decode(ceph::buffer::list&)+0x31) [0x7f9cda432db1]
>   11: (get_osdmap(ObjectStore*, unsigned int, OSDMap&,
> ceph::buffer::list&)+0x1d0) [0x55cea26c8e60]
>   12: (main()+0x5340) [0x55cea25ea870]
>   13: (__libc_start_main()+0xf5) [0x7f9cd709d3d5]
>   14: (()+0x3adc10) [0x55cea26c3c10]
> Aborted
>
> -[~:#]- ceph osd getmap -o osdmap
> got osdmap epoch 81298
>
> -[~:#]- ceph-objectstore-tool --op set-osdmap --data-path
> /var/lib/ceph/osd/ceph-46/ --file osdmap
> osdmap (#-1:92f679f2:::osdmap.81298:0#) does not exist.
>
> -[~:#]- ceph-objectstore-tool --op set-osdmap --data-path
> /var/lib/ceph/osd/ceph-45/ --file osdmap
> osdmap (#-1:92f679f2:::osdmap.81298:0#) does not exist.
>
>
>
> On 8/14/19 2:54 AM, Paul Emmerich wrote:
> > Starting point to debug/fix this would be to extract the osdmap from
> > one of the dead OSDs:
> >
> > ceph-objectstore-tool --op get-osdmap --data-path /var/lib/ceph/osd/...
> >
> > Then try to run osdmaptool on that osdmap to see if it also crashes,
> > set some --debug options (don't know which one off the top of my
> > head).
> > Does it also crash? How does it differ from the map retrieved with
> > "ceph osd getmap"?
> >
> > You can also set the osdmap with "--op set-osdmap", does it help to
> > set the osdmap retrieved by "ceph osd getmap"?
> >
> > Paul
> >
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 

Re: [ceph-users] Sudden loss of all SSD OSDs in a cluster, immedaite abort on restart [Mimic 13.2.6]

2019-08-18 Thread Brad Hubbard
On Thu, Aug 15, 2019 at 2:09 AM Troy Ablan  wrote:
>
> Paul,
>
> Thanks for the reply.  All of these seemed to fail except for pulling
> the osdmap from the live cluster.
>
> -Troy
>
> -[~:#]- ceph-objectstore-tool --op get-osdmap --data-path
> /var/lib/ceph/osd/ceph-45/ --file osdmap45
> terminate called after throwing an instance of
> 'ceph::buffer::malformed_input'
>what():  buffer::malformed_input: unsupported bucket algorithm: -1
> *** Caught signal (Aborted) **
>   in thread 7f945ee04f00 thread_name:ceph-objectstor
>   ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic
> (stable)
>   1: (()+0xf5d0) [0x7f94531935d0]
>   2: (gsignal()+0x37) [0x7f9451d80207]
>   3: (abort()+0x148) [0x7f9451d818f8]
>   4: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7f945268f7d5]
>   5: (()+0x5e746) [0x7f945268d746]
>   6: (()+0x5e773) [0x7f945268d773]
>   7: (__cxa_rethrow()+0x49) [0x7f945268d9e9]
>   8: (CrushWrapper::decode(ceph::buffer::list::iterator&)+0x18b8)
> [0x7f94553218d8]
>   9: (OSDMap::decode(ceph::buffer::list::iterator&)+0x4ad) [0x7f94550ff4ad]
>   10: (OSDMap::decode(ceph::buffer::list&)+0x31) [0x7f9455101db1]
>   11: (get_osdmap(ObjectStore*, unsigned int, OSDMap&,
> ceph::buffer::list&)+0x1d0) [0x55de1f9a6e60]
>   12: (main()+0x5340) [0x55de1f8c8870]
>   13: (__libc_start_main()+0xf5) [0x7f9451d6c3d5]
>   14: (()+0x3adc10) [0x55de1f9a1c10]
> Aborted
>
> -[~:#]- ceph-objectstore-tool --op get-osdmap --data-path
> /var/lib/ceph/osd/ceph-46/ --file osdmap46
> terminate called after throwing an instance of
> 'ceph::buffer::malformed_input'
>what():  buffer::malformed_input: unsupported bucket algorithm: -1
> *** Caught signal (Aborted) **
>   in thread 7f9ce4135f00 thread_name:ceph-objectstor
>   ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic
> (stable)
>   1: (()+0xf5d0) [0x7f9cd84c45d0]
>   2: (gsignal()+0x37) [0x7f9cd70b1207]
>   3: (abort()+0x148) [0x7f9cd70b28f8]
>   4: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7f9cd79c07d5]
>   5: (()+0x5e746) [0x7f9cd79be746]
>   6: (()+0x5e773) [0x7f9cd79be773]
>   7: (__cxa_rethrow()+0x49) [0x7f9cd79be9e9]
>   8: (CrushWrapper::decode(ceph::buffer::list::iterator&)+0x18b8)
> [0x7f9cda6528d8]
>   9: (OSDMap::decode(ceph::buffer::list::iterator&)+0x4ad) [0x7f9cda4304ad]
>   10: (OSDMap::decode(ceph::buffer::list&)+0x31) [0x7f9cda432db1]
>   11: (get_osdmap(ObjectStore*, unsigned int, OSDMap&,
> ceph::buffer::list&)+0x1d0) [0x55cea26c8e60]
>   12: (main()+0x5340) [0x55cea25ea870]
>   13: (__libc_start_main()+0xf5) [0x7f9cd709d3d5]
>   14: (()+0x3adc10) [0x55cea26c3c10]
> Aborted
>
> -[~:#]- ceph osd getmap -o osdmap
> got osdmap epoch 81298
>
> -[~:#]- ceph-objectstore-tool --op set-osdmap --data-path
> /var/lib/ceph/osd/ceph-46/ --file osdmap
> osdmap (#-1:92f679f2:::osdmap.81298:0#) does not exist.
>
> -[~:#]- ceph-objectstore-tool --op set-osdmap --data-path
> /var/lib/ceph/osd/ceph-45/ --file osdmap
> osdmap (#-1:92f679f2:::osdmap.81298:0#) does not exist.

819   auto ch = store->open_collection(coll_t::meta());
 820   const ghobject_t full_oid = OSD::get_osdmap_pobject_name(e);
 821   if (!store->exists(ch, full_oid)) {
 822 cerr << "osdmap (" << full_oid << ") does not exist." << std::endl;
 823 if (!force) {
 824   return -ENOENT;
 825 }
 826 cout << "Creating a new epoch." << std::endl;
 827   }

Adding "--force"should get you past that error.

>
>
>
> On 8/14/19 2:54 AM, Paul Emmerich wrote:
> > Starting point to debug/fix this would be to extract the osdmap from
> > one of the dead OSDs:
> >
> > ceph-objectstore-tool --op get-osdmap --data-path /var/lib/ceph/osd/...
> >
> > Then try to run osdmaptool on that osdmap to see if it also crashes,
> > set some --debug options (don't know which one off the top of my
> > head).
> > Does it also crash? How does it differ from the map retrieved with
> > "ceph osd getmap"?
> >
> > You can also set the osdmap with "--op set-osdmap", does it help to
> > set the osdmap retrieved by "ceph osd getmap"?
> >
> > Paul
> >
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Possibly a bug on rocksdb

2019-08-11 Thread Brad Hubbard
Could you create a tracker for this?

Also, if you can reproduce this could you gather a log with
debug_osd=20 ? That should show us the superblock it was trying to
decode as well as additional details.

On Mon, Aug 12, 2019 at 6:29 AM huxia...@horebdata.cn
 wrote:
>
> Dear folks,
>
> I had an OSD down, not because of a bad disk, but most likely a bug hit on 
> Rockdb. Any one had similar issue?
>
> I am using Luminous 12.2.12 version. Log attached below
>
> thanks,
> Samuel
>
> **
> [root@horeb72 ceph]# head -400 ceph-osd.4.log
> 2019-08-11 07:30:02.186519 7f69bd020700  0 -- 192.168.10.72:6805/5915 >> 
> 192.168.10.73:6801/4096 conn(0x56549cfc0800 :6805 
> s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg 
> accept connect_seq 15 vs existing csq=15 existing_state=STATE_STANDBY
> 2019-08-11 07:30:02.186871 7f69bd020700  0 -- 192.168.10.72:6805/5915 >> 
> 192.168.10.73:6801/4096 conn(0x56549cfc0800 :6805 
> s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg 
> accept connect_seq 16 vs existing csq=15 existing_state=STATE_STANDBY
> 2019-08-11 07:30:02.242291 7f69bc81f700  0 -- 192.168.10.72:6805/5915 >> 
> 192.168.10.71:6805/5046 conn(0x5654b93ed000 :6805 
> s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg 
> accept connect_seq 15 vs existing csq=15 existing_state=STATE_STANDBY
> 2019-08-11 07:30:02.242554 7f69bc81f700  0 -- 192.168.10.72:6805/5915 >> 
> 192.168.10.71:6805/5046 conn(0x5654b93ed000 :6805 
> s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg 
> accept connect_seq 16 vs existing csq=15 existing_state=STATE_STANDBY
> 2019-08-11 07:30:02.260295 7f69bc81f700  0 -- 192.168.10.72:6805/5915 >> 
> 192.168.10.73:6806/4864 conn(0x56544de16800 :6805 
> s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg 
> accept connect_seq 15 vs existing csq=15 
> existing_state=STATE_CONNECTING_WAIT_CONNECT_REPLY
> 2019-08-11 17:11:01.968247 7ff4822f1d80 -1 WARNING: the following dangerous 
> and experimental features are enabled: bluestore,rocksdb
> 2019-08-11 17:11:01.968333 7ff4822f1d80  0 ceph version 12.2.12 
> (1436006594665279fe734b4c15d7e08c13ebd777) luminous (stable), process 
> ceph-osd, pid 1048682
> 2019-08-11 17:11:01.970611 7ff4822f1d80  0 pidfile_write: ignore empty 
> --pid-file
> 2019-08-11 17:11:01.991542 7ff4822f1d80 -1 WARNING: the following dangerous 
> and experimental features are enabled: bluestore,rocksdb
> 2019-08-11 17:11:01.997597 7ff4822f1d80  0 load: jerasure load: lrc load: isa
> 2019-08-11 17:11:01.997710 7ff4822f1d80  1 bdev create path 
> /var/lib/ceph/osd/ceph-4/block type kernel
> 2019-08-11 17:11:01.997723 7ff4822f1d80  1 bdev(0x564774656c00 
> /var/lib/ceph/osd/ceph-4/block) open path /var/lib/ceph/osd/ceph-4/block
> 2019-08-11 17:11:01.998127 7ff4822f1d80  1 bdev(0x564774656c00 
> /var/lib/ceph/osd/ceph-4/block) open size 858887553024 (0xc7f9b0, 800GiB) 
> block_size 4096 (4KiB) non-rotational
> 2019-08-11 17:11:01.998231 7ff4822f1d80  1 bdev(0x564774656c00 
> /var/lib/ceph/osd/ceph-4/block) close
> 2019-08-11 17:11:02.265144 7ff4822f1d80  1 bdev create path 
> /var/lib/ceph/osd/ceph-4/block type kernel
> 2019-08-11 17:11:02.265177 7ff4822f1d80  1 bdev(0x564774658a00 
> /var/lib/ceph/osd/ceph-4/block) open path /var/lib/ceph/osd/ceph-4/block
> 2019-08-11 17:11:02.265695 7ff4822f1d80  1 bdev(0x564774658a00 
> /var/lib/ceph/osd/ceph-4/block) open size 858887553024 (0xc7f9b0, 800GiB) 
> block_size 4096 (4KiB) non-rotational
> 2019-08-11 17:11:02.266233 7ff4822f1d80  1 bdev create path 
> /var/lib/ceph/osd/ceph-4/block.db type kernel
> 2019-08-11 17:11:02.266256 7ff4822f1d80  1 bdev(0x564774589a00 
> /var/lib/ceph/osd/ceph-4/block.db) open path /var/lib/ceph/osd/ceph-4/block.db
> 2019-08-11 17:11:02.266812 7ff4822f1d80  1 bdev(0x564774589a00 
> /var/lib/ceph/osd/ceph-4/block.db) open size 2759360 (0x6fc20, 
> 27.9GiB) block_size 4096 (4KiB) non-rotational
> 2019-08-11 17:11:02.266998 7ff4822f1d80  1 bdev create path 
> /var/lib/ceph/osd/ceph-4/block type kernel
> 2019-08-11 17:11:02.267015 7ff4822f1d80  1 bdev(0x564774659a00 
> /var/lib/ceph/osd/ceph-4/block) open path /var/lib/ceph/osd/ceph-4/block
> 2019-08-11 17:11:02.267412 7ff4822f1d80  1 bdev(0x564774659a00 
> /var/lib/ceph/osd/ceph-4/block) open size 858887553024 (0xc7f9b0, 800GiB) 
> block_size 4096 (4KiB) non-rotational
> 2019-08-11 17:11:02.298355 7ff4822f1d80  0  set rocksdb option 
> compaction_readahead_size = 2MB
> 2019-08-11 17:11:02.298368 7ff4822f1d80  0  set rocksdb option 
> compaction_style = kCompactionStyleLevel
> 2019-08-11 17:11:02.299628 7ff4822f1d80  0  set rocksdb option 
> compaction_threads = 32
> 2019-08-11 17:11:02.299648 7ff4822f1d80  0  set rocksdb option compression = 
> kNoCompression
> 2019-08-11 17:11:02.23 7ff4822f1d80  0  set rocksdb option 
> flusher_threads = 8
> 2019-08-11 

Re: [ceph-users] ceph status: pg backfill_toofull, but all OSDs have enough space

2019-08-22 Thread Brad Hubbard
https://tracker.ceph.com/issues/41255 is probably reporting the same issue.

On Thu, Aug 22, 2019 at 6:31 PM Lars Täuber  wrote:
>
> Hi there!
>
> We also experience this behaviour of our cluster while it is moving pgs.
>
> # ceph health detail
> HEALTH_ERR 1 MDSs report slow metadata IOs; Reduced data availability: 2 pgs 
> inactive; Degraded data redundancy (low space): 1 pg backfill_toofull
> MDS_SLOW_METADATA_IO 1 MDSs report slow metadata IOs
> mdsmds1(mds.0): 1 slow metadata IOs are blocked > 30 secs, oldest blocked 
> for 359 secs
> PG_AVAILABILITY Reduced data availability: 2 pgs inactive
> pg 21.231 is stuck inactive for 878.224182, current state remapped, last 
> acting [20,2147483647,13,2147483647,15,10]
> pg 21.240 is stuck inactive for 878.123932, current state remapped, last 
> acting [26,17,21,20,2147483647,2147483647]
> PG_DEGRADED_FULL Degraded data redundancy (low space): 1 pg backfill_toofull
> pg 21.376 is active+remapped+backfill_wait+backfill_toofull, acting 
> [6,11,29,2,10,15]
> # ceph pg map 21.376
> osdmap e68016 pg 21.376 (21.376) -> up [6,5,23,21,10,11] acting 
> [6,11,29,2,10,15]
>
> # ceph osd dump | fgrep ratio
> full_ratio 0.95
> backfillfull_ratio 0.9
> nearfull_ratio 0.85
>
> This happens while the cluster is rebalancing the pgs after I manually mark a 
> single osd out.
> see here:
>  Subject: [ceph-users] pg 21.1f9 is stuck inactive for 53316.902820,  current 
> state remapped
>  http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-August/036634.html
>
>
> Mostly the cluster heals itself at least into state HEALTH_WARN:
>
>
> # ceph health detail
> HEALTH_WARN 1 MDSs report slow metadata IOs; Reduced data availability: 2 pgs 
> inactive
> MDS_SLOW_METADATA_IO 1 MDSs report slow metadata IOs
> mdsmds1(mds.0): 1 slow metadata IOs are blocked > 30 secs, oldest blocked 
> for 1155 secs
> PG_AVAILABILITY Reduced data availability: 2 pgs inactive
> pg 21.231 is stuck inactive for 1677.312219, current state remapped, last 
> acting [20,2147483647,13,2147483647,15,10]
> pg 21.240 is stuck inactive for 1677.211969, current state remapped, last 
> acting [26,17,21,20,2147483647,2147483647]
>
>
>
> Cheers,
> Lars
>
>
> Wed, 21 Aug 2019 17:28:05 -0500
> Reed Dier  ==> Vladimir Brik 
>  :
> > Just chiming in to say that I too had some issues with backfill_toofull 
> > PGs, despite no OSD's being in a backfill_full state, albeit, there were 
> > some nearfull OSDs.
> >
> > I was able to get through it by reweighting down the OSD that was the 
> > target reported by ceph pg dump | grep 'backfill_toofull'.
> >
> > This was on 14.2.2.
> >
> > Reed
> >
> > > On Aug 21, 2019, at 2:50 PM, Vladimir Brik 
> > >  wrote:
> > >
> > > Hello
> > >
> > > After increasing number of PGs in a pool, ceph status is reporting 
> > > "Degraded data redundancy (low space): 1 pg backfill_toofull", but I 
> > > don't understand why, because all OSDs seem to have enough space.
> > >
> > > ceph health detail says:
> > > pg 40.155 is active+remapped+backfill_toofull, acting [20,57,79,85]
> > >
> > > $ ceph pg map 40.155
> > > osdmap e3952 pg 40.155 (40.155) -> up [20,57,66,85] acting [20,57,79,85]
> > >
> > > So I guess Ceph wants to move 40.155 from 66 to 79 (or other way 
> > > around?). According to "osd df", OSD 66's utilization is 71.90%, OSD 79's 
> > > utilization is 58.45%. The OSD with least free space in the cluster is 
> > > 81.23% full, and it's not any of the ones above.
> > >
> > > OSD backfillfull_ratio is 90% (is there a better way to determine this?):
> > > $ ceph osd dump | grep ratio
> > > full_ratio 0.95
> > > backfillfull_ratio 0.9
> > > nearfull_ratio 0.7
> > >
> > > Does anybody know why a PG could be in the backfill_toofull state if no 
> > > OSD is in the backfillfull state?
> > >
> > >
> > > Vlad
> > > ___
> > > ceph-users mailing list
> > > ceph-users@lists.ceph.com
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
>
> --
> Informationstechnologie
> Berlin-Brandenburgische Akademie der Wissenschaften
> Jägerstraße 22-23  10117 Berlin
> Tel.: +49 30 20370-352   http://www.bbaw.de
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] BlueStore.cc: 11208: ceph_abort_msg("unexpected error")

2019-08-25 Thread Brad Hubbard
https://tracker.ceph.com/issues/38724

On Fri, Aug 23, 2019 at 10:18 PM Paul Emmerich  wrote:
>
> I've seen that before (but never on Nautilus), there's already an
> issue at tracker.ceph.com but I don't recall the id or title.
>
>
> Paul
>
> --
> Paul Emmerich
>
> Looking for help with your Ceph cluster? Contact us at https://croit.io
>
> croit GmbH
> Freseniusstr. 31h
> 81247 München
> www.croit.io
> Tel: +49 89 1896585 90
>
> On Fri, Aug 23, 2019 at 1:47 PM Lars Täuber  wrote:
> >
> > Hi Paul,
> >
> > a result of fgrep is attached.
> > Can you do something with it?
> >
> > I can't read it. Maybe this is the relevant part:
> > " bluestore(/var/lib/ceph/osd/first-16) _txc_add_transaction error (39) 
> > Directory not empty not handled on operation 21 (op 1, counting from 0)"
> >
> > Later I tried it again and the osd is working again.
> >
> > It feels like I hit a bug!?
> >
> > Huge thanks for your help.
> >
> > Cheers,
> > Lars
> >
> > Fri, 23 Aug 2019 13:36:00 +0200
> > Paul Emmerich  ==> Lars Täuber  :
> > > Filter the log for "7f266bdc9700" which is the id of the crashed
> > > thread, it should contain more information on the transaction that
> > > caused the crash.
> > >
> > >
> > > Paul
> > >
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-fuse segfaults in 14.2.2

2019-09-06 Thread Brad Hubbard
On Wed, Sep 4, 2019 at 9:42 PM Andras Pataki
 wrote:
>
> Dear ceph users,
>
> After upgrading our ceph-fuse clients to 14.2.2, we've been seeing sporadic 
> segfaults with not super revealing stack traces:
>
> in thread 7fff5a7fc700 thread_name:ceph-fuse
>
>  ceph version 14.2.2 (4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus 
> (stable)
>  1: (()+0xf5d0) [0x760b85d0]
>  2: (()+0x255a0c) [0x557a9a0c]
>  3: (()+0x16b6b) [0x77bb3b6b]
>  4: (()+0x13401) [0x77bb0401]
>  5: (()+0x7dd5) [0x760b0dd5]
>  6: (clone()+0x6d) [0x74b5cead]
>  NOTE: a copy of the executable, or `objdump -rdS ` is needed to 
> interpret this.

if you install the appropriate debuginfo package (this would depend on
your OS) you may get a more enlightening stack.
>
>
> Prior to 14.2.2, we've run 12.2.11 and 13.2.5 and have not seen this issue.  
> Has anyone encountered this?  If it isn't known - I can file a bug tracker 
> for it.

Please do and maybe try to capture a core dump if you can't get a
better backtrace?

>
> Andras
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ZeroDivisionError when running ceph osd status

2019-09-11 Thread Brad Hubbard
On Thu, Sep 12, 2019 at 1:52 AM Benjamin Tayehanpour
 wrote:
>
> Greetings!
>
> I had an OSD down, so I ran ceph osd status and got this:
>
> [root@ceph1 ~]# ceph osd status
> Error EINVAL: Traceback (most recent call last):
>   File "/usr/lib64/ceph/mgr/status/module.py", line 313, in handle_command
> return self.handle_osd_status(cmd)
>   File "/usr/lib64/ceph/mgr/status/module.py", line 297, in
> handle_osd_status
> self.format_dimless(self.get_rate("osd", osd_id.__str__(), "osd.op_w") +
>   File "/usr/lib64/ceph/mgr/status/module.py", line 113, in get_rate
> return (data[-1][1] - data[-2][1]) / float(data[-1][0] - data[-2][0])
> ZeroDivisionError: float division by zero
> [root@ceph1 ~]#
>
> I could still figure out which OSD it was with systemctl, put I had to
> purge the OSD before ceph osd status would run again. Is this normal
> behaviour?

No. Looks like this was fixed recently in master by
https://tracker.ceph.com/projects/ceph/repository/revisions/0164c399f3c22edce6488cd28e5b172b68ca1239/diff/src/pybind/mgr/status/module.py

>
> Cordially yours,
> Benjamin
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] 14.2.2 - OSD Crash

2019-08-06 Thread Brad Hubbard
-63> 2019-08-07 00:51:52.861 7fe987e49700  1 heartbeat_map
clear_timeout 'OSD::osd_op_tp thread 0x7fe987e49700' had suicide timed
out after 150

You hit a suicide timeout, that's fatal. On line 80 the process kills
the thread based on the assumption it's hung.


src/common/HeartbeatMap.cc:

 66 bool HeartbeatMap::_check(const heartbeat_handle_d *h, const char
*who,
 67 »···»···»···  ceph::coarse_mono_clock::rep now)
 68 {
 69   bool healthy = true;
 70   auto was = h->timeout.load();
 71   if (was && was < now) {
 72 ldout(m_cct, 1) << who << " '" << h->name << "'"
 73 »···»···<< " had timed out after " << h->grace <<
dendl;
 74 healthy = false;
 75   }
 76   was = h->suicide_timeout;
 77   if (was && was < now) {
 78 ldout(m_cct, 1) << who << " '" << h->name << "'"
 79 »···»···<< " had suicide timed out after " <<
h->suicide_grace << dendl;
 80 pthread_kill(h->thread_id, SIGABRT);
 81 sleep(1);
 82 ceph_abort_msg("hit suicide timeout");
 83   }
 84   return healthy;
 85 }

You can try increasing the relevant timeouts but you would be better
off looking for the underlying cause of the poor performance. There's
a lot of information out there if you search for "ceph suicide
timeout"

On Wed, Aug 7, 2019 at 9:16 AM EDH - Manuel Rios Fernandez
 wrote:
>
> Hi
>
>
>
> We got a pair of OSD located in  node that crash randomly since 14.2.2
>
>
>
> OS Version : Centos 7.6
>
>
>
> There’re a ton of lines before crash , I will unespected:
>
>
>
> --
>
> 3045> 2019-08-07 00:39:32.013 7fe9a4996700  1 heartbeat_map is_healthy 
> 'OSD::osd_op_tp thread 0x7fe987e49700' had timed out after 15
>
> -3044> 2019-08-07 00:39:32.013 7fe9a3994700  1 heartbeat_map is_healthy 
> 'OSD::osd_op_tp thread 0x7fe987e49700' had timed out after 15
>
> -3043> 2019-08-07 00:39:32.033 7fe9a4195700  1 heartbeat_map is_healthy 
> 'OSD::osd_op_tp thread 0x7fe987e49700' had timed out after 15
>
> -3042> 2019-08-07 00:39:32.033 7fe9a4996700  1 heartbeat_map is_healthy 
> 'OSD::osd_op_tp thread 0x7fe987e49700' had timed out after 15
>
> --
>
> -
>
>
>
> Some hundred lines of:
>
> -164> 2019-08-07 00:47:36.628 7fe9a3994700  1 heartbeat_map is_healthy 
> 'OSD::osd_op_tp thread 0x7fe98964c700' had timed out after 60
>
>   -163> 2019-08-07 00:47:36.632 7fe9a3994700  1 heartbeat_map is_healthy 
> 'OSD::osd_op_tp thread 0x7fe98964c700' had timed out after 60
>
>   -162> 2019-08-07 00:47:36.632 7fe9a3994700  1 heartbeat_map is_healthy 
> 'OSD::osd_op_tp thread 0x7fe98964c700' had timed out after 60
>
> -
>
>
>
>-78> 2019-08-07 00:50:51.755 7fe995bfa700 10 monclient: tick
>
>-77> 2019-08-07 00:50:51.755 7fe995bfa700 10 monclient: 
> _check_auth_rotating have uptodate secrets (they expire after 2019-08-07 
> 00:50:21.756453)
>
>-76> 2019-08-07 00:51:01.755 7fe995bfa700 10 monclient: tick
>
>-75> 2019-08-07 00:51:01.755 7fe995bfa700 10 monclient: 
> _check_auth_rotating have uptodate secrets (they expire after 2019-08-07 
> 00:50:31.756604)
>
>-74> 2019-08-07 00:51:11.755 7fe995bfa700 10 monclient: tick
>
>-73> 2019-08-07 00:51:11.755 7fe995bfa700 10 monclient: 
> _check_auth_rotating have uptodate secrets (they expire after 2019-08-07 
> 00:50:41.756788)
>
>-72> 2019-08-07 00:51:21.756 7fe995bfa700 10 monclient: tick
>
>-71> 2019-08-07 00:51:21.756 7fe995bfa700 10 monclient: 
> _check_auth_rotating have uptodate secrets (they expire after 2019-08-07 
> 00:50:51.756982)
>
>-70> 2019-08-07 00:51:31.755 7fe995bfa700 10 monclient: tick
>
>-69> 2019-08-07 00:51:31.755 7fe995bfa700 10 monclient: 
> _check_auth_rotating have uptodate secrets (they expire after 2019-08-07 
> 00:51:01.757206)
>
>-68> 2019-08-07 00:51:41.756 7fe995bfa700 10 monclient: tick
>
>-67> 2019-08-07 00:51:41.756 7fe995bfa700 10 monclient: 
> _check_auth_rotating have uptodate secrets (they expire after 2019-08-07 
> 00:51:11.757364)
>
>-66> 2019-08-07 00:51:51.756 7fe995bfa700 10 monclient: tick
>
>-65> 2019-08-07 00:51:51.756 7fe995bfa700 10 monclient: 
> _check_auth_rotating have uptodate secrets (they expire after 2019-08-07 
> 00:51:21.757535)
>
>-64> 2019-08-07 00:51:52.861 7fe987e49700  1 heartbeat_map clear_timeout 
> 'OSD::osd_op_tp thread 0x7fe987e49700' had timed out after 15
>
>-63> 2019-08-07 00:51:52.861 7fe987e49700  1 heartbeat_map clear_timeout 
> 'OSD::osd_op_tp thread 0x7fe987e49700' had suicide timed out after 150
>
>-62> 2019-08-07 00:51:52.948 7fe99966c700  5 
> bluestore.MempoolThread(0x55ff04ad6a88) _tune_cache_size target: 4294967296 
> heap: 6018998272 unmapped: 1721180160 mapped: 4297818112 old cache_size: 
> 1994018210 new cache size: 1992784572
>
>-61> 2019-08-07 00:51:52.948 7fe99966c700  5 
> bluestore.MempoolThread(0x55ff04ad6a88) _trim_shards cache_size: 1992784572 
> kv_alloc: 763363328 kv_used: 749381098 meta_alloc: 763363328 meta_used: 
> 654593191 data_alloc: 452984832 data_used: 455929856
>
>-60> 2019-08-07 

Re: [ceph-users] OSD crashed during the fio test

2019-10-01 Thread Brad Hubbard
Removed ceph-de...@vger.kernel.org and added d...@ceph.io

On Tue, Oct 1, 2019 at 4:26 PM Alex Litvak  wrote:
>
> Hellow everyone,
>
> Can you shed the line on the cause of the crash?  Could actually client 
> request trigger it?
>
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]: 2019-09-30 22:52:58.867 
> 7f093d71e700 -1 bdev(0x55b72c156000 /var/lib/ceph/osd/ceph-17/block) 
> aio_submit retries 16
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]: 2019-09-30 22:52:58.867 
> 7f093d71e700 -1 bdev(0x55b72c156000 /var/lib/ceph/osd/ceph-17/block)  aio 
> submit got (11) Resource temporarily unavailable

The KernelDevice::aio_submit function has tried to submit Io 16 times
(a hard coded limit) and received an error each time causing it to
assert. Can you check the status of the underlying device(s)?

> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/14.2.2/rpm/el7/BUILD/ceph-14.2.2/src/os/bluestore/KernelDevice.cc:
> In fun
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/14.2.2/rpm/el7/BUILD/ceph-14.2.2/src/os/bluestore/KernelDevice.cc:
> 757: F
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  ceph version 14.2.2 
> (4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus (stable)
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  1: 
> (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14a) 
> [0x55b71f668cf4]
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  2: 
> (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char 
> const*, ...)+0) [0x55b71f668ec2]
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  3: 
> (KernelDevice::aio_submit(IOContext*)+0x701) [0x55b71fd61ca1]
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  4: 
> (BlueStore::_txc_aio_submit(BlueStore::TransContext*)+0x42) [0x55b71fc29892]
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  5: 
> (BlueStore::_txc_state_proc(BlueStore::TransContext*)+0x42b) [0x55b71fc496ab]
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  6: 
> (BlueStore::queue_transactions(boost::intrusive_ptr&,
>  std::vector std::allocator >&, boost::intrusive_ptr, 
> ThreadPool::T
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  7: (non-virtual thunk to 
> PrimaryLogPG::queue_transactions(std::vector std::allocator >&,
> boost::intrusive_ptr)+0x54) [0x55b71f9b1b84]
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  8: 
> (ReplicatedBackend::submit_transaction(hobject_t const&, object_stat_sum_t 
> const&, eversion_t const&, std::unique_ptr std::default_delete >&&, eversion_t const&, eversion_t const&, 
> s
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  9: 
> (PrimaryLogPG::issue_repop(PrimaryLogPG::RepGather*, 
> PrimaryLogPG::OpContext*)+0xf12) [0x55b71f90e322]
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  10: 
> (PrimaryLogPG::execute_ctx(PrimaryLogPG::OpContext*)+0xfae) [0x55b71f969b7e]
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  11: 
> (PrimaryLogPG::do_op(boost::intrusive_ptr&)+0x3965) 
> [0x55b71f96de15]
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  12: 
> (PrimaryLogPG::do_request(boost::intrusive_ptr&, 
> ThreadPool::TPHandle&)+0xbd4) [0x55b71f96f8a4]
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  13: 
> (OSD::dequeue_op(boost::intrusive_ptr, boost::intrusive_ptr, 
> ThreadPool::TPHandle&)+0x1a9) [0x55b71f7a9ea9]
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  14: (PGOpItem::run(OSD*, 
> OSDShard*, boost::intrusive_ptr&, ThreadPool::TPHandle&)+0x62) 
> [0x55b71fa475d2]
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  15: 
> (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x9f4) 
> [0x55b71f7c6ef4]
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  16: 
> (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x433) 
> [0x55b71fdc5ce3]
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  17: 
> (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x55b71fdc8d80]
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  18: (()+0x7dd5) 
> [0x7f0971da9dd5]
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  19: (clone()+0x6d) 
> [0x7f0970c7002d]
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]: 2019-09-30 22:52:58.879 
> 7f093d71e700 -1
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/14.2.2/rpm/el7/BUILD/ceph-14.2.2/
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/14.2.2/rpm/el7/BUILD/ceph-14.2.2/src/os/bluestore/KernelDevice.cc:
> 757: F
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  ceph version 

Re: [ceph-users] ceph pg repair fails...?

2019-10-01 Thread Brad Hubbard
On Wed, Oct 2, 2019 at 1:15 AM Mattia Belluco  wrote:
>
> Hi Jake,
>
> I am curious to see if your problem is similar to ours (despite the fact
> we are still on Luminous).
>
> Could you post the output of:
>
> rados list-inconsistent-obj 
>
> and
>
> rados list-inconsistent-snapset 

Make sure you scrub the pg before running these commands.
Take a look at the information in http://tracker.ceph.com/issues/24994
for hints on how to proceed.
'
>
> Thanks,
>
> Mattia
>
> On 10/1/19 1:08 PM, Jake Grimmett wrote:
> > Dear All,
> >
> > I've just found two inconsistent pg that fail to repair.
> >
> > This might be the same bug as shown here:
> >
> > 
> >
> > Cluster is running Nautilus 14.2.2
> > OS is Scientific Linux 7.6
> > DB/WAL on NVMe, Data on 12TB HDD
> >
> > Logs below cab also be seen here: 
> >
> > [root@ceph-s1 ~]# ceph health detail
> > HEALTH_ERR 22 scrub errors; Possible data damage: 2 pgs inconsistent
> > OSD_SCRUB_ERRORS 22 scrub errors
> > PG_DAMAGED Possible data damage: 2 pgs inconsistent
> > pg 2.2a7 is active+clean+inconsistent+failed_repair, acting
> > [83,60,133,326,281,162,180,172,144,219]
> > pg 2.36b is active+clean+inconsistent+failed_repair, acting
> > [254,268,10,262,32,280,211,114,169,53]
> >
> > Issued "pg repair" commands, osd log shows:
> > [root@ceph-n10 ~]# grep "2.2a7" /var/log/ceph/ceph-osd.83.log
> > 2019-10-01 07:05:02.459 7f9adab4b700  0 log_channel(cluster) log [DBG] :
> > 2.2a7 repair starts
> > 2019-10-01 07:11:41.589 7f9adab4b700 -1 log_channel(cluster) log [ERR] :
> > 2.2a7 shard 83(0) soid 2:e5472cab:::1000702081f.:head :
> > candidate size 4096 info size 0 mismatch
> > 2019-10-01 07:11:41.589 7f9adab4b700 -1 log_channel(cluster) log [ERR] :
> > 2.2a7 shard 60(1) soid 2:e5472cab:::1000702081f.:head :
> > candidate size 4096 info size 0 mismatch
> > 2019-10-01 07:11:41.589 7f9adab4b700 -1 log_channel(cluster) log [ERR] :
> > 2.2a7 shard 133(2) soid 2:e5472cab:::1000702081f.:head :
> > candidate size 4096 info size 0 mismatch
> > 2019-10-01 07:11:41.589 7f9adab4b700 -1 log_channel(cluster) log [ERR] :
> > 2.2a7 shard 144(8) soid 2:e5472cab:::1000702081f.:head :
> > candidate size 4096 info size 0 mismatch
> > 2019-10-01 07:11:41.589 7f9adab4b700 -1 log_channel(cluster) log [ERR] :
> > 2.2a7 shard 162(5) soid 2:e5472cab:::1000702081f.:head :
> > candidate size 4096 info size 0 mismatch
> > 2019-10-01 07:11:41.589 7f9adab4b700 -1 log_channel(cluster) log [ERR] :
> > 2.2a7 shard 172(7) soid 2:e5472cab:::1000702081f.:head :
> > candidate size 4096 info size 0 mismatch
> > 2019-10-01 07:11:41.589 7f9adab4b700 -1 log_channel(cluster) log [ERR] :
> > 2.2a7 shard 180(6) soid 2:e5472cab:::1000702081f.:head :
> > candidate size 4096 info size 0 mismatch
> > 2019-10-01 07:11:41.589 7f9adab4b700 -1 log_channel(cluster) log [ERR] :
> > 2.2a7 shard 219(9) soid 2:e5472cab:::1000702081f.:head :
> > candidate size 4096 info size 0 mismatch
> > 2019-10-01 07:11:41.589 7f9adab4b700 -1 log_channel(cluster) log [ERR] :
> > 2.2a7 shard 281(4) soid 2:e5472cab:::1000702081f.:head :
> > candidate size 4096 info size 0 mismatch
> > 2019-10-01 07:11:41.589 7f9adab4b700 -1 log_channel(cluster) log [ERR] :
> > 2.2a7 shard 326(3) soid 2:e5472cab:::1000702081f.:head :
> > candidate size 4096 info size 0 mismatch
> > 2019-10-01 07:11:41.589 7f9adab4b700 -1 log_channel(cluster) log [ERR] :
> > 2.2a7 soid 2:e5472cab:::1000702081f.:head : failed to pick
> > suitable object info
> > 2019-10-01 07:11:41.589 7f9adab4b700 -1 log_channel(cluster) log [ERR] :
> > repair 2.2a7s0 2:e5472cab:::1000702081f.:head : on disk size
> > (4096) does not match object info size (0) adjusted for ondisk to (0)
> > 2019-10-01 07:19:47.060 7f9adab4b700 -1 log_channel(cluster) log [ERR] :
> > 2.2a7 repair 11 errors, 0 fixed
> > [root@ceph-n10 ~]#
> >
> > [root@ceph-s1 ~]#  ceph pg repair 2.36b
> > instructing pg 2.36bs0 on osd.254 to repair
> >
> > [root@ceph-n29 ~]# grep "2.36b" /var/log/ceph/ceph-osd.254.log
> > 2019-10-01 11:15:12.215 7fa01f589700  0 log_channel(cluster) log [DBG] :
> > 2.36b repair starts
> > 2019-10-01 11:25:12.241 7fa01f589700 -1 log_channel(cluster) log [ERR] :
> > 2.36b shard 254(0) soid 2:d6cac754:::100070209f6.:head :
> > candidate size 4096 info size 0 mismatch
> > 2019-10-01 11:25:12.241 7fa01f589700 -1 log_channel(cluster) log [ERR] :
> > 2.36b shard 10(2) soid 2:d6cac754:::100070209f6.:head :
> > candidate size 4096 info size 0 mismatch
> > 2019-10-01 11:25:12.241 7fa01f589700 -1 log_channel(cluster) log [ERR] :
> > 2.36b shard 32(4) soid 2:d6cac754:::100070209f6.:head :
> > candidate size 4096 info size 0 mismatch
> > 2019-10-01 11:25:12.241 7fa01f589700 -1 log_channel(cluster) log [ERR] :
> > 2.36b shard 53(9) soid 2:d6cac754:::100070209f6.:head :
> > candidate size 4096 info 

Re: [ceph-users] OSD crashed during the fio test

2019-10-01 Thread Brad Hubbard
If it is only this one osd I'd be inclined to be taking a hard look at
the underlying hardware and how it behaves/performs compared to the hw
backing identical osds. The less likely possibility is that you have
some sort of "hot spot" causing resource contention for that osd. To
investigate that further you could look at whether the pattern of cpu
and ram usage of that daemon varies significantly compared to the
other osd daemons in the cluster. You could also compare perf dumps
between daemons.

On Wed, Oct 2, 2019 at 1:46 PM Sasha Litvak
 wrote:
>
> I updated firmware and kernel, running torture tests.  So far no assert, but 
> I still noticed this on the same osd as yesterday
>
> Oct 01 19:35:13 storage2n2-la ceph-osd-34[11188]: 2019-10-01 19:35:13.721 
> 7f8d03150700  1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 
> 0x7f8cd05d7700' had timed out after 60
> Oct 01 19:35:13 storage2n2-la ceph-osd-34[11188]: 2019-10-01 19:35:13.721 
> 7f8d03150700  1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 
> 0x7f8cd0dd8700' had timed out after 60
> Oct 01 19:35:13 storage2n2-la ceph-osd-34[11188]: 2019-10-01 19:35:13.721 
> 7f8d03150700  1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 
> 0x7f8cd2ddc700' had timed out after 60
> Oct 01 19:35:13 storage2n2-la ceph-osd-34[11188]: 2019-10-01 19:35:13.721 
> 7f8d03150700  1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 
> 0x7f8cd35dd700' had timed out after 60
> Oct 01 19:35:13 storage2n2-la ceph-osd-34[11188]: 2019-10-01 19:35:13.721 
> 7f8d03150700  1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 
> 0x7f8cd3dde700' had timed out after 60
>
> The spike of latency on this OSD is 6 seconds at that time.  Any ideas?
>
> On Tue, Oct 1, 2019 at 8:03 AM Sasha Litvak  
> wrote:
>>
>> It was hardware indeed.  Dell server reported a disk being reset with power 
>> on.  Checking the usual suspects i.e. controller firmware, controller event 
>> log (if I can get one), drive firmware.
>> I will report more when I get a better idea
>>
>> Thank you!
>>
>> On Tue, Oct 1, 2019 at 2:33 AM Brad Hubbard  wrote:
>>>
>>> Removed ceph-de...@vger.kernel.org and added d...@ceph.io
>>>
>>> On Tue, Oct 1, 2019 at 4:26 PM Alex Litvak  
>>> wrote:
>>> >
>>> > Hellow everyone,
>>> >
>>> > Can you shed the line on the cause of the crash?  Could actually client 
>>> > request trigger it?
>>> >
>>> > Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]: 2019-09-30 22:52:58.867 
>>> > 7f093d71e700 -1 bdev(0x55b72c156000 /var/lib/ceph/osd/ceph-17/block) 
>>> > aio_submit retries 16
>>> > Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]: 2019-09-30 22:52:58.867 
>>> > 7f093d71e700 -1 bdev(0x55b72c156000 /var/lib/ceph/osd/ceph-17/block)  aio 
>>> > submit got (11) Resource temporarily unavailable
>>>
>>> The KernelDevice::aio_submit function has tried to submit Io 16 times
>>> (a hard coded limit) and received an error each time causing it to
>>> assert. Can you check the status of the underlying device(s)?
>>>
>>> > Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:
>>> > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/14.2.2/rpm/el7/BUILD/ceph-14.2.2/src/os/bluestore/KernelDevice.cc:
>>> > In fun
>>> > Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:
>>> > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/14.2.2/rpm/el7/BUILD/ceph-14.2.2/src/os/bluestore/KernelDevice.cc:
>>> > 757: F
>>> > Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  ceph version 14.2.2 
>>> > (4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus (stable)
>>> > Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  1: 
>>> > (ceph::__ceph_assert_fail(char const*, char const*, int, char 
>>> > const*)+0x14a) [0x55b71f668cf4]
>>> > Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  2: 
>>> > (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, 
>>> > char const*, ...)+0) [0x55b71f668ec2]
>>> > Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  3: 
>>> > (KernelDevice::aio_submit(IOContext*)+0x701) [0x55b71fd61ca1]
>>> > Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  4: 
>>> > (BlueStore::_txc_aio_submit(BlueStore::TransContext*)+0x42) 
>>> > [0x55b71fc29892]
>>> > Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  5: 

Re: [ceph-users] ceph-osd@n crash dumps

2019-10-01 Thread Brad Hubbard
On Tue, Oct 1, 2019 at 10:43 PM Del Monaco, Andrea <
andrea.delmon...@atos.net> wrote:

> Hi list,
>
> After the nodes ran OOM and after reboot, we are not able to restart the
> ceph-osd@x services anymore. (Details about the setup at the end).
>
> I am trying to do this manually, so we can see the error but all i see is
> several crash dumps - this is just one of the OSDs which is not starting.
> Any idea how to get past this??
> [root@ceph001 ~]# /usr/bin/ceph-osd --debug_osd 10 -f --cluster ceph --id
> 83 --setuser ceph --setgroup ceph  > /tmp/dump 2>&1
> starting osd.83 at - osd_data /var/lib/ceph/osd/ceph-83
> /var/lib/ceph/osd/ceph-83/journal
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.6/rpm/el7/BUILD/ceph-13.2.6/src/osd/ECUtil.h:
> In function 'ECUtil::stripe_info_t::stripe_info_t(uint64_t, uint64_t)'
> thread 2aaf5540 time 2019-10-01 14:19:49.494368
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.6/rpm/el7/BUILD/ceph-13.2.6/src/osd/ECUtil.h:
> 34: FAILED assert(stripe_width % stripe_size == 0)
>  ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic
> (stable)
>  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> const*)+0x14b) [0x2af3d36b]
>  2: (()+0x26e4f7) [0x2af3d4f7]
>  3: (ECBackend::ECBackend(PGBackend::Listener*, coll_t const&,
> boost::intrusive_ptr&, ObjectStore*,
> CephContext*, std::shared_ptr, unsigned
> long)+0x46d) [0x55c0bd3d]
>  4: (PGBackend::build_pg_backend(pg_pool_t const&, std::map std::string, std::less, std::allocator const, std::string> > > const&, PGBackend::Listener*, coll_t,
> boost::intrusive_ptr&, ObjectStore*,
> CephContext*)+0x30a) [0x55b0ba8a]
>  5: (PrimaryLogPG::PrimaryLogPG(OSDService*, std::shared_ptr const>, PGPool const&, std::map std::less, std::allocator std::string> > > const&, spg_t)+0x140) [0x55abd100]
>  6: (OSD::_make_pg(std::shared_ptr, spg_t)+0x10cb)
> [0x55914ecb]
>  7: (OSD::load_pgs()+0x4a9) [0x55917e39]
>  8: (OSD::init()+0xc99) [0x559238e9]
>  9: (main()+0x23a3) [0x558017a3]
>  10: (__libc_start_main()+0xf5) [0x2aaab77de495]
>  11: (()+0x385900) [0x558d9900]
> 2019-10-01 14:19:49.500 2aaf5540 -1
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.6/rpm/el7/BUILD/ceph-13.2.6/src/osd/ECUtil.h:
> In function 'ECUtil::stripe_info_t::stripe_info_t(uint64_t, uint64_t)'
> thread 2aaf5540 time 2019-10-01 14:19:49.494368
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.6/rpm/el7/BUILD/ceph-13.2.6/src/osd/ECUtil.h:
> 34: FAILED assert(stripe_width % stripe_size == 0)
>

 https://tracker.ceph.com/issues/41336 may be relevant here.

Can you post details of the pool involved as well as the erasure code
profile in use for that pool?


>  ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic
> (stable)
>  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> const*)+0x14b) [0x2af3d36b]
>  2: (()+0x26e4f7) [0x2af3d4f7]
>  3: (ECBackend::ECBackend(PGBackend::Listener*, coll_t const&,
> boost::intrusive_ptr&, ObjectStore*,
> CephContext*, std::shared_ptr, unsigned
> long)+0x46d) [0x55c0bd3d]
>  4: (PGBackend::build_pg_backend(pg_pool_t const&, std::map std::string, std::less, std::allocator const, std::string> > > const&, PGBackend::Listener*, coll_t,
> boost::intrusive_ptr&, ObjectStore*,
> CephContext*)+0x30a) [0x55b0ba8a]
>  5: (PrimaryLogPG::PrimaryLogPG(OSDService*, std::shared_ptr const>, PGPool const&, std::map std::less, std::allocator std::string> > > const&, spg_t)+0x140) [0x55abd100]
>  6: (OSD::_make_pg(std::shared_ptr, spg_t)+0x10cb)
> [0x55914ecb]
>  7: (OSD::load_pgs()+0x4a9) [0x55917e39]
>  8: (OSD::init()+0xc99) [0x559238e9]
>  9: (main()+0x23a3) [0x558017a3]
>  10: (__libc_start_main()+0xf5) [0x2aaab77de495]
>  11: (()+0x385900) [0x558d9900]
>
> *** Caught signal (Aborted) **
>  in thread 2aaf5540 thread_name:ceph-osd
>  ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic
> (stable)
>  1: (()+0xf5d0) [0x2aaab69765d0]
>  2: (gsignal()+0x37) [0x2aaab77f22c7]
>  3: (abort()+0x148) [0x2aaab77f39b8]
>  4: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> const*)+0x248) [0x2af3d468]
>  5: (()+0x26e4f7) [0x2af3d4f7]
>  6: (ECBackend::ECBackend(PGBackend::Listener*, coll_t const&,
> boost::intrusive_ptr&, ObjectStore*,
> CephContext*, std::shared_ptr, unsigned
> long)+0x46d) [0x55c0bd3d]
>  7: (PGBackend::build_pg_backend(pg_pool_t const&, std::map std::string, std::less, std::allocator const, std::string> > > const&, PGBackend::Listener*, coll_t,
> 

Re: [ceph-users] Inconsistents + FAILED assert(recovery_info.oi.legacy_snaps.size())

2019-10-30 Thread Brad Hubbard
ta_size: 0, omap_header_size: 0, 
> omap_entries_size: 0, attrset_size: 1, recovery_info: 
> ObjectRecoveryInfo(2:9b97b818:::rbd_data.0c16b76b8b4567.0001426e:5926@127481'7241006,
>  size: 4194304, copy_subset: [], clone_subset: {}, snapset: 0=[]:[]), 
> after_progress: ObjectRecoveryProgress(!first, data_recovered_to:0, 
> data_complete:true, omap_recovered_to:, omap_complete:true, error:false), 
> before_progress: ObjectRecoveryProgress(first, data_recovered_to:0, 
> data_complete:false, omap_recovered_to:, omap_complete:false, error:false))]) 
> v3
> -1> 2019-10-30 12:52:31.998899 7fce6189b700  1 -- 
> 129.20.177.4:6810/2125155 <== osd.25 129.20.177.3:6808/810999 24804  
> MOSDPGPush(2.1d9 194334/194298 
> [PushOp(2:9b97b818:::rbd_data.0c16b76b8b4567.0001426e:5926, version: 
> 127481'7241006, data_included: [], data_size: 0, omap_header_size: 0, 
> omap_entries_size: 0, attrset_size: 1, recovery_info: 
> ObjectRecoveryInfo(2:9b97b818:::rbd_data.0c16b76b8b4567.0001426e:5926@127481'7241006,
>  size: 4194304, copy_subset: [], clone_subset: {}, snapset: 0=[]:[]), 
> after_progress: ObjectRecoveryProgress(!first, data_recovered_to:0, 
> data_complete:true, omap_recovered_to:, omap_complete:true, error:false), 
> before_progress: ObjectRecoveryProgress(first, data_recovered_to:0, 
> data_complete:false, omap_recovered_to:, omap_complete:false, error:false))]) 
> v3  925+0+0 (542499734 0 0) 0x5648d708eac0 con 0x5648d74a0800
>  0> 2019-10-30 12:52:32.003339 7fce4803e700 -1 
> /build/ceph-12.2.12/src/osd/PrimaryLogPG.cc: In function 'virtual void 
> PrimaryLogPG::on_local_recover(const hobject_t&, const ObjectRecoveryInfo&, 
> ObjectContextRef, bool, ObjectStore::Transaction*)' thread 7fce4803e700 time 
> 2019-10-30 12:52:31.999086
> /build/ceph-12.2.12/src/osd/PrimaryLogPG.cc: 354: FAILED 
> assert(recovery_info.oi.legacy_snaps.size())
>
>
> More informations about this object (0c16b76b8b4567.0001426e)
> from pg 2.1d9 :
> ceph osd map rbd rbd_data.0c16b76b8b4567.0001426e
> osdmap e194356 pool 'rbd' (2) object 
> 'rbd_data.0c16b76b8b4567.0001426e' -> pg 2.181de9d9 (2.1d9) -> up 
> ([27,30,38], p27) acting ([30,25], p30)
>
> I also checked the logs of all OSDs already done and got the same logs
> about this object :
> * osd.4, last time : 2019-10-10 16:15:20
> * osd.32, last time : 2019-10-14 01:54:56
> * osd.33, last time : 2019-10-11 06:24:01
> * osd.34, last time : 2019-10-18 06:24:26
> * osd.20, last time : 2019-10-27 18:12:31
> * osd.28, last time : 2019-10-28 12:57:47
>
> No matter that the data came from osd.25 or osd.30, i have the same
> error. It seems this PG|object try to recover an healthy state but
> shutdown my OSDs one by one…
>
> Thus spake Jérémy Gardais (jeremy.gard...@univ-rennes1.fr) on mercredi 30 
> octobre 2019 à 11:09:36:
> > Thus spake Brad Hubbard (bhubb...@redhat.com) on mercredi 30 octobre 2019 à 
> > 12:50:50:
> > > You should probably try and work out what caused the issue and take
> > > steps to minimise the likelihood of a recurrence. This is not expected
> > > behaviour in a correctly configured and stable environment.
>
> This PG 2.1d9 is "only" marked as : 
> "active+undersized+degraded+remapped+backfill_wait", not inconsistent…
>
> Everything i got from PG 2.1d9 (query, list_missing,
> ceph-objectstore-tool list,…) is available here :
> https://cloud.ipr.univ-rennes1.fr/index.php/s/BYtuAURnC7YOAQG?path=%2Fpg.2.1d9
> But nothing looks suspicious to me…
>
> I also separated the logs from the last error on osd.27 and it's
> reboot ("only" ~22k lines ^^) :
> https://cloud.ipr.univ-rennes1.fr/index.php/s/BYtuAURnC7YOAQG/download?path=%2F=ceph-osd.27.log.last.error.txt
>
> Is anybody understand these logs or do i have to leave with this damned
> object ? ^^
>
> --
> Gardais Jérémy
> Institut de Physique de Rennes
> Université Rennes 1
> Téléphone: 02-23-23-68-60
> Mail & bonnes pratiques: http://fr.wikipedia.org/wiki/Nétiquette
> ---



-- 
Cheers,
Brad

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-objectstore-tool crash when trying to recover pg from OSD

2019-11-07 Thread Brad Hubbard
I'd suggest you open a tracker under the Bluestore component so
someone can take a look. I'd also suggest you include a log with
'debug_bluestore=20' added to the COT command line.

On Thu, Nov 7, 2019 at 6:56 PM Eugene de Beste  wrote:
>
> Hi, does anyone have any feedback for me regarding this?
>
> Here's the log I get when trying to restart the OSD via systemctl: 
> https://pastebin.com/tshuqsLP
> On Mon, 4 Nov 2019 at 12:42, Eugene de Beste  wrote:
>
> Hi everyone
>
> I have a cluster that was initially set up with bad defaults in Luminous. 
> After upgrading to Nautilus I've had a few OSDs crash on me, due to errors 
> seemingly related to https://tracker.ceph.com/issues/42223 and 
> https://tracker.ceph.com/issues/22678.
>
> One of my pools have been running in min_size 1 (yes, I know) and I am not 
> stuck with incomplete pgs due to aforementioned OSD crash.
>
> When trying to use the ceph-objectstore-tool to get the pgs out of the OSD, 
> I'm running into the same issue as trying to start the OSD, which is the 
> crashes. ceph-objectstore-tool core dumps and I can't retrieve the pg.
>
> Does anyone have any input on this? I would like to be able to retrieve that 
> data if possible.
>
> Here's the log for ceph-objectstore-tool --debug --data-path 
> /var/lib/ceph/osd/ceph-22 --skip-journal-replay --skip-mount-omap --op info 
> --pgid 2.9f  -- https://pastebin.com/9aGtAfSv
>
> Regards and thanks,
> Eugene
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Inconsistents + FAILED assert(recovery_info.oi.legacy_snaps.size())

2019-10-28 Thread Brad Hubbard
Yes, try and get the pgs healthy, then you can just re-provision the down OSDs.

Run a scrub on each of these pgs and then use the commands on the
following page to find out more information for each case.

https://docs.ceph.com/docs/luminous/rados/troubleshooting/troubleshooting-pg/

Focus on the commands 'list-missing', 'list-inconsistent-obj', and
'list-inconsistent-snapset'.

Let us know if you get stuck.

P.S. There are several threads about these sorts of issues in this
mailing list that should turn up when doing a web search.

On Tue, Oct 29, 2019 at 5:06 AM Jérémy Gardais
 wrote:
>
> Hello,
>
> From several weeks, i have some OSDs flapping before ending out of the
> cluster by Ceph…
> I was hoping some Ceph's magic and just gave it sometime to auto heal
> (and be able to do all the side work…) but it was a bad idea (what a
> surprise :D). Also got some inconsistents PGs, but i was waiting a quiet
> health cluster before trying to fix them.
>
> Now that i have more time, i also have 6 OSDs down+out on my 5 nodes
> cluster and 1~2 OSDs still flapping from time to time, i asking myself
> if these PGs might be the (one ?) source of my problem.
>
> The last OSD error on osd.28 gave these logs :
> -2> 2019-10-28 12:57:47.346460 7fefbdc4d700  5 -- 129.20.177.2:6811/47803 
> >> 129.20.177.3:6808/4141402 conn(0x55de8211a000 :-1 
> s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=2058 cs=1 l=0). rx osd.25 
> seq 169 0x55dea57b3600 MOSDPGPush(2.1d9 191810/191810 
> [PushOp(2:9b97b818:::rbd_data.0c16b76b8b4567.0001426e:5926, version: 
> 127481'7241006, data_included: [], data_size: 0, omap_header_size: 0, 
> omap_entries_size: 0, attrset_size: 1, recovery_info: 
> ObjectRecoveryInfo(2:9b97b818:::rbd_data.0c16b76b8b4567.0001426e:5926@127481'7241006,
>  size: 4194304, copy_subset: [], clone_subset: {}, snapset: 0=[]:[]), 
> after_progress: ObjectRecoveryProgress(!first, data_recovered_to:0, 
> data_complete:true, omap_recovered_to:, omap_complete:true, error:false), 
> before_progress: ObjectRecoveryProgress(first, data_recovered_to:0, 
> data_complete:false, omap_recovered_to:, omap_complete:false, error:false))]) 
> v3
> -1> 2019-10-28 12:57:47.346517 7fefbdc4d700  1 -- 129.20.177.2:6811/47803 
> <== osd.25 129.20.177.3:6808/4141402 169  MOSDPGPush(2.1d9 191810/191810 
> [PushOp(2:9b97b818:::rbd_data.0c16b76b8b4567.0001426e:5926, version: 
> 127481'7241006, data_included: [], data_size: 0, omap_header_size: 0, 
> omap_entries_size: 0, attrset_size: 1, recovery_info: 
> ObjectRecoveryInfo(2:9b97b818:::rbd_data.c16b76b8b4567.0001426e:5926@127481'7241006,
>  size: 4194304, copy_subset: [], clone_subset: {}, snapset: 0=[]:[]), 
> after_progress: ObjectRecoveryProgress(!first, data_recovered_to:0, 
> data_complete:true, omap_recovered_to:, omap_complete:true, error:false), 
> before_progress: ObjectRecoveryProgress(first, data_recovered_to:0, 
> data_complete:false, omap_recovered_to:, omap_complete:false, error:false))]) 
> v3  909+0+0 (1239474936 0 0) 0x55dea57b3600 con 0x55de8211a000
>  0> 2019-10-28 12:57:47.353680 7fef99441700 -1 
> /build/ceph-12.2.12/src/osd/PrimaryLogPG.cc: In function 'virtual void 
> PrimaryLogPG::on_local_recover(const hobject_t&, const ObjectRecoveryInfo&, 
> ObjectContextRef, bool, ObjectStore::Transaction*)' thread 7fef99441700 time 
> 2019-10-28 12:57:47.347132
> /build/ceph-12.2.12/src/osd/PrimaryLogPG.cc: 354: FAILED 
> assert(recovery_info.oi.legacy_snaps.size())
>
>  ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) luminous 
> (stable)
>  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
> const*)+0x102) [0x55de72039f32]
>  2: (PrimaryLogPG::on_local_recover(hobject_t const&, ObjectRecoveryInfo 
> const&, std::shared_ptr, bool, 
> ObjectStore::Transaction*)+0x135b) [0x55de71be330b]
>  3: (ReplicatedBackend::handle_push(pg_shard_t, PushOp const&, PushReplyOp*, 
> ObjectStore::Transaction*)+0x31d) [0x55de71d4fadd]
>  4: (ReplicatedBackend::_do_push(boost::intrusive_ptr)+0x18f) 
> [0x55de71d4fd7f]
>  5: 
> (ReplicatedBackend::_handle_message(boost::intrusive_ptr)+0x2d1) 
> [0x55de71d5ff11]
>  6: (PGBackend::handle_message(boost::intrusive_ptr)+0x50) 
> [0x55de71c7d030]
>  7: (PrimaryLogPG::do_request(boost::intrusive_ptr&, 
> ThreadPool::TPHandle&)+0x5f1) [0x55de71be87b1]
>  8: (OSD::dequeue_op(boost::intrusive_ptr, 
> boost::intrusive_ptr, ThreadPool::TPHandle&)+0x3f7) 
> [0x55de71a63e97]
>  9: (PGQueueable::RunVis::operator()(boost::intrusive_ptr 
> const&)+0x57) [0x55de71cf5077]
>  10: (OSD::ShardedOpWQ::_process(unsigned int, 
> ceph::heartbeat_handle_d*)+0x108c) [0x55de71a94e1c]
>  11: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x88d) 
> [0x55de7203fbbd]
>  12: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x55de72041b80]
>  13: (()+0x8064) [0x7fefc12b5064]
>  14: (clone()+0x6d) [0x7fefc03a962d]
>  NOTE: a copy of the executable, or `objdump -rdS ` is needed to 
> 

Re: [ceph-users] Inconsistents + FAILED assert(recovery_info.oi.legacy_snaps.size())

2019-10-29 Thread Brad Hubbard
On Tue, Oct 29, 2019 at 9:09 PM Jérémy Gardais
 wrote:
>
> Thus spake Brad Hubbard (bhubb...@redhat.com) on mardi 29 octobre 2019 à 
> 08:20:31:
> > Yes, try and get the pgs healthy, then you can just re-provision the down 
> > OSDs.
> >
> > Run a scrub on each of these pgs and then use the commands on the
> > following page to find out more information for each case.
> >
> > https://docs.ceph.com/docs/luminous/rados/troubleshooting/troubleshooting-pg/
> >
> > Focus on the commands 'list-missing', 'list-inconsistent-obj', and
> > 'list-inconsistent-snapset'.
> >
> > Let us know if you get stuck.
> >
> > P.S. There are several threads about these sorts of issues in this
> > mailing list that should turn up when doing a web search.
>
> I found this thread :
> https://www.mail-archive.com/ceph-users@lists.ceph.com/msg53116.html

That looks like the same issue.

>
> And i start to get additionnals informations to solve PG 2.2ba :
> 1. rados list-inconsistent-snapset 2.2ba --format=json-pretty
> {
> "epoch": 192223,
> "inconsistents": [
> {
> "name": "rbd_data.b4537a2ae8944a.425f",
> "nspace": "",
> "locator": "",
> "snap": 22772,
> "errors": [
> "headless"
> ]
> },
> {
> "name": "rbd_data.b4537a2ae8944a.425f",
> "nspace": "",
> "locator": "",
> "snap": "head",
> "snapset": {
> "snap_context": {
> "seq": 22806,
> "snaps": [
> 22805,
> 22804,
> 22674,
> 22619,
> 20536,
> 17248,
> 14270
> ]
> },
> "head_exists": 1,
> "clones": [
> {
> "snap": 17248,
> "size": 4194304,
> "overlap": "[0~2269184,2277376~1916928]",
> "snaps": [
> 17248
> ]
> },
> {
> "snap": 20536,
> "size": 4194304,
> "overlap": "[0~2269184,2277376~1916928]",
> "snaps": [
> 20536
> ]
> },
> {
> "snap": 22625,
> "size": 4194304,
> "overlap": "[0~2269184,2277376~1916928]",
> "snaps": [
> 22619
> ]
> },
> {
> "snap": 22674,
> "size": 4194304,
> "overlap": "[266240~4096]",
> "snaps": [
> 22674
> ]
> },
> {
> "snap": 22805,
> "size": 4194304,
> "overlap": 
> "[0~942080,958464~901120,1875968~16384,1908736~360448,2285568~1908736]",
> "snaps": [
> 22805,
> 22804
> ]
> }
> ]
> },
> "errors": [
> "extra_clones"
> ],
> "extra clones": [
> 22772
> ]
> }
> ]
> }
>
> 2.a ceph-objectstore-tool from osd.29 and osd.42 :
> ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-29/ --pgid 2.2ba 
> --op list rbd_data.b4537a2ae8944a.425f
> ["2.2ba",{"oid":"rbd_data.b4537a2ae8944a.425f","key":"","snapid":17248,"hash":71960

Re: [ceph-users] ceph; pg scrub errors

2019-09-24 Thread Brad Hubbard
On Tue, Sep 24, 2019 at 10:51 PM M Ranga Swami Reddy
 wrote:
>
> Interestingly - "rados list-inconsistent-obj ${PG} --format=json"  not 
> showing any objects inconsistent-obj.
> And also "rados list-missing-obj ${PG} --format=json" also not showing any 
> missing or unfound objects.

Complete a scrub of ${PG} just before you run these commands.

>
> Thanks
> Swami
>
> On Mon, Sep 23, 2019 at 8:18 PM Robert LeBlanc  wrote:
>>
>> On Thu, Sep 19, 2019 at 4:34 AM M Ranga Swami Reddy
>>  wrote:
>> >
>> > Hi-Iam using ceph 12.2.11. here I am getting a few scrub errors. To fix 
>> > these scrub error I ran the "ceph pg repair ".
>> > But scrub error not going and the repair is talking long time like 8-12 
>> > hours.
>>
>> Depending on the size of the PGs and how active the cluster is, it
>> could take a long time as it takes another deep scrub to happen to
>> clear the error status after a repair. Since it is not going away,
>> either the problem is too complicated to automatically repair and
>> needs to be done by hand, or the problem is repaired and when it
>> deep-scrubs to check it, the problem has reappeared or another problem
>> was found and the disk needs to be replaced.
>>
>> Try running:
>> rados list-inconsistent-obj ${PG} --format=json
>>
>> and see what the exact problems are.
>> 
>> Robert LeBlanc
>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] lot of inconsistent+failed_repair - failed to pick suitable auth object (14.2.3)

2019-10-10 Thread Brad Hubbard
Does pool 6 have min_size = 1 set?

https://tracker.ceph.com/issues/24994#note-5 would possibly be helpful
here, depending on what the output of the following command looks
like.

# rados list-inconsistent-obj [pgid] --format=json-pretty

On Thu, Oct 10, 2019 at 8:16 PM Kenneth Waegeman
 wrote:
>
> Hi all,
>
> After some node failure and rebalancing, we have a lot of pg's in
> inconsistent state. I tried to repair, but it din't work. This is also
> in the logs:
>
> > 2019-10-10 11:23:27.221 7ff54c9b0700  0 log_channel(cluster) log [DBG]
> > : 6.327 repair starts
> > 2019-10-10 11:23:27.431 7ff5509b8700 -1 log_channel(cluster) log [ERR]
> > : 6.327 shard 19 soid 6:e4c130fd:::20005f3b582.:head :
> > omap_digest 0x334f57be != omap_digest 0xa8c4ce76 from auth oi
> > 6:e4c130fd:::20005f3b582.:head(203789'1033530 osd.3.0:342
> > dirty|omap|data_digest|omap_digest s 0 uv 1032164 dd  od
> > a8c4ce76 alloc_hint [0 0 0])
> > 2019-10-10 11:23:27.431 7ff5509b8700 -1 log_channel(cluster) log [ERR]
> > : 6.327 shard 72 soid 6:e4c130fd:::20005f3b582.:head :
> > omap_digest 0x334f57be != omap_digest 0xa8c4ce76 from auth oi
> > 6:e4c130fd:::20005f3b582.:head(203789'1033530 osd.3.0:342
> > dirty|omap|data_digest|omap_digest s 0 uv 1032164 dd  od
> > a8c4ce76 alloc_hint [0 0 0])
> > 2019-10-10 11:23:27.431 7ff5509b8700 -1 log_channel(cluster) log [ERR]
> > : 6.327 shard 91 soid 6:e4c130fd:::20005f3b582.:head :
> > omap_digest 0x334f57be != omap_digest 0xa8c4ce76 from auth oi
> > 6:e4c130fd:::20005f3b582.:head(203789'1033530 osd.3.0:342
> > dirty|omap|data_digest|omap_digest s 0 uv 1032164 dd  od
> > a8c4ce76 alloc_hint [0 0 0])
> > 2019-10-10 11:23:27.431 7ff5509b8700 -1 log_channel(cluster) log [ERR]
> > : 6.327 soid 6:e4c130fd:::20005f3b582.:head : failed to pick
> > suitable auth object
> > 2019-10-10 11:23:27.731 7ff54c9b0700 -1 log_channel(cluster) log [ERR]
> > : 6.327 shard 19 soid 6:e4c2e57b:::20005f11daa.:head :
> > omap_digest 0x6aafaf97 != omap_digest 0x56dd55a2 from auth oi
> > 6:e4c2e57b:::20005f11daa.:head(203789'1033711 osd.3.0:3666823
> > dirty|omap|data_digest|omap_digest s 0 uv 1032158 dd  od
> > 56dd55a2 alloc_hint [0 0 0])
> > 2019-10-10 11:23:27.731 7ff54c9b0700 -1 log_channel(cluster) log [ERR]
> > : 6.327 shard 72 soid 6:e4c2e57b:::20005f11daa.:head :
> > omap_digest 0x6aafaf97 != omap_digest 0x56dd55a2 from auth oi
> > 6:e4c2e57b:::20005f11daa.:head(203789'1033711 osd.3.0:3666823
> > dirty|omap|data_digest|omap_digest s 0 uv 1032158 dd  od
> > 56dd55a2 alloc_hint [0 0 0])
> > 2019-10-10 11:23:27.731 7ff54c9b0700 -1 log_channel(cluster) log [ERR]
> > : 6.327 shard 91 soid 6:e4c2e57b:::20005f11daa.:head :
> > omap_digest 0x6aafaf97 != omap_digest 0x56dd55a2 from auth oi
> > 6:e4c2e57b:::20005f11daa.:head(203789'1033711 osd.3.0:3666823
> > dirty|omap|data_digest|omap_digest s 0 uv 1032158 dd  od
> > 56dd55a2 alloc_hint [0 0 0])
> > 2019-10-10 11:23:27.731 7ff54c9b0700 -1 log_channel(cluster) log [ERR]
> > : 6.327 soid 6:e4c2e57b:::20005f11daa.:head : failed to pick
> > suitable auth object
> > 2019-10-10 11:23:27.971 7ff54c9b0700 -1 log_channel(cluster) log [ERR]
> > : 6.327 shard 19 soid 6:e4c40009:::20005f45f1b.:head :
> > omap_digest 0x7ccf5cc9 != omap_digest 0xe048d29 from auth oi
> > 6:e4c40009:::20005f45f1b.:head(203789'1033837 osd.3.0:3666949
> > dirty|omap|data_digest|omap_digest s 0 uv 1032168 dd  od
> > e048d29 alloc_hint [0 0 0])
> > 2019-10-10 11:23:27.971 7ff54c9b0700 -1 log_channel(cluster) log [ERR]
> > : 6.327 shard 72 soid 6:e4c40009:::20005f45f1b.:head :
> > omap_digest 0x7ccf5cc9 != omap_digest 0xe048d29 from auth oi
> > 6:e4c40009:::20005f45f1b.:head(203789'1033837 osd.3.0:3666949
> > dirty|omap|data_digest|omap_digest s 0 uv 1032168 dd  od
> > e048d29 alloc_hint [0 0 0])
> > 2019-10-10 11:23:27.971 7ff54c9b0700 -1 log_channel(cluster) log [ERR]
> > : 6.327 shard 91 soid 6:e4c40009:::20005f45f1b.:head :
> > omap_digest 0x7ccf5cc9 != omap_digest 0xe048d29 from auth oi
> > 6:e4c40009:::20005f45f1b.:head(203789'1033837 osd.3.0:3666949
> > dirty|omap|data_digest|omap_digest s 0 uv 1032168 dd  od
> > e048d29 alloc_hint [0 0 0])
> > 2019-10-10 11:23:27.971 7ff54c9b0700 -1 log_channel(cluster) log [ERR]
> > : 6.327 soid 6:e4c40009:::20005f45f1b.:head : failed to pick
> > suitable auth object
> > 2019-10-10 11:23:28.041 7ff54c9b0700 -1 log_channel(cluster) log [ERR]
> > : 6.327 shard 19 soid 6:e4c4a042:::20005f389fb.:head :
> > omap_digest 0xdd1558b8 != omap_digest 0xcf9af548 from auth oi
> > 6:e4c4a042:::20005f389fb.:head(203789'1033899 osd.3.0:3667011
> > dirty|omap|data_digest|omap_digest s 0 uv 1031358 dd  od
> > cf9af548 alloc_hint [0 0 0])
> > 2019-10-10 11:23:28.041 7ff54c9b0700 -1 log_channel(cluster) log [ERR]
> 

Re: [ceph-users] lot of inconsistent+failed_repair - failed to pick suitable auth object (14.2.3)

2019-10-10 Thread Brad Hubbard
On Fri, Oct 11, 2019 at 12:27 AM Kenneth Waegeman
 wrote:
>
> Hi Brad, all,
>
> Pool 6 has min_size 2:
>
> pool 6 'metadata' replicated size 3 min_size 2 crush_rule 1 object_hash
> rjenkins pg_num 1024 pgp_num 1024 autoscale_mode warn last_change 172476
> flags hashpspool stripe_width 0 application cephfs

This looked like something min_size 1 could cause, but I guess that's
not the cause here.

> so inconsistens is empty, which is weird, no ?

Try scrubbing the pg just before running the command.

>
> Thanks again!
>
> K
>
>
> On 10/10/2019 12:52, Brad Hubbard wrote:
> > Does pool 6 have min_size = 1 set?
> >
> > https://tracker.ceph.com/issues/24994#note-5 would possibly be helpful
> > here, depending on what the output of the following command looks
> > like.
> >
> > # rados list-inconsistent-obj [pgid] --format=json-pretty
> >
> > On Thu, Oct 10, 2019 at 8:16 PM Kenneth Waegeman
> >  wrote:
> >> Hi all,
> >>
> >> After some node failure and rebalancing, we have a lot of pg's in
> >> inconsistent state. I tried to repair, but it din't work. This is also
> >> in the logs:
> >>
> >>> 2019-10-10 11:23:27.221 7ff54c9b0700  0 log_channel(cluster) log [DBG]
> >>> : 6.327 repair starts
> >>> 2019-10-10 11:23:27.431 7ff5509b8700 -1 log_channel(cluster) log [ERR]
> >>> : 6.327 shard 19 soid 6:e4c130fd:::20005f3b582.:head :
> >>> omap_digest 0x334f57be != omap_digest 0xa8c4ce76 from auth oi
> >>> 6:e4c130fd:::20005f3b582.:head(203789'1033530 osd.3.0:342
> >>> dirty|omap|data_digest|omap_digest s 0 uv 1032164 dd  od
> >>> a8c4ce76 alloc_hint [0 0 0])
> >>> 2019-10-10 11:23:27.431 7ff5509b8700 -1 log_channel(cluster) log [ERR]
> >>> : 6.327 shard 72 soid 6:e4c130fd:::20005f3b582.:head :
> >>> omap_digest 0x334f57be != omap_digest 0xa8c4ce76 from auth oi
> >>> 6:e4c130fd:::20005f3b582.:head(203789'1033530 osd.3.0:342
> >>> dirty|omap|data_digest|omap_digest s 0 uv 1032164 dd  od
> >>> a8c4ce76 alloc_hint [0 0 0])
> >>> 2019-10-10 11:23:27.431 7ff5509b8700 -1 log_channel(cluster) log [ERR]
> >>> : 6.327 shard 91 soid 6:e4c130fd:::20005f3b582.:head :
> >>> omap_digest 0x334f57be != omap_digest 0xa8c4ce76 from auth oi
> >>> 6:e4c130fd:::20005f3b582.:head(203789'1033530 osd.3.0:342
> >>> dirty|omap|data_digest|omap_digest s 0 uv 1032164 dd  od
> >>> a8c4ce76 alloc_hint [0 0 0])
> >>> 2019-10-10 11:23:27.431 7ff5509b8700 -1 log_channel(cluster) log [ERR]
> >>> : 6.327 soid 6:e4c130fd:::20005f3b582.:head : failed to pick
> >>> suitable auth object
> >>> 2019-10-10 11:23:27.731 7ff54c9b0700 -1 log_channel(cluster) log [ERR]
> >>> : 6.327 shard 19 soid 6:e4c2e57b:::20005f11daa.:head :
> >>> omap_digest 0x6aafaf97 != omap_digest 0x56dd55a2 from auth oi
> >>> 6:e4c2e57b:::20005f11daa.:head(203789'1033711 osd.3.0:3666823
> >>> dirty|omap|data_digest|omap_digest s 0 uv 1032158 dd  od
> >>> 56dd55a2 alloc_hint [0 0 0])
> >>> 2019-10-10 11:23:27.731 7ff54c9b0700 -1 log_channel(cluster) log [ERR]
> >>> : 6.327 shard 72 soid 6:e4c2e57b:::20005f11daa.:head :
> >>> omap_digest 0x6aafaf97 != omap_digest 0x56dd55a2 from auth oi
> >>> 6:e4c2e57b:::20005f11daa.:head(203789'1033711 osd.3.0:3666823
> >>> dirty|omap|data_digest|omap_digest s 0 uv 1032158 dd  od
> >>> 56dd55a2 alloc_hint [0 0 0])
> >>> 2019-10-10 11:23:27.731 7ff54c9b0700 -1 log_channel(cluster) log [ERR]
> >>> : 6.327 shard 91 soid 6:e4c2e57b:::20005f11daa.:head :
> >>> omap_digest 0x6aafaf97 != omap_digest 0x56dd55a2 from auth oi
> >>> 6:e4c2e57b:::20005f11daa.:head(203789'1033711 osd.3.0:3666823
> >>> dirty|omap|data_digest|omap_digest s 0 uv 1032158 dd  od
> >>> 56dd55a2 alloc_hint [0 0 0])
> >>> 2019-10-10 11:23:27.731 7ff54c9b0700 -1 log_channel(cluster) log [ERR]
> >>> : 6.327 soid 6:e4c2e57b:::20005f11daa.:head : failed to pick
> >>> suitable auth object
> >>> 2019-10-10 11:23:27.971 7ff54c9b0700 -1 log_channel(cluster) log [ERR]
> >>> : 6.327 shard 19 soid 6:e4c40009:::20005f45f1b.:head :
> >>> omap_digest 0x7ccf5cc9 != omap_digest 0xe048d29 from auth oi
> >>> 6:e4c40009:::20005f45f1b.:head(203789'1033837 osd.3.0:3666949
> >>> dirty|omap|data

Re: [ceph-users] Ceph pg repair clone_missing?

2019-10-08 Thread Brad Hubbard
On Fri, Oct 4, 2019 at 6:09 PM Marc Roos  wrote:
>
>  >
>  >Try something like the following on each OSD that holds a copy of
>  >rbd_data.1f114174b0dc51.0974 and see what output you get.
>  >Note that you can drop the bluestore flag if they are not bluestore
>  >osds and you will need the osd stopped at the time (set noout). Also
>  >note, snapids are displayed in hexadecimal in the output (but then '4'
>  >is '4' so not a big issues here).
>  >
>  >$ ceph-objectstore-tool --type bluestore --data-path
>  >/var/lib/ceph/osd/ceph-XX/ --pgid 17.36 --op list
>  >rbd_data.1f114174b0dc51.0974
>
> I got these results
>
> osd.7
> Error getting attr on : 17.36_head,#-19:6c00:::scrub_17.36:head#,
> (61) No data available
> ["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","sna
> pid":63,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]
> ["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","sna
> pid":-2,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]

Ah, so of course the problem is the snapshot is missing. You may need
to try something like the following on each of those osds.

$ ceph-objectstore-tool --type bluestore --data-path
/var/lib/ceph/osd/ceph-XX/ --pgid 17.36
'{"oid":"rbd_data.1f114174b0dc51.0974","key":"","snapid":-2,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}'
remove-clone-metadata 4

>
> osd.12
> ["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","sna
> pid":63,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]
> ["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","sna
> pid":-2,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]
>
> osd.29
> ["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","sna
> pid":63,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]
> ["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","sna
> pid":-2,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]
>
>
>  >
>  >The likely issue here is the primary believes snapshot 4 is gone but
>  >there is still data and/or metadata on one of the replicas which is
>  >confusing the issue. If that is the case you can use the the
>  >ceph-objectstore-tool to delete the relevant snapshot(s)
>  >



-- 
Cheers,
Brad

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


<    1   2   3   4   5   >