Re: [ceph-users] Nautilus - Balancer is always on

2019-08-07 Thread Konstantin Shalygin

ceph mgr module disable balancer

Error EINVAL: module 'balancer' cannot be disabled (always-on)

  


Whats the way to restart balanacer? Restart MGR service?

  


I wanna suggest to Balancer developer to setup a ceph-balancer.log for this
module get more information about whats doing.



Maybe you should `ceph balancer off` first?



k

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Error Mounting CephFS

2019-08-07 Thread DHilsbos
JC;

Excellent, thank you!

I apologize, normally I'm better about RTFM...

Thank you,

Dominic L. Hilsbos, MBA 
Director – Information Technology 
Perform Air International Inc.
dhils...@performair.com 
www.PerformAir.com


From: JC Lopez [mailto:jelo...@redhat.com] 
Sent: Wednesday, August 07, 2019 11:52 AM
To: Dominic Hilsbos
Cc: Lopez Jean-Charles; fr...@dtu.dk; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Error Mounting CephFS

Hi,

See https://docs.ceph.com/docs/nautilus/cephfs/kernel/

-o mds_namespace={fsname}

Regards
JC

On Aug 7, 2019, at 10:24, dhils...@performair.com wrote:

All;

Thank you for your assistance, this led me to the fact that I hadn't set up the 
Ceph repo on this client server, and the ceph-common I had installed was 
version 10.

I got all of that squared away, and it all works.

I do have a couple follow up questions:
Can more than one system mount the same  CephFS, at the same time?
If your cluster has several CephFS filesystems defined, how do you select which 
gets mounted, as the fs name doesn't appear to be used in the mount command?

Thank you,

Dominic L. Hilsbos, MBA 
Director - Information Technology 
Perform Air International Inc.
dhils...@performair.com 
www.PerformAir.com



-Original Message-
From: Frank Schilder [mailto:fr...@dtu.dk] 
Sent: Wednesday, August 07, 2019 2:48 AM
To: Dominic Hilsbos
Cc: ceph-users
Subject: Re: [ceph-users] Error Mounting CephFS

On Centos7, the option "secretfile" requires installation of ceph-fuse.

Best regards,

=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: ceph-users  on behalf of Yan, Zheng 

Sent: 07 August 2019 10:10:19
To: dhils...@performair.com
Cc: ceph-users
Subject: Re: [ceph-users] Error Mounting CephFS

On Wed, Aug 7, 2019 at 3:46 PM  wrote:


All;

I have a server running CentOS 7.6 (1810), that I want to set up with CephFS 
(full disclosure, I'm going to be running samba on the CephFS).  I can mount 
the CephFS fine when I use the option secret=, but when I switch to 
secretfile=, I get an error "No such process."  I installed ceph-common.

Is there a service that I'm not aware I should be starting?
Do I need to install another package?

mount.ceph is missing.  check if it exists and is located in $PATH


Thank you,

Dominic L. Hilsbos, MBA
Director - Information Technology
Perform Air International Inc.
dhils...@performair.com
www.PerformAir.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Error Mounting CephFS

2019-08-07 Thread JC Lopez
Hi,

See https://docs.ceph.com/docs/nautilus/cephfs/kernel/ 


-o mds_namespace={fsname}

Regards
JC

> On Aug 7, 2019, at 10:24, dhils...@performair.com wrote:
> 
> All;
> 
> Thank you for your assistance, this led me to the fact that I hadn't set up 
> the Ceph repo on this client server, and the ceph-common I had installed was 
> version 10.
> 
> I got all of that squared away, and it all works.
> 
> I do have a couple follow up questions:
> Can more than one system mount the same  CephFS, at the same time?
> If your cluster has several CephFS filesystems defined, how do you select 
> which gets mounted, as the fs name doesn't appear to be used in the mount 
> command?
> 
> Thank you,
> 
> Dominic L. Hilsbos, MBA 
> Director - Information Technology 
> Perform Air International Inc.
> dhils...@performair.com 
> www.PerformAir.com
> 
> 
> 
> -Original Message-
> From: Frank Schilder [mailto:fr...@dtu.dk] 
> Sent: Wednesday, August 07, 2019 2:48 AM
> To: Dominic Hilsbos
> Cc: ceph-users
> Subject: Re: [ceph-users] Error Mounting CephFS
> 
> On Centos7, the option "secretfile" requires installation of ceph-fuse.
> 
> Best regards,
> 
> =
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
> 
> 
> From: ceph-users  on behalf of Yan, Zheng 
> 
> Sent: 07 August 2019 10:10:19
> To: dhils...@performair.com
> Cc: ceph-users
> Subject: Re: [ceph-users] Error Mounting CephFS
> 
> On Wed, Aug 7, 2019 at 3:46 PM  wrote:
>> 
>> All;
>> 
>> I have a server running CentOS 7.6 (1810), that I want to set up with CephFS 
>> (full disclosure, I'm going to be running samba on the CephFS).  I can mount 
>> the CephFS fine when I use the option secret=, but when I switch to 
>> secretfile=, I get an error "No such process."  I installed ceph-common.
>> 
>> Is there a service that I'm not aware I should be starting?
>> Do I need to install another package?
>> 
> 
> mount.ceph is missing.  check if it exists and is located in $PATH
> 
>> Thank you,
>> 
>> Dominic L. Hilsbos, MBA
>> Director - Information Technology
>> Perform Air International Inc.
>> dhils...@performair.com
>> www.PerformAir.com
>> 
>> 
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Can kstore be used as OSD objectstore backend when deploying a Ceph Storage Cluster? If can, how to?

2019-08-07 Thread Gregory Farnum
No; KStore is not for real use AFAIK.

On Wed, Aug 7, 2019 at 12:24 AM R.R.Yuan  wrote:
>
> Hi, All,
>
>When deploying a development cluster, there are three types of OSD 
> objectstore backend:  filestore, bluestore and kstore.
>But there is no "--kstore" option when using "ceph-deploy osd"command 
> to deploy a real ceph cluster.
>
>Can kstore be used as OSD objectstore backend when deploy a real ceph 
> cluster?If can, how to ?
>
>
> Thanks a lot
> R.R.Yuan
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Error Mounting CephFS

2019-08-07 Thread DHilsbos
All;

Thank you for your assistance, this led me to the fact that I hadn't set up the 
Ceph repo on this client server, and the ceph-common I had installed was 
version 10.

I got all of that squared away, and it all works.

I do have a couple follow up questions:
Can more than one system mount the same  CephFS, at the same time?
If your cluster has several CephFS filesystems defined, how do you select which 
gets mounted, as the fs name doesn't appear to be used in the mount command?

Thank you,

Dominic L. Hilsbos, MBA 
Director - Information Technology 
Perform Air International Inc.
dhils...@performair.com 
www.PerformAir.com



-Original Message-
From: Frank Schilder [mailto:fr...@dtu.dk] 
Sent: Wednesday, August 07, 2019 2:48 AM
To: Dominic Hilsbos
Cc: ceph-users
Subject: Re: [ceph-users] Error Mounting CephFS

On Centos7, the option "secretfile" requires installation of ceph-fuse.

Best regards,

=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: ceph-users  on behalf of Yan, Zheng 

Sent: 07 August 2019 10:10:19
To: dhils...@performair.com
Cc: ceph-users
Subject: Re: [ceph-users] Error Mounting CephFS

On Wed, Aug 7, 2019 at 3:46 PM  wrote:
>
> All;
>
> I have a server running CentOS 7.6 (1810), that I want to set up with CephFS 
> (full disclosure, I'm going to be running samba on the CephFS).  I can mount 
> the CephFS fine when I use the option secret=, but when I switch to 
> secretfile=, I get an error "No such process."  I installed ceph-common.
>
> Is there a service that I'm not aware I should be starting?
> Do I need to install another package?
>

mount.ceph is missing.  check if it exists and is located in $PATH

> Thank you,
>
> Dominic L. Hilsbos, MBA
> Director - Information Technology
> Perform Air International Inc.
> dhils...@performair.com
> www.PerformAir.com
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] FYI: Mailing list domain change

2019-08-07 Thread David Galloway
Hi all,

I am in the process of migrating the upstream Ceph mailing lists from
Dreamhost to a self-hosted instance of Mailman 3.

Please update your address book and mail filters to ceph-us...@ceph.io
(notice the Top Level Domain change).

You may receive a "Welcome" e-mail as I subscribe you to the new list.
No other action should be required on your part.
-- 
David Galloway
Systems Administrator, RDU
Ceph Engineering
IRC: dgalloway
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph device list empty

2019-08-07 Thread Gary Molenkamp
I'm testing an upgrade to Nautilus on a development cluster and the 
command "ceph device ls" is returning an empty list.

# ceph device ls
DEVICE HOST:DEV DAEMONS LIFE EXPECTANCY
#

I have walked through the luminous upgrade documentation under 
https://docs.ceph.com/docs/master/releases/nautilus/#upgrading-from-mimic-or-luminous
but I don't see anything pertaining to "activating" device support under 
Nautilus.

The devices are visible to ceph-volume on the OSS nodes.  ie:

osdev-stor1 ~]# ceph-volume lvm list
== osd.0 ===
   [block] 
/dev/ceph-f5eb16ec-7074-477b-8f83-ce87c5f74fa3/osd-block-c1de464f-d838-4558-ba75-1c268e538d6b

   block device 
/dev/ceph-f5eb16ec-7074-477b-8f83-ce87c5f74fa3/osd-block-c1de464f-d838-4558-ba75-1c268e538d6b
   block uuid dlbIm6-H5za-001b-C3mQ-EGks-yoed-zoQpoo

   devices   /dev/sdb

== osd.2 ===
   [block] 
/dev/ceph-37145a74-6b2b-4519-b72e-2defe11732aa/osd-block-e06c513b-5af3-4bf6-927f-1f0142c59e8a
   block device 
/dev/ceph-37145a74-6b2b-4519-b72e-2defe11732aa/osd-block-e06c513b-5af3-4bf6-927f-1f0142c59e8a
   block uuid egdvpm-3bXx-xmNO-ACzp-nxax-Wka2-81rfNT

   devices   /dev/sdc

Is there a step I missed?
Thanks.

Gary.



-- 
Gary Molenkamp  Computer Science/Science Technology Services
Systems Administrator   University of Western Ontario
molen...@uwo.ca http://www.csd.uwo.ca
(519) 661-2111 x86882   (519) 661-3566

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSD's keep crasching after clusterreboot

2019-08-07 Thread Ansgar Jazdzewski
another update,

we now took the more destructive route and removed the cephfs pools
(lucky we had only test date in the filesystem)
Our hope was that within the startup-process the osd will delete the
no longer needed PG, But this is NOT the Case.

So we are still have the same issue the only difference is that the PG
does not belong to a pool anymore.

 -360> 2019-08-07 14:52:32.655 7fb14db8de00  5 osd.44 pg_epoch: 196586
pg[23.f8s0(unlocked)] enter Initial
 -360> 2019-08-07 14:52:32.659 7fb14db8de00 -1
/build/ceph-13.2.6/src/osd/ECUtil.h: In function
'ECUtil::stripe_info_t::stripe_info_t(uint64_t, uint64_t)' thread
7fb14db8de00 time 2019-08-07 14:52:32.660169
/build/ceph-13.2.6/src/osd/ECUtil.h: 34: FAILED assert(stripe_width %
stripe_size == 0)

we now can take one rout and try to delete the pg by hand in the OSD
(bluestore) how this can be done? OR we try to upgrade to Nautilus and
hope for the beset.

any help hints are welcome,
have a nice one
Ansgar

Am Mi., 7. Aug. 2019 um 11:32 Uhr schrieb Ansgar Jazdzewski
:
>
> Hi,
>
> as a follow-up:
> * a full log of one OSD failing to start https://pastebin.com/T8UQ2rZ6
> * our ec-pool cration in the fist place https://pastebin.com/20cC06Jn
> * ceph osd dump and ceph osd erasure-code-profile get cephfs
> https://pastebin.com/TRLPaWcH
>
> as we try to dig more into it, it looks like a bug in the cephfs or
> erasure-coding part of ceph.
>
> Ansgar
>
>
> Am Di., 6. Aug. 2019 um 14:50 Uhr schrieb Ansgar Jazdzewski
> :
> >
> > hi folks,
> >
> > we had to move one of our clusters so we had to boot all servers, now
> > we found an Error on all OSD with the EC-Pool.
> >
> > do we miss some opitons, will an upgrade to 13.2.6 help?
> >
> >
> > Thanks,
> > Ansgar
> >
> > 2019-08-06 12:10:16.265 7fb337b83200 -1
> > /build/ceph-13.2.4/src/osd/ECUtil.h: In function
> > 'ECUtil::stripe_info_t::stripe_info_t(uint64_t, uint64_t)' thread
> > 7fb337b83200 time 2019-08-06 12:10:16.263025
> > /build/ceph-13.2.4/src/osd/ECUtil.h: 34: FAILED assert(stripe_width %
> > stripe_size == 0)
> >
> > ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic
> > (stable) 1: (ceph::ceph_assert_fail(char const, char const, int, char
> > const)+0x102) [0x7fb32eeb83c2] 2: (()+0x2e5587) [0x7fb32eeb8587] 3:
> > (ECBackend::ECBackend(PGBackend::Listener, coll_t const&,
> > boost::intrusive_ptr&, ObjectStore,
> > CephContext, std::shared_ptr, unsigned
> > long)+0x4de) [0xa4cbbe] 4: (PGBackend::build_pg_backend(pg_pool_t
> > const&, std::map > std::char_traits, std::allocator >,
> > std::cxx11::basic_string,
> > std::allocator >, std::less > std::char_traits, std::allocator > >, std
> > ::allocator > std::char_traits, std::allocator > const,
> > std::cxx11::basic_string,
> > std::allocator > > > > const&, PGBackend::Listener, coll_t,
> > boost::intrusive_ptr&, ObjectStore,
> > CephContext)+0x2f9 ) [0x9474e9] 5:
> > (PrimaryLogPG::PrimaryLogPG(OSDService, std::shared_ptr,
> > PGPool const&, std::map > std::char_traits, std::allocator >,
> > std::cxx11::basic_string,
> > std::allocator >, std::less > std::char_tra its, std::allocator > >,
> > std::allocator > std::char_traits, std::allocator > const,
> > std::cxx11::basic_string,
> > std::allocator > > > > const&, spg_t)+0x138) [0x8f96e8] 6:
> > (OSD::_make_pg(std::shared_ptr, spg_t)+0x11d3)
> > [0x753553] 7: (OSD::load_pgs()+0x4a9) [0x758339] 8:
> > (OSD::init()+0xcd3) [0x7619c3] 9: (main()+0x3678) [0x64d6a8] 10:
> > (libc_start_main()+0xf0) [0x7fb32ca68830] 11: (_start()+0x29)
> > [0x717389] NOTE: a copy of the executable, or objdump -rdS
> >  is needed to interpret this.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] bluestore write iops calculation

2019-08-07 Thread vitalif

I can add RAM ans is there a way to increase rocksdb caching , can I
increase bluestore_cache_size_hdd to higher value to cache rocksdb?


In recent releases it's governed by the osd_memory_target parameter. In 
previous releases it's bluestore_cache_size_hdd. Check release notes to 
know for sure.



This we have planned to add some SSDs and how many OSD's rocks db we
can add per SSDs and i guess if one SSD is down then all related OSDs
has to be re-installed.


Yes. At least you'd better not put all 24 block.db's on a single SSD :) 
4-8 HDDs per an SSD is usually fine. Also check db_used_bytes in `ceph 
daemon osd.0 perf dump` (replace 0 with actual OSD numbers) to figure 
out how much space your DBs use. If it's below 30gb you're lucky because 
in that case DBs will fit on 30GB SSD partitions. 
https://yourcmc.ru/wiki/Ceph_performance#About_block.db_sizing


--
Vitaliy Filippov
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] New CRUSH device class questions

2019-08-07 Thread Paul Emmerich
On Wed, Aug 7, 2019 at 9:30 AM Robert LeBlanc  wrote:

>> # ceph osd crush rule dump replicated_racks_nvme
>> {
>>  "rule_id": 0,
>>  "rule_name": "replicated_racks_nvme",
>>  "ruleset": 0,
>>  "type": 1,
>>  "min_size": 1,
>>  "max_size": 10,
>>  "steps": [
>>  {
>>  "op": "take",
>>  "item": -44,
>>  "item_name": "default~nvme"<
>>  },
>>  {
>>  "op": "chooseleaf_firstn",
>>  "num": 0,
>>  "type": "rack"
>>  },
>>  {
>>  "op": "emit"
>>  }
>>  ]
>> }
>> ```
>
>
> Yes, our HDD cluster is much like this, but not Luminous, so we created as 
> separate root with SSD OSD for the metadata and set up a CRUSH rule for the 
> metadata pool to be mapped to SSD. I understand that the CRUSH rule should 
> have a `step take default class ssd` which I don't see in your rule unless 
> the `~` in the item_name means device class.

~ is the internal implementation of device classes. Internally it's
still using separate roots, that's how it stays compatible with older
clients that don't know about device classes.

And since it wasn't mentioned here yet: consider upgrading to Nautilus
to benefit from the new and improved accounting for metadata space.
You'll be able to see how much space is used for metadata and quotas
should work properly for metadata usage.


Paul

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90

>
> Thanks
> 
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] out of memory bluestore osds

2019-08-07 Thread Mark Nelson

Hi Jaime,


we only use the cache size parameters now if you've disabled 
autotuning.  With autotuning we adjust the cache size on the fly to try 
and keep the mapped process memory under the osd_memory_target.  You can 
set a lower memory target than default, though you will have far less 
cache for bluestore onodes and rocksdb.  You may notice that it's 
slower, especially if you have a big active data set you are 
processing.  I don't usually recommend setting the osd_memory_target 
below 2GB.  At some point it will have shrunk the caches as far as it 
can and the process memory may start exceeding the target.  (with our 
default rocksdb and pglog settings this usually happens somewhere 
between 1.3-1.7GB once the OSD has been sufficiently saturated with IO). 
Given memory prices right now, I'd still recommend upgrading RAM if you 
have the ability though.  You might be able to get away with setting 
each OSD to 2-2.5GB in your scenario but you'll be pushing it.



I would not recommend lowering the osd_memory_cache_min.  You really 
want rocksdb indexes/filters fitting in cache, and as many bluestore 
onodes as you can get.  In any event, you'll still be bound by the 
(currently hardcoded) 64MB cache chunk allocation size in the autotuner 
which osd_memory_cache_min can't reduce (and that's per cache while 
osd_memory_cache_min is global for the kv,buffer, and rocksdb block 
caches).  IE each cache is going to get 64MB+growth room regardless of 
how low you set osd_memory_cache_min.  That's intentional as we don't 
want a single SST file in rocksdb to be able to completely blow 
everything else out of the block cache during compaction, only to 
quickly become invalid, removed from the cache, and make it look to the 
priority cache system like rocksdb doesn't actually need any more memory 
for cache.



Mark


On 8/7/19 7:44 AM, Jaime Ibar wrote:

Hi all,

we run a Ceph Luminous 12.2.12 cluster, 7 osds servers 12x4TB disks each.
Recently we redeployed the osds of one of them using bluestore backend,
however, after this, we're facing Out of memory errors(invoked 
oom-killer)

and the OS kills one of the ceph-osd process.
The osd is restarted automatically and back online after one minute.
We're running Ubuntu 16.04, kernel 4.15.0-55-generic.
The server has 32GB of RAM and 4GB of swap partition.
All the disks are hdd, no ssd disks.
Bluestore settings are the default ones

"osd_memory_target": "4294967296"
"osd_memory_cache_min": "134217728"
"bluestore_cache_size": "0"
"bluestore_cache_size_hdd": "1073741824"
"bluestore_cache_autotune": "true"

As stated in the documentation, bluestore assigns by default 4GB of
RAM per osd(1GB of RAM for 1TB).
So in this case 48GB of RAM would be needed. Am I right?

Are these the minimun requirements for bluestore?
In case adding more RAM is not an option, can any of
osd_memory_target, osd_memory_cache_min, bluestore_cache_size_hdd
be decrease to fit in our server specs?
Would this have any impact on performance?

Thanks
Jaime


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] 14.2.2 - OSD Crash

2019-08-07 Thread Igor Fedotov

Manuel,

well, this is a bit different from the tickets I shared... But still 
looks like slow DB access.


80+ seconds for submit/commit latency is TOO HIGH, this definitely might 
cause suicides...


Have you had a chance to inspect disk utilization?


Introducing NVMe drive for WAL/DB might be a good idea, I can see up to 
20GB allocated for META so they perfectly fit into 480GB NVMe drive.


Having single drive isn't that the perfect from performance and failure 
domain points of view though... I'd rather prefer 4-6 OSDs per drive...



As  a workaround you might also try to disable deep scrub.


Thanks,

Igor

On 8/7/2019 2:59 PM, EDH - Manuel Rios Fernandez wrote:


Hi Igor

Yes we got all in same device :

[root@CEPH-MON01 ~]# ceph osd df tree

ID CLASS   WEIGHT    REWEIGHT SIZE    RAW USE DATA    OMAP META 
AVAIL   %USE  VAR  PGS STATUS TYPE NAME


31 130.96783    - 131 TiB 114 TiB 114 TiB  14 MiB  204 GiB 17 TiB 
86.88 1.03   -    host CEPH008


5 archive  10.91399  0.80002  11 TiB 7.9 TiB 7.9 TiB 2.6 MiB   15 GiB 
3.0 TiB 72.65 0.86 181 up osd.5


6 archive  10.91399  1.0  11 TiB 9.4 TiB 9.3 TiB 5.8 MiB   17 GiB 
1.6 TiB 85.76 1.01 222 up osd.6


11 archive  10.91399  1.0  11 TiB  10 TiB  10 TiB  48 KiB   19 GiB 
838 GiB 92.50 1.09 251 up osd.11


45 archive  10.91399  1.0  11 TiB  10 TiB  10 TiB 148 KiB   18 GiB 
678 GiB 93.94 1.11 248 up osd.45


46 archive  10.91399  1.0  11 TiB 9.6 TiB 9.5 TiB 4.7 MiB   17 GiB 
1.4 TiB 87.52 1.04 235 up osd.46


47 archive  10.91399  1.0  11 TiB 8.8 TiB 8.8 TiB  68 KiB   17 GiB 
2.1 TiB 80.43 0.95 211 up osd.47


55 archive  10.91399  1.0  11 TiB 9.9 TiB 9.9 TiB 132 KiB   17 GiB 
1.0 TiB 90.74 1.07 243 up osd.55


70 archive  10.91399  1.0  11 TiB  10 TiB  10 TiB  44 KiB   19 GiB 
864 GiB 92.27 1.09 236 up osd.70


71 archive  10.91399  1.0  11 TiB 9.2 TiB 9.2 TiB  28 KiB   16 GiB 
1.7 TiB 84.19 1.00 228 up osd.71


78 archive  10.91399  1.0  11 TiB 8.9 TiB 8.9 TiB 182 KiB   16 GiB 
2.0 TiB 81.87 0.97 215 up osd.78


79 archive  10.91399  1.0  11 TiB  10 TiB  10 TiB 152 KiB   17 GiB 
958 GiB 91.43 1.08 238 up osd.79


91 archive  10.91399  1.0  11 TiB 9.7 TiB 9.7 TiB  92 KiB   17 GiB 
1.2 TiB 89.22 1.06 232 up osd.91


Disk are HGST of 12TB for archive porpourse.

In the same osd we got sine commit bluestore log latency

2019-08-07 06:57:33.681 7f059b06e700  0 
bluestore(/var/lib/ceph/osd/ceph-46) queue_transactions slow operation 
observed for l_bluestore_submit_lat, latency = 11.163s


2019-08-07 06:57:33.703 7f05a8088700  0 
bluestore(/var/lib/ceph/osd/ceph-46) _txc_committed_kv slow operation 
observed for l_bluestore_commit_lat, latency = 11.1858s, txc = 
0x55e9e3ea2c00


2019-08-07 09:14:00.620 7f059d072700  0 
bluestore(/var/lib/ceph/osd/ceph-46) queue_transactions slow operation 
observed for l_bluestore_submit_lat, latency = 7.23777s


2019-08-07 09:14:00.650 7f05a8088700  0 
bluestore(/var/lib/ceph/osd/ceph-46) _txc_committed_kv slow operation 
observed for l_bluestore_commit_lat, latency = 7.26778s, txc = 
0x55eaafbf6600


2019-08-07 09:19:08.242 7f059e875700  0 
bluestore(/var/lib/ceph/osd/ceph-46) queue_transactions slow operation 
observed for l_bluestore_submit_lat, latency = 81.8293s


2019-08-07 09:19:08.291 7f05a8088700  0 
bluestore(/var/lib/ceph/osd/ceph-46) _txc_committed_kv slow operation 
observed for l_bluestore_commit_lat, latency = 81.8609s, txc = 
0x55ea05ee6000


2019-08-07 09:19:08.467 7f059b06e700  0 
bluestore(/var/lib/ceph/osd/ceph-46) queue_transactions slow operation 
observed for l_bluestore_submit_lat, latency = 87.7795s


2019-08-07 09:19:08.481 7f05a8088700  0 
bluestore(/var/lib/ceph/osd/ceph-46) _txc_committed_kv slow operation 
observed for l_bluestore_commit_lat, latency = 87.7928s, txc = 
0x55eaa7a40600


Maybe move OMAP +META from all OSD to a NVME of 480GB per node helps 
in this situation but not sure.


Manuel

*De:*Igor Fedotov 
*Enviado el:* miércoles, 7 de agosto de 2019 13:10
*Para:* EDH - Manuel Rios Fernandez ; 'Ceph 
Users' 

*Asunto:* Re: [ceph-users] 14.2.2 - OSD Crash

Hi Manuel,

as Brad pointed out timeouts and suicides are rather consequences of 
some other issues with OSDs.


I recall at least two recent relevant tickets:

https://tracker.ceph.com/issues/36482

https://tracker.ceph.com/issues/40741 (see last comments)

Both had massive and slow reads from RocksDB which caused timeouts..

Visible symptom for both cases was  unexpectedly high read I/O from 
underlying disks (main and/or DB).


You can use iotop for inspection..,

These were worsened by having significant part of DB at spinners due 
to spillovers. So wondering what's your layout in this respect:


what drives back troublesome OSDs, is there any spillover to slow 
device, how massive it is?


Also could you please inspect your OSD 

[ceph-users] out of memory bluestore osds

2019-08-07 Thread Jaime Ibar

Hi all,

we run a Ceph Luminous 12.2.12 cluster, 7 osds servers 12x4TB disks each.
Recently we redeployed the osds of one of them using bluestore backend,
however, after this, we're facing Out of memory errors(invoked oom-killer)
and the OS kills one of the ceph-osd process.
The osd is restarted automatically and back online after one minute.
We're running Ubuntu 16.04, kernel 4.15.0-55-generic.
The server has 32GB of RAM and 4GB of swap partition.
All the disks are hdd, no ssd disks.
Bluestore settings are the default ones

"osd_memory_target": "4294967296"
"osd_memory_cache_min": "134217728"
"bluestore_cache_size": "0"
"bluestore_cache_size_hdd": "1073741824"
"bluestore_cache_autotune": "true"

As stated in the documentation, bluestore assigns by default 4GB of
RAM per osd(1GB of RAM for 1TB).
So in this case 48GB of RAM would be needed. Am I right?

Are these the minimun requirements for bluestore?
In case adding more RAM is not an option, can any of
osd_memory_target, osd_memory_cache_min, bluestore_cache_size_hdd
be decrease to fit in our server specs?
Would this have any impact on performance?

Thanks
Jaime

--

Jaime Ibar
High Performance & Research Computing, IS Services
Lloyd Building, Trinity College Dublin, Dublin 2, Ireland.
http://www.tchpc.tcd.ie/ | ja...@tchpc.tcd.ie
Tel: +353-1-896-3725

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] 14.2.2 - OSD Crash

2019-08-07 Thread EDH - Manuel Rios Fernandez
Hi Igor 

 

Yes we got all in same device :

 

[root@CEPH-MON01 ~]# ceph osd df tree

ID  CLASS   WEIGHTREWEIGHT SIZERAW USE DATAOMAPMETA
AVAIL   %USE  VAR  PGS STATUS TYPE NAME

31 130.96783- 131 TiB 114 TiB 114 TiB  14 MiB  204 GiB  17
TiB 86.88 1.03   -host CEPH008

  5 archive  10.91399  0.80002  11 TiB 7.9 TiB 7.9 TiB 2.6 MiB   15 GiB 3.0
TiB 72.65 0.86 181 up osd.5

  6 archive  10.91399  1.0  11 TiB 9.4 TiB 9.3 TiB 5.8 MiB   17 GiB 1.6
TiB 85.76 1.01 222 up osd.6

11 archive  10.91399  1.0  11 TiB  10 TiB  10 TiB  48 KiB   19 GiB 838
GiB 92.50 1.09 251 up osd.11

45 archive  10.91399  1.0  11 TiB  10 TiB  10 TiB 148 KiB   18 GiB 678
GiB 93.94 1.11 248 up osd.45

46 archive  10.91399  1.0  11 TiB 9.6 TiB 9.5 TiB 4.7 MiB   17 GiB 1.4
TiB 87.52 1.04 235 up osd.46

47 archive  10.91399  1.0  11 TiB 8.8 TiB 8.8 TiB  68 KiB   17 GiB 2.1
TiB 80.43 0.95 211 up osd.47

55 archive  10.91399  1.0  11 TiB 9.9 TiB 9.9 TiB 132 KiB   17 GiB 1.0
TiB 90.74 1.07 243 up osd.55

70 archive  10.91399  1.0  11 TiB  10 TiB  10 TiB  44 KiB   19 GiB 864
GiB 92.27 1.09 236 up osd.70

71 archive  10.91399  1.0  11 TiB 9.2 TiB 9.2 TiB  28 KiB   16 GiB 1.7
TiB 84.19 1.00 228 up osd.71

78 archive  10.91399  1.0  11 TiB 8.9 TiB 8.9 TiB 182 KiB   16 GiB 2.0
TiB 81.87 0.97 215 up osd.78

79 archive  10.91399  1.0  11 TiB  10 TiB  10 TiB 152 KiB   17 GiB 958
GiB 91.43 1.08 238 up osd.79

91 archive  10.91399  1.0  11 TiB 9.7 TiB 9.7 TiB  92 KiB   17 GiB 1.2
TiB 89.22 1.06 232 up osd.91

 

Disk are HGST of 12TB for archive porpourse.

 

In the same osd we got sine commit bluestore log latency

 

2019-08-07 06:57:33.681 7f059b06e700  0 bluestore(/var/lib/ceph/osd/ceph-46)
queue_transactions slow operation observed for l_bluestore_submit_lat,
latency = 11.163s

2019-08-07 06:57:33.703 7f05a8088700  0 bluestore(/var/lib/ceph/osd/ceph-46)
_txc_committed_kv slow operation observed for l_bluestore_commit_lat,
latency = 11.1858s, txc = 0x55e9e3ea2c00

2019-08-07 09:14:00.620 7f059d072700  0 bluestore(/var/lib/ceph/osd/ceph-46)
queue_transactions slow operation observed for l_bluestore_submit_lat,
latency = 7.23777s

2019-08-07 09:14:00.650 7f05a8088700  0 bluestore(/var/lib/ceph/osd/ceph-46)
_txc_committed_kv slow operation observed for l_bluestore_commit_lat,
latency = 7.26778s, txc = 0x55eaafbf6600

2019-08-07 09:19:08.242 7f059e875700  0 bluestore(/var/lib/ceph/osd/ceph-46)
queue_transactions slow operation observed for l_bluestore_submit_lat,
latency = 81.8293s

2019-08-07 09:19:08.291 7f05a8088700  0 bluestore(/var/lib/ceph/osd/ceph-46)
_txc_committed_kv slow operation observed for l_bluestore_commit_lat,
latency = 81.8609s, txc = 0x55ea05ee6000

2019-08-07 09:19:08.467 7f059b06e700  0 bluestore(/var/lib/ceph/osd/ceph-46)
queue_transactions slow operation observed for l_bluestore_submit_lat,
latency = 87.7795s

2019-08-07 09:19:08.481 7f05a8088700  0 bluestore(/var/lib/ceph/osd/ceph-46)
_txc_committed_kv slow operation observed for l_bluestore_commit_lat,
latency = 87.7928s, txc = 0x55eaa7a40600

 

Maybe move OMAP +META from all OSD to a NVME of 480GB per node helps in this
situation but not sure.

 

Manuel

 

 

 

 

De: Igor Fedotov  
Enviado el: miércoles, 7 de agosto de 2019 13:10
Para: EDH - Manuel Rios Fernandez ; 'Ceph Users'

Asunto: Re: [ceph-users] 14.2.2 - OSD Crash

 

Hi Manuel,

as Brad pointed out timeouts and suicides are rather consequences of some
other issues with OSDs.

I recall at least two recent relevant tickets:

https://tracker.ceph.com/issues/36482

https://tracker.ceph.com/issues/40741 (see last comments)

Both had massive and slow reads from RocksDB which caused timeouts..

Visible symptom for both cases was  unexpectedly high read I/O from
underlying disks (main and/or DB). 

You can use iotop for inspection..,

These were worsened by having significant part of DB at spinners due to
spillovers. So wondering what's your layout in this respect:

what drives back troublesome OSDs, is there any spillover to slow device,
how massive it is?

Also could you please inspect your OSD logs for the presence of lines
containing "slow operation observed" substring. And share them if any..

 

Hope this helps.

Thanks,

Igor

 

 

On 8/7/2019 2:16 AM, EDH - Manuel Rios Fernandez wrote:

Hi 

 

We got a pair of OSD located in  node that crash randomly since 14.2.2

 

OS Version : Centos 7.6

 

There’re a ton of lines before crash , I will unespected:

 

--

3045> 2019-08-07 00:39:32.013 7fe9a4996700  1 heartbeat_map is_healthy
'OSD::osd_op_tp thread 0x7fe987e49700' had timed out after 15

-3044> 2019-08-07 00:39:32.013 7fe9a3994700  1 heartbeat_map is_healthy
'OSD::osd_op_tp thread 0x7fe987e49700' had timed out after 15

-3043> 2019-08-07 00:39:32.033 7fe9a4195700  1 

[ceph-users] Nautilus - Balancer is always on

2019-08-07 Thread EDH - Manuel Rios Fernandez
Hi All,

 

 

ceph mgr module disable balancer

Error EINVAL: module 'balancer' cannot be disabled (always-on)

 

Whats the way to restart balanacer? Restart MGR service?

 

I wanna suggest to Balancer developer to setup a ceph-balancer.log for this
module get more information about whats doing.

 

Regards

 

Manuel

 

 

 

 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] 14.2.2 - OSD Crash

2019-08-07 Thread Igor Fedotov

Hi Manuel,

as Brad pointed out timeouts and suicides are rather consequences of 
some other issues with OSDs.


I recall at least two recent relevant tickets:

https://tracker.ceph.com/issues/36482

https://tracker.ceph.com/issues/40741 (see last comments)

Both had massive and slow reads from RocksDB which caused timeouts..

Visible symptom for both cases was  unexpectedly high read I/O from 
underlying disks (main and/or DB).


You can use iotop for inspection..,

These were worsened by having significant part of DB at spinners due to 
spillovers. So wondering what's your layout in this respect:


what drives back troublesome OSDs, is there any spillover to slow 
device, how massive it is?


Also could you please inspect your OSD logs for the presence of lines 
containing "slow operation observed" substring. And share them if any..



Hope this helps.

Thanks,

Igor



On 8/7/2019 2:16 AM, EDH - Manuel Rios Fernandez wrote:


Hi

We got a pair of OSD located in  node that crash randomly since 14.2.2

OS Version : Centos 7.6

There’re a ton of lines before crash , I will unespected:

--

3045> 2019-08-07 00:39:32.013 7fe9a4996700  1 heartbeat_map is_healthy 
'OSD::osd_op_tp thread 0x7fe987e49700' had timed out after 15


-3044> 2019-08-07 00:39:32.013 7fe9a3994700  1 heartbeat_map 
is_healthy 'OSD::osd_op_tp thread 0x7fe987e49700' had timed out after 15


-3043> 2019-08-07 00:39:32.033 7fe9a4195700  1 heartbeat_map 
is_healthy 'OSD::osd_op_tp thread 0x7fe987e49700' had timed out after 15


-3042> 2019-08-07 00:39:32.033 7fe9a4996700  1 heartbeat_map 
is_healthy 'OSD::osd_op_tp thread 0x7fe987e49700' had timed out after 15


--

-

Some hundred lines of:

-164> 2019-08-07 00:47:36.628 7fe9a3994700  1 heartbeat_map is_healthy 
'OSD::osd_op_tp thread 0x7fe98964c700' had timed out after 60


  -163> 2019-08-07 00:47:36.632 7fe9a3994700  1 heartbeat_map 
is_healthy 'OSD::osd_op_tp thread 0x7fe98964c700' had timed out after 60


  -162> 2019-08-07 00:47:36.632 7fe9a3994700  1 heartbeat_map 
is_healthy 'OSD::osd_op_tp thread 0x7fe98964c700' had timed out after 60


-

   -78> 2019-08-07 00:50:51.755 7fe995bfa700 10 monclient: tick

   -77> 2019-08-07 00:50:51.755 7fe995bfa700 10 monclient: 
_check_auth_rotating have uptodate secrets (they expire after 
2019-08-07 00:50:21.756453)


   -76> 2019-08-07 00:51:01.755 7fe995bfa700 10 monclient: tick

   -75> 2019-08-07 00:51:01.755 7fe995bfa700 10 monclient: 
_check_auth_rotating have uptodate secrets (they expire after 
2019-08-07 00:50:31.756604)


   -74> 2019-08-07 00:51:11.755 7fe995bfa700 10 monclient: tick

   -73> 2019-08-07 00:51:11.755 7fe995bfa700 10 monclient: 
_check_auth_rotating have uptodate secrets (they expire after 
2019-08-07 00:50:41.756788)


   -72> 2019-08-07 00:51:21.756 7fe995bfa700 10 monclient: tick

   -71> 2019-08-07 00:51:21.756 7fe995bfa700 10 monclient: 
_check_auth_rotating have uptodate secrets (they expire after 
2019-08-07 00:50:51.756982)


   -70> 2019-08-07 00:51:31.755 7fe995bfa700 10 monclient: tick

   -69> 2019-08-07 00:51:31.755 7fe995bfa700 10 monclient: 
_check_auth_rotating have uptodate secrets (they expire after 
2019-08-07 00:51:01.757206)


   -68> 2019-08-07 00:51:41.756 7fe995bfa700 10 monclient: tick

   -67> 2019-08-07 00:51:41.756 7fe995bfa700 10 monclient: 
_check_auth_rotating have uptodate secrets (they expire after 
2019-08-07 00:51:11.757364)


   -66> 2019-08-07 00:51:51.756 7fe995bfa700 10 monclient: tick

   -65> 2019-08-07 00:51:51.756 7fe995bfa700 10 monclient: 
_check_auth_rotating have uptodate secrets (they expire after 
2019-08-07 00:51:21.757535)


   -64> 2019-08-07 00:51:52.861 7fe987e49700  1 heartbeat_map 
clear_timeout 'OSD::osd_op_tp thread 0x7fe987e49700' had timed out 
after 15


   -63> 2019-08-07 00:51:52.861 7fe987e49700  1 heartbeat_map 
clear_timeout 'OSD::osd_op_tp thread 0x7fe987e49700' had suicide timed 
out after 150


   -62> 2019-08-07 00:51:52.948 7fe99966c700  5 
bluestore.MempoolThread(0x55ff04ad6a88) _tune_cache_size target: 
4294967296 heap: 6018998272 unmapped: 1721180160 mapped: 4297818112 
old cache_size: 1994018210 new cache size: 1992784572


   -61> 2019-08-07 00:51:52.948 7fe99966c700  5 
bluestore.MempoolThread(0x55ff04ad6a88) _trim_shards cache_size: 
1992784572 kv_alloc: 763363328 kv_used: 749381098 meta_alloc: 
763363328 meta_used: 654593191 data_alloc: 452984832 data_used: 455929856


   -60> 2019-08-07 00:51:57.923 7fe99966c700  5 
bluestore.MempoolThread(0x55ff04ad6a88) _trim_shards cache_size: 
1994110827 kv_alloc: 763363328 kv_used: 749381098 meta_alloc: 
763363328 meta_used: 654590799 data_alloc: 452984832 data_used: 451538944


   -59> 2019-08-07 00:51:57.973 7fe99966c700  5 
bluestore.MempoolThread(0x55ff04ad6a88) _tune_cache_size target: 
4294967296 heap: 6018998272 unmapped: 1725702144 mapped: 4293296128 
old cache_size: 1994110827 new cache size: 1994442069


   -58> 2019-08-07 00:52:01.756 7fe995bfa700 10 monclient: tick

   -57> 

Re: [ceph-users] Error Mounting CephFS

2019-08-07 Thread Frank Schilder
On Centos7, the option "secretfile" requires installation of ceph-fuse.

Best regards,

=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: ceph-users  on behalf of Yan, Zheng 

Sent: 07 August 2019 10:10:19
To: dhils...@performair.com
Cc: ceph-users
Subject: Re: [ceph-users] Error Mounting CephFS

On Wed, Aug 7, 2019 at 3:46 PM  wrote:
>
> All;
>
> I have a server running CentOS 7.6 (1810), that I want to set up with CephFS 
> (full disclosure, I'm going to be running samba on the CephFS).  I can mount 
> the CephFS fine when I use the option secret=, but when I switch to 
> secretfile=, I get an error "No such process."  I installed ceph-common.
>
> Is there a service that I'm not aware I should be starting?
> Do I need to install another package?
>

mount.ceph is missing.  check if it exists and is located in $PATH

> Thank you,
>
> Dominic L. Hilsbos, MBA
> Director - Information Technology
> Perform Air International Inc.
> dhils...@performair.com
> www.PerformAir.com
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSD's keep crasching after clusterreboot

2019-08-07 Thread Ansgar Jazdzewski
Hi,

as a follow-up:
* a full log of one OSD failing to start https://pastebin.com/T8UQ2rZ6
* our ec-pool cration in the fist place https://pastebin.com/20cC06Jn
* ceph osd dump and ceph osd erasure-code-profile get cephfs
https://pastebin.com/TRLPaWcH

as we try to dig more into it, it looks like a bug in the cephfs or
erasure-coding part of ceph.

Ansgar


Am Di., 6. Aug. 2019 um 14:50 Uhr schrieb Ansgar Jazdzewski
:
>
> hi folks,
>
> we had to move one of our clusters so we had to boot all servers, now
> we found an Error on all OSD with the EC-Pool.
>
> do we miss some opitons, will an upgrade to 13.2.6 help?
>
>
> Thanks,
> Ansgar
>
> 2019-08-06 12:10:16.265 7fb337b83200 -1
> /build/ceph-13.2.4/src/osd/ECUtil.h: In function
> 'ECUtil::stripe_info_t::stripe_info_t(uint64_t, uint64_t)' thread
> 7fb337b83200 time 2019-08-06 12:10:16.263025
> /build/ceph-13.2.4/src/osd/ECUtil.h: 34: FAILED assert(stripe_width %
> stripe_size == 0)
>
> ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic
> (stable) 1: (ceph::ceph_assert_fail(char const, char const, int, char
> const)+0x102) [0x7fb32eeb83c2] 2: (()+0x2e5587) [0x7fb32eeb8587] 3:
> (ECBackend::ECBackend(PGBackend::Listener, coll_t const&,
> boost::intrusive_ptr&, ObjectStore,
> CephContext, std::shared_ptr, unsigned
> long)+0x4de) [0xa4cbbe] 4: (PGBackend::build_pg_backend(pg_pool_t
> const&, std::map std::char_traits, std::allocator >,
> std::cxx11::basic_string,
> std::allocator >, std::less std::char_traits, std::allocator > >, std
> ::allocator std::char_traits, std::allocator > const,
> std::cxx11::basic_string,
> std::allocator > > > > const&, PGBackend::Listener, coll_t,
> boost::intrusive_ptr&, ObjectStore,
> CephContext)+0x2f9 ) [0x9474e9] 5:
> (PrimaryLogPG::PrimaryLogPG(OSDService, std::shared_ptr,
> PGPool const&, std::map std::char_traits, std::allocator >,
> std::cxx11::basic_string,
> std::allocator >, std::less std::char_tra its, std::allocator > >,
> std::allocator std::char_traits, std::allocator > const,
> std::cxx11::basic_string,
> std::allocator > > > > const&, spg_t)+0x138) [0x8f96e8] 6:
> (OSD::_make_pg(std::shared_ptr, spg_t)+0x11d3)
> [0x753553] 7: (OSD::load_pgs()+0x4a9) [0x758339] 8:
> (OSD::init()+0xcd3) [0x7619c3] 9: (main()+0x3678) [0x64d6a8] 10:
> (libc_start_main()+0xf0) [0x7fb32ca68830] 11: (_start()+0x29)
> [0x717389] NOTE: a copy of the executable, or objdump -rdS
>  is needed to interpret this.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Error Mounting CephFS

2019-08-07 Thread Yan, Zheng
On Wed, Aug 7, 2019 at 3:46 PM  wrote:
>
> All;
>
> I have a server running CentOS 7.6 (1810), that I want to set up with CephFS 
> (full disclosure, I'm going to be running samba on the CephFS).  I can mount 
> the CephFS fine when I use the option secret=, but when I switch to 
> secretfile=, I get an error "No such process."  I installed ceph-common.
>
> Is there a service that I'm not aware I should be starting?
> Do I need to install another package?
>

mount.ceph is missing.  check if it exists and is located in $PATH

> Thank you,
>
> Dominic L. Hilsbos, MBA
> Director - Information Technology
> Perform Air International Inc.
> dhils...@performair.com
> www.PerformAir.com
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] New CRUSH device class questions

2019-08-07 Thread Konstantin Shalygin

On 8/7/19 2:30 PM, Robert LeBlanc wrote:

... plus 11 more hosts just like this


Interesting. Please paste full `ceph osd df tree`. What is actually your 
NVMe models?


Yes, our HDD cluster is much like this, but not Luminous, so we 
created as separate root with SSD OSD for the metadata and set up a 
CRUSH rule for the metadata pool to be mapped to SSD. I understand 
that the CRUSH rule should have a `step take default class ssd` which 
I don't see in your rule unless the `~` in the item_name means device 
class.

Indeed, this is a device class.


And new crush rule may be created like this `ceph osd crush rule 
create-replicated
`, for me it is: `ceph osd crush rule create-replicated 
replicated_racks_nvme default rack nvme`




k


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] New CRUSH device class questions

2019-08-07 Thread Robert LeBlanc
On Wed, Aug 7, 2019 at 12:08 AM Konstantin Shalygin  wrote:

> On 8/7/19 1:40 PM, Robert LeBlanc wrote:
>
> > Maybe it's the lateness of the day, but I'm not sure how to do that.
> > Do you have an example where all the OSDs are of class ssd?
> Can't parse what you mean. You always should paste your `ceph osd tree`
> first.
>

Our 'ceph osd tree' is like this:
ID  CLASS WEIGHTTYPE NAMESTATUS REWEIGHT PRI-AFF
 -1   892.21326 root default
 -369.16382 host sun-pcs01-osd01
  0   ssd   3.49309 osd.0up  1.0 1.0
  1   ssd   3.42329 osd.1up  0.87482 1.0
  2   ssd   3.49309 osd.2up  0.88989 1.0
  3   ssd   3.42329 osd.3up  0.94989 1.0
  4   ssd   3.49309 osd.4up  0.93993 1.0
  5   ssd   3.42329 osd.5up  1.0 1.0
  6   ssd   3.49309 osd.6up  0.89490 1.0
  7   ssd   3.42329 osd.7up  1.0 1.0
  8   ssd   3.49309 osd.8up  0.89482 1.0
  9   ssd   3.42329 osd.9up  1.0 1.0
100   ssd   3.49309 osd.100  up  1.0 1.0
101   ssd   3.42329 osd.101  up  1.0 1.0
102   ssd   3.49309 osd.102  up  1.0 1.0
103   ssd   3.42329 osd.103  up  0.81482 1.0
104   ssd   3.49309 osd.104  up  0.87973 1.0
105   ssd   3.42329 osd.105  up  0.86485 1.0
106   ssd   3.49309 osd.106  up  0.79965 1.0
107   ssd   3.42329 osd.107  up  1.0 1.0
108   ssd   3.49309 osd.108  up  1.0 1.0
109   ssd   3.42329 osd.109  up  1.0 1.0
 -562.24744 host sun-pcs01-osd02
 10   ssd   3.49309 osd.10   up  1.0 1.0
 11   ssd   3.42329 osd.11   up  0.72473 1.0
 12   ssd   3.49309 osd.12   up  1.0 1.0
 13   ssd   3.42329 osd.13   up  0.78979 1.0
 14   ssd   3.49309 osd.14   up  0.98961 1.0
 15   ssd   3.42329 osd.15   up  1.0 1.0
 16   ssd   3.49309 osd.16   up  0.96495 1.0
 17   ssd   3.42329 osd.17   up  0.94994 1.0
 18   ssd   3.49309 osd.18   up  1.0 1.0
 19   ssd   3.42329 osd.19   up  0.80481 1.0
110   ssd   3.49309 osd.110  up  0.97998 1.0
111   ssd   3.42329 osd.111  up  1.0 1.0
112   ssd   3.49309 osd.112  up  1.0 1.0
113   ssd   3.42329 osd.113  up  0.72974 1.0
116   ssd   3.49309 osd.116  up  0.91992 1.0
117   ssd   3.42329 osd.117  up  0.96997 1.0
118   ssd   3.49309 osd.118  up  0.93959 1.0
119   ssd   3.42329 osd.119  up  0.94481 1.0
... plus 11 more hosts just like this

How do you single out one OSD from each host for the metadata only and
prevent data on that OSD when all the device classes are the same? It seems
that you would need one OSD to be a different class to do that. It a
previous email the conversation was:

Is it possible to add a new device class like 'metadata'?

Yes, but you don't need this. Just use your existing class with another
crush ruleset.


So, I'm trying to figure out how you use the existing class of 'ssd' with
another CRUSH ruleset to accomplish the above.


> > Yes, we can set quotas to limit space usage (or number objects), but
> > you can not reserve some space that other pools can't use. The problem
> > is if we set a quota for the CephFS data pool to the equivalent of 95%
> > there are at least two scenario that make that quota useless.
>
> Of course. 95% of CephFS deployments is where meta_pool on flash drives
> with enough space for this.
>
>
> ```
>
> pool 21 'fs_data' replicated size 3 min_size 2 crush_rule 4 object_hash
> rjenkins pg_num 64 pgp_num 64 last_change 56870 flags hashpspool
> stripe_width 0 application cephfs
> pool 22 'fs_meta' replicated size 3 min_size 2 crush_rule 0 object_hash
> rjenkins pg_num 16 pgp_num 16 last_change 56870 flags hashpspool
> stripe_width 0 application cephfs
>
> ```
>
> ```
>
> # ceph osd crush rule dump replicated_racks_nvme
> {
>  "rule_id": 0,
>  "rule_name": "replicated_racks_nvme",
>  "ruleset": 0,
>  "type": 1,
>  "min_size": 1,
>  "max_size": 10,
>  "steps": [
>  {
>  "op": "take",
>  "item": -44,
>  "item_name": "default~nvme"<
>  },
>  {
>  "op": "chooseleaf_firstn",
>  "num": 0,
>  "type": "rack"
> 

[ceph-users] Can kstore be used as OSD objectstore backend when deploying a Ceph Storage Cluster? If can, how to?

2019-08-07 Thread R.R.Yuan
Hi, All,


   When deploying a development cluster, there are three types of OSD 
objectstore backend:  filestore, bluestore and kstore. 
   But there is no "--kstore" option when using "ceph-deploy osd"command to 
deploy a real ceph cluster.
  
   Can kstore be used as OSD objectstore backend when deploy a real ceph 
cluster?If can, how to ?




Thanks a lot
R.R.Yuan___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] New CRUSH device class questions

2019-08-07 Thread Konstantin Shalygin

On 8/7/19 1:40 PM, Robert LeBlanc wrote:

Maybe it's the lateness of the day, but I'm not sure how to do that. 
Do you have an example where all the OSDs are of class ssd?
Can't parse what you mean. You always should paste your `ceph osd tree` 
first.


Yes, we can set quotas to limit space usage (or number objects), but 
you can not reserve some space that other pools can't use. The problem 
is if we set a quota for the CephFS data pool to the equivalent of 95% 
there are at least two scenario that make that quota useless.


Of course. 95% of CephFS deployments is where meta_pool on flash drives 
with enough space for this.



```

pool 21 'fs_data' replicated size 3 min_size 2 crush_rule 4 object_hash 
rjenkins pg_num 64 pgp_num 64 last_change 56870 flags hashpspool 
stripe_width 0 application cephfs
pool 22 'fs_meta' replicated size 3 min_size 2 crush_rule 0 object_hash 
rjenkins pg_num 16 pgp_num 16 last_change 56870 flags hashpspool 
stripe_width 0 application cephfs


```

```

# ceph osd crush rule dump replicated_racks_nvme
{
    "rule_id": 0,
    "rule_name": "replicated_racks_nvme",
    "ruleset": 0,
    "type": 1,
    "min_size": 1,
    "max_size": 10,
    "steps": [
    {
    "op": "take",
    "item": -44,
    "item_name": "default~nvme"    <
    },
    {
    "op": "chooseleaf_firstn",
    "num": 0,
    "type": "rack"
    },
    {
    "op": "emit"
    }
    ]
}
```



k

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] New CRUSH device class questions

2019-08-07 Thread Robert LeBlanc
On Tue, Aug 6, 2019 at 7:56 PM Konstantin Shalygin  wrote:

> Is it possible to add a new device class like 'metadata'?
>
>
> Yes, but you don't need this. Just use your existing class with another
> crush ruleset.
>

Maybe it's the lateness of the day, but I'm not sure how to do that. Do you
have an example where all the OSDs are of class ssd?

> If I set the device class manually, will it be overwritten when the OSD
> boots up?
>
>
> Nope. Classes assigned automatically when OSD is created, not boot'ed.
>

That's good to know.

> I read https://ceph.com/community/new-luminous-crush-device-classes/ and it
> mentions that Ceph automatically classifies into hdd, ssd, and nvme. Hence
> the question.
>
> But it's not a magic. Sometimes drive can be sata ssd, but in kernel is
> 'rotational'...
>
I see, so it's not looking to see if the device is in /sys/class/pci or
something.

> We will still have 13 OSDs, it will be overkill for space for metadata, but
> since Ceph lacks a reserve space feature, we don't have  many options. This
> cluster is so fast that it can fill up in the blink of an eye.
>
>
> Not true. You always can set per-pool quota in bytes, for example:
>
> * your meta is 1G;
>
> * your raw space is 300G;
>
> * your data is 90G;
>
> Set quota to your data pool: `ceph osd pool set-quota 
> max_bytes 96636762000`
>
Yes, we can set quotas to limit space usage (or number objects), but you
can not reserve some space that other pools can't use. The problem is if we
set a quota for the CephFS data pool to the equivalent of 95% there are at
least two scenario that make that quota useless.

1. A host fails and the cluster recovers. The quota is now past the
capacity of the cluster so if the data pool fills up, no pool can write.
2. The CephFS data pool is an erasure encoded pool, and it shares with a
RGW data pool that is 3x rep. If more writes happen to the RGW data pool,
then the quota will be past the capacity of the cluster.

Both of these cause metadata operations to not be committed and cause lots
of problems with CephFS (can't list a directory with a broken inode in it).
We would prefer to get a truncated file, then a broken file system.

I wrote a script that calculates 95% of the pool capacity and sets the
quota if the current quota is 1% out of balance. This is run by cron every
5 minutes.

If there is a way to reserve some capacity for a pool that no other pool
can use, please provide an example. Think of reserved inode space in
ext4/XFS/etc.

Thank you.

Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] radosgw (beast): how to enable verbose log? request, user-agent, etc.

2019-08-07 Thread Félix Barbeira
Hi Manuel,

Yes, I already tried that option but the result it's extremely noisy and
not usable due to lack of some fields, besides that forget to parse those
logs in order to print some stats. Also, I'm not sure if this is a good
hint to rgw performance.

I think I'm going to stick with nginx and made some tests.

Thanks anyway! :)

El mar., 6 ago. 2019 a las 18:06, EDH - Manuel Rios Fernandez (<
mrios...@easydatahost.com>) escribió:

> Hi Felix,
>
>
>
> You can increase debug option with debug rgw in your rgw nodes.
>
>
>
> We got it to 10.
>
>
>
> But at least in our case we switched again to civetweb because it don’t
> provide a clear log without a lot verbose.
>
>
>
> Regards
>
>
>
> Manuel
>
>
>
>
>
> *De:* ceph-users  *En nombre de *Félix
> Barbeira
> *Enviado el:* martes, 6 de agosto de 2019 17:43
> *Para:* Ceph Users 
> *Asunto:* [ceph-users] radosgw (beast): how to enable verbose log?
> request, user-agent, etc.
>
>
>
> Hi,
>
>
>
> I'm testing radosgw with beast backend and I did not found a way to view
> more information on logfile. This is an example:
>
>
>
> 2019-08-06 16:59:14.488 7fc808234700  1 == starting new request
> req=0x5608245646f0 =
> 2019-08-06 16:59:14.496 7fc808234700  1 == req done req=0x5608245646f0
> op status=0 http_status=204 latency=0.00800043s ==
>
>
>
> I would be interested on typical fields that a regular webserver has:
> origin, request, useragent, etc. I checked the official docs but I don't
> find anything related:
>
>
>
> https://docs.ceph.com/docs/nautilus/radosgw/frontends/
> 
>
>
>
> The only manner I found is to put in front a nginx server running as a
> proxy or an haproxy, but I really don't like that solution because it would
> be an overhead component used only to log requests. Anyone in the same
> situation?
>
>
>
> Thanks in advance.
>
> --
>
> Félix Barbeira.
>


-- 
Félix Barbeira.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com