Re: [ceph-users] krbd / kcephfs - jewel client features question

2019-10-21 Thread Lei Liu
Hello  llya and paul,

Thanks for your reply. Yes, you are right, 0x7fddff8ee8cbffb is come from
kernel upgrade, it's reported by a docker container
(digitalocean/ceph_exporter) use for ceph monitoring.

Now upmap mode is enabled, client features:

"client": {
"group": {
"features": "0x7010fb86aa42ada",
"release": "jewel",
"num": 6
},
"group": {
"features": "0x3ffddff8eea4fffb",
"release": "luminous",
        "num": 6
}
}

Thanks again.

Ilya Dryomov  于2019年10月21日周一 下午6:38写道:

> On Sat, Oct 19, 2019 at 2:00 PM Lei Liu  wrote:
> >
> > Hello llya,
> >
> > After updated client kernel version to 3.10.0-862 , ceph features shows:
> >
> > "client": {
> > "group": {
> > "features": "0x7010fb86aa42ada",
> > "release": "jewel",
> > "num": 5
> > },
> > "group": {
> > "features": "0x7fddff8ee8cbffb",
> > "release": "jewel",
> > "num": 1
> > },
> > "group": {
> > "features": "0x3ffddff8eea4fffb",
> > "release": "luminous",
> > "num": 6
> > },
> > "group": {
> > "features": "0x3ffddff8eeacfffb",
> > "release": "luminous",
> > "num": 1
> > }
> > }
> >
> > both 0x7fddff8ee8cbffb and 0x7010fb86aa42ada are reported by new kernel
> client.
> >
> > Is it now possible to force set-require-min-compat-client to be
> luminous, if not how to fix it?
>
> No, you haven't upgraded the one with features 0x7fddff8ee8cbffb (or
> rather it looks like you have upgraded it from 0x7fddff8ee84bffb, but
> to a version that is still too old).
>
> What exactly did you do on that machine?  That change doesn't look like
> it came from a kernel upgrade.  What is the output of "uname -a" there?
>
> Thanks,
>
> Ilya
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] krbd / kcephfs - jewel client features question

2019-10-19 Thread Lei Liu
Hello llya,

After updated client kernel version to 3.10.0-862 , ceph features shows:

"client": {
"group": {
"features": "0x7010fb86aa42ada",
"release": "jewel",
"num": 5
},
"group": {
"features": "0x7fddff8ee8cbffb",
"release": "jewel",
"num": 1
},
"group": {
"features": "0x3ffddff8eea4fffb",
"release": "luminous",
"num": 6
},
"group": {
"features": "0x3ffddff8eeacfffb",
"release": "luminous",
"num": 1
}
}

both 0x7fddff8ee8cbffb and 0x7010fb86aa42ada are reported by new kernel
client.

Is it now possible to force set-require-min-compat-client to be luminous,
if not how to fix it?

Thanks

Ilya Dryomov  于2019年10月17日周四 下午9:45写道:

> On Thu, Oct 17, 2019 at 3:38 PM Lei Liu  wrote:
> >
> > Hi Cephers,
> >
> > We have some ceph clusters in 12.2.x version, now we want to use upmap
> balancer,but when i set set-require-min-compat-client to luminous, it's
> failed
> >
> > # ceph osd set-require-min-compat-client luminous
> > Error EPERM: cannot set require_min_compat_client to luminous: 6
> connected client(s) look like jewel (missing 0xa20); 1
> connected client(s) look like jewel (missing 0x800); 1
> connected client(s) look like jewel (missing 0x820); add
> --yes-i-really-mean-it to do it anyway
> >
> > ceph features
> >
> > "client": {
> > "group": {
> > "features": "0x40106b84a842a52",
> > "release": "jewel",
> > "num": 6
> > },
> > "group": {
> > "features": "0x7010fb86aa42ada",
> > "release": "jewel",
> > "num": 1
> > },
> > "group": {
> > "features": "0x7fddff8ee84bffb",
> > "release": "jewel",
> > "num": 1
> > },
> > "group": {
> > "features": "0x3ffddff8eea4fffb",
> > "release": "luminous",
> > "num": 7
> > }
> > }
> >
> > and sessions
> >
> > "MonSession(unknown.0 10.10.100.6:0/1603916368 is open allow *,
> features 0x40106b84a842a52 (jewel))",
> > "MonSession(unknown.0 10.10.100.2:0/2484488531 is open allow *,
> features 0x40106b84a842a52 (jewel))",
> > "MonSession(client.? 10.10.100.6:0/657483412 is open allow *, features
> 0x7fddff8ee84bffb (jewel))",
> > "MonSession(unknown.0 10.10.14.67:0/500706582 is open allow *, features
> 0x7010fb86aa42ada (jewel))"
> >
> > can i use --yes-i-really-mean-it to force enable it ?
>
> No.  0x40106b84a842a52 and 0x7fddff8ee84bffb are too old.
>
> Thanks,
>
> Ilya
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] kernel cephfs - too many caps used by client

2019-10-18 Thread Lei Liu
Only osds is v12.2.8, all of mds and mon used v12.2.12

# ceph versions
{
"mon": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777)
luminous (stable)": 3
},
"mgr": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777)
luminous (stable)": 4
},
"osd": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777)
luminous (stable)": 24,
"ceph version 12.2.8 (ae699615bac534ea496ee965ac6192cb7e0e07c0)
luminous (stable)": 203
},
"mds": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777)
luminous (stable)": 5
},
"rgw": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777)
luminous (stable)": 1
},
"overall": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777)
luminous (stable)": 37,
"ceph version 12.2.8 (ae699615bac534ea496ee965ac6192cb7e0e07c0)
luminous (stable)": 203
}
}

Lei Liu  于2019年10月19日周六 上午10:09写道:

> Thanks for your reply.
>
> Yes, Already set it.
>
> [mds]
>> mds_max_caps_per_client = 10485760 # default is 1048576
>
>
> I think the current configuration is big enough for per client. Do I need
> to continue to increase this value?
>
> Thanks.
>
> Patrick Donnelly  于2019年10月19日周六 上午6:30写道:
>
>> Hello Lei,
>>
>> On Thu, Oct 17, 2019 at 8:43 PM Lei Liu  wrote:
>> >
>> > Hi cephers,
>> >
>> > We have some ceph clusters use cephfs in production(mount with kernel
>> cephfs), but several of clients often keep a lot of caps(millions)
>> unreleased.
>> > I know this is due to the client's inability to complete the cache
>> release, errors might have been encountered, but no logs.
>> >
>> > client kernel version is 3.10.0-957.21.3.el7.x86_64
>> > ceph version is mostly v12.2.8
>> >
>> > ceph status shows:
>> >
>> > x clients failing to respond to cache pressure
>> >
>> > client kernel debug shows:
>> >
>> > # cat
>> /sys/kernel/debug/ceph/a00cc99c-f9f9-4dd9-9281-43cd12310e41.client11291811/caps
>> > total 23801585
>> > avail 1074
>> > used 23800511
>> > reserved 0
>> > min 1024
>> >
>> > mds config:
>> > [mds]
>> > mds_max_caps_per_client = 10485760
>> > # 50G
>> > mds_cache_memory_limit = 53687091200
>> >
>> > I want to know if some ceph configurations can solve this problem ?
>>
>> mds_max_caps_per_client is new in Luminous 12.2.12. See [1]. You need
>> to upgrade.
>>
>> [1] https://tracker.ceph.com/issues/38130
>>
>> --
>> Patrick Donnelly, Ph.D.
>> He / Him / His
>> Senior Software Engineer
>> Red Hat Sunnyvale, CA
>> GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
>>
>>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] kernel cephfs - too many caps used by client

2019-10-18 Thread Lei Liu
Thanks for your reply.

Yes, Already set it.

[mds]
> mds_max_caps_per_client = 10485760 # default is 1048576


I think the current configuration is big enough for per client. Do I need
to continue to increase this value?

Thanks.

Patrick Donnelly  于2019年10月19日周六 上午6:30写道:

> Hello Lei,
>
> On Thu, Oct 17, 2019 at 8:43 PM Lei Liu  wrote:
> >
> > Hi cephers,
> >
> > We have some ceph clusters use cephfs in production(mount with kernel
> cephfs), but several of clients often keep a lot of caps(millions)
> unreleased.
> > I know this is due to the client's inability to complete the cache
> release, errors might have been encountered, but no logs.
> >
> > client kernel version is 3.10.0-957.21.3.el7.x86_64
> > ceph version is mostly v12.2.8
> >
> > ceph status shows:
> >
> > x clients failing to respond to cache pressure
> >
> > client kernel debug shows:
> >
> > # cat
> /sys/kernel/debug/ceph/a00cc99c-f9f9-4dd9-9281-43cd12310e41.client11291811/caps
> > total 23801585
> > avail 1074
> > used 23800511
> > reserved 0
> > min 1024
> >
> > mds config:
> > [mds]
> > mds_max_caps_per_client = 10485760
> > # 50G
> > mds_cache_memory_limit = 53687091200
> >
> > I want to know if some ceph configurations can solve this problem ?
>
> mds_max_caps_per_client is new in Luminous 12.2.12. See [1]. You need
> to upgrade.
>
> [1] https://tracker.ceph.com/issues/38130
>
> --
> Patrick Donnelly, Ph.D.
> He / Him / His
> Senior Software Engineer
> Red Hat Sunnyvale, CA
> GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] kernel cephfs - too many caps used by client

2019-10-17 Thread Lei Liu
Hi cephers,

We have some ceph clusters use cephfs in production(mount with kernel
cephfs), but several of clients often keep a lot of caps(millions)
unreleased.
I know this is due to the client's inability to complete the cache release,
errors might have been encountered, but no logs.

client kernel version is 3.10.0-957.21.3.el7.x86_64
ceph version is mostly v12.2.8

ceph status shows:

x clients failing to respond to cache pressure

client kernel debug shows:

# cat
/sys/kernel/debug/ceph/a00cc99c-f9f9-4dd9-9281-43cd12310e41.client11291811/caps
total 23801585
avail 1074
used 23800511
reserved 0
min 1024

mds config:
[mds]
mds_max_caps_per_client = 10485760
# 50G
mds_cache_memory_limit = 53687091200

I want to know if some ceph configurations can solve this problem ?

Any suggestions?

Thanks.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] krbd / kcephfs - jewel client features question

2019-10-17 Thread Lei Liu
Well, I know kernel 4.13 is full support with ceph luminous and I know that Redhat backports a lot of things to the 3.10 kernel, So which is the matching version?  3.10-862 or above version ?Thanks 发自我的iPhone-- Original --From: 刘磊 Date: Thu,Oct 17,2019 9:38 PMTo: ceph-users Subject: Re: krbd / kcephfs - jewel client features questionHi Cephers,We have some ceph clusters in 12.2.x version, now we want to use upmap balancer,but when i set set-require-min-compat-client to luminous, it's failed# ceph osd set-require-min-compat-client luminousError EPERM: cannot set require_min_compat_client to luminous: 6 connected client(s) look like jewel (missing 0xa20); 1 connected client(s) look like jewel (missing 0x800); 1 connected client(s) look like jewel (missing 0x820); add --yes-i-really-mean-it to do it anywayceph features"client": {        "group": {            "features": "0x40106b84a842a52",            "release": "jewel",            "num": 6        },        "group": {            "features": "0x7010fb86aa42ada",            "release": "jewel",            "num": 1        },        "group": {            "features": "0x7fddff8ee84bffb",            "release": "jewel",            "num": 1        },        "group": {            "features": "0x3ffddff8eea4fffb",            "release": "luminous",            "num": 7        }    }and sessions"MonSession(unknown.0 10.10.100.6:0/1603916368 is open allow *, features 0x40106b84a842a52 (jewel))","MonSession(unknown.0 10.10.100.2:0/2484488531 is open allow *, features 0x40106b84a842a52 (jewel))","MonSession(client.? 10.10.100.6:0/657483412 is open allow *, features 0x7fddff8ee84bffb (jewel))","MonSession(unknown.0 10.10.14.67:0/500706582 is open allow *, features 0x7010fb86aa42ada (jewel))"can i use --yes-i-really-mean-it to force enable it ? I know kernel 4.13 is full support with ceph luminous and I know that Redhat backports a lot of things to the 3.10 kernel, So which is the matching version?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] krbd / kcephfs - jewel client features question

2019-10-17 Thread Lei Liu
Hi Cephers,

We have some ceph clusters in 12.2.x version, now we want to use upmap
balancer,but when i set set-require-min-compat-client to luminous, it's
failed

# ceph osd set-require-min-compat-client luminous
Error EPERM: cannot set require_min_compat_client to luminous: 6 connected
client(s) look like jewel (missing 0xa20); 1 connected
client(s) look like jewel (missing 0x800); 1 connected
client(s) look like jewel (missing 0x820); add
--yes-i-really-mean-it to do it anyway

ceph features

"client": {
"group": {
"features": "0x40106b84a842a52",
"release": "jewel",
"num": 6
},
"group": {
"features": "0x7010fb86aa42ada",
"release": "jewel",
"num": 1
},
"group": {
"features": "0x7fddff8ee84bffb",
"release": "jewel",
"num": 1
},
"group": {
"features": "0x3ffddff8eea4fffb",
"release": "luminous",
"num": 7
}
}

and sessions

"MonSession(unknown.0 10.10.100.6:0/1603916368 is open allow *, features
0x40106b84a842a52 (jewel))",
"MonSession(unknown.0 10.10.100.2:0/2484488531 is open allow *, features
0x40106b84a842a52 (jewel))",
"MonSession(client.? 10.10.100.6:0/657483412 is open allow *, features
0x7fddff8ee84bffb (jewel))",
"MonSession(unknown.0 10.10.14.67:0/500706582 is open allow *, features
0x7010fb86aa42ada (jewel))"

can i use --yes-i-really-mean-it to force enable it ?

I know kernel 4.13 is full support with ceph luminous and I know that
Redhat backports a lot of things to the 3.10 kernel, So which is the
matching version?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] What's the best practice for Erasure Coding

2019-07-08 Thread Lei Liu
Hi Frank,

Thanks for sharing valuable experience.

Frank Schilder  于2019年7月8日周一 下午4:36写道:

> Hi David,
>
> I'm running a cluster with bluestore on raw devices (no lvm) and all
> journals collocated on the same disk with the data. Disks are spinning
> NL-SAS. Our goal was to build storage at lowest cost, therefore all data on
> HDD only. I got a few SSDs that I'm using for FS and RBD meta data. All
> large pools are EC on spinning disk.
>
> I spent at least one month to run detailed benchmarks (rbd bench)
> depending on EC profile, object size, write size, etc. Results were varying
> a lot. My advice would be to run benchmarks with your hardware. If there
> was a single perfect choice, there wouldn't be so many options. For
> example, my tests will not be valid when using separate fast disks for WAL
> and DB.
>
> There are some results though that might be valid in general:
>
> 1) EC pools have high throughput but low IOP/s compared with replicated
> pools
>
> I see single-thread write speeds of up to 1.2GB (gigabyte) per second,
> which is probably the network limit and not the disk limit. IOP/s get
> better with more disks, but are way lower than what replicated pools can
> provide. On a cephfs with EC data pool, small-file IO will be comparably
> slow and eat a lot of resources.
>
> 2) I observe massive network traffic amplification on small IO sizes,
> which is due to the way EC overwrites are handled. This is one bottleneck
> for IOP/s. We have 10G infrastructure and use 2x10G client and 4x10G OSD
> network. OSD bandwidth at least 2x client network, better 4x or more.
>
> 3) k should only have small prime factors, power of 2 if possible
>
> I tested k=5,6,8,10,12. Best results in decreasing order: k=8, k=6. All
> other choices were poor. The value of m seems not relevant for performance.
> Larger k will require more failure domains (more hardware).
>
> 4) object size matters
>
> The best throughput (1M write size) I see with object sizes of 4MB or 8MB,
> with IOP/s getting somewhat better with slower object sizes but throughput
> dropping fast. I use the default of 4MB in production. Works well for us.
>
> 5) jerasure is quite good and seems most flexible
>
> jerasure is quite CPU efficient and can handle smaller chunk sizes than
> other plugins, which is preferrable for IOP/s. However, CPU usage can
> become a problem and a plugin optimized for specific values of k and m
> might help here. Under usual circumstances I see very low load on all OSD
> hosts, even under rebalancing. However, I remember that once I needed to
> rebuild something on all OSDs (I don't remember what it was, sorry). In
> this situation, CPU load went up to 30-50% (meaning up to half the cores
> were at 100%), which is really high considering that each server has only
> 16 disks at the moment and is sized to handle up to 100. CPU power could
> become a bottle for us neck in the future.
>
> These are some general observations and do not replace benchmarks for
> specific use cases. I was hunting for a specific performance pattern, which
> might not be what you want to optimize for. I would recommend to run
> extensive benchmarks if you have to live with a configuration for a long
> time - EC profiles cannot be changed.
>
> We settled on 8+2 and 6+2 pools with jerasure and object size 4M. We also
> use bluestore compression. All meta data pools are on SSD, only very little
> SSD space is required. This choice works well for the majority of our use
> cases. We can still build small expensive pools to accommodate special
> performance requests.
>
> Best regards,
>
> =
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> 
> From: ceph-users  on behalf of David <
> xiaomajia...@gmail.com>
> Sent: 07 July 2019 20:01:18
> To: ceph-users@lists.ceph.com
> Subject: [ceph-users]  What's the best practice for Erasure Coding
>
> Hi Ceph-Users,
>
> I'm working with a  Ceph cluster (about 50TB, 28 OSDs, all Bluestore on
> lvm).
> Recently, I'm trying to use the Erasure Code pool.
> My question is "what's the best practice for using EC pools ?".
> More specifically, which plugin (jerasure, isa, lrc, shec or  clay) should
> I adopt, and how to choose the combinations of (k,m) (e.g. (k=3,m=2),
> (k=6,m=3) ).
>
> Does anyone share some experience?
>
> Thanks for any help.
>
> Regards,
> David
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com