Re: [ceph-users] RBD EC images for a ZFS pool

2020-01-09 Thread JC Lopez
Hi,

you can actually specify the feature you want to enable at creation time so 
this way no need to remove the feature after.

To illustrate Ilya’s message: rbd create rbd/test --size=128M 
--image-feature=layering,striping --stripe-count=8 --stripe-unit=4K

The object size is hereby left to the default but it can also be altered with 
--object-size

Best regards
JC


> On Jan 9, 2020, at 18:32, Kyriazis, George  wrote:
> 
> 
> 
>> On Jan 9, 2020, at 2:16 PM, Ilya Dryomov > > wrote:
>> 
>> On Thu, Jan 9, 2020 at 2:52 PM Kyriazis, George
>> mailto:george.kyria...@intel.com>> wrote:
>>> 
>>> Hello ceph-users!
>>> 
>>> My setup is that I’d like to use RBD images as a replication target of a 
>>> FreeNAS zfs pool.  I have a 2nd FreeNAS (in a VM) to act as a backup target 
>>> in which I mount the RBD image.  All this (except the source FreeNAS 
>>> server) is in Proxmox.
>>> 
>>> Since I am using RBD as a backup target, performance is not really 
>>> critical, but I still don’t want it to take months to complete the backup.  
>>> My source pool size is in the order of ~30TB.
>>> 
>>> I’ve set up an EC RBD pool (and the matching replicated pool) and created 
>>> image with no problems.  However, with the stock 4MB object size, backup 
>>> speed in quite slow.  I tried creating an image with 4K object size, but 
>>> even for a relatively small image size (of 1TB), I get:
>>> 
>>> # rbd -p rbd_backup create vm-118-disk-0 --size 1T --object-size 4K 
>>> --data-pool rbd_ec
>>> 2020-01-09 07:40:27.120 7f3e4aa15f40 -1 librbd::image::CreateRequest: 
>>> validate_layout: image size not compatible with object map
>>> rbd: create error: (22) Invalid argument
>>> #
>> 
>> Yeah, this is an object map limitation.  Given that this is a backup
>> target, you don't really need the object map feature.  Disable it with
>> "rbd feature disable vm-118-disk-0 object-map" and you should be able
>> to create an image of any size.
>> 
> Hmm.. Except I can’t disable a feature on a image that I haven’t created yet. 
> :-). I can start creating a smaller image, and resize after I remove that 
> feature.
> 
>> That said, are you sure that object size is the issue?  If you expect
>> small sequential writes and want them to go to different OSDs, look at
>> using a fancy striping pattern instead of changing the object size:
>> 
>>  https://docs.ceph.com/docs/master/man/8/rbd/#striping 
>> 
>> 
>> E.g. with --stripe-unit 4K --stripe-count 8, the first 4K will go to
>> object 1, the second 4K to object 2, etc.  The ninth 4K will return to
>> object 1, the tenth to object 2, etc.  When objects 1-8 become full, it
>> will move on to objects 9-16, then to 17-24, etc.
>> 
>> This way you get the increased parallelism without the very significant
>> overhead of tons of small objects (if your OSDs are capable enough).
>> 
> Thanks for the suggestions.  After yours and Stefan’s suggestions, I’ll 
> experiment a little bit with various parameters and see what gets me the best 
> performance.
> 
> Thanks all!
> 
> George
> 
>> Thanks,
>> 
>>Ilya
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com 
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Large OMAP Object

2019-11-14 Thread JC Lopez
Hi

this probably comes from your RGW which is a big consumer/producer of OMAP for 
bucket indexes.

Have a look at this previous post and just adapt the pool name to match the one 
where it’s detected: https://www.spinics.net/lists/ceph-users/msg51681.html

Regards
JC

> On Nov 14, 2019, at 15:23, dhils...@performair.com wrote:
> 
> All;
> 
> We had a warning about a large OMAP object pop up in one of our clusters 
> overnight.  The cluster is configured for CephFS, but nothing mounts a 
> CephFS, at this time.
> 
> The cluster mostly uses RGW.  I've checked the cluster log, the MON log, and 
> the MGR log on one of the mons, with no useful references to the pool / pg 
> where the large OMAP objects resides.
> 
> Is my only option to find this large OMAP object to go through the OSD logs 
> for the individual OSDs in the cluster?
> 
> Thank you,
> 
> Dominic L. Hilsbos, MBA 
> Director - Information Technology 
> Perform Air International Inc.
> dhils...@performair.com 
> www.PerformAir.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Create containers/buckets in a custom rgw pool

2019-11-11 Thread JC Lopez
Hi Soumya,

have a look at this page that will show you how to map your special pool from 
the RADOS Gateway perspective.

L uminous: 
https://docs.ceph.com/docs/luminous/radosgw/placement/ 

Mimic: https://docs.ceph.com/docs/mimic/radosgw/placement/ 

Master: https://docs.ceph.com/docs/master/radosgw/placement/ 


As for mixing RGW data with RBD data this is not best practice and should be 
avoided.

Best regards
JC

> On Nov 11, 2019, at 16:20, soumya tr  wrote:
> 
> Hey,
> 
> By default, there are some custom pools created for rgw in a ceph cluster. 
> And when buckets/containers are created using OpenStack horizon or swift CLI, 
> it gets created in the default pool. 
> --
> 1 default.rgw.buckets.data
> 2 default.rgw.control
> 3 default.rgw.data.root
> 4 default.rgw.gc
> 5 default.rgw.log
> 6 default.rgw.intent-log
> 7 default.rgw.meta
> 8 default.rgw.usage
> 9 default.rgw.users.keys
> 10 default.rgw.users.email
> 11 default.rgw.users.swift
> 12 default.rgw.users.uid
> 13 default.rgw.buckets.extra
> 14 default.rgw.buckets.index
> 15 .rgw.root
> --
> 
> I created a custom pool and associated it with application rgw. But not sure 
> how to make use of it. I didn't get many references for the same.
> 
> Is there any way to have the buckets/containers created in a custom rbd pool?
> 
> -- 
> Regards, 
> Soumya
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Optimizing terrible RBD performance

2019-10-04 Thread JC Lopez
Hi,

your RBD bench and RADOS bench use by default 4MB IO request size while your 
FIO is configured for 4KB IO request size.

If you want to compare apple 2 apple (bandwidth) you need to change the FIO IO 
request size to 4194304. Plus, you tested a sequential workload with RADOS 
bench but random with fio.

Make sure you align all parameters to obtain results you can compare

Other note: What block size did you specify with your dd command?

By default block size is equal to 512 bytes so even smaller than the 4KB you 
used for FIO and miles away from the 4MB you used for RADOS bench. Be mindful 
that 5MB/s for your dd with BS=512 is about 1 IOPS.

JC

> On Oct 4, 2019, at 08:28, Petr Bena  wrote:
> 
> Hello,
> 
> I tried to use FIO on RBD device I just created and writing is really 
> terrible (around 1.5MB/s)
> 
> [root@ceph3 tmp]# fio test.fio
> rbd_iodepth32: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 
> 4096B-4096B, ioengine=rbd, iodepth=32
> fio-3.7
> Starting 1 process
> Jobs: 1 (f=1): [w(1)][100.0%][r=0KiB/s,w=1628KiB/s][r=0,w=407 IOPS][eta 
> 00m:00s]
> rbd_iodepth32: (groupid=0, jobs=1): err= 0: pid=115425: Fri Oct  4 17:25:24 
> 2019
>   write: IOPS=384, BW=1538KiB/s (1574kB/s)(39.1MiB/26016msec)
> slat (nsec): min=1452, max=591931, avg=14498.83, stdev=17295.97
> clat (usec): min=1795, max=793172, avg=83218.39, stdev=83485.65
>  lat (usec): min=1810, max=793201, avg=83232.89, stdev=83485.19
> clat percentiles (msec):
>  |  1.00th=[3],  5.00th=[5], 10.00th=[7], 20.00th=[   12],
>  | 30.00th=[   21], 40.00th=[   36], 50.00th=[   61], 60.00th=[   89],
>  | 70.00th=[  116], 80.00th=[  146], 90.00th=[  190], 95.00th=[  218],
>  | 99.00th=[  380], 99.50th=[  430], 99.90th=[  625], 99.95th=[  768],
>  | 99.99th=[  793]
>bw (  KiB/s): min=  520, max= 4648, per=99.77%, avg=1533.40, stdev=754.35, 
> samples=52
>iops: min=  130, max= 1162, avg=383.33, stdev=188.61, samples=52
>   lat (msec)   : 2=0.08%, 4=4.77%, 10=13.56%, 20=11.66%, 50=16.40%
>   lat (msec)   : 100=17.66%, 250=32.53%, 500=3.05%, 750=0.21%, 1000=0.08%
>   cpu  : usr=0.57%, sys=0.52%, ctx=3976, majf=0, minf=8489
>   IO depths: 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.2%, 32=99.7%, >=64=0.0%
>  submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
> >=64=0.0%
>  complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, 
> >=64=0.0%
>  issued rwts: total=0,1,0,0 short=0,0,0,0 dropped=0,0,0,0
>  latency   : target=0, window=0, percentile=100.00%, depth=32
> 
> Run status group 0 (all jobs):
>   WRITE: bw=1538KiB/s (1574kB/s), 1538KiB/s-1538KiB/s (1574kB/s-1574kB/s), 
> io=39.1MiB (40.0MB), run=26016-26016msec
> 
> Disk stats (read/write):
> dm-6: ios=0/2, merge=0/0, ticks=0/0, in_queue=0, util=0.00%, 
> aggrios=20/368, aggrmerge=0/195, aggrticks=105/6248, aggrin_queue=6353, 
> aggrutil=9.07%
>   xvda: ios=20/368, merge=0/195, ticks=105/6248, in_queue=6353, util=9.07%
> 
> 
> Uncomparably worse to RADOS bench results
> 
> On 04/10/2019 17:15, Alexandre DERUMIER wrote:
>> Hi,
>> 
 dd if=/dev/zero of=/dev/rbd0 writes at 5MB/s -
>> you are testing with a single thread/iodepth=1 sequentially here.
>> Then only 1 disk at time, and you have network latency too.
>> 
>> rados bench is doing 16 concurrent write.
>> 
>> 
>> Try to test with fio for example, with bigger iodepth,  small block/big 
>> block , seq/rand.
>> 
>> 
>> 
>> - Mail original -
>> De: "Petr Bena" 
>> À: "ceph-users" 
>> Envoyé: Vendredi 4 Octobre 2019 17:06:48
>> Objet: [ceph-users] Optimizing terrible RBD performance
>> 
>> Hello,
>> 
>> If this is too long for you, TL;DR; section on the bottom
>> 
>> I created a CEPH cluster made of 3 SuperMicro servers, each with 2 OSD
>> (WD RED spinning drives) and I would like to optimize the performance of
>> RBD, which I believe is blocked by some wrong CEPH configuration,
>> because from my observation all resources (CPU, RAM, network, disks) are
>> basically unused / idling even when I put load on the RBD.
>> 
>> Each drive should be 50MB/s read / write and when I run RADOS benchmark,
>> I see values that are somewhat acceptable, interesting part is that when
>> I run RADOS benchmark, I can see all disks read / write to their limits,
>> I can see heavy network utilization and even some CPU utilization - on
>> other hand, when I put any load on the RBD device, performance is
>> terrible, reading is very slow (20MB/s) writing as well (5 - 20MB/s),
>> running dd if=/dev/zero of=/dev/rbd0 writes at 5MB/s - and the most
>> weird part - resources are almost unused - no CPU usage, no network
>> traffic, minimal disk activity.
>> 
>> It looks to me like if CEPH wasn't even trying to perform much as long
>> as the access is via RBD, did anyone ever saw this kind of issue? Is
>> there any way to track down why it is so slow? Here are some outputs:
>> 
>> [root@ceph1 cephadm]# ceph --version
>> ceph 

Re: [ceph-users] file location

2019-08-20 Thread JC Lopez
Hi,

fin out the inode number, identify from the data pool all the object that 
belong to this inode and then run the ceph osd map {pool} {objectname} for each 
of them and this will tell you about all the PGs your inode objects are located 
in.

printf '%x\n' $(stat -c %i {filepath})
100
rados -p {data-pool} ls | grep {hex-inode}
100.
ceph osd map {data-pool} 100.
... '100.' -> pg 3.f0b56f30 (3.30) -> up ([1,2], p1) ...

Bold and undeline is your PG.
Regards
JC

> On Aug 20, 2019, at 13:32, Fyodor Ustinov  wrote:
> 
> Hi!
> 
> How to find out in which pg's located file on cephfs?
> 
> WBR,
>Fyodor.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Error Mounting CephFS

2019-08-07 Thread JC Lopez
Hi,

See https://docs.ceph.com/docs/nautilus/cephfs/kernel/ 


-o mds_namespace={fsname}

Regards
JC

> On Aug 7, 2019, at 10:24, dhils...@performair.com wrote:
> 
> All;
> 
> Thank you for your assistance, this led me to the fact that I hadn't set up 
> the Ceph repo on this client server, and the ceph-common I had installed was 
> version 10.
> 
> I got all of that squared away, and it all works.
> 
> I do have a couple follow up questions:
> Can more than one system mount the same  CephFS, at the same time?
> If your cluster has several CephFS filesystems defined, how do you select 
> which gets mounted, as the fs name doesn't appear to be used in the mount 
> command?
> 
> Thank you,
> 
> Dominic L. Hilsbos, MBA 
> Director - Information Technology 
> Perform Air International Inc.
> dhils...@performair.com 
> www.PerformAir.com
> 
> 
> 
> -Original Message-
> From: Frank Schilder [mailto:fr...@dtu.dk] 
> Sent: Wednesday, August 07, 2019 2:48 AM
> To: Dominic Hilsbos
> Cc: ceph-users
> Subject: Re: [ceph-users] Error Mounting CephFS
> 
> On Centos7, the option "secretfile" requires installation of ceph-fuse.
> 
> Best regards,
> 
> =
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
> 
> 
> From: ceph-users  on behalf of Yan, Zheng 
> 
> Sent: 07 August 2019 10:10:19
> To: dhils...@performair.com
> Cc: ceph-users
> Subject: Re: [ceph-users] Error Mounting CephFS
> 
> On Wed, Aug 7, 2019 at 3:46 PM  wrote:
>> 
>> All;
>> 
>> I have a server running CentOS 7.6 (1810), that I want to set up with CephFS 
>> (full disclosure, I'm going to be running samba on the CephFS).  I can mount 
>> the CephFS fine when I use the option secret=, but when I switch to 
>> secretfile=, I get an error "No such process."  I installed ceph-common.
>> 
>> Is there a service that I'm not aware I should be starting?
>> Do I need to install another package?
>> 
> 
> mount.ceph is missing.  check if it exists and is located in $PATH
> 
>> Thank you,
>> 
>> Dominic L. Hilsbos, MBA
>> Director - Information Technology
>> Perform Air International Inc.
>> dhils...@performair.com
>> www.PerformAir.com
>> 
>> 
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] MON crashing when upgrading from Hammer to Luminous

2019-07-22 Thread JC Lopez
First link should be this one 
http://docs.ceph.com/docs/jewel/install/upgrading-ceph/#upgrade-procedures 
<http://docs.ceph.com/docs/jewel/install/upgrading-ceph/#upgrade-procedures> 
rather than 
http://docs.ceph.com/docs/mimic/install/upgrading-ceph/#upgrade-procedures 
<http://docs.ceph.com/docs/mimic/install/upgrading-ceph/#upgrade-procedures> to 
be consistent.

JC

> On Jul 22, 2019, at 13:38, JC Lopez  wrote:
> 
> Hi 
> 
> you’ll have to go from Hammer to Jewel then from Jewel to Luminous for a 
> smooth upgrade.
> - http://docs.ceph.com/docs/mimic/install/upgrading-ceph/#upgrade-procedures 
> <http://docs.ceph.com/docs/mimic/install/upgrading-ceph/#upgrade-procedures>
> - 
> http://docs.ceph.com/docs/luminous/release-notes/#upgrading-from-pre-jewel-releases-like-hammer
>  
> <http://docs.ceph.com/docs/luminous/release-notes/#upgrading-from-pre-jewel-releases-like-hammer>
> 
> Make sure to check any special upgrade requirement from the release notes.
> - 
> http://docs.ceph.com/docs/jewel/release-notes/#upgrading-from-infernalis-or-hammer
>  
> <http://docs.ceph.com/docs/jewel/release-notes/#upgrading-from-infernalis-or-hammer>
> - 
> http://docs.ceph.com/docs/luminous/release-notes/#upgrade-from-jewel-or-kraken
>  
> <http://docs.ceph.com/docs/luminous/release-notes/#upgrade-from-jewel-or-kraken>
> 
> Regards
> JC
> 
>> On Jul 22, 2019, at 12:20, Armin Ranjbar > <mailto:z...@zoup.org>> wrote:
>> 
>> Dear Everyone,
>> 
>> First of all, guys, seriously, Thank you for Ceph.
>> 
>> now to the problem, upgrading ceph from 0.94.6 
>> (e832001feaf8c176593e0325c8298e3f16dfb403) to 12.2.12-218-g9fd889f 
>> (9fd889fe09c652512ca78854702d5ad9bf3059bb), ceph-mon seems unable to upgrade 
>> it's database, problem is gone if i --force-sync.
>> 
>> This is the message:
>> terminate called after throwing an instance of 
>> 'ceph::buffer::malformed_input'
>>   what():  buffer::malformed_input: void 
>> object_stat_sum_t::decode(ceph::buffer::list::iterator&) decode past end of 
>> struct encoding
>> *** Caught signal (Aborted) **
>> 
>> attached is full log, the output of:
>> ceph-mon --debug_mon 100 -i node-1 -d
>> 
>> ---
>> Armin ranjbar
>> 
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] MON crashing when upgrading from Hammer to Luminous

2019-07-22 Thread JC Lopez
Hi 

you’ll have to go from Hammer to Jewel then from Jewel to Luminous for a smooth 
upgrade.
- http://docs.ceph.com/docs/mimic/install/upgrading-ceph/#upgrade-procedures 

- 
http://docs.ceph.com/docs/luminous/release-notes/#upgrading-from-pre-jewel-releases-like-hammer
 


Make sure to check any special upgrade requirement from the release notes.
- 
http://docs.ceph.com/docs/jewel/release-notes/#upgrading-from-infernalis-or-hammer
 

- 
http://docs.ceph.com/docs/luminous/release-notes/#upgrade-from-jewel-or-kraken 


Regards
JC

> On Jul 22, 2019, at 12:20, Armin Ranjbar  wrote:
> 
> Dear Everyone,
> 
> First of all, guys, seriously, Thank you for Ceph.
> 
> now to the problem, upgrading ceph from 0.94.6 
> (e832001feaf8c176593e0325c8298e3f16dfb403) to 12.2.12-218-g9fd889f 
> (9fd889fe09c652512ca78854702d5ad9bf3059bb), ceph-mon seems unable to upgrade 
> it's database, problem is gone if i --force-sync.
> 
> This is the message:
> terminate called after throwing an instance of 'ceph::buffer::malformed_input'
>   what():  buffer::malformed_input: void 
> object_stat_sum_t::decode(ceph::buffer::list::iterator&) decode past end of 
> struct encoding
> *** Caught signal (Aborted) **
> 
> attached is full log, the output of:
> ceph-mon --debug_mon 100 -i node-1 -d
> 
> ---
> Armin ranjbar
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ONE pg deep-scrub blocks cluster

2016-08-29 Thread JC Lopez
#
>> maximum number of chunks the scrub will process in one go. Defaults to
>> 25.
>> osd_deep_scrub_stride = 1048576   # Read size
>> during scrubbing operations. The idea here is to do less chunks but
>> bigger sequential reads. Defaults to 512KB=524288.
> 
> Thanks for this suggenstions.
> These are already on my notebook to setup after the cluster is running fine 
> without any issues.
> 
>> If your devices use coq as an elevator, you can add the following 2
>> lines in the section to lower the priority of the scrubbing read IOs
>> osd_disk_thread_ioprio_class = idle  # Change
>> the coq priority for the scrub thread. Default is the same priority
>> class as the OSD
>> osd_disk_thread_ioprio_priority = 0   # Change
>> the priority within the class to the lowest possible. Default is the
>> same priority as the OSD
> 
> Actualy they are on default (deadline), but it seems to be worth to change 
> this to cfq.
> We will have a closer look on this when the cluster is working without any 
> issues.
> 
>> Keep me posted on your tests and findings.
>> If this does fix the performance impact problem, I suggest you apply
>> those changes to the config file and push it to all the OSD nodes and
>> restart the OSDs in a gradual manner.
> 
> Of course!... and done with this eMail :)
> 
> Could be the mentionend file above the cause for the trouble?
> Is it possible to delete this via "rm -f" or would this cause any other 
> issues?

Verify that when you do a "rados -p rbd ls | grep vm-101-disk-2” command, you 
can see an object named vm-101-disk-2.
Verify if you have an RBD named this way “rbd -p rbd ls | grep vm-101-disk-2"

As I’m not familiar with proxmox so I’d suggest the following:
If yes to 1, for security, copy this file somewhere else and then to a rados -p 
rbd rm vm-101-disk-2.
If no to 1, for security, copy this file somewhere else and then to a rm -rf 
vm-101-disk-2__head_383C3223__0

Make sure all your PG copies show the same content and wait for the next scrub 
to see what is happening.

If anything goes wrong you will be able to upload an object with the exact same 
content from the file you copied.

Is proxmox using such huge objects for something to your knowledge (VM boot 
image or something else)? Can you search the proxmox mailing list and open 
tickets to verify.

And is this the cause of the long deep scrub? I do think so but I’m not in 
front of the cluster.


> 
> Best regards
> - Mehmet
> 
>> Regards
>> JC Lopez
>> S. Technical Instructor, Global Storage Consulting Practice
>> Red Hat, Inc.
>> jelo...@redhat.com
>> +1 408-680-6959
>>> On Aug 26, 2016, at 04:16, Mehmet <c...@elchaka.de> wrote:
>>> Hello JC,
>>> as promised here is my
>>> - ceph.conf (I have done a "diff" on all involved server - all using the 
>>> same ceph.conf) = ceph_conf.txt
>>> - ceph pg 0.223 query = ceph_pg_0223_query_20161236.txt
>>> - ceph -s = ceph_s.txt
>>> - ceph df = ceph_df.txt
>>> - ceph osd df = ceph_osd_df.txt
>>> - ceph osd dump | grep pool = ceph_osd_dump_pool.txt
>>> - ceph osd crush rule dump = ceph_osd_crush_rule_dump.txt
>>> as attached txt files.
>>> I have done again a "ceph pg deep-scrub 0.223" before I have created the 
>>> files above. The issue still exists today ~ 12:24 on ... :*(
>>> The deep-scrub on this pg has taken ~14 minutes:
>>> - 2016-08-26 12:24:01.463411 osd.9 172.16.0.11:6808/29391 1777 : cluster 
>>> [INF] 0.223 deep-scrub starts
>>> - 2016-08-26 12:38:07.201726 osd.9 172.16.0.11:6808/29391 2485 : cluster 
>>> [INF] 0.223 deep-scrub ok
>>> Ceph: version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
>>> OS: Ubuntu 16.04 LTS (Linux osdserver1 4.4.0-31-generic #50-Ubuntu SMP Wed 
>>> Jul 13 00:07:12 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux)
>>>> As a remark, assuming the size parameter of the rbd pool is set to 3, the 
>>>> number of PGs in your cluster should be higher
>>> I know I could increase this to 2048 (with 30 OSDs). But perhaps we will 
>>> create further Pools so I did not want to set this to high for this pool 
>>> because it is not possible to decrease the pg for the pool.
>>> Furthermore if I would change this now and the issue is gone, we would not 
>>> know what the cause was... :)
>>> When you need further informations, please do not hesitate to ask - I will 
>>> provides this as soon as possible.
>>> Please keep in mind t

Re: [ceph-users] Cache Tiering Question

2015-10-15 Thread JC Lopez
Hi Robert

usable bytes so before replication. The size of the actual original objects you 
write.

Cheers
JC

> On 15 Oct 2015, at 16:33, Robert LeBlanc  wrote:
> 
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
> 
> One more question. Is max_{bytes,objects} before or after replication factor?
> - 
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
> 
> 
> On Thu, Oct 15, 2015 at 4:42 PM, LOPEZ Jean-Charles  wrote:
>> Hi Robert,
>> 
>> yes they do.
>> 
>> Pools don’t have a size when you create them hence the couple value/ratio 
>> that is to be defined for cache tiering mechanism. Pool only have a number 
>> of PGs assigned. So setting the max values and the ratios for dirty and full 
>> must be set explicitly to match your configuration.
>> 
>> Note that you can at the same time define max_bytes and max_objects. The 
>> first of the 2 values that breaches using your ratio settings will trigger 
>> eviction and/or flushing. The ratios you choose apply to both values.
>> 
>> Cheers
>> JC
>> 
>>> On 15 Oct 2015, at 15:02, Robert LeBlanc  wrote:
>>> 
>>> -BEGIN PGP SIGNED MESSAGE-
>>> Hash: SHA256
>>> 
>>> hmmm...
>>> 
>>> http://docs.ceph.com/docs/master/rados/operations/cache-tiering/#relative-sizing
>>> 
>>> makes it sound like it should be based on the size of the pool and
>>> that you don't have to set anything like max bytes/objects. Can you
>>> confirm that cache_target_{dirty,dirty_high,full}_ratio works as a
>>> ratio of target_max_bytes set?
>>> - 
>>> Robert LeBlanc
>>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
>>> 
>>> 
>>> On Thu, Oct 15, 2015 at 3:32 PM, Nick Fisk  wrote:
 
 
 
 
> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Robert LeBlanc
> Sent: 15 October 2015 22:06
> To: ceph-users@lists.ceph.com
> Subject: [ceph-users] Cache Tiering Question
> 
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
> 
> ceph df (ceph version 0.94.3-252-g629b631
> (629b631488f044150422371ac77dfc005f3de1bc)) is showing some odd
> results:
> 
> root@nodez:~# ceph df
> GLOBAL:
>   SIZE   AVAIL  RAW USED %RAW USED
>   24518G 21670G1602G  6.53
> POOLS:
>   NAME ID USED  %USED MAX AVAIL OBJECTS
>   rbd  0  2723G 11.11 6380G 1115793
>   ssd-pool 2  0 0  732G   1
> 
> The rbd pool is showing 11.11% used, but if you calculate the numbers
 there
> it is 2723/6380=42.68%.
 
 I have a feeling that the percentage is based on the amount used of the
 total cluster size. Ie 2723/24518
 
> 
> Will this cause problems with the relative cache tier settings? Do I need
 to set
> the percentage based on what Ceph is reporting here?
 
 The flushing/eviction thresholds are based on the target_max_bytes number
 that you set, they have nothing to do with the underlying pool size. It's 
 up
 to you to come up with a sane number for this variable.
 
> 
> Thanks,
> - 
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1 -
> BEGIN PGP SIGNATURE-
> Version: Mailvelope v1.2.0
> Comment: https://www.mailvelope.com
> 
> wsFcBAEBCAAQBQJWIBVGCRDmVDuy+mK58QAAXEYQAKm5IBGn81Hlb9az4
> 52x
> hSH6onk7mJE7L2s5FnoJv2sNW4azhDEVKGQBE9vvhIVBhhtKtnqdzu3ytk6E
> EUFuPBzUWLJyG3wQtp3QC0PdYzlGkS7bowdpZqk9PdaYZYgEdqG/cLEl/eAx
> LGIUXmr6vIuNhnntGIIYeUAiWXA7b5qzOKbef6OlOp7Mz6Euel9S8ycZlSAR
> eBQ5hdLSFoFai5ldyV+/hmqLnujOfanRFC8pIYr41aKe7wBOPOargLGQdka3
> jswmcf+0hV7QqZSOjJijDYvOgRuHBFK6cdyP9SRKxWxG7uH+yDOvya0TqOob
> 1yDomYC1zD2uzG9+L5Iv6at8fuBF5xFKPqax9N4WQj3Oj9fBwioQVBocNxHc
> MIlQnvnLeq6OLtdfPoPignTAHIH2RrvAmdwYkSCuopjUSTkmBsyBLIiiz/KI
> P4mSXAxZb0UF4pbCDgdYG6qUEywR/enGsT1lnmNLx4vY8W/yz9xQ3o3JnIpD
> pWyo9zJ8Ugnwvihbo7xKe+EZOeJL0YF4BiyAprH5pKFdQcAWcV98zWHnLBxd
> EFHyN9fHsVdw0UsxIUBZFfM1u4S7fchgVeFfiTSdGqd/dWHQCHKJPNBSJnae
> aPKTyvg77N6zTn04VGspfenR+svGbkAtUfO2HJ1Kkd4/wZ9GIzsS1ovPZFsM
> jJe4
> =YSyj
> -END PGP SIGNATURE-
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 
 
 
>>> 
>>> -BEGIN PGP SIGNATURE-
>>> Version: Mailvelope v1.2.0
>>> Comment: https://www.mailvelope.com
>>> 
>>> wsFcBAEBCAAQBQJWICJwCRDmVDuy+mK58QAAyTUQALkwOnB++bXto+cM0iSZ
>>> B3nZgvl9FKZnujb0MUIiS29a+Y2nnBpAGgHbF4Y9ngnDQYNZ0yf1DD2wYad2
>>> rll6pYeWRRYSmaBCBfdPlqbbVw8WpjdXLR9FtLFfUR2V+Ghf4U83F8iKiWn1
>>> +6DqouHMA/auHjEr49w+Ue0kpKSfItH/9LkVjYQBKp6E7tyOSsrzcM1milKR
>>>