Re: [ceph-users] Intel SSD D3-S4510 and Intel SSD D3-S4610 firmware advisory notice

2019-04-19 Thread Irek Fasikhov
Wow!!!

пт, 19 апр. 2019 г. в 10:16, Stefan Kooman :

> Hi List,
>
> TL;DR:
>
> For those of you who are running a Ceph cluster with Intel SSD D3-S4510
> and or Intel SSD D3-S4610 with firmware version XCV10100 please upgrade
> to firmware XCV10110 ASAP. At least before ~ 1700 power up hours.
>
> More information here:
>
>
> https://support.microsoft.com/en-us/help/4499612/intel-ssd-drives-unresponsive-after-1700-idle-hours
>
>
> https://downloadcenter.intel.com/download/28673/SSD-S4510-S4610-2-5-non-searchable-firmware-links/
>
> Gr. Stefan
>
> P.s. Thanks to Frank Dennis (@jedisct1) for retweeting @NerdPyle:
> https://twitter.com/jedisct1/status/1118623635072258049
>
>
> --
> | BIT BV  http://www.bit.nl/Kamer van Koophandel 09090351
> | GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Urgent: Reduced data availability / All pgs inactive

2019-02-20 Thread Irek Fasikhov
Hi,

You have problems with MRG.
http://docs.ceph.com/docs/master/rados/operations/pg-states/
*The ceph-mgr hasn’t yet received any information about the PG’s state from
an OSD since mgr started up.*

чт, 21 февр. 2019 г. в 09:04, Irek Fasikhov :

> Hi,
>
> You have problems with MRG.
> http://docs.ceph.com/docs/master/rados/operations/pg-states/
> *The ceph-mgr hasn’t yet received any information about the PG’s state
> from an OSD since mgr started up.*
>
>
> ср, 20 февр. 2019 г. в 23:10, Ranjan Ghosh :
>
>> Hi all,
>>
>> hope someone can help me. After restarting a node of my 2-node-cluster
>> suddenly I get this:
>>
>> root@yak2 /var/www/projects # ceph -s
>>   cluster:
>> id: 749b2473-9300-4535-97a6-ee6d55008a1b
>> health: HEALTH_WARN
>> Reduced data availability: 200 pgs inactive
>>
>>   services:
>> mon: 3 daemons, quorum yak1,yak2,yak0
>> mgr: yak0.planwerk6.de(active), standbys: yak1.planwerk6.de,
>> yak2.planwerk6.de
>> mds: cephfs-1/1/1 up  {0=yak1.planwerk6.de=up:active}, 1 up:standby
>> osd: 2 osds: 2 up, 2 in
>>
>>   data:
>> pools:   2 pools, 200 pgs
>> objects: 0  objects, 0 B
>> usage:   0 B used, 0 B / 0 B avail
>> pgs: 100.000% pgs unknown
>>  200 unknown
>>
>> And this:
>>
>>
>> root@yak2 /var/www/projects # ceph health detail
>> HEALTH_WARN Reduced data availability: 200 pgs inactive
>> PG_AVAILABILITY Reduced data availability: 200 pgs inactive
>> pg 1.34 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.35 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.36 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.37 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.38 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.39 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.3a is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.3b is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.3c is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.3d is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.3e is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.3f is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.40 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.41 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.42 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.43 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.44 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.45 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.46 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.47 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.48 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.49 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.4a is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.4b is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.4c is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.4d is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 2.34 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 2.35 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 2.36 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 2.38 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 2.39 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 2.3a is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 2.3b is st

Re: [ceph-users] How to speed up backfill

2018-01-10 Thread Irek Fasikhov
ceph tell osd.* injectargs '--osd_recovery_delay_start 30'

2018-01-11 10:31 GMT+03:00 shadow_lin :

> Hi ,
>  Mine is purely backfilling(remove a osd from the cluster) and it
> started at 600Mb/s and ended at about 3MB/s.
> How is your recovery made up?Is it backfill or log replay pg recovery
> or both?
>
> 2018-01-11
> --
> shadow_lin
> --
>
> *发件人:*Josef Zelenka 
> *发送时间:*2018-01-11 15:26
> *主题:*Re: [ceph-users] How to speed up backfill
> *收件人:*"shadow_lin"
> *抄送:*"ceph-users"
>
>
> Hi, our recovery slowed down significantly towards the end, however it was
> still about five times faster than the original speed.We suspected that
> this is caused somehow by threading (more objects transferred - more
> threads used), but this is only an assumption.
>
> On 11/01/18 05:02, shadow_lin wrote:
>
> Hi,
> I had tried these two method and for backfilling it seems only
> osd-max-backfills works.
> How was your recovery speed when it comes to the last few pgs or objects?
>
> 2018-01-11
> --
> shadow_lin
> --
>
> *发件人:*Josef Zelenka 
> 
> *发送时间:*2018-01-11 04:53
> *主题:*Re: [ceph-users] How to speed up backfill
> *收件人:*"shadow_lin" 
> *抄送:*
>
>
> Hi, i had the same issue a few days back, i tried playing around with
> these two:
>
> ceph tell 'osd.*' injectargs '--osd-max-backfills '
> ceph tell 'osd.*' injectargs '--osd-recovery-max-active  '
>  and it helped greatly(increased our recovery speed 20x), but be careful to 
> not overload your systems.
>
>
> On 10/01/18 17:50, shadow_lin wrote:
>
> Hi all,
> I am playing with setting for backfill to try to find how to control the
> speed of backfill.
>
> Now I only find  "osd max backfills" can have effect the backfill speed.
> But after all pg need to be backfilled begin backfilling I can't find any
> way to speed up backfills.
>
> Especailly when it comes to the last pg to recover, the speed is only a
> few MB/s(when there are multi pg are backfilled the speed could be more
> than 600MB/s in my test)
>
> I am a little confused about the setting of backfills and recovery.Though
> backfilling is a kind of recovery but It seems recovery setting is only
> about to replay pg logs to do recover  pg.
>
> Would change "osd recovery max active" or other recovery setting have any
> effect on backfilling?
>
> I did tried "osd recovery op priority" and "osd recovery max active" with
> no luck.
>
> Any advice would be greatly appreciated.Thanks
>
> 2018-01-11
> --
> lin.yunfan
>
>
> ___
> ceph-users mailing 
> listceph-us...@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Luminous release_type "rc"

2017-09-26 Thread Irek Fasikhov
Hi
No cause for concern:
https://github.com/ceph/ceph/pull/17348/commits/2b5f84586ec4d20ebb5aacd6f3c71776c621bf3b

2017-09-26 11:23 GMT+03:00 Stefan Kooman :

> Hi,
>
> I noticed the ceph version still gives "rc" although we are using the
> latest Ceph packages: 12.2.0-1xenial
> (https://download.ceph.com/debian-luminous xenial/main amd64 Packages):
>
> ceph daemon mon.mon5 version
> {"version":"12.2.0","release":"luminous","release_type":"rc"}
>
> Why is this important (to me)? I want to make a monitoring check that
> ensures we
> are running identical, "stable" packages, instead of "beta" / "rc" in
> production.
>
> Gr. Stefan
>
>
>
> --
> | BIT BV  http://www.bit.nl/Kamer van Koophandel 09090351
> | GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Long OSD restart after upgrade to 10.2.9

2017-07-16 Thread Irek Fasikhov
Hi, Anton.
You need to run the OSD with debug_ms = 1/1 and debug_osd = 20/20 for
detailed information.

2017-07-17 8:26 GMT+03:00 Anton Dmitriev :

> Hi, all!
>
> After upgrading from 10.2.7 to 10.2.9 I see that restarting osds by
> 'restart ceph-osd id=N' or 'restart ceph-osd-all' takes about 10 minutes
> for getting OSD from DOWN to UP. The same situation on all 208 OSDs on 7
> servers.
>
> Also very long OSD start after rebooting servers.
>
> Before upgrade it took no more than 2 minutes.
>
> Does anyone has the same situation like mine?
>
>
> 2017-07-17 08:07:26.895600 7fac2d656840  0 set uid:gid to 4402:4402
> (ceph:ceph)
> 2017-07-17 08:07:26.895615 7fac2d656840  0 ceph version 10.2.9
> (2ee413f77150c0f375ff6f10edd6c8f9c7d060d0), process ceph-osd, pid 197542
> 2017-07-17 08:07:26.897018 7fac2d656840  0 pidfile_write: ignore empty
> --pid-file
> 2017-07-17 08:07:26.906489 7fac2d656840  0 filestore(/var/lib/ceph/osd/ceph-0)
> backend xfs (magic 0x58465342)
> 2017-07-17 08:07:26.917074 7fac2d656840  0 
> genericfilestorebackend(/var/lib/ceph/osd/ceph-0)
> detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config
> option
> 2017-07-17 08:07:26.917092 7fac2d656840  0 
> genericfilestorebackend(/var/lib/ceph/osd/ceph-0)
> detect_features: SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data
> hole' config option
> 2017-07-17 08:07:26.917112 7fac2d656840  0 
> genericfilestorebackend(/var/lib/ceph/osd/ceph-0)
> detect_features: splice is supported
> 2017-07-17 08:07:27.037031 7fac2d656840  0 
> genericfilestorebackend(/var/lib/ceph/osd/ceph-0)
> detect_features: syncfs(2) syscall fully supported (by glibc and kernel)
> 2017-07-17 08:07:27.037154 7fac2d656840  0 
> xfsfilestorebackend(/var/lib/ceph/osd/ceph-0)
> detect_feature: extsize is disabled by conf
> 2017-07-17 08:15:17.839072 7fac2d656840  0 filestore(/var/lib/ceph/osd/ceph-0)
> mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled
> 2017-07-17 08:15:20.150446 7fac2d656840  0 
> cls/hello/cls_hello.cc:305: loading cls_hello
> 2017-07-17 08:15:20.152483 7fac2d656840  0 
> cls/cephfs/cls_cephfs.cc:202: loading cephfs_size_scan
> 2017-07-17 08:15:20.210428 7fac2d656840  0 osd.0 224167 crush map has
> features 2200130813952, adjusting msgr requires for clients
> 2017-07-17 08:15:20.210443 7fac2d656840  0 osd.0 224167 crush map has
> features 2200130813952 was 8705, adjusting msgr requires for mons
> 2017-07-17 08:15:20.210448 7fac2d656840  0 osd.0 224167 crush map has
> features 2200130813952, adjusting msgr requires for osds
> 2017-07-17 08:15:58.902173 7fac2d656840  0 osd.0 224167 load_pgs
> 2017-07-17 08:16:19.083406 7fac2d656840  0 osd.0 224167 load_pgs opened
> 242 pgs
> 2017-07-17 08:16:19.083969 7fac2d656840  0 osd.0 224167 using 0 op queue
> with priority op cut off at 64.
> 2017-07-17 08:16:19.109547 7fac2d656840 -1 osd.0 224167 log_to_monitors
> {default=true}
> 2017-07-17 08:16:19.522448 7fac2d656840  0 osd.0 224167 done with init,
> starting boot process
>
> --
> Dmitriev Anton
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] qemu-img convert vs rbd import performance

2017-07-13 Thread Irek Fasikhov
Hi.

You need to add to the ceph.conf
[client]
 rbd cache = true
 rbd readahead trigger requests = 5
 rbd readahead max bytes = 419430400
 *rbd readahead disable after bytes = 0*
 rbd_concurrent_management_ops = 50

2017-07-13 15:29 GMT+03:00 Mahesh Jambhulkar :

> Seeing some performance issues on my ceph cluster with *qemu-img convert 
> *directly
> writing to ceph against normal rbd import command.
>
> *Direct data copy (without qemu-img convert) took 5 hours 43 minutes for
> 465GB data.*
>
>
> [root@cephlarge vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb]# time
> rbd import 66582225-6539-4e5e-9b7a-59aa16739df1 -p volumes
> 66582225-6539-4e5e-9b7a-59aa16739df1_directCopy --image-format 2
> rbd: --pool is deprecated for import, use --dest-pool
> Importing image: 100% complete...done.
>
> real*343m38.028s*
> user4m40.779s
> sys 7m18.916s
> [root@cephlarge vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb]# rbd
> info volumes/66582225-6539-4e5e-9b7a-59aa16739df1_directCopy
> rbd image '66582225-6539-4e5e-9b7a-59aa16739df1_directCopy':
> size 465 GB in 119081 objects
> order 22 (4096 kB objects)
> block_name_prefix: rbd_data.373174b0dc51
> format: 2
> features: layering, exclusive-lock, object-map, fast-diff,
> deep-flatten
> flags:
> [root@cephlarge vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb]#
>
>
> *Qemu-img convert is still in progress and completed merely 10% in more
> than 40 hours. (for 465GB data)*
>
> [root@cephlarge mnt]# time qemu-img convert -p -t none -O raw
> /mnt/data/workload_326e8a43-a90a-4fe9-8aab-6d33bcdf5a05/snap
> shot_9f0cee13-8200-4562-82ec-1fb9f234bcd8/vm_id_05e9534e-
> 5c84-4487-9613-1e0e227e4c1a/vm_res_id_24291e4b-93d2-47ad-
> 80a8-bf3c395319b9_vdb/66582225-6539-4e5e-9b7a-59aa16739df1
> rbd:volumes/24291e4b-93d2-47ad-80a8-bf3c395319b9
> (0.00/100%)
>
>
> (10.00/100%)
>
>
> *Rbd bench-write shows speed of ~21MB/s.*
>
> [root@cephlarge ~]# rbd bench-write image01 --pool=rbdbench
> bench-write  io_size 4096 io_threads 16 bytes 1073741824 pattern sequential
>   SEC   OPS   OPS/SEC   BYTES/SEC
> 2  6780   3133.53  12834946.35
> 3  6831   1920.65  7866998.17
> 4  8896   2040.50  8357871.83
> 5 13058   2562.61  10496432.34
> 6 17225   2836.78  11619432.99
> 7 20345   2736.84  11210076.25
> 8 23534   3761.57  15407392.94
> 9 25689   3601.35  14751109.98
>10 29670   3391.53  13891695.57
>11 33169   3218.29  13182107.64
>12 36356   3135.34  12842344.21
>13 38431   2972.62  12175863.99
>14 47780   4389.77  17980497.11
>15 55452   5156.40  21120627.26
>16 59298   4772.32  19547440.33
>17 61437   5151.20  21099315.94
>18 67702   5861.64  24009295.97
>19 77086   5895.03  24146032.34
>20 85474   5936.09  24314243.88
>21 93848   7499.73  30718898.25
>22100115   7783.39  31880760.34
>23105405   7524.76  30821410.70
>24111677   6797.12  27841003.78
>25116971   6274.51  25700386.48
>26121156   5468.77  22400087.81
>27126484   5345.83  21896515.02
>28137937   6412.41  26265239.30
>29143229   6347.28  25998461.13
>30149505   6548.76  26823729.97
>31159978   7815.37  32011752.09
>32171431   8821.65  36133479.15
>33181084   8795.28  36025472.27
>35182856   6322.41  25896605.75
>36186891   5592.25  22905872.73
>37190906   4876.30  19973339.07
>38190943   3076.87  12602853.89
>39190974   1536.79  6294701.64
>40195323   2344.75  9604081.07
>41198479   2703.00  11071492.89
>42208893   3918.55  16050365.70
>43214172   4702.42  19261091.89
>44215263   5167.53  21166212.98
>45219435   5392.57  22087961.94
>46225731   5242.85  21474728.85
>47234101   5009.43  20518607.70
>48243529   6326.00  25911280.08
>49254058   7944.90  32542315.10
> elapsed:50  ops:   262144  ops/sec:  5215.19  bytes/sec: 21361431.86
> [root@cephlarge ~]#
>
> This CEPH deployment has 2 OSDs.
>
> It would be of great help if anyone can give me pointers.
>
> --
> Regards,
> mahesh j
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] To backup or not to backup the classic way - How to backup hundreds of TB?

2017-02-14 Thread Irek Fasikhov
Hi.

We use Ceph Rados GW S3. And we are very happy :).
Each administrator is responsible for its service.

Using the following clients S3:
Linux - s3cmd, duply;
Windows - cloudberry.

P.S 500 TB data, 3x replication, 3 datacenter.

С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757

2017-02-14 12:15 GMT+03:00 Götz Reinicke :

> Hi,
>
> I guess that's a question that pops up in different places, but I could
> not find any which fits to my thoughts.
>
> Currently we start to use ceph for file shares of our films produced by
> our students and some xen/vmware VMs. Thd VM data is already backed up; the
> fils original footage is stored in other places.
>
> We start with some 100TB rbd and mount smb/NFS shares from the clients.
> May be we look into ceph fs soon.
>
> The question is: How would someone handle a backup of 100 TB data?
> Rsyncing that to an other system or having a commercial backup solution
> looks not that good e.g. regarding the price.
>
> One thought is, is there some sort of best practice in the ceph world e.g.
> replicating to an other physical independent cluster? Or use more replicas,
> odds, nodes and do snapshots in one cluster?
>
> Having productive data and backup on the same hardware currently makes me
> feel not that good too….But the world changes :)
>
> Long story short: How do you do backup hundreds of TB?
>
> Curious for suggestions and thoughts .. Thanks and Regards . Götz
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Migrating data from a Ceph clusters to another

2017-02-09 Thread Irek Fasikhov
Hi.
I recommend using rbd import/export.

С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757

2017-02-09 11:13 GMT+03:00 林自均 :

> Hi,
>
> I have 2 Ceph clusters, cluster A and cluster B. I want to move all the
> pools on A to B. The pool names don't conflict between clusters. I guess
> it's like RBD mirroring, except that it's pool mirroring. Is there any
> proper ways to do it?
>
> Thanks for any suggestions.
>
> Best,
> John Lin
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Experience with 5k RPM/archive HDDs

2017-02-02 Thread Irek Fasikhov
Hi,All

The more filling, the slower disk.
*In my opinion* the SMR can be used exclusively for the RGW.

С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757

2017-02-03 10:39 GMT+03:00 Christian Balzer :

>
> Hello,
>
> On Fri, 3 Feb 2017 10:30:28 +0300 Irek Fasikhov wrote:
>
> > Hi, Maxime.
> >
> > Linux SMR is only starting with version 4.9 kernel.
> >
> What Irek said.
>
> Also, SMR in general is probably a bad match for Ceph.
> Drives like that really want to be treated more like a tape than anything
> else.
>
>
> In general, do you really need all this space, what's your use case?
>
> Unless it's something like a backup/archive cluster or pool with little to
> none concurrent R/W access, you're likely to run out of IOPS (again) long
> before filling these monsters up.
>
> Christian
> >
> > С уважением, Фасихов Ирек Нургаязович
> > Моб.: +79229045757
> >
> > 2017-02-03 10:26 GMT+03:00 Maxime Guyot :
> >
> > > Hi everyone,
> > >
> > >
> > >
> > > I’m wondering if anyone in the ML is running a cluster with archive
> type
> > > HDDs, like the HGST Ultrastar Archive (10TB@7.2k RPM) or the Seagate
> > > Enterprise Archive (8TB@5.9k RPM)?
> > >
> > > As far as I read they both fall in the enterprise class HDDs so
> **might**
> > > be suitable for a low performance, low cost cluster?
> > >
> > >
> > >
> > > Cheers,
> > >
> > > Maxime
> > >
> > > ___
> > > ceph-users mailing list
> > > ceph-users@lists.ceph.com
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > >
> > >
>
>
> --
> Christian BalzerNetwork/Systems Engineer
> ch...@gol.com   Global OnLine Japan/Rakuten Communications
> http://www.gol.com/
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Experience with 5k RPM/archive HDDs

2017-02-02 Thread Irek Fasikhov
Hi, Maxime.

Linux SMR is only starting with version 4.9 kernel.


С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757

2017-02-03 10:26 GMT+03:00 Maxime Guyot :

> Hi everyone,
>
>
>
> I’m wondering if anyone in the ML is running a cluster with archive type
> HDDs, like the HGST Ultrastar Archive (10TB@7.2k RPM) or the Seagate
> Enterprise Archive (8TB@5.9k RPM)?
>
> As far as I read they both fall in the enterprise class HDDs so **might**
> be suitable for a low performance, low cost cluster?
>
>
>
> Cheers,
>
> Maxime
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] XFS no space left on device

2016-10-25 Thread Irek Fasikhov
Привет, Василий.
Hi,Vasily.

You are busy inode. see "df -i"

С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757

2016-10-25 15:52 GMT+03:00 Василий Ангапов :

> This is a a bit more information about that XFS:
>
> root@ed-ds-c178:[~]:$ xfs_info /dev/mapper/disk23p1
> meta-data=/dev/mapper/disk23p1   isize=2048   agcount=6, agsize=268435455
> blks
>  =   sectsz=4096  attr=2, projid32bit=1
>  =   crc=0finobt=0
> data =   bsize=4096   blocks=1465130385, imaxpct=5
>  =   sunit=0  swidth=0 blks
> naming   =version 2  bsize=4096   ascii-ci=0 ftype=0
> log  =internal   bsize=4096   blocks=521728, version=2
>  =   sectsz=4096  sunit=1 blks, lazy-count=1
> realtime =none   extsz=4096   blocks=0, rtextents=0
>
> root@ed-ds-c178:[~]:$ xfs_db /dev/mapper/disk23p1
> xfs_db> frag
> actual 25205642, ideal 22794438, fragmentation factor 9.57%
>
> 2016-10-25 14:59 GMT+03:00 Василий Ангапов :
> > Actually all OSDs are already mounted with inode64 option. Otherwise I
> > could not write beyond 1TB.
> >
> > 2016-10-25 14:53 GMT+03:00 Ashley Merrick :
> >> Sounds like 32bit Inode limit, if you mount with -o inode64 (not 100%
> how you would do in ceph), would allow data to continue to be wrote.
> >>
> >> ,Ashley
> >>
> >> -Original Message-
> >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
> Of ??? ???
> >> Sent: 25 October 2016 12:38
> >> To: ceph-users 
> >> Subject: [ceph-users] XFS no space left on device
> >>
> >> Hello,
> >>
> >> I got Ceph 10.2.1 cluster with 10 nodes, each having 29 * 6TB OSDs.
> >> Yesterday I found that 3 OSDs were down and out with 89% space
> utilization.
> >> In logs there is:
> >> 2016-10-24 22:36:37.599253 7f8309c5e800  0 ceph version 10.2.1 (
> 3a66dd4f30852819c1bdaa8ec23c795d4ad77269), process ceph-osd, pid
> >> 2602081
> >> 2016-10-24 22:36:37.600129 7f8309c5e800  0 pidfile_write: ignore empty
> --pid-file
> >> 2016-10-24 22:36:37.635769 7f8309c5e800  0
> >> filestore(/var/lib/ceph/osd/ceph-123) backend xfs (magic 0x58465342)
> >> 2016-10-24 22:36:37.635805 7f8309c5e800 -1
> >> genericfilestorebackend(/var/lib/ceph/osd/ceph-123) detect_features:
> >> unable to create /var/lib/ceph/osd/ceph-123/fiemap_test: (28) No space
> left on device
> >> 2016-10-24 22:36:37.635814 7f8309c5e800 -1
> >> filestore(/var/lib/ceph/osd/ceph-123) _detect_fs: detect_features
> >> error: (28) No space left on device
> >> 2016-10-24 22:36:37.635818 7f8309c5e800 -1
> >> filestore(/var/lib/ceph/osd/ceph-123) FileStore::mount: error in
> >> _detect_fs: (28) No space left on device
> >> 2016-10-24 22:36:37.635824 7f8309c5e800 -1 osd.123 0 OSD:init: unable
> to mount object store
> >> 2016-10-24 22:36:37.635827 7f8309c5e800 -1 ESC[0;31m ** ERROR: osd init
> failed: (28) No space left on deviceESC[0m
> >>
> >> root@ed-ds-c178:[/var/lib/ceph/osd/ceph-123]:$ df -h
> /var/lib/ceph/osd/ceph-123
> >> FilesystemSize  Used Avail Use% Mounted on
> >> /dev/mapper/disk23p1  5.5T  4.9T  651G  89% /var/lib/ceph/osd/ceph-123
> >>
> >> root@ed-ds-c178:[/var/lib/ceph/osd/ceph-123]:$ df -i
> /var/lib/ceph/osd/ceph-123
> >> Filesystem  InodesIUsed IFree IUse% Mounted on
> >> /dev/mapper/disk23p1 146513024 22074752 124438272   16%
> >> /var/lib/ceph/osd/ceph-123
> >>
> >> root@ed-ds-c178:[/var/lib/ceph/osd/ceph-123]:$ touch 123
> >> touch: cannot touch ‘123’: No space left on device
> >>
> >> root@ed-ds-c178:[/var/lib/ceph/osd/ceph-123]:$ grep ceph-123
> /proc/mounts
> >> /dev/mapper/disk23p1 /var/lib/ceph/osd/ceph-123 xfs
> rw,noatime,attr2,inode64,noquota 0 0
> >>
> >> The same situation is for all three down OSDs. OSD can be unmounted and
> mounted without problem:
> >> root@ed-ds-c178:[~]:$ umount /var/lib/ceph/osd/ceph-123 
> >> root@ed-ds-c178:[~]:$
> root@ed-ds-c178:[~]:$ mount /var/lib/ceph/osd/ceph-123 root@ed-ds-c178:[~]:$
> touch /var/lib/ceph/osd/ceph-123/123
> >> touch: cannot touch ‘/var/lib/ceph/osd/ceph-123/123’: No space left on
> device
> >>
> >> xfs_repair gives no error for FS.
> >>
> >> Kernel is
> >> root@ed-ds-c178:[~]:$ uname -r
> >> 4.7.0-1.el7.wg.x86_64
> >>
> >> What else can I do to rectify that situation?
> >> ___
> >> ceph-users mailing list
> >> ceph-users@lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] data corruption with hammer

2016-03-19 Thread Irek Fasikhov
Hi, Nick

I switched between forward and writeback. (forward -> writeback)

С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757

2016-03-17 16:10 GMT+03:00 Nick Fisk :

> > -Original Message-
> > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> > Irek Fasikhov
> > Sent: 17 March 2016 13:00
> > To: Sage Weil 
> > Cc: Robert LeBlanc ; ceph-users  > us...@lists.ceph.com>; Nick Fisk ; William Perkins
> > 
> > Subject: Re: [ceph-users] data corruption with hammer
> >
> > Hi,All.
> >
> > I confirm the problem. When min_read_recency_for_promote> 1 data
> > failure.
>
> But what scenario is this? Are you switching between forward and
> writeback, or just running in writeback?
>
> >
> >
> > С уважением, Фасихов Ирек Нургаязович
> > Моб.: +79229045757
> >
> > 2016-03-17 15:26 GMT+03:00 Sage Weil :
> > On Thu, 17 Mar 2016, Nick Fisk wrote:
> > > There is got to be something else going on here. All that PR does is to
> > > potentially delay the promotion to hit_set_period*recency instead of
> > > just doing it on the 2nd read regardless, it's got to be uncovering
> > > another bug.
> > >
> > > Do you see the same problem if the cache is in writeback mode before
> you
> > > start the unpacking. Ie is it the switching mid operation which causes
> > > the problem? If it only happens mid operation, does it still occur if
> > > you pause IO when you make the switch?
> > >
> > > Do you also see this if you perform on a RBD mount, to rule out any
> > > librbd/qemu weirdness?
> > >
> > > Do you know if it’s the actual data that is getting corrupted or if
> it's
> > > the FS metadata? I'm only wondering as unpacking should really only be
> > > writing to each object a couple of times, whereas FS metadata could
> > > potentially be being updated+read back lots of times for the same group
> > > of objects and ordering is very important.
> > >
> > > Thinking through it logically the only difference is that with
> recency=1
> > > the object will be copied up to the cache tier, where recency=6 it will
> > > be proxy read for a long time. If I had to guess I would say the issue
> > > would lie somewhere in the proxy read + writeback<->forward logic.
> >
> > That seems reasonable.  Was switching from writeback -> forward always
> > part of the sequence that resulted in corruption?  Not that there is a
> > known ordering issue when switching to forward mode.  I wouldn't really
> > expect it to bite real users but it's possible..
> >
> > http://tracker.ceph.com/issues/12814
> >
> > I've opened a ticket to track this:
> >
> > http://tracker.ceph.com/issues/15171
> >
> > What would be *really* great is if you could reproduce this with a
> > ceph_test_rados workload (from ceph-tests).  I.e., get ceph_test_rados
> > running, and then find the sequence of operations that are sufficient to
> > trigger a failure.
> >
> > sage
> >
> >
> >
> >  >
> > >
> > >
> > > > -Original Message-
> > > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On
> > Behalf Of
> > > > Mike Lovell
> > > > Sent: 16 March 2016 23:23
> > > > To: ceph-users ; sw...@redhat.com
> > > > Cc: Robert LeBlanc ; William Perkins
> > > > 
> > > > Subject: Re: [ceph-users] data corruption with hammer
> > > >
> > > > just got done with a test against a build of 0.94.6 minus the two
> commits
> > that
> > > > were backported in PR 7207. everything worked as it should with the
> > cache-
> > > > mode set to writeback and the min_read_recency_for_promote set to 2.
> > > > assuming it works properly on master, there must be a commit that
> we're
> > > > missing on the backport to support this properly.
> > > >
> > > > sage,
> > > > i'm adding you to the recipients on this so hopefully you see it.
> the tl;dr
> > > > version is that the backport of the cache recency fix to hammer
> doesn't
> > work
> > > > right and potentially corrupts data when
> > > > the min_read_recency_for_promote is set to greater than 1.
> > > >
> > > > mike
> > > >
> > > > On Wed, Mar 16, 2016 at 4:41 PM, Mike Lovell
> > > >  wrote:
> > > &g

Re: [ceph-users] data corruption with hammer

2016-03-19 Thread Irek Fasikhov
Hi,All.

I confirm the problem. When min_read_recency_for_promote> 1 data failure.

С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757

2016-03-17 15:26 GMT+03:00 Sage Weil :

> On Thu, 17 Mar 2016, Nick Fisk wrote:
> > There is got to be something else going on here. All that PR does is to
> > potentially delay the promotion to hit_set_period*recency instead of
> > just doing it on the 2nd read regardless, it's got to be uncovering
> > another bug.
> >
> > Do you see the same problem if the cache is in writeback mode before you
> > start the unpacking. Ie is it the switching mid operation which causes
> > the problem? If it only happens mid operation, does it still occur if
> > you pause IO when you make the switch?
> >
> > Do you also see this if you perform on a RBD mount, to rule out any
> > librbd/qemu weirdness?
> >
> > Do you know if it’s the actual data that is getting corrupted or if it's
> > the FS metadata? I'm only wondering as unpacking should really only be
> > writing to each object a couple of times, whereas FS metadata could
> > potentially be being updated+read back lots of times for the same group
> > of objects and ordering is very important.
> >
> > Thinking through it logically the only difference is that with recency=1
> > the object will be copied up to the cache tier, where recency=6 it will
> > be proxy read for a long time. If I had to guess I would say the issue
> > would lie somewhere in the proxy read + writeback<->forward logic.
>
> That seems reasonable.  Was switching from writeback -> forward always
> part of the sequence that resulted in corruption?  Not that there is a
> known ordering issue when switching to forward mode.  I wouldn't really
> expect it to bite real users but it's possible..
>
> http://tracker.ceph.com/issues/12814
>
> I've opened a ticket to track this:
>
> http://tracker.ceph.com/issues/15171
>
> What would be *really* great is if you could reproduce this with a
> ceph_test_rados workload (from ceph-tests).  I.e., get ceph_test_rados
> running, and then find the sequence of operations that are sufficient to
> trigger a failure.
>
> sage
>
>
>
>  >
> >
> >
> > > -Original Message-
> > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
> Of
> > > Mike Lovell
> > > Sent: 16 March 2016 23:23
> > > To: ceph-users ; sw...@redhat.com
> > > Cc: Robert LeBlanc ; William Perkins
> > > 
> > > Subject: Re: [ceph-users] data corruption with hammer
> > >
> > > just got done with a test against a build of 0.94.6 minus the two
> commits that
> > > were backported in PR 7207. everything worked as it should with the
> cache-
> > > mode set to writeback and the min_read_recency_for_promote set to 2.
> > > assuming it works properly on master, there must be a commit that we're
> > > missing on the backport to support this properly.
> > >
> > > sage,
> > > i'm adding you to the recipients on this so hopefully you see it. the
> tl;dr
> > > version is that the backport of the cache recency fix to hammer
> doesn't work
> > > right and potentially corrupts data when
> > > the min_read_recency_for_promote is set to greater than 1.
> > >
> > > mike
> > >
> > > On Wed, Mar 16, 2016 at 4:41 PM, Mike Lovell
> > >  wrote:
> > > robert and i have done some further investigation the past couple days
> on
> > > this. we have a test environment with a hard drive tier and an ssd
> tier as a
> > > cache. several vms were created with volumes from the ceph cluster. i
> did a
> > > test in each guest where i un-tarred the linux kernel source multiple
> times
> > > and then did a md5sum check against all of the files in the resulting
> source
> > > tree. i started off with the monitors and osds running 0.94.5 and
> never saw
> > > any problems.
> > >
> > > a single node was then upgraded to 0.94.6 which has osds in both the
> ssd and
> > > hard drive tier. i then proceeded to run the same test and, while the
> untar
> > > and md5sum operations were running, i changed the ssd tier cache-mode
> > > from forward to writeback. almost immediately the vms started
> reporting io
> > > errors and odd data corruption. the remainder of the cluster was
> updated to
> > > 0.94.6, including the monitors, and the same thing happened.
> > >
> > > things were cleaned up and reset and then a test was run
> > > where min_read_recency_for_promote for the ssd cache pool was set to 1.
> > > we previously had it set to 6. there was never an error with the
> recency
> > > setting set to 1. i then tested with it set to 2 and it immediately
> caused
> > > failures. we are currently thinking that it is related to the backport
> of the fix
> > > for the recency promotion and are in progress of making a .6 build
> without
> > > that backport to see if we can cause corruption. is anyone using a
> version
> > > from after the original recency fix (PR 6702) with a cache tier in
> writeback
> > > mode? anyone have a similar problem?
> > >
> > > mike
> > >
> > > On Mon, Mar

Re: [ceph-users] ceph 9.2.0 SAMSUNG ssd performance issue?

2016-02-11 Thread Irek Fasikhov
Hi.
You need to read :
https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/

С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757

2016-02-12 10:41 GMT+03:00 Huan Zhang :

> Hi,
>
> ceph VERY SLOW with 24 osd(SAMSUNG ssd).
> fio /dev/rbd0 iodepth=1 direct=1   IOPS only ~200
> fio /dev/rbd0 iodepth=32 direct=1 IOPS only ~3000
>
> But test single ssd deive with fio:
> fio iodepth=1 direct=1   IOPS  ~15000
> fio iodepth=32 direct=1 IOPS  ~3
>
> Why ceph SO SLOW? Could you give me some help?
> Appreciated!
>
>
> My Enviroment:
> [root@szcrh-controller ~]# ceph -s
> cluster eb26a8b9-e937-4e56-a273-7166ffaa832e
>  health HEALTH_WARN
> 1 mons down, quorum 0,1,2,3,4 ceph01,ceph02,ceph03,ceph04,
> ceph05
>  monmap e1: 6 mons at {ceph01=
>
> 10.10.204.144:6789/0,ceph02=10.10.204.145:6789/0,ceph03=10.10.204.146:6789/0,ceph04=10.10.204.147:6789/0,ceph05=10.10.204.148:6789/0,ceph06=0.0.0.0:0/5
> }
> election epoch 6, quorum 0,1,2,3,4
> ceph01,ceph02,ceph03,ceph04,ceph05
>  osdmap e114: 24 osds: 24 up, 24 in
> flags sortbitwise
>   pgmap v2213: 1864 pgs, 3 pools, 49181 MB data, 4485 objects
> 144 GB used, 42638 GB / 42782 GB avail
> 1864 active+clean
>
> [root@ceph03 ~]# lsscsi
> [0:0:6:0]diskATA  SAMSUNG MZ7KM1T9 003Q  /dev/sda
> [0:0:7:0]diskATA  SAMSUNG MZ7KM1T9 003Q  /dev/sdb
> [0:0:8:0]diskATA  SAMSUNG MZ7KM1T9 003Q  /dev/sdc
> [0:0:9:0]diskATA  SAMSUNG MZ7KM1T9 003Q  /dev/sdd
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Undersized pgs problem

2015-11-27 Thread Irek Fasikhov
You have time to synchronize?

С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757

2015-11-27 15:57 GMT+03:00 Vasiliy Angapov :

> > It seams that you played around with crushmap, and done something wrong.
> > Compare the look of 'ceph osd tree' and crushmap. There are some 'osd'
> devices renamed to 'device' think threre is you problem.
> Is this a mistake actually? What I did is removed a bunch of OSDs from
> my cluster that's why the numeration is sparse. But is it an issue to
> a have a sparse numeration of OSDs?
>
> > Hi.
> > Vasiliy, Yes it is a problem with crusmap. Look at height:
> > -3 14.56000 host slpeah001
> > -2 14.56000 host slpeah002
> What exactly is wrong here?
>
> I also found out that my OSD logs are full of such records:
> 2015-11-26 08:31:19.273268 7fe4f49b1700  0 cephx: verify_authorizer
> could not get service secret for service osd secret_id=2924
> 2015-11-26 08:31:19.273276 7fe4f49b1700  0 --
> 192.168.254.18:6816/110740 >> 192.168.254.12:0/1011754 pipe(0x41fd1000
> sd=79 :6816 s=0 pgs=0 cs=0 l=1 c=0x3ee1a520).accept: got bad
> authorizer
> 2015-11-26 08:31:24.273207 7fe4f49b1700  0 auth: could not find
> secret_id=2924
> 2015-11-26 08:31:24.273225 7fe4f49b1700  0 cephx: verify_authorizer
> could not get service secret for service osd secret_id=2924
> 2015-11-26 08:31:24.273231 7fe4f49b1700  0 --
> 192.168.254.18:6816/110740 >> 192.168.254.12:0/1011754 pipe(0x3f90b000
> sd=79 :6816 s=0 pgs=0 cs=0 l=1 c=0x3ee1a3c0).accept: got bad
> authorizer
> 2015-11-26 08:31:29.273199 7fe4f49b1700  0 auth: could not find
> secret_id=2924
> 2015-11-26 08:31:29.273215 7fe4f49b1700  0 cephx: verify_authorizer
> could not get service secret for service osd secret_id=2924
> 2015-11-26 08:31:29.273222 7fe4f49b1700  0 --
> 192.168.254.18:6816/110740 >> 192.168.254.12:0/1011754 pipe(0x41fd1000
> sd=79 :6816 s=0 pgs=0 cs=0 l=1 c=0x3ee1a260).accept: got bad
> authorizer
> 2015-11-26 08:31:34.273469 7fe4f49b1700  0 auth: could not find
> secret_id=2924
> 2015-11-26 08:31:34.273482 7fe4f49b1700  0 cephx: verify_authorizer
> could not get service secret for service osd secret_id=2924
> 2015-11-26 08:31:34.273486 7fe4f49b1700  0 --
> 192.168.254.18:6816/110740 >> 192.168.254.12:0/1011754 pipe(0x3f90b000
> sd=79 :6816 s=0 pgs=0 cs=0 l=1 c=0x3ee1a100).accept: got bad
> authorizer
> 2015-11-26 08:31:39.273310 7fe4f49b1700  0 auth: could not find
> secret_id=2924
> 2015-11-26 08:31:39.273331 7fe4f49b1700  0 cephx: verify_authorizer
> could not get service secret for service osd secret_id=2924
> 2015-11-26 08:31:39.273342 7fe4f49b1700  0 --
> 192.168.254.18:6816/110740 >> 192.168.254.12:0/1011754 pipe(0x41fcc000
> sd=98 :6816 s=0 pgs=0 cs=0 l=1 c=0x3ee19fa0).accept: got bad
> authorizer
> 2015-11-26 08:31:44.273753 7fe4f49b1700  0 auth: could not find
> secret_id=2924
> 2015-11-26 08:31:44.273769 7fe4f49b1700  0 cephx: verify_authorizer
> could not get service secret for service osd secret_id=2924
> 2015-11-26 08:31:44.273776 7fe4f49b1700  0 --
> 192.168.254.18:6816/110740 >> 192.168.254.12:0/1011754 pipe(0x41fcc000
> sd=98 :6816 s=0 pgs=0 cs=0 l=1 c=0x3ee189a0).accept: got bad
> authorizer
> 2015-11-26 08:31:49.273412 7fe4f49b1700  0 auth: could not find
> secret_id=2924
> 2015-11-26 08:31:49.273431 7fe4f49b1700  0 cephx: verify_authorizer
> could not get service secret for service osd secret_id=2924
> 2015-11-26 08:31:49.273455 7fe4f49b1700  0 --
> 192.168.254.18:6816/110740 >> 192.168.254.12:0/1011754 pipe(0x41fd1000
> sd=98 :6816 s=0 pgs=0 cs=0 l=1 c=0x3ee19080).accept: got bad
> authorizer
> 2015-11-26 08:31:54.273293 7fe4f49b1700  0 auth: could not find
> secret_id=2924
>
> What does it mean? Google sais it might be a time sync issue, but my
> clocks are perfectly synchronized...
>
> 2015-11-26 21:05 GMT+08:00 Irek Fasikhov :
> > Hi.
> > Vasiliy, Yes it is a problem with crusmap. Look at height:
> > " -3 14.56000 host slpeah001
> >  -2 14.56000 host slpeah002
> >  "
> >
> > С уважением, Фасихов Ирек Нургаязович
> > Моб.: +79229045757
> >
> > 2015-11-26 13:16 GMT+03:00 ЦИТ РТ-Курамшин Камиль Фидаилевич
> > :
> >>
> >> It seams that you played around with crushmap, and done something wrong.
> >> Compare the look of 'ceph osd tree' and crushmap. There are some 'osd'
> >> devices renamed to 'device' think threre is you problem.
> >>
> >> Отправлено с мобильного устройства.
> >>
> >>
> >> -Original Message-
> >> From: Vasiliy Angapov 
> >> To: ceph

Re: [ceph-users] Undersized pgs problem

2015-11-26 Thread Irek Fasikhov
Hi.
Vasiliy, Yes it is a problem with crusmap. Look at height:
" -3 14.56000 host slpeah001
 -2 14.56000 host slpeah002
 "

С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757

2015-11-26 13:16 GMT+03:00 ЦИТ РТ-Курамшин Камиль Фидаилевич <
kamil.kurams...@tatar.ru>:

> It seams that you played around with crushmap, and done something wrong.
> Compare the look of 'ceph osd tree' and crushmap. There are some 'osd'
> devices renamed to 'device' think threre is you problem.
>
> Отправлено с мобильного устройства.
>
>
> -Original Message-
> From: Vasiliy Angapov 
> To: ceph-users 
> Sent: чт, 26 нояб. 2015 7:53
> Subject: [ceph-users] Undersized pgs problem
>
> Hi, colleagues!
>
> I have small 4-node CEPH cluster (0.94.2), all pools have size 3, min_size
> 1.
> This night one host failed and cluster was unable to rebalance saying
> there are a lot of undersized pgs.
>
> root@slpeah002:[~]:# ceph -s
> cluster 78eef61a-3e9c-447c-a3ec-ce84c617d728
>  health HEALTH_WARN
> 1486 pgs degraded
> 1486 pgs stuck degraded
> 2257 pgs stuck unclean
> 1486 pgs stuck undersized
> 1486 pgs undersized
> recovery 80429/555185 <80429555185> objects degraded
> (14.487%)
> recovery 40079/555185 objects misplaced (7.219%)
> 4/20 in osds are down
> 1 mons down, quorum 1,2 slpeah002,slpeah007
>  monmap e7: 3 mons at
> {slpeah001=
> 192.168.254.11:6780/0,slpeah002=192.168.254.12:6780/0,slpeah007=172.31.252.46:6789/0}
>
> election epoch 710, quorum 1,2 slpeah002,slpeah007
>  osdmap e14062: 20 osds: 16 up, 20 in; 771 remapped pgs
>   pgmap v7021316: 4160 pgs, 5 pools, 1045 GB data, 180 kobjects
> 3366 GB used, 93471 GB / 96838 GB avail
> 80429/555185 <80429555185> objects degraded (14.487%)
> 40079/555185 objects misplaced (7.219%)
> 1903 active+clean
> 1486 active+undersized+degraded
>  771 active+remapped
>   client io 0 B/s rd, 246 kB/s wr, 67 op/s
>
>   root@slpeah002:[~]:# ceph osd tree
> ID  WEIGHT   TYPE NAME  UP/DOWN REWEIGHT PRIMARY-AFFINITY
>  -1 94.63998 root default
>  -9 32.75999 host slpeah007
>  72  5.45999 osd.72  up  1.0  1.0
>  73  5.45999 osd.73  up  1.0  1.0
>  74  5.45999 osd.74  up  1.0  1.0
>  75  5.45999 osd.75  up  1.0  1.0
>  76  5.45999 osd.76  up  1.0  1.0
>  77  5.45999 osd.77  up  1.0  1.0
> -10 32.75999 host slpeah008
>  78  5.45999 osd.78  up  1.0  1.0
>  79  5.45999 osd.79  up  1.0  1.0
>  80  5.45999 osd.80  up  1.0  1.0
>  81  5.45999 osd.81  up  1.0  1.0
>  82  5.45999 osd.82  up  1.0  1.0
>  83  5.45999 osd.83  up  1.0  1.0
>  -3 14.56000 host slpeah001
>   1  3.64000  osd.1 down  1.0  1.0
>  33  3.64000 osd.33down  1.0  1.0
>  34  3.64000 osd.34down  1.0  1.0
>  35  3.64000 osd.35down  1.0  1.0
>  -2 14.56000 host slpeah002
>   0  3.64000 osd.0   up  1.0  1.0
>  36  3.64000 osd.36  up  1.0  1.0
>  37  3.64000 osd.37  up  1.0  1.0
>  38  3.64000 osd.38  up  1.0  1.0
>
> Crushmap:
>
>  # begin crush map
> tunable choose_local_tries 0
> tunable choose_local_fallback_tries 0
> tunable choose_total_tries 50
> tunable chooseleaf_descend_once 1
> tunable chooseleaf_vary_r 1
> tunable straw_calc_version 1
> tunable allowed_bucket_algs 54
>
> # devices
> device 0 osd.0
> device 1 osd.1
> device 2 device2
> device 3 device3
> device 4 device4
> device 5 device5
> device 6 device6
> device 7 device7
> device 8 device8
> device 9 device9
> device 10 device10
> device 11 device11
> device 12 device12
> device 13 device13
> device 14 device14
> device 15 device15
> device 16 device16
> device 17 device17
> device 18 device18
> device 19 device19
> device 20 device20
> device 21 device21
> device 22 device22
> device 23 device23
> device 24 device24
> device 25 device25
> device 26 device26
> device 27 device27
> device 28 device28
> device 29 device29
> device 30 device30
> device 31 device31
> device 32 device32
> device 33 osd.33
> device 34 osd.34
> device 35 osd.35
> device 36 osd.36
> device 37 osd.37
> device 38 osd.38
> device 39 device39
> device 40 device40
> device 41 device41
> device 42 device42
> device 43 device43
> device 44 device44
> device 45 device45
> device 46 device46
> device 47 device47
> device 48 

Re: [ceph-users] proxmox 4.0 release : lxc with krbd support and qemu librbd improvements

2015-10-07 Thread Irek Fasikhov
Hi, Alexandre.

Very Very Good!
Thank you for your work! :)

С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757

2015-10-07 7:25 GMT+03:00 Alexandre DERUMIER :

> Hi,
>
> proxmox 4.0 has been released:
>
> http://forum.proxmox.com/threads/23780-Proxmox-VE-4-0-released!
>
>
> Some ceph improvements :
>
> - lxc containers with krbd support (multiple disks + snapshots)
> - qemu with jemalloc support (improve librbd performance)
> - qemu iothread option by disk (improve scaling rbd  with multiple disk)
> - librbd hammer version
>
> Regards,
>
> Alexandre
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Repair inconsistent pgs..

2015-08-17 Thread Irek Fasikhov
Hi, Igor.

You need to repair the PG.

for i in `ceph pg dump| grep inconsistent | grep -v 'inconsistent+repair' |
awk {'print$1'}`;do ceph pg repair $i;done

С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757

2015-08-18 8:27 GMT+03:00 Voloshanenko Igor :

> Hi all, at our production cluster, due high rebalancing ((( we have 2 pgs
> in inconsistent state...
>
> root@temp:~# ceph health detail | grep inc
> HEALTH_ERR 2 pgs inconsistent; 18 scrub errors
> pg 2.490 is active+clean+inconsistent, acting [56,15,29]
> pg 2.c4 is active+clean+inconsistent, acting [56,10,42]
>
> From OSD logs, after recovery attempt:
>
> root@test:~# ceph pg dump | grep -i incons | cut -f 1 | while read i; do
> ceph pg repair ${i} ; done
> dumped all in format plain
> instructing pg 2.490 on osd.56 to repair
> instructing pg 2.c4 on osd.56 to repair
>
> /var/log/ceph/ceph-osd.56.log:51:2015-08-18 07:26:37.035910 7f94663b3700
> -1 log_channel(cluster) log [ERR] : deep-scrub 2.490
> f5759490/rbd_data.1631755377d7e.04da/head//2 expected clone
> 90c59490/rbd_data.eb486436f2beb.7a65/141//2
> /var/log/ceph/ceph-osd.56.log:52:2015-08-18 07:26:37.035960 7f94663b3700
> -1 log_channel(cluster) log [ERR] : deep-scrub 2.490
> fee49490/rbd_data.12483d3ba0794b.522f/head//2 expected clone
> f5759490/rbd_data.1631755377d7e.04da/141//2
> /var/log/ceph/ceph-osd.56.log:53:2015-08-18 07:26:37.036133 7f94663b3700
> -1 log_channel(cluster) log [ERR] : deep-scrub 2.490
> a9b39490/rbd_data.12483d3ba0794b.37b3/head//2 expected clone
> fee49490/rbd_data.12483d3ba0794b.522f/141//2
> /var/log/ceph/ceph-osd.56.log:54:2015-08-18 07:26:37.036243 7f94663b3700
> -1 log_channel(cluster) log [ERR] : deep-scrub 2.490
> bac19490/rbd_data.1238e82ae8944a.032e/head//2 expected clone
> a9b39490/rbd_data.12483d3ba0794b.37b3/141//2
> /var/log/ceph/ceph-osd.56.log:55:2015-08-18 07:26:37.036289 7f94663b3700
> -1 log_channel(cluster) log [ERR] : deep-scrub 2.490
> 98519490/rbd_data.123e9c2ae8944a.0807/head//2 expected clone
> bac19490/rbd_data.1238e82ae8944a.032e/141//2
> /var/log/ceph/ceph-osd.56.log:56:2015-08-18 07:26:37.036314 7f94663b3700
> -1 log_channel(cluster) log [ERR] : deep-scrub 2.490
> c3c09490/rbd_data.1238e82ae8944a.0c2b/head//2 expected clone
> 98519490/rbd_data.123e9c2ae8944a.0807/141//2
> /var/log/ceph/ceph-osd.56.log:57:2015-08-18 07:26:37.036363 7f94663b3700
> -1 log_channel(cluster) log [ERR] : deep-scrub 2.490
> 28809490/rbd_data.edea7460fe42b.01d9/head//2 expected clone
> c3c09490/rbd_data.1238e82ae8944a.0c2b/141//2
> /var/log/ceph/ceph-osd.56.log:58:2015-08-18 07:26:37.036432 7f94663b3700
> -1 log_channel(cluster) log [ERR] : deep-scrub 2.490
> e1509490/rbd_data.1423897545e146.09a6/head//2 expected clone
> 28809490/rbd_data.edea7460fe42b.01d9/141//2
> /var/log/ceph/ceph-osd.56.log:59:2015-08-18 07:26:38.548765 7f94663b3700
> -1 log_channel(cluster) log [ERR] : 2.490 deep-scrub 17 errors
>
> So, how i can solve "expected clone" situation by hand?
> Thank in advance!
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Geographical Replication and Disaster Recovery Support

2015-08-13 Thread Irek Fasikhov
Hi.
This document applies only to RadosGW.

You need to read the data document:
https://wiki.ceph.com/Planning/Blueprints/Hammer/RBD%3A_Mirroring


С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757

2015-08-13 11:40 GMT+03:00 Özhan Rüzgar Karaman :

> Hi;
> I like to learn about Ceph's Geographical Replication and Disaster
> Recovery Options. I know that currently we do not have a built-in official
> Geo Replication or disaster recovery, there are some third party tools like
> drbd but they are not like a solution that business needs.
>
> I also read the RGW document at Ceph Wiki Site.
>
>
> https://wiki.ceph.com/Planning/Blueprints/Dumpling/RGW_Geo-Replication_and_Disaster_Recovery
>
>
> The document is from Dumpling Release nearly year 2013. Do we have any
> active works or efforts to achieve disaster recovery or geographical
> replication features to Ceph, is it on our current road map?
>
> Thanks
> Özhan KARAMAN
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CEPH cache layer. Very slow

2015-08-13 Thread Irek Fasikhov
Hi, Igor.
Try to roll the patch here:
http://www.theirek.com/blog/2014/02/16/patch-dlia-raboty-s-enierghoniezavisimym-keshiem-ssd-diskov

P.S. I am no longer tracks changes in this direction(kernel), because we
use already recommended SSD

С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757

2015-08-13 11:56 GMT+03:00 Voloshanenko Igor :

> So, after testing SSD (i wipe 1 SSD, and used it for tests)
>
> root@ix-s2:~# sudo fio --filename=/dev/sda --direct=1 --sync=1 --rw=write
> --bs=4k --numjobs=1 --iodepth=1 --runtime=60 --time_based --gr[53/1800]
> ting --name=journal-test
> journal-test: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=sync,
> iodepth=1
> fio-2.1.3
> Starting 1 process
> Jobs: 1 (f=1): [W] [100.0% done] [0KB/1152KB/0KB /s] [0/288/0 iops] [eta
> 00m:00s]
> journal-test: (groupid=0, jobs=1): err= 0: pid=2849460: Thu Aug 13
> 10:46:42 2015
>   write: io=68972KB, bw=1149.6KB/s, iops=287, runt= 60001msec
> clat (msec): min=2, max=15, avg= 3.48, stdev= 1.08
>  lat (msec): min=2, max=15, avg= 3.48, stdev= 1.08
> clat percentiles (usec):
>  |  1.00th=[ 2704],  5.00th=[ 2800], 10.00th=[ 2864], 20.00th=[ 2928],
>  | 30.00th=[ 3024], 40.00th=[ 3088], 50.00th=[ 3280], 60.00th=[ 3408],
>  | 70.00th=[ 3504], 80.00th=[ 3728], 90.00th=[ 3856], 95.00th=[ 4016],
>  | 99.00th=[ 9024], 99.50th=[ 9280], 99.90th=[ 9792], 99.95th=[10048],
>  | 99.99th=[14912]
> bw (KB  /s): min= 1064, max= 1213, per=100.00%, avg=1150.07,
> stdev=34.31
> lat (msec) : 4=94.99%, 10=4.96%, 20=0.05%
>   cpu  : usr=0.13%, sys=0.57%, ctx=17248, majf=0, minf=7
>   IO depths: 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%,
> >=64=0.0%
>  submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >=64=0.0%
>  complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >=64=0.0%
>  issued: total=r=0/w=17243/d=0, short=r=0/w=0/d=0
>
> Run status group 0 (all jobs):
>   WRITE: io=68972KB, aggrb=1149KB/s, minb=1149KB/s, maxb=1149KB/s,
> mint=60001msec, maxt=60001msec
>
> Disk stats (read/write):
>   sda: ios=0/17224, merge=0/0, ticks=0/59584, in_queue=59576, util=99.30%
>
> So, it's pain... SSD do only 287 iops on 4K... 1,1 MB/s
>
> I try to change cache mode :
> echo temporary write through > /sys/class/scsi_disk/2:0:0:0/cache_type
> echo temporary write through > /sys/class/scsi_disk/3:0:0:0/cache_type
>
> no luck, still same shit results, also i found this article:
> https://lkml.org/lkml/2013/11/20/264 pointed to old very simple patch,
> which disable CMD_FLUSH
> https://gist.github.com/TheCodeArtist/93dddcd6a21dc81414ba
>
> Has everybody better ideas, how to improve this? (or disable CMD_FLUSH
> without recompile kernel, i used ubuntu and 4.0.4 for now (4.x branch
> because SSD 850 Pro have issue with NCQ TRIM< and before 4.0.4 this
> exception was not included into libsata.c)
>
> 2015-08-12 19:17 GMT+03:00 Pieter Koorts :
>
>> Hi Igor
>>
>> I suspect you have very much the same problem as me.
>>
>> https://www.mail-archive.com/ceph-users@lists.ceph.com/msg22260.html
>>
>> Basically Samsung drives (like many SATA SSD's) are very much hit and
>> miss so you will need to test them like described here to see if they are
>> any good.
>> http://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/
>>
>> To give you an idea my average performance went from 11MB/s (with Samsung
>> SSD) to 30MB/s (without any SSD) on write performance. This is a very small
>> cluster.
>>
>> Pieter
>>
>> On Aug 12, 2015, at 04:33 PM, Voloshanenko Igor <
>> igor.voloshane...@gmail.com> wrote:
>>
>> Hi all, we have setup CEPH cluster with 60 OSD (2 diff types) (5 nodes,
>> 12 disks on each, 10 HDD, 2 SSD)
>>
>> Also we cover this with custom crushmap with 2 root leaf
>>
>> ID   WEIGHT  TYPE NAME  UP/DOWN REWEIGHT PRIMARY-AFFINITY
>> -100 5.0 root ssd
>> -102 1.0 host ix-s2-ssd
>>2 1.0 osd.2   up  1.0  1.0
>>9 1.0 osd.9   up  1.0  1.0
>> -103 1.0 host ix-s3-ssd
>>3 1.0 osd.3   up  1.0  1.0
>>7 1.0 osd.7   up  1.0  1.0
>> -104 1.0 host ix-s5-ssd
>>1 1.0 osd.1   up  1.0  1.0
>>6 1.0 osd.6   up  1.0  1.0
>> -105 1.0 host ix-s6-ssd
>>4 1.0 osd.4   up  1.0  1.0
>>8 1.0 osd.8   up  1.0  1.0
>> -106 1.0 host ix-s7-ssd
>>0 1.0 osd.0   up  1.0  1.0
>>5 1.0 osd.5   up  1.0  1.0
>>   -1 5.0 root platter
>>   -2 1.0 host ix-s2-platter
>>   13 1.0 osd.13  up  1.0  1.0
>>   17 1.0 osd.17   

Re: [ceph-users] RBD performance slowly degrades :-(

2015-08-12 Thread Irek Fasikhov
Hi.
Read this thread here:
https://www.mail-archive.com/ceph-users@lists.ceph.com/msg17360.html

С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757

2015-08-12 14:52 GMT+03:00 Pieter Koorts :

> Hi
>
> Something that's been bugging me for a while is I am trying to diagnose
> iowait time within KVM guests. Guests doing reads or writes tend do about
> 50% to 90% iowait but the host itself is only doing about 1% to 2% iowait.
> So the result is the guests are extremely slow.
>
> I currently run 3x hosts each with a single SSD and single HDD OSD in
> cache-teir writeback mode. Although the SSD (Samsung 850 EVO 120GB) is not
> a great one it should at least perform reasonably compared to a hard disk
> and doing some direct SSD tests I get approximately 100MB/s write and
> 200MB/s read on each SSD.
>
> When I run rados bench though, the benchmark starts with a not great but
> okay speed and as the benchmark progresses it just gets slower and slower
> till it's worse than a USB hard drive. The SSD cache pool is 120GB in size
> (360GB RAW) and in use at about 90GB. I have tried tuning the XFS mount
> options as well but it has had little effect.
>
> Understandably the server spec is not great but I don't expect performance
> to be that bad.
>
> *OSD config:*
> [osd]
> osd crush update on start = false
> osd mount options xfs =
> "rw,noatime,inode64,logbsize=256k,delaylog,allocsize=4M"
>
> *Servers spec:*
> Dual Quad Core XEON E5410 and 32GB RAM in each server
> 10GBE @ 10G speed with 8000byte Jumbo Frames.
>
> *Rados bench result:* (starts at 50MB/s average and plummets down to
> 11MB/s)
> sudo rados bench -p rbd 50 write --no-cleanup -t 1
>  Maintaining 1 concurrent writes of 4194304 bytes for up to 50 seconds or
> 0 objects
>  Object prefix: benchmark_data_osc-mgmt-1_10007
>sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
>  0   0 0 0 0 0 - 0
>  1   11413   51.990652 0.0671911  0.074661
>  2   12726   51.990852 0.0631836 0.0751152
>  3   13736   47.992140 0.0691167 0.0802425
>  4   15150   49.992256 0.0816432 0.0795869
>  5   15655   43.993420  0.208393  0.088523
>  6   1616039.99420  0.241164 0.0999179
>  7   16463   35.993412  0.239001  0.106577
>  8   16665   32.4942 8  0.214354  0.122767
>  9   17271 31.5524  0.132588  0.125438
> 10   17776   30.394820  0.256474  0.128548
> 11   17978   28.3589 8  0.183564  0.138354
> 12   18281   26.995612  0.345809  0.145523
> 13   1858425.84212  0.373247  0.151291
> 14   18685   24.2819 4  0.950586  0.160694
> 15   18685   22.6632 0 -  0.160694
> 16   19089   22.2466 8  0.204714  0.178352
> 17   19493   21.879116  0.282236  0.180571
> 18   19897   21.552416  0.262566  0.183742
> 19   1   101   100   21.049512  0.357659  0.187477
> 20   1   104   10320.59712  0.369327  0.192479
> 21   1   105   104   19.8066 4  0.373233  0.194217
> 22   1   105   104   18.9064 0 -  0.194217
> 23   1   106   105   18.2582 2   2.35078  0.214756
> 24   1   107   106   17.6642 4  0.680246  0.219147
> 25   1   109   108   17.2776 8  0.677688  0.229222
> 26   1   113   112   17.228316   0.29171  0.230487
> 27   1   117   116   17.182816  0.255915  0.231101
> 28   1   120   119   16.997612  0.412411  0.235122
> 29   1   120   119   16.4115 0 -  0.235122
> 30   1   120   119   15.8645 0 -  0.235122
> 31   1   120   119   15.3527 0 -  0.235122
> 32   1   122   121   15.1229 2  0.319309  0.262822
> 33   1   124   123   14.9071 8  0.344094  0.266201
> 34   1   127   126   14.821512   0.33534  0.267913
> 35   1   129   128   14.6266 8  0.355403  0.269241
> 36   1   132   131   14.553612  0.581528  0.274327
> 37   1   132   131   14.1603 0 -  0.274327
> 38   1   133   132   13.8929 2   1.43621   0.28313
> 39   1   134   133   13.6392 4  0.894817  0.287729
> 40   1   134   133  

Re: [ceph-users] rbd_cache, limiting read on high iops around 40k

2015-06-22 Thread Irek Fasikhov
| Proxmox 4.0 will allow to enable|disable 1 iothread by disk.

Alexandre, Useful option!
In proxmox 3.4 will it be possible to add at least in the configuration
file? Or it entails a change in the source code KVM?
Thanks.

2015-06-22 11:54 GMT+03:00 Alexandre DERUMIER :

> >>It is already possible to do in proxmox 3.4 (with the latest updates
> qemu-kvm 2.2.x). But it is necessary to register in the conf file
> iothread:1. For single drives the ambiguous behavior of productivity.
>
> Yes and no ;)
>
> Currently in proxmox 3.4, iothread:1  generate only 1 iothread for all
> disks.
>
> So, you'll have a small extra boost, but it'll not scale with multiple
> disks.
>
> Proxmox 4.0 will allow to enable|disable 1 iothread by disk.
>
>
> >>Does it also help for single disks or only multiple disks?
>
> Iothread can also help for single disk, because by default qemu use a main
> thread for disk but also other things(don't remember what exactly)
>
>
>
>
> - Mail original -
> De: "Irek Fasikhov" 
> À: "Stefan Priebe" 
> Cc: "aderumier" , "pushpesh sharma" <
> pushpesh@gmail.com>, "Somnath Roy" ,
> "ceph-devel" , "ceph-users" <
> ceph-users@lists.ceph.com>
> Envoyé: Lundi 22 Juin 2015 09:22:13
> Objet: Re: rbd_cache, limiting read on high iops around 40k
>
> It is already possible to do in proxmox 3.4 (with the latest updates
> qemu-kvm 2.2.x). But it is necessary to register in the conf file
> iothread:1. For single drives the ambiguous behavior of productivity.
>
> 2015-06-22 10:12 GMT+03:00 Stefan Priebe - Profihost AG <
> s.pri...@profihost.ag > :
>
>
>
> Am 22.06.2015 um 09:08 schrieb Alexandre DERUMIER < aderum...@odiso.com >:
>
> >>> Just an update, there seems to be no proper way to pass iothread
> >>> parameter from openstack-nova (not at least in Juno release). So a
> >>> default single iothread per VM is what all we have. So in conclusion a
> >>> nova instance max iops on ceph rbd will be limited to 30-40K.
> >
> > Thanks for the update.
> >
> > For proxmox users,
> >
> > I have added iothread option to gui for proxmox 4.0
>
> Can we make iothread the default? Does it also help for single disks or
> only multiple disks?
>
> > and added jemalloc as default memory allocator
> >
> >
> > I have also send a jemmaloc patch to qemu dev mailing
> > https://lists.gnu.org/archive/html/qemu-devel/2015-06/msg05265.html
> >
> > (Help is welcome to push it in qemu upstream ! )
> >
> >
> >
> > - Mail original -
> > De: "pushpesh sharma" < pushpesh@gmail.com >
> > À: "aderumier" < aderum...@odiso.com >
> > Cc: "Somnath Roy" < somnath@sandisk.com >, "Irek Fasikhov" <
> malm...@gmail.com >, "ceph-devel" < ceph-de...@vger.kernel.org >,
> "ceph-users" < ceph-users@lists.ceph.com >
> > Envoyé: Lundi 22 Juin 2015 07:58:47
> > Objet: Re: rbd_cache, limiting read on high iops around 40k
> >
> > Just an update, there seems to be no proper way to pass iothread
> > parameter from openstack-nova (not at least in Juno release). So a
> > default single iothread per VM is what all we have. So in conclusion a
> > nova instance max iops on ceph rbd will be limited to 30-40K.
> >
> > On Tue, Jun 16, 2015 at 10:08 PM, Alexandre DERUMIER
> > < aderum...@odiso.com > wrote:
> >> Hi,
> >>
> >> some news about qemu with tcmalloc vs jemmaloc.
> >>
> >> I'm testing with multiple disks (with iothreads) in 1 qemu guest.
> >>
> >> And if tcmalloc is a little faster than jemmaloc,
> >>
> >> I have hit a lot of time the
> tcmalloc::ThreadCache::ReleaseToCentralCache bug.
> >>
> >> increasing TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES, don't help.
> >>
> >>
> >> with multiple disk, I'm around 200k iops with tcmalloc (before hitting
> the bug) and 350kiops with jemmaloc.
> >>
> >> The problem is that when I hit malloc bug, I'm around 4000-1 iops,
> and only way to fix is is to restart qemu ...
> >>
> >>
> >>
> >> - Mail original -
> >> De: "pushpesh sharma" < pushpesh@gmail.com >
> >> À: "aderumier" < aderum...@odiso.com >
> >> Cc: "Somnath Roy" < somnath@sandisk.com >, "Irek Fasikhov" <
> malm...@gmail.com

Re: [ceph-users] rbd_cache, limiting read on high iops around 40k

2015-06-22 Thread Irek Fasikhov
It is already possible to do in proxmox 3.4 (with the latest updates
qemu-kvm 2.2.x). But it is necessary to register in the conf file
iothread:1. For single drives the ambiguous behavior of productivity.

2015-06-22 10:12 GMT+03:00 Stefan Priebe - Profihost AG <
s.pri...@profihost.ag>:

>
> Am 22.06.2015 um 09:08 schrieb Alexandre DERUMIER :
>
> >>> Just an update, there seems to be no proper way to pass iothread
> >>> parameter from openstack-nova (not at least in Juno release). So a
> >>> default single iothread per VM is what all we have. So in conclusion a
> >>> nova instance max iops on ceph rbd will be limited to 30-40K.
> >
> > Thanks for the update.
> >
> > For proxmox users,
> >
> > I have added iothread option to gui for proxmox 4.0
>
> Can we make iothread the default? Does it also help for single disks or
> only multiple disks?
>
> > and added jemalloc as default memory allocator
> >
> >
> > I have also send a jemmaloc patch to qemu dev mailing
> > https://lists.gnu.org/archive/html/qemu-devel/2015-06/msg05265.html
> >
> > (Help is welcome to push it in qemu upstream ! )
> >
> >
> >
> > - Mail original -
> > De: "pushpesh sharma" 
> > À: "aderumier" 
> > Cc: "Somnath Roy" , "Irek Fasikhov" <
> malm...@gmail.com>, "ceph-devel" ,
> "ceph-users" 
> > Envoyé: Lundi 22 Juin 2015 07:58:47
> > Objet: Re: rbd_cache, limiting read on high iops around 40k
> >
> > Just an update, there seems to be no proper way to pass iothread
> > parameter from openstack-nova (not at least in Juno release). So a
> > default single iothread per VM is what all we have. So in conclusion a
> > nova instance max iops on ceph rbd will be limited to 30-40K.
> >
> > On Tue, Jun 16, 2015 at 10:08 PM, Alexandre DERUMIER
> >  wrote:
> >> Hi,
> >>
> >> some news about qemu with tcmalloc vs jemmaloc.
> >>
> >> I'm testing with multiple disks (with iothreads) in 1 qemu guest.
> >>
> >> And if tcmalloc is a little faster than jemmaloc,
> >>
> >> I have hit a lot of time the
> tcmalloc::ThreadCache::ReleaseToCentralCache bug.
> >>
> >> increasing TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES, don't help.
> >>
> >>
> >> with multiple disk, I'm around 200k iops with tcmalloc (before hitting
> the bug) and 350kiops with jemmaloc.
> >>
> >> The problem is that when I hit malloc bug, I'm around 4000-1 iops,
> and only way to fix is is to restart qemu ...
> >>
> >>
> >>
> >> - Mail original -
> >> De: "pushpesh sharma" 
> >> À: "aderumier" 
> >> Cc: "Somnath Roy" , "Irek Fasikhov" <
> malm...@gmail.com>, "ceph-devel" ,
> "ceph-users" 
> >> Envoyé: Vendredi 12 Juin 2015 08:58:21
> >> Objet: Re: rbd_cache, limiting read on high iops around 40k
> >>
> >> Thanks, posted the question in openstack list. Hopefully will get some
> >> expert opinion.
> >>
> >> On Fri, Jun 12, 2015 at 11:33 AM, Alexandre DERUMIER
> >>  wrote:
> >>> Hi,
> >>>
> >>> here a libvirt xml sample from libvirt src
> >>>
> >>> (you need to define  number, then assign then in disks).
> >>>
> >>> I don't use openstack, so I really don't known how it's working with
> it.
> >>>
> >>>
> >>> 
> >>> QEMUGuest1
> >>> c7a5fdbd-edaf-9455-926a-d65c16db1809
> >>> 219136
> >>> 219136
> >>> 2
> >>> 2
> >>> 
> >>> hvm
> >>> 
> >>> 
> >>> 
> >>> destroy
> >>> restart
> >>> destroy
> >>> 
> >>> /usr/bin/qemu
> >>> 
> >>> 
> >>> 
> >>> 
> >>>  function='0x0'/>
> >>> 
> >>> 
> >>> 
> >>> 
> >>> 
> >>> 
> >>> 
> >>> 
> >>> 
> >>> 
> >>> 
> >>> 
> >>>
> >>>
> >>> - Mail original -
> >>> De: "pushpesh sharma" 
> >>> À: "aderumier" 
> >>> Cc: "Somnath Roy" , "Irek Fasikhov" <
> malm...@gmail.com>, "ceph-devel" ,
&g

Re: [ceph-users] rbd_cache, limiting read on high iops around 40k

2015-06-16 Thread Irek Fasikhov
If necessary, there are RPM files for centos 7:
​
 gperftools.spec
<https://drive.google.com/file/d/0BxoNLVWxzOJWaVVmWTA3Z18zbUE/edit?usp=drive_web>
​​
 pprof-2.4-1.el7.centos.noarch.rpm
<https://drive.google.com/file/d/0BxoNLVWxzOJWRmQ2ZEt6a1pnSVk/edit?usp=drive_web>
​​
 gperftools-libs-2.4-1.el7.centos.x86_64.rpm
<https://drive.google.com/file/d/0BxoNLVWxzOJWcVByNUZHWWJqRXc/edit?usp=drive_web>
​​
 gperftools-devel-2.4-1.el7.centos.x86_64.rpm
<https://drive.google.com/file/d/0BxoNLVWxzOJWYTUzQTNha3J3NEU/edit?usp=drive_web>
​​
 gperftools-debuginfo-2.4-1.el7.centos.x86_64.rpm
<https://drive.google.com/file/d/0BxoNLVWxzOJWVzBic043YUk2LWM/edit?usp=drive_web>
​​
 gperftools-2.4-1.el7.centos.x86_64.rpm
<https://drive.google.com/file/d/0BxoNLVWxzOJWNm81QWdQYU9ZaG8/edit?usp=drive_web>
​

2015-06-17 8:01 GMT+03:00 Alexandre DERUMIER :

> Hi,
> I finally fix it with tcmalloc with
>
> TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=268435456 LD_PRELOAD} =
> "/usr/lib/libtcmalloc_minimal.so.4" qemu
>
> I got almost same result than jemmaloc in this case, maybe a littleb it
> faster
>
>
> Here the iops results for 1qemu vm with iothread by disk (iodepth=32,
> 4krandread, nocache)
>
>
> qemu randread 4k nocache libc6  iops
>
>
> 1 disk  29052
> 2 disks 55878
> 4 disks 127899
> 8 disks 240566
> 15 disks269976
>
> qemu randread 4k nocache jemmaloc   iops
>
> 1 disk   41278
> 2 disks  75781
> 4 disks  195351
> 8 disks  294241
> 15 disks 298199
>
>
>
> qemu randread 4k nocache tcmalloc 16M cache iops
>
>
> 1 disk   37911
> 2 disks  67698
> 4 disks  41076
> 8 disks  43312
> 15 disks 37569
>
>
> qemu randread 4k nocache tcmalloc patched 256M  iops
>
> 1 disk no-iothread
> 1 disk   42160
> 2 disks  83135
> 4 disks  194591
> 8 disks  306038
> 15 disks 302278
>
>
> - Mail original -
> De: "aderumier" 
> À: "Mark Nelson" 
> Cc: "ceph-users" 
> Envoyé: Mardi 16 Juin 2015 20:27:54
> Objet: Re: [ceph-users] rbd_cache, limiting read on high iops around 40k
>
> >>I forgot to ask, is this with the patched version of tcmalloc that
> >>theoretically fixes the TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES issue?
>
> Yes, the patched version of tcmalloc, but also the last version from
> gperftools git.
> (I'm talking about qemu here, not osds).
>
> I have tried to increased TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES, but it
> doesn't help.
>
>
>
> For osd, increasing TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES is helping.
> (Benchs are still running, I try to overload them as much as possible)
>
>
>
> - Mail original -
> De: "Mark Nelson" 
> À: "ceph-users" 
> Envoyé: Mardi 16 Juin 2015 19:04:27
> Objet: Re: [ceph-users] rbd_cache, limiting read on high iops around 40k
>
> I forgot to ask, is this with the patched version of tcmalloc that
> theoretically fixes the TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES issue?
>
> Mark
>
> On 06/16/2015 11:46 AM, Mark Nelson wrote:
> > Hi Alexandre,
> >
> > Excellent find! Have you also informed the QEMU developers of your
> > discovery?
> >
> > Mark
> >
> > On 06/16/2015 11:38 AM, Alexandre DERUMIER wrote:
> >> Hi,
> >>
> >> some news about qemu with tcmalloc vs jemmaloc.
> >>
> >> I'm testing with multiple disks (with iothreads) in 1 qemu guest.
> >>
> >> And if tcmalloc is a little faster than jemmaloc,
> >>
> >> I have hit a lot of time the
> >> tcmalloc::ThreadCache::ReleaseToCentralCache bug.
> >>
> >> increasing TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES, don't help.
> >>
> >>
> >> with multiple disk, I'm around 200k iops with tcmalloc (before hitting
> >> the bug) and 350kiops with jemmaloc.
> >>
> >> The problem is that when I hit malloc bug, I'm around 4000-1 iops,
> >> and only way to fix is is to restart qemu ...
> >>
> >>
> >>
> >> - Mail original -
> >> De: "pushpesh sharma" 
> >> À: "aderumier" 
> >> Cc: "Somnath Roy" , "Irek Fasikhov"
> >> , "ceph-devel" ,
> >> "ceph-users" 
> >> Envoyé: Vendredi 12 Juin 2015 08:58:21
> >> Objet: Re: rbd_cache, limiting read on high iops around 40k
> >>
> >> Thanks, posted the question in openstack list. Hopefully will get some
> >> expert opinion.
> >>
> >> On Fri, Jun 12, 2015 at 11:33 AM, Ale

Re: [ceph-users] [Fwd: adding a a monitor wil result in cephx: verify_reply couldn't decrypt with error: error decoding block for decryption]

2015-06-11 Thread Irek Fasikhov
Hands follow command: ntpdate NTPADDRESS

2015-06-11 12:36 GMT+03:00 Makkelie, R (ITCDCC) - KLM <
ramon.makke...@klm.com>:

>  all ceph releated servers have the same NTP server
> and double checked the time and timezones
> the are all correct
>
>
> -Original Message-
> *From*: Irek Fasikhov  >
> *To*: "Makkelie, R (ITCDCC) - KLM"  <%22Makkelie,%20r%20%28itcdcc%29%20-%20klm%22%20%3cramon.makke...@klm.com%3e>
> >
> *Cc*: ceph-users@lists.ceph.com  <%22ceph-us...@lists.ceph.com%22%20%3cceph-us...@lists.ceph.com%3e>>
> *Subject*: Re: [ceph-users] [Fwd: adding a a monitor wil result in cephx:
> verify_reply couldn't decrypt with error: error decoding block for
> decryption]
> *Date*: Thu, 11 Jun 2015 12:16:53 +0300
>
> It is necessary to synchronize time
>
>
> 2015-06-11 11:09 GMT+03:00 Makkelie, R (ITCDCC) - KLM <
> ramon.makke...@klm.com>:
>
> i'm trying to add a extra monitor to my already existing cluster
> i do this with the ceph-deploy with the following command
>
> ceph-deploy mon add "mynewhost"
>
> the ceph-deploy says its all finished
> but when i take a look at my new monitor host in the logs i see the
> following error
>
> cephx: verify_reply couldn't decrypt with error: error decoding block for
> decryption
>
> and when i take a look in my existing monitor logs i see this error
> cephx: verify_authorizer could not decrypt ticket info: error: NSS AES
> final round failed: -8190
>
> i tried gatherking key's
> copy keys
> reinstall/purge the new monitor node
>
> greetz
> Ramon 
> For information, services and offers, please visit our web site:
> http://www.klm.com. This e-mail and any attachment may contain
> confidential and privileged material intended for the addressee only. If
> you are not the addressee, you are notified that no part of the e-mail or
> any attachment may be disclosed, copied or distributed, and that any other
> action related to this e-mail or attachment is strictly prohibited, and may
> be unlawful. If you have received this e-mail by error, please notify the
> sender immediately by return e-mail, and delete this message.
>
> Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its
> employees shall not be liable for the incorrect or incomplete transmission
> of this e-mail or any attachments, nor responsible for any delay in receipt.
> Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal Dutch
> Airlines) is registered in Amstelveen, The Netherlands, with registered
> number 33014286
> 
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
>
>
> -- С уважением, Фасихов Ирек Нургаязович Моб.: +79229045757
> 
> For information, services and offers, please visit our web site:
> http://www.klm.com. This e-mail and any attachment may contain
> confidential and privileged material intended for the addressee only. If
> you are not the addressee, you are notified that no part of the e-mail or
> any attachment may be disclosed, copied or distributed, and that any other
> action related to this e-mail or attachment is strictly prohibited, and may
> be unlawful. If you have received this e-mail by error, please notify the
> sender immediately by return e-mail, and delete this message.
>
> Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its
> employees shall not be liable for the incorrect or incomplete transmission
> of this e-mail or any attachments, nor responsible for any delay in receipt.
> Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal Dutch
> Airlines) is registered in Amstelveen, The Netherlands, with registered
> number 33014286
> 
>



-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [Fwd: adding a a monitor wil result in cephx: verify_reply couldn't decrypt with error: error decoding block for decryption]

2015-06-11 Thread Irek Fasikhov
It is necessary to synchronize time

2015-06-11 11:09 GMT+03:00 Makkelie, R (ITCDCC) - KLM <
ramon.makke...@klm.com>:

>  i'm trying to add a extra monitor to my already existing cluster
> i do this with the ceph-deploy with the following command
>
> ceph-deploy mon add "mynewhost"
>
> the ceph-deploy says its all finished
> but when i take a look at my new monitor host in the logs i see the
> following error
>
> cephx: verify_reply couldn't decrypt with error: error decoding block for
> decryption
>
> and when i take a look in my existing monitor logs i see this error
> cephx: verify_authorizer could not decrypt ticket info: error: NSS AES
> final round failed: -8190
>
> i tried gatherking key's
> copy keys
> reinstall/purge the new monitor node
>
> greetz
> Ramon 
> For information, services and offers, please visit our web site:
> http://www.klm.com. This e-mail and any attachment may contain
> confidential and privileged material intended for the addressee only. If
> you are not the addressee, you are notified that no part of the e-mail or
> any attachment may be disclosed, copied or distributed, and that any other
> action related to this e-mail or attachment is strictly prohibited, and may
> be unlawful. If you have received this e-mail by error, please notify the
> sender immediately by return e-mail, and delete this message.
>
> Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its
> employees shall not be liable for the incorrect or incomplete transmission
> of this e-mail or any attachments, nor responsible for any delay in receipt.
> Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal Dutch
> Airlines) is registered in Amstelveen, The Netherlands, with registered
> number 33014286
> 
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd_cache, limiting read on high iops around 40k

2015-06-09 Thread Irek Fasikhov
Hi, Alexandre.

Very good work!
Do you have a rpm-file?
Thanks.

2015-06-10 7:10 GMT+03:00 Alexandre DERUMIER :

> Hi,
>
> I have tested qemu with last tcmalloc 2.4, and the improvement is huge
> with iothread: 50k iops (+45%) !
>
>
>
> qemu : no iothread : glibc : iops=33395
> qemu : no-iothread : tcmalloc (2.2.1) : iops=34516 (+3%)
> qemu : no-iothread : jemmaloc : iops=42226 (+26%)
> qemu : no-iothread : tcmalloc (2.4) : iops=35974 (+7%)
>
>
> qemu : iothread : glibc : iops=34516
> qemu : iothread : tcmalloc : iops=38676 (+12%)
> qemu : iothread : jemmaloc : iops=28023 (-19%)
> qemu : iothread : tcmalloc (2.4) : iops=50276 (+45%)
>
>
>
>
>
> qemu : iothread : tcmalloc (2.4) : iops=50276 (+45%)
> --
> rbd_iodepth32-test: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K,
> ioengine=libaio, iodepth=32
> fio-2.1.11
> Starting 1 process
> Jobs: 1 (f=1): [r(1)] [100.0% done] [214.7MB/0KB/0KB /s] [54.1K/0/0 iops]
> [eta 00m:00s]
> rbd_iodepth32-test: (groupid=0, jobs=1): err= 0: pid=894: Wed Jun 10
> 05:54:24 2015
>   read : io=5120.0MB, bw=201108KB/s, iops=50276, runt= 26070msec
> slat (usec): min=1, max=1136, avg= 3.54, stdev= 3.58
> clat (usec): min=128, max=6262, avg=631.41, stdev=197.71
>  lat (usec): min=149, max=6265, avg=635.27, stdev=197.40
> clat percentiles (usec):
>  |  1.00th=[  318],  5.00th=[  378], 10.00th=[  418], 20.00th=[  474],
>  | 30.00th=[  516], 40.00th=[  564], 50.00th=[  612], 60.00th=[  652],
>  | 70.00th=[  700], 80.00th=[  756], 90.00th=[  860], 95.00th=[  980],
>  | 99.00th=[ 1272], 99.50th=[ 1384], 99.90th=[ 1688], 99.95th=[ 1896],
>  | 99.99th=[ 3760]
> bw (KB  /s): min=145608, max=249688, per=100.00%, avg=201108.00,
> stdev=21718.87
> lat (usec) : 250=0.04%, 500=25.84%, 750=53.00%, 1000=16.63%
> lat (msec) : 2=4.46%, 4=0.03%, 10=0.01%
>   cpu  : usr=9.73%, sys=24.93%, ctx=66417, majf=0, minf=38
>   IO depths: 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%,
> >=64=0.0%
>  submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >=64=0.0%
>  complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%,
> >=64=0.0%
>  issued: total=r=1310720/w=0/d=0, short=r=0/w=0/d=0
>  latency   : target=0, window=0, percentile=100.00%, depth=32
>
> Run status group 0 (all jobs):
>READ: io=5120.0MB, aggrb=201107KB/s, minb=201107KB/s, maxb=201107KB/s,
> mint=26070msec, maxt=26070msec
>
> Disk stats (read/write):
>   vdb: ios=1302555/0, merge=0/0, ticks=715176/0, in_queue=714840,
> util=99.73%
>
>
>
>
>
>
> rbd_iodepth32-test: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K,
> ioengine=libaio, iodepth=32
> fio-2.1.11
> Starting 1 process
> Jobs: 1 (f=1): [r(1)] [100.0% done] [158.7MB/0KB/0KB /s] [40.6K/0/0 iops]
> [eta 00m:00s]
> rbd_iodepth32-test: (groupid=0, jobs=1): err= 0: pid=889: Wed Jun 10
> 06:05:06 2015
>   read : io=5120.0MB, bw=143897KB/s, iops=35974, runt= 36435msec
> slat (usec): min=1, max=710, avg= 3.31, stdev= 3.35
> clat (usec): min=191, max=4740, avg=884.66, stdev=315.65
>  lat (usec): min=289, max=4743, avg=888.31, stdev=315.51
> clat percentiles (usec):
>  |  1.00th=[  462],  5.00th=[  516], 10.00th=[  548], 20.00th=[  596],
>  | 30.00th=[  652], 40.00th=[  764], 50.00th=[  868], 60.00th=[  940],
>  | 70.00th=[ 1004], 80.00th=[ 1096], 90.00th=[ 1256], 95.00th=[ 1416],
>  | 99.00th=[ 2024], 99.50th=[ 2224], 99.90th=[ 2544], 99.95th=[ 2640],
>  | 99.99th=[ 3632]
> bw (KB  /s): min=98352, max=177328, per=99.91%, avg=143772.11,
> stdev=21782.39
> lat (usec) : 250=0.01%, 500=3.48%, 750=35.69%, 1000=30.01%
> lat (msec) : 2=29.74%, 4=1.07%, 10=0.01%
>   cpu  : usr=7.10%, sys=16.90%, ctx=54855, majf=0, minf=38
>   IO depths: 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%,
> >=64=0.0%
>  submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >=64=0.0%
>  complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%,
> >=64=0.0%
>  issued: total=r=1310720/w=0/d=0, short=r=0/w=0/d=0
>  latency   : target=0, window=0, percentile=100.00%, depth=32
>
> Run status group 0 (all jobs):
>READ: io=5120.0MB, aggrb=143896KB/s, minb=143896KB/s, maxb=143896KB/s,
> mint=36435msec, maxt=36435msec
>
> Disk stats (read/write):
>   vdb: ios=1301357/0, merge=0/0, ticks=1033036/0, in_queue=1032716,
> util=99.85%
>
>
> - Mail original -
> De: "aderumier" 
> À: "Robert LeBlanc" 
> Cc: "Mark Nelson" , "ceph-devel" <
> ceph-de...@vger.kernel.org>, "pushpesh sharma" ,
> "ceph-users" 
> Envoyé: Mardi 9 Juin 2015 18:47:27
> Objet: Re: [ceph-users] rbd_cache, limiting read on high iops around 40k
>
> Hi Robert,
>
> >>What I found was that Ceph OSDs performed well with either
> >>tcmalloc or jemalloc (except when RocksDB was built with jemalloc
> >>instead of tcmalloc, I'm still working to dig into why that might be
> >>the case).
> yes,from my test, for osd tcmalloc is a

Re: [ceph-users] active+clean+scrubbing+deep

2015-06-02 Thread Irek Fasikhov
Hi.

Restart the OSD. :)

2015-06-02 11:55 GMT+03:00 Никитенко Виталий :

> Hi!
>
> I have ceph version 0.94.1.
>
> root@ceph-node1:~# ceph -s
> cluster 3e0d58cd-d441-4d44-b49b-6cff08c20abf
>  health HEALTH_OK
>  monmap e2: 3 mons at {ceph-mon=
> 10.10.100.3:6789/0,ceph-node1=10.10.100.1:6789/0,ceph-node2=10.10.100.2:6789/0
> }
> election epoch 428, quorum 0,1,2 ceph-node1,ceph-node2,ceph-mon
>  osdmap e978: 16 osds: 16 up, 16 in
>   pgmap v6735569: 2012 pgs, 8 pools, 2801 GB data, 703 kobjects
> 5617 GB used, 33399 GB / 39016 GB avail
> 2011 active+clean
>1 active+clean+scrubbing+deep
>   client io 174 kB/s rd, 30641 kB/s wr, 80 op/s
>
> root@ceph-node1:~# ceph pg dump  | grep -i deep | cut -f 1
>   dumped all in format plain
>   pg_stat
>   19.b3
>
> In log file i see
> 2015-05-14 03:23:51.556876 7fc708a37700  0 log_channel(cluster) log [INF]
> : 19.b3 deep-scrub starts
> but no "19.b3 deep-scrub ok"
>
> then i do "ceph pg deep-scrub 19.b3", nothing happens and in logs file no
> any records about it.
>
> What can i do to pg return in "active + clean" station?
> is there any sense restart OSD or the entirely server where the OSD?
>
> Thanks.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] HEALTH_WARN 6 requests are blocked

2015-05-11 Thread Irek Fasikhov
Patrick,
At the moment, you do not have any problems related to the slow query.

2015-05-12 8:56 GMT+03:00 Patrik Plank :

>  So ok, understand.
>
> But what can I do if the scrubbing process hangs by one page since last
> night:
>
>
> root@ceph01:~# ceph health detail
> HEALTH_OK
>
> root@ceph01:~# ceph pg dump | grep scrub
> pg_statobjectsmipdegrmispunfbyteslog
> disklogstatestate_stampvreportedupup_primary
> actingacting_primarylast_scrubscrub_stamplast_deep_scrub
> deep_scrub_stamp
> 2.5cb1010000423620608324324
> active+clean+scrubbing+deep2015-05-11 23:01:37.0567474749'324
> 4749:6524[14,10]14[14,10]144749'3182015-05-10
> 22:05:29.2528763423'3092015-05-04 21:44:46.609791
>
> Perhaps an idea?
>
>
> best regards
>
>
>  -Original message-
> *From:* Irek Fasikhov 
> *Sent:* Tuesday 12th May 2015 7:49
> *To:* Patrik Plank ; ceph-users@lists.ceph.com
> *Subject:* Re: [ceph-users] HEALTH_WARN 6 requests are blocked
>
> Scrubbing greatly affects the I / O and can slow queries on OSD. For more
> information, look in the 'ceph health detail' and 'ceph pg dump | grep
> scrub'
>
> 2015-05-12 8:42 GMT+03:00 Patrik Plank :
>
>>  Hi,
>>
>>
>> is that the reason for the Health Warn or the scrubbing notification?
>>
>>
>>
>> thanks
>>
>> regards
>>
>>
>>  -Original message-
>> *From:* Irek Fasikhov 
>> *Sent:* Tuesday 12th May 2015 7:33
>> *To:* Patrik Plank 
>> *Cc:* ceph-users@lists.ceph.com >> ceph-users@lists.ceph.com <
>> ceph-users@lists.ceph.com>
>> *Subject:* Re: [ceph-users] HEALTH_WARN 6 requests are blocked
>>
>> Hi, Patrik.
>>
>> You must configure the priority of the I / O for scrubbing.
>>
>> http://dachary.org/?p=3268
>>
>>
>>
>> 2015-05-12 8:03 GMT+03:00 Patrik Plank :
>>
>>>  Hi,
>>>
>>>
>>> the ceph cluster shows always the scrubbing notifications, although he
>>> do not scrub.
>>>
>>> And what does the "Health Warn" mean.
>>>
>>> Does anybody have an idea why the warning is displayed.
>>>
>>> How can I solve this?
>>>
>>>
>>>  cluster 78227661-3a1b-4e56-addc-c2a272933ac2
>>>  health HEALTH_WARN 6 requests are blocked > 32 sec
>>>  monmap e3: 3 mons at {ceph01=
>>> 10.0.0.20:6789/0,ceph02=10.0.0.21:6789/0,ceph03=10.0.0.22:6789/0},
>>> election epoch 92, quorum 0,1,2 ceph01,ceph02,ceph03
>>>  osdmap e4749: 30 osds: 30 up, 30 in
>>>   pgmap v2321129: 4608 pgs, 2 pools, 1712 GB data, 440 kobjects
>>> 3425 GB used, 6708 GB / 10134 GB avail
>>>1 active+clean+scrubbing+deep
>>> 4607 active+clean
>>>   client io 3282 kB/s rd, 10742 kB/s wr, 182 op/s
>>>
>>>
>>> thanks
>>>
>>> best regards
>>>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>>
>>
>>
>> --
>>  С уважением, Фасихов Ирек Нургаязович
>> Моб.: +79229045757
>>
>>
>
>
> --
>  С уважением, Фасихов Ирек Нургаязович
> Моб.: +79229045757
>
>


-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] HEALTH_WARN 6 requests are blocked

2015-05-11 Thread Irek Fasikhov
Scrubbing greatly affects the I / O and can slow queries on OSD. For more
information, look in the 'ceph health detail' and 'ceph pg dump | grep
scrub'

2015-05-12 8:42 GMT+03:00 Patrik Plank :

>  Hi,
>
>
> is that the reason for the Health Warn or the scrubbing notification?
>
>
>
> thanks
>
> regards
>
>
>  -Original message-
> *From:* Irek Fasikhov 
> *Sent:* Tuesday 12th May 2015 7:33
> *To:* Patrik Plank 
> *Cc:* ceph-users@lists.ceph.com >> ceph-users@lists.ceph.com <
> ceph-users@lists.ceph.com>
> *Subject:* Re: [ceph-users] HEALTH_WARN 6 requests are blocked
>
> Hi, Patrik.
>
> You must configure the priority of the I / O for scrubbing.
>
> http://dachary.org/?p=3268
>
>
>
> 2015-05-12 8:03 GMT+03:00 Patrik Plank :
>
>>  Hi,
>>
>>
>> the ceph cluster shows always the scrubbing notifications, although he do
>> not scrub.
>>
>> And what does the "Health Warn" mean.
>>
>> Does anybody have an idea why the warning is displayed.
>>
>> How can I solve this?
>>
>>
>>  cluster 78227661-3a1b-4e56-addc-c2a272933ac2
>>  health HEALTH_WARN 6 requests are blocked > 32 sec
>>  monmap e3: 3 mons at {ceph01=
>> 10.0.0.20:6789/0,ceph02=10.0.0.21:6789/0,ceph03=10.0.0.22:6789/0},
>> election epoch 92, quorum 0,1,2 ceph01,ceph02,ceph03
>>  osdmap e4749: 30 osds: 30 up, 30 in
>>   pgmap v2321129: 4608 pgs, 2 pools, 1712 GB data, 440 kobjects
>> 3425 GB used, 6708 GB / 10134 GB avail
>>1 active+clean+scrubbing+deep
>> 4607 active+clean
>>   client io 3282 kB/s rd, 10742 kB/s wr, 182 op/s
>>
>>
>> thanks
>>
>> best regards
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>
>
> --
>  С уважением, Фасихов Ирек Нургаязович
> Моб.: +79229045757
>
>


-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] HEALTH_WARN 6 requests are blocked

2015-05-11 Thread Irek Fasikhov
Hi, Patrik.

You must configure the priority of the I / O for scrubbing.

http://dachary.org/?p=3268



2015-05-12 8:03 GMT+03:00 Patrik Plank :

>  Hi,
>
>
> the ceph cluster shows always the scrubbing notifications, although he do
> not scrub.
>
> And what does the "Health Warn" mean.
>
> Does anybody have an idea why the warning is displayed.
>
> How can I solve this?
>
>
>  cluster 78227661-3a1b-4e56-addc-c2a272933ac2
>  health HEALTH_WARN 6 requests are blocked > 32 sec
>  monmap e3: 3 mons at {ceph01=
> 10.0.0.20:6789/0,ceph02=10.0.0.21:6789/0,ceph03=10.0.0.22:6789/0},
> election epoch 92, quorum 0,1,2 ceph01,ceph02,ceph03
>  osdmap e4749: 30 osds: 30 up, 30 in
>   pgmap v2321129: 4608 pgs, 2 pools, 1712 GB data, 440 kobjects
> 3425 GB used, 6708 GB / 10134 GB avail
>1 active+clean+scrubbing+deep
> 4607 active+clean
>   client io 3282 kB/s rd, 10742 kB/s wr, 182 op/s
>
>
> thanks
>
> best regards
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] very different performance on two volumes in the same pool

2015-04-27 Thread Irek Fasikhov
Hi, Nikola.

https://www.mail-archive.com/ceph-users@lists.ceph.com/msg19152.html

2015-04-27 14:17 GMT+03:00 Nikola Ciprich :

> Hello Somnath,
> > Thanks for the perf data..It seems innocuous..I am not seeing single
> tcmalloc trace, are you running with tcmalloc by the way ?
>
> according to ldd, it seems I have it compiled in, yes:
> [root@vfnphav1a ~]# ldd /usr/bin/ceph-osd
> .
> .
> libtcmalloc.so.4 => /usr/lib64/libtcmalloc.so.4 (0x7f7a3756e000)
> .
> .
>
>
> > What about my other question, is the performance of slow volume
> increasing if you stop IO on the other volume ?
> I don't have any other cpeh users, actually whole cluster is idle..
>
> > Are you using default ceph.conf ? Probably, you want to try with
> different osd_op_num_shards (may be = 10 , based on your osd server config)
> and osd_op_num_threads_per_shard (may be = 1). Also, you may want to see
> the effect by doing osd_enable_op_tracker = false
>
> I guess I'm using pretty default settings, few changes probably not much
> related:
>
> [osd]
> osd crush update on start = false
>
> [client]
> rbd cache = true
> rbd cache writethrough until flush = true
>
> [mon]
> debug paxos = 0
>
>
>
> I now tried setting
> throttler perf counter = false
> osd enable op tracker = false
> osd_op_num_threads_per_shard = 1
> osd_op_num_shards = 10
>
> and restarting all ceph servers.. but it seems to make no big difference..
>
>
> >
> > Are you seeing similar resource consumption on both the servers while IO
> is going on ?
> yes, on all three nodes, ceph-osd seems to be consuming lots of CPU during
> benchmark.
>
> >
> > Need some information about your client, are the volumes exposed with
> krbd or running with librbd environment ? If krbd and with same physical
> box, hope you mapped the images with 'noshare' enabled.
>
> I'm using fio with ceph engine, so I guess none rbd related stuff is in
> use here?
>
>
> >
> > Too many questions :-)  But, this may give some indication what is going
> on there.
> :-) hopefully my answers are not too confused, I'm still pretty new to
> ceph..
>
> BR
>
> nik
>
>
> >
> > Thanks & Regards
> > Somnath
> >
> > -Original Message-
> > From: Nikola Ciprich [mailto:nikola.cipr...@linuxbox.cz]
> > Sent: Sunday, April 26, 2015 7:32 AM
> > To: Somnath Roy
> > Cc: ceph-users@lists.ceph.com; n...@linuxbox.cz
> > Subject: Re: [ceph-users] very different performance on two volumes in
> the same pool
> >
> > Hello Somnath,
> >
> > On Fri, Apr 24, 2015 at 04:23:19PM +, Somnath Roy wrote:
> > > This could be again because of tcmalloc issue I reported earlier.
> > >
> > > Two things to observe.
> > >
> > > 1. Is the performance improving if you stop IO on other volume ? If
> so, it could be different issue.
> > there is no other IO.. only cephfs mounted, but no users of it.
> >
> > >
> > > 2. Run perf top in the OSD node and see if tcmalloc traces are popping
> up.
> >
> > don't see anything special:
> >
> >   3.34%  libc-2.12.so  [.] _int_malloc
> >   2.87%  libc-2.12.so  [.] _int_free
> >   2.79%  [vdso][.] __vdso_gettimeofday
> >   2.67%  libsoftokn3.so[.] 0x0001fad9
> >   2.34%  libfreeblpriv3.so [.] 0x000355e6
> >   2.33%  libpthread-2.12.so[.] pthread_mutex_unlock
> >   2.19%  libpthread-2.12.so[.] pthread_mutex_lock
> >   1.80%  libc-2.12.so  [.] malloc
> >   1.43%  [kernel]  [k] do_raw_spin_lock
> >   1.42%  libc-2.12.so  [.] memcpy
> >   1.23%  [kernel]  [k] __switch_to
> >   1.19%  [kernel]  [k]
> acpi_processor_ffh_cstate_enter
> >   1.09%  libc-2.12.so  [.] malloc_consolidate
> >   1.08%  [kernel]  [k] __schedule
> >   1.05%  libtcmalloc.so.4.1.0  [.] 0x00017e6f
> >   0.98%  libc-2.12.so  [.] vfprintf
> >   0.83%  libstdc++.so.6.0.13   [.] std::basic_ostream std::char_traits >& std::__ostream_insert std::char_traits >(std::basic_ostream >   0.76%  libstdc++.so.6.0.13   [.] 0x0008092a
> >   0.73%  libc-2.12.so  [.] __memset_sse2
> >   0.72%  libc-2.12.so  [.] __strlen_sse42
> >   0.70%  libstdc++.so.6.0.13   [.] std::basic_streambuf std::char_traits >::xsputn(char const*, long)
> >   0.68%  libpthread-2.12.so[.] pthread_mutex_trylock
> >   0.67%  librados.so.2.0.0 [.] ceph_crc32c_sctp
> >   0.63%  libpython2.6.so.1.0   [.] 0x0007d823
> >   0.55%  libnss3.so[.] 0x00056d2a
> >   0.52%  libc-2.12.so  [.] free
> >   0.50%  libstdc++.so.6.0.13   [.] std::basic_string std::char_traits, std::allocator >::basic_string(std::string
> const&)
> >
> > should I check anything else?
> > BR
> > nik
> >
> >
> > >
> > > Thanks & Regards
> > > Somnath
> > >
> > > -Original Message-

Re: [ceph-users] strange benchmark problem : restarting osd daemon improve performance from 100k iops to 300k iops

2015-04-24 Thread Irek Fasikhov
Hi,Alexandre!
Do not try to change the parameter vm.min_free_kbytes?

2015-04-23 19:24 GMT+03:00 Somnath Roy :

> Alexandre,
> You can configure with --with-jemalloc or ./do_autogen -J to build ceph
> with jemalloc.
>
> Thanks & Regards
> Somnath
>
> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Alexandre DERUMIER
> Sent: Thursday, April 23, 2015 4:56 AM
> To: Mark Nelson
> Cc: ceph-users; ceph-devel; Milosz Tanski
> Subject: Re: [ceph-users] strange benchmark problem : restarting osd
> daemon improve performance from 100k iops to 300k iops
>
> >>If you have the means to compile the same version of ceph with
> >>jemalloc, I would be very interested to see how it does.
>
> Yes, sure. (I have around 3-4 weeks to do all the benchs)
>
> But I don't know how to do it ?
> I'm running the cluster on centos7.1, maybe it can be easy to patch the
> srpms to rebuild the package with jemalloc.
>
>
>
> - Mail original -
> De: "Mark Nelson" 
> À: "aderumier" , "Srinivasula Maram" <
> srinivasula.ma...@sandisk.com>
> Cc: "ceph-users" , "ceph-devel" <
> ceph-de...@vger.kernel.org>, "Milosz Tanski" 
> Envoyé: Jeudi 23 Avril 2015 13:33:00
> Objet: Re: [ceph-users] strange benchmark problem : restarting osd daemon
> improve performance from 100k iops to 300k iops
>
> Thanks for the testing Alexandre!
>
> If you have the means to compile the same version of ceph with jemalloc, I
> would be very interested to see how it does.
>
> In some ways I'm glad it turned out not to be NUMA. I still suspect we
> will have to deal with it at some point, but perhaps not today. ;)
>
> Mark
>
> On 04/23/2015 05:58 AM, Alexandre DERUMIER wrote:
> > Maybe it's tcmalloc related
> > I thinked to have patched it correctly, but perf show a lot of
> > tcmalloc::ThreadCache::ReleaseToCentralCache
> >
> > before osd restart (100k)
> > --
> > 11.66% ceph-osd libtcmalloc.so.4.1.2 [.]
> > tcmalloc::ThreadCache::ReleaseToCentralCache
> > 8.51% ceph-osd libtcmalloc.so.4.1.2 [.]
> > tcmalloc::CentralFreeList::FetchFromSpans
> > 3.04% ceph-osd libtcmalloc.so.4.1.2 [.]
> > tcmalloc::CentralFreeList::ReleaseToSpans
> > 2.04% ceph-osd libtcmalloc.so.4.1.2 [.] operator new 1.63% swapper
> > [kernel.kallsyms] [k] intel_idle 1.35% ceph-osd libtcmalloc.so.4.1.2
> > [.] tcmalloc::CentralFreeList::ReleaseListToSpans
> > 1.33% ceph-osd libtcmalloc.so.4.1.2 [.] operator delete 1.07% ceph-osd
> > libstdc++.so.6.0.19 [.] std::basic_string > std::char_traits, std::allocator >::basic_string 0.91%
> > ceph-osd libpthread-2.17.so [.] pthread_mutex_trylock 0.88% ceph-osd
> > libc-2.17.so [.] __memcpy_ssse3_back 0.81% ceph-osd ceph-osd [.]
> > Mutex::Lock 0.79% ceph-osd [kernel.kallsyms] [k]
> > copy_user_enhanced_fast_string 0.74% ceph-osd libpthread-2.17.so [.]
> > pthread_mutex_unlock 0.67% ceph-osd [kernel.kallsyms] [k]
> > _raw_spin_lock 0.63% swapper [kernel.kallsyms] [k]
> > native_write_msr_safe 0.62% ceph-osd [kernel.kallsyms] [k]
> > avc_has_perm_noaudit 0.58% ceph-osd ceph-osd [.] operator< 0.57%
> > ceph-osd [kernel.kallsyms] [k] __schedule 0.57% ceph-osd
> > [kernel.kallsyms] [k] __d_lookup_rcu 0.54% swapper [kernel.kallsyms]
> > [k] __schedule
> >
> >
> > after osd restart (300k iops)
> > --
> > 3.47% ceph-osd libtcmalloc.so.4.1.2 [.] operator new 1.92% ceph-osd
> > libtcmalloc.so.4.1.2 [.] operator delete 1.86% swapper
> > [kernel.kallsyms] [k] intel_idle 1.52% ceph-osd libstdc++.so.6.0.19
> > [.] std::basic_string,
> > std::allocator >::basic_string 1.34% ceph-osd
> > libtcmalloc.so.4.1.2 [.] tcmalloc::ThreadCache::ReleaseToCentralCache
> > 1.24% ceph-osd libc-2.17.so [.] __memcpy_ssse3_back 1.23% ceph-osd
> > ceph-osd [.] Mutex::Lock 1.21% ceph-osd libpthread-2.17.so [.]
> > pthread_mutex_trylock 1.11% ceph-osd [kernel.kallsyms] [k]
> > copy_user_enhanced_fast_string 0.95% ceph-osd libpthread-2.17.so [.]
> > pthread_mutex_unlock 0.94% ceph-osd [kernel.kallsyms] [k]
> > _raw_spin_lock 0.78% ceph-osd [kernel.kallsyms] [k] __d_lookup_rcu
> > 0.70% ceph-osd [kernel.kallsyms] [k] tcp_sendmsg 0.70% ceph-osd
> > ceph-osd [.] Message::Message 0.68% ceph-osd [kernel.kallsyms] [k]
> > __schedule 0.66% ceph-osd [kernel.kallsyms] [k] idle_cpu 0.65%
> > ceph-osd libtcmalloc.so.4.1.2 [.]
> > tcmalloc::CentralFreeList::FetchFromSpans
> > 0.64% swapper [kernel.kallsyms] [k] native_write_msr_safe 0.61%
> > ceph-osd ceph-osd [.]
> > std::tr1::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release
> > 0.60% swapper [kernel.kallsyms] [k] __schedule 0.60% ceph-osd
> > libstdc++.so.6.0.19 [.] 0x000bdd2b 0.57% ceph-osd ceph-osd [.]
> > operator< 0.57% ceph-osd ceph-osd [.] crc32_iscsi_00 0.56% ceph-osd
> > libstdc++.so.6.0.19 [.] std::string::_Rep::_M_dispose 0.55% ceph-osd
> > [kernel.kallsyms] [k] __switch_to 0.54% ceph-osd libc-2.17.so [.]
> > vfprintf 0.52% ceph-osd [kernel.kallsyms] [k] fget_light
> >
> > - Mail original -
> > De: "aderumier" 
> > À: 

Re: [ceph-users] Firefly - Giant : CentOS 7 : install failed ceph-deploy

2015-04-08 Thread Irek Fasikhov
I use Centos 7.1. The problem is that in the basic package repository has
"ceph-common".

[root@ceph01p24 cluster]# yum --showduplicates list ceph-common
Loaded plugins: dellsysid, etckeeper, fastestmirror, priorities
Loading mirror speeds from cached hostfile
 * base: centos-mirror.rbc.ru
 * epel: be.mirror.eurid.eu
 * extras: ftp.funet.fi
 * updates: centos-mirror.rbc.ru
Installed Packages
ceph-common.x86_64

 0.80.7-0.el7.centos
@Ceph
Available Packages
ceph-common.x86_64

 0.80.6-0.el7.centos
Ceph
ceph-common.x86_64

 0.80.7-0.el7.centos
Ceph
ceph-common.x86_64

 0.80.8-0.el7.centos
Ceph
ceph-common.x86_64

 0.80.9-0.el7.centos
Ceph
ceph-common.x86_64
 1:0.80.7-0.4.el7

   epel
ceph-common.x86_64
 1:0.80.7-2.el7

   base

I make the installation as follows:

rpm -ivh
http://ceph.com/rpm-firefly/el7/noarch/ceph-release-1-0.el7.noarch.rpm
yum install redhat-lsb-core-4.1-27.el7.centos.1.x86_64
gperftools-libs.x86_64 yum-plugin-priorities.noarch ntp -y
yum install librbd1-0.80.7-0.el7.centos
librados2-0.80.7-0.el7.centos.x86_64.rpm -y
yum install gdisk cryptsetup leveldb python-jinja2 hdparm -y

yum install --disablerepo=base --disablerepo=epel
ceph-common-0.80.7-0.el7.centos.x86_64 -y
yum install --disablerepo=base --disablerepo=epel ceph-0.80.7-0.el7.centos
-y

2015-04-08 12:40 GMT+03:00 Vickey Singh :

> Hello Everyone
>
>
> I also tried setting higher priority as suggested by SAM but no luck
>
>
> Please see the Full logs here http://paste.ubuntu.com/10771358/
>
>
> While installing yum searches for correct Ceph repository but it founds 3
> versions of python-ceph under http://ceph.com/rpm-giant/el7/x86_64/
>
>
> How can i instruct yum to install latest version of ceph from giant
> repository ?? FYI i have this setting already
>
>
> [root@rgw-node1 yum.repos.d]# cat /etc/yum/pluginconf.d/priorities.conf
>
> [main]
>
> enabled = 1
>
> check_obsoletes = 1
>
> [root@rgw-node1 yum.repos.d]#
>
>
>
>
> This issue can be easily reproduced, just now i tried on a fresh server
> centos 7.0.1406 but it still fails.
>
> Please help.
>
> Please help.
>
> Please help.
>
>
> # cat /etc/redhat-release
>
> CentOS Linux release 7.0.1406 (Core)
>
> #
>
> # uname -r
>
> 3.10.0-123.20.1.el7.x86_64
>
> #
>
>
> Regards
>
> VS
>
>
> On Wed, Apr 8, 2015 at 11:10 AM, Sam Wouters  wrote:
>
>>  Hi Vickey,
>>
>> we had a similar issue and we resolved it by giving the centos base and
>> update repo a higher priority (ex 10) then the epel repo.
>> The ceph-deploy tool only sets a prio of 1 for the ceph repo's, but the
>> centos and epel repo's stay on the default of 99.
>>
>> regards,
>> Sam
>>
>> On 08-04-15 09:32, Vickey Singh wrote:
>>
>>  Hi Ken
>>
>>
>>  As per your suggestion , i tried enabling epel-testing repository but
>> still no luck.
>>
>>
>>  Please check the below output. I would really appreciate  any help
>> here.
>>
>>
>>
>>  # yum install ceph --enablerepo=epel-testing
>>
>>
>>  ---> Package python-rbd.x86_64 1:0.80.7-0.5.el7 will be installed
>>
>> --> Processing Dependency: librbd1 = 1:0.80.7 for package:
>> 1:python-rbd-0.80.7-0.5.el7.x86_64
>>
>> --> Finished Dependency Resolution
>>
>> Error: Package: 1:python-cephfs-0.80.7-0.4.el7.x86_64 (epel)
>>
>>Requires: libcephfs1 = 1:0.80.7
>>
>>Available: 1:libcephfs1-0.86-0.el7.centos.x86_64 (Ceph)
>>
>>libcephfs1 = 1:0.86-0.el7.centos
>>
>>Available: 1:libcephfs1-0.87-0.el7.centos.x86_64 (Ceph)
>>
>>libcephfs1 = 1:0.87-0.el7.centos
>>
>>Installing: 1:libcephfs1-0.87.1-0.el7.centos.x86_64 (Ceph)
>>
>>libcephfs1 = 1:0.87.1-0.el7.centos
>>
>> *Error: Package: 1:python-rbd-0.80.7-0.5.el7.x86_64 (epel-testing)*
>>
>>Requires: librbd1 = 1:0.80.7
>>
>>Removing: librbd1-0.80.9-0.el7.centos.x86_64 (@Ceph)
>>
>>librbd1 = 0.80.9-0.el7.centos
>>
>>Updated By: 1:librbd1-0.87.1-0.el7.centos.x86_64 (Ceph)
>>
>>librbd1 = 1:0.87.1-0.el7.centos
>>
>>Available: 1:librbd1-0.86-0.el7.centos.x86_64 (Ceph)
>>
>>librbd1 = 1:0.86-0.el7.centos
>>
>>Available: 1:librbd1-0.87-0.el7.centos.x86_64 (Ceph)
>>
>>librbd1 = 1:0.87-0.el7.centos
>>
>> *Error: Package: 1:python-rados-0.80.7-0.5.el7.x86_64 (epel-testing)*
>>
>>Requires: librados2 = 1:0.80.7
>>
>>Removing: librados2-0.80.9-0.el7.centos.x86_64 (@Ceph)
>>
>>librados2 = 0.80.9-0.el7.centos
>>
>>Updat

Re: [ceph-users] Rebalance/Backfill Throtling - anything missing here?

2015-03-03 Thread Irek Fasikhov
Once you have only three nodes in the cluster.
I recommend you add new nodes to the cluster, and then delete the old.

2015-03-03 15:28 GMT+03:00 Irek Fasikhov :

> You have a number of replication?
>
> 2015-03-03 15:14 GMT+03:00 Andrija Panic :
>
>> Hi Irek,
>>
>> yes, stoping OSD (or seting it to OUT) resulted in only 3% of data
>> degraded and moved/recovered.
>> When I after that removed it from Crush map "ceph osd crush rm id",
>> that's when the stuff with 37% happened.
>>
>> And thanks Irek for help - could you kindly just let me know of the
>> prefered steps when removing whole node?
>> Do you mean I first stop all OSDs again, or just remove each OSD from
>> crush map, or perhaps, just decompile cursh map, delete the node
>> completely, compile back in, and let it heal/recover ?
>>
>> Do you think this would result in less data missplaces and moved arround ?
>>
>> Sorry for bugging you, I really appreaciate your help.
>>
>> Thanks
>>
>> On 3 March 2015 at 12:58, Irek Fasikhov  wrote:
>>
>>> A large percentage of the rebuild of the cluster map (But low percentage
>>> degradation). If you had not made "ceph osd crush rm id", the percentage
>>> would be low.
>>> In your case, the correct option is to remove the entire node, rather
>>> than each disk individually
>>>
>>> 2015-03-03 14:27 GMT+03:00 Andrija Panic :
>>>
>>>> Another question - I mentioned here 37% of objects being moved arround
>>>> - this is MISPLACED object (degraded objects were 0.001%, after I removed 1
>>>> OSD from cursh map (out of 44 OSD or so).
>>>>
>>>> Can anybody confirm this is normal behaviour - and are there any
>>>> workarrounds ?
>>>>
>>>> I understand this is because of the object placement algorithm of CEPH,
>>>> but still 37% of object missplaces just by removing 1 OSD from crush maps
>>>> out of 44 make me wonder why this large percentage ?
>>>>
>>>> Seems not good to me, and I have to remove another 7 OSDs (we are
>>>> demoting some old hardware nodes). This means I can potentialy go with 7 x
>>>> the same number of missplaced objects...?
>>>>
>>>> Any thoughts ?
>>>>
>>>> Thanks
>>>>
>>>> On 3 March 2015 at 12:14, Andrija Panic 
>>>> wrote:
>>>>
>>>>> Thanks Irek.
>>>>>
>>>>> Does this mean, that after peering for each PG, there will be delay of
>>>>> 10sec, meaning that every once in a while, I will have 10sec od the 
>>>>> cluster
>>>>> NOT being stressed/overloaded, and then the recovery takes place for that
>>>>> PG, and then another 10sec cluster is fine, and then stressed again ?
>>>>>
>>>>> I'm trying to understand process before actually doing stuff (config
>>>>> reference is there on ceph.com but I don't fully understand the
>>>>> process)
>>>>>
>>>>> Thanks,
>>>>> Andrija
>>>>>
>>>>> On 3 March 2015 at 11:32, Irek Fasikhov  wrote:
>>>>>
>>>>>> Hi.
>>>>>>
>>>>>> Use value "osd_recovery_delay_start"
>>>>>> example:
>>>>>> [root@ceph08 ceph]# ceph --admin-daemon
>>>>>> /var/run/ceph/ceph-osd.94.asok config show  | grep 
>>>>>> osd_recovery_delay_start
>>>>>>   "osd_recovery_delay_start": "10"
>>>>>>
>>>>>> 2015-03-03 13:13 GMT+03:00 Andrija Panic :
>>>>>>
>>>>>>> HI Guys,
>>>>>>>
>>>>>>> I yesterday removed 1 OSD from cluster (out of 42 OSDs), and it
>>>>>>> caused over 37% od the data to rebalance - let's say this is fine (this 
>>>>>>> is
>>>>>>> when I removed it frm Crush Map).
>>>>>>>
>>>>>>> I'm wondering - I have previously set some throtling mechanism, but
>>>>>>> during first 1h of rebalancing, my rate of recovery was going up to 1500
>>>>>>> MB/s - and VMs were unusable completely, and then last 4h of the 
>>>>>>> duration
>>>>>>> of recover this recovery rate went down to, say, 100-200 MB.s and during
>>>>>>> 

Re: [ceph-users] Rebalance/Backfill Throtling - anything missing here?

2015-03-03 Thread Irek Fasikhov
You have a number of replication?

2015-03-03 15:14 GMT+03:00 Andrija Panic :

> Hi Irek,
>
> yes, stoping OSD (or seting it to OUT) resulted in only 3% of data
> degraded and moved/recovered.
> When I after that removed it from Crush map "ceph osd crush rm id",
> that's when the stuff with 37% happened.
>
> And thanks Irek for help - could you kindly just let me know of the
> prefered steps when removing whole node?
> Do you mean I first stop all OSDs again, or just remove each OSD from
> crush map, or perhaps, just decompile cursh map, delete the node
> completely, compile back in, and let it heal/recover ?
>
> Do you think this would result in less data missplaces and moved arround ?
>
> Sorry for bugging you, I really appreaciate your help.
>
> Thanks
>
> On 3 March 2015 at 12:58, Irek Fasikhov  wrote:
>
>> A large percentage of the rebuild of the cluster map (But low percentage
>> degradation). If you had not made "ceph osd crush rm id", the percentage
>> would be low.
>> In your case, the correct option is to remove the entire node, rather
>> than each disk individually
>>
>> 2015-03-03 14:27 GMT+03:00 Andrija Panic :
>>
>>> Another question - I mentioned here 37% of objects being moved arround -
>>> this is MISPLACED object (degraded objects were 0.001%, after I removed 1
>>> OSD from cursh map (out of 44 OSD or so).
>>>
>>> Can anybody confirm this is normal behaviour - and are there any
>>> workarrounds ?
>>>
>>> I understand this is because of the object placement algorithm of CEPH,
>>> but still 37% of object missplaces just by removing 1 OSD from crush maps
>>> out of 44 make me wonder why this large percentage ?
>>>
>>> Seems not good to me, and I have to remove another 7 OSDs (we are
>>> demoting some old hardware nodes). This means I can potentialy go with 7 x
>>> the same number of missplaced objects...?
>>>
>>> Any thoughts ?
>>>
>>> Thanks
>>>
>>> On 3 March 2015 at 12:14, Andrija Panic  wrote:
>>>
>>>> Thanks Irek.
>>>>
>>>> Does this mean, that after peering for each PG, there will be delay of
>>>> 10sec, meaning that every once in a while, I will have 10sec od the cluster
>>>> NOT being stressed/overloaded, and then the recovery takes place for that
>>>> PG, and then another 10sec cluster is fine, and then stressed again ?
>>>>
>>>> I'm trying to understand process before actually doing stuff (config
>>>> reference is there on ceph.com but I don't fully understand the
>>>> process)
>>>>
>>>> Thanks,
>>>> Andrija
>>>>
>>>> On 3 March 2015 at 11:32, Irek Fasikhov  wrote:
>>>>
>>>>> Hi.
>>>>>
>>>>> Use value "osd_recovery_delay_start"
>>>>> example:
>>>>> [root@ceph08 ceph]# ceph --admin-daemon
>>>>> /var/run/ceph/ceph-osd.94.asok config show  | grep 
>>>>> osd_recovery_delay_start
>>>>>   "osd_recovery_delay_start": "10"
>>>>>
>>>>> 2015-03-03 13:13 GMT+03:00 Andrija Panic :
>>>>>
>>>>>> HI Guys,
>>>>>>
>>>>>> I yesterday removed 1 OSD from cluster (out of 42 OSDs), and it
>>>>>> caused over 37% od the data to rebalance - let's say this is fine (this 
>>>>>> is
>>>>>> when I removed it frm Crush Map).
>>>>>>
>>>>>> I'm wondering - I have previously set some throtling mechanism, but
>>>>>> during first 1h of rebalancing, my rate of recovery was going up to 1500
>>>>>> MB/s - and VMs were unusable completely, and then last 4h of the duration
>>>>>> of recover this recovery rate went down to, say, 100-200 MB.s and during
>>>>>> this VM performance was still pretty impacted, but at least I could work
>>>>>> more or a less
>>>>>>
>>>>>> So my question, is this behaviour expected, is throtling here working
>>>>>> as expected, since first 1h was almoust no throtling applied if I check 
>>>>>> the
>>>>>> recovery rate 1500MB/s and the impact on Vms.
>>>>>> And last 4h seemed pretty fine (although still lot of impact in
>>>>>> general)
>>>>>>
>>>>>> I changed these throtling on

Re: [ceph-users] Rebalance/Backfill Throtling - anything missing here?

2015-03-03 Thread Irek Fasikhov
A large percentage of the rebuild of the cluster map (But low percentage
degradation). If you had not made "ceph osd crush rm id", the percentage
would be low.
In your case, the correct option is to remove the entire node, rather than
each disk individually

2015-03-03 14:27 GMT+03:00 Andrija Panic :

> Another question - I mentioned here 37% of objects being moved arround -
> this is MISPLACED object (degraded objects were 0.001%, after I removed 1
> OSD from cursh map (out of 44 OSD or so).
>
> Can anybody confirm this is normal behaviour - and are there any
> workarrounds ?
>
> I understand this is because of the object placement algorithm of CEPH,
> but still 37% of object missplaces just by removing 1 OSD from crush maps
> out of 44 make me wonder why this large percentage ?
>
> Seems not good to me, and I have to remove another 7 OSDs (we are demoting
> some old hardware nodes). This means I can potentialy go with 7 x the same
> number of missplaced objects...?
>
> Any thoughts ?
>
> Thanks
>
> On 3 March 2015 at 12:14, Andrija Panic  wrote:
>
>> Thanks Irek.
>>
>> Does this mean, that after peering for each PG, there will be delay of
>> 10sec, meaning that every once in a while, I will have 10sec od the cluster
>> NOT being stressed/overloaded, and then the recovery takes place for that
>> PG, and then another 10sec cluster is fine, and then stressed again ?
>>
>> I'm trying to understand process before actually doing stuff (config
>> reference is there on ceph.com but I don't fully understand the process)
>>
>> Thanks,
>> Andrija
>>
>> On 3 March 2015 at 11:32, Irek Fasikhov  wrote:
>>
>>> Hi.
>>>
>>> Use value "osd_recovery_delay_start"
>>> example:
>>> [root@ceph08 ceph]# ceph --admin-daemon /var/run/ceph/ceph-osd.94.asok
>>> config show  | grep osd_recovery_delay_start
>>>   "osd_recovery_delay_start": "10"
>>>
>>> 2015-03-03 13:13 GMT+03:00 Andrija Panic :
>>>
>>>> HI Guys,
>>>>
>>>> I yesterday removed 1 OSD from cluster (out of 42 OSDs), and it caused
>>>> over 37% od the data to rebalance - let's say this is fine (this is when I
>>>> removed it frm Crush Map).
>>>>
>>>> I'm wondering - I have previously set some throtling mechanism, but
>>>> during first 1h of rebalancing, my rate of recovery was going up to 1500
>>>> MB/s - and VMs were unusable completely, and then last 4h of the duration
>>>> of recover this recovery rate went down to, say, 100-200 MB.s and during
>>>> this VM performance was still pretty impacted, but at least I could work
>>>> more or a less
>>>>
>>>> So my question, is this behaviour expected, is throtling here working
>>>> as expected, since first 1h was almoust no throtling applied if I check the
>>>> recovery rate 1500MB/s and the impact on Vms.
>>>> And last 4h seemed pretty fine (although still lot of impact in general)
>>>>
>>>> I changed these throtling on the fly with:
>>>>
>>>> ceph tell osd.* injectargs '--osd_recovery_max_active 1'
>>>> ceph tell osd.* injectargs '--osd_recovery_op_priority 1'
>>>> ceph tell osd.* injectargs '--osd_max_backfills 1'
>>>>
>>>> My Jorunals are on SSDs (12 OSD per server, of which 6 journals on one
>>>> SSD, 6 journals on another SSD)  - I have 3 of these hosts.
>>>>
>>>> Any thought are welcome.
>>>> --
>>>>
>>>> Andrija Panić
>>>>
>>>> ___
>>>> ceph-users mailing list
>>>> ceph-users@lists.ceph.com
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>
>>>>
>>>
>>>
>>> --
>>> С уважением, Фасихов Ирек Нургаязович
>>> Моб.: +79229045757
>>>
>>
>>
>>
>> --
>>
>> Andrija Panić
>>
>
>
>
> --
>
> Andrija Panić
>



-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Rebalance/Backfill Throtling - anything missing here?

2015-03-03 Thread Irek Fasikhov
osd_recovery_delay_start - is the delay in seconds between iterations
recovery (osd_recovery_max_active)

It is described here:
https://github.com/ceph/ceph/search?utf8=%E2%9C%93&q=osd_recovery_delay_start


2015-03-03 14:27 GMT+03:00 Andrija Panic :

> Another question - I mentioned here 37% of objects being moved arround -
> this is MISPLACED object (degraded objects were 0.001%, after I removed 1
> OSD from cursh map (out of 44 OSD or so).
>
> Can anybody confirm this is normal behaviour - and are there any
> workarrounds ?
>
> I understand this is because of the object placement algorithm of CEPH,
> but still 37% of object missplaces just by removing 1 OSD from crush maps
> out of 44 make me wonder why this large percentage ?
>
> Seems not good to me, and I have to remove another 7 OSDs (we are demoting
> some old hardware nodes). This means I can potentialy go with 7 x the same
> number of missplaced objects...?
>
> Any thoughts ?
>
> Thanks
>
> On 3 March 2015 at 12:14, Andrija Panic  wrote:
>
>> Thanks Irek.
>>
>> Does this mean, that after peering for each PG, there will be delay of
>> 10sec, meaning that every once in a while, I will have 10sec od the cluster
>> NOT being stressed/overloaded, and then the recovery takes place for that
>> PG, and then another 10sec cluster is fine, and then stressed again ?
>>
>> I'm trying to understand process before actually doing stuff (config
>> reference is there on ceph.com but I don't fully understand the process)
>>
>> Thanks,
>> Andrija
>>
>> On 3 March 2015 at 11:32, Irek Fasikhov  wrote:
>>
>>> Hi.
>>>
>>> Use value "osd_recovery_delay_start"
>>> example:
>>> [root@ceph08 ceph]# ceph --admin-daemon /var/run/ceph/ceph-osd.94.asok
>>> config show  | grep osd_recovery_delay_start
>>>   "osd_recovery_delay_start": "10"
>>>
>>> 2015-03-03 13:13 GMT+03:00 Andrija Panic :
>>>
>>>> HI Guys,
>>>>
>>>> I yesterday removed 1 OSD from cluster (out of 42 OSDs), and it caused
>>>> over 37% od the data to rebalance - let's say this is fine (this is when I
>>>> removed it frm Crush Map).
>>>>
>>>> I'm wondering - I have previously set some throtling mechanism, but
>>>> during first 1h of rebalancing, my rate of recovery was going up to 1500
>>>> MB/s - and VMs were unusable completely, and then last 4h of the duration
>>>> of recover this recovery rate went down to, say, 100-200 MB.s and during
>>>> this VM performance was still pretty impacted, but at least I could work
>>>> more or a less
>>>>
>>>> So my question, is this behaviour expected, is throtling here working
>>>> as expected, since first 1h was almoust no throtling applied if I check the
>>>> recovery rate 1500MB/s and the impact on Vms.
>>>> And last 4h seemed pretty fine (although still lot of impact in general)
>>>>
>>>> I changed these throtling on the fly with:
>>>>
>>>> ceph tell osd.* injectargs '--osd_recovery_max_active 1'
>>>> ceph tell osd.* injectargs '--osd_recovery_op_priority 1'
>>>> ceph tell osd.* injectargs '--osd_max_backfills 1'
>>>>
>>>> My Jorunals are on SSDs (12 OSD per server, of which 6 journals on one
>>>> SSD, 6 journals on another SSD)  - I have 3 of these hosts.
>>>>
>>>> Any thought are welcome.
>>>> --
>>>>
>>>> Andrija Panić
>>>>
>>>> ___
>>>> ceph-users mailing list
>>>> ceph-users@lists.ceph.com
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>
>>>>
>>>
>>>
>>> --
>>> С уважением, Фасихов Ирек Нургаязович
>>> Моб.: +79229045757
>>>
>>
>>
>>
>> --
>>
>> Andrija Panić
>>
>
>
>
> --
>
> Andrija Panić
>



-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Rebalance/Backfill Throtling - anything missing here?

2015-03-03 Thread Irek Fasikhov
Hi.

Use value "osd_recovery_delay_start"
example:
[root@ceph08 ceph]# ceph --admin-daemon /var/run/ceph/ceph-osd.94.asok
config show  | grep osd_recovery_delay_start
  "osd_recovery_delay_start": "10"

2015-03-03 13:13 GMT+03:00 Andrija Panic :

> HI Guys,
>
> I yesterday removed 1 OSD from cluster (out of 42 OSDs), and it caused
> over 37% od the data to rebalance - let's say this is fine (this is when I
> removed it frm Crush Map).
>
> I'm wondering - I have previously set some throtling mechanism, but during
> first 1h of rebalancing, my rate of recovery was going up to 1500 MB/s -
> and VMs were unusable completely, and then last 4h of the duration of
> recover this recovery rate went down to, say, 100-200 MB.s and during this
> VM performance was still pretty impacted, but at least I could work more or
> a less
>
> So my question, is this behaviour expected, is throtling here working as
> expected, since first 1h was almoust no throtling applied if I check the
> recovery rate 1500MB/s and the impact on Vms.
> And last 4h seemed pretty fine (although still lot of impact in general)
>
> I changed these throtling on the fly with:
>
> ceph tell osd.* injectargs '--osd_recovery_max_active 1'
> ceph tell osd.* injectargs '--osd_recovery_op_priority 1'
> ceph tell osd.* injectargs '--osd_max_backfills 1'
>
> My Jorunals are on SSDs (12 OSD per server, of which 6 journals on one
> SSD, 6 journals on another SSD)  - I have 3 of these hosts.
>
> Any thought are welcome.
> --
>
> Andrija Panić
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] who is using radosgw with civetweb?

2015-02-26 Thread Irek Fasikhov
I fully support Wido. We also have no problems.

OS: CentOS7
[root@s3backup etc]# ceph -v
ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7)


2015-02-26 13:22 GMT+03:00 Dan van der Ster :

> Hi Sage,
>
> We switched from apache+fastcgi to civetweb (+haproxy) around one
> month ago and so far it is working quite well. Just like GuangYang, we
> had seen many error 500's with fastcgi, but we never investigated it
> deeply. After moving to civetweb we don't get any errors at all no
> matter what load we send to the gateways.
>
> Here are some details:
>   - the whole cluster, radosgw included, is firefly 0.80.8 and
> Scientific Linux 6.6
>   - we have 6 gateways, each running on a 2-core VM
>   - civetweb is listening on 8080
>   - haproxy is listening on _each_ gateway VM on 80 and 443 and
> proxying to the radosgw's
>   - so far we've written ~20 million objects (mostly very small)
> through civetweb.
>
> Our feedback is that the civetweb configuration is _much_ easier, much
> cleaner, and more reliable than what we had with apache+fastcgi.
> Before, we needed the non-standard apache (with 100-continue support)
> and the fastcgi config was always error-prone.
>
> The main goals we had for adding haproxy were for load balancing and
> to add SSL. Currently haproxy is configured to balance the http
> sessions evenly over all of our gateways -- one civetweb feature which
> would be nice to have would be a /health report (which returns e.g.
> some "load" metric for that gateway) that we could feed into haproxy
> so it would be able to better balance the load.
>
> In conclusion, +1 from us... AFAWCT civetweb is the way to go for Red
> Hat's future supported configuration.
>
> Best Regards, Dan (+Herve who did the work!)
>
>
>
>
> On Wed, Feb 25, 2015 at 8:31 PM, Sage Weil  wrote:
> > Hey,
> >
> > We are considering switching to civetweb (the embedded/standalone rgw web
> > server) as the primary supported RGW frontend instead of the current
> > apache + mod-fastcgi or mod-proxy-fcgi approach.  "Supported" here means
> > both the primary platform the upstream development focuses on and what
> the
> > downstream Red Hat product will officially support.
> >
> > How many people are using RGW standalone using the embedded civetweb
> > server instead of apache?  In production?  At what scale?  What
> > version(s) (civetweb first appeared in firefly and we've backported most
> > fixes).
> >
> > Have you seen any problems?  Any other feedback?  The hope is to (vastly)
> > simplify deployment.
> >
> > Thanks!
> > sage
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Dumpling/Firefly/Hammer SSD/Memstore performance comparison

2015-02-17 Thread Irek Fasikhov
Mark, very very good!

2015-02-17 20:37 GMT+03:00 Mark Nelson :

> Hi All,
>
> I wrote up a short document describing some tests I ran recently to look
> at how SSD backed OSD performance has changed across our LTS releases. This
> is just looking at RADOS performance and not RBD or RGW.  It also doesn't
> offer any real explanations regarding the results.  It's just a first high
> level step toward understanding some of the behaviors folks on the mailing
> list have reported over the last couple of releases.  I hope you find it
> useful.
>
> Mark
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Introducing "Learning Ceph" : The First ever Book on Ceph

2015-02-13 Thread Irek Fasikhov
Karan

Whether to send the book in Russian?

Thanks.

2015-02-13 11:43 GMT+03:00 Karan Singh :

> Here is the new link for sample book :
> https://www.dropbox.com/s/2zcxawtv4q29fm9/Learning_Ceph_Sample.pdf?dl=0
>
>
> 
> Karan Singh
> Systems Specialist , Storage Platforms
> CSC - IT Center for Science,
> Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland
> mobile: +358 503 812758
> tel. +358 9 4572001
> fax +358 9 4572302
> http://www.csc.fi/
> 
>
> On 13 Feb 2015, at 05:25, Frank Yu  wrote:
>
> Wow, Cong
> BTW, I found the link of sample copy is 404.
>
>
>
> 2015-02-06 6:53 GMT+08:00 Karan Singh :
>
>> Hello Community Members
>>
>> I am happy to introduce the first book on Ceph with the title “*Learning
>> Ceph*”.
>>
>> Me and many folks from the publishing house together with technical
>> reviewers spent several months to get this book compiled and published.
>>
>> Finally the book is up for sale on , i hope you would like it and surely
>> will learn a lot from it.
>>
>> Amazon :
>> http://www.amazon.com/Learning-Ceph-Karan-Singh/dp/1783985623/ref=sr_1_1?s=books&ie=UTF8&qid=1423174441&sr=1-1&keywords=ceph
>> Packtpub : https://www.packtpub.com/application-development/learning-ceph
>>
>> You can grab the sample copy from here :
>> https://www.dropbox.com/s/ek76r01r9prs6pb/Learning_Ceph_Packt.pdf?dl=0
>>
>> *Finally , I would like to express my sincere thanks to *
>>
>> *Sage Weil* - For developing Ceph and everything around it as well as
>> writing foreword for “Learning Ceph”.
>> *Patrick McGarry *- For his usual off the track support that too always.
>>
>> Last but not the least , to our great community members , who are also
>> reviewers of the book *Don Talton , Julien Recurt , Sebastien Han *and 
>> *Zihong
>> Chen *, Thank you guys for your efforts.
>>
>>
>> 
>> Karan Singh
>> Systems Specialist , Storage Platforms
>> CSC - IT Center for Science,
>> Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland
>> mobile: +358 503 812758
>> tel. +358 9 4572001
>> fax +358 9 4572302
>> http://www.csc.fi/
>> 
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>
>
> --
> Regards
> Frank Yu
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph Performance with SSD journal

2015-02-13 Thread Irek Fasikhov
Hi.
What version?

2015-02-13 6:04 GMT+03:00 Sumit Gaur :

> Hi Chir,
> Please fidn my answer below in blue
>
> On Thu, Feb 12, 2015 at 12:42 PM, Chris Hoy Poy  wrote:
>
>> Hi Sumit,
>>
>> A couple questions:
>>
>> What brand/model SSD?
>>
> samsung 480G SSD(PM853T) having random write 90K IOPS (4K, 368MBps)
>
>>
>> What brand/model HDD?
>>
> 64GB memory, 300GB SAS HDD (seagate), 10Gb nic
>
>>
>> Also how they are connected to controller/motherboard? Are they sharing a
>> bus (ie SATA expander)?
>>
> no , They are connected with local Bus not the SATA expander.
>
>
>>
>> RAM?
>>
> *64GB *
>
>>
>> Also look at the output of  "iostat -x" or similiar, are the SSDs hitting
>> 100% utilisation?
>>
> *No, SSD was hitting 2000 iops only.  *
>
>>
>> I suspect that the 5:1 ratio of HDDs to SDDs is not ideal, you now have
>> 5x the write IO trying to fit into a single SSD.
>>
> * I have not seen any documented reference to calculate the ratio. Could
> you suggest one. Here I want to mention that results for 1024K write
> improve a lot. Problem is with 1024K read and 4k write .*
>
> *SSD journal 810 IOPS and 810MBps*
> *HDD journal 620 IOPS and 620 MBps*
>
>
>
>
>> I'll take a punt on it being a SATA connected SSD (most common), 5x ~130
>> megabytes/second gets very close to most SATA bus limits. If its a shared
>> BUS, you possibly hit that limit even earlier (since all that data is now
>> being written twice out over the bus).
>>
>> cheers;
>> \Chris
>>
>>
>> --
>> *From: *"Sumit Gaur" 
>> *To: *ceph-users@lists.ceph.com
>> *Sent: *Thursday, 12 February, 2015 9:23:35 AM
>> *Subject: *[ceph-users] ceph Performance with SSD journal
>>
>>
>> Hi Ceph-Experts,
>>
>> Have a small ceph architecture related question
>>
>> As blogs and documents suggest that ceph perform much better if we use
>> journal on SSD.
>>
>> I have made the ceph cluster with 30 HDD + 6 SSD for 6 OSD nodes. 5 HDD
>> + 1 SSD on each node and each SSD have 5 partition for journaling 5 OSDs
>> on the node.
>>
>> Now I ran similar test as I ran for all HDD setup.
>>
>> What I saw below two reading goes in wrong direction as expected
>>
>> 1) 4K write IOPS are less for SSD setup, though not major difference but
>> less.
>> 2) 1024K Read IOPS are  less  for SSD setup than HDD setup.
>>
>> On the other hand 4K read and 1024K write both have much better numbers
>> for SSD setup.
>>
>> Let me know if I am missing some obvious concept.
>>
>> Thanks
>> sumit
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] re: Upgrade 0.80.5 to 0.80.8 --the VM's read requestbecome too slow

2015-02-12 Thread Irek Fasikhov
Hi.
hmm ... I thought, why I have such a low speed reading on another cluster


P.S. ceph 0.80.8

2015-02-12 14:33 GMT+03:00 Alexandre DERUMIER :

> >>Hi,
> >>Can you test with disabling rbd_cache ?
>
> >>I remember of a bug detected in giant, not sure it's also the case for
> fireflt
>
> This was this tracker:
>
> http://tracker.ceph.com/issues/9513
>
> But It has been solved and backported to firefly.
>
> Also, can you test 0.80.6 and 0.80.7 ?
>
>
>
>
>
>
>
> - Mail original -
> De: "killingwolf" 
> À: "ceph-users" 
> Envoyé: Jeudi 12 Février 2015 12:16:32
> Objet: [ceph-users] re: Upgrade 0.80.5 to 0.80.8 --the VM's read
> requestbecome too slow
>
> I have this problems too , Help!
>
> -- 原始邮件 --
> 发件人: "杨万元";;
> 发送时间: 2015年2月12日(星期四) 中午11:14
> 收件人: "ceph-users@lists.ceph.com";
> 主题: [ceph-users] Upgrade 0.80.5 to 0.80.8 --the VM's read requestbecome
> too slow
>
> Hello!
> We use Ceph+Openstack in our private cloud. Recently we upgrade our
> centos6.5 based cluster from Ceph Emperor to Ceph Firefly.
> At first,we use redhat yum repo epel to upgrade, this Ceph's version is
> 0.80.5. First upgrade monitor,then osd,last client. when we complete this
> upgrade, we boot a VM on the cluster,then use fio to test the io
> performance. The io performance is as better as before. Everything is ok!
> Then we upgrade the cluster from 0.80.5 to 0.80.8,when we completed , we
> reboot the VM to load the newest librbd. after that we also use fio to test
> the io performance .then we find the randwrite and write is as good as
> before.but the randread and read is become worse, randwrite's iops from
> 4000-5000 to 300-400 ,and the latency is worse. the write's bw from 400MB/s
> to 115MB/s . then I downgrade the ceph client version from 0.80.8 to
> 0.80.5, then the reslut become normal.
> So I think maybe something cause about librbd. I compare the 0.80.8
> release notes with 0.80.5 (
> http://ceph.com/docs/master/release-notes/#v0-80-8-firefly ), I just find
> this change in 0.80.8 is something about read request : librbd: cap memory
> utilization for read requests (Jason Dillaman) . Who can explain this?
>
>
> My ceph cluster is 400osd,5mons :
> ceph -s
> health HEALTH_OK
> monmap e11: 5 mons at {BJ-M1-Cloud71=
> 172.28.2.71:6789/0,BJ-M1-Cloud73=172.28.2.73:6789/0,BJ-M2-Cloud80=172.28.2.80:6789/0,BJ-M2-Cloud81=172.28.2.81:6789/0,BJ-M3-Cloud85=172.28.2.85:6789/0
> }, election epoch 198, quorum 0,1,2,3,4
> BJ-M1-Cloud71,BJ-M1-Cloud73,BJ-M2-Cloud80,BJ-M2-Cloud81,BJ-M3-Cloud85
> osdmap e120157: 400 osds: 400 up, 400 in
> pgmap v26161895: 29288 pgs, 6 pools, 20862 GB data, 3014 kobjects
> 41084 GB used, 323 TB / 363 TB avail
> 29288 active+clean
> client io 52640 kB/s rd, 32419 kB/s wr, 5193 op/s
>
>
> The follwing is my ceph client conf :
> [global]
> auth_service_required = cephx
> filestore_xattr_use_omap = true
> auth_client_required = cephx
> auth_cluster_required = cephx
> mon_host =
> 172.29.204.24,172.29.204.48,172.29.204.55,172.29.204.58,172.29.204.73
> mon_initial_members = ZR-F5-Cloud24, ZR-F6-Cloud48, ZR-F7-Cloud55,
> ZR-F8-Cloud58, ZR-F9-Cloud73
> fsid = c01c8e28-304e-47a4-b876-cb93acc2e980
> mon osd full ratio = .85
> mon osd nearfull ratio = .75
> public network = 172.29.204.0/24
> mon warn on legacy crush tunables = false
>
> [osd]
> osd op threads = 12
> filestore journal writeahead = true
> filestore merge threshold = 40
> filestore split multiple = 8
>
> [client]
> rbd cache = true
> rbd cache writethrough until flush = false
> rbd cache size = 67108864
> rbd cache max dirty = 50331648
> rbd cache target dirty = 33554432
>
> [client.cinder]
> admin socket = /var/run/ceph/rbd-$pid.asok
>
>
>
> My VM is 8core16G,we use fio scripts is :
> fio -ioengine=libaio -bs=4k -direct=1 -thread -rw=randread -size=60G
> -filename=/dev/vdb -name="EBS" -iodepth=32 -runtime=200
> fio -ioengine=libaio -bs=4k -direct=1 -thread -rw=randwrite -size=60G
> -filename=/dev/vdb -name="EBS" -iodepth=32 -runtime=200
> fio -ioengine=libaio -bs=4k -direct=1 -thread -rw=read -size=60G
> -filename=/dev/vdb -name="EBS" -iodepth=32 -runtime=200
> fio -ioengine=libaio -bs=4k -direct=1 -thread -rw=write -size=60G
> -filename=/dev/vdb -name="EBS" -iodepth=32 -runtime=200
>
> The following is the io test result
> ceph client verison :0.80.5
> read: bw= 430MB
> write: bw=420MB
> randread: iops= 4875 latency=65ms
> randwrite: iops=6844 latency=46ms
>
> ceph client verison :0.80.8
> read: bw= 115MB
> write: bw=480MB
> randread: iops= 381 latency=83ms
> randwrite: iops=4843 latency=68ms
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757

[ceph-users] 0.80.8 ReplicationPG Fail

2015-02-06 Thread Irek Fasikhov
Morning found that some OSD dropped out of Tier Cache Pool. Maybe it's a
coincidence, but at this point was rollback.

2015-02-05 23:23:18.231723 7fd747ff1700 -1 *** Caught signal
(Segmentation fault) **
 in thread 7fd747ff1700

 ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7)
 1: /usr/bin/ceph-osd() [0x9bde51]
 2: (()+0xf710) [0x7fd766f97710]
 3: (std::_Rb_tree_decrement(std::_Rb_tree_node_base*)+0xa) [0x7fd7666c1eca]
 4: (ReplicatedPG::make_writeable(ReplicatedPG::OpContext*)+0x14c) [0x87cd5c]
 5: (ReplicatedPG::prepare_transaction(ReplicatedPG::OpContext*)+0x1db)
[0x89d29b]
 6: (ReplicatedPG::execute_ctx(ReplicatedPG::OpContext*)+0xcd4) [0x89e0f4]
 7: (ReplicatedPG::do_op(std::tr1::shared_ptr)+0x2ca5) [0x8a2a55]
 8: (ReplicatedPG::do_request(std::tr1::shared_ptr,
ThreadPool::TPHandle&)+0x5b1) [0x832251]
 9: (OSD::dequeue_op(boost::intrusive_ptr,
std::tr1::shared_ptr, ThreadPool::TPHandle&)+0x37c)
[0x61344c]
 10: (OSD::OpWQ::_process(boost::intrusive_ptr,
ThreadPool::TPHandle&)+0x63d) [0x6472ad]
 11: (ThreadPool::WorkQueueVal,
std::tr1::shared_ptr >, boost::intrusive_ptr
>::_void_process(void*, ThreadPool::TPHandle&)+0xae) [0x67dcde]
 12: (ThreadPool::worker(ThreadPool::WorkThread*)+0x551) [0xa2a181]
 13: (ThreadPool::WorkThread::entry()+0x10) [0xa2d260]
 14: (()+0x79d1) [0x7fd766f8f9d1]
 15: (clone()+0x6d) [0x7fd765f088fd]
 NOTE: a copy of the executable, or `objdump -rdS ` is
needed to interpret this.

Are there any ideas? Thank.

http://tracker.ceph.com/issues/10778
-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD over cache tier over EC pool: rbd rm doesn't remove objects

2015-01-28 Thread Irek Fasikhov
Hi,Sage.

Yes, Firefly.
[root@ceph05 ~]# ceph --version
ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7)

Yes, I have seen this behavior.

[root@ceph08 ceph]# rbd info vm-160-disk-1
rbd image 'vm-160-disk-1':
size 32768 MB in 8192 objects
order 22 (4096 kB objects)
block_name_prefix: rbd_data.179faf52eb141f2
format: 2
features: layering
parent: rbd/base-145-disk-1@__base__
overlap: 32768 MB
[root@ceph08 ceph]# rbd rm vm-160-disk-1
Removing image: 100% complete...done.
[root@ceph08 ceph]# rbd info vm-160-disk-1
2015-01-28 10:39:01.595785 7f1fbea9e760 -1 librbd::ImageCtx: error finding
header: (2) No such file or directoryrbd: error opening image
vm-160-disk-1: (2) No such file or directory

[root@ceph08 ceph]# rados -p rbdcache ls | grep 179faf52eb141f2 | wc
   59445944  249633
[root@ceph08 ceph]# rados -p rbdcache ls | grep 179faf52eb141f2 | wc
   58575857  245979
[root@ceph08 ceph]# rados -p rbd ls | grep 179faf52eb141f2 | wc
   43774377  183819
[root@ceph08 ceph]# rados -p rbdcache ls | grep 179faf52eb141f2 | wc
   50175017  210699
[root@ceph08 ceph]# rados -p rbdcache ls | grep 179faf52eb141f2 | wc
   50155015  210615
[root@ceph08 ceph]# rados -p rbd ls | grep 179faf52eb141f2 | wc
[root@ceph08 ceph]# rados -p rcachehe ls | grep 179faf52eb141f2 | wc
   19861986   83412
[root@ceph08 ceph]# rados -p rbd ls | grep 179faf52eb141f2 | wc
981 981   41202
[root@ceph08 ceph]# rados -p rbd ls | grep 179faf52eb141f2 | wc
802 802   33684
[root@ceph08 ceph]# rados -p rbdcache ls | grep 179faf52eb141f2 | wc
   16111611   67662

Thank, Sage!


Tue Jan 27 2015 at 7:01:43 PM, Sage Weil :

> On Tue, 27 Jan 2015, Irek Fasikhov wrote:
> > Hi,All.
> > Indeed, there is a problem. Removed 1 TB of data space on a cluster is
> not
> > cleared. This feature of the behavior or a bug? And how long will it be
> > cleaned?
>
> Your subject says cache tier but I don't see it in the 'ceph df' output
> below.  The cache tiers will store 'whiteout' objects that cache object
> non-existence that could be delaying some deletion.  You can wrangle the
> cluster into flushing those with
>
>  ceph osd pool set  cache_target_dirty_ratio .05
>
> (though you'll probably want to change it back to the default .4 later).
>
> If there's no cache tier involved, there may be another problem.  What
> version is this?  Firefly?
>
> sage
>
> >
> > Sat Sep 20 2014 at 8:19:24 AM, Mika?l Cluseau :
> >   Hi all,
> >
> >   I have weird behaviour on my firefly "test + convenience
> >   storage" cluster. It consists of 2 nodes with a light imbalance
> >   in available space:
> >
> >   # idweighttype nameup/downreweight
> >   -114.58root default
> >   -28.19host store-1
> >   12.73osd.1up1
> >   02.73osd.0up1
> >   52.73osd.5up1
> >   -36.39host store-2
> >   22.73osd.2up1
> >   32.73osd.3up1
> >   40.93osd.4up1
> >
> >   I used to store ~8TB of rbd volumes, coming to a near-full
> >   state. There was some annoying "stuck misplaced" PGs so I began
> >   to remove 4.5TB of data; the weird thing is: the space hasn't
> >   been reclaimed on the OSDs, they keeped stuck around 84% usage.
> >   I tried to move PGs around and it happens that the space is
> >   correctly "reclaimed" if I take an OSD out, let him empty it XFS
> >   volume and then take it in again.
> >
> >   I'm currently applying this to and OSD in turn, but I though it
> >   could be worth telling about this. The current ceph df output
> >   is:
> >
> >   GLOBAL:
> >   SIZE   AVAIL RAW USED %RAW USED
> >   12103G 5311G 6792G56.12
> >   POOLS:
> >   NAME ID USED   %USED OBJECTS
> >   data 0  0  0 0
> >   metadata 1  0  0 0
> >   rbd  2  444G   3.67  117333
> >   [...]
> >   archives-ec  14 3628G  29.98 928902
> >   archives 15 37518M 0.30  273167
> >
> >   Before "just moving data", AVAIL was around 3TB.
> >
> >   I finished the process with the OSDs on store-1, who show the
> >   

Re: [ceph-users] RBD over cache tier over EC pool: rbd rm doesn't remove objects

2015-01-28 Thread Irek Fasikhov
Sage.
Is a sentence when deleting objects bypass the cache tier pool.
Thank

Wed Jan 28 2015 at 5:13:36 PM, Irek Fasikhov :

> Hi,Sage.
>
> Yes, Firefly.
> [root@ceph05 ~]# ceph --version
> ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7)
>
> Yes, I have seen this behavior.
>
> [root@ceph08 ceph]# rbd info vm-160-disk-1
> rbd image 'vm-160-disk-1':
> size 32768 MB in 8192 objects
> order 22 (4096 kB objects)
> block_name_prefix: rbd_data.179faf52eb141f2
> format: 2
> features: layering
> parent: rbd/base-145-disk-1@__base__
> overlap: 32768 MB
> [root@ceph08 ceph]# rbd rm vm-160-disk-1
> Removing image: 100% complete...done.
> [root@ceph08 ceph]# rbd info vm-160-disk-1
> 2015-01-28 10:39:01.595785 7f1fbea9e760 -1 librbd::ImageCtx: error finding
> header: (2) No such file or directoryrbd: error opening image
> vm-160-disk-1: (2) No such file or directory
>
> [root@ceph08 ceph]# rados -p rbdcache ls | grep 179faf52eb141f2 | wc
>59445944  249633
> [root@ceph08 ceph]# rados -p rbdcache ls | grep 179faf52eb141f2 | wc
>58575857  245979
> [root@ceph08 ceph]# rados -p rbd ls | grep 179faf52eb141f2 | wc
>43774377  183819
> [root@ceph08 ceph]# rados -p rbdcache ls | grep 179faf52eb141f2 | wc
>50175017  210699
> [root@ceph08 ceph]# rados -p rbdcache ls | grep 179faf52eb141f2 | wc
>50155015  210615
> [root@ceph08 ceph]# rados -p rbd ls | grep 179faf52eb141f2 | wc
> [root@ceph08 ceph]# rados -p rcachehe ls | grep 179faf52eb141f2 | wc
>19861986   83412
> [root@ceph08 ceph]# rados -p rbd ls | grep 179faf52eb141f2 | wc
> 981 981   41202
> [root@ceph08 ceph]# rados -p rbd ls | grep 179faf52eb141f2 | wc
> 802 802   33684
> [root@ceph08 ceph]# rados -p rbdcache ls | grep 179faf52eb141f2 | wc
>16111611   67662
>
> Thank, Sage!
>
>
> Tue Jan 27 2015 at 7:01:43 PM, Sage Weil :
>
> On Tue, 27 Jan 2015, Irek Fasikhov wrote:
>> > Hi,All.
>> > Indeed, there is a problem. Removed 1 TB of data space on a cluster is
>> not
>> > cleared. This feature of the behavior or a bug? And how long will it be
>> > cleaned?
>>
>> Your subject says cache tier but I don't see it in the 'ceph df' output
>> below.  The cache tiers will store 'whiteout' objects that cache object
>> non-existence that could be delaying some deletion.  You can wrangle the
>> cluster into flushing those with
>>
>>  ceph osd pool set  cache_target_dirty_ratio .05
>>
>> (though you'll probably want to change it back to the default .4 later).
>>
>> If there's no cache tier involved, there may be another problem.  What
>> version is this?  Firefly?
>>
>> sage
>>
>> >
>> > Sat Sep 20 2014 at 8:19:24 AM, Mika?l Cluseau :
>> >   Hi all,
>> >
>> >   I have weird behaviour on my firefly "test + convenience
>> >   storage" cluster. It consists of 2 nodes with a light imbalance
>> >   in available space:
>> >
>> >   # idweighttype nameup/downreweight
>> >   -114.58root default
>> >   -28.19host store-1
>> >   12.73osd.1up1
>> >   02.73osd.0up1
>> >   52.73osd.5up1
>> >   -36.39host store-2
>> >   22.73osd.2up1
>> >   32.73osd.3up1
>> >   40.93osd.4up1
>> >
>> >   I used to store ~8TB of rbd volumes, coming to a near-full
>> >   state. There was some annoying "stuck misplaced" PGs so I began
>> >   to remove 4.5TB of data; the weird thing is: the space hasn't
>> >   been reclaimed on the OSDs, they keeped stuck around 84% usage.
>> >   I tried to move PGs around and it happens that the space is
>> >   correctly "reclaimed" if I take an OSD out, let him empty it XFS
>> >   volume and then take it in again.
>> >
>> >   I'm currently applying this to and OSD in turn, but I though it
>> >   could be worth telling about this. The current ceph df output
>> >   is:
>> >
>> >   GLOBAL:
>> >   SIZE   AVAIL RAW USED %RAW USED
>> >   12103G 5311G 6792G56.12
>> >   POOLS:
>> >   NAME ID USED   %USED OBJEC

Re: [ceph-users] RBD over cache tier over EC pool: rbd rm doesn't remove objects

2015-01-26 Thread Irek Fasikhov
Hi,All.

Indeed, there is a problem. Removed 1 TB of data space on a cluster is not
cleared. This feature of the behavior or a bug? And how long will it be
cleaned?

Sat Sep 20 2014 at 8:19:24 AM, Mikaël Cluseau :

>  Hi all,
>
> I have weird behaviour on my firefly "test + convenience storage" cluster.
> It consists of 2 nodes with a light imbalance in available space:
>
> # idweighttype nameup/downreweight
> -114.58root default
> -28.19host store-1
> 12.73osd.1up1
> 02.73osd.0up1
> 52.73osd.5up1
> -36.39host store-2
> 22.73osd.2up1
> 32.73osd.3up1
> 40.93osd.4up1
>
> I used to store ~8TB of rbd volumes, coming to a near-full state. There
> was some annoying "stuck misplaced" PGs so I began to remove 4.5TB of data;
> the weird thing is: the space hasn't been reclaimed on the OSDs, they
> keeped stuck around 84% usage. I tried to move PGs around and it happens
> that the space is correctly "reclaimed" if I take an OSD out, let him empty
> it XFS volume and then take it in again.
>
> I'm currently applying this to and OSD in turn, but I though it could be
> worth telling about this. The current ceph df output is:
>
> GLOBAL:
> SIZE   AVAIL RAW USED %RAW USED
> 12103G 5311G 6792G56.12
> POOLS:
> NAME ID USED   %USED OBJECTS
> data 0  0  0 0
> metadata 1  0  0 0
> rbd  2  444G   3.67  117333
> [...]
> archives-ec  14 3628G  29.98 928902
> archives 15 37518M 0.30  273167
>
> Before "just moving data", AVAIL was around 3TB.
>
> I finished the process with the OSDs on store-1, who show the following
> space usage now:
>
> /dev/sdb1 2.8T  1.4T  1.4T  50% /var/lib/ceph/osd/ceph-0
> /dev/sdc1 2.8T  1.3T  1.5T  46% /var/lib/ceph/osd/ceph-1
> /dev/sdd1 2.8T  1.3T  1.5T  48% /var/lib/ceph/osd/ceph-5
>
> I'm currently fixing OSD 2, 3 will be the last one to be fixed. The df on
> store-2 shows the following:
>
> /dev/sdb1   2.8T  1.9T  855G  *70%* /var/lib/ceph/osd/ceph-2
> /dev/sdc1   2.8T  2.4T  417G  *86%* /var/lib/ceph/osd/ceph-3
> /dev/sdd1   932G  481G  451G  52% /var/lib/ceph/osd/ceph-4
>
> OSD 2 was at 84% 3h ago, and OSD 3 was ~75%.
>
> During rbd rm (that took a bit more that 3 days), ceph log was showing
> things like that:
>
> 2014-09-03 16:17:38.831640 mon.0 192.168.1.71:6789/0 417194 : [INF] pgmap
> v14953987: 3196 pgs: 2882 active+clean, 314 active+remapped; 7647 GB data,
> 11067 GB used, 3828 GB / 14896 GB avail; 0 B/s rd, 6778 kB/s wr, 18 op/s;
> -5/5757286 objects degraded (-0.000%)
> [...]
> 2014-09-05 03:09:59.895507 mon.0 192.168.1.71:6789/0 513976 : [INF] pgmap
> v15050766: 3196 pgs: 2882 active+clean, 314 active+remapped; 6010 GB data,
> 11156 GB used, 3740 GB / 14896 GB avail; 0 B/s rd, 0 B/s wr, 8 op/s;
> -388631/5247320 objects degraded (-7.406%)
> [...]
> 2014-09-06 03:56:50.008109 mon.0 192.168.1.71:6789/0 580816 : [INF] pgmap
> v15117604: 3196 pgs: 2882 active+clean, 314 active+remapped; 4865 GB data,
> 11207 GB used, 3689 GB / 14896 GB avail; 0 B/s rd, 6117 kB/s wr, 22 op/s;
> -706519/3699415 objects degraded (-19.098%)
> 2014-09-06 03:56:44.476903 osd.0 192.168.1.71:6805/11793 729 : [WRN] 1
> slow requests, 1 included below; oldest blocked for > 30.058434 secs
> 2014-09-06 03:56:44.476909 osd.0 192.168.1.71:6805/11793 730 : [WRN] slow
> request 30.058434 seconds old, received at 2014-09-06 03:56:14.418429:
> osd_op(client.19843278.0:46081 rb.0.c7fd7f.238e1f29.b3fa [delete]
> 15.b8fb7551 ack+ondisk+write e38950) v4 currently waiting for blocked object
> 2014-09-06 03:56:49.477785 osd.0 192.168.1.71:6805/11793 731 : [WRN] 2
> slow requests, 1 included below; oldest blocked for > 35.059315 secs
> [... stabilizes here:]
> 2014-09-06 22:13:48.771531 mon.0 192.168.1.71:6789/0 632527 : [INF] pgmap
> v15169313: 3196 pgs: 2882 active+clean, 314 active+remapped; 4139 GB data,
> 11215 GB used, 3681 GB / 14896 GB avail; 64 B/s rd, 64 B/s wr, 0 op/s;
> -883219/3420796 objects degraded (-25.819%)
> [...]
> 2014-09-07 03:09:48.491325 mon.0 192.168.1.71:6789/0 633880 : [INF] pgmap
> v15170666: 3196 pgs: 2882 active+clean, 314 active+remapped; 4139 GB data,
> 11215 GB used, 3681 GB / 14896 GB avail; 18727 B/s wr, 2 op/s;
> -883219/3420796 objects degraded (-25.819%)
>
> And now, during data movement I described before:
>
> 2014-09-20 15:16:13.394694 mon.0 [INF] pgmap v15344707: 3196 pgs: 2132
> active+clean, 432 active+remapped+wait_backfill, 621 active+remapped, 11
> active+remapped+backfilling; 4139 GB data, 6831 GB used, 5271 GB / 12103 GB
> avail; 379097/3792969 objects degraded (9.995%)
>
> If some ceph develop

Re: [ceph-users] Part 2: ssd osd fails often with "FAILED assert(soid < scrubber.start || soid >= scrubber.end)"

2015-01-26 Thread Irek Fasikhov
Hi, All,Loic
I have exactly the same error. I understand the problem is in 0.80.9? Thank
you.



Sat Jan 17 2015 at 2:21:09 AM, Loic Dachary :

>
>
> On 14/01/2015 18:33, Udo Lembke wrote:
> > Hi Loic,
> > thanks for the answer. I hope it's not like in
> > http://tracker.ceph.com/issues/8747 where the issue happens with an
> > patched version if understand right.
>
> http://tracker.ceph.com/issues/8747 is a duplicate of
> http://tracker.ceph.com/issues/8011 indeed :-)
> >
> > So I must only wait few month ;-) for an backport...
> >
> > Udo
> >
> > Am 14.01.2015 09:40, schrieb Loic Dachary:
> >> Hi,
> >>
> >> This is http://tracker.ceph.com/issues/8011 which is being
> >> backported.
> >>
> >> Cheers
> >>
> >>
>
> --
> Loïc Dachary, Artisan Logiciel Libre
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] PG num calculator live on Ceph.com

2015-01-09 Thread Irek Fasikhov
Very very good :)

пт, 9 янв. 2015, 2:17, William Bloom (wibloom) :

>  Awesome, thanks Michael.
>
>
>
> Regards
>
> William
>
>
>
> *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf
> Of *Michael J. Kidd
> *Sent:* Wednesday, January 07, 2015 2:09 PM
> *To:* ceph-us...@ceph.com
> *Subject:* [ceph-users] PG num calculator live on Ceph.com
>
>
>
> Hello all,
>
>   Just a quick heads up that we now have a PG calculator to help determine
> the proper PG per pool numbers to achieve a target PG per OSD ratio.
>
> http://ceph.com/pgcalc
>
> Please check it out!  Happy to answer any questions, and always welcome
> any feedback on the tool / verbiage, etc...
>
> As an aside, we're also working to update the documentation to reflect the
> best practices.  See Ceph.com tracker for this at:
> http://tracker.ceph.com/issues/9867
>
> Thanks!
>
> Michael J. Kidd
> Sr. Storage Consultant
> Inktank Professional Services
>
>  - by Red Hat
>   ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] VM restore on Ceph *very* slow

2014-12-11 Thread Irek Fasikhov
Examples
Backups:
/usr/bin/nice -n +20 /usr/bin/rbd -n client.backup export
test/vm-105-disk-1@rbd_data.505392ae8944a - | /usr/bin/pv -s 40G -n -i 1 |
/usr/bin/nice -n +20 /usr/bin/pbzip2 -c > /backup/vm-105-disk-1
Restore:
pbzip2 -dk /nfs/RBD/big-vm-268-disk-1-LyncV2-20140830-011308.pbzip2 -c |
rbd -n client.rbdbackup -k /etc/ceph/big.keyring -c /etc/ceph/big.conf
import --image-format 2 - rbd/Lyncolddisk1

2014-12-12 8:38 GMT+03:00 Irek Fasikhov :

> Hi.
>
> For faster operation, use rbd export/export-diff and import/import-diff
>
> 2014-12-11 17:17 GMT+03:00 Lindsay Mathieson 
> :
>
>>
>> Anyone know why a VM live restore would be excessively slow on Ceph?
>> restoring
>> a  small VM with 12GB disk/2GB Ram is taking 18 *minutes*. Larger VM's
>> can be
>> over half an hour.
>>
>> The same VM's on the same disks, but native, or glusterfs take less than
>> 30
>> seconds.
>>
>> VM's are KVM on Proxmox.
>>
>>
>> thanks,
>> --
>> Lindsay
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>
>
> --
> С уважением, Фасихов Ирек Нургаязович
> Моб.: +79229045757
>



-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] VM restore on Ceph *very* slow

2014-12-11 Thread Irek Fasikhov
Hi.

For faster operation, use rbd export/export-diff and import/import-diff

2014-12-11 17:17 GMT+03:00 Lindsay Mathieson :

>
> Anyone know why a VM live restore would be excessively slow on Ceph?
> restoring
> a  small VM with 12GB disk/2GB Ram is taking 18 *minutes*. Larger VM's can
> be
> over half an hour.
>
> The same VM's on the same disks, but native, or glusterfs take less than 30
> seconds.
>
> VM's are KVM on Proxmox.
>
>
> thanks,
> --
> Lindsay
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] system metrics monitoring

2014-12-11 Thread Irek Fasikhov
Hi.

We use Zabbix.

2014-12-12 8:33 GMT+03:00 pragya jain :

> hello sir!
>
> I need some open source monitoring tool for examining these metrics.
>
> Please suggest some open source monitoring software.
>
> Thanks
> Regards
> Pragya Jain
>
>
>   On Thursday, 11 December 2014 9:16 PM, Denish Patel 
> wrote:
>
>
>
> Try http://www.circonus.com
>
> On Thu, Dec 11, 2014 at 1:22 AM, pragya jain 
> wrote:
>
> please somebody reply my query.
>
> Regards
> Pragya Jain
>
>
>   On Tuesday, 9 December 2014 11:53 AM, pragya jain 
> wrote:
>
>
>
> hello all!
>
> As mentioned at statistics and monitoring page of Riak
> Systems Metrics To Graph
> 
> MetricAvailable Disk SpaceIOWaitRead OperationsWrite OperationsNetwork
> ThroughputLoad Average
> Can somebody suggest me some monitoring tools that monitor these metrics?
>
> Regards
> Pragya Jain
>
>
>
> ___
> riak-users mailing list
> riak-us...@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
>
>
> --
> Denish Patel,
> OmniTI Computer Consulting Inc.
> Database Architect,
> http://omniti.com/does/data-management
> http://www.pateldenish.com
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] What's the difference between ceph-0.87-0.el6.x86_64.rpm and ceph-0.80.7-0.el6.x86_64.rpm

2014-12-10 Thread Irek Fasikhov
Hi, Cao.

https://github.com/ceph/ceph/commits/firefly


2014-12-11 5:00 GMT+03:00 Cao, Buddy :

>  Hi, I tried to download firefly rpm package, but found two rpms existing
> in different folders, what is the difference of 0.87.0 and  0.80.7?
>
>
>
> http://ceph.com/rpm/el6/x86_64/ceph-0.87-0.el6.x86_64.rpm
>
> http://ceph.com/rpm-firefly/el6/x86_64/ceph-0.80.7-0.el6.x86_64.rpm
>
>
>
>
>
> Wei Cao (Buddy)
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] active+degraded on an empty new cluster

2014-12-09 Thread Irek Fasikhov
Hi.

http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/

ceph pg force_create_pg 


2014-12-09 14:50 GMT+03:00 Giuseppe Civitella 
:

> Hi all,
>
> last week I installed a new ceph cluster on 3 vm running Ubuntu 14.04 with
> default kernel.
> There is a ceph monitor a two osd hosts. Here are some datails:
> ceph -s
> cluster c46d5b02-dab1-40bf-8a3d-f8e4a77b79da
>  health HEALTH_WARN 192 pgs degraded; 192 pgs stuck unclean
>  monmap e1: 1 mons at {ceph-mon1=10.1.1.83:6789/0}, election epoch 1,
> quorum 0 ceph-mon1
>  osdmap e83: 6 osds: 6 up, 6 in
>   pgmap v231: 192 pgs, 3 pools, 0 bytes data, 0 objects
> 207 MB used, 30446 MB / 30653 MB avail
>  192 active+degraded
>
> root@ceph-mon1:/home/ceph# ceph osd dump
> epoch 99
> fsid c46d5b02-dab1-40bf-8a3d-f8e4a77b79da
> created 2014-12-06 13:15:06.418843
> modified 2014-12-09 11:38:04.353279
> flags
> pool 0 'data' replicated size 2 min_size 1 crush_ruleset 0 object_hash
> rjenkins pg_num 64 pgp_num 64 last_change 18 flags hashpspool
> crash_replay_interval 45 stripe_width 0
> pool 1 'metadata' replicated size 2 min_size 1 crush_ruleset 0 object_hash
> rjenkins pg_num 64 pgp_num 64 last_change 19 flags hashpspool stripe_width 0
> pool 2 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash
> rjenkins pg_num 64 pgp_num 64 last_change 20 flags hashpspool stripe_width 0
> max_osd 6
> osd.0 up   in  weight 1 up_from 90 up_thru 90 down_at 89
> last_clean_interval [58,89) 10.1.1.84:6805/995 10.1.1.84:6806/4000995
> 10.1.1.84:6807/4000995 10.1.1.84:6808/4000995 exists,up
> e3895075-614d-48e2-b956-96e13dbd87fe
> osd.1 up   in  weight 1 up_from 88 up_thru 0 down_at 87
> last_clean_interval [8,87) 10.1.1.85:6800/23146 10.1.1.85:6815/7023146
> 10.1.1.85:6816/7023146 10.1.1.85:6817/7023146 exists,up
> 144bc6ee-2e3d-4118-a460-8cc2bb3ec3e8
> osd.2 up   in  weight 1 up_from 61 up_thru 0 down_at 60
> last_clean_interval [11,60) 10.1.1.85:6805/26784 10.1.1.85:6802/5026784
> 10.1.1.85:6811/5026784 10.1.1.85:6812/5026784 exists,up
> 8d5c7108-ef11-4947-b28c-8e20371d6d78
> osd.3 up   in  weight 1 up_from 95 up_thru 0 down_at 94
> last_clean_interval [57,94) 10.1.1.84:6800/810 10.1.1.84:6810/3000810
> 10.1.1.84:6811/3000810 10.1.1.84:6812/3000810 exists,up
> bd762b2d-f94c-4879-8865-cecd63895557
> osd.4 up   in  weight 1 up_from 97 up_thru 0 down_at 96
> last_clean_interval [74,96) 10.1.1.84:6801/9304 10.1.1.84:6802/2009304
> 10.1.1.84:6803/2009304 10.1.1.84:6813/2009304 exists,up
> 7d28a54b-b474-4369-b958-9e6bf6c856aa
> osd.5 up   in  weight 1 up_from 99 up_thru 0 down_at 98
> last_clean_interval [79,98) 10.1.1.85:6801/19513 10.1.1.85:6808/2019513
> 10.1.1.85:6810/2019513 10.1.1.85:6813/2019513 exists,up
> f4d76875-0e40-487c-a26d-320f8b8d60c5
>
> root@ceph-mon1:/home/ceph# ceph osd tree
> # idweight  type name   up/down reweight
> -1  0   root default
> -2  0   host ceph-osd1
> 0   0   osd.0   up  1
> 3   0   osd.3   up  1
> 4   0   osd.4   up  1
> -3  0   host ceph-osd2
> 1   0   osd.1   up  1
> 2   0   osd.2   up  1
> 5   0   osd.5   up  1
>
> Current HEALTH_WARN state says "192 active+degraded" since I rebooted an
> osd host. Previously it was "incomplete". It never reached a HEALTH_OK
> state.
> Any hint about what to do next to have an healthy cluster?
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Issue in renaming rbd

2014-12-03 Thread Irek Fasikhov
Sorry :).

root@backhb2:~# rbd ls -l | grep test
test  1024M1
root@backhb2:~# rbd mv rbd/test rbd/test2
root@backhb2:~# rbd ls -l | grep test
test2 1024M1
root@backhb2:~# rbd rename rbd/test2 rbd/test
root@backhb2:~# rbd ls -l | grep test
test  1024M1

2014-12-03 16:36 GMT+03:00 Mallikarjun Biradar <
mallikarjuna.bira...@gmail.com>:

> I am trying to rename rbd.
>
> ems@rack2-storage-5:~$ sudo rbd -h | grep rename
>   (mv | rename)rename src image to dest
> ems@rack2-storage-5:~$
>
>
> On Wed, Dec 3, 2014 at 6:55 PM, Irek Fasikhov  wrote:
>
>> root@backhb2:~# ceph osd pool -h | grep rename
>> osd pool rename  rename  to 
>>
>>
>> 2014-12-03 16:23 GMT+03:00 Mallikarjun Biradar <
>> mallikarjuna.bira...@gmail.com>:
>>
>>> Hi,
>>>
>>> I am trying to rename in the same pool.
>>>
>>> sudo rbd rename rbdPool1 -p testPool2 rbdPoolTest1 -p testPool2
>>>
>>> -Thanks & Regards,
>>> Mallikarjun Biradar
>>>
>>> On Wed, Dec 3, 2014 at 6:50 PM, Irek Fasikhov  wrote:
>>>
>>>> Hi.
>>>> You can only rename in the same pool.
>>>> For transfer to another pool: rbd cp and rbd export/import.
>>>>
>>>> 2014-12-03 16:15 GMT+03:00 Mallikarjun Biradar <
>>>> mallikarjuna.bira...@gmail.com>:
>>>>
>>>>> Hi all,
>>>>>
>>>>> Whether renaming rbd is allowed?
>>>>>
>>>>> I am getting this error,
>>>>>
>>>>> ems@rack6-ramp-4:~$ sudo rbd rename rbdPool1 -p testPool2
>>>>> rbdPoolTest1 -p testPool2
>>>>> rbd: mv/rename across pools not supported
>>>>> source pool: testPool2 dest pool: rbd
>>>>> ems@rack6-ramp-4:~$
>>>>>
>>>>> ems@rack6-ramp-4:~$ sudo rbd rename rbdPool1 rbdPoolTest1
>>>>> rbd: rename error: (2) No such file or directory
>>>>> 2014-12-03 18:41:50.786397 7f73b4f75840 -1 librbd: error finding
>>>>> source object: (2) No such file or directory
>>>>> ems@rack6-ramp-4:~$
>>>>>
>>>>> ems@rack6-ramp-4:~$ sudo rbd ls -p testPool2
>>>>> rbdPool1
>>>>> ems@rack6-ramp-4:~$
>>>>>
>>>>> Why its taking rbd as destination pool, though I have provided another
>>>>> pool as per syntax.
>>>>>
>>>>> Syntax in rbd help:
>>>>> rbd   (mv | rename)rename src image to
>>>>> dest
>>>>>
>>>>> The rbd which I am trying to rename is mounted and IO is running on it.
>>>>>
>>>>> -Thanks & regards,
>>>>> Mallikarjun Biradar
>>>>>
>>>>> ___
>>>>> ceph-users mailing list
>>>>> ceph-users@lists.ceph.com
>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> С уважением, Фасихов Ирек Нургаязович
>>>> Моб.: +79229045757
>>>>
>>>
>>>
>>
>>
>> --
>> С уважением, Фасихов Ирек Нургаязович
>> Моб.: +79229045757
>>
>
>


-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Issue in renaming rbd

2014-12-03 Thread Irek Fasikhov
root@backhb2:~# ceph osd pool -h | grep rename
osd pool rename  rename  to 


2014-12-03 16:23 GMT+03:00 Mallikarjun Biradar <
mallikarjuna.bira...@gmail.com>:

> Hi,
>
> I am trying to rename in the same pool.
>
> sudo rbd rename rbdPool1 -p testPool2 rbdPoolTest1 -p testPool2
>
> -Thanks & Regards,
> Mallikarjun Biradar
>
> On Wed, Dec 3, 2014 at 6:50 PM, Irek Fasikhov  wrote:
>
>> Hi.
>> You can only rename in the same pool.
>> For transfer to another pool: rbd cp and rbd export/import.
>>
>> 2014-12-03 16:15 GMT+03:00 Mallikarjun Biradar <
>> mallikarjuna.bira...@gmail.com>:
>>
>>> Hi all,
>>>
>>> Whether renaming rbd is allowed?
>>>
>>> I am getting this error,
>>>
>>> ems@rack6-ramp-4:~$ sudo rbd rename rbdPool1 -p testPool2 rbdPoolTest1
>>> -p testPool2
>>> rbd: mv/rename across pools not supported
>>> source pool: testPool2 dest pool: rbd
>>> ems@rack6-ramp-4:~$
>>>
>>> ems@rack6-ramp-4:~$ sudo rbd rename rbdPool1 rbdPoolTest1
>>> rbd: rename error: (2) No such file or directory
>>> 2014-12-03 18:41:50.786397 7f73b4f75840 -1 librbd: error finding source
>>> object: (2) No such file or directory
>>> ems@rack6-ramp-4:~$
>>>
>>> ems@rack6-ramp-4:~$ sudo rbd ls -p testPool2
>>> rbdPool1
>>> ems@rack6-ramp-4:~$
>>>
>>> Why its taking rbd as destination pool, though I have provided another
>>> pool as per syntax.
>>>
>>> Syntax in rbd help:
>>> rbd   (mv | rename)rename src image to
>>> dest
>>>
>>> The rbd which I am trying to rename is mounted and IO is running on it.
>>>
>>> -Thanks & regards,
>>> Mallikarjun Biradar
>>>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>>
>>
>>
>> --
>> С уважением, Фасихов Ирек Нургаязович
>> Моб.: +79229045757
>>
>
>


-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Issue in renaming rbd

2014-12-03 Thread Irek Fasikhov
Hi.
You can only rename in the same pool.
For transfer to another pool: rbd cp and rbd export/import.

2014-12-03 16:15 GMT+03:00 Mallikarjun Biradar <
mallikarjuna.bira...@gmail.com>:

> Hi all,
>
> Whether renaming rbd is allowed?
>
> I am getting this error,
>
> ems@rack6-ramp-4:~$ sudo rbd rename rbdPool1 -p testPool2 rbdPoolTest1 -p
> testPool2
> rbd: mv/rename across pools not supported
> source pool: testPool2 dest pool: rbd
> ems@rack6-ramp-4:~$
>
> ems@rack6-ramp-4:~$ sudo rbd rename rbdPool1 rbdPoolTest1
> rbd: rename error: (2) No such file or directory
> 2014-12-03 18:41:50.786397 7f73b4f75840 -1 librbd: error finding source
> object: (2) No such file or directory
> ems@rack6-ramp-4:~$
>
> ems@rack6-ramp-4:~$ sudo rbd ls -p testPool2
> rbdPool1
> ems@rack6-ramp-4:~$
>
> Why its taking rbd as destination pool, though I have provided another
> pool as per syntax.
>
> Syntax in rbd help:
> rbd   (mv | rename)rename src image to dest
>
> The rbd which I am trying to rename is mounted and IO is running on it.
>
> -Thanks & regards,
> Mallikarjun Biradar
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] trouble starting second monitor

2014-12-01 Thread Irek Fasikhov
[celtic][DEBUG ] create the mon path if it does not exist

mkdir /var/lib/ceph/mon/

2014-12-01 4:32 GMT+03:00 K Richard Pixley :

> What does this mean, please?
>
> --rich
>
> ceph@adriatic:~/my-cluster$ ceph status
> cluster 1023db58-982f-4b78-b507-481233747b13
>  health HEALTH_OK
>  monmap e1: 1 mons at {black=192.168.1.77:6789/0}, election epoch 2,
> quorum 0 black
>  mdsmap e7: 1/1/1 up {0=adriatic=up:active}, 3 up:standby
>  osdmap e17: 4 osds: 4 up, 4 in
>   pgmap v48: 192 pgs, 3 pools, 1884 bytes data, 20 objects
> 29134 MB used, 113 GB / 149 GB avail
>  192 active+clean
> ceph@adriatic:~/my-cluster$ ceph-deploy mon create celtic
> [ceph_deploy.conf][DEBUG ] found configuration file at:
> /home/ceph/.cephdeploy.conf
> [ceph_deploy.cli][INFO  ] Invoked (1.5.20): /usr/bin/ceph-deploy mon
> create celtic
> [ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts celtic
> [ceph_deploy.mon][DEBUG ] detecting platform for host celtic ...
> [celtic][DEBUG ] connection detected need for sudo
> [celtic][DEBUG ] connected to host: celtic
> [celtic][DEBUG ] detect platform information from remote host
> [celtic][DEBUG ] detect machine type
> [ceph_deploy.mon][INFO  ] distro info: Ubuntu 14.04 trusty
> [celtic][DEBUG ] determining if provided host has same hostname in remote
> [celtic][DEBUG ] get remote short hostname
> [celtic][DEBUG ] deploying mon to celtic
> [celtic][DEBUG ] get remote short hostname
> [celtic][DEBUG ] remote hostname: celtic
> [celtic][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
> [celtic][DEBUG ] create the mon path if it does not exist
> [celtic][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-celtic/
> done
> [celtic][DEBUG ] create a done file to avoid re-doing the mon deployment
> [celtic][DEBUG ] create the init path if it does not exist
> [celtic][DEBUG ] locating the `service` executable...
> [celtic][INFO  ] Running command: sudo initctl emit ceph-mon cluster=ceph
> id=celtic
> [celtic][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon
> /var/run/ceph/ceph-mon.celtic.asok mon_status
> [celtic][ERROR ] admin_socket: exception getting command descriptions:
> [Errno 2] No such file or directory
> [celtic][WARNIN] monitor: mon.celtic, might not be running yet
> [celtic][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon
> /var/run/ceph/ceph-mon.celtic.asok mon_status
> [celtic][ERROR ] admin_socket: exception getting command descriptions:
> [Errno 2] No such file or directory
> [celtic][WARNIN] celtic is not defined in `mon initial members`
> [celtic][WARNIN] monitor celtic does not exist in monmap
> [celtic][WARNIN] neither `public_addr` nor `public_network` keys are
> defined for monitors
> [celtic][WARNIN] monitors may not be able to form quorum
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] S3CMD and Ceph

2014-11-27 Thread Irek Fasikhov
I like this work.
[rbd@rbdbackup ~]$ cat .s3cfg
[default]
access_key = 2M4PRTYOGI3AXBZFAXFR
secret_key = LQYFttxRn+7bBJ5rD1Y7ckZCN8XjEInOFY3s9RUR
host_base = s3.X.ru
host_bucket = %(bucket)s.s3.X.ru
enable_multipart = True
multipart_chunk_size_mb = 30
use_https = True


2014-11-27 7:43 GMT+03:00 b :

> I'm having some issues with a user in ceph using S3 Browser and S3cmd
>
> It was previously working.
>
> I can no longer use s3cmd to list the contents of a bucket, i am getting
> 403 and 405 errors
> When using S3browser, I can see the contents of the bucket, I can upload
> files, but i cannot create additional folders within the bucket (i get 403
> error)
>
> The bucket is owned by the user, I am using the correct keys, I have
> checked the keys for escape characters, but there are no slashes in the key.
>
> I'm not sure what else I can do to get this to work.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] S3CMD and Ceph

2014-11-27 Thread Irek Fasikhov
Hi,Ben!

Do you have problems with permissions. The configuration is fully
operational.

2014-11-27 11:39 GMT+03:00 Ben :

>  Even with those settings it doesnt work.
>
> I still get "ERROR: Access to bucket 'BUCKET' was denied'
>
> Radosgw-admin shows me as the owner of the bucket, and when i do 's3cmd
> ls' by itself, it lists all buckets. But when I do 's3cmd ls s3://BUCKET'
> it gives me denied error.
>
>
>
> On 27/11/14 19:32, Irek Fasikhov wrote:
>
>  I like this work.
>  [rbd@rbdbackup ~]$ cat .s3cfg
> [default]
> access_key = 2M4PRTYOGI3AXBZFAXFR
> secret_key = LQYFttxRn+7bBJ5rD1Y7ckZCN8XjEInOFY3s9RUR
> host_base = s3.X.ru
> host_bucket = %(bucket)s.s3.X.ru
> enable_multipart = True
> multipart_chunk_size_mb = 30
> use_https = True
>
>
> 2014-11-27 7:43 GMT+03:00 b :
>
>> I'm having some issues with a user in ceph using S3 Browser and S3cmd
>>
>> It was previously working.
>>
>> I can no longer use s3cmd to list the contents of a bucket, i am getting
>> 403 and 405 errors
>> When using S3browser, I can see the contents of the bucket, I can upload
>> files, but i cannot create additional folders within the bucket (i get 403
>> error)
>>
>> The bucket is owned by the user, I am using the correct keys, I have
>> checked the keys for escape characters, but there are no slashes in the key.
>>
>> I'm not sure what else I can do to get this to work.
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
>
>  --
>  С уважением, Фасихов Ирек Нургаязович
> Моб.: +79229045757
>
>
>


-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osds fails to start with mismatch in id

2014-11-10 Thread Irek Fasikhov
Hi, Ramakrishna.
I think you understand what the problem is:
[ceph@ceph05 ~]$ cat /var/lib/ceph/osd/ceph-56/whoami
56
[ceph@ceph05 ~]$ cat /var/lib/ceph/osd/ceph-57/whoami
57


Tue Nov 11 2014 at 6:01:40, Ramakrishna Nishtala (rnishtal) <
rnish...@cisco.com>:

>  Hi Greg,
>
> Thanks for the pointer. I think you are right. The full story is like this.
>
>
>
> After installation, everything works fine until I reboot. I do observe
> udevadm getting triggered in logs, but the devices do not come up after
> reboot. Exact issue as http://tracker.ceph.com/issues/5194. But this has
> been fixed a while back per the case details.
>
> As a workaround, I copied the contents from /proc/mounts to fstab and
> that’s where I landed into the issue.
>
>
>
> After your suggestion, defined as UUID in fstab, but similar problem.
>
> blkid.tab now moved to tmpfs and also isn’t consistent ever after issuing
> blkid explicitly to get the UUID’s. Goes in line with ceph-disk comments.
>
>
>
> Decided to reinstall, dd the partitions, zapdisks etc. Did not help. Very
> weird that links below change in /dev/disk/by-uuid and
> /dev/disk/by-partuuid etc.
>
>
>
> *Before reboot*
>
> lrwxrwxrwx 1 root root 10 Nov 10 06:31
> 11aca3e2-a9d5-4bcc-a5b0-441c53d473b6 -> ../../sdd2
>
> lrwxrwxrwx 1 root root 10 Nov 10 06:31
> 89594989-90cb-4144-ac99-0ffd6a04146e -> ../../sde2
>
> lrwxrwxrwx 1 root root 10 Nov 10 06:31
> c17fe791-5525-4b09-92c4-f90eaaf80dc6 -> ../../sda2
>
> lrwxrwxrwx 1 root root 10 Nov 10 06:31
> c57541a1-6820-44a8-943f-94d68b4b03d4 -> ../../sdc2
>
> lrwxrwxrwx 1 root root 10 Nov 10 06:31
> da7030dd-712e-45e4-8d89-6e795d9f8011 -> ../../sdb2
>
>
>
> *After reboot*
>
> lrwxrwxrwx 1 root root 10 Nov 10 09:50
> 11aca3e2-a9d5-4bcc-a5b0-441c53d473b6 -> ../../sdd2
>
> lrwxrwxrwx 1 root root 10 Nov 10 09:50
> 89594989-90cb-4144-ac99-0ffd6a04146e -> ../../sde2
>
> lrwxrwxrwx 1 root root 10 Nov 10 09:50
> c17fe791-5525-4b09-92c4-f90eaaf80dc6 -> ../../sda2
>
> lrwxrwxrwx 1 root root 10 Nov 10 09:50
> c57541a1-6820-44a8-943f-94d68b4b03d4 -> ../../sdb2
>
> lrwxrwxrwx 1 root root 10 Nov 10 09:50
> da7030dd-712e-45e4-8d89-6e795d9f8011 -> ../../sdh2
>
>
>
> Essentially, the transformation here is sdb2->sdh2 and sdc2-> sdb2. In
> fact I haven’t partitioned my sdh at all before the test. The only
> difference probably from the standard procedure is I have pre-created the
> partitions for the journal and data, with parted.
>
>
>
> /lib/udev/rules.d  osd rules has four different partition GUID codes,
>
> "45b0969e-9b03-4f30-b4c6-5ec00ceff106",
>
> "45b0969e-9b03-4f30-b4c6-b4b80ceff106",
>
> "4fbd7e29-9d25-41b8-afd0-062c0ceff05d",
>
> "4fbd7e29-9d25-41b8-afd0-5ec00ceff05d",
>
>
>
> But all my partitions journal/data are having
> ebd0a0a2-b9e5-4433-87c0-68b6b72699c7 as partition guid code.
>
>
>
> Appreciate any help.
>
>
>
> Regards,
>
>
>
> Rama
>
> =
>
> -Original Message-
> From: Gregory Farnum [mailto:g...@gregs42.com]
> Sent: Sunday, November 09, 2014 3:36 PM
> To: Ramakrishna Nishtala (rnishtal)
> Cc: ceph-us...@ceph.com
> Subject: Re: [ceph-users] osds fails to start with mismatch in id
>
>
>
> On Sun, Nov 9, 2014 at 3:21 PM, Ramakrishna Nishtala (rnishtal) <
> rnish...@cisco.com> wrote:
>
> > Hi
>
> >
>
> > I am on ceph 0.87, RHEL 7
>
> >
>
> > Out of 60 few osd’s start and the rest complain about mismatch about
>
> > id’s as below.
>
> >
>
> >
>
> >
>
> > 2014-11-09 07:09:55.501177 7f4633e01880 -1 OSD id 56 != my id 53
>
> >
>
> > 2014-11-09 07:09:55.810048 7f636edf4880 -1 OSD id 57 != my id 54
>
> >
>
> > 2014-11-09 07:09:56.122957 7f459a766880 -1 OSD id 58 != my id 55
>
> >
>
> > 2014-11-09 07:09:56.429771 7f87f8e0c880 -1 OSD id 0 != my id 56
>
> >
>
> > 2014-11-09 07:09:56.741329 7fadd9b91880 -1 OSD id 2 != my id 57
>
> >
>
> >
>
> >
>
> > Found one OSD ID in /var/lib/ceph/cluster-id/keyring. To check this
>
> > out manually corrected it and turned authentication to none too, but
>
> > did not help.
>
> >
>
> >
>
> >
>
> > Any clues, how it can be corrected?
>
>
>
> It sounds like maybe the symlinks to data and journal aren't matching up
> with where they're supposed to be. This is usually a result of using
> unstable /dev links that don't always match to the same physical disks.
> Have you checked that?
>
> -Greg
>  ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Typical 10GbE latency

2014-11-06 Thread Irek Fasikhov
Hi,Udo.
Good value :)

Whether an additional optimization on the host?
Thanks.

Thu Nov 06 2014 at 16:57:36, Udo Lembke :

> Hi,
> from one host to five OSD-hosts.
>
> NIC Intel 82599EB; jumbo-frames; single Switch IBM G8124 (blade network).
>
> rtt min/avg/max/mdev = 0.075/0.114/0.231/0.037 ms
> rtt min/avg/max/mdev = 0.088/0.164/0.739/0.072 ms
> rtt min/avg/max/mdev = 0.081/0.141/0.229/0.030 ms
> rtt min/avg/max/mdev = 0.083/0.115/0.183/0.030 ms
> rtt min/avg/max/mdev = 0.087/0.144/0.190/0.028 ms
>
>
> Udo
>
> Am 06.11.2014 14:18, schrieb Wido den Hollander:
> > Hello,
> >
> > While working at a customer I've ran into a 10GbE latency which seems
> > high to me.
> >
> > I have access to a couple of Ceph cluster and I ran a simple ping test:
> >
> > $ ping -s 8192 -c 100 -n 
> >
> > Two results I got:
> >
> > rtt min/avg/max/mdev = 0.080/0.131/0.235/0.039 ms
> > rtt min/avg/max/mdev = 0.128/0.168/0.226/0.023 ms
> >
> > Both these environment are running with Intel 82599ES 10Gbit cards in
> > LACP. One with Extreme Networks switches, the other with Arista.
> >
> > Now, on a environment with Cisco Nexus 3000 and Nexus 7000 switches I'm
> > seeing:
> >
> > rtt min/avg/max/mdev = 0.160/0.244/0.298/0.029 ms
> >
> > As you can see, the Cisco Nexus network has high latency compared to the
> > other setup.
> >
> > You would say the switches are to blame, but we also tried with a direct
> > TwinAx connection, but that didn't help.
> >
> > This setup also uses the Intel 82599ES cards, so the cards don't seem to
> > be the problem.
> >
> > The MTU is set to 9000 on all these networks and cards.
> >
> > I was wondering, others with a Ceph cluster running on 10GbE, could you
> > perform a simple network latency test like this? I'd like to compare the
> > results.
> >
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] PG inconsistency

2014-11-06 Thread Irek Fasikhov
Thu Nov 06 2014 at 16:44:09, GuangYang :

> Thanks Dan. By "killed/formatted/replaced the OSD", did you replace the
> disk? Not an filesystem expert here, but would like to understand the
> underlying what happened behind the EIO and does that reveal something
> (e.g. hardware issue).
>
> In our case, we are using 6TB drive so that there are lot of data to
> migrate and as backfilling/recovering bring latency increasing, we hope to
> avoid that as much as we can..
>

For example, use the following parameters:
osd_recovery_delay_start = 10
osd recovery op priority = 2
osd max backfills = 1
osd recovery max active =1
osd recovery threads = 1



>
> Thanks,
> Guang
>
> 
> > From: daniel.vanders...@cern.ch
> > Date: Thu, 6 Nov 2014 13:36:46 +
> > Subject: Re: PG inconsistency
> > To: yguan...@outlook.com; ceph-users@lists.ceph.com
> >
> > Hi,
> > I've only ever seen (1), EIO to read a file. In this case I've always
> > just killed / formatted / replaced that OSD completely -- that moves
> > the PG to a new master and the new replication "fixes" the
> > inconsistency. This way, I've never had to pg repair. I don't know if
> > this is a best or even good practise, but it works for us.
> > Cheers, Dan
> >
> > On Thu Nov 06 2014 at 2:24:32 PM GuangYang
> > mailto:yguan...@outlook.com>> wrote:
> > Hello Cephers,
> > Recently we observed a couple of inconsistencies in our Ceph cluster,
> > there were two major patterns leading to inconsistency as I observed:
> > 1) EIO to read the file, 2) the digest is inconsistent (for EC) even
> > there is no read error).
> >
> > While ceph has built-in tool sets to repair the inconsistencies, I also
> > would like to check with the community in terms of what is the best
> > ways to handle such issues (e.g. should we run fsck / xfs_repair when
> > such issue happens).
> >
> > In more details, I have the following questions:
> > 1. When there is inconsistency detected, what is the chance there is
> > some hardware issues which need to be repaired physically, or should I
> > run some disk/filesystem tools to further check?
> > 2. Should we use fsck / xfs_repair to fix the inconsistencies, or
> > should we solely relay on Ceph's repair tool sets?
> >
> > It would be great to hear you experience and suggestions.
> >
> > BTW, we are using XFS in the cluster.
> >
> > Thanks,
> > Guang
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] PG inconsistency

2014-11-06 Thread Irek Fasikhov
What is your version of the ceph?
0.80.0 - 0.80.3
https://github.com/ceph/ceph/commit/7557a8139425d1705b481d7f010683169fd5e49b

Thu Nov 06 2014 at 16:24:21, GuangYang :

> Hello Cephers,
> Recently we observed a couple of inconsistencies in our Ceph cluster,
> there were two major patterns leading to inconsistency as I observed: 1)
> EIO to read the file, 2) the digest is inconsistent (for EC) even there is
> no read error).
>
> While ceph has built-in tool sets to repair the inconsistencies, I also
> would like to check with the community in terms of what is the best ways to
> handle such issues (e.g. should we run fsck / xfs_repair when such issue
> happens).
>
> In more details, I have the following questions:
> 1. When there is inconsistency detected, what is the chance there is some
> hardware issues which need to be repaired physically, or should I run some
> disk/filesystem tools to further check?
> 2. Should we use fsck / xfs_repair to fix the inconsistencies, or should
> we solely relay on Ceph's repair tool sets?
>
> It would be great to hear you experience and suggestions.
>
> BTW, we are using XFS in the cluster.
>
> Thanks,
> Guang
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Full backup/restore of Ceph cluster?

2014-11-05 Thread Irek Fasikhov
Hi.

I changed the script and added it multithreaded archiver.
See: http://www.theirek.com/blog/2014/10/26/primier-biekapa-rbd-ustroistva

2014-11-05 14:03 GMT+03:00 Alexandre DERUMIER :

> >>What if I just wanted to back up a running cluster without having
> another cluster to replicate to
>
> Yes, import is optionnal,
>
> you can simply export and pipe to tar
>
>
> rbd export-diff --from-snap snap1 pool/image@snap2 - | tar 
>
>
> - Mail original -
>
> De: "Christopher Armstrong" 
> À: "Alexandre DERUMIER" 
> Cc: ceph-users@lists.ceph.com
> Envoyé: Mercredi 5 Novembre 2014 10:08:49
> Objet: Re: [ceph-users] Full backup/restore of Ceph cluster?
>
>
> Hi Alexandre,
>
>
> Thanks for the link! Unless I'm misunderstanding, this is to replicate an
> RBD volume from one cluster to another.
> ? i.e. I'd ideally like a tarball of raw files that I could extract on a
> new host, start the Ceph daemons, and get up and running.
>
>
>
>
>
> Chris Armstrong
> Head of Services
> OpDemand / Deis.io
> GitHub: https://github.com/deis/deis -- Docs: http://docs.deis.io/
>
>
> On Wed, Nov 5, 2014 at 1:04 AM, Alexandre DERUMIER < aderum...@odiso.com
> > wrote:
>
>
> >>Is RBD snapshotting what I'm looking for? Is this even possible?
>
> Yes, you can use rbd snapshoting, export / import
>
> http://ceph.com/dev-notes/incremental-snapshots-with-rbd/
>
> But you need to do it for each rbd volume.
>
> Here a script to do it:
>
> http://www.rapide.nl/blog/item/ceph_-_rbd_replication
>
>
>
> (AFAIK it's not possible to do it at pool level)
>
>
> - Mail original -
>
> De: "Christopher Armstrong" < ch...@opdemand.com >
> À: ceph-users@lists.ceph.com
> Envoyé: Mercredi 5 Novembre 2014 08:52:31
> Objet: [ceph-users] Full backup/restore of Ceph cluster?
>
>
>
>
> Hi folks,
>
>
> I was wondering if anyone has a solution for performing a complete backup
> and restore of a CEph cluster. A Google search came up with some
> articles/blog posts, some of which are old, and I don't really have a great
> idea of the feasibility of this.
>
>
> Here's what I've found:
>
>
> http://ceph.com/community/blog/tag/backup/
>
> http://ceph.com/docs/giant/rbd/rbd-snapshot/
>
> http://t3491.file-systems-ceph-user.file-systemstalk.us/backups-t3491.html
>
>
>
> Is RBD snapshotting what I'm looking for? Is this even possible? Any info
> is much appreciated!
>
>
> Thanks,
>
>
> Chris
>
>
>
>
> Chris Armstrong
> Head of Services
> OpDemand / Deis.io
> GitHub: https://github.com/deis/deis -- Docs: http://docs.deis.io/
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] where to download 0.87 debs?

2014-10-30 Thread Irek Fasikhov
http://ceph.com/debian-giant/ :)

2014-10-30 12:45 GMT+03:00 Jon Kåre Hellan :

>  Will there be debs?
>
> On 30/10/14 10:37, Irek Fasikhov wrote:
>
> Hi.
>
>  Use http://ceph.com/rpm-giant/
>
> 2014-10-30 12:34 GMT+03:00 Kenneth Waegeman :
>
>> Hi,
>>
>> Will http://ceph.com/rpm/ also be updated to have the giant packages?
>>
>> Thanks
>>
>> Kenneth
>>
>>
>>
>>
>> - Message from Patrick McGarry  -
>>Date: Wed, 29 Oct 2014 22:13:50 -0400
>>From: Patrick McGarry 
>> Subject: Re: [ceph-users] where to download 0.87 RPMS?
>>  To: 廖建锋 
>>  Cc: ceph-users 
>>
>>
>>
>>  I have updated the http://ceph.com/get page to reflect a more generic
>>> approach to linking.  It's also worth noting that the new
>>> http://download.ceph.com/ infrastructure is available now.
>>>
>>> To get to the rpms specifically you can either crawl the
>>> download.ceph.com tree or use the symlink at
>>> http://ceph.com/rpm-giant/
>>>
>>> Hope that (and the updated linkage on ceph.com/get) helps.  Thanks!
>>>
>>>
>>> Best Regards,
>>>
>>> Patrick McGarry
>>> Director Ceph Community || Red Hat
>>> http://ceph.com  ||  http://community.redhat.com
>>> @scuttlemonkey || @ceph
>>>
>>>
>>> On Wed, Oct 29, 2014 at 9:15 PM, 廖建锋  wrote:
>>>
>>>>
>>>>
>>>>
>>>> ___
>>>> ceph-users mailing list
>>>> ceph-users@lists.ceph.com
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>
>>>>  ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
>>
>>  - End message from Patrick McGarry  -
>>
>> --
>>
>> Met vriendelijke groeten,
>> Kenneth Waegeman
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
>
>  --
> С уважением, Фасихов Ирек Нургаязович
> Моб.: +79229045757
>
>
> ___
> ceph-users mailing 
> listceph-us...@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] where to download 0.87 RPMS?

2014-10-30 Thread Irek Fasikhov
Hi.

Use http://ceph.com/rpm-giant/

2014-10-30 12:34 GMT+03:00 Kenneth Waegeman :

> Hi,
>
> Will http://ceph.com/rpm/ also be updated to have the giant packages?
>
> Thanks
>
> Kenneth
>
>
>
>
> - Message from Patrick McGarry  -
>Date: Wed, 29 Oct 2014 22:13:50 -0400
>From: Patrick McGarry 
> Subject: Re: [ceph-users] where to download 0.87 RPMS?
>  To: 廖建锋 
>  Cc: ceph-users 
>
>
>
>  I have updated the http://ceph.com/get page to reflect a more generic
>> approach to linking.  It's also worth noting that the new
>> http://download.ceph.com/ infrastructure is available now.
>>
>> To get to the rpms specifically you can either crawl the
>> download.ceph.com tree or use the symlink at
>> http://ceph.com/rpm-giant/
>>
>> Hope that (and the updated linkage on ceph.com/get) helps.  Thanks!
>>
>>
>> Best Regards,
>>
>> Patrick McGarry
>> Director Ceph Community || Red Hat
>> http://ceph.com  ||  http://community.redhat.com
>> @scuttlemonkey || @ceph
>>
>>
>> On Wed, Oct 29, 2014 at 9:15 PM, 廖建锋  wrote:
>>
>>>
>>>
>>>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>>  ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
> - End message from Patrick McGarry  -
>
> --
>
> Met vriendelijke groeten,
> Kenneth Waegeman
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Use 2 osds to create cluster but health check display "active+degraded"

2014-10-29 Thread Irek Fasikhov
ceph osd tree please :)

2014-10-29 12:03 GMT+03:00 Vickie CH :

> Dear all,
> Thanks for the reply.
> Pool replicated size is 2. Because the replicated size parameter already
> write into ceph.conf before deploy.
> Because not familiar crush map.  I will according Mark's information to do
> a test that change the crush map to see the result.
>
> ---ceph.conf--
> [global]
> fsid = c404ded6-4086-4f0b-b479-
> 89bc018af954
> mon_initial_members = storage0
> mon_host = 192.168.1.10
> auth_cluster_required = cephx
> auth_service_required = cephx
> auth_client_required = cephx
> filestore_xattr_use_omap = true
>
> *osd_pool_default_size = 2osd_pool_default_min_size = 1*
> osd_pool_default_pg_num = 128
> osd_journal_size = 2048
> osd_pool_default_pgp_num = 128
> osd_mkfs_type = xfs
> ---
>
> --ceph osd dump result -
> pool 0 'data' replicated size 2 min_size 1 crush_ruleset 0 object_hash
> rjenkins pg_num 64 pgp_num 64 last_change 14 flags hashpspool
> crash_replay_interval 45 stripe_width 0
> pool 1 'metadata' replicated size 2 min_size 1 crush_ruleset 0 object_hash
> rjenkins pg_num 64 pgp_num 64 last_change 15 flags hashpspool stripe_width 0
> pool 2 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash
> rjenkins pg_num 64 pgp_num 64 last_change 16 flags hashpspool stripe_width 0
> max_osd 2
>
> --
>
> Best wishes,
> Mika
>
> Best wishes,
> Mika
>
> 2014-10-29 16:56 GMT+08:00 Mark Kirkwood :
>
>> That is not my experience:
>>
>> $ ceph -v
>> ceph version 0.86-579-g06a73c3 (06a73c39169f2f332dec760f56d3ec20455b1646)
>>
>> $ cat /etc/ceph/ceph.conf
>> [global]
>> ...
>> osd pool default size = 2
>>
>> $ ceph osd dump|grep size
>> pool 2 'hot' replicated size 2 min_size 1 crush_ruleset 0 object_hash
>> rjenkins pg_num 128 pgp_num 128 last_change 47 flags
>> hashpspool,incomplete_clones tier_of 1 cache_mode writeback target_bytes
>> 20 hit_set bloom{false_positive_probability: 0.05, target_size:
>> 0, seed: 0} 3600s x1 stripe_width 0
>> pool 10 '.rgw.root' replicated size 2 min_size 1 crush_ruleset 0
>> object_hash rjenkins pg_num 8 pgp_num 8 last_change 102 owner
>> 18446744073709551615 flags hashpspool stripe_width 0
>> pool 11 '.rgw.control' replicated size 2 min_size 1 crush_ruleset 0
>> object_hash rjenkins pg_num 8 pgp_num 8 last_change 104 owner
>> 18446744073709551615 flags hashpspool stripe_width 0
>> pool 12 '.rgw' replicated size 2 min_size 1 crush_ruleset 0 object_hash
>> rjenkins pg_num 8 pgp_num 8 last_change 106 owner 18446744073709551615
>> flags hashpspool stripe_width 0
>> pool 13 '.rgw.gc' replicated size 2 min_size 1 crush_ruleset 0
>> object_hash rjenkins pg_num 8 pgp_num 8 last_change 107 owner
>> 18446744073709551615 flags hashpspool stripe_width 0
>> pool 14 '.users.uid' replicated size 2 min_size 1 crush_ruleset 0
>> object_hash rjenkins pg_num 8 pgp_num 8 last_change 108 owner
>> 18446744073709551615 flags hashpspool stripe_width 0
>> pool 15 '.rgw.buckets.index' replicated size 2 min_size 1 crush_ruleset 0
>> object_hash rjenkins pg_num 8 pgp_num 8 last_change 110 owner
>> 18446744073709551615 flags hashpspool stripe_width 0
>> pool 16 '.rgw.buckets' replicated size 2 min_size 1 crush_ruleset 0
>> object_hash rjenkins pg_num 8 pgp_num 8 last_change 112 owner
>> 18446744073709551615 flags hashpspool stripe_width 0
>> pool 17 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash
>> rjenkins pg_num 1024 pgp_num 1024 last_change 186 flags hashpspool
>> stripe_width 0
>>
>>
>>
>>
>>
>>
>> On 29/10/14 21:46, Irek Fasikhov wrote:
>>
>>> Hi.
>>> This parameter does not apply to pools by default.
>>> ceph osd dump | grep pool. see size=?
>>>
>>>
>>> 2014-10-29 11:40 GMT+03:00 Vickie CH >> <mailto:mika.leaf...@gmail.com>>:
>>>
>>> Der Irek:
>>>
>>> Thanks for your reply.
>>> Even already set "osd_pool_default_size = 2" the cluster still need
>>> 3 different hosts right?
>>> Is this default number can be changed by user and write into
>>> ceph.conf before deploy?
>>>
>>>
>>> Best wishes,
>>> Mika
>

Re: [ceph-users] Use 2 osds to create cluster but health check display "active+degraded"

2014-10-29 Thread Irek Fasikhov
Mark.
I meant that the existing pools, this parameter is not used.
I'm sure he pools DATA, METADATA, RDB(They are created by default) have
size = 3.

2014-10-29 11:56 GMT+03:00 Mark Kirkwood :

> That is not my experience:
>
> $ ceph -v
> ceph version 0.86-579-g06a73c3 (06a73c39169f2f332dec760f56d3ec20455b1646)
>
> $ cat /etc/ceph/ceph.conf
> [global]
> ...
> osd pool default size = 2
>
> $ ceph osd dump|grep size
> pool 2 'hot' replicated size 2 min_size 1 crush_ruleset 0 object_hash
> rjenkins pg_num 128 pgp_num 128 last_change 47 flags
> hashpspool,incomplete_clones tier_of 1 cache_mode writeback target_bytes
> 20 hit_set bloom{false_positive_probability: 0.05, target_size:
> 0, seed: 0} 3600s x1 stripe_width 0
> pool 10 '.rgw.root' replicated size 2 min_size 1 crush_ruleset 0
> object_hash rjenkins pg_num 8 pgp_num 8 last_change 102 owner
> 18446744073709551615 flags hashpspool stripe_width 0
> pool 11 '.rgw.control' replicated size 2 min_size 1 crush_ruleset 0
> object_hash rjenkins pg_num 8 pgp_num 8 last_change 104 owner
> 18446744073709551615 flags hashpspool stripe_width 0
> pool 12 '.rgw' replicated size 2 min_size 1 crush_ruleset 0 object_hash
> rjenkins pg_num 8 pgp_num 8 last_change 106 owner 18446744073709551615
> flags hashpspool stripe_width 0
> pool 13 '.rgw.gc' replicated size 2 min_size 1 crush_ruleset 0 object_hash
> rjenkins pg_num 8 pgp_num 8 last_change 107 owner 18446744073709551615
> flags hashpspool stripe_width 0
> pool 14 '.users.uid' replicated size 2 min_size 1 crush_ruleset 0
> object_hash rjenkins pg_num 8 pgp_num 8 last_change 108 owner
> 18446744073709551615 flags hashpspool stripe_width 0
> pool 15 '.rgw.buckets.index' replicated size 2 min_size 1 crush_ruleset 0
> object_hash rjenkins pg_num 8 pgp_num 8 last_change 110 owner
> 18446744073709551615 flags hashpspool stripe_width 0
> pool 16 '.rgw.buckets' replicated size 2 min_size 1 crush_ruleset 0
> object_hash rjenkins pg_num 8 pgp_num 8 last_change 112 owner
> 18446744073709551615 flags hashpspool stripe_width 0
> pool 17 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash
> rjenkins pg_num 1024 pgp_num 1024 last_change 186 flags hashpspool
> stripe_width 0
>
>
>
>
>
>
> On 29/10/14 21:46, Irek Fasikhov wrote:
>
>> Hi.
>> This parameter does not apply to pools by default.
>> ceph osd dump | grep pool. see size=?
>>
>>
>> 2014-10-29 11:40 GMT+03:00 Vickie CH > <mailto:mika.leaf...@gmail.com>>:
>>
>> Der Irek:
>>
>>     Thanks for your reply.
>> Even already set "osd_pool_default_size = 2" the cluster still need
>> 3 different hosts right?
>> Is this default number can be changed by user and write into
>> ceph.conf before deploy?
>>
>>
>> Best wishes,
>> Mika
>>
>> 2014-10-29 16:29 GMT+08:00 Irek Fasikhov > <mailto:malm...@gmail.com>>:
>>
>> Hi.
>>
>> Because the disc requires three different hosts, the default
>> number of replications 3.
>>
>> 2014-10-29 10:56 GMT+03:00 Vickie CH > <mailto:mika.leaf...@gmail.com>>:
>>
>>
>> Hi all,
>>Try to use two OSDs to create a cluster. After the
>> deply finished, I found the health status is "88
>> active+degraded" "104 active+remapped". Before use 2 osds to
>> create cluster the result is ok. I'm confuse why this
>> situation happened. Do I need to set crush map to fix this
>> problem?
>>
>>
>> --ceph.conf-
>> [global]
>> fsid = c404ded6-4086-4f0b-b479-89bc018af954
>> mon_initial_members = storage0
>> mon_host = 192.168.1.10
>> auth_cluster_required = cephx
>> auth_service_required = cephx
>> auth_client_required = cephx
>> filestore_xattr_use_omap = true
>> osd_pool_default_size = 2
>> osd_pool_default_min_size = 1
>> osd_pool_default_pg_num = 128
>> osd_journal_size = 2048
>> osd_pool_default_pgp_num = 128
>> osd_mkfs_type = xfs
>> -
>>
>> ---ceph -s---
>> cluster c404ded6-4086-4f0b-b479-89bc018af954
>&

Re: [ceph-users] Use 2 osds to create cluster but health check display "active+degraded"

2014-10-29 Thread Irek Fasikhov
Hi.
This parameter does not apply to pools by default.
ceph osd dump | grep pool. see size=?


2014-10-29 11:40 GMT+03:00 Vickie CH :

> Der Irek:
>
> Thanks for your reply.
> Even already set "osd_pool_default_size = 2" the cluster still need 3
> different hosts right?
> Is this default number can be changed by user and write into ceph.conf
> before deploy?
>
>
> Best wishes,
> Mika
>
> 2014-10-29 16:29 GMT+08:00 Irek Fasikhov :
>
>> Hi.
>>
>> Because the disc requires three different hosts, the default number of
>> replications 3.
>>
>> 2014-10-29 10:56 GMT+03:00 Vickie CH :
>>
>>> Hi all,
>>>   Try to use two OSDs to create a cluster. After the deply finished,
>>> I found the health status is "88 active+degraded" "104 active+remapped".
>>> Before use 2 osds to create cluster the result is ok. I'm confuse why this
>>> situation happened. Do I need to set crush map to fix this problem?
>>>
>>>
>>> --ceph.conf-
>>> [global]
>>> fsid = c404ded6-4086-4f0b-b479-89bc018af954
>>> mon_initial_members = storage0
>>> mon_host = 192.168.1.10
>>> auth_cluster_required = cephx
>>> auth_service_required = cephx
>>> auth_client_required = cephx
>>> filestore_xattr_use_omap = true
>>> osd_pool_default_size = 2
>>> osd_pool_default_min_size = 1
>>> osd_pool_default_pg_num = 128
>>> osd_journal_size = 2048
>>> osd_pool_default_pgp_num = 128
>>> osd_mkfs_type = xfs
>>> -
>>>
>>> ---ceph -s---
>>> cluster c404ded6-4086-4f0b-b479-89bc018af954
>>>  health HEALTH_WARN 88 pgs degraded; 192 pgs stuck unclean
>>>  monmap e1: 1 mons at {storage0=192.168.10.10:6789/0}, election
>>> epoch 2, quorum 0 storage0
>>>  osdmap e20: 2 osds: 2 up, 2 in
>>>   pgmap v45: 192 pgs, 3 pools, 0 bytes data, 0 objects
>>> 79752 kB used, 1858 GB / 1858 GB avail
>>>   88 active+degraded
>>>  104 active+remapped
>>> 
>>>
>>>
>>> Best wishes,
>>> Mika
>>>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>>
>>
>>
>> --
>> С уважением, Фасихов Ирек Нургаязович
>> Моб.: +79229045757
>>
>
>


-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Use 2 osds to create cluster but health check display "active+degraded"

2014-10-29 Thread Irek Fasikhov
Hi.

Because the disc requires three different hosts, the default number of
replications 3.

2014-10-29 10:56 GMT+03:00 Vickie CH :

> Hi all,
>   Try to use two OSDs to create a cluster. After the deply finished, I
> found the health status is "88 active+degraded" "104 active+remapped".
> Before use 2 osds to create cluster the result is ok. I'm confuse why this
> situation happened. Do I need to set crush map to fix this problem?
>
>
> --ceph.conf-
> [global]
> fsid = c404ded6-4086-4f0b-b479-89bc018af954
> mon_initial_members = storage0
> mon_host = 192.168.1.10
> auth_cluster_required = cephx
> auth_service_required = cephx
> auth_client_required = cephx
> filestore_xattr_use_omap = true
> osd_pool_default_size = 2
> osd_pool_default_min_size = 1
> osd_pool_default_pg_num = 128
> osd_journal_size = 2048
> osd_pool_default_pgp_num = 128
> osd_mkfs_type = xfs
> -
>
> ---ceph -s---
> cluster c404ded6-4086-4f0b-b479-89bc018af954
>  health HEALTH_WARN 88 pgs degraded; 192 pgs stuck unclean
>  monmap e1: 1 mons at {storage0=192.168.10.10:6789/0}, election epoch
> 2, quorum 0 storage0
>  osdmap e20: 2 osds: 2 up, 2 in
>   pgmap v45: 192 pgs, 3 pools, 0 bytes data, 0 objects
> 79752 kB used, 1858 GB / 1858 GB avail
>   88 active+degraded
>  104 active+remapped
> 
>
>
> Best wishes,
> Mika
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] When will Ceph 0.72.3?

2014-10-29 Thread Irek Fasikhov
Dear developers.

Very much want io priorities ;)
During the execution of Snap roollback appear slow queries.

Thanks
-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Scrub proces, IO performance

2014-10-28 Thread Irek Fasikhov
No. Appeared in 0.80.6. But there is a bug which is corrected in 0.80.8
See: http://tracker.ceph.com/issues/9677

2014-10-28 14:50 GMT+03:00 Mateusz Skała :

> Thanks for reply, we are using now ceph 0.80.1 firefly, is this options
> available?
>
>
>
> *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf
> Of *Mateusz Skała
> *Sent:* Tuesday, October 28, 2014 9:27 AM
> *To:* ceph-us...@ceph.com
> *Subject:* [ceph-users] Scrub proces, IO performance
>
>
>
> Hello,
>
> We are using Ceph as a storage backend for KVM, used for hosting MS
> Windows RDP, Linux for web applications with MySQL database and file
> sharing from Linux. Wen scrub or deep-scrub process is active, RDP sessions
> are freezing for a few seconds and web applications have big replay
> latency.
>
> New we have disabled scrubbing  and deep-scrubbing process between  6AM -
> 10PM, when majority of users doesn't work, but user experience is still
> poor, like I write above. We are considering disabling scrubbing process at
> all. Does a new version 0.87 with addresses scrubbing priority is going to
> solve our problem (according to http://tracker.ceph.com/issues/6278)? Can
> we switch off scrubbing at all? How we can change our configuration to
> lower scrubbing performance impact? Does changing block size  can lower
> scrubbing impact or increase performance?
>
>
>
> Our Ceph cluster configuration :
>
>
>
> * we are using ~216 RBD disks for KVM VM's
>
> * ~11TB used, 3.593TB data, replica count 3
>
> * we have 5 mons, 32 OSD
>
> * 3 pools/ 4096pgs (only one - RBD in use)
>
> * 6 nodes (5osd+mon, 1 osd only) in two racks
>
> * 1 SATA disk for system, 1 SSD disk for journal and 4 or 6 SATA disk for
> OSD
>
> * 2 networks on 2 NIC 1Gbps (cluster + public)  on all nodes.
>
> * 2x 10GBps links between racks
>
> * without scrub max 45 iops
>
> * when scrub running 120 - 180 iops
>
>
>
>
>
> ceph.conf
>
>
>
> mon initial members = ceph35, ceph30, ceph20, ceph15, ceph10
>
> mon host = 10.20.8.35, 10.20.8.30, 10.20.8.20, 10.20.8.15, 10.20.8.10
>
>
>
> public network = 10.20.8.0/22
>
> cluster network = 10.20.4.0/22
>
>
>
> filestore xattr use omap = true
>
> filestore max sync interval = 15
>
>
>
> osd journal size = 10240
>
> osd pool default size = 3
>
> osd pool default min size = 1
>
> osd pool default pg num = 2048
>
> osd pool default pgp num = 2048
>
> osd crush chooseleaf type = 1
>
> osd recovery max active = 1
>
> osd recovery op priority = 1
>
> osd max backfills = 1
>
>
>
> auth cluster required = cephx
>
> auth service required = cephx
>
> auth client required = cephx
>
>
>
> rbd default format = 2
>
>
>
> Regards,
>
> Mateusz
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osd_disk_thread_ioprio_class/_priorioty ignored?

2014-10-23 Thread Irek Fasikhov
Hi.
Already have the necessary changes in git.
https://github.com/ceph/ceph/commit/86926c6089d63014dd770b4bb61fc7aca3998542

2014-10-23 16:42 GMT+04:00 Paweł Sadowski :

> On 10/23/2014 09:10 AM, Paweł Sadowski wrote:
> > Hi,
> >
> > I was trying to determine performance impact of deep-scrubbing with
> > osd_disk_thread_ioprio_class option set but it looks like it's ignored.
> > Performance (during deep-scrub) is the same with this options set or
> > left with defaults (1/3 of "normal" performance).
> >
> >
> > # ceph --admin-daemon /var/run/ceph/ceph-osd.26.asok config show  | grep
> > osd_disk_thread_ioprio
> >   "osd_disk_thread_ioprio_class": "idle",
> >   "osd_disk_thread_ioprio_priority": "7",
> >
> > # ps -efL | grep 'ce[p]h-osd --cluster=ceph -i 26' | awk '{ print $4; }'
> > | xargs --no-run-if-empty ionice -p | sort | uniq -c
> >  18 unknown: prio 0
> > 186 unknown: prio 4
> >
> > # cat /sys/class/block/sdf/queue/scheduler
> > noop deadline [cfq]
> >
> > And finallyGDB:
> >
> > Breakpoint 1, ceph_ioprio_string_to_class (s=...) at
> > common/io_priority.cc:48
> > warning: Source file is more recent than executable.
> > 48return IOPRIO_CLASS_IDLE;
> > (gdb) cont
> > Continuing.
> >
> > Breakpoint 2, OSD::set_disk_tp_priority (this=0x3398000) at
> osd/OSD.cc:8548
> > warning: Source file is more recent than executable.
> > 8548  disk_tp.set_ioprio(cls,
> > cct->_conf->osd_disk_thread_ioprio_priority);
> > (gdb) print cls
> > $1 = -22
> >
> > So the IO priorities are *NOT*set (cls >= 0). I'm not sure where this
> > -22 came from.Any ideas?
> > In the mean time I'll compile ceph from sources and check again.
> >
> >
> >
> > Ceph installed from Ceph repositories:
> >
> > # ceph-osd -v
> > ceph version 0.86 (97dcc0539dfa7dac3de74852305d51580b7b1f82)
> >
> > # apt-cache policy ceph
> > ceph:
> >   Installed: 0.86-1precise
> >   Candidate: 0.86-1precise
> >   Version table:
> >  *** 0.86-1precise 0
> > 500 http://eu.ceph.com/debian-giant/ precise/main amd64 Packages
> > 100 /var/lib/dpkg/status
>
> Following patch corrects problem:
>
> diff --git a/src/common/io_priority.cc b/src/common/io_priority
> index b9eeae8..4cd299a 100644
> --- a/src/common/io_priority.cc
> +++ b/src/common/io_priority.cc
> @@ -41,7 +41,7 @@ int ceph_ioprio_set(int whence, int who, int
>
>  int ceph_ioprio_string_to_class(const std::string& s)
>  {
> -  std::string l;
> +  std::string l(s);
>std::transform(s.begin(), s.end(), l.begin(), ::tolower);
>
>if (l == "idle")
>
>
> # ps -efL | grep 'ce[p]h-osd --cluster=ceph -i 26' | awk '{ print $4; }'
> | xargs --no-run-if-empty ionice -p | sort | uniq -c
>   1 idle
>   4 unknown: prio 0
> 183 unknown: prio 4
>
> Change to *best effort* (ceph tell osd.26 injectargs
> '--osd_disk_thread_ioprio_class be')
>
> # ps -efL | grep 'ce[p]h-osd --cluster=ceph -i 26' | awk '{ print $4; }'
> | xargs --no-run-if-empty ionice -p | sort | uniq -c
>   1 best-effort: prio 7
>   4 unknown: prio 0
> 183 unknown: prio 4
>
>
> --
> PS
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Why performance of benchmarks with small blocks is extremely small?

2014-10-01 Thread Irek Fasikhov
Timur, read this thread:
https://www.mail-archive.com/ceph-users@lists.ceph.com/msg12486.html
Тимур, прочитай эту ветку.


2014-10-01 16:24 GMT+04:00 Andrei Mikhailovsky :

> Timur,
>
> As far as I know, the latest master has a number of improvements for ssd
> disks. If you check the mailing list discussion from a couple of weeks
> back, you can see that the latest stable firefly is not that well optimised
> for ssd drives and IO is limited. However changes are being made to address
> that.
>
> I am well surprised that you can get 10K IOps as in my tests I was not
> getting over 3K IOPs on the ssd disks which are capable of doing 90K IOps.
>
> P.S. does anyone know if the ssd optimisation code will be added to the
> next maintenance release of firefy?
>
> Andrei
> --
>
> *From: *"Timur Nurlygayanov" 
> *To: *"Christian Balzer" 
> *Cc: *ceph-us...@ceph.com
> *Sent: *Wednesday, 1 October, 2014 1:11:25 PM
> *Subject: *Re: [ceph-users] Why performance of benchmarks with small
> blocks is extremely small?
>
>
> Hello Christian,
>
> Thank you for your detailed answer!
>
> I have other pre-production environment with 4 Ceph servers, 4 SSD disks
> per Ceph server (each Ceph OSD node on the separate SSD disk)
> Probably I should move journals to other disks or it is not required in my
> case?
>
> [root@ceph-node ~]# mount | grep ceph
> /dev/sdb4 on /var/lib/ceph/osd/ceph-0 type xfs
> (rw,noexec,nodev,noatime,nodiratime,inode64,logbsize=256k,delaylog,user_xattr,data=writeback)
> /dev/sde4 on /var/lib/ceph/osd/ceph-5 type xfs
> (rw,noexec,nodev,noatime,nodiratime,inode64,logbsize=256k,delaylog,user_xattr,data=writeback)
> /dev/sdd4 on /var/lib/ceph/osd/ceph-2 type xfs
> (rw,noexec,nodev,noatime,nodiratime,inode64,logbsize=256k,delaylog,user_xattr,data=writeback)
> /dev/sdc4 on /var/lib/ceph/osd/ceph-1 type xfs
> (rw,noexec,nodev,noatime,nodiratime,inode64,logbsize=256k,delaylog,user_xattr,data=writeback)
>
> [root@ceph-node ~]# find /var/lib/ceph/osd/ | grep journal
> /var/lib/ceph/osd/ceph-0/journal
> /var/lib/ceph/osd/ceph-5/journal
> /var/lib/ceph/osd/ceph-1/journal
> /var/lib/ceph/osd/ceph-2/journal
>
> My SSD disks have ~ 40k IOPS per disk, but on the VM I can see only ~ 10k
> - 14k IOPS for disks operations.
> To check this I execute the following command on VM with root partition
> mounted on disk in Ceph storage:
>
> root@test-io:/home/ubuntu# rm -rf /tmp/test && spew -d --write -r -b 4096
> 10M /tmp/test
> WTR:56506.22 KiB/s   Transfer time: 00:00:00IOPS:14126.55
>
> Is it expected result or I can improve the performance and get at least
> 30k-40k IOPS on the VM disks? (I have 2x 10Gb/s networks interfaces in LACP
> bonding for storage network, looks like network can't be the bottleneck).
>
> Thank you!
>
>
> On Wed, Oct 1, 2014 at 6:50 AM, Christian Balzer  wrote:
>
>>
>> Hello,
>>
>> [reduced to ceph-users]
>>
>> On Sat, 27 Sep 2014 19:17:22 +0400 Timur Nurlygayanov wrote:
>>
>> > Hello all,
>> >
>> > I installed OpenStack with Glance + Ceph OSD with replication factor 2
>> > and now I can see the write operations are extremly slow.
>> > For example, I can see only 0.04 MB/s write speed when I run rados bench
>> > with 512b blocks:
>> >
>> > rados bench -p test 60 write --no-cleanup -t 1 -b 512
>> >
>> There are 2 things wrong with that this test:
>>
>> 1. You're using rados bench, when in fact you should be testing from
>> within VMs. For starters a VM could make use of the rbd cache you enabled,
>> rados bench won't.
>>
>> 2. Given the parameters of this test you're testing network latency more
>> than anything else. If you monitor the Ceph nodes (atop is a good tool for
>> that), you will probably see that neither CPU nor disks resources are
>> being exhausted. With a single thread rados puts that tiny block of 512
>> bytes on the wire, the primary OSD for the PG has to write this to the
>> journal (on your slow, non-SSD disks) and send it to the secondary OSD,
>> which has to ACK the write to its journal back to the primary one, which
>> in turn then ACKs it to the client (rados bench) and then rados bench can
>> send the next packet.
>> You get the drift.
>>
>> Using your parameters I can get 0.17MB/s on a pre-production cluster
>> that uses 4xQDR Infiniband (IPoIB) connections, on my shitty test cluster
>> with 1GB/s links I get similar results to you, unsurprisingly.
>>
>> Ceph excels only with lots of parallelism, so an individual thread might
>> be slow (and in your case HAS to be slow, which has nothing to do with
>> Ceph per se) but many parallel ones will utilize the resources available.
>>
>> Having data blocks that are adequately sized (4MB, the default rados size)
>> will help for bandwidth and the rbd cache inside a properly configured VM
>> should make that happen.
>>
>> Of course in most real life scenarios you will run out of IOPS long before
>> you run out of bandwidth.
>>
>>
>> >  Maintaining 1 concurrent writes of 512 bytes for up to 60 seconds 

Re: [ceph-users] rbd export -> nc ->rbd import = memory leak

2014-09-26 Thread Irek Fasikhov
I created a task in: http://tracker.ceph.com/issues/9602

2014-09-26 15:54 GMT+04:00 Irek Fasikhov :

> message log:
> Sep 25 11:37:30 ct2 kernel: rbd invoked oom-killer: gfp_mask=0x280da,
> order=0, oom_adj=0, oom_score_adj=0
> Sep 25 11:37:30 ct2 kernel: rbd cpuset=/ mems_allowed=0-1
> Sep 25 11:37:30 ct2 kernel: Pid: 28217, comm: rbd Not tainted
> 2.6.32-431.el6.x86_64 #1
> Sep 25 11:37:30 ct2 kernel: Call Trace:
> Sep 25 11:37:30 ct2 kernel: [] ?
> cpuset_print_task_mems_allowed+0x91/0xb0
> Sep 25 11:37:30 ct2 kernel: [] ? dump_header+0x90/0x1b0
> Sep 25 11:37:30 ct2 kernel: [] ?
> security_real_capable_noaudit+0x3c/0x70
> Sep 25 11:37:30 ct2 kernel: [] ?
> oom_kill_process+0x82/0x2a0
> Sep 25 11:37:30 ct2 kernel: [] ?
> select_bad_process+0x9e/0x120
> Sep 25 11:37:30 ct2 kernel: [] ?
> out_of_memory+0x220/0x3c0
> Sep 25 11:37:30 ct2 kernel: [] ?
> __alloc_pages_nodemask+0x8ac/0x8d0
> Sep 25 11:37:30 ct2 kernel: [] ?
> alloc_pages_vma+0x9a/0x150
> Sep 25 11:37:30 ct2 kernel: [] ?
> handle_pte_fault+0x73d/0xb00
> Sep 25 11:37:30 ct2 kernel: [] ? pte_alloc_one+0x37/0x50
> Sep 25 11:37:30 ct2 kernel: [] ?
> do_huge_pmd_anonymous_page+0xb9/0x3b0
> Sep 25 11:37:30 ct2 kernel: [] ?
> handle_mm_fault+0x22a/0x300
> Sep 25 11:37:30 ct2 kernel: [] ?
> __do_page_fault+0x138/0x480
> Sep 25 11:37:30 ct2 kernel: [] ?
> do_mmap_pgoff+0x335/0x380
> Sep 25 11:37:30 ct2 kernel: [] ? do_page_fault+0x3e/0xa0
> Sep 25 11:37:30 ct2 kernel: [] ? page_fault+0x25/0x30
> Sep 25 11:37:30 ct2 kernel: Mem-Info:
> Sep 25 11:37:30 ct2 kernel: Node 0 DMA per-cpu:
> Sep 25 11:37:30 ct2 kernel: CPU0: hi:0, btch:   1 usd:   0
> Sep 25 11:37:30 ct2 kernel: CPU1: hi:0, btch:   1 usd:   0
> Sep 25 11:37:30 ct2 kernel: CPU2: hi:0, btch:   1 usd:   0
> Sep 25 11:37:30 ct2 kernel: CPU3: hi:0, btch:   1 usd:   0
> Sep 25 11:37:30 ct2 kernel: CPU4: hi:0, btch:   1 usd:   0
> Sep 25 11:37:30 ct2 kernel: CPU5: hi:0, btch:   1 usd:   0
> Sep 25 11:37:30 ct2 kernel: CPU6: hi:0, btch:   1 usd:   0
> Sep 25 11:37:30 ct2 kernel: CPU7: hi:0, btch:   1 usd:   0
> Sep 25 11:37:30 ct2 kernel: Node 0 DMA32 per-cpu:
> Sep 25 11:37:30 ct2 kernel: CPU0: hi:  186, btch:  31 usd:   0
> Sep 25 11:37:30 ct2 kernel: CPU1: hi:  186, btch:  31 usd:   0
> Sep 25 11:37:30 ct2 kernel: CPU2: hi:  186, btch:  31 usd:   0
> Sep 25 11:37:30 ct2 kernel: CPU3: hi:  186, btch:  31 usd:   0
> Sep 25 11:37:30 ct2 kernel: CPU4: hi:  186, btch:  31 usd:   0
> Sep 25 11:37:30 ct2 kernel: CPU5: hi:  186, btch:  31 usd:   0
> Sep 25 11:37:30 ct2 kernel: CPU6: hi:  186, btch:  31 usd:   0
> Sep 25 11:37:30 ct2 kernel: CPU7: hi:  186, btch:  31 usd:   0
> Sep 25 11:37:30 ct2 kernel: Node 0 Normal per-cpu:
> Sep 25 11:37:30 ct2 kernel: CPU0: hi:  186, btch:  31 usd:   0
> Sep 25 11:37:30 ct2 kernel: CPU1: hi:  186, btch:  31 usd:   0
> Sep 25 11:37:30 ct2 kernel: CPU2: hi:  186, btch:  31 usd:   0
> Sep 25 11:37:30 ct2 kernel: CPU3: hi:  186, btch:  31 usd:   0
> Sep 25 11:37:30 ct2 kernel: CPU4: hi:  186, btch:  31 usd:   0
> Sep 25 11:37:30 ct2 kernel: CPU5: hi:  186, btch:  31 usd:   0
> Sep 25 11:37:30 ct2 kernel: CPU6: hi:  186, btch:  31 usd:   0
> Sep 25 11:37:30 ct2 kernel: CPU7: hi:  186, btch:  31 usd:   0
> Sep 25 11:37:30 ct2 kernel: Node 1 Normal per-cpu:
> Sep 25 11:37:30 ct2 kernel: CPU0: hi:  186, btch:  31 usd:   0
> Sep 25 11:37:30 ct2 kernel: CPU1: hi:  186, btch:  31 usd:   0
> Sep 25 11:37:30 ct2 kernel: CPU2: hi:  186, btch:  31 usd:   0
> Sep 25 11:37:30 ct2 kernel: CPU3: hi:  186, btch:  31 usd:   0
> Sep 25 11:37:30 ct2 kernel: CPU4: hi:  186, btch:  31 usd:   0
> Sep 25 11:37:30 ct2 kernel: CPU5: hi:  186, btch:  31 usd:   0
> Sep 25 11:37:30 ct2 kernel: CPU6: hi:  186, btch:  31 usd:   0
> Sep 25 11:37:30 ct2 kernel: CPU7: hi:  186, btch:  31 usd:   0
> Sep 25 11:37:30 ct2 kernel: active_anon:11237039 inactive_anon:870686
> isolated_anon:0
> Sep 25 11:37:30 ct2 kernel: active_file:981 inactive_file:1095
> isolated_file:0
> Sep 25 11:37:30 ct2 kernel: unevictable:0 dirty:68 writeback:971 unstable:0
> Sep 25 11:37:30 ct2 kernel: free:47328 slab_reclaimable:10413
> slab_unreclaimable:35885
> Sep 25 11:37:30 ct2 kernel: mapped:1017 shmem:1 pagetables:39376 bounce:0
> Sep 25 11:37:30 ct2 kernel: Node 0 DMA free:15740kB min:24kB low:28kB
> high:36kB active_anon:0kB inactive_anon:0kB active_file:0kB
> inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB
> present:15352kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB
> slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB
> unstable:0kB bounce:0kB writ

Re: [ceph-users] rbd export -> nc ->rbd import = memory leak

2014-09-26 Thread Irek Fasikhov
 -1000 udevd
Sep 25 11:37:30 ct2 kernel: [ 1590] 0  159063856   38   1
0 0 rsyslogd
Sep 25 11:37:30 ct2 kernel: [ 1737] 0  173720318   24   6
0 0 master
Sep 25 11:37:30 ct2 kernel: [ 1744]89  174420381   19   0
0 0 qmgr
Sep 25 11:37:30 ct2 kernel: [ 1747] 0  174729333   24   0
0 0 crond
Sep 25 11:37:30 ct2 kernel: [ 1760] 0  1760142111   0
0 0 login
Sep 25 11:37:30 ct2 kernel: [ 1762] 0  1762 10161   5
0 0 mingetty
Sep 25 11:37:30 ct2 kernel: [ 1764] 0  1764 10161   3
0 0 mingetty
Sep 25 11:37:30 ct2 kernel: [ 1766] 0  1766 10161   0
0 0 mingetty
Sep 25 11:37:30 ct2 kernel: [ 1768] 0  1768 10161   6
0 0 mingetty
Sep 25 11:37:30 ct2 kernel: [ 1770] 0  1770 10161   4
0 0 mingetty
Sep 25 11:37:30 ct2 kernel: [ 1784] 0  178427076   25   5
0 0 bash
Sep 25 11:37:30 ct2 kernel: [ 4915] 0  4915 22801   4
0 0 dhclient
Sep 25 11:37:30 ct2 kernel: [ 4958] 0  495825089   37   2
0 0 sshd
Sep 25 11:37:30 ct2 kernel: [ 4962] 0  496227101   12   3
0 0 bash
Sep 25 11:37:30 ct2 kernel: [ 5025] 0  5025166510   4
-17 -1000 sshd
Sep 25 11:37:30 ct2 kernel: [ 5688] 0  5688 6910   10   1
-17 -1000 auditd
Sep 25 11:37:30 ct2 kernel: [17539] 0 17539252660   1
0 0 sfcbd
Sep 25 11:37:30 ct2 kernel: [17540] 0 17540201440   1
0 0 sfcbd
Sep 25 11:37:30 ct2 kernel: [17542] 0 17542232641   5
0 0 sfcbd
Sep 25 11:37:30 ct2 kernel: [17543] 0 17543396091   1
0 0 sfcbd
Sep 25 11:37:30 ct2 kernel: [17932] 0 17932396091   3
0 0 sfcbd
Sep 25 11:37:30 ct2 kernel: [17937] 0 17937396031   2
0 0 sfcbd
Sep 25 11:37:30 ct2 kernel: [17940] 0 17940401231   0
0 0 sfcbd
Sep 25 11:37:30 ct2 kernel: [18632] 0 18632   291742  269   2
0 0 dsm_sa_datamgrd
Sep 25 11:37:30 ct2 kernel: [18780] 0 1878073312   86   4
0 0 dsm_sa_eventmgr
Sep 25 11:37:30 ct2 kernel: [18796] 0 18796   171833   21   0
0 0 dsm_sa_datamgrd
Sep 25 11:37:30 ct2 kernel: [18842] 0 18842   109528  108   4
0 0 dsm_sa_snmpd
Sep 25 11:37:30 ct2 kernel: [19147] 0 19147332511   0
0 0 dsm_om_connsvcd
Sep 25 11:37:30 ct2 kernel: [19148] 0 19148   83139128062   1
0 0 dsm_om_connsvcd
Sep 25 11:37:30 ct2 kernel: [38380] 0 38380250914   4
0 0 sshd
Sep 25 11:37:30 ct2 kernel: [38386] 0 38386270764   1
0 0 bash
Sep 25 11:37:30 ct2 kernel: [28907] 0 28907 28830   5
-17 -1000 udevd
Sep 25 11:37:30 ct2 kernel: [28908] 0 28908 28560   7
-17 -1000 udevd
Sep 25 11:37:30 ct2 kernel: [44087] 0 44087245737   0
0 0 sshd
Sep 25 11:37:30 ct2 kernel: [44090] 0 44090271198   1
0 0 bash
Sep 25 11:37:30 ct2 kernel: [34590]38 34590 6624   34   2
0 0 ntpd
Sep 25 11:37:30 ct2 kernel: [34920] 0 349208408011471   3
0 0 ceph-mon
Sep 25 11:37:30 ct2 kernel: [35674] 0 35674   26818542473   1
0 0 ceph-osd
Sep 25 11:37:30 ct2 kernel: [43044] 0 43044   24747139531   5
0 0 ceph-osd
Sep 25 11:37:30 ct2 kernel: [ 7125] 0  7125   27094241559   1
0 0 ceph-osd
Sep 25 11:37:30 ct2 kernel: [20854] 0 20854   25166640220   0
0 0 ceph-osd
Sep 25 11:37:30 ct2 kernel: [21799] 0 21799   25064129693   5
0 0 ceph-osd
Sep 25 11:37:30 ct2 kernel: [22800] 0 22800   27500433343   0
0 0 ceph-osd
Sep 25 11:37:30 ct2 kernel: [23509] 0 23509   26211328210   4
0 0 ceph-osd
Sep 25 11:37:30 ct2 kernel: [24445] 0 24445   25967233212   0
0 0 ceph-osd
Sep 25 11:37:30 ct2 kernel: [25146] 0 25146   26643735462   2
0 0 ceph-osd
Sep 25 11:37:30 ct2 kernel: [25889] 0 25889   25785434321   0
0 0 ceph-osd
Sep 25 11:37:30 ct2 kernel: [27977] 0 27977   166624  482   2
0 0 python
Sep 25 11:37:30 ct2 kernel: [28216] 0 28216 1918   12   6
0 0 nc
Sep 25 11:37:30 ct2 kernel: [28636]89 2863620344  127   4
0 0 pickup


2014-09-26 15:48 GMT+04:00 Irek Fasikhov :

> OS: CentOS 6.5
> Kernel: 2.6.32-431.el6.x86_64
> Ceph --version: ceph version 0.72.2
> (a913ded2ff138aefb8cb84d347d72164099cfd60)
>
>
>
> 2014-09-26 15:44 GMT+04:00 Irek Fasikhov :
>
>> H

Re: [ceph-users] rbd export -> nc ->rbd import = memory leak

2014-09-26 Thread Irek Fasikhov
OS: CentOS 6.5
Kernel: 2.6.32-431.el6.x86_64
Ceph --version: ceph version 0.72.2
(a913ded2ff138aefb8cb84d347d72164099cfd60)



2014-09-26 15:44 GMT+04:00 Irek Fasikhov :

> Hi, All.
>
> I see a memory leak when importing raw deviсe.
>
> Export Scheme:
> [rbd@rbdbackup ~]$ rbd --no-progress -n client.rbdbackup -k
> /etc/ceph/big.keyring -c /etc/ceph/big.conf export rbdtest/vm-111-disk-1 -
> | nc 10.43.255.252 12345
>
> [root@ct2 ~]# nc -l 12345 | rbd import --no-progress --image-format 2 -
> rbd/vm-111-disk-1
>
> This is the same problem with ssh
>
> Memory usage, see the screenshots:
>
> https://drive.google.com/folderview?id=0BxoNLVWxzOJWSHlTSEZvM3lkQXM&usp=sharing
>
> --
> С уважением, Фасихов Ирек Нургаязович
> Моб.: +79229045757
>



-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] rbd export -> nc ->rbd import = memory leak

2014-09-26 Thread Irek Fasikhov
Hi, All.

I see a memory leak when importing raw deviсe.

Export Scheme:
[rbd@rbdbackup ~]$ rbd --no-progress -n client.rbdbackup -k
/etc/ceph/big.keyring -c /etc/ceph/big.conf export rbdtest/vm-111-disk-1 -
| nc 10.43.255.252 12345

[root@ct2 ~]# nc -l 12345 | rbd import --no-progress --image-format 2 -
rbd/vm-111-disk-1

This is the same problem with ssh

Memory usage, see the screenshots:
https://drive.google.com/folderview?id=0BxoNLVWxzOJWSHlTSEZvM3lkQXM&usp=sharing

-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [PG] Slow request *** seconds old,v4 currently waiting for pg to exist locally

2014-09-24 Thread Irek Fasikhov
osd_op(client.4625.1:9005787)
.


This is due to external factors. For example, the network settings.

2014-09-25 10:05 GMT+04:00 Udo Lembke :

> Hi again,
> sorry - forgot my post... see
>
> osdmap e421: 9 osds: 9 up, 9 in
>
> shows that all your 9 osds are up!
>
> Do you have trouble with your journal/filesystem?
>
> Udo
>
> Am 25.09.2014 08:01, schrieb Udo Lembke:
> > Hi,
> > looks that some osds are down?!
> >
> > What is the output of "ceph osd tree"
> >
> > Udo
> >
> > Am 25.09.2014 04:29, schrieb Aegeaner:
> >> The cluster healthy state is WARN:
> >>
> >>  health HEALTH_WARN 118 pgs degraded; 8 pgs down; 59 pgs
> >> incomplete; 28 pgs peering; 292 pgs stale; 87 pgs stuck inactive;
> >> 292 pgs stuck stale; 205 pgs stuck unclean; 22 requests are blocked
> >> > 32 sec; recovery 12474/46357 objects degraded (26.909%)
> >>  monmap e3: 3 mons at
> >> {CVM-0-mon01=
> 172.18.117.146:6789/0,CVM-0-mon02=172.18.117.152:6789/0,CVM-0-mon03=172.18.117.153:6789/0
> },
> >> election epoch 24, quorum 0,1,2 CVM-0-mon01,CVM-0-mon02,CVM-0-mon03
> >>  osdmap e421: 9 osds: 9 up, 9 in
> >>   pgmap v2261: 292 pgs, 4 pools, 91532 MB data, 23178 objects
> >> 330 MB used, 3363 GB / 3363 GB avail
> >> 12474/46357 objects degraded (26.909%)
> >>   20 stale+peering
> >>   87 stale+active+clean
> >>8 stale+down+peering
> >>   59 stale+incomplete
> >>  118 stale+active+degraded
> >>
> >>
> >> What does these errors mean? Can these PGs be recovered?
> >>
> >>
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] bug: ceph-deploy does not support jumbo frame

2014-09-24 Thread Irek Fasikhov
You have configured the switch?

2014-09-25 5:07 GMT+04:00 yuelongguang :

> hi,all
> after i set mtu=9000,  ceph-deply waits reply all the time , 'detecting
> platform for host.'
>
> how to know what commands  ceph-deploy need that osd to do?
>
> thanks
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Rebalancing slow I/O.

2014-09-24 Thread Irek Fasikhov
Hi, Andrei.

Thanks for the tip, but there are problems with reading some VMs.
Can the impact parameter (osd_recover_clone_overlap) to lock the RBD?

Thank!

2014-09-11 16:49 GMT+04:00 Andrei Mikhailovsky :

> Irek,
>
> have you change the ceph.conf file to change the recovery p riority?
>
> Options like these might help with prioritising repair/rebuild io with the
> client IO:
>
> osd_recovery_max_chunk = 8388608
> osd_recovery_op_priority = 2
> osd_max_backfills = 1
> osd_recovery_max_active = 1
> osd_recovery_threads = 1
>
>
> Andrei
> ------
> *From: *"Irek Fasikhov" 
> *To: *ceph-users@lists.ceph.com
> *Sent: *Thursday, 11 September, 2014 1:07:06 PM
> *Subject: *[ceph-users] Rebalancing slow I/O.
>
>
> Hi,All.
>
> DELL R720X8,96 OSDs, Network 2x10Gbit LACP.
>
> When one of the nodes crashes, I get very slow I / O operations on virtual
> machines.
> A cluster map by default.
> [ceph@ceph08 ~]$ ceph osd tree
> # idweight  type name   up/down reweight
> -1  262.1   root defaults
> -2  32.76   host ceph01
> 0   2.73osd.0   up  1
> ...
> 11  2.73osd.11  up  1
> -3  32.76   host ceph02
> 13  2.73osd.13  up  1
> ..
> 12  2.73osd.12  up  1
> -4  32.76   host ceph03
> 24  2.73osd.24  up  1
> 
> 35  2.73osd.35  up  1
> -5  32.76   host ceph04
> 37  2.73osd.37  up  1
> .
> 47  2.73osd.47  up  1
> -6  32.76   host ceph05
> 48  2.73osd.48  up  1
> ...
> 59  2.73osd.59  up  1
> -7  32.76   host ceph06
> 60  2.73osd.60  down0
> ...
> 71  2.73osd.71  down0
> -8  32.76   host ceph07
> 72  2.73osd.72  up  1
> 
> 83  2.73osd.83  up  1
> -9  32.76   host ceph08
> 84  2.73osd.84  up  1
> 
> 95  2.73osd.95  up  1
>
>
> If I change the cluster map on the following:
> root---|
>   |
>   |-rack1
>   ||
>   |host ceph01
>   |host ceph02
>   |host ceph03
>   |host ceph04
>   |
>   |---rack2
>|
>   host ceph05
>   host ceph06
>   host ceph07
>   host ceph08
> What will povidenie cluster failover one node? And how much will it affect
> the performance?
> Thank you
>
> --
> С уважением, Фасихов Ирек Нургаязович
> Моб.: +79229045757
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Rebalancing slow I/O.

2014-09-11 Thread Irek Fasikhov
Hi,All.

DELL R720X8,96 OSDs, Network 2x10Gbit LACP.

When one of the nodes crashes, I get very slow I / O operations on virtual
machines.
A cluster map by default.
[ceph@ceph08 ~]$ ceph osd tree
# idweight  type name   up/down reweight
-1  262.1   root defaults
-2  32.76   host ceph01
0   2.73osd.0   up  1
...
11  2.73osd.11  up  1
-3  32.76   host ceph02
13  2.73osd.13  up  1
..
12  2.73osd.12  up  1
-4  32.76   host ceph03
24  2.73osd.24  up  1

35  2.73osd.35  up  1
-5  32.76   host ceph04
37  2.73osd.37  up  1
.
47  2.73osd.47  up  1
-6  32.76   host ceph05
48  2.73osd.48  up  1
...
59  2.73osd.59  up  1
-7  32.76   host ceph06
60  2.73osd.60  down0
...
71  2.73osd.71  down0
-8  32.76   host ceph07
72  2.73osd.72  up  1

83  2.73osd.83  up  1
-9  32.76   host ceph08
84  2.73osd.84  up  1

95  2.73osd.95  up  1


If I change the cluster map on the following:
root---|
  |
  |-rack1
  ||
  |host ceph01
  |host ceph02
  |host ceph03
  |host ceph04
  |
  |---rack2
   |
  host ceph05
  host ceph06
  host ceph07
  host ceph08
What will povidenie cluster failover one node? And how much will it affect
the performance?
Thank you

-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] enrich ceph test methods, what is your concern about ceph. thanks

2014-08-26 Thread Irek Fasikhov
Sorry..Enter pressed :)

continued...
no, it's not the only way to check, but it depends what you want to use ceph


2014-08-26 15:22 GMT+04:00 Irek Fasikhov :

> For me, the bottleneck is single-threaded operation. The recording will
> have more or less solved with the inclusion of rbd cache, but there are
> problems with reading. But I think that these problems can be solved cache
> pool, but have not tested.
>
> It follows that the more threads, the greater the speed of reading and
> writing. But in reality it is different.
>
> The speed and number of operations, depending on many factors, such as
> network latency.
>
> Examples testing, special attention to the charts:
>
>
> https://software.intel.com/en-us/blogs/2013/10/25/measure-ceph-rbd-performance-in-a-quantitative-way-part-i
> and
>
> https://software.intel.com/en-us/blogs/2013/11/20/measure-ceph-rbd-performance-in-a-quantitative-way-part-ii
>
>
> 2014-08-26 15:11 GMT+04:00 yuelongguang :
>
>
>> thanks Irek Fasikhov.
>> is it the only way to test ceph-rbd?  and an important aim of the test is
>> to find where  the bottleneck is.   qemu/librbd/ceph.
>> could you share your test result with me?
>>
>>
>>
>> thanks
>>
>>
>>
>>
>>
>>
>> 在 2014-08-26 04:22:22,"Irek Fasikhov"  写道:
>>
>> Hi.
>> I and many people use fio.
>> For ceph rbd has a special engine:
>> https://telekomcloud.github.io/ceph/2014/02/26/ceph-performance-analysis_fio_rbd.html
>>
>>
>> 2014-08-26 12:15 GMT+04:00 yuelongguang :
>>
>>> hi,all
>>>
>>> i am planning to do a test on ceph, include performance, throughput,
>>> scalability,availability.
>>> in order to get a full test result, i  hope you all can give me some
>>> advice. meanwhile i can send the result to you,if you like.
>>> as for each category test( performance, throughput,
>>> scalability,availability)  ,  do you have some some test idea and test
>>> tools?
>>> basicly i have know some tools to test throughtput,iops .  but you can
>>> tell the tools you prefer and the result you expect.
>>>
>>> thanks very much
>>>
>>>
>>>
>>>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>>
>>
>>
>> --
>> С уважением, Фасихов Ирек Нургаязович
>> Моб.: +79229045757
>>
>>
>>
>>
>
>
> --
> С уважением, Фасихов Ирек Нургаязович
> Моб.: +79229045757
>



-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] enrich ceph test methods, what is your concern about ceph. thanks

2014-08-26 Thread Irek Fasikhov
For me, the bottleneck is single-threaded operation. The recording will
have more or less solved with the inclusion of rbd cache, but there are
problems with reading. But I think that these problems can be solved cache
pool, but have not tested.

It follows that the more threads, the greater the speed of reading and
writing. But in reality it is different.

The speed and number of operations, depending on many factors, such as
network latency.

Examples testing, special attention to the charts:

https://software.intel.com/en-us/blogs/2013/10/25/measure-ceph-rbd-performance-in-a-quantitative-way-part-i
and
https://software.intel.com/en-us/blogs/2013/11/20/measure-ceph-rbd-performance-in-a-quantitative-way-part-ii


2014-08-26 15:11 GMT+04:00 yuelongguang :

>
> thanks Irek Fasikhov.
> is it the only way to test ceph-rbd?  and an important aim of the test is
> to find where  the bottleneck is.   qemu/librbd/ceph.
> could you share your test result with me?
>
>
>
> thanks
>
>
>
>
>
>
> 在 2014-08-26 04:22:22,"Irek Fasikhov"  写道:
>
> Hi.
> I and many people use fio.
> For ceph rbd has a special engine:
> https://telekomcloud.github.io/ceph/2014/02/26/ceph-performance-analysis_fio_rbd.html
>
>
> 2014-08-26 12:15 GMT+04:00 yuelongguang :
>
>> hi,all
>>
>> i am planning to do a test on ceph, include performance, throughput,
>> scalability,availability.
>> in order to get a full test result, i  hope you all can give me some
>> advice. meanwhile i can send the result to you,if you like.
>> as for each category test( performance, throughput,
>> scalability,availability)  ,  do you have some some test idea and test
>> tools?
>> basicly i have know some tools to test throughtput,iops .  but you can
>> tell the tools you prefer and the result you expect.
>>
>> thanks very much
>>
>>
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>
>
> --
> С уважением, Фасихов Ирек Нургаязович
> Моб.: +79229045757
>
>
>
>


-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph monitor load, low performance

2014-08-26 Thread Irek Fasikhov
I'm sorry, of course it journals)


2014-08-26 13:16 GMT+04:00 Mateusz Skała :

> You mean to move /var/log/ceph/* to SSD disk?
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph monitor load, low performance

2014-08-26 Thread Irek Fasikhov
Move logs on the SSD and immediately increase performance. you have about
50% of the performance lost on logs. And just for the three replications
recommended more than 5 hosts


2014-08-26 12:17 GMT+04:00 Mateusz Skała :

>
> Hi thanks for reply.
>
>
>
>  From the top of my head, it is recommended to use 3 mons in
>> production. Also, for the 22 osds your number of PGs look a bug low,
>> you should look at that.
>>
> I get it from http://ceph.com/docs/master/rados/operations/placement-
> groups/
>
> (22osd's * 100)/3 replicas = 733, ~1024 pgs
> Please correct me if I'm wrong.
>
> It will be 5 mons (on 6 hosts) but now we must migrate some data from used
> servers.
>
>
>
>
>> The performance of the cluster is poor - this is too vague. What is
>> your current performance, what benchmarks have you tried, what is your
>> data workload and most importantly, how is your cluster setup. what
>> disks, ssds, network, ram, etc.
>>
>> Please provide more information so that people could help you.
>>
>> Andrei
>>
>
> Hardware informations:
> ceph15:
> RAM: 4GB
> Network: 4x 1GB NIC
> OSD disk's:
> 2x SATA Seagate ST31000524NS
> 2x SATA WDC WD1003FBYX-18Y7B0
>
> ceph25:
> RAM: 16GB
> Network: 4x 1GB NIC
> OSD disk's:
> 2x SATA WDC WD7500BPKX-7
> 2x SATA WDC WD7500BPKX-2
> 2x SATA SSHD ST1000LM014-1EJ164
>
> ceph30
> RAM: 16GB
> Network: 4x 1GB NIC
> OSD disks:
> 6x SATA SSHD ST1000LM014-1EJ164
>
> ceph35:
> RAM: 16GB
> Network: 4x 1GB NIC
> OSD disks:
> 6x SATA SSHD ST1000LM014-1EJ164
>
>
> All journals are on OSD's. 2 NIC are for backend network (10.20.4.0/22)
> and 2 NIC are for frontend (10.20.8.0/22).
>
> This cluster we use as storage backend for <100VM's on KVM. I don't make
> benchmarks but all vm's are migrated from Xen+GlusterFS(NFS), before
> migration every VM are running fine, now each VM  from time to time hangs
> for few seconds, apps installed on VM's loading much more time. GlusterFS
> are running on 2 servers with 1x 1GB NIC and 2x8 disks WDC WD7500BPKX-7.
>
> I make one test with recovery, if disk marks out, then recovery io is
> 150-200MB/s but all vm's hangs until recovery ends.
>
> Biggest load is on ceph35, IOps on each disk are near 150, cpu load ~4-5.
> On other hosts cpu load <2, 120~130iops
>
> Our ceph.conf
>
> ===
> [global]
>
> fsid=a9d17295-62f2-46f6-8325-1cad7724e97f
> mon initial members = ceph35, ceph30, ceph25, ceph15
> mon host = 10.20.8.35, 10.20.8.30, 10.20.8.25, 10.20.8.15
> public network = 10.20.8.0/22
> cluster network = 10.20.4.0/22
> osd journal size = 1024
> filestore xattr use omap = true
> osd pool default size = 3
> osd pool default min size = 1
> osd pool default pg num = 1024
> osd pool default pgp num = 1024
> osd crush chooseleaf type = 1
> auth cluster required = cephx
> auth service required = cephx
> auth client required = cephx
> rbd default format = 2
>
> ##ceph35 osds
> [osd.0]
> cluster addr = 10.20.4.35
> [osd.1]
> cluster addr = 10.20.4.35
> [osd.2]
> cluster addr = 10.20.4.35
> [osd.3]
> cluster addr = 10.20.4.36
> [osd.4]
> cluster addr = 10.20.4.36
> [osd.5]
> cluster addr = 10.20.4.36
>
> ##ceph25 osds
> [osd.6]
> cluster addr = 10.20.4.25
> public addr = 10.20.8.25
> [osd.7]
> cluster addr = 10.20.4.25
> public addr = 10.20.8.25
> [osd.8]
> cluster addr = 10.20.4.25
> public addr = 10.20.8.25
> [osd.9]
> cluster addr = 10.20.4.26
> public addr = 10.20.8.26
> [osd.10]
> cluster addr = 10.20.4.26
> public addr = 10.20.8.26
> [osd.11]
> cluster addr = 10.20.4.26
> public addr = 10.20.8.26
>
> ##ceph15 osds
> [osd.12]
> cluster addr = 10.20.4.15
> public addr = 10.20.8.15
> [osd.13]
> cluster addr = 10.20.4.15
> public addr = 10.20.8.15
> [osd.14]
> cluster addr = 10.20.4.15
> public addr = 10.20.8.15
> [osd.15]
> cluster addr = 10.20.4.16
> public addr = 10.20.8.16
>
> ##ceph30 osds
> [osd.16]
> cluster addr = 10.20.4.30
> public addr = 10.20.8.30
> [osd.17]
> cluster addr = 10.20.4.30
> public addr = 10.20.8.30
> [osd.18]
> cluster addr = 10.20.4.30
> public addr = 10.20.8.30
> [osd.19]
> cluster addr = 10.20.4.31
> public addr = 10.20.8.31
> [osd.20]
> cluster addr = 10.20.4.31
> public addr = 10.20.8.31
> [osd.21]
> cluster addr = 10.20.4.31
> public addr = 10.20.8.31
>
> [mon.ceph35]
> host = ceph35
> mon addr = 10.20.8.35:6789
> [mon.ceph30]
> host = ceph30
> mon addr = 10.20.8.30:6789
> [mon.ceph25]
> host = ceph25
> mon addr = 10.20.8.25:6789
> [mon.ceph15]
> host = ceph15
> mon addr = 10.20.8.15:6789
> 
>
> Regards,
>
> Mateusz
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] enrich ceph test methods, what is your concern about ceph. thanks

2014-08-26 Thread Irek Fasikhov
Hi.
I and many people use fio.
For ceph rbd has a special engine:
https://telekomcloud.github.io/ceph/2014/02/26/ceph-performance-analysis_fio_rbd.html


2014-08-26 12:15 GMT+04:00 yuelongguang :

> hi,all
>
> i am planning to do a test on ceph, include performance, throughput,
> scalability,availability.
> in order to get a full test result, i  hope you all can give me some
> advice. meanwhile i can send the result to you,if you like.
> as for each category test( performance, throughput,
> scalability,availability)  ,  do you have some some test idea and test
> tools?
> basicly i have know some tools to test throughtput,iops .  but you can
> tell the tools you prefer and the result you expect.
>
> thanks very much
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] question about getting rbd.ko and ceph.ko

2014-08-26 Thread Irek Fasikhov
Hi

No support module begins with a 2.6.37. And it is not recommended to use.

But you can use http://elrepo.org/tiki/kernel-ml


2014-08-26 11:56 GMT+04:00 yuelongguang :

> hi,all
>
> is there a way to get rbd,ko and ceph.ko for centos 6.X.
>
> or  i have to build them from source code?  which is the least kernel
> version?
>
> thanks
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How to calculate necessary disk amount

2014-08-22 Thread Irek Fasikhov
node1: 4[TB], node2: 4[TB], node3: 4[TB] :)
22 авг. 2014 г. 12:53 пользователь "idzzy"  написал:

> Hi Irek,
>
> Understood.
>
> Let me ask about only this.
>
> > No, it's for the entire cluster.
>
> Is this meant that total disk amount size of all nodes is over than 11.8
> TB?
> e.g  node1: 4[TB], node2: 4[TB], node3: 4[TB]
>
> not each node.
> e.g  node1: 11.8[TB], node2: 11.8[TB], node3:11.8 [TB]
>
> Thank you.
>
>
> On August 22, 2014 at 5:06:02 PM, Irek Fasikhov (malm...@gmail.com) wrote:
>
> I recommend you use replication, because radosgw uses asynchronous
> replication.
>
> Yes divided by nearfull ratio.
> No, it's for the entire cluster.
>
>
> 2014-08-22 11:51 GMT+04:00 idzzy :
>
>>  Hi,
>>
>>  If not use replication, Is it only to divide by nearfull_ratio?
>>  (does only radosgw support replication?)
>>
>> 10T/0.85 = 11.8 TB of each node?
>>
>>  # ceph pg dump | egrep "full_ratio|nearfulll_ratio"
>>  full_ratio 0.95
>> nearfull_ratio 0.85
>>
>>  Sorry I’m not familiar with ceph architecture.
>>  Thanks for the reply.
>>
>>  —
>>  idzzy
>>
>> On August 22, 2014 at 3:53:21 PM, Irek Fasikhov (malm...@gmail.com)
>> wrote:
>>
>>  Hi.
>>
>> 10ТB*2/0.85 ~= 24 TB with two replications, total volume for the raw data.
>>
>>
>>
>>
>
>
> --
> С уважением, Фасихов Ирек Нургаязович
> Моб.: +79229045757
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How to calculate necessary disk amount

2014-08-22 Thread Irek Fasikhov
I recommend you use replication, because radosgw uses asynchronous
replication.

Yes divided by nearfull ratio.
No, it's for the entire cluster.


2014-08-22 11:51 GMT+04:00 idzzy :

> Hi,
>
> If not use replication, Is it only to divide by nearfull_ratio?
> (does only radosgw support replication?)
>
> 10T/0.85 = 11.8 TB of each node?
>
> # ceph pg dump | egrep "full_ratio|nearfulll_ratio"
> full_ratio 0.95
> nearfull_ratio 0.85
>
> Sorry I’m not familiar with ceph architecture.
> Thanks for the reply.
>
> —
> idzzy
>
> On August 22, 2014 at 3:53:21 PM, Irek Fasikhov (malm...@gmail.com) wrote:
>
> Hi.
>
> 10ТB*2/0.85 ~= 24 TB with two replications, total volume for the raw data.
>
>
>
>


-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How to calculate necessary disk amount

2014-08-21 Thread Irek Fasikhov
Hi.

10ТB*2/0.85 ~= 24 TB with two replications, total volume for the raw data.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mounting RBD in linux containers

2014-08-10 Thread Irek Fasikhov
dmesg output please.


2014-08-11 2:16 GMT+04:00 Lorieri :

> same here, did you manage to fix it ?
>
> On Mon, Oct 28, 2013 at 3:13 PM, Kevin Weiler
>  wrote:
> > Hi Josh,
> >
> > We did map it directly to the host, and it seems to work just fine. I
> > think this is a problem with how the container is accessing the rbd
> module.
> >
> > --
> >
> > Kevin Weiler
> >
> > IT
> >
> >
> > IMC Financial Markets | 233 S. Wacker Drive, Suite 4300 | Chicago, IL
> > 60606 | http://imc-chicago.com/
> >
> > Phone: +1 312-204-7439 | Fax: +1 312-244-3301 | E-Mail:
> > kevin.wei...@imc-chicago.com
> >
> >
> >
> >
> >
> >
> >
> > On 10/18/13 7:50 PM, "Josh Durgin"  wrote:
> >
> >>On 10/18/2013 10:04 AM, Kevin Weiler wrote:
> >>> The kernel is 3.11.4-201.fc19.x86_64, and the image format is 1. I did,
> >>> however, try a map with an RBD that was format 2. I got the same error.
> >>
> >>To rule out any capability drops as the culprit, can you map an rbd
> >>image on the same host outside of a container?
> >>
> >>Josh
> >>
> >>> --
> >>>
> >>> *Kevin Weiler*
> >>>
> >>> IT
> >>>
> >>> IMC Financial Markets | 233 S. Wacker Drive, Suite 4300 | Chicago, IL
> >>> 60606 | http://imc-chicago.com/
> >>>
> >>> Phone: +1 312-204-7439 | Fax: +1 312-244-3301 | E-Mail:
> >>> _kevin.wei...@imc-chicago.com _
> >>>
> >>>
> >>> From: Gregory Farnum mailto:g...@inktank.com>>
> >>> Date: Friday, October 18, 2013 10:26 AM
> >>> To: Omar Marquez  >>> >
> >>> Cc: Kyle Bader mailto:kyle.ba...@gmail.com>>,
> >>> Kevin Weiler  >>> >, "ceph-users@lists.ceph.com
> >>> "  >>> >, Khalid Goudeaux
> >>>  >>>>
> >>> Subject: Re: [ceph-users] mounting RBD in linux containers
> >>>
> >>> What kernel are you running, and which format is the RBD image? I
> >>> thought we had a special return code for when the kernel doesn't
> support
> >>> the features used by that image, but that could be the problem.
> >>> -Greg
> >>>
> >>> On Thursday, October 17, 2013, Omar Marquez wrote:
> >>>
> >>>
> >>> Strace produces below:
> >>>
> >>> Š
> >>>
> >>> futex(0xb5637c, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0xb56378,
> >>> {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
> >>> futex(0xb562f8, FUTEX_WAKE_PRIVATE, 1)  = 1
> >>> add_key(0x424408, 0x7fff82c4e210, 0x7fff82c4e140, 0x22,
> >>> 0xfffe) = 607085216
> >>> stat("/sys/bus/rbd", {st_mode=S_IFDIR|0755, st_size=0, ...}) =
> 0
> >>> *open("/sys/bus/rbd/add", O_WRONLY)  = 3*
> >>> *write(3, "10.198.41.6:6789
> >>> ,10.198.41.8:678
> >>> "..., 96) = -1 EINVAL (Invalid
> >>>argument)*
> >>> close(3)= 0
> >>> rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTORER,
> 0x7fbf8a7efa90},
> >>> {SIG_DFL, [], 0}, 8) = 0
> >>> rt_sigaction(SIGQUIT, {SIG_IGN, [], SA_RESTORER,
> >>> 0x7fbf8a7efa90}, {SIG_DFL, [], 0}, 8) = 0
> >>> rt_sigprocmask(SIG_BLOCK, [CHLD], [PIPE], 8) = 0
> >>> clone(child_stack=0, flags=CLONE_PARENT_SETTID|SIGCHLD,
> >>> parent_tidptr=0x7fff82c4e040) = 22
> >>> wait4(22, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) =
> 22
> >>> rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER,
> 0x7fbf8a7efa90},
> >>> NULL, 8) = 0
> >>> rt_sigaction(SIGQUIT, {SIG_DFL, [], SA_RESTORER,
> >>> 0x7fbf8a7efa90}, NULL, 8) = 0
> >>> rt_sigprocmask(SIG_SETMASK, [PIPE], NULL, 8) = 0
> >>> write(2, "rbd: add failed: ", 17rbd: add failed: )   = 17
> >>> write(2, "(22) Invalid argument", 21(22) Invalid argument)   =
> >>>21
> >>> write(2, "\n", 1
> >>> )   = 1
> >>> exit_group(1)   = ?
> >>> +++ exited with 1 +++
> >>>
> >>>
> >>> The app is run inside the container with setuid = 0 and the
> >>> container is able to mount all required filesystems Š could this
> >>> still be a capability problem ? Also I do not see any call to
> >>> capset() in the strafe log Š
> >>>
> >>> --
> >>> Om
> >>>
> >>>
> >>> From: Kyle Bader 
> >>> Date: Thursday, October 17, 2013 5:08 PM
> >>> To: Kevin Weiler 
> >>> Cc: "ceph-users@lists.ceph.com" , Omar
> >>> Marquez , Khalid Goudeaux
> >>> 
> >>> Subject: Re: [ceph-users] mounting RBD in linux containers
> >>>
> >>> My first guess would be that it's due to LXC dropping capabilities,
> >>> I'd investigate whether CAP_SYS_ADMIN is being dropped. You need
> >>> CAP_SYS_ADMIN for mount and block ioctls, if the container doesn't
> >>> have those privs a map will likely fail. Maybe try tracing the
> >>> command with strace?
> >>>
> >>> On Thu, Oct 17, 2013 at 2:4

Re: [ceph-users] Fw: external monitoring tools for processes

2014-08-10 Thread Irek Fasikhov
Hi.

I use ZABBIX with the following script:
[ceph@ceph08 ~]$ cat /etc/zabbix/external/ceph
#!/usr/bin/python

import sys
import os
import commands
import json
import datetime
import time

#Chech arguments. If count arguments equally 1, then false.
if len(sys.argv) == 1:
print "You will need arguments!";
exit;

def generate(data,type):
JSON="{\"data\":["
for js in range(len(splits)):
JSON+="{\"{#"+type+"}\":\""+splits[js]+"\"},";
return JSON[:-1]+"]}"

if sys.argv[1] == "osd":
if len(sys.argv)==2:
splits=commands.getoutput('df | grep osd | awk {\'print
$6\'}| sed \'s/[^0-9]//g\'| sed \':a;N;$!ba;s/\\n/,/g\'').split(",")
print generate(splits,"OSD")
else:
ID=sys.argv[2]
LEVEL=sys.argv[3]
PERF=sys.argv[4]
CACHEFILE="/tmp/zabbix.ceph.osd"+ID+".cache"
CACHETTL=5

TIME=int(round(float(datetime.datetime.now().strftime("%s"

##CACHE FOR OPTIMIZATION PERFORMANCE#
if os.path.isfile(CACHEFILE):
CACHETIME=int(round(os.stat(CACHEFILE).st_mtime))
else:
CACHETIME=0
if TIME-CACHETIME>CACHETTL:
if os.system('sudo ceph --admin-daemon
/var/run/ceph/ceph-osd.'+ID+'.asok perfcounters_dump >'+CACHEFILE)>0: exit

json_data=open(CACHEFILE)
data = json.load(json_data)
json_data.close()
## PARSING 
if LEVEL in data:
if PERF in data[LEVEL]:
try:
key=data[LEVEL][PERF].has_key("sum")
print
(data[LEVEL][PERF]["sum"])/(data[LEVEL][PERF]["avgcount"])
except AttributeError:
print data[LEVEL][PERF]

and zabbix templates:
https://dl.dropboxusercontent.com/u/575018/zbx_export_templates.xml



2014-08-11 7:42 GMT+04:00 pragya jain :

> please somebody reply my question.
>
>
>On Saturday, 9 August 2014 3:34 PM, pragya jain 
> wrote:
>
>
>
> hi all,
>
> can somebody suggest me some external monitoring tools which can monitor
> whether the processes in ceph, such as, heartbeating, data scrubbing,
> authentication, backfilling, recovering etc. are working properly or not.
>
> Regards
> Pragya Jain
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] flashcache from fb and dm-cache??

2014-07-30 Thread Irek Fasikhov
Ceph has at CachePool. which can be created from SSD.
30 июля 2014 г. 18:41 пользователь "German Anders" 
написал:

>  Also, does someone try flashcache from facebook on ceph? cons? pros? any
> perf improvement? and dm-cache?
>
>
>
> *German Anders*
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] rbd rm. Error: trim_objectcould not find coid

2014-07-23 Thread Irek Fasikhov
Hi, All.

I encountered such a problem.

Was the status of one pg - inconsistent. RBD found this device and deleted
it, now on the OSD get the following error:


cod 0'0 active+inconsistent snaptrimq=[15~1,89~1]] exit
Started/Primary/Active/Recovering 0.025609 1 0.53
-8> 2014-07-23 12:03:13.386747 7f3617b02700  5 osd.94 pg_epoch: 35725
pg[80.3d6( v 35718'170614 (34929'167412,35718'170614] local-les=35724
n=2242 ec=9510 les/c 35724/35718 35723/35723/35723) [94,36] r=0 lpr=35723
pi=35713-35722/2 ml
cod 0'0 active+inconsistent snaptrimq=[15~1,89~1]] enter
Started/Primary/Active/Recovered
-7> 2014-07-23 12:03:13.386783 7f3617b02700  5 osd.94 pg_epoch: 35725
pg[80.3d6( v 35718'170614 (34929'167412,35718'170614] local-les=35724
n=2242 ec=9510 les/c 35724/35718 35723/35723/35723) [94,36] r=0 lpr=35723
pi=35713-35722/2 ml
cod 0'0 active+inconsistent snaptrimq=[15~1,89~1]] exit
Started/Primary/Active/Recovered 0.35 0 0.00
-6> 2014-07-23 12:03:13.386795 7f3617b02700  5 osd.94 pg_epoch: 35725
pg[80.3d6( v 35718'170614 (34929'167412,35718'170614] local-les=35724
n=2242 ec=9510 les/c 35724/35718 35723/35723/35723) [94,36] r=0 lpr=35723
pi=35713-35722/2 ml
cod 0'0 active+inconsistent snaptrimq=[15~1,89~1]] enter
Started/Primary/Active/Clean
-5> 2014-07-23 12:03:13.386932 7f3617b02700  5 osd.94 pg_epoch: 35725
pg[2.772( v 35722'486163 lc 35716'486156 (35141'483132,35722'486163]
local-les=35724 n=1328 ec=1 les/c 35724/35718 35723/35723/35723) [94,38,59]
r=0 lpr=35723 pi=3
5713-35722/2 lcod 0'0 mlcod 0'0 active+recovery_wait m=4] exit
Started/Primary/Active/WaitLocalRecoveryReserved 4.377808 7 0.96
-4> 2014-07-23 12:03:13.386956 7f3617b02700  5 osd.94 pg_epoch: 35725
pg[2.772( v 35722'486163 lc 35716'486156 (35141'483132,35722'486163]
local-les=35724 n=1328 ec=1 les/c 35724/35718 35723/35723/35723) [94,38,59]
r=0 lpr=35723 pi=3
5713-35722/2 lcod 0'0 mlcod 0'0 active+recovery_wait m=4] enter
Started/Primary/Active/WaitRemoteRecoveryReserved
-3> 2014-07-23 12:03:13.387282 7f36148fd700 -1 osd.94 pg_epoch: 35725
pg[80.3d6( v 35718'170614 (34929'167412,35718'170614] local-les=35724
n=2242 ec=9510 les/c 35724/35725 35723/35723/35723) [94,36] r=0 lpr=35723
mlcod 0'0 active+cl
ean+inconsistent snaptrimq=[15~1,89~1]] *trim_objectcould not find coid *
f022c7d6/rbd_data.3ed9c72ae8944a.0717/15//80
-2> 2014-07-23 12:03:13.388628 7f3617101700  5 osd.94 pg_epoch: 35725
pg[2.772( v 35722'486163 lc 35716'486156 (35141'483132,35722'486163]
local-les=35724 n=1328 ec=1 les/c 35724/35718 35723/35723/35723) [94,38,59]
r=0 lpr=35723 pi=3
5713-35722/2 lcod 0'0 mlcod 0'0 active+recovery_wait m=4] exit
Started/Primary/Active/WaitRemoteRecoveryReserved 0.001672 2 0.79
-1> 2014-07-23 12:03:13.388670 7f3617101700  5 osd.94 pg_epoch: 35725
pg[2.772( v 35722'486163 lc 35716'486156 (35141'483132,35722'486163]
local-les=35724 n=1328 ec=1 les/c 35724/35718 35723/35723/35723) [94,38,59]
r=0 lpr=35723 pi=3
5713-35722/2 lcod 0'0 mlcod 0'0 active+recovery_wait m=4] enter
Started/Primary/Active/Recovering
 0> 2014-07-23 12:03:13.389138 7f36148fd700 -1 osd/ReplicatedPG.cc: In
function 'ReplicatedPG::RepGather* ReplicatedPG::trim_object(const
hobject_t&)' thread 7f36148fd700 time 2014-07-23 12:03:13.387304
osd/ReplicatedPG.cc: 1824: FAILED assert(0)

[root@ceph08 DIR_7]# find /var/lib/ceph/osd/ceph-94/ -name
'*3ed9c72ae8944a.0717*' -ls
10745283770 -rw-r--r--   1 root root1 Июл 23 11:30
/var/lib/ceph/osd/ceph-94/current/80.3d6_head/DIR_6/DIR_D/DIR_7/rbd\\udata.3ed9c72ae8944a.0717__15_F022C7D6__50



How to make ceph forgot about the existence of this file?

Ceph version 0.72.2


Thanks.

-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] feature set mismatch after upgrade from Emperor to Firefly

2014-07-20 Thread Irek Fasikhov
Привет, Андрей.

ceph osd getcrushmap -o /tmp/crush
crushtool -i /tmp/crush --set-chooseleaf_vary_r 0 -o /tmp/crush.new
ceph osd setcrushmap -i /tmp/crush.new

Or

update kernel 3.15.


2014-07-20 20:19 GMT+04:00 Andrei Mikhailovsky :

> Hello guys,
>
>
> I have noticed the following message/error after upgrading to firefly.
> Does anyone know what needs doing to correct it?
>
>
> Thanks
>
> Andrei
>
>
>
> [   25.911055] libceph: mon1 192.168.168.201:6789 feature set mismatch,
> my 40002 < server's 20002040002, missing 2000200
>
> [   25.911698] libceph: mon1 192.168.168.201:6789 socket error on read
>
> [   35.913049] libceph: mon2 192.168.168.13:6789 feature set mismatch, my
> 40002 < server's 20002040002, missing 2000200
>
> [   35.913694] libceph: mon2 192.168.168.13:6789 socket error on read
>
> [   45.909466] libceph: mon0 192.168.168.200:6789 feature set mismatch,
> my 40002 < server's 20002040002, missing 2000200
>
> [   45.910104] libceph: mon0 192.168.168.200:6789 socket error on read
>
>
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph RBD and Backup.

2014-07-02 Thread Irek Fasikhov
Hi,All.

Dear community. How do you make backups CEPH RDB?

Thanks

-- 
Fasihov Irek (aka Kataklysm).
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Calamari Goes Open Source

2014-05-31 Thread Irek Fasikhov
Very Very Good! Thanks Inktank/RedHat.


2014-05-31 2:43 GMT+04:00 John Kinsella :

>  Cool! Looking forward to kicking the tires on that...
>  On May 30, 2014, at 3:04 PM, Patrick McGarry  wrote:
>
> Hey cephers,
>
> Sorry to push this announcement so late on a Friday but...
>
> Calamari has arrived!
>
> The source code bits have been flipped, the ticket tracker has been
> moved, and we have even given you a little bit of background from both
> a technical and vision point of view:
>
> Technical (ceph.com):
> http://ceph.com/community/ceph-calamari-goes-open-source/
>
> Vision (inktank.com):
> http://www.inktank.com/software/future-of-calamari/
>
> The ceph.com link should give you everything you need to know about
> what tech comprises Calamari, where the source lives, and where the
> discussions will take place.  If you have any questions feel free to
> hit the new ceph-calamari list or stop by IRC and we'll get you
> started.  Hope you all enjoy the GUI!
>
>
>
> Best Regards,
>
> Patrick McGarry
> Director, Community || Inktank
> http://ceph.com  ||  http://inktank.com
> @scuttlemonkey || @ceph || @inktank
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>  Stratosec  - Compliance as a Service
> o: 415.315.9385
> @johnlkinsella 
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


  1   2   >