Re: [ceph-users] Optimizing terrible RBD performance

2019-10-04 Thread Maged Mokhtar
sure if LVM subsystem on Linux can be tweaked in how large block that is being read / written is? Is there anything I can do to improve the performance, except for replacing with SSD disks? Does it mean that IOPS is my bottleneck now? On 04/10/2019 18:53, Maged Mokhtar wrote: The tests are

Re: [ceph-users] Optimizing terrible RBD performance

2019-10-04 Thread Maged Mokhtar
The tests are measuring differing things, and fio test result of 1.5 MB/s is not bad. The rados write bench uses by default 4M block size and does 16 threads and is random in nature, you can change the block size and thread count. The dd command uses by default 512 block size and and 1 thread

Re: [ceph-users] Commit and Apply latency on nautilus

2019-10-01 Thread Maged Mokhtar
Some suggestions: monitor raw resources such as cpu %util raw disk %util/busy, raw disk iops. instead of running a mix of workloads at this stage, narrow it down first, for example using rbd rand writes and 4k block sizes, then change 1 param at a time for example change the block size. See ho

Re: [ceph-users] Ceph + SAMBA (vfs_ceph)

2019-08-28 Thread Maged Mokhtar
On 27/08/2019 21:39, Salsa wrote: I'm running a ceph installation on a lab to evaluate for production and I have a cluster running, but I need to mount on different windows servers and desktops. I created an NFS share and was able to mount it on my Linux desktop, but not a Win 10 desktop. Sinc

Re: [ceph-users] optane + 4x SSDs for VM disk images?

2019-08-12 Thread Maged Mokhtar
On 11/08/2019 19:46, Victor Hooi wrote: Hi I am building a 3-node Ceph cluster to storE VM disk images. We are running Ceph Nautilus with KVM. Each node has: Xeon 4116 512GB ram per node Optane 905p NVMe disk with 980 GB Previously, I was creating four OSDs per Optane disk, and using only

Re: [ceph-users] bluestore write iops calculation

2019-08-02 Thread Maged Mokhtar
On 02/08/2019 08:54, nokia ceph wrote: Hi Team, Could you please help us in understanding the write iops inside ceph cluster . There seems to be mismatch in iops between theoretical and what we see in disk status. Our platform 5 node cluster 120 OSDs, with each node having 24 disks HDD ( d

Re: [ceph-users] New best practices for osds???

2019-07-17 Thread Maged Mokhtar
in most cases write back cache does help a lot for hdd write latency, either raid-0 or some Areca cards support write back in jbod mode. Our observation they could help by a 3-5x factor in Bluestore, whereas db/wal on flash will be about 2x, it does depend on hardware but in general we see bene

Re: [ceph-users] Erasure Coding performance for IO < stripe_width

2019-07-08 Thread Maged Mokhtar
On 08/07/2019 13:02, Lars Marowsky-Bree wrote: On 2019-07-08T12:25:30, Dan van der Ster wrote: Is there a specific bench result you're concerned about? We're seeing ~5800 IOPS, ~23 MiB/s on 4 KiB IO (stripe_width 8192) on a pool that could do 3 GiB/s with 4M blocksize. So, yeah, well, that

Re: [ceph-users] rebalancing ceph cluster

2019-06-24 Thread Maged Mokhtar
On 24/06/2019 11:25, jinguk.k...@ungleich.ch wrote: Hello everyone, We have some osd on the ceph. Some osd's usage is more than 77% and another osd's usage is 39% in the same host. I wonder why osd’s usage is different.(Difference is large) and how can i fix it? ID  CLASS   WEIGHT    REWEI

[ceph-users] bluestore_allocated vs bluestore_stored

2019-06-16 Thread Maged Mokhtar
Hi all, I want to understand more the difference between bluestore_allocated and bluestore_stored in the case of no compression. If i am writing fixed objects with sizes greater than min alloc size, would bluestore_allocated still be higher than bluestore_stored ? If so, is this a permanent o

Re: [ceph-users] performance in a small cluster

2019-05-24 Thread Maged Mokhtar
Hi Robert 1) Can you specify how many threads were used in the 4k write rados test ? i suspect that only 16 threads were used, this is because it is the default + also the average latency was 2.9 ms giving average of 344 iops per thread, your average iops were 5512 divide this by 344 we get 16

Re: [ceph-users] Default min_size value for EC pools

2019-05-20 Thread Maged Mokhtar
Not sure. In general important fixes get backported, but will have to wait and see. /Maged On 20/05/2019 22:11, Frank Schilder wrote: Dear Maged, thanks for elaborating on this question. Is there already information in which release this patch will be deployed? Best regards, ===

Re: [ceph-users] Default min_size value for EC pools

2019-05-20 Thread Maged Mokhtar
On 20/05/2019 19:37, Frank Schilder wrote: This is an issue that is coming up every now and then (for example: https://www.mail-archive.com/ceph-users@lists.ceph.com/msg50415.html) and I would consider it a very serious one (I will give an example below). A statement like "min_size = k is unsa

Re: [ceph-users] Samba vfs_ceph or kernel client

2019-05-16 Thread Maged Mokhtar
Thanks a lot for the clarification.  /Maged On 16/05/2019 17:23, David Disseldorp wrote: Hi Maged, On Fri, 10 May 2019 18:32:15 +0200, Maged Mokhtar wrote: What is the recommended way for Samba gateway integration: using vfs_ceph or mounting CephFS via kernel client ? i tested the kernel

Re: [ceph-users] How to maximize the OSD effective queue depth in Ceph?

2019-05-11 Thread Maged Mokhtar
On 10/05/2019 19:54, Mark Lehrer wrote: I'm setting up a new Ceph cluster with fast SSD drives, and there is one problem I want to make sure to address straight away: comically-low OSD queue depths. On the past several clusters I built, there was one major performance problem that I never had

[ceph-users] Samba vfs_ceph or kernel client

2019-05-10 Thread Maged Mokhtar
Hi all, What is the recommended way for Samba gateway integration: using vfs_ceph or mounting CephFS via kernel client ? i tested the kernel solution in a ctdb setup and gave good performance, does it have any limitations relative to vfs_ceph ? Cheers /Maged __

Re: [ceph-users] Tip for erasure code profile?

2019-05-04 Thread Maged Mokhtar
On 03/05/2019 23:56, Maged Mokhtar wrote: On 03/05/2019 17:45, Robert Sander wrote: Hi, I would be glad if anybody could give me a tip for an erasure code profile and an associated crush ruleset. The cluster spans 2 rooms with each room containing 6 hosts and each host has 12 to 16 OSDs

Re: [ceph-users] Tip for erasure code profile?

2019-05-03 Thread Maged Mokhtar
On 03/05/2019 17:45, Robert Sander wrote: Hi, I would be glad if anybody could give me a tip for an erasure code profile and an associated crush ruleset. The cluster spans 2 rooms with each room containing 6 hosts and each host has 12 to 16 OSDs. The failure domain would be the room level,

Re: [ceph-users] Glance client and RBD export checksum mismatch

2019-04-10 Thread Maged Mokhtar
On 10/04/2019 07:46, Brayan Perera wrote: Dear All, Ceph Version : 12.2.5-2.ge988fb6.el7 We are facing an issue on glance which have backend set to ceph, when we try to create an instance or volume out of an image, it throws checksum error. When we use rbd export and use md5sum, value is mat

Re: [ceph-users] 答复: CEPH ISCSI LIO multipath change delay

2019-03-21 Thread Maged Mokhtar
1 18:34:33 CEPH-client01test iscsid: connect to 172.17.1.23:3260 failed (No route to host) Mar 21 18:34:41 CEPH-client01test iscsid: connect to 172.17.1.23:3260 failed (No route to host) -邮件原件----- 发件人: Maged Mokhtar 发送时间: 2019年3月20日 15:36 收件人: li jerry ; ceph-users@lists.ceph.com 主题: Re: [c

Re: [ceph-users] fio test rbd - single thread - qd1

2019-03-20 Thread Maged Mokhtar
On 19/03/2019 16:17, jes...@krogh.cc wrote: Hi All. I'm trying to get head and tails into where we can stretch our Ceph cluster into what applications. Parallism works excellent, but baseline throughput it - perhaps - not what I would expect it to be. Luminous cluster running bluestore - all

Re: [ceph-users] CEPH ISCSI LIO multipath change delay

2019-03-20 Thread Maged Mokhtar
On 20/03/2019 07:43, li jerry wrote: Hi,ALL I’ve deployed mimic(13.2.5) cluster on 3 CentOS 7.6 servers, then configured iscsi-target and created a LUN, referring to http://docs.ceph.com/docs/mimic/rbd/iscsi-target-cli/. I have another server which is CentOS 7.4, configured and mounted th

Re: [ceph-users] RBD poor performance

2019-02-28 Thread Maged Mokhtar
ed if they should focus on performance more, now I wished I checked that box ;) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Maged Mokhtar CEO PetaSAN 4 Emad El Deen Kamel Cairo 11371, Egypt www.petasan.org +20100697993

Re: [ceph-users] rados block on SSD - performance - how to tune and get insight?

2019-02-07 Thread Maged Mokhtar
On 07/02/2019 17:07, jes...@krogh.cc wrote: Hi Maged Thanks for your reply. 6k is low as a max write iops value..even for single client. for cluster of 3 nodes, we see from 10k to 60k write iops depending on hardware. can you increase your threads to 64 or 128 via -t parameter I can absolu

Re: [ceph-users] rados block on SSD - performance - how to tune and get insight?

2019-02-07 Thread Maged Mokhtar
On 07/02/2019 09:17, jes...@krogh.cc wrote: Hi List We are in the process of moving to the next usecase for our ceph cluster (Bulk, cheap, slow, erasurecoded, cephfs) storage was the first - and that works fine. We're currently on luminous / bluestore, if upgrading is deemed to change what we

Re: [ceph-users] Multicast communication compuverde

2019-02-06 Thread Maged Mokhtar
On 06/02/2019 11:14, Marc Roos wrote: Yes indeed, but for osd's writing the replication or erasure objects you get sort of parrallel processing not? Multicast traffic from storage has a point in things like the old Windows provisioning software Ghost where you could netboot a room full och c

Re: [ceph-users] Recommendations for sharing a file system to a heterogeneous client network?

2019-01-15 Thread Maged Mokhtar
ph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Maged Mokhtar CEO PetaS

Re: [ceph-users] OSDs busy reading from Bluestore partition while bringing up nodes.

2019-01-12 Thread Maged Mokhtar
re or distribution is strictly prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. ___ ceph-users mailing list ceph-users@lists.ceph.com http

Re: [ceph-users] EC Pool Disk Performance Toshiba vs Segate

2018-12-13 Thread Maged Mokhtar
On 13/12/2018 09:53, Ashley Merrick wrote: I have a Mimic Bluestore EC RBD Pool running on 8+2, this is currently running across 4 node's. 3 Node's are running Toshiba disk's while one node is running Segate disks (same size, spinning speed, enterprise disks e.t.c), I have noticed huge diffe

Re: [ceph-users] ceph pg backfill_toofull

2018-12-11 Thread Maged Mokhtar
-users-ceph.com -- Maged Mokhtar CEO PetaSAN 4 Emad El Deen Kamel Cairo 11371, Egypt www.petasan.org +201006979931 skype: maged.mokhtar ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] will crush rule be used during object relocation in OSD failure ?

2018-11-24 Thread Maged Mokhtar
On 23/11/18 18:00, ST Wong (ITSC) wrote: Hi all, We've 8 osd hosts, 4 in room 1 and 4 in room2. A pool with size = 3 using following crush map is created, to cater for room failure. rule multiroom { id 0 type replicated min_size 2 max_size 4 step

Re: [ceph-users] Benchmark performance when using SSD as the journal

2018-11-13 Thread Maged Mokhtar
Hi Dave, The SSD journal will help boost iops  & latency which will be more apparent for small block sizes. The rados benchmark default block size is 4M, use the -b option to specify the size. Try at 4k, 32k, 64k ... As a side note, this is a rados level test, the rbd image size is not releva

Re: [ceph-users] Drive for Wal and Db

2018-10-22 Thread Maged Mokhtar
sharing SSD between WAL and DB what should be placed on SSD? WAL or DB? - Original Message - From: "Maged Mokhtar" To: "ceph-users" Sent: Saturday, 20 October, 2018 20:05:44 Subject: Re: [ceph-users] Drive for Wal and Db On 20/10/18 18:57, Robert Stanford wrote: Ou

Re: [ceph-users] Drive for Wal and Db

2018-10-20 Thread Maged Mokhtar
On 20/10/18 18:57, Robert Stanford wrote:  Our OSDs are BlueStore and are on regular hard drives. Each OSD has a partition on an SSD for its DB.  Wal is on the regular hard drives.  Should I move the wal to share the SSD with the DB?  Regards R ___

Re: [ceph-users] A basic question on failure domain

2018-10-20 Thread Maged Mokhtar
On 20/10/18 05:28, Cody wrote: Hi folks, I have a rookie question. Does the number of the buckets chosen as the failure domain must be equal or greater than the number of replica (or k+m for erasure coding)? E.g., for an erasure code profile where k=4, m=2, failure domain=rack, does it only w

Re: [ceph-users] bcache, dm-cache support

2018-10-10 Thread Maged Mokhtar
On 10/10/18 21:08, Ilya Dryomov wrote: On Wed, Oct 10, 2018 at 8:48 PM Kjetil Joergensen wrote: Hi, We tested bcache, dm-cache/lvmcache, and one more which name eludes me with PCIe NVME on top of large spinning rust drives behind a SAS3 expander - and decided this were not for us. This w

[ceph-users] bcache, dm-cache support

2018-10-04 Thread Maged Mokhtar
Hello all, Do  bcache and dm-cache work well with Ceph ? Is one recommended on the other ? Are there any issues ? There are a few posts in this list around them, but i could not determine if they are ready for mainstream use or not Appreciate any clarifications.  /Maged _

Re: [ceph-users] CRUSH puzzle: step weighted-take

2018-09-27 Thread Maged Mokhtar
On 27/09/18 17:18, Dan van der Ster wrote: Dear Ceph friends, I have a CRUSH data migration puzzle and wondered if someone could think of a clever solution. Consider an osd tree like this: -2 4428.02979 room 0513-R-0050 -72911.81897 rack RA01 -4917.

Re: [ceph-users] Hyper-v ISCSI support

2018-09-21 Thread Maged Mokhtar
Hi Glen, Yes you need clustered SCSI-3 persistent reservations support. This is supported in SUSE SLE kernels, you may also be interested in PetaSAN: http://www.petasan.org which is based on these kernels. Maged On 21/09/18 12:48, Glen Baars wrote: Hello Ceph Users, We have been using ce

Re: [ceph-users] Slow Ceph: Any plans on torrent-like transfers from OSDs ?

2018-09-14 Thread Maged Mokhtar
On 14/09/18 12:13, Alex Lupsa wrote: Hi, Thank you for the answer Ronny. I did indeed try 2x RBD drives (rdb-cache was already active), striping them, and got double write/read speed instantly. So I am chalking this one on KVM who is single-threaded and not fully ceph-aware it seems. Althoug

Re: [ceph-users] Benchmark does not show gains with DB on SSD

2018-09-12 Thread Maged Mokhtar
On 12/09/18 17:06, Ján Senko wrote: We are benchmarking a test machine which has: 8 cores, 64GB RAM 12 * 12 TB HDD (SATA) 2 * 480 GB SSD (SATA) 1 * 240 GB SSD (NVME) Ceph Mimic Baseline benchmark for HDD only (Erasure Code 4+2) Write 420 MB/s, 100 IOPS, 150ms latency Read 1040 MB/s, 260 IOPS,

Re: [ceph-users] advice with erasure coding

2018-09-07 Thread Maged Mokhtar
On 2018-09-07 13:52, Janne Johansson wrote: > Den fre 7 sep. 2018 kl 13:44 skrev Maged Mokhtar : > >> Good day Cephers, >> >> I want to get some guidance on erasure coding, the docs do state the >> different plugins and settings but to really understand them

Re: [ceph-users] WAL/DB size

2018-09-07 Thread Maged Mokhtar
On 2018-09-07 14:36, Alfredo Deza wrote: > On Fri, Sep 7, 2018 at 8:27 AM, Muhammad Junaid > wrote: > >> Hi there >> >> Asking the questions as a newbie. May be asked a number of times before by >> many but sorry, it is not clear yet to me. >> >> 1. The WAL device is just like journaling dev

[ceph-users] advice with erasure coding

2018-09-07 Thread Maged Mokhtar
Good day Cephers, I want to get some guidance on erasure coding, the docs do state the different plugins and settings but to really understand them all and their use cases is not easy: -Are the majority of implementations using jerasure and just configuring k and m ? -For jerasure: when/if woul

Re: [ceph-users] Performance tuning for SAN SSD config

2018-07-06 Thread Maged Mokhtar
ke FC systems > where you can use commodity hardware and grow as you need, you > generally dont need hba/fc enclosed disks but nothing stopping you > from using your existing system. Also you generally dont need any raid > mirroring configurations in the backend since ceph will hand

Re: [ceph-users] Performance tuning for SAN SSD config

2018-07-06 Thread Maged Mokhtar
On 2018-06-29 18:30, Matthew Stroud wrote: > We back some of our ceph clusters with SAN SSD disk, particularly VSP G/F and > Purestorage. I'm curious what are some settings we should look into modifying > to take advantage of our SAN arrays. We had to manually set the class for the > luns to SS

Re: [ceph-users] CephFS+NFS For VMWare

2018-07-02 Thread Maged Mokhtar
Hi Nick, With iSCSI we reach over 150 MB/s vmotion for single vm, 1 GB/s for 7-8 vm migrations. Since these are 64KB block sizes, latency/iops is a large factor, you need either controllers with write back cache or all flash . hdds without write cache will suffer even with external wal/db on ssds

Re: [ceph-users] pulled a disk out, ceph still thinks its in

2018-06-27 Thread Maged Mokhtar
When you pull a drive out what is the status of the the daemon: systemctl status ceph-osd@ID /Maged On 2018-06-27 21:51, pixelfairy wrote: > even pulling a few more out didnt show up in osd tree. had to actually try to > use them. ceph tell osd.N bench works. > > On Sun, Jun 24, 2018 at 2

Re: [ceph-users] How to use libradostriper to improve I/O bandwidth?

2018-06-12 Thread Maged Mokhtar
On 2018-06-12 01:01, Jialin Liu wrote: > Hello Ceph Community, > > I used libradosstriper api to test the striping feature, it doesn't seem to > improve the performance at all, can anyone advise what's wrong with my > settings: > > The rados object store testbed at my center has > osd: 48

Re: [ceph-users] Issues with RBD when rebooting

2018-05-25 Thread Maged Mokhtar
On 2018-05-25 12:11, Josef Zelenka wrote: > Hi, we are running a jewel cluster (54OSDs, six nodes, ubuntu 16.04) that > serves as a backend for openstack(newton) VMs. TOday we had to reboot one of > the nodes(replicated pool, x2) and some of our VMs oopsed with issues with > their FS(mainly dat

Re: [ceph-users] Bluestore cluster, bad IO perf on blocksize<64k... could it be throttling ?

2018-03-23 Thread Maged Mokhtar
On 2018-03-21 19:50, Frederic BRET wrote: > Hi all, > > The context : > - Test cluster aside production one > - Fresh install on Luminous > - choice of Bluestore (coming from Filestore) > - Default config (including wpq queuing) > - 6 nodes SAS12, 14 OSD, 2 SSD, 2 x 10Gb nodes, far more Gb at eac

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-12 Thread Maged Mokhtar
On 2018-03-12 21:00, Ilya Dryomov wrote: > On Mon, Mar 12, 2018 at 7:41 PM, Maged Mokhtar wrote: > >> On 2018-03-12 14:23, David Disseldorp wrote: >> >> On Fri, 09 Mar 2018 11:23:02 +0200, Maged Mokhtar wrote: >> >> 2)I undertand that before switching

Re: [ceph-users] Fwd: [ceph bad performance], can't find a bottleneck

2018-03-12 Thread Maged Mokhtar
Hi, Try increasing the queue depth from default 128 to 1024: rbd map image-XX -o queue_depth=1024 Also if you run multiple rbd images/fio tests, do you get higher combined performance ? Maged On 2018-03-12 17:16, Sergey Kotov wrote: > Dear moderator, i subscribed to ceph list today, cou

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-12 Thread Maged Mokhtar
On 2018-03-12 14:23, David Disseldorp wrote: > On Fri, 09 Mar 2018 11:23:02 +0200, Maged Mokhtar wrote: > >> 2)I undertand that before switching the path, the initiator will send a >> TMF ABORT can we pass this to down to the same abort_request() function >> in osd

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-10 Thread Maged Mokhtar
-- From: "Jason Dillaman" Sent: Sunday, March 11, 2018 1:46 AM To: "shadow_lin" Cc: "Lazuardi Nasution" ; "Ceph Users" Subject: Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock On Sat, Mar 10, 2018 at 10:11 AM, shadow

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-09 Thread Maged Mokhtar
Hi Mike, > For the easy case, the SCSI command is sent directly to krbd and so if > osd_request_timeout is less than M seconds then the command will be > failed in time and we would not hit the problem above. > If something happens in the target stack like the SCSI command gets > stuck/queued the

Re: [ceph-users] Slow requests troubleshooting in Luminous - details missing

2018-03-02 Thread Maged Mokhtar
On 2018-03-02 07:54, Alex Gorbachev wrote: > On Thu, Mar 1, 2018 at 10:57 PM, David Turner wrote: > Blocked requests and slow requests are synonyms in ceph. They are 2 names > for the exact same thing. > > On Thu, Mar 1, 2018, 10:21 PM Alex Gorbachev > wrote: > On Thu, Mar 1, 2018 at 2:47 PM

Re: [ceph-users] Ceph luminous performance - how to calculate expected results

2018-02-14 Thread Maged Mokhtar
On 2018-02-14 20:14, Steven Vacaroaia wrote: > Hi, > > It is very useful to "set up expectations" from a performance perspective > > I have a cluster using 3 DELL R620 with 64 GB RAM and 10 GB cluster network > > I've seen numerous posts and articles about the topic mentioning the > foll

Re: [ceph-users] Newbie question: stretch ceph cluster

2018-02-14 Thread Maged Mokhtar
Hi, You need to set the min_size to 2 in crush rule. The exact location and replication flow when a client writes data depends on the object name and num of pgs. the crush rule determines which osds will serve a pg, the first is the primary osd for that pg. The client computes the pg from the

Re: [ceph-users] How to clean data of osd with ssd journal(wal, db if it is bluestore) ?

2018-02-01 Thread Maged Mokhtar
-- > > lin.yunfan > - > > 发件人:Maged Mokhtar > 发送时间:2018-02-01 14:22 > 主题:Re: [ceph-users] How to clean data of osd with ssd journal(wal, db if it > is bluestore) ? > 收件人:"David Turner" > 抄送:"shadow_lin","ceph-user

Re: [ceph-users] How to clean data of osd with ssd journal(wal, db if it is bluestore) ?

2018-01-31 Thread Maged Mokhtar
I would recommend as Wido to use the dd command. block db device holds the metada/allocation of objects stored in data block, not cleaning this is asking for problems, besides it does not take any time. In our testing building new custer on top of older installation, we did see many cases where os

Re: [ceph-users] Ceph - incorrect output of ceph osd tree

2018-01-31 Thread Maged Mokhtar
try setting: mon_osd_min_down_reporters = 1 On 2018-01-31 20:46, Steven Vacaroaia wrote: > Hi, > > Why is ceph osd tree reports that osd.4 is up when the server on which osd.4 > is running is actually down ?? > > Any help will be appreciated > > [root@osd01 ~]# ping -c 2 osd02 > PING

Re: [ceph-users] troubleshooting ceph performance

2018-01-30 Thread Maged Mokhtar
On 2018-01-31 08:14, Manuel Sopena Ballesteros wrote: > Dear Ceph community, > > I have a very small ceph cluster for testing with this configuration: > > · 2x compute nodes each with: > > · dual port of 25 nic > > · 2x socket (56 cores with hyperthreading) > > ·

Re: [ceph-users] How ceph client read data from ceph cluster

2018-01-26 Thread Maged Mokhtar
.2.2) the client only read from the primary > osd(one copy),is that true? > > 2018-01-27 > - > > lin.yunfan > --------- > > 发件人:Maged Mokhtar > 发送时间:2018-01-26 20:27 > 主题:Re: [ceph-users] How ceph client read data f

Re: [ceph-users] How ceph client read data from ceph cluster

2018-01-26 Thread Maged Mokhtar
On 2018-01-26 09:09, shadow_lin wrote: > Hi List, > I read a old article about how ceph client read from ceph cluster.It said the > client only read from the primary osd. Since ceph cluster in replicate mode > have serveral copys of data only read from one copy seems waste the > performance of

Re: [ceph-users] What is the should be the expected latency of 10Gbit network connections

2018-01-22 Thread Maged Mokhtar
kets transmitted, 10 received, 0% packet loss, time 2363ms > rtt min/avg/max/mdev = 0.014/0.015/0.322/0.006 ms, ipg/ewma 0.023/0.016 ms > > On 22 January 2018 at 22:37, Nick Fisk wrote: > >> Anyone with 25G ethernet willing to do the test? Would love to see what the >> late

Re: [ceph-users] What is the should be the expected latency of 10Gbit network connections

2018-01-22 Thread Maged Mokhtar
On 2018-01-22 08:39, Wido den Hollander wrote: > On 01/20/2018 02:02 PM, Marc Roos wrote: > >> If I test my connections with sockperf via a 1Gbit switch I get around >> 25usec, when I test the 10Gbit connection via the switch I have around >> 12usec is that normal? Or should there be a differnce

Re: [ceph-users] OSDs wrongly marked down

2017-12-20 Thread Maged Mokhtar
Could also be your hardware under powered for the io you have. try to check your resource load during peak workload together with recovery and scrubbing going on at same time. On 2017-12-20 17:03, David Turner wrote: > When I have OSDs wrongly marked down it's usually to do with the > filesto

Re: [ceph-users] How to increase the size of requests written to a ceph image

2017-12-08 Thread Maged Mokhtar
4M block sizes you will only need 22.5 iops On 2017-12-08 09:59, Maged Mokhtar wrote: > Hi Russell, > > It is probably due to the difference in block sizes used in the test vs your > cluster load. You have a latency problem which is limiting your max write > iops to around

Re: [ceph-users] How to increase the size of requests written to a ceph image

2017-12-08 Thread Maged Mokhtar
I have > to believe it is a hardware firmware issue. > And its peculiar seeing performance boost slightly, even 24 hours later, when > I stop then start the OSDs. > > Our actual writes are low, as most of our Ceph Cluster based images are > low-write, high-memory. So a 20GB/d

[ceph-users] Single disk per OSD ?

2017-12-01 Thread Maged Mokhtar
Hi all, I believe most exiting setups use 1 disk per OSD. Is this going to be the most common setup in the future ? With the move to lvm, will this prefer the use of multiple disks per OSD ? On the other side i also see nvme vendors recommending multiple OSDs ( 2,4 ) per disk as disks are getting

Re: [ceph-users] ceph all-nvme mysql performance tuning

2017-11-29 Thread Maged Mokhtar
me disk with an additional >> partition to use as journal/wal. We double check the c-state and it was >> not configure to use c1, so we change that on all the osd nodes and mon >> nodes and we're going to make some new tests, and see how it goes. I'll >> get back as soon as

Re: [ceph-users] ceph-disk is now deprecated

2017-11-28 Thread Maged Mokhtar
I tend to agree with Wido. May of us still reply on ceph-disk and hope to see it live a little longer. Maged On 2017-11-28 13:54, Alfredo Deza wrote: > On Tue, Nov 28, 2017 at 3:12 AM, Wido den Hollander wrote: > Op 27 november 2017 om 14:36 schreef Alfredo Deza : > > For the upcoming Lumin

Re: [ceph-users] ceph all-nvme mysql performance tuning

2017-11-27 Thread Maged Mokhtar
On 2017-11-27 15:02, German Anders wrote: > Hi All, > > I've a performance question, we recently install a brand new Ceph cluster > with all-nvme disks, using ceph version 12.2.0 with bluestore configured. The > back-end of the cluster is using a bond IPoIB (active/passive) , and for the > fr

Re: [ceph-users] Bluestore performance 50% of filestore

2017-11-15 Thread Maged Mokhtar
On 2017-11-14 21:54, Milanov, Radoslav Nikiforov wrote: > Hi > > We have 3 node, 27 OSDs cluster running Luminous 12.2.1 > > In filestore configuration there are 3 SSDs used for journals of 9 OSDs on > each hosts (1 SSD has 3 journal paritions for 3 OSDs). > > I've converted filestore to bl

Re: [ceph-users] Performance, and how much wiggle room there is with tunables

2017-11-10 Thread Maged Mokhtar
rados benchmark is a client application that simulates client io to stress the cluster. This applies whether you run the test from an external client or from a cluster server that will act as a client. For fast clusters it the client will saturate (cpu/net) before the cluster does. To get accurate

Re: [ceph-users] Performance, and how much wiggle room there is with tunables

2017-11-10 Thread Maged Mokhtar
Hi Mark, It will be interesting to know: The impact of replication. I guess it will decrease by a higher factor than the replica count. I assume you mean the 30K IOPS per OSD is what the client sees, if so the OSD raw disk itself will be doing more IOPS, is this correct and if so what is the

Re: [ceph-users] Bluestore OSD_DATA, WAL & DB

2017-11-03 Thread Maged Mokhtar
On 2017-11-03 15:59, Wido den Hollander wrote: > Op 3 november 2017 om 14:43 schreef Mark Nelson : > > On 11/03/2017 08:25 AM, Wido den Hollander wrote: > Op 3 november 2017 om 13:33 schreef Mark Nelson : > > On 11/03/2017 02:44 AM, Wido den Hollander wrote: > Op 3 november 2017 om 0:09 schree

Re: [ceph-users] How to increase the size of requests written to a ceph image

2017-10-27 Thread Maged Mokhtar
h a long stick for anything but small toy-test clusters. > > On Fri, Oct 27, 2017 at 3:44 AM, Russell Glaue wrote: > On Wed, Oct 25, 2017 at 7:09 PM, Maged Mokhtar wrote: > It depends on what stage you are in: > in production, probably the best thing is to setup a monitoring tool >

Re: [ceph-users] How to increase the size of requests written to a ceph image

2017-10-26 Thread Maged Mokhtar
and a busy% of below 90% during rados 4k test. Maged On 2017-10-26 16:44, Russell Glaue wrote: > On Wed, Oct 25, 2017 at 7:09 PM, Maged Mokhtar wrote: > >> It depends on what stage you are in: >> in production, probably the best thing is to setup a monitoring tool &g

Re: [ceph-users] How to increase the size of requests written to a ceph image

2017-10-25 Thread Maged Mokhtar
ry well without any hint of a problem. >>>>> >>>>> Any other ideas or suggestions? >>>>> >>>>> -RG >>>>> >>>>> >>>>> On Wed, Oct 18, 2017 at 3:40 PM, Maged Mokhtar >>>>> wrote

Re: [ceph-users] Backup VM (Base image + snapshot)

2017-10-20 Thread Maged Mokhtar
Hi all, Can export-diff work effectively without the fast-diff rbd feature as it is not supported in kernel rbd ? Maged On 2017-10-19 23:18, Oscar Segarra wrote: > Hi Richard, > > Thanks a lot for sharing your experience... I have made deeper investigation > and it looks export-diff is t

Re: [ceph-users] How to increase the size of requests written to a ceph image

2017-10-18 Thread Maged Mokhtar
peed in which in the dd infile can > be read? > And I assume the best test should be run with no other load. > > How does one run the rados bench "as stress"? > > -RG > > On Wed, Oct 18, 2017 at 1:33 PM, Maged Mokhtar wrote: > > measuring resource loa

Re: [ceph-users] How to increase the size of requests written to a ceph image

2017-10-18 Thread Maged Mokhtar
de.servers.com/ssd-performance-2017-c4307a92dea [2] > > On Wed, Oct 18, 2017 at 11:53 AM, Maged Mokhtar wrote: > > Check out the following link: some SSDs perform bad in Ceph due to sync > writes to journal > > https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test

Re: [ceph-users] How to increase the size of requests written to a ceph image

2017-10-18 Thread Maged Mokhtar
0 > Total writes made: 31032 > Write size: 4096 > Object size:4096 > Bandwidth (MB/sec): 3.93282 > Stddev Bandwidth: 3.66265 > Max bandwidth (MB/sec): 13.668 > Min bandwidth (MB/sec): 0 > Average IOPS: 1006 > St

Re: [ceph-users] How to increase the size of requests written to a ceph image

2017-10-18 Thread Maged Mokhtar
First a general comment: local RAID will be faster than Ceph for a single threaded (queue depth=1) io operation test. A single thread Ceph client will see at best same disk speed for reads and for writes 4-6 times slower than single disk. Not to mention the latency of local disks will much better.

Re: [ceph-users] Ceph-ISCSI

2017-10-17 Thread Maged Mokhtar
The issue with active/active is the following condition: client initiator sends write operation to gateway server A server A does not respond within client timeout client initiator re-sends failed write operation to gateway server B client initiator sends another write operation to gateway server C

Re: [ceph-users] Bareos and libradosstriper works only for 4M sripe_unit size

2017-10-17 Thread Maged Mokhtar
>> Would it be 4 objects of 24M and 4 objects of 250KB? Or will the last 4 objects be artificially padded (with 0's) to meet the stripe_unit? It will be 4 object of 24M + 1M stored on the 5th object If you write 104M : 4 object of 24M + 8M stored on the 5th object If you write 105M : 4 obje

Re: [ceph-users] osd max scrubs not honored?

2017-10-15 Thread Maged Mokhtar
correction, i limit it to 128K: echo 128 > /sys/block/sdX/queue/read_ahead_kb On 2017-10-15 13:14, Maged Mokhtar wrote: > On 2017-10-14 05:02, J David wrote: > >> Thanks all for input on this. >> >> It's taken a couple of weeks, but based on the feedback

Re: [ceph-users] osd max scrubs not honored?

2017-10-15 Thread Maged Mokhtar
On 2017-10-14 05:02, J David wrote: > Thanks all for input on this. > > It's taken a couple of weeks, but based on the feedback from the list, > we've got our version of a scrub-one-at-a-time cron script running and > confirmed that it's working properly. > > Unfortunately, this hasn't really so

Re: [ceph-users] Ceph iSCSI login failed due to authorization failure

2017-10-14 Thread Maged Mokhtar
On 2017-10-14 17:50, Kashif Mumtaz wrote: > Hello Dear, > > I am trying to configure the Ceph iscsi gateway on Ceph Luminious . As per > below > > Ceph iSCSI Gateway -- Ceph Documentation [1] > > [1] > > CEPH ISCSI GATEWAY — CEPH DOCUMENTATION > > Ceph is iscsi gateway are configured and

Re: [ceph-users] Ceph-ISCSI

2017-10-12 Thread Maged Mokhtar
On 2017-10-12 11:32, David Disseldorp wrote: > On Wed, 11 Oct 2017 14:03:59 -0400, Jason Dillaman wrote: > > On Wed, Oct 11, 2017 at 1:10 PM, Samuel Soulard > wrote: Hmmm, If you failover the identity of the > LIO configuration including PGRs > (I believe they are files on disk), this would wor

Re: [ceph-users] Ceph-ISCSI

2017-10-12 Thread Maged Mokhtar
On 2017-10-11 14:57, Jason Dillaman wrote: > On Wed, Oct 11, 2017 at 6:38 AM, Jorge Pinilla López > wrote: > >> As far as I am able to understand there are 2 ways of setting iscsi for ceph >> >> 1- using kernel (lrbd) only able on SUSE, CentOS, fedora... > > The target_core_rbd approach is on

Re: [ceph-users] rados_read versus rados_aio_read performance

2017-10-01 Thread Maged Mokhtar
On 2017-10-01 16:47, Alexander Kushnirenko wrote: > Hi, Gregory! > > Thanks for the comment. I compiled simple program to play with write speed > measurements (from librados examples). Underline "write" functions are: > rados_write(io, "hw", read_res, 1048576, i*1048576); > rados_aio_write(i

Re: [ceph-users] Get rbd performance stats

2017-09-29 Thread Maged Mokhtar
On 2017-09-29 17:13, Matthew Stroud wrote: > Is there a way I could get a performance stats for rbd images? I'm looking > for iops and throughput. > > This issue we are dealing with is that there was a sudden jump in throughput > and I want to be able to find out with rbd volume might be causi

Re: [ceph-users] Ceph OSD on Hardware RAID

2017-09-29 Thread Maged Mokhtar
On 2017-09-29 17:14, Hauke Homburg wrote: > Hello, > > Ich think that the Ceph Users don't recommend on ceph osd on Hardware > RAID. But i haven't found a technical Solution for this. > > Can anybody give me so a Solution? > > Thanks for your help > > Regards > > Hauke You get better perform

Re: [ceph-users] osd create returns duplicate ID's

2017-09-29 Thread Maged Mokhtar
On 2017-09-29 11:31, Maged Mokhtar wrote: > On 2017-09-29 10:44, Adrian Saul wrote: > > Do you mean that after you delete and remove the crush and auth entries for > the OSD, when you go to create another OSD later it will re-use the previous > OSD ID that you have destro

Re: [ceph-users] osd create returns duplicate ID's

2017-09-29 Thread Maged Mokhtar
On 2017-09-29 10:44, Adrian Saul wrote: > Do you mean that after you delete and remove the crush and auth entries for > the OSD, when you go to create another OSD later it will re-use the previous > OSD ID that you have destroyed in the past? > > Because I have seen that behaviour as well - bu

Re: [ceph-users] RBD features(kernel client) with kernel version

2017-09-26 Thread Maged Mokhtar
On 2017-09-25 14:29, Ilya Dryomov wrote: > On Sat, Sep 23, 2017 at 12:07 AM, Muminul Islam Russell > wrote: > >> Hi Ilya, >> >> Hope you are doing great. >> Sorry for bugging you. I did not find enough resources for my question. I >> would be really helped if you could reply me. My questions

Re: [ceph-users] trying to understanding crush more deeply

2017-09-22 Thread Maged Mokhtar
d this ? Can you explain something > about this ? Apologize for my dummy. And thank you very much . : ) > > On Fri, Sep 22, 2017 at 3:50 PM, Maged Mokhtar wrote: > >> Per section 3.4.4 The default bucket type straw computes the hash of (PG >> number, replica number, bucket

  1   2   >