Re: [ceph-users] Poor performance on all SSD cluster

2014-06-20 Thread Greg Poirier
Thanks Tyler. So, I'm not totally crazy. There is something weird going on. I've looked into things about as much as I can: - We have tested with collocated journals and dedicated journal disks. - We have bonded 10Gb nics and have verified network configuration and connectivity is sound - We have

Re: [ceph-users] Poor performance on all SSD cluster

2014-06-20 Thread Tyler Wilson
Greg, Not a real fix for you but I too run a full-ssd cluster and am able to get 112MB/s with your command; [root@plesk-test ~]# dd if=/dev/zero of=testfilasde bs=16k count=65535 oflag=direct 65535+0 records in 65535+0 records out 1073725440 bytes (1.1 GB) copied, 9.59092 s, 112 MB/s This of cou

Re: [ceph-users] qemu/librbd versus qemu/kernel module rbd

2014-06-20 Thread Chad Seys
Hi John, Thanks for the reply! Yes, I agree Ceph is exciting! Keep up the good work! > Using librbd, as you've pointed out, doesn't run afoul of potential Linux > kernel deadlocks; however, you normally wouldn't encounter this type of > situation in a production cluster anyway as you'd likel

[ceph-users] Multiple hierarchies and custom placement

2014-06-20 Thread Shayan Saeed
Is it allowed for crush maps to have multiple hierarchies for different pools. So for example, I want one pool to treat my cluster as flat with every host being equal but the other pool to have a more hierarchical idea as hosts->racks->root? Also, is it currently possible in ceph to have a custom

Re: [ceph-users] Designing a cluster with ceph and benchmark (ceph vs ext4)

2014-06-20 Thread Listas@Adminlinux
Hi Pieter, At the time our cluster environment was Ubuntu 10.04 + Kernel-2.6.32 + ocfs2-tools-1.4.3. Later we did the upgrade to Ubuntu 10.10 + Kernel-2.6.35 + ocfs2-tools-1.6.4. We tried to use OCFS2 under production in 2010, but were forced to migrate to a failover design cluster with Ex

[ceph-users] Poor performance on all SSD cluster

2014-06-20 Thread Greg Poirier
I recently created a 9-node Firefly cluster backed by all SSDs. We have had some pretty severe performance degradation when using O_DIRECT in our tests (as this is how MySQL will be interacting with RBD volumes, this makes the most sense for a preliminary test). Running the following test: dd if=/

Re: [ceph-users] Designing a cluster with ceph and benchmark (ceph vs ext4)

2014-06-20 Thread Listas@Adminlinux
Thank you Alexandre, I'm setting up a new test environment with network 10GB and updated version of Ceph to a new benchmark. []'s -- Thiago Henrique www.adminlinux.com.br Em 01-06-2014 05:23, Alexandre DERUMIER escreveu: hi! See design here: http://adminlinux.com.br/cluster_design.txt # d

[ceph-users] Cluster design with Ceph in a service provider

2014-06-20 Thread Listas@Adminlinux
Hi ! I work at a large hosting company in Brazil. We have web, email, databases and other services. I'm studying Ceph to understand how it could be useful to us. I would like to hear from people who already use Ceph in a production environment. I wonder on what types of services people are

Re: [ceph-users] qemu/librbd versus qemu/kernel module rbd

2014-06-20 Thread John Wilkins
Chad, I'm sure others can speak to performance better than I can. However, using a kernel RBD was only vulnerable to deadlocks when trying to mount a block device on the same host that had Ceph server daemons like monitors or osds running. It was a kernel issue not a Ceph issue. Otherwise, the rea

Re: [ceph-users] Error 95: Operation not supported

2014-06-20 Thread Loic Dachary
On 20/06/2014 16:32, Pavel V. Kaygorodov wrote: > Hi! > > I'm getting a strange error, trying to create rbd image: > > # rbd -p images create --size 10 test > rbd: create error: (95) Operation not supported > 2014-06-20 18:28:39.537889 7f32af795780 -1 librbd: error adding image to > directory:

Re: [ceph-users] Bug in erasure-code-profiles

2014-06-20 Thread Loic Dachary
On 20/06/2014 17:50, Shayan Saeed wrote: > I am in the process of learning about ceph. Now I have created a few > erasure-coded profiles and removed them too. However, I get inconsistencies > when I call the command > > "ceph osd erasure-code-profile ls" > > Instead of consistently listing a

Re: [ceph-users] erasure coding parameter's choice and performance

2014-06-20 Thread Loic Dachary
Hi David, On 20/06/2014 14:15, David Z wrote:> > Hi Loic, > > We are evaluating erasure coding and we want to tolerate 3 chunks failure. > Then we choose cauchy_orig because RS's performance should be no better than > cauchy_orig and other algorithms are optimized for raid6 mode. > > For cauc

[ceph-users] Bug in erasure-code-profiles

2014-06-20 Thread Shayan Saeed
I am in the process of learning about ceph. Now I have created a few erasure-coded profiles and removed them too. However, I get inconsistencies when I call the command "ceph osd erasure-code-profile ls" Instead of consistently listing all the profiles, it alternates between 2 choices: 1. my-cod

[ceph-users] Error 95: Operation not supported

2014-06-20 Thread Pavel V. Kaygorodov
Hi! I'm getting a strange error, trying to create rbd image: # rbd -p images create --size 10 test rbd: create error: (95) Operation not supported 2014-06-20 18:28:39.537889 7f32af795780 -1 librbd: error adding image to directory: (95) Operation not supported The images -- erasure encoded pool

[ceph-users] qemu/librbd versus qemu/kernel module rbd

2014-06-20 Thread Chad Seys
Hi All, What are the pros and cons of running a virtual machine (with qemu-kvm) whose image is accessed via librbd or by mounting /dev/rbdX ? I've heard that the librbd method has the advantage of not being vulnerable to deadlocks due to memory allocation problems. ? Would one also benefit

Re: [ceph-users] OSD aserts in OSD shutdown

2014-06-20 Thread Sahana Lokeshappa
Sahana Lokeshappa Test Development Engineer I [cid:image001.png@01CE9342.6D040E30] 3rd Floor, Bagmane Laurel, Bagmane Tech Park C V Raman nagar, Bangalore 560093 T: +918042422283 sahana.lokesha...@sandisk.com From: Ceph-community [mailto:ceph-community-boun...@lists.ceph.com] On Behalf Of Sahan

[ceph-users] OSD aserts in OSDService

2014-06-20 Thread Sahana Lokeshappa
Hi all, I have a ceph cluster with 8 nodes with 3 osds in each node, and 3 monitors. While Client IO and recovery IO was going on, removed one of the osd due to some reasons. Osd crashed with assert : 1: (OSD::shutdown()+0x176f) [0x61e94f] 2: (OSD::handle_signal(int)+0x60) [0x61f210] 3: (Signa

[ceph-users] OSD : assert at OSDService

2014-06-20 Thread Sahana
Hi All, I have a ceph cluster with 8 nodes, 3 osds in each, 3 monitors. I ran command : ceph osd thrash 101 2014-06-09 18:05:58.001622 7f421da58700 0 mon.ip-10-15-16-63@0(leader) e1 handle_command mon_command( {"prefix": "osd thrash", "num_epochs": 101} v 0) v1 the osd's are thrashed for 101 tim

Re: [ceph-users] understanding rados df statistics

2014-06-20 Thread george.ryall
Thanks. I hadn’t actually found ‘ceph df’. It probably just needs a brief description of what the raw totals include. One question relating to this, the documentation you’ve linked to suggests that the pool usage stats are converted to megabytes and gigabytes where relevant, are they also conv

Re: [ceph-users] Cache tier pool in CephFS

2014-06-20 Thread Sherry Shahbazi
Another problem that I faced is that when I start a benchmark that creates loads of workload, the cache-pool does not evict/flush objects during the workload generation. However, I set the target_max_objects to certain amount, it does not flush/evict the objects during workload generation! So my