[ceph-users] Bug report: unexpected behavior when executing Lua object class

2017-06-02 Thread Zheyuan Chen
Hi, I found two bugs when testing out Lua object class. I am running with Ceph 11.2.0. Can anybody take a look at them? Zheyuan Bug 1: I can not get returned output in the first script. "data" is always empty. import rados, json > cluster = rados.Rados(conffile='') > cluster.connect() > ioctx

Re: [ceph-users] is there any way to speed up cache evicting?

2017-06-02 Thread jiajia zhong
david, 2017-06-02 21:41 GMT+08:00 David Turner : > I'm thinking you have erasure coding in cephfs and only use cache tiring > because you have to, correct? What is your use case for repeated file > accesses? How much data is written into cephfs at a time? > these days, up

Re: [ceph-users] Recovery stuck in active+undersized+degraded

2017-06-02 Thread Christian Wuerdig
Well, what's "best" really depends on your needs and use-case. The general advise which has been floated several times now is to have at least N+2 entities of your failure domain in your cluster. So for example if you run with size=3 then you should have at least 5 OSDs if your failure domain is

Re: [ceph-users] RGW lifecycle not expiring objects

2017-06-02 Thread Yehuda Sadeh-Weinraub
Have you opened a ceph tracker issue, so that we don't lose track of the problem? Thanks, Yehuda On Fri, Jun 2, 2017 at 3:05 PM, wrote: > Hi Graham. > > We are on Kraken and have the same problem with "lifecycle". Various (other) > tools like s3cmd or CyberDuck

Re: [ceph-users] RGW lifecycle not expiring objects

2017-06-02 Thread ceph . novice
Hi Graham.   We are on Kraken and have the same problem with "lifecycle". Various (other) tools like s3cmd or CyberDuck do show the applied "expiration" settings, but objects seem never to be purged. If you should have new findings, hints,... PLEASE share/let me know. Thanks a lot! Anton  

Re: [ceph-users] Crushmap from Rack aware to Node aware

2017-06-02 Thread Anthony D'Atri
All very true and worth considering, but I feel compelled to mention the strategy of setting mon_osd_down_out_subtree_limit carefully to prevent automatic rebalancing. *If* the loss of a failure domain is temporary, ie. something you can fix fairly quickly, it can be preferable to not start

Re: [ceph-users] RBD exclusive-lock and lqemu/librbd

2017-06-02 Thread koukou73gr
Coming back to this, with Jason's insight it was quickly revealed that my problem was in reality a cephx authentication permissions issue. Specifically, exclusive-lock requires a cephx user with class-write access to the pool where the image resides. This wasn't clear in the documentation and the

Re: [ceph-users] OSD crash loop - FAILED assert(recovery_info.oi.snaps.size())

2017-06-02 Thread Steve Anthony
I'm seeing this again on two OSDs after adding another 20 disks to my cluster. Is there someway I can maybe determine which snapshots the recovery process is looking for? Or maybe find and remove the objects it's trying to recover, since there's apparently a problem with them? Thanks! -Steve On

Re: [ceph-users] Lumionous: bluestore 'tp_osd_tp thread tp_osd_tp' had timed out after 60

2017-06-02 Thread Mark Nelson
I got a chance to run this by Josh and he had a good thought. Just to make sure that it's not IO backing up on the device, it probably makes sense to repeat the test and watch what the queue depth and service times look like. I like using collectl for it: "collectl -sD -oT" The queue depth

Re: [ceph-users] Recovery stuck in active+undersized+degraded

2017-06-02 Thread Oleg Obleukhov
But what would be the best? Have 3 servers and how many osd? Thanks! > On 2 Jun 2017, at 17:09, David Turner wrote: > > That's good for testing in the small scale. For production I would revisit > using size 3. Glad you got it working. > > On Fri, Jun 2, 2017 at 11:02

Re: [ceph-users] Recovery stuck in active+undersized+degraded

2017-06-02 Thread David Turner
That's good for testing in the small scale. For production I would revisit using size 3. Glad you got it working. On Fri, Jun 2, 2017 at 11:02 AM Oleg Obleukhov wrote: > Thanks to everyone, > problem is solved by: > ceph osd pool set cephfs_metadata size 2 > ceph osd

Re: [ceph-users] Recovery stuck in active+undersized+degraded

2017-06-02 Thread Oleg Obleukhov
Thanks to everyone, problem is solved by: ceph osd pool set cephfs_metadata size 2 ceph osd pool set cephfs_data size 2 Best, Oleg. > On 2 Jun 2017, at 16:15, Oleg Obleukhov wrote: > > Hello, > I am playing around with ceph (ceph version 10.2.7 >

Re: [ceph-users] Crushmap from Rack aware to Node aware

2017-06-02 Thread David Turner
I agree that running in min_size of 1 is worse than running with only 3 failure domains. Even if it's just for a short time and you're monitoring it closely... it takes mere seconds before you could have corrupt data with min_size of 1 (depending on your use case). That right there is the key.

Re: [ceph-users] Recovery stuck in active+undersized+degraded

2017-06-02 Thread David Turner
Also, your min_size is set to 2. What this means is that you need at least 2 copies of your data up to be able to access it. You do not want to have min_size of 1. If you had min_size of 1 and you only have 1 copy of your data receiving writes and then that copy goes down as well... What is to

Re: [ceph-users] Crushmap from Rack aware to Node aware

2017-06-02 Thread Laszlo Budai
What you're saying that if we only have 3 failure domains then ceph can do nothing to maintain 3 copies in case of an entire failure domain is lost, that is correct. BUT if you're losing 2 replicas out of 3 of your data, and your min size is set to 2 (the recommended minimum) then you have an

Re: [ceph-users] Recovery stuck in active+undersized+degraded

2017-06-02 Thread Burkhard Linke
Hi, On 06/02/2017 04:15 PM, Oleg Obleukhov wrote: Hello, I am playing around with ceph (ceph version 10.2.7 (50e863e0f4bc8f4b9e31156de690d765af245185)) on Debian Jessie and I build a test setup: $ ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1

Re: [ceph-users] Recovery stuck in active+undersized+degraded

2017-06-02 Thread Etienne Menguy
I think it's because af-staging-ceph02 data can only be moved to af-staging-ceph01/3 which already have the data. There is no acceptable place to create the third replicate of data. Etienne From: ceph-users on behalf of

Re: [ceph-users] Recovery stuck in active+undersized+degraded

2017-06-02 Thread Ashley Merrick
You only have 3 osd's hence with one down you only have 2 left for replication of 3 objects. No spare OSD to place the 3rd object on, if you was to add a 4th node the issue would be removed. ,Ashley On 2 Jun 2017, at 10:31 PM, Oleg Obleukhov >

[ceph-users] Recovery stuck in active+undersized+degraded

2017-06-02 Thread Oleg Obleukhov
Hello, I am playing around with ceph (ceph version 10.2.7 (50e863e0f4bc8f4b9e31156de690d765af245185)) on Debian Jessie and I build a test setup: $ ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 0.01497 root default -2 0.00499 host af-staging-ceph01

Re: [ceph-users] Crushmap from Rack aware to Node aware

2017-06-02 Thread David Turner
You wouldn't be able to guarantee that the cluster will not use 2 servers from the same rack. The problem with 3 failure domains, however, is if you lose a full failure domain ceph can do nothing to maintain 3 copies of your data. It leaves you in a position where you need to rush to the

Re: [ceph-users] is there any way to speed up cache evicting?

2017-06-02 Thread David Turner
I'm thinking you have erasure coding in cephfs and only use cache tiring because you have to, correct? What is your use case for repeated file accesses? How much data is written into cephfs at a time? For me, my files are infrequently accessed after they are written or read from the EC back-end

Re: [ceph-users] should I use rocdsdb ?

2017-06-02 Thread Mark Nelson
Hi Will, Few people have tried rocksdb as the k/v store for filestore since we never really started supporting it for production use (We ended up deciding to move on to bluestore). I suspect it will be faster than leveldb but I don't think anyone has actually tested filestore+rocksdb to any

Re: [ceph-users] RBD exclusive-lock and lqemu/librbd

2017-06-02 Thread Peter Maloney
On 06/02/17 12:25, koukou73gr wrote: > On 2017-06-02 13:01, Peter Maloney wrote: >>> Is it easy for you to reproduce it? I had the same problem, and the same >>> solution. But it isn't easy to reproduce... Jason Dillaman asked me for >>> a gcore dump of a hung process but I wasn't able to get one.

Re: [ceph-users] RBD exclusive-lock and lqemu/librbd

2017-06-02 Thread koukou73gr
On 2017-06-02 13:22, Peter Maloney wrote: > On 06/02/17 12:06, koukou73gr wrote: >> Thanks for the reply. >> >> Easy? >> Sure, it happens reliably every time I boot the guest with >> exclusive-lock on :) > If it's that easy, also try with only exclusive-lock, and not object-map > nor fast-diff.

Re: [ceph-users] RBD exclusive-lock and lqemu/librbd

2017-06-02 Thread koukou73gr
On 2017-06-02 13:01, Peter Maloney wrote: >> Is it easy for you to reproduce it? I had the same problem, and the same >> solution. But it isn't easy to reproduce... Jason Dillaman asked me for >> a gcore dump of a hung process but I wasn't able to get one. Can you do >> that, and when you reply,

Re: [ceph-users] RBD exclusive-lock and lqemu/librbd

2017-06-02 Thread Peter Maloney
On 06/02/17 12:06, koukou73gr wrote: > Thanks for the reply. > > Easy? > Sure, it happens reliably every time I boot the guest with > exclusive-lock on :) If it's that easy, also try with only exclusive-lock, and not object-map nor fast-diff. And also with one or the other of those. > > I'll need

Re: [ceph-users] RBD exclusive-lock and lqemu/librbd

2017-06-02 Thread koukou73gr
Thanks for the reply. Easy? Sure, it happens reliably every time I boot the guest with exclusive-lock on :) I'll need some walkthrough on the gcore part though! -K. On 2017-06-02 12:59, Peter Maloney wrote: > On 06/01/17 17:12, koukou73gr wrote: >> Hello list, >> >> Today I had to create a

Re: [ceph-users] RBD exclusive-lock and lqemu/librbd

2017-06-02 Thread Peter Maloney
On 06/02/17 11:59, Peter Maloney wrote: > On 06/01/17 17:12, koukou73gr wrote: >> Hello list, >> >> Today I had to create a new image for a VM. This was the first time, >> since our cluster was updated from Hammer to Jewel. So far I was just >> copying an existing golden image and resized it as

Re: [ceph-users] RBD exclusive-lock and lqemu/librbd

2017-06-02 Thread Peter Maloney
On 06/01/17 17:12, koukou73gr wrote: > Hello list, > > Today I had to create a new image for a VM. This was the first time, > since our cluster was updated from Hammer to Jewel. So far I was just > copying an existing golden image and resized it as appropriate. But this > time I used rbd create. >

Re: [ceph-users] Crushmap from Rack aware to Node aware

2017-06-02 Thread Laszlo Budai
Hi David, If I understand correctly your suggestion is the following: If we have for instance 12 servers grouped into 3 racks (4/rack) then you would build a crush map saying that you have 6 racks (virtual ones), and 2 servers in each of them, right? In this case if we are setting the failure

Re: [ceph-users] is there any way to speed up cache evicting?

2017-06-02 Thread jiajia zhong
thank you for your guide :), It's making sense. 2017-06-02 16:17 GMT+08:00 Christian Balzer : > > Hello, > > On Fri, 2 Jun 2017 14:30:56 +0800 jiajia zhong wrote: > > > christian, thanks for your reply. > > > > 2017-06-02 11:39 GMT+08:00 Christian Balzer : > > > > >

[ceph-users] should I use rocdsdb ?

2017-06-02 Thread Z Will
Hello gurus: My name is will . I have just study ceph and have a lot of interest in it . We are using ceph 0.94.10. And I am tring to tune the performance of ceph to satisfy our requirements. We are using it as object store now. Even though I have tried some different configuration. But I

Re: [ceph-users] is there any way to speed up cache evicting?

2017-06-02 Thread Christian Balzer
Hello, On Fri, 2 Jun 2017 14:30:56 +0800 jiajia zhong wrote: > christian, thanks for your reply. > > 2017-06-02 11:39 GMT+08:00 Christian Balzer : > > > On Fri, 2 Jun 2017 10:30:46 +0800 jiajia zhong wrote: > > > > > hi guys: > > > > > > Our ceph cluster is working with tier

[ceph-users] About dmClock tests confusion after integrating dmClock QoS library into ceph codebase

2017-06-02 Thread Lijie
Hi Eric, Our team has developed QOS feature on ceph using the dmclock library from community. We treat a rbd as a dmclock client instead of pool as . We tested our code and the result is confusing . Testing environment: single server with 16 cores , RAM of 32G, 8 non-systerm disks,each runs

Re: [ceph-users] is there any way to speed up cache evicting?

2017-06-02 Thread jiajia zhong
christian, thanks for your reply. 2017-06-02 11:39 GMT+08:00 Christian Balzer : > On Fri, 2 Jun 2017 10:30:46 +0800 jiajia zhong wrote: > > > hi guys: > > > > Our ceph cluster is working with tier cache. > If so, then I suppose you read all the discussions here as well and not >