Re: [ceph-users] Cloudstack agent crashed JVM with exception in librbd

2015-11-03 Thread Wido den Hollander
On 03-11-15 01:54, Voloshanenko Igor wrote: > Thank you, Jason! > > Any advice, for troubleshooting > > I'm looking in code, and right now don;t see any bad things :( > Can you run the CloudStack Agent in DEBUG mode and then see after which lines in the logs it crashes? Wido >

Re: [ceph-users] Cloudstack agent crashed JVM with exception in librbd

2015-11-03 Thread Voloshanenko Igor
Wido, also minor issue with 0,2.0 java-rados We still catch: -storage/ae1b6e5f-f5f4-4abe-aee3-084f2fe71876 2015-11-02 11:41:14,958 WARN [cloud.agent.Agent] (agentRequest-Handler-4:null) Caught: java.lang.NegativeArraySizeException at com.ceph.rbd.RbdImage.snapList(Unknown Source) at

Re: [ceph-users] ceph new osd addition and client disconnected

2015-11-03 Thread Chris Taylor
On 2015-11-03 12:01 am, gjprabu wrote: Hi Taylor, Details are below. CEPH -S cluster 944fa0af-b7be-45a9-93ff-b9907cfaee3f health HEALTH_OK monmap e2: 3 mons at {integ-hm5=192.168.112.192:6789/0,integ-hm6=192.168.112.193:6789/0,integ-hm7=192.168.112.194:6789/0} election epoch 526, quorum

Re: [ceph-users] Cloudstack agent crashed JVM with exception in librbd

2015-11-03 Thread Wido den Hollander
On 03-11-15 10:04, Voloshanenko Igor wrote: > Wido, also minor issue with 0,2.0 java-rados > Did you also re-compile CloudStack against the new rados-java? I still think it's related to when the Agent starts cleaning up and there are snapshots which need to be unprotected. In the meantime you

Re: [ceph-users] Cloudstack agent crashed JVM with exception in librbd

2015-11-03 Thread Voloshanenko Igor
Yes, we recompiled ACS too Also we delete all snapshots... but we can do it for a while... New snapshot created each days.. And the main issue - agent crash, not exception itself... Each RBD operations which cause exception in 20-30 minutes cause agent crash... 2015-11-03 11:09 GMT+02:00

Re: [ceph-users] ceph new osd addition and client disconnected

2015-11-03 Thread gjprabu
Hi Taylor, Details are below. ceph -s cluster 944fa0af-b7be-45a9-93ff-b9907cfaee3f health HEALTH_OK monmap e2: 3 mons at {integ-hm5=192.168.112.192:6789/0,integ-hm6=192.168.112.193:6789/0,integ-hm7=192.168.112.194:6789/0} election epoch 526, quorum 0,1,2

Re: [ceph-users] Cloudstack agent crashed JVM with exception in librbd

2015-11-03 Thread Voloshanenko Igor
Wido, it's the main issue. No records at all... So, from last time: 2015-11-02 11:40:33,204 DEBUG [kvm.resource.LibvirtComputingResource] (agentRequest-Handler-2:null) Executing: /bin/bash -c free|grep Mem:|awk '{print $2}' 2015-11-02 11:40:33,207 DEBUG [kvm.resource.LibvirtComputingResource]

[ceph-users] rados bench leaves objects in tiered pool

2015-11-03 Thread Дмитрий Глушенок
Hi, While benchmarking tiered pool using rados bench it was noticed that objects are not being removed after test. Test was performed using "rados -p rbd bench 3600 write". The pool is not used by anything else. Just before end of test: POOLS: NAME ID USED

[ceph-users] One object in .rgw.buckets.index causes systemic instability

2015-11-03 Thread Gerd Jakobovitsch
Dear all, I have a cluster running hammer (0.94.5), with 5 nodes. The main usage is for S3-compatible object storage. I am getting to a very troublesome problem at a ceph cluster. A single object in the .rgw.buckets.index is not responding to request and takes a very long time while

[ceph-users] Choosing hp sata or sas SSDs for journals

2015-11-03 Thread Karsten Heymann
Hi, has anyone experiences with hp-branded ssds for journaling? Given that everything else is fixed (raid controller, cpu, etc...) and a fixed budget, would it be better to go with more of the cheaper 6G SATA Write intensive drives or should I aim for (then fewer) 12G SAS models? Here are the

[ceph-users] Ceph Openstack deployment

2015-11-03 Thread Iban Cabrillo
Hi all, During last week I been trying to deploy the pre-existing ceph cluster with out openstack intance. The ceph-cinder integration was easy (or at least I think so!!) There is only one volume to attach block storage to out cloud machines. The client.cinder has permission on

Re: [ceph-users] rados bench leaves objects in tiered pool

2015-11-03 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Try: rados -p {cachepool} cache-flush-evict-all and see if the objects clean up. - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Tue, Nov 3, 2015 at 8:02 AM, Gregory Farnum wrote: > When

[ceph-users] some postmortem

2015-11-03 Thread Dzianis Kahanovich
OK, now my ceph cluster is died & re-created. Main problem was too many pgs and disabled swap, then one of node have problems with xfs (even stuck on mount) and all starts to die, last on trying to edit pgs & delete more then needed. But I see some issues. After ceph-osd crash (out of RAM

Re: [ceph-users] rados bench leaves objects in tiered pool

2015-11-03 Thread Дмитрий Глушенок
Hi, Thanks Gregory and Robert, now it is a bit clearer. After cache-flush-evict-all almost all objects were deleted, but 101 remained in cache pool. Also 1 pg changed its state to inconsistent with HEALTH_ERR. "ceph pg repair" changed objects count to 100, but at least ceph become healthy.

Re: [ceph-users] rados bench leaves objects in tiered pool

2015-11-03 Thread Gregory Farnum
Ceph maintains some metadata in objects. In this case, hitsets, which keep track of object accesses for evaluating how hot an object is when flushing and evicting from the cache. On Tuesday, November 3, 2015, Дмитрий Глушенок wrote: > Hi, > > Thanks Gregory and Robert, now it

Re: [ceph-users] two or three replicas?

2015-11-03 Thread Udo Lembke
Hi, for production (with enough OSDs) is three replicas the right choice. The chance for data loss if two ODSs fails at one time is to high. And if this happens most of your data ist lost, because the data is spead over many OSDs... And yes - two replicas is faster for writes. Udo On

[ceph-users] Can snapshot of image still be used while flattening the image?

2015-11-03 Thread Jackie
Hi experts, I have rbd images and snapshots as following: image1 - > snapshot1(snapshot for image1) -> image2(cloned from snapshot1) -> snapshot2 (snapshot for image2), During I flatten the image2, can I still use snapshot2 to clone new image? Regards, Jackie

[ceph-users] Using LVM on top of a RBD.

2015-11-03 Thread Daniel Hoffman
Hi All. I have a legacy server farm made up of 7 nodes running KVM and using LVM(LVs) for the disks of the virtual machines. The nodes at this time are CentOS 6. We would love to remove this small farm from our network and use CephRBD over using a traditional iSCSI block device as we currently

Re: [ceph-users] Choosing hp sata or sas SSDs for journals

2015-11-03 Thread Christian Balzer
Hello, On Tue, 3 Nov 2015 12:01:16 +0100 Karsten Heymann wrote: > Hi, > > has anyone experiences with hp-branded ssds for journaling? Given that > everything else is fixed (raid controller, cpu, etc...) and a fixed A raid controller that can hopefully be run well in JBOD mode or something

[ceph-users] Ceph Amazon S3 API

2015-11-03 Thread Богдан Тимофеев
I have 4 ceph nodes running on virtual machines in our corporate network. I've installed a ceph object gateway on admin node. Can I somehow use it in Amazon S3 style from my windows machine in the same network, for example with using Amazon S3 Java API? -- Богдан

Re: [ceph-users] rados bench leaves objects in tiered pool

2015-11-03 Thread Gregory Farnum
When you have a caching pool in writeback mode, updates to objects (including deletes) are handled by writeback rather than writethrough. Since there's no other activity against these pools, there is nothing prompting the cache pool to flush updates out to the backing pool, so the backing pool

[ceph-users] iSCSI over RDB is a good idea ?

2015-11-03 Thread Gaetan SLONGO
Dear Ceph users, We are currently working on design of virtualization infrastructure using oVirt and we would like to use Ceph. The problem is, at this time there is no native integration of Ceph in oVirt. One possibility is to export RBD devices over iSCSI (maybe you have better one?). I've