Hello,
On Wed, 20 Aug 2014 15:39:11 +0100 Hugo Mills wrote:
We have a ceph system here, and we're seeing performance regularly
descend into unusability for periods of minutes at a time (or longer).
This appears to be triggered by writing large numbers of small files.
Specifications:
I have noticed that when I make the request to HTTPS, the responde comes in
http form with port 443... Where is this happening, do you have any idea?
On Wed, Aug 20, 2014 at 1:30 PM, Marco Garcês ma...@garces.cc wrote:
swift --insecure -V 1 -A https://gateway.bcitestes.local/auth -U
Hi Hugo,
On 20 Aug 2014, at 17:54, Hugo Mills h.r.mi...@reading.ac.uk wrote:
What are you using for OSD journals?
On each machine, the three OSD journals live on the same ext4
filesystem on an SSD, which is also the root filesystem of the
machine.
Also check the CPU usage for the mons
Hi,
You only have one OSD? I’ve seen similar strange things in test pools having
only one OSD — and I kinda explained it by assuming that OSDs need peers (other
OSDs sharing the same PG) to behave correctly. Install a second OSD and see how
it goes...
Cheers, Dan
On 21 Aug 2014, at 02:59,
Just to fill in some of the gaps from yesterday's mail:
On Wed, Aug 20, 2014 at 04:54:28PM +0100, Hugo Mills wrote:
Some questions below I can't answer immediately, but I'll spend
tomorrow morning irritating people by triggering these events (I think
I have a reproducer -- unpacking a
On Thu, Aug 21, 2014 at 07:40:45AM +, Dan Van Der Ster wrote:
On 20 Aug 2014, at 17:54, Hugo Mills h.r.mi...@reading.ac.uk wrote:
Does your hardware provide enough IOPS for what your users need?
(e.g. what is the op/s from ceph -w)
Not really an answer to your question, but: Before
Hi Hugo,
On 21 Aug 2014, at 14:17, Hugo Mills h.r.mi...@reading.ac.uk wrote:
Not sure what you mean about colocated journal/OSD. The journals
aren't on the same device as the OSDs. However, all three journals on
each machine are on the same SSD.
embarrassed I obviously didn’t drink
Hi,
I'm trying to start Qemu on top of RBD. In documentation[1] there is a
big warning:
Important
If you set rbd_cache=true, you must set cache=writeback or risk data
loss. Without cache=writeback, QEMU will not send flush requests to
librbd. If QEMU exits uncleanly in this
Sorry for missing subject.
On 08/21/2014 03:09 PM, Paweł Sadowski wrote:
Hi,
I'm trying to start Qemu on top of RBD. In documentation[1] there is a
big warning:
Important
If you set rbd_cache=true, you must set cache=writeback or risk data
loss. Without cache=writeback, QEMU
I understand the concept with Ceph being able to recover from the failure of an
OSD (presumably with a single OSD being on a single disk), but I'm wondering
what the scenario is if an OSD server node containing multiple disks should
fail. Presuming you have a server containing 8-10 disks,
Ceph uses CRUSH (http://ceph.com/docs/master/rados/operations/crush-map/) to
determine object placement. The default generated crush maps are sane, in that
they will put replicas in placement groups into separate failure domains. You
do not need to worry about this simple failure case, but
Hi,
On a freshly created 4 node cluster I'm struggling to get the 4th node
to create correctly. ceph-deploy is unable to create the OSDs on it
and when logging in to the node and attempting to run `ceph -s`
manually (after copying the client.admin keyring) with debug
parameters it ends up hanging
i can upload file to RadosGW by s3cmd , and software Dragondisk.
the script can list all bucket and all file in the bucket. but can not
from python s3.
###
#coding=utf-8
__author__ = 'Administrator'
#!/usr/bin/env python
import fnmatch
import os, sys
import boto
import
Hi,
I have 2 PG in active+remapped state.
ceph health detail
HEALTH_WARN 2 pgs stuck unclean; recovery 24/348041229 degraded (0.000%)
pg 3.1a07 is stuck unclean for 29239.046024, current state
active+remapped, last acting [167,80,145]
pg 3.154a is stuck unclean for 29239.039777, current state
I have 3 storage servers each with 30 osds. Each osd has a journal that is a
partition on a virtual drive that is a raid0 of 6 ssds. I brought up a 3 osd (1
per storage server) cluster to bring up Ceph and figure out configuration etc.
From: Dan Van Der Ster [mailto:daniel.vanders...@cern.ch]
I am working with Cinder Multi Backends on an Icehouse installation and have
added another backend (Quobyte) to a previously running Cinder/Ceph
installation.
I can now create QuoByte volumes, but no longer any ceph volumes. The
cinder-scheduler log get’s an incorrect number for the free size
Yeah, that's fairly bizarre. Have you turned up the monitor logs and
seen what they're doing? Have you checked that the nodes otherwise
have the same configuration (firewall rules, client key permissions,
installed version of Ceph...)
-Greg
Software Engineer #42 @ http://inktank.com |
when i use Dragondisk , i unselect Expect 100-continue header , upload
file sucessfully. when select this option, upload file will hang.
maybe the python script can not upload file due to the 100-continue ?? my
radosgw Apache2 not use 100-continue.
if my guess is ture, how to disable this in
my radosgw disbaled 100-continue
[global]
fsid = 075f1aae-48de-412e-b024-b0f014dbc8cf
mon_initial_members = ceph01-vm, ceph02-vm, ceph04-vm
mon_host = 192.168.123.251,192.168.123.252,192.168.123.250
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
Are the OSD processes still alive? What's the osdmap output of ceph
-w (which was not in the output you pasted)?
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com
On Thu, Aug 21, 2014 at 7:11 AM, Bruce McFarland
bruce.mcfarl...@taec.toshiba.com wrote:
I have 3 storage servers
On Thu, Aug 21, 2014 at 8:29 AM, Jens-Christian Fischer
jens-christian.fisc...@switch.ch wrote:
I am working with Cinder Multi Backends on an Icehouse installation and have
added another backend (Quobyte) to a previously running Cinder/Ceph
installation.
I can now create QuoByte volumes,
Yes all of the ceph-osd processes are up and running. I perform a ceph-mon
restart to see if that might trigger the osdmap update, but there is no INFO
msg from the osdmap or the pgmap that I expect to when the osd's are started.
All of the osd's and their hosts appear in the CRUSH map and in
There was a good discussion of this a month ago:
https://www.mail-archive.com/ceph-users%40lists.ceph.com/msg11483.html
That'll give you some things you can try, and information on how to undo it
if it does cause problems.
You can disable the warning by adding this to the [mon] section of
The default rules are sane for small clusters with few failure domains.
Anything larger than a single rack should customize their rules.
It's a good idea to figure this out early. Changes to your CRUSH rules can
result in a large percentage of data moving around, which will make your
cluster
24 matches
Mail list logo