Hi,
We have setup a ceph cluster and while adding it as primary storage in
Cloudstack, I am getting the below error in hypervisor server . The
error says the hypervisor server timed out while connecting to the ceph
monitor.
Disabled firewall and made sure ports are open. This is the final st
Hi,
I have checked and confirmed that the monitor daemon is running and the
socket file /var/run/ceph/ceph-mon.mon1.asok has been created. But the
server messages is still showing the error.
Mar 22 00:47:38 mon1 ceph-create-keys: admin_socket: exception getting
command descriptions: [Er
Thanks Brad
--
Deepak
> On Mar 21, 2017, at 9:31 PM, Brad Hubbard wrote:
>
>> On Wed, Mar 22, 2017 at 10:55 AM, Deepak Naidu wrote:
>> Do we know which version of ceph client does this bug has a fix. Bug:
>> http://tracker.ceph.com/issues/17191
>>
>>
>>
>> I have ceph-common-10.2.6-0 ( on C
On Wed, Mar 22, 2017 at 10:55 AM, Deepak Naidu wrote:
> Do we know which version of ceph client does this bug has a fix. Bug:
> http://tracker.ceph.com/issues/17191
>
>
>
> I have ceph-common-10.2.6-0 ( on CentOS 7.3.1611) & ceph-fs-common-
> 10.2.6-1(Ubuntu 14.04.5)
ceph-client is the repository
Do we know which version of ceph client does this bug has a fix. Bug:
http://tracker.ceph.com/issues/17191
I have ceph-common-10.2.6-0 ( on CentOS 7.3.1611) & ceph-fs-common-
10.2.6-1(Ubuntu 14.04.5)
--
Deepak
---
Based solely on the information given the only rpms with this specific commit in
them would be here
https://shaman.ceph.com/builds/ceph/wip-prune-past-intervals-kraken/
(specifically
https://4.chacra.ceph.com/r/ceph/wip-prune-past-intervals-kraken/8263140fe539f9c3241c1c0f6ee9cfadde9178c0/centos/7/f
> I am sure I remember having to reduce min_size to 1 temporarily in the past
> to allow recovery from having two drives irrecoverably die at the same time
> in one of my clusters.
What was the situation that you had to do that?
Thanks for sharing your experience in advance.
Regards,
__
Hi,
Thank you for providing me this level of detail.
I ended up just failing the drive since it is still under support and we had in
fact gotten emails about the health of this drive in the past.
I will however use this in the future if we have an issue with a pg and it is
the first time we h
I’m fairly sure I saw it as recently as Hammer, definitely Firefly. YMMV.
> On Mar 21, 2017, at 4:09 PM, Gregory Farnum wrote:
>
> You shouldn't need to set min_size to 1 in order to heal any more. That was
> the case a long time ago but it's been several major LTS releases now. :)
> So: just
Hi,
I'm installing Rados Gateway, using Jewel 10.2.5, and can't seem to find the
correct documentation.
I used ceph-deploy to start the gateway, but cant seem to restart the process
correctly.
Can someone point me to the correct steps?
Also, how do I start my rados gateway back.
This is what I
You shouldn't need to set min_size to 1 in order to heal any more. That was
the case a long time ago but it's been several major LTS releases now. :)
So: just don't ever set min_size to 1.
-Greg
On Tue, Mar 21, 2017 at 6:04 PM Anthony D'Atri wrote:
> >> a min_size of 1 is dangerous though because
I wanted to share the recent experience, in which a few RBD volumes,
formatted as XFS and exported via Ubuntu NFS-kernel-server performed
poorly, even generated an "out of space" warnings on a nearly empty
filesystem. I tried a variety of hacks and fixes to no effect, until
things started magicall
Deploying or removing OSD’s in parallel for sure can save elapsed time and
avoid moving data more than once. There are certain pitfalls, though, and the
strategy needs careful planning.
- Deploying a new OSD at full weight means a lot of write operations. Running
multiple whole-OSD backfills
>> a min_size of 1 is dangerous though because it means you are 1 hard disk
>> failure away from losing the objects within that placement group entirely. a
>> min_size of 2 is generally considered the minimum you want but many people
>> ignore that advice, some wish they hadn't.
>
> I admit I
Greetings,
I have below two cephFS "volumes/filesystem" created on my ceph cluster. Yes I
used the "enable_multiple" flag to enable the multiple cephFS feature. My
question
1) How do I mention the fs name ie dataX or data1 during cephFS mount
either using kernel mount of ceph-fuse mount.
I came across an inconsistent pg in our 4+2 EC storage pool (ceph
10.2.5). Since "ceph pg repair" wasn't able to correct it, I followed
the general outline given in this thread
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-August/003965.html
# zgrep -Hn ERR /var/log/ceph/ceph-osd.3
The exclusive-lock feature does, by default, automatically transition
the lock between clients that are attempting to use the image. Only
one client will be able to issue writes to the image at a time. If you
ran "dd" against both mappings concurrently, I'd expect you'd see a
vastly decreased throu
On 03/17/2017 11:47 AM, Casey Bodley wrote:
On 03/16/2017 03:47 PM, Graham Allan wrote:
This might be a dumb question, but I'm not at all sure what the
"global quotas" in the radosgw region map actually do.
It is like a default quota which is applied to all users or buckets,
without having to
Hi,
On Tue, Mar 21, 2017 at 11:59 AM, Adam Carheden wrote:
> Let's see if I got this. 4 host cluster. size=3, min_size=2. 2 hosts
> fail. Are all of the following accurate?
>
> a. An rdb is split into lots of objects, parts of which will probably
> exist on all 4 hosts.
>
Correct.
>
> b. Some
If it took 7hr for one drive you probably already done this (or
defaults are for low impact recovery) but before doing anything you
want to besure you OSD settings max backfills, max recovery active,
recovery sleep (perhaps others?) are set such that revovery and
backfilling doesn't overwhelm pr
Generally speaking, you are correct. Adding more OSDs at once is more
efficient than adding fewer at a time.
That being said, do so carefully. We typically add OSDs to our clusters
either 32 or 64 at once, and we have had issues on occasion with bad
drives. It's common for us to have a drive or tw
Let's see if I got this. 4 host cluster. size=3, min_size=2. 2 hosts
fail. Are all of the following accurate?
a. An rdb is split into lots of objects, parts of which will probably
exist on all 4 hosts.
b. Some objects will have 2 of their 3 replicas on 2 of the offline OSDs.
c. Reads can continu
Hi,
Just a quick question about adding OSDs, since most of the docs I can
find talk about adding ONE OSD, and I'd like to add four per server on
my three-node cluster.
This morning I tried the careful approach, and added one OSD to server1.
It all went fine, everything rebuilt and I have a H
Hi Vincent,
There is no buffering until the object reaches 8MB. When the object is written,
it has a given size. RADOS just splits the object in K chunks, padding occurs
if the object size is not a multiple of K.
See also:
http://docs.ceph.com/docs/master/dev/osd_internals/erasure_coding/devel
On 21/03/17 17:48, Wes Dillingham wrote:
> a min_size of 1 is dangerous though because it means you are 1 hard disk
> failure away from losing the objects within that placement group entirely. a
> min_size of 2 is generally considered the minimum you want but many people
> ignore that advice, so
Generally this means the monitor daemon is not running. Is the monitor
daemon running? The monitor daemon creates the admin socket in
/var/run/ceph/$socket
Elaborate on how you are attempting to deploy ceph.
On Tue, Mar 21, 2017 at 9:01 AM, Vince wrote:
> Hi,
>
> I am getting the below error in
If you had set min_size to 1 you would not have seen the writes pause. a
min_size of 1 is dangerous though because it means you are 1 hard disk
failure away from losing the objects within that placement group entirely.
a min_size of 2 is generally considered the minimum you want but many
people ign
Hello Ceph team,
Linux Fest NorthWest's CFP is out. It is a bit too far for me to do
it as a day trip from Boston, but it would be nice if someone on the
pacific coast feels like giving a technical overview / architecture
session.
https://www.linuxfestnorthwest.org/2017/news/2017-call-presentati
Hello, I have a small cluster of Ceph installed and I have followed the
manual installation instructions since I do not have internet.
I have configured the system with two network interfaces, one for the
client network and one for the cluster network.
The problem is that the system when it begins
When we use a replicated pool of size 3 for example, each data, a block of
4MB is written on one PG which is distributed on 3 hosts (by default). The
osd holding the primary will copy the block to OSDs holding the secondary
and third PG.
With erasure code, let's take a raid5 schema like k=2 and m=
Hey cephers,
For those of you that are interested in presenting, sponsoring, or
attending Cephalocon, all of those options are now available on the
Ceph site.
http://ceph.com/cephalocon2017/
If you have any questions, comments, or difficulties, feel free to let
me know. Thanks!
--
Best Regard
Hi Logan,
On 03/21/2017 03:27 PM, Logan Kuhn wrote:
> I like the idea
>
> Being able to play around with different configuration options and using this
> tool as a sanity checker or showing what will change as well as whether or
> not the changes could cause health warn or health err.
The tool
Thanks everyone for the replies. Very informative. However, should I
have expected writes to pause if I'd had min_size set to 1 instead of 2?
And yes, I was under the false impression that my rdb devices was a
single object. That explains what all those other things are on a test
cluster where I o
when you replace a failed osd, it has to recover all of its pgs and so it
is pretty busy. Is it possible to tell the OSD to not become primary for
any of its already synchronized pgs till every pgs (of the OSD) have
recover ? It should accelerate the rebuild process because the OSD won't
have to se
I like the idea
Being able to play around with different configuration options and using this
tool as a sanity checker or showing what will change as well as whether or not
the changes could cause health warn or health err.
For example, if I were to change the replication level of a pool, how
Hello,
I made some changes in the below file on ceph kraken v11.2.0 source code as
per this article
https://github.com/ceph/ceph-ci/commit/wip-prune-past-intervals-kraken
..src/osd/PG.cc
..src/osd/PG.h
Is there any way to find which rpm got affected by these two files. I
believe it should be ce
Hey cephers,
We have no finalized the details for Ceph Day Warsaw (see
http://ceph.com/cephdays) and as a result, we need speakers!
If you would be interested in sharing some of your experiences or work
around Ceph please let me know as soon as possible. Thanks.
--
Best Regards,
Patrick McGa
;
>> On Tue, Mar 21, 2017 at 12:21 PM, Özhan Rüzgar Karaman
>> wrote:
>> > Hi Wido;
>> > After 30 minutes osd id 3 crashed also with segmentation fault, i
>> > uploaded
>> > logs again to the same location as ceph.log.wido.20170321-3.tgz. So now
>&
Hi,
I am getting the below error in messages after setting up ceph monitor.
===
Mar 21 08:48:23 mon1 ceph-create-keys: admin_socket: exception getting
command descriptions: [Errno 2] No such file or directory
Mar 21 08:48:23 mon1 ceph-create-keys: INFO:ceph-create-keys:ceph-mon
admin socke
ault, i
> uploaded
> > logs again to the same location as ceph.log.wido.20170321-3.tgz. So now
> all
> > OSD deamons on that server is crashed.
> >
> > Thanks
> > Özhan
> >
> > On Tue, Mar 21, 2017 at 10:57 AM, Özhan Rüzgar Karaman
> > wrot
, Mar 21, 2017 at 12:21 PM, Özhan Rüzgar Karaman
wrote:
> Hi Wido;
> After 30 minutes osd id 3 crashed also with segmentation fault, i uploaded
> logs again to the same location as ceph.log.wido.20170321-3.tgz. So now all
> OSD deamons on that server is crashed.
>
> Thanks
>
Hello all,
A few weeks ago Loïc Dachary presented his work on python-crush to the
ceph-devel list, but I don't think it's been done here yet. In a few words,
python-crush is a new Python 2 and 3 library / API for the CRUSH algorithm.
It also provides a CLI executable with a few built-in tools rela
Hi,
There's something I don't understand about the exclusive-lock feature.
I created an image:
$ ssh host-3
Container Linux by CoreOS stable (1298.6.0)
Update Strategy: No Reboots
host-3 ~ # uname -a
Linux host-3 4.9.9-coreos-r1 #1 SMP Tue Mar 14 21:09:42 UTC 2017 x86_64
Intel(R) Xeon(R) CPU E5
Hello,
we have been patching our ceph cluster 0.94.7 to 0.94.10. We were updating one
node at a time, and after each OSD node has been rebooted we were waiting for
the cluster health status to be OK.
In the docs we have "stale - The placement group status has not been updated by a
ceph-osd, in
Hi Wido;
After 30 minutes osd id 3 crashed also with segmentation fault, i uploaded
logs again to the same location as ceph.log.wido.20170321-3.tgz. So now all
OSD deamons on that server is crashed.
Thanks
Özhan
On Tue, Mar 21, 2017 at 10:57 AM, Özhan Rüzgar Karaman <
oruzgarkara...@gmail.
time osd id 3 started and operated successfully but osd id 2 failed
again with same segmentation fault.
I have uploaded new logs as to the same destination
as ceph.log.wido.20170321-2.tgz and its link is below again.
https://drive.google.com/drive/folders/0B_hD9LJqrkd7NmtJOW5YUnh6UE0?usp=
sharing
46 matches
Mail list logo