[ceph-users] 3 monitor down and recovery

2017-04-05 Thread 云平台事业部
Hello,
I am simulating the recovery when all of our 3 monitors are down in our test 
environment. I refered to the Ceph mon troubleshoting document, but encourtered 
the problem that “OSD has the store locked”.
I stop all the 3 mom, and plan to get the monmap from the OSDs. To get the 
monmap, the command is as following:
for osd in /var/lib/ceph/osd/ceph-*;
do  ceph-objectstore-tool --data-path $osd --op update-mon-db --mon-store-path 
/tmp/mon-store; done
The output:
OSD has the store locked
OSD has the store locked
OSD has the store locked
OSD has the store locked
OSD has the store locked
OSD has the store locked
OSD has the store locked
OSD has the store locked

The Ceph version 10.2.5, OS REHL7.2, 3 hosts for mon and OSD, 4 OSDs in each 
hosts.

And I searched the problem on the internet, finding that someone says it is 
required to stop the OSD, but it is not reasonable for us to stop all of OSDs 
and to get the monmap.
Please give me some suggestions, thank you!


Best regards,
Taotao He
Cloud Storage Engineer
Ping An Technology(shenzhen) Co.,Ltd, Cloud Business Division




The information in this email is confidential and may be legally privileged. If 
you have received this email in error or are not the intended recipient, please 
immediately notify the sender and delete this message from your computer. Any 
use, distribution, or copying of this email other than by the intended 
recipient is strictly prohibited. All messages sent to and from us may be 
monitored to ensure compliance with internal policies and to protect our 
business.
Emails are not secure and cannot be guaranteed to be error free as they can be 
intercepted, amended, lost or destroyed, or contain viruses. Anyone who 
communicates with us by email is taken to accept these risks.

收发邮件者请注意:
本邮件含涉密信息,请保守秘密,若误收本邮件,请务必通知发送人并直接删去,不得使用、传播或复制本邮件。
进出邮件均受到本公司合规监控。邮件可能发生被截留、被修改、丢失、被破坏或包含计算机病毒等不安全情况。

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd iscsi gateway question

2017-04-05 Thread Adrian Saul

I am not sure if there is a hard and fast rule you are after, but pretty much 
anything that would cause ceph transactions to be blocked (flapping OSD, 
network loss, hung host) has the potential to block RBD IO which would cause 
your iSCSI LUNs to become unresponsive for that period.

For the most part though, once that condition clears things keep working, so 
its not like a hang where you need to reboot to clear it.  Some situations we 
have hit with our setup:


-  Failed OSDs (dead disks) – no issues

-  Cluster rebalancing – ok if throttled back to keep service times down

-  Network packet loss (bad fibre) – painful, broken communication 
everywhere, caused a krbd hang needing a reboot

-  RBD Snapshot deletion – disk latency through roof, cluster 
unresponsive for minutes at a time, won’t do again.



From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Brady 
Deetz
Sent: Thursday, 6 April 2017 12:58 PM
To: ceph-users
Subject: [ceph-users] rbd iscsi gateway question

I apologize if this is a duplicate of something recent, but I'm not finding 
much. Does the issue still exist where dropping an OSD results in a LUN's I/O 
hanging?

I'm attempting to determine if I have to move off of VMWare in order to safely 
use Ceph as my VM storage.
Confidentiality: This email and any attachments are confidential and may be 
subject to copyright, legal or some other professional privilege. They are 
intended solely for the attention and use of the named addressee(s). They may 
only be copied, distributed or disclosed with the consent of the copyright 
owner. If you have received this email by mistake or by breach of the 
confidentiality clause, please notify the sender immediately by return email 
and delete or destroy all copies of the email. Any confidentiality, privilege 
or copyright is not waived or lost because this email has been sent to you by 
mistake.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] rbd iscsi gateway question

2017-04-05 Thread Brady Deetz
I apologize if this is a duplicate of something recent, but I'm not finding
much. Does the issue still exist where dropping an OSD results in a LUN's
I/O hanging?

I'm attempting to determine if I have to move off of VMWare in order to
safely use Ceph as my VM storage.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] performance issues

2017-04-05 Thread Christian Balzer

Hello,

first and foremost, do yourself and everybody else a favor by thoroughly
searching net and thus the ML archives.
This kind of question has come up and been answered countless times.


On Thu, 6 Apr 2017 09:59:10 +0800 PYH wrote:

> what I meant is, when the total IOPS reach to 3000+, the total cluster 
> gets very slow. so any idea? thanks.
> 
Gee whiz, that tends to happen when you push the borders of your
capacity/design.

> On 2017/4/6 9:51, PYH wrote:
> > Hi,
> > 
> > we have 21 hosts, each has 12 disks (4T sata), no SSD as journal or 
> > cache tier.
That's your problem right there, pure HDD setups will not produce good
IOPS.
 
> > so the total OSD number is 21x12=252.
> > there are three separate hosts for monitor nodes.
> > network is 10Gbps. replicas are 3.
> > 

252/3(replica)/2(journal on disk)= 42.
That's ignoring the journaling by the FS, the fact that writing RBD
objects isn't a sequential operation, etc.
If we (optimistically) assume 100 IOPS per HDD, that would give us
4200 IOPS in your case.

Factoring in everything else omitted up there, 3000 IOPS is pretty much
what I would expect.

> > under this setup, we can get only 3000+ IOPS for random writes for whole 
> > cluster.test method such as,
> > 
> > $ fio -name iops -rw=randwrite -bs=4k -runtime=60 -iodepth 64 -numjobs=2 
> > -filename /dev/rbd0 -ioengine libaio -direct=1
> >
You're testing the kernel client (which may or may not be worse than
librbd user space) here and a single client test like this (numjobs won't
help/change things) is also largely affected by latency, RTT issues.

Christian
 
> > it's much lower than my expect. Do you have any suggestions?
> > 
> > thanks.
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com  
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 


-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Global OnLine Japan/Rakuten Communications
http://www.gol.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] performance issues

2017-04-05 Thread PYH
what I meant is, when the total IOPS reach to 3000+, the total cluster 
gets very slow. so any idea? thanks.


On 2017/4/6 9:51, PYH wrote:

Hi,

we have 21 hosts, each has 12 disks (4T sata), no SSD as journal or 
cache tier.

so the total OSD number is 21x12=252.
there are three separate hosts for monitor nodes.
network is 10Gbps. replicas are 3.

under this setup, we can get only 3000+ IOPS for random writes for whole 
cluster.test method such as,


$ fio -name iops -rw=randwrite -bs=4k -runtime=60 -iodepth 64 -numjobs=2 
-filename /dev/rbd0 -ioengine libaio -direct=1


it's much lower than my expect. Do you have any suggestions?

thanks.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] performance issues

2017-04-05 Thread PYH

Hi,

we have 21 hosts, each has 12 disks (4T sata), no SSD as journal or 
cache tier.

so the total OSD number is 21x12=252.
there are three separate hosts for monitor nodes.
network is 10Gbps. replicas are 3.

under this setup, we can get only 3000+ IOPS for random writes for whole 
cluster.test method such as,


$ fio -name iops -rw=randwrite -bs=4k -runtime=60 -iodepth 64 -numjobs=2 
-filename /dev/rbd0 -ioengine libaio -direct=1


it's much lower than my expect. Do you have any suggestions?

thanks.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] clock skew

2017-04-05 Thread Dan Mick

> Just to follow-up on this: we have yet experienced a clock skew since we
> starting using chrony. Just three days ago, I know, bit still...

did you mean "we have not yet..."?

> Perhaps you should try it too, and report if it (seems to) work better
> for you as well.
> 
> But again, just three days, could be I cheer too early.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph df space for rgw.buckets.data shows used even when files are deleted

2017-04-05 Thread Deepak Naidu
Thanks Ben.

Is there are tuning param I need to use to fasten the process.

"rgw_gc_max_objs": "32",
"rgw_gc_obj_min_wait": "7200",
"rgw_gc_processor_max_time": "3600",
"rgw_gc_processor_period": "3600",


--
Deepak



From: Ben Hines [mailto:bhi...@gmail.com]
Sent: Wednesday, April 05, 2017 2:41 PM
To: Deepak Naidu
Cc: ceph-users
Subject: Re: [ceph-users] ceph df space for rgw.buckets.data shows used even 
when files are deleted

Ceph's RadosGW uses garbage collection by default.

Try running 'radosgw-admin gc list' to list the objects to be garbage 
collected, or 'radosgw-admin gc process' to trigger them to be deleted.

-Ben

On Wed, Apr 5, 2017 at 12:15 PM, Deepak Naidu 
mailto:dna...@nvidia.com>> wrote:
Folks,

Trying to test the S3 object GW. When I try to upload any files the space is 
shown used(that’s normal behavior), but when the object is deleted it shows as 
used(don’t understand this).  Below example.

Currently there is no files in the entire S3 bucket, but it still shows space 
used. Any insight is appreciated.

ceph version 10.2.6

NAMEID USED   %USED 
MAX AVAIL OBJECTS
default.rgw.buckets.data 49 51200M  1.08 4598G  
 12800


--
Deepak

This email message is for the sole use of the intended recipient(s) and may 
contain confidential information.  Any unauthorized review, use, disclosure or 
distribution is prohibited.  If you are not the intended recipient, please 
contact the sender by reply email and destroy all copies of the original 
message.


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph df space for rgw.buckets.data shows used even when files are deleted

2017-04-05 Thread Ben Hines
Ceph's RadosGW uses garbage collection by default.

Try running 'radosgw-admin gc list' to list the objects to be garbage
collected, or 'radosgw-admin gc process' to trigger them to be deleted.

-Ben

On Wed, Apr 5, 2017 at 12:15 PM, Deepak Naidu  wrote:

> Folks,
>
>
>
> Trying to test the S3 object GW. When I try to upload any files the space
> is shown used(that’s normal behavior), but when the object is deleted it
> shows as used(don’t understand this).  Below example.
>
>
>
> Currently there is no files in the entire S3 bucket, but it still shows
> space used. Any insight is appreciated.
>
>
>
> ceph version 10.2.6
>
>
>
> *NAMEID USED
> %USED MAX AVAIL OBJECTS*
>
> default.rgw.buckets.data 49 51200M  1.08
> 4598G   12800
>
>
>
>
>
> --
>
> Deepak
> --
> This email message is for the sole use of the intended recipient(s) and
> may contain confidential information.  Any unauthorized review, use,
> disclosure or distribution is prohibited.  If you are not the intended
> recipient, please contact the sender by reply email and destroy all copies
> of the original message.
> --
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] librbd + rbd-nbd

2017-04-05 Thread Prashant Murthy
Hi all,


I wanted to ask if anybody is using librbd (user mode lib) with rbd-nbd
(kernel module) on their Ceph clients. We're currently using krbd, but that
doesn't support some of the features (such as rbd mirroring). So, I wanted
to check if anybody has experience running with nbd + librbd on their
clusters and can provide more details.

Prashant

-- 
Prashant Murthy
Sr Director, Software Engineering | Salesforce
Mobile: 919-961-3041


--
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph df space for rgw.buckets.data shows used even when files are deleted

2017-04-05 Thread Deepak Naidu
Folks,

Trying to test the S3 object GW. When I try to upload any files the space is 
shown used(that's normal behavior), but when the object is deleted it shows as 
used(don't understand this).  Below example.

Currently there is no files in the entire S3 bucket, but it still shows space 
used. Any insight is appreciated.

ceph version 10.2.6

NAMEID USED   %USED 
MAX AVAIL OBJECTS
default.rgw.buckets.data 49 51200M  1.08 4598G  
 12800


--
Deepak

---
This email message is for the sole use of the intended recipient(s) and may 
contain
confidential information.  Any unauthorized review, use, disclosure or 
distribution
is prohibited.  If you are not the intended recipient, please contact the 
sender by
reply email and destroy all copies of the original message.
---
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Apply for an official mirror at CN

2017-04-05 Thread SJ Zhu
On Wed, Apr 5, 2017 at 10:55 PM, Patrick McGarry  wrote:
> Ok, I have added the following to ceph dns:
>
> cnINCNAMEmirrors.ustc.edu.cn.

Great, thanks.
Besides, I have enabled HTTPS for https://cn.ceph.com just now.

-- 
Regards,
Shengjing Zhu
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Apply for an official mirror at CN

2017-04-05 Thread Patrick McGarry
Ok, I have added the following to ceph dns:

cnINCNAMEmirrors.ustc.edu.cn.

Wido, I haven't added this to the website, doc, or anywhere else for
informational purposes, but the mechanics should be live and accepting
traffic in a bit here. Let me know if you guys need anything else.
Thanks.


On Wed, Apr 5, 2017 at 6:06 AM, Shinobu Kinjo  wrote:
> Adding Patrick who might be the best person.
>
> Regards,
>
> On Wed, Apr 5, 2017 at 6:16 PM, Wido den Hollander  wrote:
>>
>>> Op 5 april 2017 om 8:14 schreef SJ Zhu :
>>>
>>>
>>> Wido, ping?
>>>
>>
>> This might take a while! Has to go through a few hops for this to get fixed.
>>
>> It's on my radar!
>>
>> Wido
>>
>>> On Sat, Apr 1, 2017 at 8:40 PM, SJ Zhu  wrote:
>>> > On Sat, Apr 1, 2017 at 8:10 PM, Wido den Hollander  wrote:
>>> >> Great! Very good to hear. We can CNAME cn.ceph.com to that location?
>>> >
>>> >
>>> > Yes, please CNAME to mirrors.ustc.edu.cn, and I will set vhost in our
>>> > nginx for the
>>> > ceph directory.
>>> >
>>> > Thanks
>>> >
>>> > --
>>> > Regards,
>>> > Shengjing Zhu
>>>
>>>
>>>
>>> --
>>> Regards,
>>> Shengjing Zhu
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 

Best Regards,

Patrick McGarry
Director Ceph Community || Red Hat
http://ceph.com  ||  http://community.redhat.com
@scuttlemonkey || @ceph
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Client's read affinity

2017-04-05 Thread Jason Dillaman
Yes, it's a general solution for any read-only parent images. This
will *not* help localize reads for any portions of your image that
have already been copied-on-written from the parent image down to the
cloned image (i.e. the Cinder volume or Nova disk).

On Wed, Apr 5, 2017 at 10:25 AM, Alejandro Comisario
 wrote:
> Another thing that i would love to ask and clarify is, would this work
> for openstack vms that uses cinder, instead of vms that uses direct
> integration between nova and ceph ?
> We use cinder bootable volumes and normal cinder attached volumes to vms.
>
> thx
>
> On Wed, Apr 5, 2017 at 10:36 AM, Wes Dillingham
>  wrote:
>> This is a big development for us. I have not heard of this option either. I
>> am excited to play with this feature and the implications it may have in
>> improving RBD reads in our multi-datacenter RBD pools.
>>
>> Just to clarify the following options:
>> "rbd localize parent reads = true" and "crush location = foo=bar" are
>> configuration options for the client's ceph.conf and are not needed for OSD
>> hosts as their locations are already encoded in the CRUSH map.
>>
>> It looks like this is a pretty old option (
>> http://narkive.com/ZkTahBVu:5.455.67 )
>>
>> so I am assuming it is relatively tried and true? but I have never heard of
>> it before... is anyone out there using this in a production RBD environment?
>>
>>
>>
>>
>> On Tue, Apr 4, 2017 at 7:36 PM, Jason Dillaman  wrote:
>>>
>>> AFAIK, the OSDs should discover their location in the CRUSH map
>>> automatically -- therefore, this "crush location" config override
>>> would be used for librbd client configuration ("i.e. [client]
>>> section") to describe their location in the CRUSH map relative to
>>> racks, hosts, etc.
>>>
>>> On Tue, Apr 4, 2017 at 3:12 PM, Brian Andrus 
>>> wrote:
>>> > Jason, I haven't heard much about this feature.
>>> >
>>> > Will the localization have effect if the crush location configuration is
>>> > set
>>> > in the [osd] section, or does it need to apply globally for clients as
>>> > well?
>>> >
>>> > On Fri, Mar 31, 2017 at 6:38 AM, Jason Dillaman 
>>> > wrote:
>>> >>
>>> >> Assuming you are asking about RBD-back VMs, it is not possible to
>>> >> localize the all reads to the VM image. You can, however, enable
>>> >> localization of the parent image since that is a read-only data set.
>>> >> To enable that feature, set "rbd localize parent reads = true" and
>>> >> populate the "crush location = host=X rack=Y etc=Z" in your ceph.conf.
>>> >>
>>> >> On Fri, Mar 31, 2017 at 9:00 AM, Alejandro Comisario
>>> >>  wrote:
>>> >> > any experiences ?
>>> >> >
>>> >> > On Wed, Mar 29, 2017 at 2:02 PM, Alejandro Comisario
>>> >> >  wrote:
>>> >> >> Guys hi.
>>> >> >> I have a Jewel Cluster divided into two racks which is configured on
>>> >> >> the crush map.
>>> >> >> I have clients (openstack compute nodes) that are closer from one
>>> >> >> rack
>>> >> >> than to another.
>>> >> >>
>>> >> >> I would love to (if is possible) to specify in some way the clients
>>> >> >> to
>>> >> >> read first from the nodes on a specific rack then try the other one
>>> >> >> if
>>> >> >> is not possible.
>>> >> >>
>>> >> >> Is that doable ? can somebody explain me how to do it ?
>>> >> >> best.
>>> >> >>
>>> >> >> --
>>> >> >> Alejandrito
>>> >> >
>>> >> >
>>> >> >
>>> >> > --
>>> >> > Alejandro Comisario
>>> >> > CTO | NUBELIU
>>> >> > E-mail: alejandro@nubeliu.comCell: +54 9 11 3770 1857
>>> >> > _
>>> >> > www.nubeliu.com
>>> >> > ___
>>> >> > ceph-users mailing list
>>> >> > ceph-users@lists.ceph.com
>>> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Jason
>>> >> ___
>>> >> ceph-users mailing list
>>> >> ceph-users@lists.ceph.com
>>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> > Brian Andrus | Cloud Systems Engineer | DreamHost
>>> > brian.and...@dreamhost.com | www.dreamhost.com
>>>
>>>
>>>
>>> --
>>> Jason
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>>
>>
>> --
>> Respectfully,
>>
>> Wes Dillingham
>> wes_dilling...@harvard.edu
>> Research Computing | Infrastructure Engineer
>> Harvard University | 38 Oxford Street, Cambridge, Ma 02138 | Room 210
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
>
> --
> Alejandrito



-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Client's read affinity

2017-04-05 Thread Alejandro Comisario
Another thing that i would love to ask and clarify is, would this work
for openstack vms that uses cinder, instead of vms that uses direct
integration between nova and ceph ?
We use cinder bootable volumes and normal cinder attached volumes to vms.

thx

On Wed, Apr 5, 2017 at 10:36 AM, Wes Dillingham
 wrote:
> This is a big development for us. I have not heard of this option either. I
> am excited to play with this feature and the implications it may have in
> improving RBD reads in our multi-datacenter RBD pools.
>
> Just to clarify the following options:
> "rbd localize parent reads = true" and "crush location = foo=bar" are
> configuration options for the client's ceph.conf and are not needed for OSD
> hosts as their locations are already encoded in the CRUSH map.
>
> It looks like this is a pretty old option (
> http://narkive.com/ZkTahBVu:5.455.67 )
>
> so I am assuming it is relatively tried and true? but I have never heard of
> it before... is anyone out there using this in a production RBD environment?
>
>
>
>
> On Tue, Apr 4, 2017 at 7:36 PM, Jason Dillaman  wrote:
>>
>> AFAIK, the OSDs should discover their location in the CRUSH map
>> automatically -- therefore, this "crush location" config override
>> would be used for librbd client configuration ("i.e. [client]
>> section") to describe their location in the CRUSH map relative to
>> racks, hosts, etc.
>>
>> On Tue, Apr 4, 2017 at 3:12 PM, Brian Andrus 
>> wrote:
>> > Jason, I haven't heard much about this feature.
>> >
>> > Will the localization have effect if the crush location configuration is
>> > set
>> > in the [osd] section, or does it need to apply globally for clients as
>> > well?
>> >
>> > On Fri, Mar 31, 2017 at 6:38 AM, Jason Dillaman 
>> > wrote:
>> >>
>> >> Assuming you are asking about RBD-back VMs, it is not possible to
>> >> localize the all reads to the VM image. You can, however, enable
>> >> localization of the parent image since that is a read-only data set.
>> >> To enable that feature, set "rbd localize parent reads = true" and
>> >> populate the "crush location = host=X rack=Y etc=Z" in your ceph.conf.
>> >>
>> >> On Fri, Mar 31, 2017 at 9:00 AM, Alejandro Comisario
>> >>  wrote:
>> >> > any experiences ?
>> >> >
>> >> > On Wed, Mar 29, 2017 at 2:02 PM, Alejandro Comisario
>> >> >  wrote:
>> >> >> Guys hi.
>> >> >> I have a Jewel Cluster divided into two racks which is configured on
>> >> >> the crush map.
>> >> >> I have clients (openstack compute nodes) that are closer from one
>> >> >> rack
>> >> >> than to another.
>> >> >>
>> >> >> I would love to (if is possible) to specify in some way the clients
>> >> >> to
>> >> >> read first from the nodes on a specific rack then try the other one
>> >> >> if
>> >> >> is not possible.
>> >> >>
>> >> >> Is that doable ? can somebody explain me how to do it ?
>> >> >> best.
>> >> >>
>> >> >> --
>> >> >> Alejandrito
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > Alejandro Comisario
>> >> > CTO | NUBELIU
>> >> > E-mail: alejandro@nubeliu.comCell: +54 9 11 3770 1857
>> >> > _
>> >> > www.nubeliu.com
>> >> > ___
>> >> > ceph-users mailing list
>> >> > ceph-users@lists.ceph.com
>> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >>
>> >>
>> >>
>> >> --
>> >> Jason
>> >> ___
>> >> ceph-users mailing list
>> >> ceph-users@lists.ceph.com
>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>> >
>> >
>> >
>> > --
>> > Brian Andrus | Cloud Systems Engineer | DreamHost
>> > brian.and...@dreamhost.com | www.dreamhost.com
>>
>>
>>
>> --
>> Jason
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
>
> --
> Respectfully,
>
> Wes Dillingham
> wes_dilling...@harvard.edu
> Research Computing | Infrastructure Engineer
> Harvard University | 38 Oxford Street, Cambridge, Ma 02138 | Room 210
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
Alejandrito
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] CDM Today @ 12:30p EDT

2017-04-05 Thread Patrick McGarry
Hey cephers,

Just a friendly reminder that the Ceph Developer Monthly call will be
at 12:30 EDT / 16:30 UTC. We already have a few things on the docket
for discussion, but if you are doing any Ceph feature or backport
work, please add your item to the list to be discussed:

http://wiki.ceph.com/CDM_05-APR-2017

If you have any questions, please let me know. Thanks.

-- 

Best Regards,

Patrick McGarry
Director Ceph Community || Red Hat
http://ceph.com  ||  http://community.redhat.com
@scuttlemonkey || @ceph
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] radosgw global quotas - how to set in jewel?

2017-04-05 Thread Casey Bodley
A new set of 'radosgw-admin global quota' commands were added for this, 
which we'll backport to kraken and jewel. You can view the updated 
documentation here: 
http://docs.ceph.com/docs/master/radosgw/admin/#reading-writing-global-quotas


Thanks again for pointing this out,
Casey


On 04/03/2017 03:23 PM, Graham Allan wrote:
Ah, thanks, I thought I was going crazy for a bit there! The global 
quota would be useful for us (now wanting to retroactively impose 
quotas on pre-existing users), but we can script a workaround instead.


Thanks,
Graham

On 03/29/2017 10:17 AM, Casey Bodley wrote:

Hi Graham, you're absolutely right. In jewel, these settings were moved
into the period, but radosgw-admin doesn't have any commands to modify
them. I opened a tracker issue for this at
http://tracker.ceph.com/issues/19409. For now, it looks like you're
stuck with the 'default quota' settings in ceph.conf.

Thanks,
Casey

On 03/27/2017 03:13 PM, Graham Allan wrote:

I'm following up to myself here, but I'd love to hear if anyone knows
how the global quotas can be set in jewel's radosgw. I haven't found
anything which has an effect - the documentation says to use:

radosgw-admin region-map get > regionmap.json
...edit the json file
radosgw-admin region-map set < regionmap.json

but this has no effect on jewel. There doesn't seem to be any
analogous function in the "period"-related commands which I think
would be the right place to look for jewel.

Am I missing something, or should I open a bug?

Graham




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Client's read affinity

2017-04-05 Thread Wes Dillingham
This is a big development for us. I have not heard of this option either. I
am excited to play with this feature and the implications it may have in
improving RBD reads in our multi-datacenter RBD pools.

Just to clarify the following options:
"rbd localize parent reads = true" and "crush location = foo=bar" are
configuration options for the client's ceph.conf and are not needed for OSD
hosts as their locations are already encoded in the CRUSH map.

It looks like this is a pretty old option ( http://narkive.com/ZkTahBVu:
5.455.67 )

so I am assuming it is relatively tried and true? but I have never heard of
it before... is anyone out there using this in a production RBD environment?




On Tue, Apr 4, 2017 at 7:36 PM, Jason Dillaman  wrote:

> AFAIK, the OSDs should discover their location in the CRUSH map
> automatically -- therefore, this "crush location" config override
> would be used for librbd client configuration ("i.e. [client]
> section") to describe their location in the CRUSH map relative to
> racks, hosts, etc.
>
> On Tue, Apr 4, 2017 at 3:12 PM, Brian Andrus 
> wrote:
> > Jason, I haven't heard much about this feature.
> >
> > Will the localization have effect if the crush location configuration is
> set
> > in the [osd] section, or does it need to apply globally for clients as
> well?
> >
> > On Fri, Mar 31, 2017 at 6:38 AM, Jason Dillaman 
> wrote:
> >>
> >> Assuming you are asking about RBD-back VMs, it is not possible to
> >> localize the all reads to the VM image. You can, however, enable
> >> localization of the parent image since that is a read-only data set.
> >> To enable that feature, set "rbd localize parent reads = true" and
> >> populate the "crush location = host=X rack=Y etc=Z" in your ceph.conf.
> >>
> >> On Fri, Mar 31, 2017 at 9:00 AM, Alejandro Comisario
> >>  wrote:
> >> > any experiences ?
> >> >
> >> > On Wed, Mar 29, 2017 at 2:02 PM, Alejandro Comisario
> >> >  wrote:
> >> >> Guys hi.
> >> >> I have a Jewel Cluster divided into two racks which is configured on
> >> >> the crush map.
> >> >> I have clients (openstack compute nodes) that are closer from one
> rack
> >> >> than to another.
> >> >>
> >> >> I would love to (if is possible) to specify in some way the clients
> to
> >> >> read first from the nodes on a specific rack then try the other one
> if
> >> >> is not possible.
> >> >>
> >> >> Is that doable ? can somebody explain me how to do it ?
> >> >> best.
> >> >>
> >> >> --
> >> >> Alejandrito
> >> >
> >> >
> >> >
> >> > --
> >> > Alejandro Comisario
> >> > CTO | NUBELIU
> >> > E-mail: alejandro@nubeliu.comCell: +54 9 11 3770 1857
> >> > _
> >> > www.nubeliu.com
> >> > ___
> >> > ceph-users mailing list
> >> > ceph-users@lists.ceph.com
> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >>
> >>
> >>
> >> --
> >> Jason
> >> ___
> >> ceph-users mailing list
> >> ceph-users@lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> >
> >
> >
> > --
> > Brian Andrus | Cloud Systems Engineer | DreamHost
> > brian.and...@dreamhost.com | www.dreamhost.com
>
>
>
> --
> Jason
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
Respectfully,

Wes Dillingham
wes_dilling...@harvard.edu
Research Computing | Infrastructure Engineer
Harvard University | 38 Oxford Street, Cambridge, Ma 02138 | Room 210
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] radosgw leaking objects

2017-04-05 Thread Luis Periquito
To try and make my life easier do you have such a script already done?

Also, has the source of the orphans been found, or will they continue
to happen after the upgrade to the newer version?

thanks,

On Mon, Apr 3, 2017 at 4:59 PM, Yehuda Sadeh-Weinraub  wrote:
> On Mon, Apr 3, 2017 at 1:32 AM, Luis Periquito  wrote:
>>> Right. The tool isn't removing objects (yet), because we wanted to
>>> have more confidence in the tool before having it automatically
>>> deleting all the found objects. The process currently is to manually
>>> move these objects to a different backup pool (via rados cp, rados
>>> rm), then when you're confident that no needed data was lost in the
>>> process remove the backup pool. In the future we'll automate that.
>>
>> My problem exactly. I don't have enough confidence in myself to just
>> delete a bunch of random objects... Any idea as to when will be
>> available such tool?
>
> Why random? The objects are the ones that the orphan tool pointed at.
> And the idea is to move these objects to a safe place before removal,
> so that even if the wrong objects are removed, they can be recovered.
> There is no current ETA for the tool, but the tool will probably have
> the same two steps as reflected here: 1. backup, 2. remove backup.
>
> Yehuda
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] bluestore - OSD booting issue continuosly

2017-04-05 Thread nokia ceph
Hello,

Env:- 11.2.0
 bluestore, EC 4+1 , RHEL7.2

We are facing one OSD's booting again and again which caused the cluster
crazy :( .  As you can see one PG got in inconsistent state while we tried
to repair that partular PG, as its primary OSD's went down. After some time
we found some trace from mon logs that the OSD boot continuously

# ceph pg stat

v1109889: 8192 pgs: 1 *active+clean+inconsistent*, 8191 active+clean; 31041
GB data, 42473 GB used, 1177 TB / 1218 TB avail; 411 kB/s rd, 28102 kB/s
wr, 322 op/s



# ceph health detail | grep 'inconsistent'

HEALTH_ERR 1 pgs inconsistent; 5 scrub errors

pg 1.5c4 is active+clean+inconsistent, acting [258,268,156,14,67]



# ceph pg deep-scrub 1.5c4

instructing pg 1.5c4 on osd.258 to deep-scrub





>From ceph-osd.258.log when the problem pops up;

=


2017-04-04 13:08:08.288728 7f6b34fe6700 -1 log_channel(cluster) log [ERR] :
1.5c4s0 scrub stat mismatch, got 1989/1988 objects, 0/0 clones, 1989/1988
dirty, 0/0

omap, 0/0 pinned, 0/0 hit_set_archive, 0/0 whiteouts, 4113960196/4110943924
bytes, 0/0 hit_set_archive bytes.

2017-04-04 13:08:08.288744 7f6b34fe6700 -1 log_channel(cluster) log [ERR] :
1.5c4s0 scrub 1 missing, 0 inconsistent objects



2017-04-04 13:08:08.288747 7f6b34fe6700 -1 log_channel(cluster) log [ERR] :
1.5c4 scrub 5 errors


>From mon logs

===



2017-04-05 07:18:17.665821 7f0fb55b5700  0 log_channel(cluster) log [INF] :
pgmap v1134421: 8192 pgs: 1 activating+degraded+inconsistent, 4 activ

ating, 27 active+recovery_wait+degraded, 3 active+recovering+degraded, 8157
active+clean; 31619 GB data, 43163 GB used, 1176 TB / 1218 TB avail;

16429 kB/s rd, 109 MB/s wr, 1208 op/s; 554/87717275 objects degraded
(0.001%)

2017-04-05 07:18:18.643905 7f0fb55b5700  1 mon.cn1@0(leader).osd e4701
e4701: 335 osds: 334 up, 335 in

2017-04-05 07:18:18.656143 7f0fb55b5700  0 mon.cn1@0(leader).osd e4701
crush map has features 288531987042664448, adjusting msgr requires

2017-04-05 07:18:18.676477 7f0fb55b5700  0 log_channel(cluster) log [INF] :
osdmap e4701: 335 osds: 334 up, 335 in

2017-04-05 07:18:18.731719 7f0fb55b5700  0 log_channel(cluster) log [INF] :
pgmap v1134422: 8192 pgs: 1 activating+degraded+inconsistent, 4 activ

ating, 27 active+recovery_wait+degraded, 3 active+recovering+degraded, 8157
active+clean; 31619 GB data, 43163 GB used, 1176 TB / 1218 TB avail;

21108 kB/s rd, 105 MB/s wr, 1165 op/s; 554/87717365 objects degraded
(0.001%)

2017-04-05 07:18:19.963217 7f0fb55b5700  1 mon.cn1@0(leader).osd e4702
e4702: 335 osds: 335 up, 335 in

2017-04-05 07:18:19.975002 7f0fb55b5700  0 mon.cn1@0(leader).osd e4702
crush map has features 288531987042664448, adjusting msgr requires

2017-04-05 07:18:19.978696 7f0fb55b5700  0 log_channel(cluster) log [INF] :
osd.258 10.139.4.84:6815/27074 *boot   --->>*

2017-04-05 07:18:19.983349 7f0fb55b5700  0 log_channel(cluster) log [INF] :
osdmap e4702: 335 osds: 335 up, 335 in

2017-04-05 07:18:20.079996 7f0fb55b5700  0 log_channel(cluster) log [INF] :
pgmap v1134423: 8192 pgs: 1 inconsistent+peering, 15 peering, 23 acti

ve+recovery_wait+degraded, 3 active+recovering+degraded, 8150 active+clean;
31619 GB data, 43163 GB used, 1176 TB / 1218 TB avail; 6888 kB/s rd,

72604 kB/s wr, 654 op/s; 489/87717555 objects degraded (0.001%)

2017-04-05 07:18:20.135081 7f0fb55b5700  0 log_channel(cluster) log [INF] :
pgmap v1134424: 8192 pgs: 19 stale+active+clean, 1 inconsistent+peeri

ng, 15 peering, 23 active+recovery_wait+degraded, 3
active+recovering+degraded, 8131 active+clean; 31619 GB data, 43163 GB
used, 1176 TB / 1218 T

B avail; 3452 kB/s rd, 97092 kB/s wr, 834 op/s; 489/87717565 objects
degraded (0.001%)

2017-04-05 07:18:20.303398 7f0fb55b5700  1 mon.cn1@0(leader).osd e4703
e4703: 335 osds: 335 up, 335 in

Few occurance for boot..

2017-04-05 07:12:40.085065 7f0fb55b5700  0 log_channel(cluster) log [INF] :
osd.258 10.139.4.84:6815/23975 boot
2017-04-05 07:13:37.20 7f0fb55b5700  0 log_channel(cluster) log [INF] :
osd.258 10.139.4.84:6815/24334 boot
2017-04-05 07:14:35.413633 7f0fb55b5700  0 log_channel(cluster) log [INF] :
osd.258 10.139.4.84:6815/25745 boot
2017-04-05 07:15:30.303761 7f0fb55b5700  0 log_channel(cluster) log [INF] :
osd.258 10.139.4.84:6815/26194 boot
2017-04-05 07:16:25.155067 7f0fb55b5700  0 log_channel(cluster) log [INF] :
osd.258 10.139.4.84:6815/26561 boot
2017-04-05 07:17:22.538683 7f0fb55b5700  0 log_channel(cluster) log [INF] :
osd.258 10.139.4.84:6815/26768 boot
2017-04-05 07:18:19.978696 7f0fb55b5700  0 log_channel(cluster) log [INF] :
osd.258 10.139.4.84:6815/27074 boot
2017-04-05 07:19:14.201683 7f0fb55b5700  0 log_channel(cluster) log [INF] :
osd.258 10.139.4.84:6815/28517 boot


>From OSD.258..

==

  -5299> 2017-04-05 07:18:20.151610 7f7c9ed5e700  1 osd.258 4702
state: *booting
-> active*

 -5298> 2017-04-05 07:18:20.151622 7f7c9ed5e700  0 osd.258 4702 crush map
has features 288531987042

Re: [ceph-users] Apply for an official mirror at CN

2017-04-05 Thread Shinobu Kinjo
Adding Patrick who might be the best person.

Regards,

On Wed, Apr 5, 2017 at 6:16 PM, Wido den Hollander  wrote:
>
>> Op 5 april 2017 om 8:14 schreef SJ Zhu :
>>
>>
>> Wido, ping?
>>
>
> This might take a while! Has to go through a few hops for this to get fixed.
>
> It's on my radar!
>
> Wido
>
>> On Sat, Apr 1, 2017 at 8:40 PM, SJ Zhu  wrote:
>> > On Sat, Apr 1, 2017 at 8:10 PM, Wido den Hollander  wrote:
>> >> Great! Very good to hear. We can CNAME cn.ceph.com to that location?
>> >
>> >
>> > Yes, please CNAME to mirrors.ustc.edu.cn, and I will set vhost in our
>> > nginx for the
>> > ceph directory.
>> >
>> > Thanks
>> >
>> > --
>> > Regards,
>> > Shengjing Zhu
>>
>>
>>
>> --
>> Regards,
>> Shengjing Zhu
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] write to ceph hangs

2017-04-05 Thread Laszlo Budai

Hello,

We have an issue when writing to ceph. From time to time the write operation 
seems to hang for a few seconds.
We've seen the https://bugzilla.redhat.com/show_bug.cgi?id=1389503, and there it is said 
that when the qemu process would reach the max open files limit, then "the guest OS 
should be paused". Is this the case for the entire guest OS, or it is only for the 
given write operation?

In our case we have observed situations when different processes in the same VM 
are accessing different mountpoints (all of them on ceph) and some will fail 
while others does not.

Kind regards,
Laszlo
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Apply for an official mirror at CN

2017-04-05 Thread Wido den Hollander

> Op 5 april 2017 om 8:14 schreef SJ Zhu :
> 
> 
> Wido, ping?
> 

This might take a while! Has to go through a few hops for this to get fixed.

It's on my radar!

Wido

> On Sat, Apr 1, 2017 at 8:40 PM, SJ Zhu  wrote:
> > On Sat, Apr 1, 2017 at 8:10 PM, Wido den Hollander  wrote:
> >> Great! Very good to hear. We can CNAME cn.ceph.com to that location?
> >
> >
> > Yes, please CNAME to mirrors.ustc.edu.cn, and I will set vhost in our
> > nginx for the
> > ceph directory.
> >
> > Thanks
> >
> > --
> > Regards,
> > Shengjing Zhu
> 
> 
> 
> -- 
> Regards,
> Shengjing Zhu
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com