date:20170120

Re: [ceph-users] osd_snap_trim_sleep keeps locks PG during sleep?

2017-01-20 Thread Nick Fisk

Hi Sam,

I have a test cluster, albeit small. I’m happy to run tests + graph results 
with a wip branch and work out reasonable settings…etc

From: Samuel Just [mailto:sj...@redhat.com] 
Sent: 19 January 2017 23:23
To: David Turner 
Cc: Nick Fisk ; ceph-users 
Subject: Re: [ceph-users] osd_snap_trim_sleep keeps locks PG during sleep?

I could probably put together a wip branch if you have a test cluster you could 
try it out on.

-Sam

On Thu, Jan 19, 2017 at 2:27 PM, David Turner mailto:david.tur...@storagecraft.com> > wrote:

To be clear, we are willing to change to a snap_trim_sleep of 0 and try to 
manage it with the other available settings... but it is sounding like that 
won't really work for us since our main op thread(s) will just be saturated 
with snap trimming almost all day.  We currently only have ~6 hours/day where 
our snap trim q's are empty.

  _  

David Turner | Cloud Operations Engineer |   
StorageCraft Technology Corporation
380 Data Drive Suite 300 | Draper | Utah | 84020
Office: 801.871.2760   | Mobile: 385.224.2943 

  _  

If you are not the intended recipient of this message or received it 
erroneously, please notify the sender and delete it, together with any 
attachments, and be advised that any dissemination or copying of this message 
is prohibited.

  _  

  _  

From: ceph-users [ceph-users-boun...@lists.ceph.com 
 ] on behalf of David Turner 
[david.tur...@storagecraft.com  ]
Sent: Thursday, January 19, 2017 3:25 PM
To: Samuel Just; Nick Fisk

Cc: ceph-users
Subject: Re: [ceph-users] osd_snap_trim_sleep keeps locks PG during sleep?

We are a couple of weeks away from upgrading to Jewel in our production 
clusters (after months of testing in our QA environments), but this might 
prevent us from making the migration from Hammer.   We delete ~8,000 
snapshots/day between 3 clusters and our snap_trim_q gets up to about 60 
Million in each of those clusters.  We have to use an osd_snap_trim_sleep of 
0.25 to prevent our clusters from falling on their faces during our big load 
and 0.1 the rest of the day to catch up on the snap trim q.

Is our setup possible to use on Jewel?

  _  

David Turner | Cloud Operations Engineer |   
StorageCraft Technology Corporation
380 Data Drive Suite 300 | Draper | Utah | 84020
Office: 801.871.2760   | Mobile: 385.224.2943 

  _  

If you are not the intended recipient of this message or received it 
erroneously, please notify the sender and delete it, together with any 
attachments, and be advised that any dissemination or copying of this message 
is prohibited.

  _  

From: ceph-users [ceph-users-boun...@lists.ceph.com 
 ] on behalf of Samuel Just 
[sj...@redhat.com  ]
Sent: Thursday, January 19, 2017 2:45 PM
To: Nick Fisk
Cc: ceph-users
Subject: Re: [ceph-users] osd_snap_trim_sleep keeps locks PG during sleep?

Yeah, I think you're probably right.  The answer is probably to add an
explicit rate-limiting element to the way the snaptrim events are
scheduled.
-Sam

On Thu, Jan 19, 2017 at 1:34 PM, Nick Fisk mailto:n...@fisk.me.uk> > wrote:
> I will give those both a go and report back, but the more I thinking about 
> this the less I'm convinced that it's going to help.
>
> I think the problem is a general IO imbalance, there is probably something 
> like 100+ times more trimming IO than client IO and so even if client IO gets 
> promoted to the front of the queue by Ceph, once it hits the Linux IO layer 
> its fighting for itself. I guess this approach works with scrubbing as each 
> read IO has to wait to be read before the next one is submitted, so the queue 
> can be managed on the OSD. With trimming, writes can buffer up below what the 
> OSD controls.
>
> I don't know if the snap trimming goes nuts because the journals are acking 
> each request and the spinning disks can't keep up, or if it's something else. 
> Does WBThrottle get involved with snap trimming?
>
> But from an underlying disk perspective, there is definitely more than 2 
> snaps per OSD at a time going on, even if the OSD itself is not processing 
> more than 2 at a time. I think there either needs to be another knob so that 
> Ceph can throttle back snaps, not just de-prioritise them. Or, there needs a 
> whole new kernel interface where an application can priority tag individual 
> IO's for CFQ to handle, instead of the current limitation of priority per 
> thread, I realise this is probably very very hard or impossible. But it would 
> allow Ceph to control IO queue's right down to the disk.
>
>> -Original Message-
>> From: Samuel Just [mailto:sj...@redhat.com  ]
>> Sent: 19 January 2

Re: [ceph-users] [Ceph-community] Consultation about ceph storage cluster architecture

2017-01-20 Thread Nick Fisk

I think he needs the “gateway” servers because he wishes to expose the storage 
to clients which won’t speak Ceph natively. I’m not sure I would entirely trust 
that windows port of CephFS and there are also security concerns with allowing 
end users to talk directly to Ceph. There’s also future stuff like CephFS 
snapshots, which Samba can then wrap around to provide previous versions 
directly into windows explorer.

 

You probably also don’t want to use CephFS at this point, there are several 
outstanding points that means that it’s going to require a bit more work than 
you ideally want to have to go through if you want to export it HA over NFS and 
SMB reliably. There were some posts about a year ago (I think??) which covered 
some of the issues using CTDB with CephFS in a HA fashion. Mainly around 
timeouts in CephFS and how they match up to what CTDB expects.

 

Currently the simplest and most reliable way is like the blog article linked, 
to export NFS+SMB on a filesystem over a RBD. This will require the use of 
Pacemaker for HA and that’s a whole separate topic in itself to get right.

 

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of David 
Turner
Sent: 20 January 2017 19:08
To: John Petrini ; Joao Eduardo Luis 
Cc: ceph-users@lists.ceph.com; hen shmuel 
Subject: Re: [ceph-users] [Ceph-community] Consultation about ceph storage 
cluster architecture

 

CephFS does not require a central NFS server.  Any Linux server can mount the 
CephFS volume at the same time.  There is also a windows client for CephFS 
(https://drupal.star.bnl.gov/STAR/blog/mpoat/cephfs-client-windows-based-dokan-060).
  I don't see the need for the NFS/SMB server or complicated HA setup.  What 
you do need to make sure is HA is the MDS service.  It is not natively HA, but 
there are several guides to work around that.

  _  


  

David Turner | Cloud Operations Engineer |   
StorageCraft Technology Corporation
380 Data Drive Suite 300 | Draper | Utah | 84020
Office: 801.871.2760 | Mobile: 385.224.2943

  _  


If you are not the intended recipient of this message or received it 
erroneously, please notify the sender and delete it, together with any 
attachments, and be advised that any dissemination or copying of this message 
is prohibited.

  _  

  _  

From: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of John Petrini 
[jpetr...@coredial.com]
Sent: Friday, January 20, 2017 10:41 AM
To: Joao Eduardo Luis
Cc: ceph-users@lists.ceph.com; hen shmuel
Subject: Re: [ceph-users] [Ceph-community] Consultation about ceph storage 
cluster architecture

Here's a really good write up on how to cluster NFS servers backed by RBD 
volumes. It could be adapted to use CephFS with relative ease. 

 

https://www.sebastien-han.fr/blog/2012/07/06/nfs-over-rbd/




___

John Petrini

NOC Systems Administrator   //   CoreDial, LLC   // 
coredial.com   // 
 
 
  
Hillcrest I, 751 Arbor Way, Suite 150, Blue Bell PA, 19422 
P: 215.297.4400 x232   //   F: 215.297.4401   //   E:  
 jpetr...@coredial.com

 

 

The information transmitted is intended only for the person or entity to which 
it is addressed and may contain confidential and/or privileged material. Any 
review, retransmission,  dissemination or other use of, or taking of any action 
in reliance upon, this information by persons or entities other than the 
intended recipient is prohibited. If you received this in error, please contact 
the sender and delete the material from any computer.

 

On Fri, Jan 20, 2017 at 12:35 PM, Joao Eduardo Luis mailto:j...@suse.de> > wrote:

Hi,

This email is better suited for the 'ceph-users' list (CC'ed).

You'll likely find more answers there.

  -Joao

On 01/20/2017 04:33 PM, hen shmuel wrote:

im new to Ceph and i want to build ceph storage cluster at my work site,
to provide NAS services to are clients, as NFS to are linux servers
clients, and as CIFS to are windows servers clients, to my understanding
in order to do that with ceph i need to:

 1. build a full ceph storage cluster
 2. create CephFS "volumes" on my ceph cluster
 3. mount the CephFS to a linux server that will be used as a "gateway"
for export the CephFS as NFS to linux servers and as CIFS to windows
servers
 4. on my linux "gateway" server i need to install NFS server and SMB
Server to do the export part
 5. in order to overcome the "single point of failure" with this linux
"gateway" server in will need to build a cluster of them and use
clustered NFS Server and CTBD or something like that


i wanted to know if i understand it properly and if th

Re: [ceph-users] Bluestore: v11.2.0 peering not happening when OSD is down

2017-01-20 Thread Shinobu Kinjo

`ceph pg dump` should show you something like:

 * active+undersized+degraded ... [NONE,3,2,4,1]3[NONE,3,2,4,1]

Sam,

Am I wrong? Or is it up to something else?


On Sat, Jan 21, 2017 at 4:22 AM, Gregory Farnum  wrote:
> I'm pretty sure the default configs won't let an EC PG go active with
> only "k" OSDs in its PG; it needs at least k+1 (or possibly more? Not
> certain). Running an "n+1" EC config is just not a good idea.
> For testing you could probably adjust this with the equivalent of
> min_size for EC pools, but I don't know the parameters off the top of
> my head.
> -Greg
>
> On Fri, Jan 20, 2017 at 2:15 AM, Muthusamy Muthiah
>  wrote:
>> Hi ,
>>
>> We are validating kraken 11.2.0 with bluestore  on 5 node cluster with EC
>> 4+1.
>>
>> When an OSD is down , the peering is not happening and ceph health status
>> moved to ERR state after few mins. This was working in previous development
>> releases. Any additional configuration required in v11.2.0
>>
>> Following is our ceph configuration:
>>
>> mon_osd_down_out_interval = 30
>> mon_osd_report_timeout = 30
>> mon_osd_down_out_subtree_limit = host
>> mon_osd_reporter_subtree_level = host
>>
>> and the recovery parameters set to default.
>>
>> [root@ca-cn1 ceph]# ceph osd crush show-tunables
>>
>> {
>> "choose_local_tries": 0,
>> "choose_local_fallback_tries": 0,
>> "choose_total_tries": 50,
>> "chooseleaf_descend_once": 1,
>> "chooseleaf_vary_r": 1,
>> "chooseleaf_stable": 1,
>> "straw_calc_version": 1,
>> "allowed_bucket_algs": 54,
>> "profile": "jewel",
>> "optimal_tunables": 1,
>> "legacy_tunables": 0,
>> "minimum_required_version": "jewel",
>> "require_feature_tunables": 1,
>> "require_feature_tunables2": 1,
>> "has_v2_rules": 1,
>> "require_feature_tunables3": 1,
>> "has_v3_rules": 0,
>> "has_v4_buckets": 0,
>> "require_feature_tunables5": 1,
>> "has_v5_rules": 0
>> }
>>
>> ceph status:
>>
>>  health HEALTH_ERR
>> 173 pgs are stuck inactive for more than 300 seconds
>> 173 pgs incomplete
>> 173 pgs stuck inactive
>> 173 pgs stuck unclean
>>  monmap e2: 5 mons at
>> {ca-cn1=10.50.5.117:6789/0,ca-cn2=10.50.5.118:6789/0,ca-cn3=10.50.5.119:6789/0,ca-cn4=10.50.5.120:6789/0,ca-cn5=10.50.5.121:6789/0}
>> election epoch 106, quorum 0,1,2,3,4
>> ca-cn1,ca-cn2,ca-cn3,ca-cn4,ca-cn5
>> mgr active: ca-cn1 standbys: ca-cn2, ca-cn4, ca-cn5, ca-cn3
>>  osdmap e1128: 60 osds: 59 up, 59 in; 173 remapped pgs
>> flags sortbitwise,require_jewel_osds,require_kraken_osds
>>   pgmap v782747: 2048 pgs, 1 pools, 63133 GB data, 46293 kobjects
>> 85199 GB used, 238 TB / 322 TB avail
>> 1868 active+clean
>>  173 remapped+incomplete
>>7 active+clean+scrubbing
>>
>> MON log:
>>
>> 2017-01-20 09:25:54.715684 7f55bcafb700  0 log_channel(cluster) log [INF] :
>> osd.54 out (down for 31.703786)
>> 2017-01-20 09:25:54.725688 7f55bf4d5700  0 mon.ca-cn1@0(leader).osd e1120
>> crush map has features 288250512065953792, adjusting msgr requires
>> 2017-01-20 09:25:54.729019 7f55bf4d5700  0 log_channel(cluster) log [INF] :
>> osdmap e1120: 60 osds: 59 up, 59 in
>> 2017-01-20 09:25:54.735987 7f55bf4d5700  0 log_channel(cluster) log [INF] :
>> pgmap v781993: 2048 pgs: 1869 active+clean, 173 incomplete, 6
>> active+clean+scrubbing; 63159 GB data, 85201 GB used, 238 TB / 322 TB avail;
>> 21825 B/s rd, 163 MB/s wr, 2046 op/s
>> 2017-01-20 09:25:55.737749 7f55bf4d5700  0 mon.ca-cn1@0(leader).osd e1121
>> crush map has features 288250512065953792, adjusting msgr requires
>> 2017-01-20 09:25:55.744338 7f55bf4d5700  0 log_channel(cluster) log [INF] :
>> osdmap e1121: 60 osds: 59 up, 59 in
>> 2017-01-20 09:25:55.749616 7f55bf4d5700  0 log_channel(cluster) log [INF] :
>> pgmap v781994: 2048 pgs: 29 remapped+incomplete, 1869 active+clean, 144
>> incomplete, 6 active+clean+scrubbing; 63159 GB data, 85201 GB used, 238 TB /
>> 322 TB avail; 44503 B/s rd, 45681 kB/s wr, 518 op/s
>> 2017-01-20 09:25:56.768721 7f55bf4d5700  0 log_channel(cluster) log [INF] :
>> pgmap v781995: 2048 pgs: 47 remapped+incomplete, 1869 active+clean, 126
>> incomplete, 6 active+clean+scrubbing; 63159 GB data, 85201 GB used, 238 TB /
>> 322 TB avail; 20275 B/s rd, 72742 kB/s wr, 665 op/s
>>
>> Thanks,
>> Muthu
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Problems with http://tracker.ceph.com/?

2017-01-20 Thread Dan Mick

ns3 is still answering, wrongly, for the record

On 1/20/2017 12:18 PM, David Galloway wrote:
This is resolved.  Apparently ns3 was shutdown a while ago and ns4 
just took a while to catch up.

ceph.com, download.ceph.com, and docs.ceph.com all have updated DNS 
records.

Sorry again for the trouble this caused all week.  The steps we've 
taken should allow us to return to a reasonable level of stability and 
uptime.

On 01/20/2017 11:28 AM, Dan Mick wrote:

Only in that we changed the zone and apparently it hasn't propagated
properly.  I'll check with RHIT.

Sent from Nine 

*From:* Sean Redmond 
*Sent:* Jan 20, 2017 3:07 AM
*To:* Dan Mick
*Cc:* Shinobu Kinjo; Brian Andrus; ceph-users
*Subject:* Re: [ceph-users] Problems with http://tracker.ceph.com/?

Hi,

Is the current strange DNS issue with docs.ceph.com
 related to this also? I noticed that
docs.ceph.com  is getting a different A record
from ns4.redhat.com  vs ns{1..3}.redhat.com

dig output here > http://pastebin.com/WapDY9e2

Thanks

On Thu, Jan 19, 2017 at 11:03 PM, Dan Mick mailto:dm...@redhat.com>> wrote:

On 01/19/2017 09:57 AM, Shinobu Kinjo wrote:

>> The good news is the tenant delete failed. The bad news is 
we're looking for
>> the tracker volume now, which is no longer present in the Ceph 
project.

We've reloaded a new instance of tracker.ceph.com
 from a backup of the
database, and believe it's back online now.  The backup was taken at
about 12:31 PDT, so the last 8 or so hours of changes are, sadly, 
gone,
so if you had tracker updates during that time period, you may 
need to

redo them.

Sorry for the inconvenience.  We've relocated the tracker service to
hopefully mitigate this vulnerability.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Problems with http://tracker.ceph.com/?

2017-01-20 Thread David Galloway

This is resolved.  Apparently ns3 was shutdown a while ago and ns4 just 
took a while to catch up.

ceph.com, download.ceph.com, and docs.ceph.com all have updated DNS records.

Sorry again for the trouble this caused all week.  The steps we've taken 
should allow us to return to a reasonable level of stability and uptime.

On 01/20/2017 11:28 AM, Dan Mick wrote:

Only in that we changed the zone and apparently it hasn't propagated
properly.  I'll check with RHIT.

Sent from Nine 

*From:* Sean Redmond 
*Sent:* Jan 20, 2017 3:07 AM
*To:* Dan Mick
*Cc:* Shinobu Kinjo; Brian Andrus; ceph-users
*Subject:* Re: [ceph-users] Problems with http://tracker.ceph.com/?

Hi,

Is the current strange DNS issue with docs.ceph.com
 related to this also? I noticed that
docs.ceph.com  is getting a different A record
from ns4.redhat.com  vs ns{1..3}.redhat.com

dig output here > http://pastebin.com/WapDY9e2

Thanks

On Thu, Jan 19, 2017 at 11:03 PM, Dan Mick mailto:dm...@redhat.com>> wrote:

On 01/19/2017 09:57 AM, Shinobu Kinjo wrote:

>> The good news is the tenant delete failed. The bad news is we're looking 
for
>> the tracker volume now, which is no longer present in the Ceph project.

We've reloaded a new instance of tracker.ceph.com
 from a backup of the
database, and believe it's back online now.  The backup was taken at
about 12:31 PDT, so the last 8 or so hours of changes are, sadly, gone,
so if you had tracker updates during that time period, you may need to
redo them.

Sorry for the inconvenience.  We've relocated the tracker service to
hopefully mitigate this vulnerability.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Bluestore: v11.2.0 peering not happening when OSD is down

2017-01-20 Thread Gregory Farnum

I'm pretty sure the default configs won't let an EC PG go active with
only "k" OSDs in its PG; it needs at least k+1 (or possibly more? Not
certain). Running an "n+1" EC config is just not a good idea.
For testing you could probably adjust this with the equivalent of
min_size for EC pools, but I don't know the parameters off the top of
my head.
-Greg

On Fri, Jan 20, 2017 at 2:15 AM, Muthusamy Muthiah
 wrote:
> Hi ,
>
> We are validating kraken 11.2.0 with bluestore  on 5 node cluster with EC
> 4+1.
>
> When an OSD is down , the peering is not happening and ceph health status
> moved to ERR state after few mins. This was working in previous development
> releases. Any additional configuration required in v11.2.0
>
> Following is our ceph configuration:
>
> mon_osd_down_out_interval = 30
> mon_osd_report_timeout = 30
> mon_osd_down_out_subtree_limit = host
> mon_osd_reporter_subtree_level = host
>
> and the recovery parameters set to default.
>
> [root@ca-cn1 ceph]# ceph osd crush show-tunables
>
> {
> "choose_local_tries": 0,
> "choose_local_fallback_tries": 0,
> "choose_total_tries": 50,
> "chooseleaf_descend_once": 1,
> "chooseleaf_vary_r": 1,
> "chooseleaf_stable": 1,
> "straw_calc_version": 1,
> "allowed_bucket_algs": 54,
> "profile": "jewel",
> "optimal_tunables": 1,
> "legacy_tunables": 0,
> "minimum_required_version": "jewel",
> "require_feature_tunables": 1,
> "require_feature_tunables2": 1,
> "has_v2_rules": 1,
> "require_feature_tunables3": 1,
> "has_v3_rules": 0,
> "has_v4_buckets": 0,
> "require_feature_tunables5": 1,
> "has_v5_rules": 0
> }
>
> ceph status:
>
>  health HEALTH_ERR
> 173 pgs are stuck inactive for more than 300 seconds
> 173 pgs incomplete
> 173 pgs stuck inactive
> 173 pgs stuck unclean
>  monmap e2: 5 mons at
> {ca-cn1=10.50.5.117:6789/0,ca-cn2=10.50.5.118:6789/0,ca-cn3=10.50.5.119:6789/0,ca-cn4=10.50.5.120:6789/0,ca-cn5=10.50.5.121:6789/0}
> election epoch 106, quorum 0,1,2,3,4
> ca-cn1,ca-cn2,ca-cn3,ca-cn4,ca-cn5
> mgr active: ca-cn1 standbys: ca-cn2, ca-cn4, ca-cn5, ca-cn3
>  osdmap e1128: 60 osds: 59 up, 59 in; 173 remapped pgs
> flags sortbitwise,require_jewel_osds,require_kraken_osds
>   pgmap v782747: 2048 pgs, 1 pools, 63133 GB data, 46293 kobjects
> 85199 GB used, 238 TB / 322 TB avail
> 1868 active+clean
>  173 remapped+incomplete
>7 active+clean+scrubbing
>
> MON log:
>
> 2017-01-20 09:25:54.715684 7f55bcafb700  0 log_channel(cluster) log [INF] :
> osd.54 out (down for 31.703786)
> 2017-01-20 09:25:54.725688 7f55bf4d5700  0 mon.ca-cn1@0(leader).osd e1120
> crush map has features 288250512065953792, adjusting msgr requires
> 2017-01-20 09:25:54.729019 7f55bf4d5700  0 log_channel(cluster) log [INF] :
> osdmap e1120: 60 osds: 59 up, 59 in
> 2017-01-20 09:25:54.735987 7f55bf4d5700  0 log_channel(cluster) log [INF] :
> pgmap v781993: 2048 pgs: 1869 active+clean, 173 incomplete, 6
> active+clean+scrubbing; 63159 GB data, 85201 GB used, 238 TB / 322 TB avail;
> 21825 B/s rd, 163 MB/s wr, 2046 op/s
> 2017-01-20 09:25:55.737749 7f55bf4d5700  0 mon.ca-cn1@0(leader).osd e1121
> crush map has features 288250512065953792, adjusting msgr requires
> 2017-01-20 09:25:55.744338 7f55bf4d5700  0 log_channel(cluster) log [INF] :
> osdmap e1121: 60 osds: 59 up, 59 in
> 2017-01-20 09:25:55.749616 7f55bf4d5700  0 log_channel(cluster) log [INF] :
> pgmap v781994: 2048 pgs: 29 remapped+incomplete, 1869 active+clean, 144
> incomplete, 6 active+clean+scrubbing; 63159 GB data, 85201 GB used, 238 TB /
> 322 TB avail; 44503 B/s rd, 45681 kB/s wr, 518 op/s
> 2017-01-20 09:25:56.768721 7f55bf4d5700  0 log_channel(cluster) log [INF] :
> pgmap v781995: 2048 pgs: 47 remapped+incomplete, 1869 active+clean, 126
> incomplete, 6 active+clean+scrubbing; 63159 GB data, 85201 GB used, 238 TB /
> 322 TB avail; 20275 B/s rd, 72742 kB/s wr, 665 op/s
>
> Thanks,
> Muthu
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] [Ceph-community] Consultation about ceph storage cluster architecture

2017-01-20 Thread David Turner

CephFS does not require a central NFS server.  Any Linux server can mount the 
CephFS volume at the same time.  There is also a windows client for CephFS 
(https://drupal.star.bnl.gov/STAR/blog/mpoat/cephfs-client-windows-based-dokan-060).
  I don't see the need for the NFS/SMB server or complicated HA setup.  What 
you do need to make sure is HA is the MDS service.  It is not natively HA, but 
there are several guides to work around that.



[cid:imagea34cae.JPG@851cd93d.4ebea51a]   David 
Turner | Cloud Operations Engineer | StorageCraft Technology 
Corporation
380 Data Drive Suite 300 | Draper | Utah | 84020
Office: 801.871.2760 | Mobile: 385.224.2943



If you are not the intended recipient of this message or received it 
erroneously, please notify the sender and delete it, together with any 
attachments, and be advised that any dissemination or copying of this message 
is prohibited.




From: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of John Petrini 
[jpetr...@coredial.com]
Sent: Friday, January 20, 2017 10:41 AM
To: Joao Eduardo Luis
Cc: ceph-users@lists.ceph.com; hen shmuel
Subject: Re: [ceph-users] [Ceph-community] Consultation about ceph storage 
cluster architecture

Here's a really good write up on how to cluster NFS servers backed by RBD 
volumes. It could be adapted to use CephFS with relative ease.

https://www.sebastien-han.fr/blog/2012/07/06/nfs-over-rbd/


___

John Petrini

NOC Systems Administrator   //   CoreDial, LLC   //   
coredial.com   //   [Twitter] 
[LinkedIn] 
[Google Plus] 
[Blog] 

Hillcrest I, 751 Arbor Way, Suite 150, Blue Bell PA, 19422
P: 215.297.4400 x232   //   F: 215.297.4401   //   E: 
jpetr...@coredial.com

[Exceptional people. Proven Processes. Innovative Technology. Discover CoreDial 
- watch our 
video]

The information transmitted is intended only for the person or entity to which 
it is addressed and may contain confidential and/or privileged material. Any 
review, retransmission,  dissemination or other use of, or taking of any action 
in reliance upon, this information by persons or entities other than the 
intended recipient is prohibited. If you received this in error, please contact 
the sender and delete the material from any computer.



On Fri, Jan 20, 2017 at 12:35 PM, Joao Eduardo Luis 
mailto:j...@suse.de>> wrote:
Hi,

This email is better suited for the 'ceph-users' list (CC'ed).

You'll likely find more answers there.

  -Joao

On 01/20/2017 04:33 PM, hen shmuel wrote:
im new to Ceph and i want to build ceph storage cluster at my work site,
to provide NAS services to are clients, as NFS to are linux servers
clients, and as CIFS to are windows servers clients, to my understanding
in order to do that with ceph i need to:

 1. build a full ceph storage cluster
 2. create CephFS "volumes" on my ceph cluster
 3. mount the CephFS to a linux server that will be used as a "gateway"
for export the CephFS as NFS to linux servers and as CIFS to windows
servers
 4. on my linux "gateway" server i need to install NFS server and SMB
Server to do the export part
 5. in order to overcome the "single point of failure" with this linux
"gateway" server in will need to build a cluster of them and use
clustered NFS Server and CTBD or something like that


i wanted to know if i understand it properly and if this is the right
way to do that or if there is a simplified way to achieved NAS services
like i want.

thanks for any help!



___
Ceph-community mailing list
ceph-commun...@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-community-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph counters decrementing after changing pg_num

2017-01-20 Thread Shinobu Kinjo

What does `ceph -s` say?

On Sat, Jan 21, 2017 at 3:39 AM, Wido den Hollander  wrote:
>
>> Op 20 januari 2017 om 17:17 schreef Kai Storbeck :
>>
>>
>> Hello ceph users,
>>
>> My graphs of several counters in our Ceph cluster are showing abnormal
>> behaviour after changing the pg_num and pgp_num respectively.
>
> What counters exactly? Like pg information? It could be that it needs a scrub 
> on all PGs before that information is corrected. This scrub will trigger 
> automatically.
>
>>
>> We're using "http://eu.ceph.com/debian-hammer/ jessie/main".
>>
>>
>> Is this a bug, or will the counters stabilize at some time in the near
>> future? Or, is this otherwise fixable by "turning it off and on again"?
>>
>>
>> Regards,
>> Kai
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph counters decrementing after changing pg_num

2017-01-20 Thread Wido den Hollander


> Op 20 januari 2017 om 17:17 schreef Kai Storbeck :
> 
> 
> Hello ceph users,
> 
> My graphs of several counters in our Ceph cluster are showing abnormal
> behaviour after changing the pg_num and pgp_num respectively.

What counters exactly? Like pg information? It could be that it needs a scrub 
on all PGs before that information is corrected. This scrub will trigger 
automatically.

> 
> We're using "http://eu.ceph.com/debian-hammer/ jessie/main".
> 
> 
> Is this a bug, or will the counters stabilize at some time in the near
> future? Or, is this otherwise fixable by "turning it off and on again"?
> 
> 
> Regards,
> Kai
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] [Ceph-community] Consultation about ceph storage cluster architecture

2017-01-20 Thread John Petrini

Here's a really good write up on how to cluster NFS servers backed by RBD
volumes. It could be adapted to use CephFS with relative ease.

https://www.sebastien-han.fr/blog/2012/07/06/nfs-over-rbd/

___

John Petrini

NOC Systems Administrator   //   *CoreDial, LLC*   //   coredial.com
//   [image:
Twitter]    [image: LinkedIn]
   [image: Google Plus]
   [image: Blog]

Hillcrest I, 751 Arbor Way, Suite 150, Blue Bell PA, 19422
*P: *215.297.4400 x232   //   *F: *215.297.4401   //   *E: *
jpetr...@coredial.com

[image: Exceptional people. Proven Processes. Innovative Technology.
Discover CoreDial - watch our video]


The information transmitted is intended only for the person or entity to
which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission,  dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipient is prohibited. If you received
this in error, please contact the sender and delete the material from any
computer.

On Fri, Jan 20, 2017 at 12:35 PM, Joao Eduardo Luis  wrote:

> Hi,
>
> This email is better suited for the 'ceph-users' list (CC'ed).
>
> You'll likely find more answers there.
>
>   -Joao
>
> On 01/20/2017 04:33 PM, hen shmuel wrote:
>
>> im new to Ceph and i want to build ceph storage cluster at my work site,
>> to provide NAS services to are clients, as NFS to are linux servers
>> clients, and as CIFS to are windows servers clients, to my understanding
>> in order to do that with ceph i need to:
>>
>>  1. build a full ceph storage cluster
>>  2. create CephFS "volumes" on my ceph cluster
>>  3. mount the CephFS to a linux server that will be used as a "gateway"
>> for export the CephFS as NFS to linux servers and as CIFS to windows
>> servers
>>  4. on my linux "gateway" server i need to install NFS server and SMB
>> Server to do the export part
>>  5. in order to overcome the "single point of failure" with this linux
>> "gateway" server in will need to build a cluster of them and use
>> clustered NFS Server and CTBD or something like that
>>
>>
>> i wanted to know if i understand it properly and if this is the right
>> way to do that or if there is a simplified way to achieved NAS services
>> like i want.
>>
>> thanks for any help!
>>
>>
>>
>> ___
>> Ceph-community mailing list
>> ceph-commun...@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-community-ceph.com
>>
>>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] [Ceph-community] Consultation about ceph storage cluster architecture

2017-01-20 Thread Joao Eduardo Luis


Hi,

This email is better suited for the 'ceph-users' list (CC'ed).

You'll likely find more answers there.

  -Joao

On 01/20/2017 04:33 PM, hen shmuel wrote:

im new to Ceph and i want to build ceph storage cluster at my work site,
to provide NAS services to are clients, as NFS to are linux servers
clients, and as CIFS to are windows servers clients, to my understanding
in order to do that with ceph i need to:

 1. build a full ceph storage cluster
 2. create CephFS "volumes" on my ceph cluster
 3. mount the CephFS to a linux server that will be used as a "gateway"
for export the CephFS as NFS to linux servers and as CIFS to windows
servers
 4. on my linux "gateway" server i need to install NFS server and SMB
Server to do the export part
 5. in order to overcome the "single point of failure" with this linux
"gateway" server in will need to build a cluster of them and use
clustered NFS Server and CTBD or something like that


i wanted to know if i understand it properly and if this is the right
way to do that or if there is a simplified way to achieved NAS services
like i want.

thanks for any help!



___
Ceph-community mailing list
ceph-commun...@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-community-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph counters decrementing after changing pg_num

2017-01-20 Thread Alexandre DERUMIER

if you change pg_num value,

ceph will reshuffle almost all datas, so depend of the size of your storage, it 
can take some times ...

- Mail original -
De: "Kai Storbeck" 
À: "ceph-users" 
Envoyé: Vendredi 20 Janvier 2017 17:17:08
Objet: [ceph-users] Ceph counters decrementing after changing pg_num

Hello ceph users, 

My graphs of several counters in our Ceph cluster are showing abnormal 
behaviour after changing the pg_num and pgp_num respectively. 

We're using "http://eu.ceph.com/debian-hammer/ jessie/main". 


Is this a bug, or will the counters stabilize at some time in the near 
future? Or, is this otherwise fixable by "turning it off and on again"? 


Regards, 
Kai 



___ 
ceph-users mailing list 
ceph-users@lists.ceph.com 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Problems with http://tracker.ceph.com/?

2017-01-20 Thread Dan Mick

Only in that we changed the zone and apparently it hasn't propagated properly.  
I'll check with RHIT.

Sent from Nine

From: Sean Redmond 
Sent: Jan 20, 2017 3:07 AM
To: Dan Mick
Cc: Shinobu Kinjo; Brian Andrus; ceph-users
Subject: Re: [ceph-users] Problems with http://tracker.ceph.com/?

Hi,

Is the current strange DNS issue with docs.ceph.com related to this also? I 
noticed that docs.ceph.com is getting a different A record from ns4.redhat.com 
vs ns{1..3}.redhat.com

dig output here > http://pastebin.com/WapDY9e2

Thanks

On Thu, Jan 19, 2017 at 11:03 PM, Dan Mick  wrote:
>
> On 01/19/2017 09:57 AM, Shinobu Kinjo wrote:
>
> >> The good news is the tenant delete failed. The bad news is we're looking 
> >> for
> >> the tracker volume now, which is no longer present in the Ceph project.
>
> We've reloaded a new instance of tracker.ceph.com from a backup of the
> database, and believe it's back online now.  The backup was taken at
> about 12:31 PDT, so the last 8 or so hours of changes are, sadly, gone,
> so if you had tracker updates during that time period, you may need to
> redo them.
>
> Sorry for the inconvenience.  We've relocated the tracker service to
> hopefully mitigate this vulnerability.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Ceph counters decrementing after changing pg_num

2017-01-20 Thread Kai Storbeck

Hello ceph users,

My graphs of several counters in our Ceph cluster are showing abnormal
behaviour after changing the pg_num and pgp_num respectively.

We're using "http://eu.ceph.com/debian-hammer/ jessie/main".


Is this a bug, or will the counters stabilize at some time in the near
future? Or, is this otherwise fixable by "turning it off and on again"?


Regards,
Kai




signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Testing a node by fio - strange results to me (Ahmed Khuraidah)

2017-01-20 Thread Ahmed Khuraidah

looking forward so read some good help here
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Question about user's key

2017-01-20 Thread Joao Eduardo Luis


On 01/20/2017 03:52 AM, Chen, Wei D wrote:

Hi,

I have read through some documents about authentication and user management 
about ceph, everything works fine with me, I can create
a user and play with the keys and caps of that user. But I cannot find where 
those keys or capabilities stored, obviously, I can
export those info to a file but where are they if I don't export them out?

Looks like these information (keys and caps) of the user is stored in memory? 
but I still can list them out after rebooting my
machine. Or these info are persisted in some type of DB I didn't aware?

Can anyone help me out?


Authentication keys and caps are kept by the monitor in its store, 
either a leveldb or a rocksdb, in its data directory.


The monitor's data directory are, by default, in 
/var/lib/ceph/mon/ceph-X, with X being the monitor's id. The store is 
within that directory, named `store.db`.


The store in not in human-readable format, but you can use 
ceph-kvstore-tool to walk the keys if you want. Please note that, should 
you want to do this, the monitor must be shutdown first.


  -Joao

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] v11.2.0 kraken released

2017-01-20 Thread Abhishek L

This is the first release of the Kraken series.  It is suitable for
use in production deployments and will be maintained until the next
stable release, Luminous, is completed in the Spring of 2017.

Major Changes from Jewel


- *RADOS*:

  * The new *BlueStore* backend now has a stable disk format and is
passing our failure and stress testing. Although the backend is
still flagged as experimental, we encourage users to try it out
for non-production clusters and non-critical data sets.
  * RADOS now has experimental support for *overwrites on
erasure-coded* pools. Because the disk format and implementation
are not yet finalized, there is a special pool option that must be
enabled to test the new feature. Enabling this option on a cluster
will permanently bar that cluster from being upgraded to future
versions.
  * We now default to the AsyncMessenger (``ms type = async``) instead
of the legacy SimpleMessenger. The most noticeable difference is
that we now use a fixed sized thread pool for network connections
(instead of two threads per socket with SimpleMessenger).
  * Some OSD failures are now detected almost immediately, whereas
previously the heartbeat timeout (which defaults to 20 seconds)
had to expire.  This prevents IO from blocking for an extended
period for failures where the host remains up but the ceph-osd
process is no longer running.
  * There is a new ``ceph-mgr`` daemon. It is currently collocated with
the monitors by default, and is not yet used for much, but the basic
infrastructure is now in place.
  * The size of encoded OSDMaps has been reduced.
  * The OSDs now quiesce scrubbing when recovery or rebalancing is in progress.

- *RGW*:

  * RGW now supports a new zone type that can be used for metadata indexing
via ElasticSearch.
  * RGW now supports the S3 multipart object copy-part API.
  * It is possible now to reshard an existing bucket. Note that bucket
resharding currently requires that all IO (especially writes) to
the specific bucket is quiesced.
  * RGW now supports data compression for objects.
  * Civetweb version has been upgraded to 1.8
  * The Swift static website API is now supported (S3 support has been added
previously).
  * S3 bucket lifecycle API has been added. Note that currently it only supports
object expiration.
  * Support for custom search filters has been added to the LDAP auth
implementation.
  * Support for NFS version 3 has been added to the RGW NFS gateway.
  * A Python binding has been created for librgw.

- *RBD*:

  * RBD now supports images stored in an *erasure-coded* RADOS pool
using the new (experimental) overwrite support. Images must be
created using the new rbd CLI "--data-pool " option to
specify the EC pool where the backing data objects are
stored. Attempting to create an image directly on an EC pool will
not be successful since the image's backing metadata is only
supported on a replicated pool.
  * The rbd-mirror daemon now supports replicating dynamic image
feature updates and image metadata key/value pairs from the
primary image to the non-primary image.
  * The number of image snapshots can be optionally restricted to a
configurable maximum.
  * The rbd Python API now supports asynchronous IO operations.

- *CephFS*:

  * libcephfs function definitions have been changed to enable proper
uid/gid control.  The library version has been increased to reflect the
interface change.
  * Standby replay MDS daemons now consume less memory on workloads
doing deletions.
  * Scrub now repairs backtrace, and populates `damage ls` with
discovered errors.
  * A new `pg_files` subcommand to `cephfs-data-scan` can identify
files affected by a damaged or lost RADOS PG.
  * The false-positive "failing to respond to cache pressure" warnings have
been fixed.


Upgrading from Kraken release candidate 11.1.0
--

* The new *BlueStore* backend had an on-disk format change after 11.1.0.
  Any BlueStore OSDs created with 11.1.0 will need to be destroyed and
  recreated.

Upgrading from Jewel


* All clusters must first be upgraded to Jewel 10.2.z before upgrading
  to Kraken 11.2.z (or, eventually, Luminous 12.2.z).

* The ``sortbitwise`` flag must be set on the Jewel cluster before upgrading
  to Kraken.  The latest Jewel (10.2.4+) releases issue a health warning if
  the flag is not set, so this is probably already set.  If it is not, Kraken
  OSDs will refuse to start and will print and error message in their log.

* You may upgrade OSDs, Monitors, and MDSs in any order.  RGW daemons
  should be upgraded last.

* When upgrading, new ceph-mgr daemon instances will be created automatically
  alongside any monitors.  This will be true for Jewel to Kraken and Jewel to
  Luminous upgrades, but likely not be true for future upgrades bey

Re: [ceph-users] Question about user's key

2017-01-20 Thread Martin Palma

I don't know exactly where but I'm guessing in the database of the
monitor server which should be located at
"/var/lib/ceph/mon/".

Best,
Martin

On Fri, Jan 20, 2017 at 8:55 AM, Chen, Wei D  wrote:
> Hi Martin,
>
> Thanks for your response!
> Could you pls tell me where it is on the monitor nodes? only in the memory or 
> persisted in any files or DBs? Looks like it’s not just in memory but I 
> cannot find where those value saved, thanks!
>
> Best Regards,
> Dave Chen
>
> From: Martin Palma [mailto:mar...@palma.bz]
> Sent: Friday, January 20, 2017 3:36 PM
> To: Ceph-User; Chen, Wei D; ceph-de...@vger.kernel.org
> Subject: Re: Question about user's key
>
> Hi,
>
> They are stored on the monitore nodes.
>
> Best,
> Martin
>
> On Fri, 20 Jan 2017 at 04:53, Chen, Wei D  wrote:
> Hi,
>
>
>
> I have read through some documents about authentication and user management 
> about ceph, everything works fine with me, I can create
>
> a user and play with the keys and caps of that user. But I cannot find where 
> those keys or capabilities stored, obviously, I can
>
> export those info to a file but where are they if I don't export them out?
>
>
>
> Looks like these information (keys and caps) of the user is stored in memory? 
> but I still can list them out after rebooting my
>
> machine. Or these info are persisted in some type of DB I didn't aware?
>
>
>
> Can anyone help me out?
>
>
>
>
>
> Best Regards,
>
> Dave Chen
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Problems with http://tracker.ceph.com/?

2017-01-20 Thread Sean Redmond

Hi,

Is the current strange DNS issue with docs.ceph.com related to this also? I
noticed that docs.ceph.com is getting a different A record from
ns4.redhat.com vs ns{1..3}.redhat.com

dig output here > http://pastebin.com/WapDY9e2

Thanks

On Thu, Jan 19, 2017 at 11:03 PM, Dan Mick  wrote:

> On 01/19/2017 09:57 AM, Shinobu Kinjo wrote:
>
> >> The good news is the tenant delete failed. The bad news is we're
> looking for
> >> the tracker volume now, which is no longer present in the Ceph project.
>
> We've reloaded a new instance of tracker.ceph.com from a backup of the
> database, and believe it's back online now.  The backup was taken at
> about 12:31 PDT, so the last 8 or so hours of changes are, sadly, gone,
> so if you had tracker updates during that time period, you may need to
> redo them.
>
> Sorry for the inconvenience.  We've relocated the tracker service to
> hopefully mitigate this vulnerability.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] rgw static website docs 404

2017-01-20 Thread Wido den Hollander


> Op 19 januari 2017 om 20:00 schreef Ben Hines :
> 
> 
> Sure. However, as a general development process, many projects require
> documentation to go in with a feature. The person who wrote it is the best
> person to explain how to use it.
> 

Probably, but still, it's not a requirement. All I'm trying to say is that Open 
Source is free without guarantee.

> Even just adding a new setting to a list of valid settings is pretty basic,
> quick and easy.  It'd odd that major new features are added and effectively
> kept secret.
> 

Maybe the dev didn't want to write docs, he/she forgot or just didn't get to it 
yet.

It would be very much appreciated if you would send a PR with the updated 
documentation :)

Wido

> -Ben
> 
> On Thu, Jan 19, 2017 at 1:56 AM, Wido den Hollander  wrote:
> 
> >
> > > Op 19 januari 2017 om 2:57 schreef Ben Hines :
> > >
> > >
> > > Aha! Found some docs here in the RHCS site:
> > >
> > > https://access.redhat.com/documentation/en/red-hat-ceph-
> > storage/2/paged/object-gateway-guide-for-red-hat-
> > enterprise-linux/chapter-2-configuration
> > >
> > > Really, ceph.com should have all this too...
> > >
> >
> > I agree, but keep in mind that Ceph is a free, Open Source project. It's
> > free to use and to consume. Writing documentation isn't always the
> > coolest/fanciest/nicest thing to do.
> >
> > You are more then welcome to send a Pull Request on Github to update the
> > documentation for the RGW. That would help others which might be in the
> > same situation as you are.
> >
> > Open Source is by working together and collaborating on a project :)
> >
> > This can be writing code, documentation or helping others on mailinglists.
> > That way we call benefit from the project.
> >
> > Wido
> >
> > > -Ben
> > >
> > > On Wed, Jan 18, 2017 at 5:15 PM, Ben Hines  wrote:
> > >
> > > > Are there docs on the RGW static website feature?
> > > >
> > > > I found 'rgw enable static website' config setting only via the mailing
> > > > list. A search for 'static' on ceph.com turns up release notes, but no
> > > > other documentation. Anyone have pointers on how to set this up and
> > what i
> > > > can do with it? Does it require using dns based buckets, for example?
> > I'd
> > > > like to be able to hit a website with http:
> > ,
> > > > ideally. (without the browser forcing it to download)
> > > >
> > > > thanks,
> > > >
> > > > -Ben
> > > >
> > > ___
> > > ceph-users mailing list
> > > ceph-users@lists.ceph.com
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Bluestore: v11.2.0 peering not happening when OSD is down

2017-01-20 Thread Muthusamy Muthiah

Hi ,

We are validating kraken 11.2.0 with bluestore  on 5 node cluster with EC
4+1.

When an OSD is down , the peering is not happening and ceph health status
moved to ERR state after few mins. This was working in previous development
releases. Any additional configuration required in v11.2.0

Following is our ceph configuration:

mon_osd_down_out_interval = 30
mon_osd_report_timeout = 30
mon_osd_down_out_subtree_limit = host
mon_osd_reporter_subtree_level = host

and the recovery parameters set to default.

[root@ca-cn1 ceph]# ceph osd crush show-tunables

{
"choose_local_tries": 0,
"choose_local_fallback_tries": 0,
"choose_total_tries": 50,
"chooseleaf_descend_once": 1,
"chooseleaf_vary_r": 1,
"chooseleaf_stable": 1,
"straw_calc_version": 1,
"allowed_bucket_algs": 54,
"profile": "jewel",
"optimal_tunables": 1,
"legacy_tunables": 0,
"minimum_required_version": "jewel",
"require_feature_tunables": 1,
"require_feature_tunables2": 1,
"has_v2_rules": 1,
"require_feature_tunables3": 1,
"has_v3_rules": 0,
"has_v4_buckets": 0,
"require_feature_tunables5": 1,
"has_v5_rules": 0
}

ceph status:

 health HEALTH_ERR
173 pgs are stuck inactive for more than 300 seconds
173 pgs incomplete
173 pgs stuck inactive
173 pgs stuck unclean
 monmap e2: 5 mons at {ca-cn1=
10.50.5.117:6789/0,ca-cn2=10.50.5.118:6789/0,ca-cn3=10.50.5.119:6789/0,ca-cn4=10.50.5.120:6789/0,ca-cn5=10.50.5.121:6789/0
}
election epoch 106, quorum 0,1,2,3,4
ca-cn1,ca-cn2,ca-cn3,ca-cn4,ca-cn5
mgr active: ca-cn1 standbys: ca-cn2, ca-cn4, ca-cn5, ca-cn3
 osdmap e1128: 60 osds: 59 up, 59 in; 173 remapped pgs
flags sortbitwise,require_jewel_osds,require_kraken_osds
  pgmap v782747: 2048 pgs, 1 pools, 63133 GB data, 46293 kobjects
85199 GB used, 238 TB / 322 TB avail
1868 active+clean
 173 remapped+incomplete
   7 active+clean+scrubbing

MON log:

2017-01-20 09:25:54.715684 7f55bcafb700  0 log_channel(cluster) log [INF] :
osd.54 out (down for 31.703786)
2017-01-20 09:25:54.725688 7f55bf4d5700  0 mon.ca-cn1@0(leader).osd e1120
crush map has features 288250512065953792, adjusting msgr requires
2017-01-20 09:25:54.729019 7f55bf4d5700  0 log_channel(cluster) log [INF] :
osdmap e1120: 60 osds: 59 up, 59 in
2017-01-20 09:25:54.735987 7f55bf4d5700  0 log_channel(cluster) log [INF] :
pgmap v781993: 2048 pgs: 1869 active+clean, 173 incomplete, 6
active+clean+scrubbing; 63159 GB data, 85201 GB used, 238 TB / 322 TB
avail; 21825 B/s rd, 163 MB/s wr, 2046 op/s
2017-01-20 09:25:55.737749 7f55bf4d5700  0 mon.ca-cn1@0(leader).osd e1121
crush map has features 288250512065953792, adjusting msgr requires
2017-01-20 09:25:55.744338 7f55bf4d5700  0 log_channel(cluster) log [INF] :
osdmap e1121: 60 osds: 59 up, 59 in
2017-01-20 09:25:55.749616 7f55bf4d5700  0 log_channel(cluster) log [INF] :
pgmap v781994: 2048 pgs: 29 remapped+incomplete, 1869 active+clean, 144
incomplete, 6 active+clean+scrubbing; 63159 GB data, 85201 GB used, 238 TB
/ 322 TB avail; 44503 B/s rd, 45681 kB/s wr, 518 op/s
2017-01-20 09:25:56.768721 7f55bf4d5700  0 log_channel(cluster) log [INF] :
pgmap v781995: 2048 pgs: 47 remapped+incomplete, 1869 active+clean, 126
incomplete, 6 active+clean+scrubbing; 63159 GB data, 85201 GB used, 238 TB
/ 322 TB avail; 20275 B/s rd, 72742 kB/s wr, 665 op/s

Thanks,
Muthu
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] property upgrade Ceph from 10.2.3 to 10.2.5 without downtime

2017-01-20 Thread Luis Periquito

I've run through many upgrades without anyone noticing, including in
very busy openstack environments.

As a rule of thumb you should upgrade MONs, OSDs, MDSs and RadosGWs in
that order, however you should always read the upgrade instructions on
the release notes page
(http://docs.ceph.com/docs/master/release-notes/).

Again as a rule of thumb minor version upgrades have little to no
notices regarding upgrade, usually the issue is regarding major
versions.

On Thu, Jan 19, 2017 at 5:50 PM, Oliver Dzombic  wrote:
> Hi,
>
> i did exactly the same, with
>
> Centos 7, 2x OSD server ~ 30 OSDs all in all, 3x mon server, 3x mds server
>
> Following sequence:
>
> - Update OSD server A & restart ceph.target service
> - Update OSD server B & restart ceph.target service
> - Update inactive mon / mds server C & restart ceph.target service
> - Update inactive mon / mds server B & restart ceph.target service
> - Update active mon / mds server A & restart ceph.target service
>
> Result: No downtime, no issues.
>
>
> --
> Mit freundlichen Gruessen / Best regards
>
> Oliver Dzombic
> IP-Interactive
>
> mailto:i...@ip-interactive.de
>
> Anschrift:
>
> IP Interactive UG ( haftungsbeschraenkt )
> Zum Sonnenberg 1-3
> 63571 Gelnhausen
>
> HRB 93402 beim Amtsgericht Hanau
> Geschäftsführung: Oliver Dzombic
>
> Steuer Nr.: 35 236 3622 1
> UST ID: DE274086107
>
>
> Am 19.01.2017 um 18:40 schrieb Vy Nguyen Tan:
>> Hello everyone,
>>
>> I am planning for upgrade Ceph cluster from 10.2.3 to 10.2.5. I am
>> wondering can I upgrade Ceph cluster without downtime? And how to
>> upgrade Ceph from 10.2.3 to 10.2.5 without downtime ?
>>
>> Thanks for help.
>>
>> Regards,
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] osd_snap_trim_sleep keeps locks PG during sleep?

Re: [ceph-users] [Ceph-community] Consultation about ceph storage cluster architecture

Re: [ceph-users] Bluestore: v11.2.0 peering not happening when OSD is down

Re: [ceph-users] Problems with http://tracker.ceph.com/?

Re: [ceph-users] Problems with http://tracker.ceph.com/?

Re: [ceph-users] Bluestore: v11.2.0 peering not happening when OSD is down

Re: [ceph-users] [Ceph-community] Consultation about ceph storage cluster architecture

Re: [ceph-users] Ceph counters decrementing after changing pg_num

Re: [ceph-users] Ceph counters decrementing after changing pg_num

Re: [ceph-users] [Ceph-community] Consultation about ceph storage cluster architecture

Re: [ceph-users] [Ceph-community] Consultation about ceph storage cluster architecture

Re: [ceph-users] Ceph counters decrementing after changing pg_num

Re: [ceph-users] Problems with http://tracker.ceph.com/?

[ceph-users] Ceph counters decrementing after changing pg_num

Re: [ceph-users] Testing a node by fio - strange results to me (Ahmed Khuraidah)

Re: [ceph-users] Question about user's key

[ceph-users] v11.2.0 kraken released

Re: [ceph-users] Question about user's key

Re: [ceph-users] Problems with http://tracker.ceph.com/?

Re: [ceph-users] rgw static website docs 404

[ceph-users] Bluestore: v11.2.0 peering not happening when OSD is down

Re: [ceph-users] property upgrade Ceph from 10.2.3 to 10.2.5 without downtime

22 matches

Site Navigation

Mail list logo

Footer information