Re: [ceph-users] Force an OSD to try to peer

2015-04-14 Thread Martin Millnert
On Tue, Mar 31, 2015 at 10:44:51PM +0300, koukou73gr wrote:
 On 03/31/2015 09:23 PM, Sage Weil wrote:
 
 It's nothing specific to peering (or ceph).  The symptom we've seen is
 just that byte stop passing across a TCP connection, usually when there is
 some largish messages being sent.  The ping/heartbeat messages get through
 because they are small and we disable nagle so they never end up in large
 frames.
 
 Is there any special route one should take in order to transition a
 live cluster to use jumbo frames and avoid such pitfalls with OSD
 peering?

1. Configure entire switch infrastructure for jumbo frames.
2. Enable config versioning of switch infrastructure configurations
3. Bonus points: Monitor config changes of switch infrastructure
4. Run ping test using e.g. fping from each node to every other node,
with large frames.
5. Bonus points: Setup such a test in some monitor infrastructure.
6. Once you trust the config (and monitoring), up all the nodes MTU
to jumbo size, simultaneously.  This is the critical step and perhaps
it could be further perfected. Ideally you would like an atomic
MTU-upgrade command on the entire cluster.

/M


signature.asc
Description: Digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Force an OSD to try to peer

2015-04-14 Thread Scott Laird
Things *mostly* work if hosts on the same network have different MTUs, at
least with TCP, because the hosts will negotiate the MSS for each
connection.  UDP will still break, but large UDP packets are less common.
You don't want to run that way for very long, but there's no need for an
atomic MTU swap.

What *really* screws things up is when the host MTU is bigger than the
switch MTU.

On Tue, Apr 14, 2015 at 1:42 AM Martin Millnert mar...@millnert.se wrote:

 On Tue, Mar 31, 2015 at 10:44:51PM +0300, koukou73gr wrote:
  On 03/31/2015 09:23 PM, Sage Weil wrote:
  
  It's nothing specific to peering (or ceph).  The symptom we've seen is
  just that byte stop passing across a TCP connection, usually when there
 is
  some largish messages being sent.  The ping/heartbeat messages get
 through
  because they are small and we disable nagle so they never end up in
 large
  frames.
 
  Is there any special route one should take in order to transition a
  live cluster to use jumbo frames and avoid such pitfalls with OSD
  peering?

 1. Configure entire switch infrastructure for jumbo frames.
 2. Enable config versioning of switch infrastructure configurations
 3. Bonus points: Monitor config changes of switch infrastructure
 4. Run ping test using e.g. fping from each node to every other node,
 with large frames.
 5. Bonus points: Setup such a test in some monitor infrastructure.
 6. Once you trust the config (and monitoring), up all the nodes MTU
 to jumbo size, simultaneously.  This is the critical step and perhaps
 it could be further perfected. Ideally you would like an atomic
 MTU-upgrade command on the entire cluster.

 /M
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Force an OSD to try to peer

2015-03-31 Thread Robert LeBlanc
Turns out jumbo frames was not set on all the switch ports. Once that
was resolved the cluster quickly became healthy.

On Mon, Mar 30, 2015 at 8:15 PM, Robert LeBlanc rob...@leblancnet.us wrote:
 I've been working at this peering problem all day. I've done a lot of
 testing at the network layer and I just don't believe that we have a problem
 that would prevent OSDs from peering. When looking though osd_debug 20/20
 logs, it just doesn't look like the OSDs are trying to peer. I don't know if
 it is because there are so many outstanding creations or what. OSDs will
 peer with OSDs on other hosts, but for reason only chooses a certain number
 and not one that it needs to finish the peering process.

 I've check: firewall, open files, number of threads allowed. These usually
 have given me an error in the logs that helped me fix the problem.

 I can't find a configuration item that specifies how many peers an OSD
 should contact or anything that would be artificially limiting the peering
 connections. I've restarted the OSDs a number of times, as well as rebooting
 the hosts. I beleive if the OSDs finish peering everything will clear up. I
 can't find anything in pg query that would help me figure out what is
 blocking it (peering blocked by is empty). The PGs are scattered across all
 the hosts so we can't pin it down to a specific host.

 Any ideas on what to try would be appreciated.

 [ulhglive-root@ceph9 ~]# ceph --version
 ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3)
 [ulhglive-root@ceph9 ~]# ceph status
 cluster 48de182b-5488-42bb-a6d2-62e8e47b435c
  health HEALTH_WARN 1 pgs down; 1321 pgs peering; 1321 pgs stuck
 inactive; 1321 pgs stuck unclean; too few pgs per osd (17  min 20)
  monmap e2: 3 mons at
 {mon1=10.217.72.27:6789/0,mon2=10.217.72.28:6789/0,mon3=10.217.72.29:6789/0},
 election epoch 30, quorum 0,1,2 mon1,mon2,mon3
  osdmap e704: 120 osds: 120 up, 120 in
   pgmap v1895: 2048 pgs, 1 pools, 0 bytes data, 0 objects
 11447 MB used, 436 TB / 436 TB avail
  727 active+clean
  990 peering
   37 creating+peering
1 down+peering
  290 remapped+peering
3 creating+remapped+peering

 { state: peering,
   epoch: 707,
   up: [
 40,
 92,
 48,
 91],
   acting: [
 40,
 92,
 48,
 91],
   info: { pgid: 7.171,
   last_update: 0'0,
   last_complete: 0'0,
   log_tail: 0'0,
   last_user_version: 0,
   last_backfill: MAX,
   purged_snaps: [],
   history: { epoch_created: 293,
   last_epoch_started: 343,
   last_epoch_clean: 343,
   last_epoch_split: 0,
   same_up_since: 688,
   same_interval_since: 688,
   same_primary_since: 608,
   last_scrub: 0'0,
   last_scrub_stamp: 2015-03-30 11:11:18.872851,
   last_deep_scrub: 0'0,
   last_deep_scrub_stamp: 2015-03-30 11:11:18.872851,
   last_clean_scrub_stamp: 0.00},
   stats: { version: 0'0,
   reported_seq: 326,
   reported_epoch: 707,
   state: peering,
   last_fresh: 2015-03-30 20:10:39.509855,
   last_change: 2015-03-30 19:44:17.361601,
   last_active: 2015-03-30 11:37:56.956417,
   last_clean: 2015-03-30 11:37:56.956417,
   last_became_active: 0.00,
   last_unstale: 2015-03-30 20:10:39.509855,
   mapping_epoch: 683,
   log_start: 0'0,
   ondisk_log_start: 0'0,
   created: 293,
   last_epoch_clean: 343,
   parent: 0.0,
   parent_split_bits: 0,
   last_scrub: 0'0,
   last_scrub_stamp: 2015-03-30 11:11:18.872851,
   last_deep_scrub: 0'0,
   last_deep_scrub_stamp: 2015-03-30 11:11:18.872851,
   last_clean_scrub_stamp: 0.00,
   log_size: 0,
   ondisk_log_size: 0,
   stats_invalid: 0,
   stat_sum: { num_bytes: 0,
   num_objects: 0,
   num_object_clones: 0,
   num_object_copies: 0,
   num_objects_missing_on_primary: 0,
   num_objects_degraded: 0,
   num_objects_unfound: 0,
   num_objects_dirty: 0,
   num_whiteouts: 0,
   num_read: 0,
   num_read_kb: 0,
   num_write: 0,
   num_write_kb: 0,
   num_scrub_errors: 0,
   num_shallow_scrub_errors: 0,
   num_deep_scrub_errors: 0,
   num_objects_recovered: 0,
   num_bytes_recovered: 0,
   num_keys_recovered: 0,
   num_objects_omap: 0,
   num_objects_hit_set_archive: 0},
   stat_cat_sum: {},
   up: [
 40,
 92,
 48,
 91],
   acting: [
 40,
 

Re: [ceph-users] Force an OSD to try to peer

2015-03-31 Thread koukou73gr

On 03/31/2015 09:23 PM, Sage Weil wrote:


It's nothing specific to peering (or ceph).  The symptom we've seen is
just that byte stop passing across a TCP connection, usually when there is
some largish messages being sent.  The ping/heartbeat messages get through
because they are small and we disable nagle so they never end up in large
frames.


Is there any special route one should take in order to transition a live 
cluster to use jumbo frames and avoid such pitfalls with OSD peering?


-K.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Force an OSD to try to peer

2015-03-31 Thread Robert LeBlanc
At the L2 level, if the hosts and switches don't accept jumbo frames,
they just drop them because they are too big. They are not fragmented
because they don't go through a router. My problem is that OSDs were
able to peer with other OSDs on the host, but my guess is that they
never sent/received packets larger than 1500 bytes. Then other OSD
processes tried to peer but sent packets larger than 1500 bytes
causing the packets to be dropped and peering to stall.

On Tue, Mar 31, 2015 at 12:10 PM, Somnath Roy somnath@sandisk.com wrote:
 But, do we know why Jumbo frames may have an impact on peering ?
 In our setup so far, we haven't enabled jumbo frames other than performance 
 reason (if at all).

 Thanks  Regards
 Somnath

 -Original Message-
 From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
 Robert LeBlanc
 Sent: Tuesday, March 31, 2015 11:08 AM
 To: Sage Weil
 Cc: ceph-devel; Ceph-User
 Subject: Re: [ceph-users] Force an OSD to try to peer

 I was desperate for anything after exhausting every other possibility I could 
 think of. Maybe I should put a checklist in the Ceph docs of things to look 
 for.

 Thanks,

 On Tue, Mar 31, 2015 at 11:36 AM, Sage Weil s...@newdream.net wrote:
 On Tue, 31 Mar 2015, Robert LeBlanc wrote:
 Turns out jumbo frames was not set on all the switch ports. Once that
 was resolved the cluster quickly became healthy.

 I always hesitate to point the finger at the jumbo frames
 configuration but almost every time that is the culprit!

 Thanks for the update.  :)
 sage




 On Mon, Mar 30, 2015 at 8:15 PM, Robert LeBlanc rob...@leblancnet.us 
 wrote:
  I've been working at this peering problem all day. I've done a lot
  of testing at the network layer and I just don't believe that we
  have a problem that would prevent OSDs from peering. When looking
  though osd_debug 20/20 logs, it just doesn't look like the OSDs are
  trying to peer. I don't know if it is because there are so many
  outstanding creations or what. OSDs will peer with OSDs on other
  hosts, but for reason only chooses a certain number and not one that it 
  needs to finish the peering process.
 
  I've check: firewall, open files, number of threads allowed. These
  usually have given me an error in the logs that helped me fix the problem.
 
  I can't find a configuration item that specifies how many peers an
  OSD should contact or anything that would be artificially limiting
  the peering connections. I've restarted the OSDs a number of times,
  as well as rebooting the hosts. I beleive if the OSDs finish
  peering everything will clear up. I can't find anything in pg query
  that would help me figure out what is blocking it (peering blocked
  by is empty). The PGs are scattered across all the hosts so we can't pin 
  it down to a specific host.
 
  Any ideas on what to try would be appreciated.
 
  [ulhglive-root@ceph9 ~]# ceph --version ceph version 0.80.7
  (6c0127fcb58008793d3c8b62d925bc91963672a3)
  [ulhglive-root@ceph9 ~]# ceph status
  cluster 48de182b-5488-42bb-a6d2-62e8e47b435c
   health HEALTH_WARN 1 pgs down; 1321 pgs peering; 1321 pgs
  stuck inactive; 1321 pgs stuck unclean; too few pgs per osd (17  min 20)
   monmap e2: 3 mons at
  {mon1=10.217.72.27:6789/0,mon2=10.217.72.28:6789/0,mon3=10.217.72.2
  9:6789/0}, election epoch 30, quorum 0,1,2 mon1,mon2,mon3
   osdmap e704: 120 osds: 120 up, 120 in
pgmap v1895: 2048 pgs, 1 pools, 0 bytes data, 0 objects
  11447 MB used, 436 TB / 436 TB avail
   727 active+clean
   990 peering
37 creating+peering
 1 down+peering
   290 remapped+peering
 3 creating+remapped+peering
 
  { state: peering,
epoch: 707,
up: [
  40,
  92,
  48,
  91],
acting: [
  40,
  92,
  48,
  91],
info: { pgid: 7.171,
last_update: 0'0,
last_complete: 0'0,
log_tail: 0'0,
last_user_version: 0,
last_backfill: MAX,
purged_snaps: [],
history: { epoch_created: 293,
last_epoch_started: 343,
last_epoch_clean: 343,
last_epoch_split: 0,
same_up_since: 688,
same_interval_since: 688,
same_primary_since: 608,
last_scrub: 0'0,
last_scrub_stamp: 2015-03-30 11:11:18.872851,
last_deep_scrub: 0'0,
last_deep_scrub_stamp: 2015-03-30 11:11:18.872851,
last_clean_scrub_stamp: 0.00},
stats: { version: 0'0,
reported_seq: 326,
reported_epoch: 707,
state: peering,
last_fresh: 2015-03-30 20:10:39.509855,
last_change: 2015-03-30 19:44:17.361601,
last_active: 2015-03-30 11:37:56.956417,
last_clean: 2015-03-30 11:37:56.956417,
last_became_active: 0.00,

Re: [ceph-users] Force an OSD to try to peer

2015-03-31 Thread Sage Weil
On Tue, 31 Mar 2015, Robert LeBlanc wrote:
 Turns out jumbo frames was not set on all the switch ports. Once that
 was resolved the cluster quickly became healthy.

I always hesitate to point the finger at the jumbo frames configuration 
but almost every time that is the culprit!

Thanks for the update.  :)
sage



 
 On Mon, Mar 30, 2015 at 8:15 PM, Robert LeBlanc rob...@leblancnet.us wrote:
  I've been working at this peering problem all day. I've done a lot of
  testing at the network layer and I just don't believe that we have a problem
  that would prevent OSDs from peering. When looking though osd_debug 20/20
  logs, it just doesn't look like the OSDs are trying to peer. I don't know if
  it is because there are so many outstanding creations or what. OSDs will
  peer with OSDs on other hosts, but for reason only chooses a certain number
  and not one that it needs to finish the peering process.
 
  I've check: firewall, open files, number of threads allowed. These usually
  have given me an error in the logs that helped me fix the problem.
 
  I can't find a configuration item that specifies how many peers an OSD
  should contact or anything that would be artificially limiting the peering
  connections. I've restarted the OSDs a number of times, as well as rebooting
  the hosts. I beleive if the OSDs finish peering everything will clear up. I
  can't find anything in pg query that would help me figure out what is
  blocking it (peering blocked by is empty). The PGs are scattered across all
  the hosts so we can't pin it down to a specific host.
 
  Any ideas on what to try would be appreciated.
 
  [ulhglive-root@ceph9 ~]# ceph --version
  ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3)
  [ulhglive-root@ceph9 ~]# ceph status
  cluster 48de182b-5488-42bb-a6d2-62e8e47b435c
   health HEALTH_WARN 1 pgs down; 1321 pgs peering; 1321 pgs stuck
  inactive; 1321 pgs stuck unclean; too few pgs per osd (17  min 20)
   monmap e2: 3 mons at
  {mon1=10.217.72.27:6789/0,mon2=10.217.72.28:6789/0,mon3=10.217.72.29:6789/0},
  election epoch 30, quorum 0,1,2 mon1,mon2,mon3
   osdmap e704: 120 osds: 120 up, 120 in
pgmap v1895: 2048 pgs, 1 pools, 0 bytes data, 0 objects
  11447 MB used, 436 TB / 436 TB avail
   727 active+clean
   990 peering
37 creating+peering
 1 down+peering
   290 remapped+peering
 3 creating+remapped+peering
 
  { state: peering,
epoch: 707,
up: [
  40,
  92,
  48,
  91],
acting: [
  40,
  92,
  48,
  91],
info: { pgid: 7.171,
last_update: 0'0,
last_complete: 0'0,
log_tail: 0'0,
last_user_version: 0,
last_backfill: MAX,
purged_snaps: [],
history: { epoch_created: 293,
last_epoch_started: 343,
last_epoch_clean: 343,
last_epoch_split: 0,
same_up_since: 688,
same_interval_since: 688,
same_primary_since: 608,
last_scrub: 0'0,
last_scrub_stamp: 2015-03-30 11:11:18.872851,
last_deep_scrub: 0'0,
last_deep_scrub_stamp: 2015-03-30 11:11:18.872851,
last_clean_scrub_stamp: 0.00},
stats: { version: 0'0,
reported_seq: 326,
reported_epoch: 707,
state: peering,
last_fresh: 2015-03-30 20:10:39.509855,
last_change: 2015-03-30 19:44:17.361601,
last_active: 2015-03-30 11:37:56.956417,
last_clean: 2015-03-30 11:37:56.956417,
last_became_active: 0.00,
last_unstale: 2015-03-30 20:10:39.509855,
mapping_epoch: 683,
log_start: 0'0,
ondisk_log_start: 0'0,
created: 293,
last_epoch_clean: 343,
parent: 0.0,
parent_split_bits: 0,
last_scrub: 0'0,
last_scrub_stamp: 2015-03-30 11:11:18.872851,
last_deep_scrub: 0'0,
last_deep_scrub_stamp: 2015-03-30 11:11:18.872851,
last_clean_scrub_stamp: 0.00,
log_size: 0,
ondisk_log_size: 0,
stats_invalid: 0,
stat_sum: { num_bytes: 0,
num_objects: 0,
num_object_clones: 0,
num_object_copies: 0,
num_objects_missing_on_primary: 0,
num_objects_degraded: 0,
num_objects_unfound: 0,
num_objects_dirty: 0,
num_whiteouts: 0,
num_read: 0,
num_read_kb: 0,
num_write: 0,
num_write_kb: 0,
num_scrub_errors: 0,
num_shallow_scrub_errors: 0,
num_deep_scrub_errors: 0,
num_objects_recovered: 0,
num_bytes_recovered: 

Re: [ceph-users] Force an OSD to try to peer

2015-03-31 Thread Somnath Roy
But, do we know why Jumbo frames may have an impact on peering ?
In our setup so far, we haven't enabled jumbo frames other than performance 
reason (if at all).

Thanks  Regards
Somnath

-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Robert 
LeBlanc
Sent: Tuesday, March 31, 2015 11:08 AM
To: Sage Weil
Cc: ceph-devel; Ceph-User
Subject: Re: [ceph-users] Force an OSD to try to peer

I was desperate for anything after exhausting every other possibility I could 
think of. Maybe I should put a checklist in the Ceph docs of things to look for.

Thanks,

On Tue, Mar 31, 2015 at 11:36 AM, Sage Weil s...@newdream.net wrote:
 On Tue, 31 Mar 2015, Robert LeBlanc wrote:
 Turns out jumbo frames was not set on all the switch ports. Once that
 was resolved the cluster quickly became healthy.

 I always hesitate to point the finger at the jumbo frames
 configuration but almost every time that is the culprit!

 Thanks for the update.  :)
 sage




 On Mon, Mar 30, 2015 at 8:15 PM, Robert LeBlanc rob...@leblancnet.us wrote:
  I've been working at this peering problem all day. I've done a lot
  of testing at the network layer and I just don't believe that we
  have a problem that would prevent OSDs from peering. When looking
  though osd_debug 20/20 logs, it just doesn't look like the OSDs are
  trying to peer. I don't know if it is because there are so many
  outstanding creations or what. OSDs will peer with OSDs on other
  hosts, but for reason only chooses a certain number and not one that it 
  needs to finish the peering process.
 
  I've check: firewall, open files, number of threads allowed. These
  usually have given me an error in the logs that helped me fix the problem.
 
  I can't find a configuration item that specifies how many peers an
  OSD should contact or anything that would be artificially limiting
  the peering connections. I've restarted the OSDs a number of times,
  as well as rebooting the hosts. I beleive if the OSDs finish
  peering everything will clear up. I can't find anything in pg query
  that would help me figure out what is blocking it (peering blocked
  by is empty). The PGs are scattered across all the hosts so we can't pin 
  it down to a specific host.
 
  Any ideas on what to try would be appreciated.
 
  [ulhglive-root@ceph9 ~]# ceph --version ceph version 0.80.7
  (6c0127fcb58008793d3c8b62d925bc91963672a3)
  [ulhglive-root@ceph9 ~]# ceph status
  cluster 48de182b-5488-42bb-a6d2-62e8e47b435c
   health HEALTH_WARN 1 pgs down; 1321 pgs peering; 1321 pgs
  stuck inactive; 1321 pgs stuck unclean; too few pgs per osd (17  min 20)
   monmap e2: 3 mons at
  {mon1=10.217.72.27:6789/0,mon2=10.217.72.28:6789/0,mon3=10.217.72.2
  9:6789/0}, election epoch 30, quorum 0,1,2 mon1,mon2,mon3
   osdmap e704: 120 osds: 120 up, 120 in
pgmap v1895: 2048 pgs, 1 pools, 0 bytes data, 0 objects
  11447 MB used, 436 TB / 436 TB avail
   727 active+clean
   990 peering
37 creating+peering
 1 down+peering
   290 remapped+peering
 3 creating+remapped+peering
 
  { state: peering,
epoch: 707,
up: [
  40,
  92,
  48,
  91],
acting: [
  40,
  92,
  48,
  91],
info: { pgid: 7.171,
last_update: 0'0,
last_complete: 0'0,
log_tail: 0'0,
last_user_version: 0,
last_backfill: MAX,
purged_snaps: [],
history: { epoch_created: 293,
last_epoch_started: 343,
last_epoch_clean: 343,
last_epoch_split: 0,
same_up_since: 688,
same_interval_since: 688,
same_primary_since: 608,
last_scrub: 0'0,
last_scrub_stamp: 2015-03-30 11:11:18.872851,
last_deep_scrub: 0'0,
last_deep_scrub_stamp: 2015-03-30 11:11:18.872851,
last_clean_scrub_stamp: 0.00},
stats: { version: 0'0,
reported_seq: 326,
reported_epoch: 707,
state: peering,
last_fresh: 2015-03-30 20:10:39.509855,
last_change: 2015-03-30 19:44:17.361601,
last_active: 2015-03-30 11:37:56.956417,
last_clean: 2015-03-30 11:37:56.956417,
last_became_active: 0.00,
last_unstale: 2015-03-30 20:10:39.509855,
mapping_epoch: 683,
log_start: 0'0,
ondisk_log_start: 0'0,
created: 293,
last_epoch_clean: 343,
parent: 0.0,
parent_split_bits: 0,
last_scrub: 0'0,
last_scrub_stamp: 2015-03-30 11:11:18.872851,
last_deep_scrub: 0'0,
last_deep_scrub_stamp: 2015-03-30 11:11:18.872851,
last_clean_scrub_stamp: 0.00,
log_size: 0,
ondisk_log_size: 0,

Re: [ceph-users] Force an OSD to try to peer

2015-03-31 Thread Robert LeBlanc
I was desperate for anything after exhausting every other possibility
I could think of. Maybe I should put a checklist in the Ceph docs of
things to look for.

Thanks,

On Tue, Mar 31, 2015 at 11:36 AM, Sage Weil s...@newdream.net wrote:
 On Tue, 31 Mar 2015, Robert LeBlanc wrote:
 Turns out jumbo frames was not set on all the switch ports. Once that
 was resolved the cluster quickly became healthy.

 I always hesitate to point the finger at the jumbo frames configuration
 but almost every time that is the culprit!

 Thanks for the update.  :)
 sage




 On Mon, Mar 30, 2015 at 8:15 PM, Robert LeBlanc rob...@leblancnet.us wrote:
  I've been working at this peering problem all day. I've done a lot of
  testing at the network layer and I just don't believe that we have a 
  problem
  that would prevent OSDs from peering. When looking though osd_debug 20/20
  logs, it just doesn't look like the OSDs are trying to peer. I don't know 
  if
  it is because there are so many outstanding creations or what. OSDs will
  peer with OSDs on other hosts, but for reason only chooses a certain number
  and not one that it needs to finish the peering process.
 
  I've check: firewall, open files, number of threads allowed. These usually
  have given me an error in the logs that helped me fix the problem.
 
  I can't find a configuration item that specifies how many peers an OSD
  should contact or anything that would be artificially limiting the peering
  connections. I've restarted the OSDs a number of times, as well as 
  rebooting
  the hosts. I beleive if the OSDs finish peering everything will clear up. I
  can't find anything in pg query that would help me figure out what is
  blocking it (peering blocked by is empty). The PGs are scattered across all
  the hosts so we can't pin it down to a specific host.
 
  Any ideas on what to try would be appreciated.
 
  [ulhglive-root@ceph9 ~]# ceph --version
  ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3)
  [ulhglive-root@ceph9 ~]# ceph status
  cluster 48de182b-5488-42bb-a6d2-62e8e47b435c
   health HEALTH_WARN 1 pgs down; 1321 pgs peering; 1321 pgs stuck
  inactive; 1321 pgs stuck unclean; too few pgs per osd (17  min 20)
   monmap e2: 3 mons at
  {mon1=10.217.72.27:6789/0,mon2=10.217.72.28:6789/0,mon3=10.217.72.29:6789/0},
  election epoch 30, quorum 0,1,2 mon1,mon2,mon3
   osdmap e704: 120 osds: 120 up, 120 in
pgmap v1895: 2048 pgs, 1 pools, 0 bytes data, 0 objects
  11447 MB used, 436 TB / 436 TB avail
   727 active+clean
   990 peering
37 creating+peering
 1 down+peering
   290 remapped+peering
 3 creating+remapped+peering
 
  { state: peering,
epoch: 707,
up: [
  40,
  92,
  48,
  91],
acting: [
  40,
  92,
  48,
  91],
info: { pgid: 7.171,
last_update: 0'0,
last_complete: 0'0,
log_tail: 0'0,
last_user_version: 0,
last_backfill: MAX,
purged_snaps: [],
history: { epoch_created: 293,
last_epoch_started: 343,
last_epoch_clean: 343,
last_epoch_split: 0,
same_up_since: 688,
same_interval_since: 688,
same_primary_since: 608,
last_scrub: 0'0,
last_scrub_stamp: 2015-03-30 11:11:18.872851,
last_deep_scrub: 0'0,
last_deep_scrub_stamp: 2015-03-30 11:11:18.872851,
last_clean_scrub_stamp: 0.00},
stats: { version: 0'0,
reported_seq: 326,
reported_epoch: 707,
state: peering,
last_fresh: 2015-03-30 20:10:39.509855,
last_change: 2015-03-30 19:44:17.361601,
last_active: 2015-03-30 11:37:56.956417,
last_clean: 2015-03-30 11:37:56.956417,
last_became_active: 0.00,
last_unstale: 2015-03-30 20:10:39.509855,
mapping_epoch: 683,
log_start: 0'0,
ondisk_log_start: 0'0,
created: 293,
last_epoch_clean: 343,
parent: 0.0,
parent_split_bits: 0,
last_scrub: 0'0,
last_scrub_stamp: 2015-03-30 11:11:18.872851,
last_deep_scrub: 0'0,
last_deep_scrub_stamp: 2015-03-30 11:11:18.872851,
last_clean_scrub_stamp: 0.00,
log_size: 0,
ondisk_log_size: 0,
stats_invalid: 0,
stat_sum: { num_bytes: 0,
num_objects: 0,
num_object_clones: 0,
num_object_copies: 0,
num_objects_missing_on_primary: 0,
num_objects_degraded: 0,
num_objects_unfound: 0,
num_objects_dirty: 0,
num_whiteouts: 0,
num_read: 0,
num_read_kb: 0,

Re: [ceph-users] Force an OSD to try to peer

2015-03-31 Thread Sage Weil
On Tue, 31 Mar 2015, Somnath Roy wrote:
 But, do we know why Jumbo frames may have an impact on peering ?
 In our setup so far, we haven't enabled jumbo frames other than performance 
 reason (if at all).

It's nothing specific to peering (or ceph).  The symptom we've seen is 
just that byte stop passing across a TCP connection, usually when there is 
some largish messages being sent.  The ping/heartbeat messages get through 
because they are small and we disable nagle so they never end up in large 
frames.

It's a pain to diagnose.

sage


 
 Thanks  Regards
 Somnath
 
 -Original Message-
 From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
 Robert LeBlanc
 Sent: Tuesday, March 31, 2015 11:08 AM
 To: Sage Weil
 Cc: ceph-devel; Ceph-User
 Subject: Re: [ceph-users] Force an OSD to try to peer
 
 I was desperate for anything after exhausting every other possibility I could 
 think of. Maybe I should put a checklist in the Ceph docs of things to look 
 for.
 
 Thanks,
 
 On Tue, Mar 31, 2015 at 11:36 AM, Sage Weil s...@newdream.net wrote:
  On Tue, 31 Mar 2015, Robert LeBlanc wrote:
  Turns out jumbo frames was not set on all the switch ports. Once that
  was resolved the cluster quickly became healthy.
 
  I always hesitate to point the finger at the jumbo frames
  configuration but almost every time that is the culprit!
 
  Thanks for the update.  :)
  sage
 
 
 
 
  On Mon, Mar 30, 2015 at 8:15 PM, Robert LeBlanc rob...@leblancnet.us 
  wrote:
   I've been working at this peering problem all day. I've done a lot
   of testing at the network layer and I just don't believe that we
   have a problem that would prevent OSDs from peering. When looking
   though osd_debug 20/20 logs, it just doesn't look like the OSDs are
   trying to peer. I don't know if it is because there are so many
   outstanding creations or what. OSDs will peer with OSDs on other
   hosts, but for reason only chooses a certain number and not one that it 
   needs to finish the peering process.
  
   I've check: firewall, open files, number of threads allowed. These
   usually have given me an error in the logs that helped me fix the 
   problem.
  
   I can't find a configuration item that specifies how many peers an
   OSD should contact or anything that would be artificially limiting
   the peering connections. I've restarted the OSDs a number of times,
   as well as rebooting the hosts. I beleive if the OSDs finish
   peering everything will clear up. I can't find anything in pg query
   that would help me figure out what is blocking it (peering blocked
   by is empty). The PGs are scattered across all the hosts so we can't pin 
   it down to a specific host.
  
   Any ideas on what to try would be appreciated.
  
   [ulhglive-root@ceph9 ~]# ceph --version ceph version 0.80.7
   (6c0127fcb58008793d3c8b62d925bc91963672a3)
   [ulhglive-root@ceph9 ~]# ceph status
   cluster 48de182b-5488-42bb-a6d2-62e8e47b435c
health HEALTH_WARN 1 pgs down; 1321 pgs peering; 1321 pgs
   stuck inactive; 1321 pgs stuck unclean; too few pgs per osd (17  min 20)
monmap e2: 3 mons at
   {mon1=10.217.72.27:6789/0,mon2=10.217.72.28:6789/0,mon3=10.217.72.2
   9:6789/0}, election epoch 30, quorum 0,1,2 mon1,mon2,mon3
osdmap e704: 120 osds: 120 up, 120 in
 pgmap v1895: 2048 pgs, 1 pools, 0 bytes data, 0 objects
   11447 MB used, 436 TB / 436 TB avail
727 active+clean
990 peering
 37 creating+peering
  1 down+peering
290 remapped+peering
  3 creating+remapped+peering
  
   { state: peering,
 epoch: 707,
 up: [
   40,
   92,
   48,
   91],
 acting: [
   40,
   92,
   48,
   91],
 info: { pgid: 7.171,
 last_update: 0'0,
 last_complete: 0'0,
 log_tail: 0'0,
 last_user_version: 0,
 last_backfill: MAX,
 purged_snaps: [],
 history: { epoch_created: 293,
 last_epoch_started: 343,
 last_epoch_clean: 343,
 last_epoch_split: 0,
 same_up_since: 688,
 same_interval_since: 688,
 same_primary_since: 608,
 last_scrub: 0'0,
 last_scrub_stamp: 2015-03-30 11:11:18.872851,
 last_deep_scrub: 0'0,
 last_deep_scrub_stamp: 2015-03-30 11:11:18.872851,
 last_clean_scrub_stamp: 0.00},
 stats: { version: 0'0,
 reported_seq: 326,
 reported_epoch: 707,
 state: peering,
 last_fresh: 2015-03-30 20:10:39.509855,
 last_change: 2015-03-30 19:44:17.361601,
 last_active: 2015-03-30 11:37:56.956417,
 last_clean: 2015-03-30 11:37:56.956417,
 last_became_active: 0.00,
 last_unstale: 2015-03-30 20:10:39.509855,