from:"Joao Luis"

Re: [ceph-users] Some OSD and MDS crash

2014-07-03 Thread Joao Luis

Do those logs have a higher debugging level than the default? If not
nevermind as they will not have enough information. If they do however,
we'd be interested in the portion around the moment you set the tunables.
Say, before the upgrade and a bit after you set the tunable. If you want to
be finer grained, then ideally it would be the moment where those maps were
created, but you'd have to grep the logs for that.

Or drop the logs somewhere and I'll take a look.

  -Joao
On Jul 3, 2014 5:48 PM, Pierre BLONDEAU pierre.blond...@unicaen.fr
wrote:

 Le 03/07/2014 13:49, Joao Eduardo Luis a écrit :

 On 07/03/2014 12:15 AM, Pierre BLONDEAU wrote:

 Le 03/07/2014 00:55, Samuel Just a écrit :

 Ah,

 ~/logs » for i in 20 23; do ../ceph/src/osdmaptool --export-crush
 /tmp/crush$i osd-$i*; ../ceph/src/crushtool -d /tmp/crush$i 
 /tmp/crush$i.d; done; diff /tmp/crush20.d /tmp/crush23.d
 ../ceph/src/osdmaptool: osdmap file
 'osd-20_osdmap.13258__0_4E62BB79__none'
 ../ceph/src/osdmaptool: exported crush map to /tmp/crush20
 ../ceph/src/osdmaptool: osdmap file
 'osd-23_osdmap.13258__0_4E62BB79__none'
 ../ceph/src/osdmaptool: exported crush map to /tmp/crush23
 6d5
  tunable chooseleaf_vary_r 1

  Looks like the chooseleaf_vary_r tunable somehow ended up divergent?


 The only thing that comes to mind that could cause this is if we changed
 the leader's in-memory map, proposed it, it failed, and only the leader
 got to write the map to disk somehow.  This happened once on a totally
 different issue (although I can't pinpoint right now which).

 In such a scenario, the leader would serve the incorrect osdmap to
 whoever asked osdmaps from it, the remaining quorum would serve the
 correct osdmaps to all the others.  This could cause this divergence. Or
 it could be something else.

 Are there logs for the monitors for the timeframe this may have happened
 in?


 Which exactly timeframe you want ? I have 7 days of logs, I should have
 informations about the upgrade from firefly to 0.82.
 Which mon's log do you want ? Three ?

 Regards

 -Joao


 Pierre: do you recall how and when that got set?


 I am not sure to understand, but if I good remember after the update in
 firefly, I was in state : HEALTH_WARN crush map has legacy tunables and
 I see feature set mismatch in log.

 So if I good remeber, i do : ceph osd crush tunables optimal for the
 problem of crush map and I update my client and server kernel to
 3.16rc.

 It's could be that ?

 Pierre

  -Sam

 On Wed, Jul 2, 2014 at 3:43 PM, Samuel Just sam.j...@inktank.com
 wrote:

 Yeah, divergent osdmaps:
 555ed048e73024687fc8b106a570db4f  osd-20_osdmap.13258__0_
 4E62BB79__none
 6037911f31dc3c18b05499d24dcdbe5c  osd-23_osdmap.13258__0_
 4E62BB79__none

 Joao: thoughts?
 -Sam

 On Wed, Jul 2, 2014 at 3:39 PM, Pierre BLONDEAU
 pierre.blond...@unicaen.fr wrote:

 The files

 When I upgrade :
   ceph-deploy install --stable firefly servers...
   on each servers service ceph restart mon
   on each servers service ceph restart osd
   on each servers service ceph restart mds

 I upgraded from emperor to firefly. After repair, remap, replace,
 etc ... I
 have some PG which pass in peering state.

 I thought why not try the version 0.82, it could solve my problem. (
 It's my mistake ). So, I upgrade from firefly to 0.83 with :
   ceph-deploy install --testing servers...
   ..

 Now, all programs are in version 0.82.
 I have 3 mons, 36 OSD and 3 mds.

 Pierre

 PS : I find also inc\uosdmap.13258__0_469271DE__none on each meta
 directory.

 Le 03/07/2014 00:10, Samuel Just a écrit :

  Also, what version did you upgrade from, and how did you upgrade?
 -Sam

 On Wed, Jul 2, 2014 at 3:09 PM, Samuel Just sam.j...@inktank.com
 wrote:


 Ok, in current/meta on osd 20 and osd 23, please attach all files
 matching

 ^osdmap.13258.*

 There should be one such file on each osd. (should look something
 like
 osdmap.6__0_FD6E4C01__none, probably hashed into a subdirectory,
 you'll want to use find).

 What version of ceph is running on your mons?  How many mons do
 you have?
 -Sam

 On Wed, Jul 2, 2014 at 2:21 PM, Pierre BLONDEAU
 pierre.blond...@unicaen.fr wrote:


 Hi,

 I do it, the log files are available here :
 https://blondeau.users.greyc.fr/cephlog/debug20/

 The OSD's files are really big +/- 80M .

 After starting the osd.20 some other osd crash. I pass from 31
 osd up to
 16.
 I remark that after this the number of down+peering PG decrease
 from 367
 to
 248. It's normal ? May be it's temporary, the time that the
 cluster
 verifies all the PG ?

 Regards
 Pierre

 Le 02/07/2014 19:16, Samuel Just a écrit :

  You should add

 debug osd = 20
 debug filestore = 20
 debug ms = 1

 to the [osd] section of the ceph.conf and restart the osds.  I'd
 like
 all three logs if possible.

 Thanks
 -Sam

 On Wed, Jul 2, 2014 at 5:03 AM, Pierre BLONDEAU
 pierre.blond...@unicaen.fr wrote:



 Yes, but how i do that ?

 With a command like that ?

 ceph tell osd.20 injectargs '--debug-osd 20

Re: [ceph-users] No monitor sockets after upgrading to Emperor

2013-11-12 Thread Joao Luis

On Nov 12, 2013 2:38 AM, Berant Lemmenes ber...@lemmenes.com wrote:

 I noticed the same behavior on my dumpling cluster. They wouldn't show up
after boot, but after a service restart they were there.

 I haven't tested a node reboot since I upgraded to emperor today. I'll
give it a shot tomorrow.

 Thanks,
 Berant

 On Nov 11, 2013 9:29 PM, Peter Matulis peter.matu...@canonical.com
wrote:

 After upgrading from Dumpling to Emperor on Ubuntu 12.04 I noticed the
 admin sockets for each of my monitors were missing although the cluster
 seemed to continue running fine.  There wasn't anything under
 /var/run/ceph.  After restarting the service on each monitor node they
 reappeared.  Anyone?

 ~pmatulis


Odd behavior. The monitors do remove the admin socket on shutdown and
proceed to create it when they start, but as long as they are running it
should exist. Have you checked the logs for some error message that could
provide more insight on the cause?

  -Joao
___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Some OSD and MDS crash

Re: [ceph-users] No monitor sockets after upgrading to Emperor

2 matches

Site Navigation

Mail list logo

Footer information