SOLVED: mkcephfs failing on v0.48 argonaut

2012-07-22 Thread Paul Pettigrew
Hi Sage

Good news, I have been able to resolve this issue. It requires the addition of 
one line of code calling 'sync' to your script: /sbin/mkcephfs

At line 336, I have added in the block (see the comment and 'sync' line below 
it):

modprobe btrfs || true
mkfs.btrfs $btrfs_devs
btrfs device scan || btrfsctl -a
#Mach: add 'sync' command below to force block changes, else btrfs mount 
command is likely to fail
sync
mount -t btrfs $btrfs_opt $first_dev $btrfs_path
chown $osd_user $btrfs_path
chmod +w $btrfs_path

This has 100% resolved the issue for us.

Cheers,
Paul




-Original Message-
From: Sage Weil [mailto:s...@inktank.com]
Sent: Wednesday, 11 July 2012 9:35 AM
To: Paul Pettigrew
Cc: ceph-devel@vger.kernel.org
Subject: RE: mkcephfs failing on v0.48 argonaut

Hi Paul,

Were you able to make any progress on this?

On Sun, 8 Jul 2012, Paul Pettigrew wrote:
 Hi Sage

 Confirming running the commands from a root prompt in the same sequence as 
 requested:
 mkfs.btrfs /dev/sdc
  btrfs device scan
  mount /dev/sdc /srv/osd.0


 root@dsanb1-coy:~# mkfs.btrfs /dev/sdc

 WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL WARNING! - see
 http://btrfs.wiki.kernel.org before using

 fs created label (null) on /dev/sdc
 nodesize 4096 leafsize 4096 sectorsize 4096 size 1.82TB Btrfs
 Btrfs v0.19 root@dsanb1-coy:~# btrfs device scan Scanning for Btrfs
 filesystems root@dsanb1-coy:~# mount /dev/sdc /srv/osd.0
 mount: wrong fs type, bad option, bad superblock on /dev/sdc,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail  or so

 root@dsanb1-coy:~# mount | grep btrfs
 root@dsanb1-coy:~# mount -t btrfs /dev/sdc /srv/osd.0
 root@dsanb1-coy:~# mount | grep btrfs /dev/sdc on /srv/osd.0 type
 btrfs (rw)


 So - you can see I had to explicitly set the flag to mount -t btrfs.  It 
 seems that when mkcepfs is running the command mount -t btrfs -o noatime 
 /dev/sdc /srv/osd.0 it is doing so with the effect that the -t btrfs was 
 not present in the line, but it is (symptom the same at least). Crazy.

 Secondly, as requested I cannot run the command sh -x /sbin/mkcephfs
 -d /tmp/mkcephfs.xgk025tjkQ --prepare-osdfs osd.0 because the
 contents of the /tmp/ directory are deleted each time mkcephfs
 finishes its run. I have however called the overall mkcephfs command
 with sh -x so you can see what is occurring.

Can you modify the mkcephfs command so that when it re-runs itself, it passes 
in -x?

You can also remove the 'rm -r' bit that cleans up on error so that you can run 
the command manually when it fails.


 See below:

 root@dsanb1-coy:~# sh -x /sbin/mkcephfs -c /etc/ceph/ceph.conf
 --allhosts --mkbtrfs -k /etc/ceph/keyring --crushmapsrc crushfile.txt
 -v SNIP === osd.0 ===
 + return 0
 + [ -n  ]
 + rdir=/tmp/mkcephfs.kJjIwsEnfZ
 + [ 0 -eq 0 ]
 + cp /tmp/mkcephfs.kJjIwsEnfZ/conf /etc/ceph/ceph.conf [ 1 -eq 1 ] [
 + osd = osd ] do_root_cmd /sbin/mkcephfs -d /tmp/mkcephfs.kJjIwsEnfZ
 + --prepare-osdfs osd.0 [ -z  ] [ 1 -eq 1 ] echo --- dsanb1-coy#
 + /sbin/mkcephfs -d /tmp/mkcephfs.kJjIwsEnfZ --prepare-osdfs osd.0
 --- dsanb1-coy# /sbin/mkcephfs -d /tmp/mkcephfs.kJjIwsEnfZ
 --prepare-osdfs osd.0
 + ulimit -c unlimited
 + whoami
 + whoami=root
 + [ root = root ]
 + bash -c /sbin/mkcephfs -d /tmp/mkcephfs.kJjIwsEnfZ --prepare-osdfs
 + osd.0

i.e., 'bash -c -x ...' here

 umount: /srv/osd.0: not mounted
 umount: /dev/sdc: not mounted
 /sbin/mkfs.btrfs
 RUNNING: mkfs.btrfs /dev/sdc

 WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL WARNING! - see
 http://btrfs.wiki.kernel.org before using

 fs created label (null) on /dev/sdc
 nodesize 4096 leafsize 4096 sectorsize 4096 size 1.82TB Btrfs
 Btrfs v0.19 Scanning for Btrfs filesystems /bin/mount
 RUNNING: mount -t btrfs -o noatime /dev/sdc /srv/osd.0
 mount: wrong fs type, bad option, bad superblock on /dev/sdc,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail  or so

 + echo failed: '/sbin/mkcephfs -d /tmp/mkcephfs.kJjIwsEnfZ --prepare-osdfs 
 osd.0'
 failed: '/sbin/mkcephfs -d /tmp/mkcephfs.kJjIwsEnfZ --prepare-osdfs osd.0'
 + exit 1
 + rm -rf /tmp/mkcephfs.kJjIwsEnfZ

and comment out this command.

sage


 + exit



 -Original Message-
 From: Sage Weil [mailto:s...@inktank.com]
 Sent: Saturday, 7 July 2012 2:20 PM
 To: Paul Pettigrew
 Cc: ceph-devel@vger.kernel.org
 Subject: RE: mkcephfs failing on v0.48 argonaut

 On Sat, 7 Jul 2012, Paul Pettigrew wrote:
  Hi again Sage
 
  This is very perplexing.  Confirming this system is a stock Ubuntu 12.04 
  x64, with no custom kernel or anything else, fully apt-get dist-upgrade'd 
  up to date.
  root@dsanb1-coy:~# uname -r
  3.2.0-26-generic
 
  I have added in the suggestions you made to the script, we now have:
 
  modprobe btrfs || true
  which mkfs.btrfs
  echo RUNNING: mkfs.btrfs $btrfs_devs

RE: mkcephfs failing on v0.48 argonaut

2012-07-07 Thread Paul Pettigrew
Hi Sage

Confirming running the commands from a root prompt in the same sequence as 
requested:
mkfs.btrfs /dev/sdc
 btrfs device scan
 mount /dev/sdc /srv/osd.0


root@dsanb1-coy:~# mkfs.btrfs /dev/sdc

WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using

fs created label (null) on /dev/sdc
nodesize 4096 leafsize 4096 sectorsize 4096 size 1.82TB
Btrfs Btrfs v0.19
root@dsanb1-coy:~# btrfs device scan
Scanning for Btrfs filesystems
root@dsanb1-coy:~# mount /dev/sdc /srv/osd.0
mount: wrong fs type, bad option, bad superblock on /dev/sdc,
   missing codepage or helper program, or other error
   In some cases useful info is found in syslog - try
   dmesg | tail  or so

root@dsanb1-coy:~# mount | grep btrfs
root@dsanb1-coy:~# mount -t btrfs /dev/sdc /srv/osd.0
root@dsanb1-coy:~# mount | grep btrfs
/dev/sdc on /srv/osd.0 type btrfs (rw)


So - you can see I had to explicitly set the flag to mount -t btrfs.  It 
seems that when mkcepfs is running the command mount -t btrfs -o noatime 
/dev/sdc /srv/osd.0 it is doing so with the effect that the -t btrfs was not 
present in the line, but it is (symptom the same at least). Crazy.

Secondly, as requested I cannot run the command sh -x /sbin/mkcephfs -d 
/tmp/mkcephfs.xgk025tjkQ --prepare-osdfs osd.0 because the contents of the 
/tmp/ directory are deleted each time mkcephfs finishes its run. I have however 
called the overall mkcephfs command with sh -x so you can see what is 
occurring.

See below:

root@dsanb1-coy:~# sh -x /sbin/mkcephfs -c /etc/ceph/ceph.conf --allhosts 
--mkbtrfs -k /etc/ceph/keyring --crushmapsrc crushfile.txt -v
SNIP
=== osd.0 ===
+ return 0
+ [ -n  ]
+ rdir=/tmp/mkcephfs.kJjIwsEnfZ
+ [ 0 -eq 0 ]
+ cp /tmp/mkcephfs.kJjIwsEnfZ/conf /etc/ceph/ceph.conf
+ [ 1 -eq 1 ]
+ [ osd = osd ]
+ do_root_cmd /sbin/mkcephfs -d /tmp/mkcephfs.kJjIwsEnfZ --prepare-osdfs osd.0
+ [ -z  ]
+ [ 1 -eq 1 ]
+ echo --- dsanb1-coy# /sbin/mkcephfs -d /tmp/mkcephfs.kJjIwsEnfZ 
--prepare-osdfs osd.0
--- dsanb1-coy# /sbin/mkcephfs -d /tmp/mkcephfs.kJjIwsEnfZ --prepare-osdfs osd.0
+ ulimit -c unlimited
+ whoami
+ whoami=root
+ [ root = root ]
+ bash -c /sbin/mkcephfs -d /tmp/mkcephfs.kJjIwsEnfZ --prepare-osdfs osd.0
umount: /srv/osd.0: not mounted
umount: /dev/sdc: not mounted
/sbin/mkfs.btrfs
RUNNING: mkfs.btrfs /dev/sdc

WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using

fs created label (null) on /dev/sdc
nodesize 4096 leafsize 4096 sectorsize 4096 size 1.82TB
Btrfs Btrfs v0.19
Scanning for Btrfs filesystems
/bin/mount
RUNNING: mount -t btrfs -o noatime /dev/sdc /srv/osd.0
mount: wrong fs type, bad option, bad superblock on /dev/sdc,
   missing codepage or helper program, or other error
   In some cases useful info is found in syslog - try
   dmesg | tail  or so

+ echo failed: '/sbin/mkcephfs -d /tmp/mkcephfs.kJjIwsEnfZ --prepare-osdfs 
osd.0'
failed: '/sbin/mkcephfs -d /tmp/mkcephfs.kJjIwsEnfZ --prepare-osdfs osd.0'
+ exit 1
+ rm -rf /tmp/mkcephfs.kJjIwsEnfZ
+ exit



-Original Message-
From: Sage Weil [mailto:s...@inktank.com]
Sent: Saturday, 7 July 2012 2:20 PM
To: Paul Pettigrew
Cc: ceph-devel@vger.kernel.org
Subject: RE: mkcephfs failing on v0.48 argonaut

On Sat, 7 Jul 2012, Paul Pettigrew wrote:
 Hi again Sage

 This is very perplexing.  Confirming this system is a stock Ubuntu 12.04 x64, 
 with no custom kernel or anything else, fully apt-get dist-upgrade'd up to 
 date.
 root@dsanb1-coy:~# uname -r
 3.2.0-26-generic

 I have added in the suggestions you made to the script, we now have:

 modprobe btrfs || true
 which mkfs.btrfs
 echo RUNNING: mkfs.btrfs $btrfs_devs
 mkfs.btrfs $btrfs_devs
 btrfs device scan || btrfsctl -a
 which mount
 echo RUNNING: mount -t btrfs $btrfs_opt $first_dev $btrfs_path
 mount -t btrfs $btrfs_opt $first_dev $btrfs_path echo DID I GET
 HERE - OR CRASH OUT WITH mount ABOVE?
 chown $osd_user $btrfs_path


 See below that the same command within the mkcephfs that is failing,
 is working fine on a standard command line:

Weirdness!


 === osd.0 ===
 --- dsanb1-coy# /sbin/mkcephfs -d /tmp/mkcephfs.xgk025tjkQ
 --prepare-osdfs osd.0

Can you run this command with -x to see exactly what bash is doing?

 sh -x /sbin/mkcephfs -d /tmp/mkcephfs.xgk025tjkQ --prepare-osdfs osd.0

In particular, I'm curious if you do

 mkfs.btrfs /dev/sdc
 btrfs device scan
 mount /dev/sdc /srv/osd.0

(or whatever the exact sequence that mkcephfs does is) from the command line, 
does it give you the same error?

sage


 umount: /srv/osd.0: not mounted
 umount: /dev/sdc: not mounted
 /sbin/mkfs.btrfs
 RUNNING: mkfs.btrfs /dev/sdc

 WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL WARNING! - see
 http://btrfs.wiki.kernel.org before using

 fs created label (null) on /dev/sdc
 nodesize 4096 leafsize 4096 sectorsize 4096 size 1.82TB Btrfs
 Btrfs v0.19 Scanning for Btrfs filesystems
 /bin/mount
 RUNNING

RE: mkcephfs failing on v0.48 argonaut

2012-07-06 Thread Paul Pettigrew
Hi again Sage

This is very perplexing.  Confirming this system is a stock Ubuntu 12.04 x64, 
with no custom kernel or anything else, fully apt-get dist-upgrade'd up to date.
root@dsanb1-coy:~# uname -r
3.2.0-26-generic

I have added in the suggestions you made to the script, we now have:

modprobe btrfs || true
which mkfs.btrfs
echo RUNNING: mkfs.btrfs $btrfs_devs
mkfs.btrfs $btrfs_devs
btrfs device scan || btrfsctl -a
echo RUNNING: mount -t btrfs $btrfs_opt $first_dev $btrfs_path
which mount
mount -t btrfs $btrfs_opt $first_dev $btrfs_path
echo DID I GET HERE - OR CRASH OUT WITH mount ABOVE?
chown $osd_user $btrfs_path


See below that the same command within the mkcephfs that is failing, is working 
fine on a standard command line:

=== osd.0 ===
--- dsanb1-coy# /sbin/mkcephfs -d /tmp/mkcephfs.xgk025tjkQ --prepare-osdfs osd.0
umount: /srv/osd.0: not mounted
umount: /dev/sdc: not mounted
/sbin/mkfs.btrfs
RUNNING: mkfs.btrfs /dev/sdc

WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using

fs created label (null) on /dev/sdc
nodesize 4096 leafsize 4096 sectorsize 4096 size 1.82TB
Btrfs Btrfs v0.19
Scanning for Btrfs filesystems
RUNNING: mount -t btrfs -o noatime /dev/sdc /srv/osd.0
/bin/mount
mount: wrong fs type, bad option, bad superblock on /dev/sdc,
   missing codepage or helper program, or other error
   In some cases useful info is found in syslog - try
   dmesg | tail  or so

failed: '/sbin/mkcephfs -d /tmp/mkcephfs.xgk025tjkQ --prepare-osdfs osd.0'
root@dsanb1-coy:~# /bin/mount -t btrfs -o noatime /dev/sdc /srv/osd.0
root@dsanb1-coy:~# mount | grep btrfs
/dev/sdc on /srv/osd.0 type btrfs (rw,noatime)


Remember, this is not isolated to btrfs, as per my original post it fails when 
not specifying to use btrfs.

I can only conclude that /bin/sh /or /bin/bash and the way they interact with 
the mkcephfs script, which does call itself etc, is somehow now become fuddled 
up?  Must be something wiggy, when the script output confirms it is calling the 
same command ( /bin/mount ) but somehow finds a way for that to not work and 
therefore cause the mkcephfs script terminate.

Many thanks - will be a relief to sort this out, as all our Ceph project works 
are on hold til we can sort this one out.

Cheers

Paul



-Original Message-
From: Sage Weil [mailto:s...@inktank.com]
Sent: Friday, 6 July 2012 2:09 PM
To: Paul Pettigrew
Cc: ceph-devel@vger.kernel.org
Subject: RE: mkcephfs failing on v0.48 argonaut

On Fri, 6 Jul 2012, Paul Pettigrew wrote:
 Hi Sage - thanks so much for the quick response :-)

 Firstly, and it is a bit hard to see, but the command output below is run 
 with the -v option. To help isolate what command line in the script is 
 failing, I have added in some simple echo output, and the script now looks 
 like:


 ### prepare-osdfs ###

 if [ -n $prepareosdfs ]; then
 SNIP
 modprobe btrfs || true
 echo RUNNING: mkfs.btrfs $btrfs_devs
 mkfs.btrfs $btrfs_devs
 btrfs device scan || btrfsctl -a
 echo RUNNING: mount -t btrfs $btrfs_opt $first_dev $btrfs_path
 mount -t btrfs $btrfs_opt $first_dev $btrfs_path echo DID I GET
 HERE - OR CRASH OUT WITH mount ABOVE?
 chown $osd_user $btrfs_path
 chmod +w $btrfs_path

 exit 0
 fi

 Per the modified script the above, here is the output displayed when running 
 the script:

 root@dsanb1-coy:/srv# /sbin/mkcephfs -c /etc/ceph/ceph.conf --allhosts
 --mkbtrfs -k /etc/ceph/keyring --crushmapsrc crushfile.txt -v temp dir
 is /tmp/mkcephfs.uelzdJ82ej preparing monmap in
 /tmp/mkcephfs.uelzdJ82ej/monmap /usr/bin/monmaptool --create --clobber
 --add alpha 10.32.0.10:6789 --add bravo 10.32.0.25:6789 --add charlie
 10.32.0.11:6789 --print /tmp/mkcephfs.uelzdJ82ej/monmap
 /usr/bin/monmaptool: monmap file /tmp/mkcephfs.uelzdJ82ej/monmap
 /usr/bin/monmaptool: generated fsid
 b254abdd-e036-4186-b6d5-e32b14e53b45
 epoch 0
 fsid b254abdd-e036-4186-b6d5-e32b14e53b45
 last_changed 2012-07-06 12:31:38.416848 created 2012-07-06
 12:31:38.416848
 0: 10.32.0.10:6789/0 mon.alpha
 1: 10.32.0.11:6789/0 mon.charlie
 2: 10.32.0.25:6789/0 mon.bravo
 /usr/bin/monmaptool: writing epoch 0 to
 /tmp/mkcephfs.uelzdJ82ej/monmap (3 monitors) /usr/bin/ceph-conf -c 
 /etc/ceph/ceph.conf -n osd.0 user
 === osd.0 ===
 --- dsanb1-coy# /sbin/mkcephfs -d /tmp/mkcephfs.uelzdJ82ej
 --prepare-osdfs osd.0
 umount: /srv/osd.0: not mounted
 umount: /dev/sdc: not mounted
 RUNNING: mkfs.btrfs /dev/sdc

 WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL WARNING! - see
 http://btrfs.wiki.kernel.org before using

 fs created label (null) on /dev/sdc
 nodesize 4096 leafsize 4096 sectorsize 4096 size 1.82TB Btrfs
 Btrfs v0.19 Scanning for Btrfs filesystems
 RUNNING: mount -t btrfs -o noatime /dev/sdc /srv/osd.0
 mount: wrong fs type, bad option, bad superblock on /dev/sdc,
missing codepage or helper program, or other error
In some cases useful info is found in syslog

RE: mkcephfs failing on v0.48 argonaut

2012-07-06 Thread Paul Pettigrew
UPDATED code now within the below (paste snafu, sorry - ignore most recent 
post), my comments/findings the same however...
Paul

-Original Message-

Hi again Sage

This is very perplexing.  Confirming this system is a stock Ubuntu 12.04 x64, 
with no custom kernel or anything else, fully apt-get dist-upgrade'd up to date.
root@dsanb1-coy:~# uname -r
3.2.0-26-generic

I have added in the suggestions you made to the script, we now have:

modprobe btrfs || true
which mkfs.btrfs
echo RUNNING: mkfs.btrfs $btrfs_devs
mkfs.btrfs $btrfs_devs
btrfs device scan || btrfsctl -a
which mount
echo RUNNING: mount -t btrfs $btrfs_opt $first_dev $btrfs_path
mount -t btrfs $btrfs_opt $first_dev $btrfs_path
echo DID I GET HERE - OR CRASH OUT WITH mount ABOVE?
chown $osd_user $btrfs_path
chmod +w $btrfs_path

See below that the same command within the mkcephfs that is failing, is working 
fine on a standard command line:

=== osd.0 ===
--- dsanb1-coy# /sbin/mkcephfs -d /tmp/mkcephfs.ruZy4Apo23 --prepare-osdfs osd.0
umount: /dev/sdc: not mounted
/sbin/mkfs.btrfs
RUNNING: mkfs.btrfs /dev/sdc

WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using

fs created label (null) on /dev/sdc
nodesize 4096 leafsize 4096 sectorsize 4096 size 1.82TB
Btrfs Btrfs v0.19
Scanning for Btrfs filesystems
/bin/mount
RUNNING: mount -t btrfs -o noatime /dev/sdc /srv/osd.0
mount: wrong fs type, bad option, bad superblock on /dev/sdc,
   missing codepage or helper program, or other error
   In some cases useful info is found in syslog - try
   dmesg | tail  or so

failed: '/sbin/mkcephfs -d /tmp/mkcephfs.ruZy4Apo23 --prepare-osdfs osd.0'

root@dsanb1-coy:~# /bin/mount -t btrfs -o noatime /dev/sdc /srv/osd.0
root@dsanb1-coy:~# mount | grep btrfs /dev/sdc on /srv/osd.0 type btrfs 
(rw,noatime)


Remember, this is not isolated to btrfs, as per my original post it fails when 
not specifying to use btrfs.

I can only conclude that /bin/sh /or /bin/bash and the way they interact with 
the mkcephfs script, which does call itself etc, is somehow now become fuddled 
up?  Must be something wiggy, when the script output confirms it is calling the 
same command ( /bin/mount ) but somehow finds a way for that to not work and 
therefore cause the mkcephfs script terminate.

Many thanks - will be a relief to sort this out, as all our Ceph project works 
are on hold til we can sort this one out.

Cheers

Paul



-Original Message-
From: Sage Weil [mailto:s...@inktank.com]
Sent: Friday, 6 July 2012 2:09 PM
To: Paul Pettigrew
Cc: ceph-devel@vger.kernel.org
Subject: RE: mkcephfs failing on v0.48 argonaut

On Fri, 6 Jul 2012, Paul Pettigrew wrote:
 Hi Sage - thanks so much for the quick response :-)

 Firstly, and it is a bit hard to see, but the command output below is run 
 with the -v option. To help isolate what command line in the script is 
 failing, I have added in some simple echo output, and the script now looks 
 like:


 ### prepare-osdfs ###

 if [ -n $prepareosdfs ]; then
 SNIP
 modprobe btrfs || true
 echo RUNNING: mkfs.btrfs $btrfs_devs
 mkfs.btrfs $btrfs_devs
 btrfs device scan || btrfsctl -a
 echo RUNNING: mount -t btrfs $btrfs_opt $first_dev $btrfs_path
 mount -t btrfs $btrfs_opt $first_dev $btrfs_path echo DID I GET
 HERE - OR CRASH OUT WITH mount ABOVE?
 chown $osd_user $btrfs_path
 chmod +w $btrfs_path

 exit 0
 fi

 Per the modified script the above, here is the output displayed when running 
 the script:

 root@dsanb1-coy:/srv# /sbin/mkcephfs -c /etc/ceph/ceph.conf --allhosts
 --mkbtrfs -k /etc/ceph/keyring --crushmapsrc crushfile.txt -v temp dir
 is /tmp/mkcephfs.uelzdJ82ej preparing monmap in
 /tmp/mkcephfs.uelzdJ82ej/monmap /usr/bin/monmaptool --create --clobber
 --add alpha 10.32.0.10:6789 --add bravo 10.32.0.25:6789 --add charlie
 10.32.0.11:6789 --print /tmp/mkcephfs.uelzdJ82ej/monmap
 /usr/bin/monmaptool: monmap file /tmp/mkcephfs.uelzdJ82ej/monmap
 /usr/bin/monmaptool: generated fsid
 b254abdd-e036-4186-b6d5-e32b14e53b45
 epoch 0
 fsid b254abdd-e036-4186-b6d5-e32b14e53b45
 last_changed 2012-07-06 12:31:38.416848 created 2012-07-06
 12:31:38.416848
 0: 10.32.0.10:6789/0 mon.alpha
 1: 10.32.0.11:6789/0 mon.charlie
 2: 10.32.0.25:6789/0 mon.bravo
 /usr/bin/monmaptool: writing epoch 0 to
 /tmp/mkcephfs.uelzdJ82ej/monmap (3 monitors) /usr/bin/ceph-conf -c 
 /etc/ceph/ceph.conf -n osd.0 user
 === osd.0 ===
 --- dsanb1-coy# /sbin/mkcephfs -d /tmp/mkcephfs.uelzdJ82ej
 --prepare-osdfs osd.0
 umount: /srv/osd.0: not mounted
 umount: /dev/sdc: not mounted
 RUNNING: mkfs.btrfs /dev/sdc

 WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL WARNING! - see
 http://btrfs.wiki.kernel.org before using

 fs created label (null) on /dev/sdc
 nodesize 4096 leafsize 4096 sectorsize 4096 size 1.82TB Btrfs
 Btrfs v0.19 Scanning for Btrfs filesystems
 RUNNING: mount -t btrfs -o noatime /dev/sdc /srv/osd.0
 mount: wrong fs type

RE: mkcephfs failing on v0.48 argonaut

2012-07-06 Thread Sage Weil
On Sat, 7 Jul 2012, Paul Pettigrew wrote:
 Hi again Sage
 
 This is very perplexing.  Confirming this system is a stock Ubuntu 12.04 x64, 
 with no custom kernel or anything else, fully apt-get dist-upgrade'd up to 
 date.
 root@dsanb1-coy:~# uname -r
 3.2.0-26-generic
 
 I have added in the suggestions you made to the script, we now have:
 
 modprobe btrfs || true
 which mkfs.btrfs
 echo RUNNING: mkfs.btrfs $btrfs_devs
 mkfs.btrfs $btrfs_devs
 btrfs device scan || btrfsctl -a
 echo RUNNING: mount -t btrfs $btrfs_opt $first_dev $btrfs_path
 which mount
 mount -t btrfs $btrfs_opt $first_dev $btrfs_path
 echo DID I GET HERE - OR CRASH OUT WITH mount ABOVE?
 chown $osd_user $btrfs_path
 
 
 See below that the same command within the mkcephfs that is failing, is 
 working fine on a standard command line:

Weirdness!

 
 === osd.0 ===
 --- dsanb1-coy# /sbin/mkcephfs -d /tmp/mkcephfs.xgk025tjkQ --prepare-osdfs 
 osd.0

Can you run this command with -x to see exactly what bash is doing?

 sh -x /sbin/mkcephfs -d /tmp/mkcephfs.xgk025tjkQ --prepare-osdfs osd.0

In particular, I'm curious if you do

 mkfs.btrfs /dev/sdc
 btrfs device scan
 mount /dev/sdc /srv/osd.0

(or whatever the exact sequence that mkcephfs does is) from the command 
line, does it give you the same error?

sage
 

 umount: /srv/osd.0: not mounted
 umount: /dev/sdc: not mounted
 /sbin/mkfs.btrfs
 RUNNING: mkfs.btrfs /dev/sdc
 
 WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
 WARNING! - see http://btrfs.wiki.kernel.org before using
 
 fs created label (null) on /dev/sdc
 nodesize 4096 leafsize 4096 sectorsize 4096 size 1.82TB
 Btrfs Btrfs v0.19
 Scanning for Btrfs filesystems
 RUNNING: mount -t btrfs -o noatime /dev/sdc /srv/osd.0
 /bin/mount
 mount: wrong fs type, bad option, bad superblock on /dev/sdc,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail  or so
 
 failed: '/sbin/mkcephfs -d /tmp/mkcephfs.xgk025tjkQ --prepare-osdfs osd.0'
 root@dsanb1-coy:~# /bin/mount -t btrfs -o noatime /dev/sdc /srv/osd.0
 root@dsanb1-coy:~# mount | grep btrfs
 /dev/sdc on /srv/osd.0 type btrfs (rw,noatime)
 
 
 Remember, this is not isolated to btrfs, as per my original post it fails 
 when not specifying to use btrfs.
 
 I can only conclude that /bin/sh /or /bin/bash and the way they interact 
 with the mkcephfs script, which does call itself etc, is somehow now become 
 fuddled up?  Must be something wiggy, when the script output confirms it is 
 calling the same command ( /bin/mount ) but somehow finds a way for that to 
 not work and therefore cause the mkcephfs script termin5Date.
 
 Many thanks - will be a relief to sort this out, as all our Ceph project 
 works are on hold til we can sort this one out.
 
 Cheers
 
 Paul
 
 
 
 -Original Message-
 From: Sage Weil [mailto:s...@inktank.com]
 Sent: Friday, 6 July 2012 2:09 PM
 To: Paul Pettigrew
 Cc: ceph-devel@vger.kernel.org
 Subject: RE: mkcephfs failing on v0.48 argonaut
 
 On Fri, 6 Jul 2012, Paul Pettigrew wrote:
  Hi Sage - thanks so much for the quick response :-)
 
  Firstly, and it is a bit hard to see, but the command output below is run 
  with the -v option. To help isolate what command line in the script is 
  failing, I have added in some simple echo output, and the script now looks 
  like:
 
 
  ### prepare-osdfs ###
 
  if [ -n $prepareosdfs ]; then
  SNIP
  modprobe btrfs || true
  echo RUNNING: mkfs.btrfs $btrfs_devs
  mkfs.btrfs $btrfs_devs
  btrfs device scan || btrfsctl -a
  echo RUNNING: mount -t btrfs $btrfs_opt $first_dev $btrfs_path
  mount -t btrfs $btrfs_opt $first_dev $btrfs_path echo DID I GET
  HERE - OR CRASH OUT WITH mount ABOVE?
  chown $osd_user $btrfs_path
  chmod +w $btrfs_path
 
  exit 0
  fi
 
  Per the modified script the above, here is the output displayed when 
  running the script:
 
  root@dsanb1-coy:/srv# /sbin/mkcephfs -c /etc/ceph/ceph.conf --allhosts
  --mkbtrfs -k /etc/ceph/keyring --crushmapsrc crushfile.txt -v temp dir
  is /tmp/mkcephfs.uelzdJ82ej preparing monmap in
  /tmp/mkcephfs.uelzdJ82ej/monmap /usr/bin/monmaptool --create --clobber
  --add alpha 10.32.0.10:6789 --add bravo 10.32.0.25:6789 --add charlie
  10.32.0.11:6789 --print /tmp/mkcephfs.uelzdJ82ej/monmap
  /usr/bin/monmaptool: monmap file /tmp/mkcephfs.uelzdJ82ej/monmap
  /usr/bin/monmaptool: generated fsid
  b254abdd-e036-4186-b6d5-e32b14e53b45
  epoch 0
  fsid b254abdd-e036-4186-b6d5-e32b14e53b45
  last_changed 2012-07-06 12:31:38.416848 created 2012-07-06
  12:31:38.416848
  0: 10.32.0.10:6789/0 mon.alpha
  1: 10.32.0.11:6789/0 mon.charlie
  2: 10.32.0.25:6789/0 mon.bravo
  /usr/bin/monmaptool: writing epoch 0 to
  /tmp/mkcephfs.uelzdJ82ej/monmap (3 monitors) /usr/bin/ceph-conf -c 
  /etc/ceph/ceph.conf -n osd.0 user
  === osd.0 ===
  --- dsanb1-coy# /sbin/mkcephfs -d /tmp/mkcephfs.uelzdJ82ej
  --prepare-osdfs osd.0
  umount: /srv/osd.0

Re: mkcephfs failing on v0.48 argonaut

2012-07-05 Thread Sage Weil
Hi Paul,

On Wed, 4 Jul 2012, Paul Pettigrew wrote:
 Firstly, well done guys on achieving this version milestone. I 
 successfully upgraded to the 0.48 format uneventfully on a live (test) 
 system.
 
 The same system was then going through rebuild testing, to confirm 
 that also worked fine.
 
 
 Unfortunately, the mkcephfs command is failing:
 
 root@dsanb1-coy:~# mkcephfs -c /etc/ceph/ceph.conf --allhosts --mkbtrfs -k 
 /etc/ceph/keyring --crushmapsrc crushfile.txt -v
 temp dir is /tmp/mkcephfs.GaRCZ9i06a
 preparing monmap in /tmp/mkcephfs.GaRCZ9i06a/monmap
 /usr/bin/monmaptool --create --clobber --add alpha 10.32.0.10:6789 --add 
 bravo 10.32.0.25:6789 --add charlie 10.32.0.11:6789 --print 
 /tmp/mkcephfs.GaRCZ9i06a/monmap
 /usr/bin/monmaptool: monmap file /tmp/mkcephfs.GaRCZ9i06a/monmap
 /usr/bin/monmaptool: generated fsid c7202495-468c-4678-b678-115c3ee33402
 epoch 0
 fsid c7202495-468c-4678-b678-115c3ee33402
 last_changed 2012-07-04 15:02:31.732275
 created 2012-07-04 15:02:31.732275
 0: 10.32.0.10:6789/0 mon.alpha
 1: 10.32.0.11:6789/0 mon.charlie
 2: 10.32.0.25:6789/0 mon.bravo
 /usr/bin/monmaptool: writing epoch 0 to /tmp/mkcephfs.GaRCZ9i06a/monmap (3 
 monitors)
 /usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n osd.0 user
 === osd.0 ===
 --- dsanb1-coy# /sbin/mkcephfs -d /tmp/mkcephfs.GaRCZ9i06a --prepare-osdfs 
 osd.0
 umount: /srv/osd.0: not mounted
 umount: /dev/disk/by-wwn/wwn-0x50014ee601246234: not mounted
 
 WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
 WARNING! - see http://btrfs.wiki.kernel.org before using
 
 fs created label (null) on /dev/disk/by-wwn/wwn-0x50014ee601246234
 nodesize 4096 leafsize 4096 sectorsize 4096 size 1.82TB
 Btrfs Btrfs v0.19
 Scanning for Btrfs filesystems
 mount: wrong fs type, bad option, bad superblock on /dev/sdc,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail  or so
 
 failed: '/sbin/mkcephfs -d /tmp/mkcephfs.GaRCZ9i06a --prepare-osdfs osd.0'

Hmm.  Can you try running with -v?  That will tell us exactly which 
command it is running, and hopefully we can work backwards from there.

 dmesg/syslog is spitting out at the time of this failure:
 
 Jul  4 15:02:31 dsanb1-coy kernel: [ 2306.751945] device fsid 
 7de0d192-b710-4629-a201-849df1d9db17 devid 1 transid 27109 /dev/sdp
 Jul  4 15:02:31 dsanb1-coy kernel: [ 2306.751987] device fsid 
 08fc3479-2fa2-4388-8b61-83e2a742a13e devid 1 transid 28699 /dev/sdo
 Jul  4 15:02:31 dsanb1-coy kernel: [ 2306.752023] device fsid 
 8b4a7c43-1a05-4dcb-bbed-de2a5c933996 devid 1 transid 24346 /dev/sdn
 Jul  4 15:02:31 dsanb1-coy kernel: [ 2306.752068] device fsid 
 ba5fb1ca-c642-49b1-8a41-7f56f8e59fbd devid 1 transid 27274 /dev/sdm
 Jul  4 15:02:31 dsanb1-coy kernel: [ 2306.761453] device fsid 
 7fe8c5cf-bf8c-4276-90f2-c3f57f5275fb devid 1 transid 28724 /dev/sdi
 Jul  4 15:02:31 dsanb1-coy kernel: [ 2306.761518] device fsid 
 93fa3631-1202-4d42-8908-e5ef4d3e600d devid 1 transid 25201 /dev/sdh
 Jul  4 15:02:31 dsanb1-coy kernel: [ 2306.761579] device fsid 
 b9a1b5e4-3e5e-4381-a29a-33470f4b870f devid 1 transid 23375 /dev/sdg
 Jul  4 15:02:31 dsanb1-coy kernel: [ 2306.761635] device fsid 
 280ea990-23f8-4c43-9e56-140c82340fdc devid 1 transid 25559 /dev/sdf
 Jul  4 15:02:31 dsanb1-coy kernel: [ 2306.761693] device fsid 
 2f724cde-6de5-4262-b195-1ba3eea2256e devid 1 transid 176 /dev/sde
 Jul  4 15:02:31 dsanb1-coy kernel: [ 2306.761732] device fsid 
 a66f890f-8b08-4393-aab0-f222637ca5a4 devid 1 transid 7 /dev/sdd
 Jul  4 15:02:31 dsanb1-coy kernel: [ 2306.761769] device fsid 
 6c181a94-697c-4e0c-af0d-05eb04d3626c devid 1 transid 7 /dev/sdc
 Jul  4 15:02:31 dsanb1-coy kernel: [ 2306.775931] device fsid 
 6c181a94-697c-4e0c-af0d-05eb04d3626c devid 1 transid 7 /dev/sdc
 Jul  4 15:02:31 dsanb1-coy kernel: [ 2306.779716] btrfs bad fsid on block 
 20971520
 Jul  4 15:02:31 dsanb1-coy kernel: [ 2306.791594] btrfs bad fsid on block 
 20971520
 Jul  4 15:02:31 dsanb1-coy kernel: [ 2306.803608] btrfs bad fsid on block 
 20971520
 Jul  4 15:02:31 dsanb1-coy kernel: [ 2306.815541] btrfs bad fsid on block 
 20971520
 Jul  4 15:02:31 dsanb1-coy kernel: [ 2306.815878] btrfs bad fsid on block 
 20971520
 Jul  4 15:02:32 dsanb1-coy kernel: [ 2306.823554] btrfs bad fsid on block 
 20971520
 Jul  4 15:02:32 dsanb1-coy kernel: [ 2306.823797] btrfs bad fsid on block 
 20971520
 Jul  4 15:02:32 dsanb1-coy kernel: [ 2306.823887] btrfs: failed to read chunk 
 root on sdc
 Jul  4 15:02:32 dsanb1-coy kernel: [ 2306.825622] btrfs: open_ctree failed

Long shot, but is the kernel on that machine recent?

 Also fails if not forcing to use btrfs, eg:
 
 root@dsanb1-coy:~# mkcephfs -c /etc/ceph/ceph.conf --allhosts -k 
 /etc/ceph/keyring --crushmapsrc crushfile.txt -v
 temp dir is /tmp/mkcephfs.ZOh6tBPAH0
 preparing monmap in /tmp/mkcephfs.ZOh6tBPAH0/monmap
 /usr/bin/monmaptool --create --clobber --add alpha 10.32.0.10:6789 --add 
 bravo 10.32.0.25:6789 --add 

RE: mkcephfs failing on v0.48 argonaut

2012-07-05 Thread Paul Pettigrew
Hi Sage - thanks so much for the quick response :-)

Firstly, and it is a bit hard to see, but the command output below is run with 
the -v option. To help isolate what command line in the script is failing, I 
have added in some simple echo output, and the script now looks like:


### prepare-osdfs ###

if [ -n $prepareosdfs ]; then
SNIP
modprobe btrfs || true
echo RUNNING: mkfs.btrfs $btrfs_devs
mkfs.btrfs $btrfs_devs
btrfs device scan || btrfsctl -a
echo RUNNING: mount -t btrfs $btrfs_opt $first_dev $btrfs_path
mount -t btrfs $btrfs_opt $first_dev $btrfs_path
echo DID I GET HERE - OR CRASH OUT WITH mount ABOVE?
chown $osd_user $btrfs_path
chmod +w $btrfs_path

exit 0
fi

Per the modified script the above, here is the output displayed when running 
the script:

root@dsanb1-coy:/srv# /sbin/mkcephfs -c /etc/ceph/ceph.conf --allhosts 
--mkbtrfs -k /etc/ceph/keyring --crushmapsrc crushfile.txt -v
temp dir is /tmp/mkcephfs.uelzdJ82ej
preparing monmap in /tmp/mkcephfs.uelzdJ82ej/monmap
/usr/bin/monmaptool --create --clobber --add alpha 10.32.0.10:6789 --add bravo 
10.32.0.25:6789 --add charlie 10.32.0.11:6789 --print 
/tmp/mkcephfs.uelzdJ82ej/monmap
/usr/bin/monmaptool: monmap file /tmp/mkcephfs.uelzdJ82ej/monmap
/usr/bin/monmaptool: generated fsid b254abdd-e036-4186-b6d5-e32b14e53b45
epoch 0
fsid b254abdd-e036-4186-b6d5-e32b14e53b45
last_changed 2012-07-06 12:31:38.416848
created 2012-07-06 12:31:38.416848
0: 10.32.0.10:6789/0 mon.alpha
1: 10.32.0.11:6789/0 mon.charlie
2: 10.32.0.25:6789/0 mon.bravo
/usr/bin/monmaptool: writing epoch 0 to /tmp/mkcephfs.uelzdJ82ej/monmap (3 
monitors)
/usr/bin/ceph-conf -c /etc/ceph/ceph.conf -n osd.0 user
=== osd.0 ===
--- dsanb1-coy# /sbin/mkcephfs -d /tmp/mkcephfs.uelzdJ82ej --prepare-osdfs osd.0
umount: /srv/osd.0: not mounted
umount: /dev/sdc: not mounted
RUNNING: mkfs.btrfs /dev/sdc

WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using

fs created label (null) on /dev/sdc
nodesize 4096 leafsize 4096 sectorsize 4096 size 1.82TB
Btrfs Btrfs v0.19
Scanning for Btrfs filesystems
RUNNING: mount -t btrfs -o noatime /dev/sdc /srv/osd.0
mount: wrong fs type, bad option, bad superblock on /dev/sdc,
   missing codepage or helper program, or other error
   In some cases useful info is found in syslog - try
   dmesg | tail  or so

failed: '/sbin/mkcephfs -d /tmp/mkcephfs.uelzdJ82ej --prepare-osdfs osd.0'


Which clearly isolates the issue to the mount command line.

The trouble is, I can run this precise line on the command line directly 
without error:

root@dsanb1-coy:/srv# mount -t btrfs -o noatime /dev/sdc /srv/osd.0 
root@dsanb1-coy:/srv# mount | grep btrfs
/dev/sdc on /srv/osd.0 type btrfs (rw,noatime)


Therefore, what could possibly be preventing the mkcephfs running a simple 
mount command on the first OSD disk it gets to, that otherwise works fine from 
the command line?

Many thanks Sage

Paul

PS: changing the  btrfs device scan || btrfsctl -a line as proposed had no 
effect, and neither did putting in a sleep 10 immediately before the mount 
line.
PPS: zerofilling the /dev/sdc and then re-creating a partition and mounting 
manually, then writing data to it is all fine. Same errors if we substitute any 
of the other HDD's in the server as 1st/osd.0. Ie, cannot see any issues with 
the hardware.





-Original Message-
From: ceph-devel-ow...@vger.kernel.org 
[mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Sage Weil
Sent: Friday, 6 July 2012 8:18 AM
To: Paul Pettigrew
Cc: ceph-devel@vger.kernel.org
Subject: Re: mkcephfs failing on v0.48 argonaut

Hi Paul,

On Wed, 4 Jul 2012, Paul Pettigrew wrote:
 Firstly, well done guys on achieving this version milestone. I 
 successfully upgraded to the 0.48 format uneventfully on a live (test) 
 system.
 
 The same system was then going through rebuild testing, to confirm 
 that also worked fine.
 
 
 Unfortunately, the mkcephfs command is failing:
 
 root@dsanb1-coy:~# mkcephfs -c /etc/ceph/ceph.conf --allhosts 
 --mkbtrfs -k /etc/ceph/keyring --crushmapsrc crushfile.txt -v temp dir 
 is /tmp/mkcephfs.GaRCZ9i06a preparing monmap in 
 /tmp/mkcephfs.GaRCZ9i06a/monmap /usr/bin/monmaptool --create --clobber 
 --add alpha 10.32.0.10:6789 --add bravo 10.32.0.25:6789 --add charlie 
 10.32.0.11:6789 --print /tmp/mkcephfs.GaRCZ9i06a/monmap
 /usr/bin/monmaptool: monmap file /tmp/mkcephfs.GaRCZ9i06a/monmap
 /usr/bin/monmaptool: generated fsid 
 c7202495-468c-4678-b678-115c3ee33402
 epoch 0
 fsid c7202495-468c-4678-b678-115c3ee33402
 last_changed 2012-07-04 15:02:31.732275 created 2012-07-04 
 15:02:31.732275
 0: 10.32.0.10:6789/0 mon.alpha
 1: 10.32.0.11:6789/0 mon.charlie
 2: 10.32.0.25:6789/0 mon.bravo
 /usr/bin/monmaptool: writing epoch 0 to 
 /tmp/mkcephfs.GaRCZ9i06a/monmap (3 monitors) /usr/bin/ceph-conf -c 
 /etc/ceph/ceph.conf -n osd.0 user
 === osd.0 ===
 --- dsanb1-coy# /sbin/mkcephfs -d /tmp/mkcephfs.GaRCZ9i06a 
 --prepare