date:20130909

[ceph-users] few port per ceph-osd

2013-09-09 Thread Timofey Koolin

I use ceph 0.67.2.
When I start
ceph-osd -i 0
or
ceph-osd -i 1
it start one process, but it process open few tcp-ports, is it normal?

netstat -nlp | grep ceph
tcp0  0 10.11.0.73:6789 0.0.0.0:*   LISTEN
 1577/ceph-mon - mon
tcp0  0 10.11.0.73:6800 0.0.0.0:*   LISTEN
 3649/ceph-osd - osd.0
tcp0  0 10.11.0.73:6801 0.0.0.0:*   LISTEN
 3649/ceph-osd - osd.0
tcp0  0 10.11.0.73:6802 0.0.0.0:*   LISTEN
 3649/ceph-osd - osd.0
tcp0  0 10.11.0.73:6803 0.0.0.0:*   LISTEN
 3649/ceph-osd - osd.0
tcp0  0 10.11.0.73:6804 0.0.0.0:*   LISTEN
 3764/ceph-osd - osd.1
tcp0  0 10.11.0.73:6805 0.0.0.0:*   LISTEN
 3764/ceph-osd - osd.1
tcp0  0 10.11.0.73:6808 0.0.0.0:*   LISTEN
 3764/ceph-osd - osd.1
tcp0  0 10.11.0.73:6809 0.0.0.0:*   LISTEN
 3764/ceph-osd - osd.1

-- 
Blog: www.rekby.ru
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] CephFS metadata corruption on MDS restart

2013-09-09 Thread Tobias Prousa

Hi Ceph,

I recently realized that whenever I'm forced to restart MDS (i.e. stall or crash due to execcive memory consumption, btw. my MDS host has 32GB of RAM) especially while there are still clients having CephFS mounted, open files tend to have their metadata corrupted. Those files, when corrupted, will uniquely report a file size of exactly 4MiB, no matter what the real file size was. The rest of metadata like name, date, ... seems to be ok. I'm not 100% sure this is directly related to MDS restart but it obviously gives me the impression. Also the files that get corrupted are those that are highly likely open or have been written to most recently. I cannot see anything suspect on the logfiles, either.

Some details on my setup:

As servers there are 3 nodes running debian wheezy with ceph dumpling (0.67.2-35-g17a7342 from guibuilder, as 0.67.2 didn't get MDS out of rejoin any more). Each node runs a MON and three OSDs, furthermore a single one of those nodes is running one instance of MDS.

Then there are 8 clients, running debian wheezy as well, with linux-3.9 from debian backports, mounting cephfs subdir 'home' as /home using kernel client (I know 3.9 is rather old for that, but I found no way to mount a subdir of cephfs from fstab using ceph-fuse).

My clients' fstab entry looks something like that:

172.16.17.3:6789:/home/ /home ceph name=admin,secretfile=/some/secret/file 0 0

On a first look, I couldn't find something similar on ther tracker, anyone experiencing similar issues?

Btw. restarting MDS gives me some headache every time, as it tends to refuse to restart (reporting beeing active but going into some cache cleanup endless loop and not answering fs client requests), and the only thing to get it up again is to increase mds cache size. I ended up with about 2M cache size, which used so much memory that restarts were neccesary about twice a day. So I shut down MDS over the weekend and after about 40 hours I was able to start up MDS again with about 250k cache size. Maybe that information is of some help for you.

Best regards,

Tobi
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] ceph 0.67 ceph osd crush move changed

2013-09-09 Thread Vladislav Gorbunov

Is ceph osd crush move syntax changed on 0.67?
I have crushmap
# id weight type name up/down reweight
-1 10.11 root default
-4 3.82  datacenter dc1
-2 3.82   host cstore3
0 0.55osd.0 up 1
1 0.55osd.1 up 1
2 0.55osd.2 up 1
-5 3.82  datacenter dc2
-3 3.82   host cstore2
10 0.55osd.10 up 1
11 0.55osd.11 up 1
12 0.55osd.12 up 1
-7 1.92  host cstore1
20 0.55   osd.20 up 1
21 0.55   osd.21 up 1
22 0.55   osd.22 up 1
23 0.27   osd.23 up 1

and try to move new added host cstore1 to datacenter with command:
ceph osd crush move cstore1 root=default datacenter=dc1
Invalid command:  osd id cstore1 not integer
Error EINVAL: invalid command

It's worked on ceph 0.61. How can I move new host to datacenter? in 0.67?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] ceph 0.67 ceph osd crush move changed

2013-09-09 Thread Vladislav Gorbunov

I am found solution with this commands:

ceph osd crush unlink cstore1
ceph osd crush link cstore1 root=default datacenter=dc1

2013/9/9 Vladislav Gorbunov :
> Is ceph osd crush move syntax changed on 0.67?
> I have crushmap
> # id weight type name up/down reweight
> -1 10.11 root default
> -4 3.82  datacenter dc1
> -2 3.82   host cstore3
> 0 0.55osd.0 up 1
> 1 0.55osd.1 up 1
> 2 0.55osd.2 up 1
> -5 3.82  datacenter dc2
> -3 3.82   host cstore2
> 10 0.55osd.10 up 1
> 11 0.55osd.11 up 1
> 12 0.55osd.12 up 1
> -7 1.92  host cstore1
> 20 0.55   osd.20 up 1
> 21 0.55   osd.21 up 1
> 22 0.55   osd.22 up 1
> 23 0.27   osd.23 up 1
>
> and try to move new added host cstore1 to datacenter with command:
> ceph osd crush move cstore1 root=default datacenter=dc1
> Invalid command:  osd id cstore1 not integer
> Error EINVAL: invalid command
>
> It's worked on ceph 0.61. How can I move new host to datacenter? in 0.67?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] trouble with ceph-deploy

2013-09-09 Thread Pavel Timoschenkov

for the experiment:

- blank disk sdae for data

blkid -p /dev/sdaf
/dev/sdaf: PTTYPE="gpt"

- and sda4 partition for journal

blkid -p /dev/sda4
/dev/sda4: PTTYPE="gpt" PART_ENTRY_SCHEME="gpt" PART_ENTRY_NAME="Linux 
filesystem" PART_ENTRY_UUID="cdc46436-b6ed-40bb-adb4-63cf1c41cbe3" 
PART_ENTRY_TYPE="0fc63daf-8483-4772-8e79-3d69d8477de4" PART_ENTRY_NUMBER="4" 
PART_ENTRY_OFFSET="62916608" PART_ENTRY_SIZE="20971520" PART_ENTRY_DISK="8:0"

- zapped disk 

ceph-deploy disk zap ceph001:sdaf ceph001:sda4
[ceph_deploy.osd][DEBUG ] zapping /dev/sdaf on ceph001
[ceph_deploy.osd][DEBUG ] zapping /dev/sda4 on ceph001

- after this:

ceph-deploy osd create ceph001:sdae:/dev/sda4
[ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks 
ceph001:/dev/sdaf:/dev/sda4
[ceph_deploy.osd][DEBUG ] Deploying osd to ceph001
[ceph_deploy.osd][DEBUG ] Host ceph001 is now ready for osd use.
[ceph_deploy.osd][DEBUG ] Preparing host ceph001 disk /dev/sdaf journal 
/dev/sda4 activate True


- after this:

blkid -p /dev/sdaf1
/dev/sdaf1: ambivalent result (probably more filesystems on the device, use 
wipefs(8) to see more details)

wipefs /dev/sdaf1
offset   type

0x3  zfs_member   [raid]

0x0  xfs   [filesystem]
 UUID:  aba50262-0427-4f8b-8eb9-513814af6b81

- and OSD not created

but if I'm using sungle disk for data and journal:

ceph-deploy disk zap ceph001:sdaf
[ceph_deploy.osd][DEBUG ] zapping /dev/sdaf on ceph001

ceph-deploy osd create ceph001:sdaf
[ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks ceph001:/dev/sdaf:
[ceph_deploy.osd][DEBUG ] Deploying osd to ceph001
[ceph_deploy.osd][DEBUG ] Host ceph001 is now ready for osd use.
[ceph_deploy.osd][DEBUG ] Preparing host ceph001 disk /dev/sdaf journal None 
activate True

OSD created!

-Original Message-
From: Sage Weil [mailto:s...@inktank.com] 
Sent: Friday, September 06, 2013 6:41 PM
To: Pavel Timoschenkov
Cc: Alfredo Deza; ceph-users@lists.ceph.com
Subject: RE: [ceph-users] trouble with ceph-deploy

On Fri, 6 Sep 2013, Pavel Timoschenkov wrote:
> >>>Try
> >>>ceph-disk -v activate /dev/sdaa1
> 
> ceph-disk -v activate /dev/sdaa1
> /dev/sdaa1: ambivalent result (probably more filesystems on the 
> device, use wipefs(8) to see more details)

Looks like thre are multiple fs signatures on that partition.  See

http://ozancaglayan.com/2013/01/29/multiple-filesystem-signatures-on-a-partition/

for how to clean that up.  And please share the wipefs output that you see; it 
may be that we need to make the --zap-disk behavior also explicitly clear any 
signatures on the device.

Thanks!
sage


> >>>as there is probably a partition there.  And/or tell us what 
> >>>/proc/partitions contains,
> 
> cat /proc/partitions
> major minor  #blocks  name
> 
> 65  160 2930266584 sdaa
>   65  161 2930265543 sdaa1
> 
> >>>and/or what you get from
> >>>ceph-disk list
> 
> ceph-disk list
> Traceback (most recent call last):
>   File "/usr/sbin/ceph-disk", line 2328, in 
> main()
>   File "/usr/sbin/ceph-disk", line 2317, in main
> args.func(args)
>   File "/usr/sbin/ceph-disk", line 2001, in main_list
> tpath = mount(dev=dev, fstype=fs_type, options='')
>   File "/usr/sbin/ceph-disk", line 678, in mount
> path,
>   File "/usr/lib/python2.7/subprocess.py", line 506, in check_call
> retcode = call(*popenargs, **kwargs)
>   File "/usr/lib/python2.7/subprocess.py", line 493, in call
> return Popen(*popenargs, **kwargs).wait()
>   File "/usr/lib/python2.7/subprocess.py", line 679, in __init__
> errread, errwrite)
>   File "/usr/lib/python2.7/subprocess.py", line 1249, in _execute_child
> raise child_exception
> TypeError: execv() arg 2 must contain only strings
> 
> ==
> -Original Message-
> From: Sage Weil [mailto:s...@inktank.com]
> Sent: Thursday, September 05, 2013 6:37 PM
> To: Pavel Timoschenkov
> Cc: Alfredo Deza; ceph-users@lists.ceph.com
> Subject: RE: [ceph-users] trouble with ceph-deploy
> 
> On Thu, 5 Sep 2013, Pavel Timoschenkov wrote:
> > >>>What happens if you do
> > >>>ceph-disk -v activate /dev/sdaa1
> > >>>on ceph001?
> > 
> > Hi. My issue has not been solved. When i execute ceph-disk -v activate 
> > /dev/sdaa - all is ok:
> > ceph-disk -v activate /dev/sdaa
> 
> Try
> 
>  ceph-disk -v activate /dev/sdaa1
> 
> as there is probably a partition there.  And/or tell us what 
> /proc/partitions contains, and/or what you get from
> 
>  ceph-disk list
> 
> Thanks!
> sage
> 
> 
> > DEBUG:ceph-disk:Mounting /dev/sdaa on /var/lib/ceph/tmp/mnt.yQuXIa 
> > with options noatime
> > mount: Structure needs cleaning
> > but OSD not created all the same:
> > ceph -k ceph.client.admin.keyring -s
> >   cluster 0a2e18d2-fd53-4f01-b63a-84851576c076
> >health HEALTH_ERR 192 pgs stuck inactive; 192 pgs stuck unclean; no osds
> >monmap e1: 1 mon

Re: [ceph-users] CephFS metadata corruption on MDS restart

2013-09-09 Thread Yan, Zheng

On Mon, Sep 9, 2013 at 3:29 PM, Tobias Prousa  wrote:
> Hi Ceph,
>
> I recently realized that whenever I'm forced to restart MDS (i.e. stall or
> crash due to execcive memory consumption, btw. my MDS host has 32GB of RAM)
> especially while there are still clients having CephFS mounted, open files
> tend to have their metadata corrupted. Those files, when corrupted, will
> uniquely report a file size of exactly 4MiB, no matter what the real file
> size was. The rest of metadata like name, date, ... seems to be ok. I'm not
> 100% sure this is directly related to MDS restart but it obviously gives me
> the impression. Also the files that get corrupted are those that are highly
> likely open or have been written to most recently. I cannot see anything
> suspect on the logfiles, either.
>
> Some details on my setup:
>
> As servers there are 3 nodes running debian wheezy with ceph dumpling
> (0.67.2-35-g17a7342 from guibuilder, as 0.67.2 didn't get MDS out of rejoin
> any more). Each node runs a MON and three OSDs, furthermore a single one of
> those nodes is running one instance of MDS.
>
> Then there are 8 clients, running debian wheezy as well, with linux-3.9 from
> debian backports, mounting cephfs subdir 'home' as /home using kernel client
> (I know 3.9 is rather old for that, but I found no way to mount a subdir of
> cephfs from fstab using ceph-fuse).
> My clients' fstab entry looks something like that:
> 172.16.17.3:6789:/home/  /home   ceph
> name=admin,secretfile=/some/secret/file 0   0
>
> On a first look, I couldn't find something similar on ther tracker, anyone
> experiencing similar issues?
>
> Btw. restarting MDS gives me some headache every time, as it tends to refuse
> to restart (reporting beeing active but going into some cache cleanup
> endless loop and not answering fs client requests), and the only thing to
> get it up again is to increase mds cache size. I ended up with about 2M
> cache size, which used so much memory that restarts were neccesary about
> twice a day. So I shut down MDS over the weekend and after about 40 hours I
> was able to start up MDS again with about 250k cache size. Maybe that
> information is of some help for you.
>

The bug has been fixed in 3.11 kernel by commit ccca4e37b1 (libceph:
fix truncate size calculation). We don't backport cephfs bug fixes to
old kernel.
please update the kernel or use ceph-fuse.

Regards
Yan, Zheng

> Best regards,
> Tobi
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] rbd cp copies of sparse files become fully allocated

2013-09-09 Thread Andrey Korolyov

May I also suggest the same for export/import mechanism? Say, if image
was created by fallocate we may also want to leave holes upon upload
and vice-versa for export.

On Mon, Sep 9, 2013 at 8:45 AM, Sage Weil  wrote:
> On Sat, 7 Sep 2013, Oliver Daudey wrote:
>> Hey all,
>>
>> This topic has been partly discussed here:
>> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-March/000799.html
>>
>> Tested on Ceph version 0.67.2.
>>
>> If you create a fresh empty image of, say, 100GB in size on RBD and then
>> use "rbd cp" to make a copy of it, even though the image is sparse, the
>> command will attempt to read every part of it and take far more time
>> than expected.
>>
>> After reading the above thread, I understand why the copy of an
>> essentially empty sparse image on RBD would take so long, but it doesn't
>> explain why the copy won't be sparse itself.  If I use "rbd cp" to copy
>> an image, the copy will take it's full allocated size on disk, even if
>> the original was empty.  If I use the QEMU "qemu-img"-tool's
>> "convert"-option to convert the original image to the copy without
>> changing the format, essentially only making a copy, it takes it's time
>> as well, but will be faster than "rbd cp" and the resulting copy will be
>> sparse.
>>
>> Example-commands:
>> rbd create --size 102400 test1
>> rbd cp test1 test2
>> qemu-img convert -p -f rbd -O rbd rbd:rbd/test1 rbd:rbd/test3
>>
>> Shouldn't "rbd cp" at least have an option to attempt to sparsify the
>> copy, or copy the sparse parts as sparse?  Same goes for "rbd clone",
>> BTW.
>
> Yep, this is in fact a bug.  Opened http://tracker.ceph.com/issues/6257.
>
> Thanks!
> sage
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Lost osd-journal

2013-09-09 Thread Timofey Koolin

Is lost journal mean that I lost all data from this osd?
And I must have HA (raid-1 or similar) journal storage if I use data
without replication?

-- 
Blog: www.rekby.ru
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] trouble with ceph-deploy

2013-09-09 Thread Sage Weil

If you manually use wipefs to clear out the fs signatures after you zap, 
does it work then?

I've opened http://tracker.ceph.com/issues/6258 as I think that is the 
answer here, but if you could confirm that wipefs does in fact solve the 
problem, that would be helpful!

Thanks-
sage


On Mon, 9 Sep 2013, Pavel Timoschenkov wrote:

> for the experiment:
> 
> - blank disk sdae for data
> 
> blkid -p /dev/sdaf
> /dev/sdaf: PTTYPE="gpt"
> 
> - and sda4 partition for journal
> 
> blkid -p /dev/sda4
> /dev/sda4: PTTYPE="gpt" PART_ENTRY_SCHEME="gpt" PART_ENTRY_NAME="Linux 
> filesystem" PART_ENTRY_UUID="cdc46436-b6ed-40bb-adb4-63cf1c41cbe3" 
> PART_ENTRY_TYPE="0fc63daf-8483-4772-8e79-3d69d8477de4" PART_ENTRY_NUMBER="4" 
> PART_ENTRY_OFFSET="62916608" PART_ENTRY_SIZE="20971520" PART_ENTRY_DISK="8:0"
> 
> - zapped disk 
> 
> ceph-deploy disk zap ceph001:sdaf ceph001:sda4
> [ceph_deploy.osd][DEBUG ] zapping /dev/sdaf on ceph001
> [ceph_deploy.osd][DEBUG ] zapping /dev/sda4 on ceph001
> 
> - after this:
> 
> ceph-deploy osd create ceph001:sdae:/dev/sda4
> [ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks 
> ceph001:/dev/sdaf:/dev/sda4
> [ceph_deploy.osd][DEBUG ] Deploying osd to ceph001
> [ceph_deploy.osd][DEBUG ] Host ceph001 is now ready for osd use.
> [ceph_deploy.osd][DEBUG ] Preparing host ceph001 disk /dev/sdaf journal 
> /dev/sda4 activate True
> 
> 
> - after this:
> 
> blkid -p /dev/sdaf1
> /dev/sdaf1: ambivalent result (probably more filesystems on the device, use 
> wipefs(8) to see more details)
> 
> wipefs /dev/sdaf1
> offset   type
> 
> 0x3  zfs_member   [raid]
> 
> 0x0  xfs   [filesystem]
>  UUID:  aba50262-0427-4f8b-8eb9-513814af6b81
> 
> - and OSD not created
> 
> but if I'm using sungle disk for data and journal:
> 
> ceph-deploy disk zap ceph001:sdaf
> [ceph_deploy.osd][DEBUG ] zapping /dev/sdaf on ceph001
> 
> ceph-deploy osd create ceph001:sdaf
> [ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks ceph001:/dev/sdaf:
> [ceph_deploy.osd][DEBUG ] Deploying osd to ceph001
> [ceph_deploy.osd][DEBUG ] Host ceph001 is now ready for osd use.
> [ceph_deploy.osd][DEBUG ] Preparing host ceph001 disk /dev/sdaf journal None 
> activate True
> 
> OSD created!
> 
> -Original Message-
> From: Sage Weil [mailto:s...@inktank.com] 
> Sent: Friday, September 06, 2013 6:41 PM
> To: Pavel Timoschenkov
> Cc: Alfredo Deza; ceph-users@lists.ceph.com
> Subject: RE: [ceph-users] trouble with ceph-deploy
> 
> On Fri, 6 Sep 2013, Pavel Timoschenkov wrote:
> > >>>Try
> > >>>ceph-disk -v activate /dev/sdaa1
> > 
> > ceph-disk -v activate /dev/sdaa1
> > /dev/sdaa1: ambivalent result (probably more filesystems on the 
> > device, use wipefs(8) to see more details)
> 
> Looks like thre are multiple fs signatures on that partition.  See
> 
> http://ozancaglayan.com/2013/01/29/multiple-filesystem-signatures-on-a-partition/
> 
> for how to clean that up.  And please share the wipefs output that you see; 
> it may be that we need to make the --zap-disk behavior also explicitly clear 
> any signatures on the device.
> 
> Thanks!
> sage
> 
> 
> > >>>as there is probably a partition there.  And/or tell us what 
> > >>>/proc/partitions contains,
> > 
> > cat /proc/partitions
> > major minor  #blocks  name
> > 
> > 65  160 2930266584 sdaa
> >   65  161 2930265543 sdaa1
> > 
> > >>>and/or what you get from
> > >>>ceph-disk list
> > 
> > ceph-disk list
> > Traceback (most recent call last):
> >   File "/usr/sbin/ceph-disk", line 2328, in 
> > main()
> >   File "/usr/sbin/ceph-disk", line 2317, in main
> > args.func(args)
> >   File "/usr/sbin/ceph-disk", line 2001, in main_list
> > tpath = mount(dev=dev, fstype=fs_type, options='')
> >   File "/usr/sbin/ceph-disk", line 678, in mount
> > path,
> >   File "/usr/lib/python2.7/subprocess.py", line 506, in check_call
> > retcode = call(*popenargs, **kwargs)
> >   File "/usr/lib/python2.7/subprocess.py", line 493, in call
> > return Popen(*popenargs, **kwargs).wait()
> >   File "/usr/lib/python2.7/subprocess.py", line 679, in __init__
> > errread, errwrite)
> >   File "/usr/lib/python2.7/subprocess.py", line 1249, in _execute_child
> > raise child_exception
> > TypeError: execv() arg 2 must contain only strings
> > 
> > ==
> > -Original Message-
> > From: Sage Weil [mailto:s...@inktank.com]
> > Sent: Thursday, September 05, 2013 6:37 PM
> > To: Pavel Timoschenkov
> > Cc: Alfredo Deza; ceph-users@lists.ceph.com
> > Subject: RE: [ceph-users] trouble with ceph-deploy
> > 
> > On Thu, 5 Sep 2013, Pavel Timoschenkov wrote:
> > > >>>What happens if you do
> > > >>>ceph-disk -v activate /dev/sdaa1
> > > >>>on ceph001?
> > > 
> > > Hi. My issue has not been solved. When i execute ceph-disk -v activate 
> > > /dev/sdaa - all is ok:
> > > ceph-

Re: [ceph-users] ceph 0.67 ceph osd crush move changed

2013-09-09 Thread Sage Weil

On Mon, 9 Sep 2013, Vladislav Gorbunov wrote:
> Is ceph osd crush move syntax changed on 0.67?
> I have crushmap
> # id weight type name up/down reweight
> -1 10.11 root default
> -4 3.82  datacenter dc1
> -2 3.82   host cstore3
> 0 0.55osd.0 up 1
> 1 0.55osd.1 up 1
> 2 0.55osd.2 up 1
> -5 3.82  datacenter dc2
> -3 3.82   host cstore2
> 10 0.55osd.10 up 1
> 11 0.55osd.11 up 1
> 12 0.55osd.12 up 1
> -7 1.92  host cstore1
> 20 0.55   osd.20 up 1
> 21 0.55   osd.21 up 1
> 22 0.55   osd.22 up 1
> 23 0.27   osd.23 up 1
> 
> and try to move new added host cstore1 to datacenter with command:
> ceph osd crush move cstore1 root=default datacenter=dc1
> Invalid command:  osd id cstore1 not integer
> Error EINVAL: invalid command
> 
> It's worked on ceph 0.61. How can I move new host to datacenter? in 0.67?

Yep, this is a regression; see http://tracker.ceph.com/issues/6230.

Thanks for the report!  As you pointed out, link + unlink is a workaround.

Thanks!
sage

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Lost osd-journal

2013-09-09 Thread Wido den Hollander


On 09/09/2013 05:14 PM, Timofey Koolin wrote:

Is lost journal mean that I lost all data from this osd?


If you are not using btrfs, yes.


And I must have HA (raid-1 or similar) journal storage if I use data
without replication?



I'd not recommend that, but rather use Ceph's replication. Using HA or 
RAID-1 beats the purpose of Ceph imho.



--
Blog: www.rekby.ru 


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




--
Wido den Hollander
42on B.V.

Phone: +31 (0)20 700 9902
Skype: contact42on
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Full OSD questions

2013-09-09 Thread Gaylord Holder


I'm starting to load up my ceph cluster.

I currently have 12 2TB drives (10 up and in, 2 defined but down and out).

rados df

says I have 8TB free, but I have 2 nearly full OSDs.

I don't understand how/why these two disks are filled while the others 
are relatively empty.


How do I tell ceph to spread the data around more, and why isn't it 
already doing it?


Thank you for helping me understand this system better.

Cheers,
-Gaylord
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Documentation OS Recommendations

2013-09-09 Thread Scottix

Great Thanks.


On Mon, Sep 9, 2013 at 11:31 AM, John Wilkins wrote:

> Yes. We'll have an update shortly.
>
> On Mon, Sep 9, 2013 at 11:29 AM, Scottix  wrote:
> > I was looking at someones question on the list and started looking up
> some
> > documentation and found this page.
> > http://ceph.com/docs/next/install/os-recommendations/
> >
> > Do you think you can provide an update for dumpling.
> >
> > Best Regards
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
>
>
> --
> John Wilkins
> Senior Technical Writer
> Intank
> john.wilk...@inktank.com
> (415) 425-9599
> http://inktank.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Documentation OS Recommendations

2013-09-09 Thread John Wilkins

Yes. We'll have an update shortly.

On Mon, Sep 9, 2013 at 11:29 AM, Scottix  wrote:
> I was looking at someones question on the list and started looking up some
> documentation and found this page.
> http://ceph.com/docs/next/install/os-recommendations/
>
> Do you think you can provide an update for dumpling.
>
> Best Regards
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
John Wilkins
Senior Technical Writer
Intank
john.wilk...@inktank.com
(415) 425-9599
http://inktank.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Documentation OS Recommendations

2013-09-09 Thread Scottix

I was looking at someones question on the list and started looking up some
documentation and found this page.
http://ceph.com/docs/next/install/os-recommendations/

Do you think you can provide an update for dumpling.

Best Regards
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Full OSD questions

2013-09-09 Thread Samuel Just

This is usually caused by having too few pgs.  Each pool with a
significant amount of data needs at least around 100pgs/osd.
-Sam

On Mon, Sep 9, 2013 at 10:32 AM, Gaylord Holder  wrote:
> I'm starting to load up my ceph cluster.
>
> I currently have 12 2TB drives (10 up and in, 2 defined but down and out).
>
> rados df
>
> says I have 8TB free, but I have 2 nearly full OSDs.
>
> I don't understand how/why these two disks are filled while the others are
> relatively empty.
>
> How do I tell ceph to spread the data around more, and why isn't it already
> doing it?
>
> Thank you for helping me understand this system better.
>
> Cheers,
> -Gaylord
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] rgw geo-replication and disaster recovery problem

2013-09-09 Thread Samuel Just

The regions and zones can be used to distribute among different ceph clusters.
-Sam

On Mon, Sep 2, 2013 at 2:05 AM, 李学慧  wrote:
> Mr.
> Hi!I'm interested into the rgw geo-replication and disaster recovery
> feature.
> But whether those 'regisions and zones ' distributes among several different
> ceph clusters or just only one?
> Thank you !
>
>
> 
>
> ashely
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] ceph freezes for 10+ seconds during benchmark

2013-09-09 Thread Samuel Just

It looks like osd.4 may actually be the problem.  Can you try removing
osd.4 and trying again?
-Sam

On Mon, Sep 2, 2013 at 8:01 AM, Mariusz Gronczewski
 wrote:
> We've installed ceph on test cluster:
> 3x mon, 7xOSD on 2x10k RPM SAS
> Centos 6.4 ( 2.6.32-358.14.1.el6.x86_64  )
> ceph 0.67.2 (also tried with 0.61.7 with same results)
>
> And during rados bench I get very strange behaviour:
> # rados bench -p pbench 100 write
>
>sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
> ...
> 51  16  1503  1487   116.60372  0.306585  0.524611
> 52  16  1525  1509   116.05388  0.171904  0.520352
> 53  16  1541  1525115.0764  0.121784  0.516466
> 54  16  1541  1525   112.939 0 -  0.516466
> 55  16  1541  1525   110.885 0 -  0.516466
> 56  16  1541  1525   108.905 0 -  0.516466
> 57  16  1541  1525   106.994 0 -  0.516466
> ... ( http://pastebin.com/vV50YBVK )
>
> Bandwidth (MB/sec): 81.760
>
> Stddev Bandwidth:   53.8371
> Max bandwidth (MB/sec): 156
> Min bandwidth (MB/sec): 0
> Average Latency:0.782271
> Stddev Latency: 2.51829
> Max latency:26.1715
> Min latency:0.084654
>
> basically benchmark goes at full disk speed and then it stops any I/O for 10+ 
> seconds
>
> During that time all IO and cpu load on all nodes basically stops and ceph -w 
> starts to report:
>
> 2013-09-02 16:44:57.794115 osd.4 [WRN] 6 slow requests, 1 included below; 
> oldest blocked for > 62.953663 secs
> 2013-09-02 16:44:57.794125 osd.4 [WRN] slow request 60.363101 seconds old, 
> received at 2013-09-02 16:43:57.430961: osd_op(client.381797.0:2109 
> benchmark_data_hqblade203.non.3dart.com_18829_object2108 [write 0~4194304] 
> 14.745012c3 e277) v4 currently waiting for subops from [0]
> 2013-09-02 16:45:01.795211 osd.4 [WRN] 6 slow requests, 1 included below; 
> oldest blocked for > 66.954773 secs
> 2013-09-02 16:45:01.795221 osd.4 [WRN] slow request 60.661060 seconds old, 
> received at 2013-09-02 16:44:01.134112: osd_op(client.381797.0:2199 
> benchmark_data_hqblade203.non.3dart.com_18829_object2198 [write 0~4194304] 
> 14.dec41e60 e277) v4 currently waiting for subops from [0]
> 2013-09-02 16:45:02.795582 osd.4 [WRN] 6 slow requests, 2 included below; 
> oldest blocked for > 67.955102 secs
> 2013-09-02 16:45:02.795590 osd.4 [WRN] slow request 60.316291 seconds old, 
> received at 2013-09-02 16:44:02.479210: osd_op(client.381797.0:2230 
> benchmark_data_hqblade203.non.3dart.com_18829_object2229 [write 0~4194304] 
> 14.b3ca5505 e277) v4 currently waiting for subops from [0]
> 2013-09-02 16:45:02.795595 osd.4 [WRN] slow request 60.014792 seconds old, 
> received at 2013-09-02 16:44:02.780709: osd_op(client.381797.0:2234 
> benchmark_data_hqblade203.non.3dart.com_18829_object2233 [write 0~4194304] 
> 14.a8c8cfd5 e277) v4 currently waiting for subops from [0]
> 2013-09-02 16:45:03.723742 osd.0 [WRN] 10 slow requests, 1 included below; 
> oldest blocked for > 69.571037 secs
> 2013-09-02 16:45:03.723748 osd.0 [WRN] slow request 60.871583 seconds
> old, received at 2013-09-02 16:44:02.852110:
> osd_op(client.381797.0:2235
> benchmark_data_hqblade203.non.3dart.com_18829_object2234 [write
> 0~4194304] 14.d44b2ab6 e277) v4 currently waiting for subops from [4]
>
> Any ideas why it is happening and how it can be debugged ? it seems that 
> there is something wrong with osd.0 but there doesnt seem to be anything 
> wrong with machine itself (bonnie++ and dd on machine does not show up any 
> lockups)
>
> --
> Mariusz Gronczewski, Administrator
>
> Efigence Sp. z o. o.
> ul. Wołoska 9a, 02-583 Warszawa
> T: [+48] 22 380 13 13
> F: [+48] 22 380 13 14
> E: mariusz.gronczew...@efigence.com
> 
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] SSD only storage, where to place journal

2013-09-09 Thread Samuel Just

You can't really disable the journal.  It's used for failure recovery.  It
should be fine to place your journal on the same ssd as the osd data
directory (though it does affect performance).
-Sam


On Wed, Sep 4, 2013 at 8:40 AM, Neo  wrote:

>
>
>
>  Original-Nachricht   Betreff: Re: [ceph-users] SSD only
> storage, where to place journal  Datum: Fri, 30 Aug 2013 22:13:12 +0200  Von:
> Stefan PriebeAn: Tobias
> BrunnerKopie (CC):
> ceph-users@lists.ceph.com
> Hi Stefan.
>
> Am 30.08.2013 22:09, schrieb Tobias Brunner:
>
> SSD only setup
>
> >> Am I doing the correct considerations? What are best practices on an SSD
> >> only storage cluster?
>
> > Yes, thats correct what i hate at this point is that you lower the ssd
> > speed by writing to journal, reading to journal wirting to ssd. Sadly
> > there is no option to disable the journal. I think for SSD this would be
> > best.
>
> Are there any considerations by now? Or, is disabling journal sowhat tricky, 
> that we cant build an --nojournal switch for such scenarios?
>
> I'd be glad for answers.
>
> Have fun and keep up the good work
>
> Neo
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] few port per ceph-osd

2013-09-09 Thread Samuel Just

That's normal, each osd listens on a few different ports for different reasons.
-Sam

On Mon, Sep 9, 2013 at 12:27 AM, Timofey Koolin  wrote:
> I use ceph 0.67.2.
> When I start
> ceph-osd -i 0
> or
> ceph-osd -i 1
> it start one process, but it process open few tcp-ports, is it normal?
>
> netstat -nlp | grep ceph
> tcp0  0 10.11.0.73:6789 0.0.0.0:*   LISTEN
> 1577/ceph-mon - mon
> tcp0  0 10.11.0.73:6800 0.0.0.0:*   LISTEN
> 3649/ceph-osd - osd.0
> tcp0  0 10.11.0.73:6801 0.0.0.0:*   LISTEN
> 3649/ceph-osd - osd.0
> tcp0  0 10.11.0.73:6802 0.0.0.0:*   LISTEN
> 3649/ceph-osd - osd.0
> tcp0  0 10.11.0.73:6803 0.0.0.0:*   LISTEN
> 3649/ceph-osd - osd.0
> tcp0  0 10.11.0.73:6804 0.0.0.0:*   LISTEN
> 3764/ceph-osd - osd.1
> tcp0  0 10.11.0.73:6805 0.0.0.0:*   LISTEN
> 3764/ceph-osd - osd.1
> tcp0  0 10.11.0.73:6808 0.0.0.0:*   LISTEN
> 3764/ceph-osd - osd.1
> tcp0  0 10.11.0.73:6809 0.0.0.0:*   LISTEN
> 3764/ceph-osd - osd.1
>
> --
> Blog: www.rekby.ru
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] radosgw md5

2013-09-09 Thread Samuel Just

What do you mean by "directly from Rados"?
-Sam

On Wed, Sep 4, 2013 at 1:40 AM, Art M.  wrote:
> Hello,
>
> As I know, radosgw calculates MD5 of the uploaded file and compares it with
> MD5 provided in header.
>
> Is it possible to get calculated MD5 of uploaded file directly from Rados.
> We want to keep it as a file attribute for future use.
>
> How radosgw calculates MD5? Is it realtime block by block calculation, or
> postupload action?
>
>
> Thanks,
> Arturs Meinarts
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Full OSD questions

2013-09-09 Thread Timofey

I don't see any very bad.
Try rename your racks from numbers to unique string, for example change
rack 1 {
to
rack rack1 {

and i.e.

On 09.09.2013, at 23:56, Gaylord Holder wrote:

> Thanks for your assistance.
> 
> Crush map:
> 
> # begin crush map
> tunable choose_local_tries 0
> tunable choose_local_fallback_tries 0
> tunable choose_total_tries 50
> tunable chooseleaf_descend_once 1
> 
> # devices
> device 0 osd.0
> device 1 osd.1
> device 2 osd.2
> device 3 osd.3
> device 4 osd.4
> device 5 osd.5
> device 6 osd.6
> device 7 osd.7
> device 8 osd.8
> device 9 osd.9
> device 10 osd.10
> device 11 osd.11
> 
> # types
> type 0 osd
> type 1 host
> type 2 rack
> type 3 row
> type 4 room
> type 5 datacenter
> type 6 root
> 
> # buckets
> host celestia {
>   id -2   # do not change unnecessarily
>   # weight 7.600
>   alg straw
>   hash 0  # rjenkins1
>   item osd.0 weight 1.900
>   item osd.1 weight 1.900
>   item osd.2 weight 1.900
>   item osd.3 weight 1.900
> }
> rack 1 {
>   id -3   # do not change unnecessarily
>   # weight 7.600
>   alg straw
>   hash 0  # rjenkins1
>   item celestia weight 7.600
> }
> host luna {
>   id -4   # do not change unnecessarily
>   # weight 7.600
>   alg straw
>   hash 0  # rjenkins1
>   item osd.5 weight 1.900
>   item osd.6 weight 1.900
>   item osd.7 weight 1.900
>   item osd.4 weight 1.900
> }
> rack 2 {
>   id -5   # do not change unnecessarily
>   # weight 7.600
>   alg straw
>   hash 0  # rjenkins1
>   item luna weight 7.600
> }
> host twilight {
>   id -6   # do not change unnecessarily
>   # weight 7.600
>   alg straw
>   hash 0  # rjenkins1
>   item osd.8 weight 1.900
>   item osd.10 weight 1.900
>   item osd.11 weight 1.900
>   item osd.9 weight 1.900
> }
> rack 3 {
>   id -7   # do not change unnecessarily
>   # weight 7.600
>   alg straw
>   hash 0  # rjenkins1
>   item twilight weight 7.600
> }
> root default {
>   id -1   # do not change unnecessarily
>   # weight 22.800
>   alg straw
>   hash 0  # rjenkins1
>   item 1 weight 7.600
>   item 2 weight 7.600
>   item 3 weight 7.600
> }
> 
> # rules
> rule data {
>   ruleset 0
>   type replicated
>   min_size 1
>   max_size 10
>   step take default
>   step chooseleaf firstn 0 type host
>   step emit
> }
> rule metadata {
>   ruleset 1
>   type replicated
>   min_size 1
>   max_size 10
>   step take default
>   step chooseleaf firstn 0 type host
>   step emit
> }
> rule rbd {
>   ruleset 2
>   type replicated
>   min_size 1
>   max_size 10
>   step take default
>   step chooseleaf firstn 0 type host
>   step emit
> }
> 
> # end crush map
> 
> The full osds are 2 and 10.
> 
> -Gaylord
> 
> On 09/09/2013 03:49 PM, Timofey wrote:> Show crush map please
> >
> > 09.09.2013, в 21:32, Gaylord Holder  написал(а):
> >
> >> I'm starting to load up my ceph cluster.
> >>
> >> I currently have 12 2TB drives (10 up and in, 2 defined but down and out).
> >>
> >> rados df
> >>
> >> says I have 8TB free, but I have 2 nearly full OSDs.
> >>
> >> I don't understand how/why these two disks are filled while the others are 
> >> relatively empty.
> >>
> >> How do I tell ceph to spread the data around more, and why isn't it 
> >> already doing it?
> >>
> >> Thank you for helping me understand this system better.
> >>
> >> Cheers,
> >> -Gaylord
> >> ___
> >> ceph-users mailing list
> >> ceph-users@lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 
> On 09/09/2013 03:49 PM, Timofey wrote:
>> Show crush map please
>> 
>> 09.09.2013, в 21:32, Gaylord Holder  написал(а):
>> 
>>> I'm starting to load up my ceph cluster.
>>> 
>>> I currently have 12 2TB drives (10 up and in, 2 defined but down and out).
>>> 
>>> rados df
>>> 
>>> says I have 8TB free, but I have 2 nearly full OSDs.
>>> 
>>> I don't understand how/why these two disks are filled while the others are 
>>> relatively empty.
>>> 
>>> How do I tell ceph to spread the data around more, and why isn't it already 
>>> doing it?
>>> 
>>> Thank you for helping me understand this system better.
>>> 
>>> Cheers,
>>> -Gaylord
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] rbd cp copies of sparse files become fully allocated

2013-09-09 Thread Josh Durgin


On 09/09/2013 04:57 AM, Andrey Korolyov wrote:

May I also suggest the same for export/import mechanism? Say, if image
was created by fallocate we may also want to leave holes upon upload
and vice-versa for export.


Import and export already omit runs of zeroes. They could detect
smaller runs (currently they look at object size chunks), and export
might be more efficient if it used diff_iterate() instead of
read_iterate(). Have you observed them misbehaving with sparse images?


On Mon, Sep 9, 2013 at 8:45 AM, Sage Weil  wrote:

On Sat, 7 Sep 2013, Oliver Daudey wrote:

Hey all,

This topic has been partly discussed here:
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-March/000799.html

Tested on Ceph version 0.67.2.

If you create a fresh empty image of, say, 100GB in size on RBD and then
use "rbd cp" to make a copy of it, even though the image is sparse, the
command will attempt to read every part of it and take far more time
than expected.

After reading the above thread, I understand why the copy of an
essentially empty sparse image on RBD would take so long, but it doesn't
explain why the copy won't be sparse itself.  If I use "rbd cp" to copy
an image, the copy will take it's full allocated size on disk, even if
the original was empty.  If I use the QEMU "qemu-img"-tool's
"convert"-option to convert the original image to the copy without
changing the format, essentially only making a copy, it takes it's time
as well, but will be faster than "rbd cp" and the resulting copy will be
sparse.

Example-commands:
rbd create --size 102400 test1
rbd cp test1 test2
qemu-img convert -p -f rbd -O rbd rbd:rbd/test1 rbd:rbd/test3

Shouldn't "rbd cp" at least have an option to attempt to sparsify the
copy, or copy the sparse parts as sparse?  Same goes for "rbd clone",
BTW.


Yep, this is in fact a bug.  Opened http://tracker.ceph.com/issues/6257.

Thanks!
sage


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Full OSD questions

2013-09-09 Thread Gaylord Holder


Indeed, that pool was created with the default 8 pg_nums.

8 pg_num * 2T/OSD / 2 repl ~ 8TB which about how far I got.

I bumped up the pg_num to 600 for that pool and nothing happened.
I bumped up the pgp_num to 600 for that pool and ceph started shifting 
things around.


Can you explain the difference between pg_num and pgp_num to me?
I can't understand the distinction.

Thank you for your help!

-Gaylord

On 09/09/2013 04:58 PM, Samuel Just wrote:

This is usually caused by having too few pgs.  Each pool with a
significant amount of data needs at least around 100pgs/osd.
-Sam

On Mon, Sep 9, 2013 at 10:32 AM, Gaylord Holder  wrote:

I'm starting to load up my ceph cluster.

I currently have 12 2TB drives (10 up and in, 2 defined but down and out).

rados df

says I have 8TB free, but I have 2 nearly full OSDs.

I don't understand how/why these two disks are filled while the others are
relatively empty.

How do I tell ceph to spread the data around more, and why isn't it already
doing it?

Thank you for helping me understand this system better.

Cheers,
-Gaylord
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] blockdev --setro cannot set krbd to readonly

2013-09-09 Thread Josh Durgin


On 09/08/2013 01:14 AM, Da Chun Ng wrote:

I mapped an image to a system, and used blockdev to make it readonly.
But it failed.
[root@ceph0 mnt]# blockdev --setro /dev/rbd2
[root@ceph0 mnt]# blockdev --getro /dev/rbd2
0

It's on Centos6.4 with kernel 3.10.6 .
Ceph 0.61.8 .

Any idea?


For reasons I can't understand right now, calling set_device_ro(bdev, ro)
in the driver seems to prevent future BLKROSET ioctls from having any
effect, even though they should be calling exactly the same function.
The rbd driver always calls set_device_ro() right now, which causes
the problem.

Presumably there's some cached information that isn't updated if the
driver set the flags during device initialization. There's no reason
you shouldn't be able to change it for non-snapshot mappings though.

I added http://tracker.ceph.com/issues/6265 to track this.

Josh
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Understanding ceph status

2013-09-09 Thread Gaylord Holder


There are a lot of numbers ceph status prints.

Is there any documentation on what they are?

I'm particulary curious about what seems a total data.

ceph status says I have 314TB, when I calculate I have 24TB.

It also says:

10615 GB used, 8005 GB / 18621 GB avail;

which I take to be 10TB used/8T available for use, and 18TB total available.

This doesn't make sense to me as I have 24TB raw and with default 2x 
replication, I should only have 12TB available??


I see MB/s, K/s, o/s, but what are E/s units?

-Gaylord
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] [RadosGW] Which pools should be set to size = 3 for three copies of object ?

2013-09-09 Thread Kuo Hugo

Hi Folks,
I found that RadosGW created the following pools. The copies number is 2 by
default. I'd like to tweak the replicas to 3 for better reliability.  I
tried to find the definition/usage of each pool but no luck.
Could someone provide related information about the usage of each pool and
which should I set the size to 3 ?

*.rgw.gc*
*.rgw.control*
*.users.uid*
*.users*
*.rgw*
*.rgw.buckets*
*.users.swift*



Appreciate

+Hugo Kuo+
(+886) 935004793
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] [RadosGW] Which pools should be set to size = 3 for three copies of object ?

2013-09-09 Thread Kuo Hugo

Thanks for the quick reply.


+Hugo Kuo+
(+886) 935004793


2013/9/10 Yehuda Sadeh 

> by default .rgw.buckets holds the objects data.
>
> On Mon, Sep 9, 2013 at 8:39 PM, Kuo Hugo  wrote:
> > Hi Folks,
> > I found that RadosGW created the following pools. The copies number is 2
> by
> > default. I'd like to tweak the replicas to 3 for better reliability.  I
> > tried to find the definition/usage of each pool but no luck.
> > Could someone provide related information about the usage of each pool
> and
> > which should I set the size to 3 ?
> >
> > .rgw.gc
> > .rgw.control
> > .users.uid
> > .users
> > .rgw
> > .rgw.buckets
> > .users.swift
> >
> >
> >
> > Appreciate
> >
> > +Hugo Kuo+
> > (+886) 935004793
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] New Geek on Duty!

2013-09-09 Thread Ross David Turk


Greetings, ceph-users :)

I’m pleased to share that Xiaoxi from the Intel Asia Pacific R&D Center
is our newest volunteer Geek on Duty!

If you’re not familiar with the Geek on Duty program, here are the
basics: members of our community take shifts on IRC and on the mailing
list to help new users get Ceph up and running quickly.

Xiaoxi will be taking the 10:00 - 13:00 shift in China (which is 7pm PDT,
10pm EDT, 04:00 CEST).  His handle on IRC is “xiaoxi” - everyone say
hello when you see him in the channel next!

Cheers,
Ross

--
Ross Turk
Community, Inktank

@rossturk @inktank @ceph

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] few port per ceph-osd

[ceph-users] CephFS metadata corruption on MDS restart

[ceph-users] ceph 0.67 ceph osd crush move changed

Re: [ceph-users] ceph 0.67 ceph osd crush move changed

Re: [ceph-users] trouble with ceph-deploy

Re: [ceph-users] CephFS metadata corruption on MDS restart

Re: [ceph-users] rbd cp copies of sparse files become fully allocated

[ceph-users] Lost osd-journal

Re: [ceph-users] trouble with ceph-deploy

Re: [ceph-users] ceph 0.67 ceph osd crush move changed

Re: [ceph-users] Lost osd-journal

[ceph-users] Full OSD questions

Re: [ceph-users] Documentation OS Recommendations

Re: [ceph-users] Documentation OS Recommendations

[ceph-users] Documentation OS Recommendations

Re: [ceph-users] Full OSD questions

Re: [ceph-users] rgw geo-replication and disaster recovery problem

Re: [ceph-users] ceph freezes for 10+ seconds during benchmark

Re: [ceph-users] SSD only storage, where to place journal

Re: [ceph-users] few port per ceph-osd

Re: [ceph-users] radosgw md5

Re: [ceph-users] Full OSD questions

Re: [ceph-users] rbd cp copies of sparse files become fully allocated

Re: [ceph-users] Full OSD questions

Re: [ceph-users] blockdev --setro cannot set krbd to readonly

[ceph-users] Understanding ceph status

[ceph-users] [RadosGW] Which pools should be set to size = 3 for three copies of object ?

Re: [ceph-users] [RadosGW] Which pools should be set to size = 3 for three copies of object ?

[ceph-users] New Geek on Duty!

29 matches

Site Navigation

Mail list logo

Footer information