Re: [ceph-users] phantom osd.0 in osd tree

2016-08-23 Thread M Ranga Swami Reddy
Please share the crushmap.

Thanks
Swami

On Tue, Aug 23, 2016 at 11:49 PM, Reed Dier  wrote:

> Trying to hunt down a mystery osd populated in the osd tree.
>
> Cluster was deployed using ceph-deploy on an admin node, originally 10.2.1
> at time of deployment, but since upgraded to 10.2.2.
>
> For reference, mons and mds do not live on the osd nodes, and the admin
> node is neither mon, mds, or osd.
>
> Attempting to remove it from the crush map, it says that osd.0 does not
> exist.
>
> Just looking for some insight into this mystery.
>
> Thanks
>
> # ceph osd tree
> ID WEIGHT   TYPE NAME   UP/DOWN REWEIGHT PRIMARY-AFFINITY
> -1 58.19960 root default
> -2  7.27489 host node24
>  1  7.27489 osd.1up  1.0  1.0
> -3  7.27489 host node25
>  2  7.27489 osd.2up  1.0  1.0
> -4  7.27489 host node26
>  3  7.27489 osd.3up  1.0  1.0
> -5  7.27489 host node27
>  4  7.27489 osd.4up  1.0  1.0
> -6  7.27489 host node28
>  5  7.27489 osd.5up  1.0  1.0
> -7  7.27489 host node29
>  6  7.27489 osd.6up  1.0  1.0
> -8  7.27539 host node30
>  9  7.27539 osd.9up  1.0  1.0
> -9  7.27489 host node31
>  7  7.27489 osd.7up  1.0  1.0
>  00 osd.0  down0  1.0
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Memory leak in ceph OSD.

2016-08-23 Thread Khang Nguyễn Nhật
Hi,
I'm using ceph jewel 10.2.2, I noticed that, when I put multiple object of
the same file, same user to ceph-rgw s3 then RAM memory of ceph-osd
increased and not reduced anymore? This time, the upload speed is reduced
significantly.

Please help me solve this problem?
Thank!
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] issuse with data duplicated in ceph storage cluster.

2016-08-23 Thread Khang Nguyễn Nhật
Hi,

I'm using ceph jewel 10.2.2 and I always want to know that Ceph will do
with duplicate data?
Is Ceph osd will automatically delete them or Ceph rgw will do it ? my Ceph
storage cluster using s3 api to PUT object.
Example:
1. Suppose I use one ceph-rgw s3 user to put two different ojbect of same
source file to ceph-rgw s3, then my Ceph storage cluster will process like?
2. If I use two ceph-rgw s3 user to put two different ojbect of same source
file to ceph-rgw s3, then my Ceph storage cluster will process like?

Please help me solve this problem?
Thank!
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd-nbd: list-mapped : is it possible to display associtation between rbd volume and nbd device ?

2016-08-23 Thread Jason Dillaman
Would you mind opening a feature tracker ticket [1] to document the
proposal? Any chance you are interested in doing the work?

[1] http://tracker.ceph.com/projects/rbd/issues

On Tue, Aug 23, 2016 at 11:15 AM, Alexandre DERUMIER
 wrote:
> I have find a way, nbd device store the pid of the running rbd-ndb process,
> so:
>
>
> #cat /sys/block/nbd0/pid
> 18963
> #cat /proc/18963/cmdline
> rbd-nbd map pool/testimage
>
>
>
>
> - Mail original -
> De: "Jason Dillaman" 
> À: "aderumier" 
> Cc: "ceph-users" 
> Envoyé: Mardi 23 Août 2016 16:30:38
> Objet: Re: [ceph-users] rbd-nbd: list-mapped : is it possible to display 
> associtation between rbd volume and nbd device ?
>
> I don't think this is something that could be trivially added. The
> nbd protocol doesn't really support associating metadata with the
> device. Right now, that "list-mapped" command just tests each nbd
> device to see if it is connected to any backing server (not just
> rbd-nbd backed devices).
>
> On Tue, Aug 23, 2016 at 5:28 AM, Alexandre DERUMIER  
> wrote:
>> Hi, I'm currently testing rbd-nbd, to implement them in lxc instead krbd (to 
>> support new rbd features )
>>
>>
>> #rbd-nbd map pool/testimage
>> /dev/nbd0
>> #rbd-nbd list-mapped
>> /dev/nbd0
>>
>>
>> Is is possible to implement something like
>>
>> #rbd-nbd list-mapped
>> /dev/nbd0 pool/testimage
>>
>>
>> Regards,
>>
>> Alexandre
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> --
> Jason
>



-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] udev rule to set readahead on Ceph RBD's

2016-08-23 Thread Nick Fisk


> -Original Message-
> From: Wido den Hollander [mailto:w...@42on.com]
> Sent: 23 August 2016 19:45
> To: Ilya Dryomov ; Nick Fisk 
> Cc: ceph-users 
> Subject: Re: [ceph-users] udev rule to set readahead on Ceph RBD's
> 
> 
> > Op 23 augustus 2016 om 18:32 schreef Ilya Dryomov :
> >
> >
> > On Mon, Aug 22, 2016 at 9:22 PM, Nick Fisk  wrote:
> > >> -Original Message-
> > >> From: Wido den Hollander [mailto:w...@42on.com]
> > >> Sent: 22 August 2016 18:22
> > >> To: ceph-users ; n...@fisk.me.uk
> > >> Subject: Re: [ceph-users] udev rule to set readahead on Ceph RBD's
> > >>
> > >>
> > >> > Op 22 augustus 2016 om 15:17 schreef Nick Fisk :
> > >> >
> > >> >
> > >> > Hope it's useful to someone
> > >> >
> > >> > https://gist.github.com/fiskn/6c135ab218d35e8b53ec0148fca47bf6
> > >> >
> > >>
> > >> Thanks for sharing. Might this be worth adding it to ceph-common?
> > >
> > > Maybe, Ilya kindly set the default for krbd to 4MB last year in the 
> > > kernel, but maybe having this available would be handy if people
> ever want a different default. It could be set to 4MB as well, with a note 
> somewhere to point people at its direction if they need to
> change it.
> >
> > I remember you running tests and us talking about it, but I didn't
> > actually do it - the default is still a standard kernel-wide 128k.
> > I hesitated because it's obviously a trade off and we didn't have a
> > clear winner.  Whatever (sensible) default we pick, users with
> > demanding all sequential workloads would want to crank it up anyway.

Yes sorry, I was getting mixed up with the default max_sectors_kb which is now 
4MB.

> >
> > I don't have an opinion on the udev file.
> >
> 
> I would vote for adding it to ceph-common so that it's there and users can 
> easily change it.
> 
> We can still default it to 128k which makes it just a file change for users.

Yep, this sounds like a good idea

> 
> Wido
> 
> > >
> > >>
> > >> And is 16MB something we should want by default or does this apply to 
> > >> your situation better?
> > >
> > > It sort of applies to me. With a 4MB readahead you will probably struggle 
> > > to get much more than around 50-80MB/s sequential
> reads, as the read ahead will only ever hit 1 object at a time. If you want 
> to get nearer 200MB/s then you need to set either 16 or
> 32MB readahead. I need it to stream to LTO6 tape. Depending on what you are 
> doing this may or may not be required.
> >
> > Thanks,
> >
> > Ilya

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] udev rule to set readahead on Ceph RBD's

2016-08-23 Thread Wido den Hollander

> Op 23 augustus 2016 om 18:32 schreef Ilya Dryomov :
> 
> 
> On Mon, Aug 22, 2016 at 9:22 PM, Nick Fisk  wrote:
> >> -Original Message-
> >> From: Wido den Hollander [mailto:w...@42on.com]
> >> Sent: 22 August 2016 18:22
> >> To: ceph-users ; n...@fisk.me.uk
> >> Subject: Re: [ceph-users] udev rule to set readahead on Ceph RBD's
> >>
> >>
> >> > Op 22 augustus 2016 om 15:17 schreef Nick Fisk :
> >> >
> >> >
> >> > Hope it's useful to someone
> >> >
> >> > https://gist.github.com/fiskn/6c135ab218d35e8b53ec0148fca47bf6
> >> >
> >>
> >> Thanks for sharing. Might this be worth adding it to ceph-common?
> >
> > Maybe, Ilya kindly set the default for krbd to 4MB last year in the kernel, 
> > but maybe having this available would be handy if people ever want a 
> > different default. It could be set to 4MB as well, with a note somewhere to 
> > point people at its direction if they need to change it.
> 
> I remember you running tests and us talking about it, but I didn't
> actually do it - the default is still a standard kernel-wide 128k.
> I hesitated because it's obviously a trade off and we didn't have
> a clear winner.  Whatever (sensible) default we pick, users with
> demanding all sequential workloads would want to crank it up anyway.
> 
> I don't have an opinion on the udev file.
> 

I would vote for adding it to ceph-common so that it's there and users can 
easily change it.

We can still default it to 128k which makes it just a file change for users.

Wido

> >
> >>
> >> And is 16MB something we should want by default or does this apply to your 
> >> situation better?
> >
> > It sort of applies to me. With a 4MB readahead you will probably struggle 
> > to get much more than around 50-80MB/s sequential reads, as the read ahead 
> > will only ever hit 1 object at a time. If you want to get nearer 200MB/s 
> > then you need to set either 16 or 32MB readahead. I need it to stream to 
> > LTO6 tape. Depending on what you are doing this may or may not be required.
> 
> Thanks,
> 
> Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] phantom osd.0 in osd tree

2016-08-23 Thread Reed Dier
Trying to hunt down a mystery osd populated in the osd tree.

Cluster was deployed using ceph-deploy on an admin node, originally 10.2.1 at 
time of deployment, but since upgraded to 10.2.2.

For reference, mons and mds do not live on the osd nodes, and the admin node is 
neither mon, mds, or osd.

Attempting to remove it from the crush map, it says that osd.0 does not exist.

Just looking for some insight into this mystery.

Thanks

# ceph osd tree
ID WEIGHT   TYPE NAME   UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 58.19960 root default
-2  7.27489 host node24
 1  7.27489 osd.1up  1.0  1.0
-3  7.27489 host node25
 2  7.27489 osd.2up  1.0  1.0
-4  7.27489 host node26
 3  7.27489 osd.3up  1.0  1.0
-5  7.27489 host node27
 4  7.27489 osd.4up  1.0  1.0
-6  7.27489 host node28
 5  7.27489 osd.5up  1.0  1.0
-7  7.27489 host node29
 6  7.27489 osd.6up  1.0  1.0
-8  7.27539 host node30
 9  7.27539 osd.9up  1.0  1.0
-9  7.27489 host node31
 7  7.27489 osd.7up  1.0  1.0
 00 osd.0  down0  1.0
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] udev rule to set readahead on Ceph RBD's

2016-08-23 Thread Ilya Dryomov
On Tue, Aug 23, 2016 at 6:15 PM, Nick Fisk  wrote:
>
>> -Original Message-
>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
>> Alex Gorbachev
>> Sent: 23 August 2016 16:43
>> To: Wido den Hollander 
>> Cc: ceph-users ; Nick Fisk 
>> Subject: Re: [ceph-users] udev rule to set readahead on Ceph RBD's
>>
>> On Mon, Aug 22, 2016 at 3:29 PM, Wido den Hollander  wrote:
>> >
>> >> Op 22 augustus 2016 om 21:22 schreef Nick Fisk :
>> >>
>> >>
>> >> > -Original Message-
>> >> > From: Wido den Hollander [mailto:w...@42on.com]
>> >> > Sent: 22 August 2016 18:22
>> >> > To: ceph-users ; n...@fisk.me.uk
>> >> > Subject: Re: [ceph-users] udev rule to set readahead on Ceph RBD's
>> >> >
>> >> >
>> >> > > Op 22 augustus 2016 om 15:17 schreef Nick Fisk :
>> >> > >
>> >> > >
>> >> > > Hope it's useful to someone
>> >> > >
>> >> > > https://gist.github.com/fiskn/6c135ab218d35e8b53ec0148fca47bf6
>> >> > >
>> >> >
>> >> > Thanks for sharing. Might this be worth adding it to ceph-common?
>> >>
>> >> Maybe, Ilya kindly set the default for krbd to 4MB last year in the 
>> >> kernel, but maybe having this available would be handy if
> people
>> ever want a different default. It could be set to 4MB as well, with a note 
>> somewhere to point people at its direction if they need
> to
>> change it.
>> >>
>> >
>> > I think it might be handy to have the udev file as redundancy. That way it 
>> > can easily be changed by users. The udev file is
> already
>> present, they just have to modify it.
>> >
>> >> >
>> >> > And is 16MB something we should want by default or does this apply to 
>> >> > your situation better?
>> >>
>> >> It sort of applies to me. With a 4MB readahead you will probably struggle 
>> >> to get much more than around 50-80MB/s sequential
>> reads, as the read ahead will only ever hit 1 object at a time. If you want 
>> to get nearer 200MB/s then you need to set either 16
> or
>> 32MB readahead. I need it to stream to LTO6 tape. Depending on what you are 
>> doing this may or may not be required.
>> >>
>> >
>> > Ah, yes. I a kind of similar use-case I went for using 64MB objects 
>> > underneath a RBD device. We needed high sequential Write and
>> Read performance on those RBD devices since we were storing large files on 
>> there.
>> >
>> > Different approach, kind of similar result.
>>
>> Question: what scheduler were you guys using to facilitate the readahead on 
>> the RBD client?  Have you noticed any difference
>> between different elevators and have you tried blk-mq/scsi-mq?
>
> I thought since kernel 3.19 you didn't have a choice and RBD always used 
> blk-mq? But that's what I'm using as default.

Correct, but since 4.0.  3.19 was the last non-blk-mq-rbd kernel.

Thanks,

Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] udev rule to set readahead on Ceph RBD's

2016-08-23 Thread Ilya Dryomov
On Mon, Aug 22, 2016 at 9:22 PM, Nick Fisk  wrote:
>> -Original Message-
>> From: Wido den Hollander [mailto:w...@42on.com]
>> Sent: 22 August 2016 18:22
>> To: ceph-users ; n...@fisk.me.uk
>> Subject: Re: [ceph-users] udev rule to set readahead on Ceph RBD's
>>
>>
>> > Op 22 augustus 2016 om 15:17 schreef Nick Fisk :
>> >
>> >
>> > Hope it's useful to someone
>> >
>> > https://gist.github.com/fiskn/6c135ab218d35e8b53ec0148fca47bf6
>> >
>>
>> Thanks for sharing. Might this be worth adding it to ceph-common?
>
> Maybe, Ilya kindly set the default for krbd to 4MB last year in the kernel, 
> but maybe having this available would be handy if people ever want a 
> different default. It could be set to 4MB as well, with a note somewhere to 
> point people at its direction if they need to change it.

I remember you running tests and us talking about it, but I didn't
actually do it - the default is still a standard kernel-wide 128k.
I hesitated because it's obviously a trade off and we didn't have
a clear winner.  Whatever (sensible) default we pick, users with
demanding all sequential workloads would want to crank it up anyway.

I don't have an opinion on the udev file.

>
>>
>> And is 16MB something we should want by default or does this apply to your 
>> situation better?
>
> It sort of applies to me. With a 4MB readahead you will probably struggle to 
> get much more than around 50-80MB/s sequential reads, as the read ahead will 
> only ever hit 1 object at a time. If you want to get nearer 200MB/s then you 
> need to set either 16 or 32MB readahead. I need it to stream to LTO6 tape. 
> Depending on what you are doing this may or may not be required.

Thanks,

Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] udev rule to set readahead on Ceph RBD's

2016-08-23 Thread Nick Fisk

> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Alex 
> Gorbachev
> Sent: 23 August 2016 16:43
> To: Wido den Hollander 
> Cc: ceph-users ; Nick Fisk 
> Subject: Re: [ceph-users] udev rule to set readahead on Ceph RBD's
> 
> On Mon, Aug 22, 2016 at 3:29 PM, Wido den Hollander  wrote:
> >
> >> Op 22 augustus 2016 om 21:22 schreef Nick Fisk :
> >>
> >>
> >> > -Original Message-
> >> > From: Wido den Hollander [mailto:w...@42on.com]
> >> > Sent: 22 August 2016 18:22
> >> > To: ceph-users ; n...@fisk.me.uk
> >> > Subject: Re: [ceph-users] udev rule to set readahead on Ceph RBD's
> >> >
> >> >
> >> > > Op 22 augustus 2016 om 15:17 schreef Nick Fisk :
> >> > >
> >> > >
> >> > > Hope it's useful to someone
> >> > >
> >> > > https://gist.github.com/fiskn/6c135ab218d35e8b53ec0148fca47bf6
> >> > >
> >> >
> >> > Thanks for sharing. Might this be worth adding it to ceph-common?
> >>
> >> Maybe, Ilya kindly set the default for krbd to 4MB last year in the 
> >> kernel, but maybe having this available would be handy if
people
> ever want a different default. It could be set to 4MB as well, with a note 
> somewhere to point people at its direction if they need
to
> change it.
> >>
> >
> > I think it might be handy to have the udev file as redundancy. That way it 
> > can easily be changed by users. The udev file is
already
> present, they just have to modify it.
> >
> >> >
> >> > And is 16MB something we should want by default or does this apply to 
> >> > your situation better?
> >>
> >> It sort of applies to me. With a 4MB readahead you will probably struggle 
> >> to get much more than around 50-80MB/s sequential
> reads, as the read ahead will only ever hit 1 object at a time. If you want 
> to get nearer 200MB/s then you need to set either 16
or
> 32MB readahead. I need it to stream to LTO6 tape. Depending on what you are 
> doing this may or may not be required.
> >>
> >
> > Ah, yes. I a kind of similar use-case I went for using 64MB objects 
> > underneath a RBD device. We needed high sequential Write and
> Read performance on those RBD devices since we were storing large files on 
> there.
> >
> > Different approach, kind of similar result.
> 
> Question: what scheduler were you guys using to facilitate the readahead on 
> the RBD client?  Have you noticed any difference
> between different elevators and have you tried blk-mq/scsi-mq?

I thought since kernel 3.19 you didn't have a choice and RBD always used 
blk-mq? But that's what I'm using as default.

> 
> Thank you.
> --
> Alex Gorbachev
> Storcium
> 
> 
> >
> > Wido
> >
> >> >
> >> > Wido
> >> >
> >> > >
> >> > > ___
> >> > > ceph-users mailing list
> >> > > ceph-users@lists.ceph.com
> >> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >>
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] udev rule to set readahead on Ceph RBD's

2016-08-23 Thread Alex Gorbachev
On Mon, Aug 22, 2016 at 3:29 PM, Wido den Hollander  wrote:
>
>> Op 22 augustus 2016 om 21:22 schreef Nick Fisk :
>>
>>
>> > -Original Message-
>> > From: Wido den Hollander [mailto:w...@42on.com]
>> > Sent: 22 August 2016 18:22
>> > To: ceph-users ; n...@fisk.me.uk
>> > Subject: Re: [ceph-users] udev rule to set readahead on Ceph RBD's
>> >
>> >
>> > > Op 22 augustus 2016 om 15:17 schreef Nick Fisk :
>> > >
>> > >
>> > > Hope it's useful to someone
>> > >
>> > > https://gist.github.com/fiskn/6c135ab218d35e8b53ec0148fca47bf6
>> > >
>> >
>> > Thanks for sharing. Might this be worth adding it to ceph-common?
>>
>> Maybe, Ilya kindly set the default for krbd to 4MB last year in the kernel, 
>> but maybe having this available would be handy if people ever want a 
>> different default. It could be set to 4MB as well, with a note somewhere to 
>> point people at its direction if they need to change it.
>>
>
> I think it might be handy to have the udev file as redundancy. That way it 
> can easily be changed by users. The udev file is already present, they just 
> have to modify it.
>
>> >
>> > And is 16MB something we should want by default or does this apply to your 
>> > situation better?
>>
>> It sort of applies to me. With a 4MB readahead you will probably struggle to 
>> get much more than around 50-80MB/s sequential reads, as the read ahead will 
>> only ever hit 1 object at a time. If you want to get nearer 200MB/s then you 
>> need to set either 16 or 32MB readahead. I need it to stream to LTO6 tape. 
>> Depending on what you are doing this may or may not be required.
>>
>
> Ah, yes. I a kind of similar use-case I went for using 64MB objects 
> underneath a RBD device. We needed high sequential Write and Read performance 
> on those RBD devices since we were storing large files on there.
>
> Different approach, kind of similar result.

Question: what scheduler were you guys using to facilitate the
readahead on the RBD client?  Have you noticed any difference between
different elevators and have you tried blk-mq/scsi-mq?

Thank you.
--
Alex Gorbachev
Storcium


>
> Wido
>
>> >
>> > Wido
>> >
>> > >
>> > > ___
>> > > ceph-users mailing list
>> > > ceph-users@lists.ceph.com
>> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd-nbd: list-mapped : is it possible to display associtation between rbd volume and nbd device ?

2016-08-23 Thread Alexandre DERUMIER
I have find a way, nbd device store the pid of the running rbd-ndb process,
so:


#cat /sys/block/nbd0/pid 
18963
#cat /proc/18963/cmdline 
rbd-nbd map pool/testimage




- Mail original -
De: "Jason Dillaman" 
À: "aderumier" 
Cc: "ceph-users" 
Envoyé: Mardi 23 Août 2016 16:30:38
Objet: Re: [ceph-users] rbd-nbd: list-mapped : is it possible to display 
associtation between rbd volume and nbd device ?

I don't think this is something that could be trivially added. The 
nbd protocol doesn't really support associating metadata with the 
device. Right now, that "list-mapped" command just tests each nbd 
device to see if it is connected to any backing server (not just 
rbd-nbd backed devices). 

On Tue, Aug 23, 2016 at 5:28 AM, Alexandre DERUMIER  
wrote: 
> Hi, I'm currently testing rbd-nbd, to implement them in lxc instead krbd (to 
> support new rbd features ) 
> 
> 
> #rbd-nbd map pool/testimage 
> /dev/nbd0 
> #rbd-nbd list-mapped 
> /dev/nbd0 
> 
> 
> Is is possible to implement something like 
> 
> #rbd-nbd list-mapped 
> /dev/nbd0 pool/testimage 
> 
> 
> Regards, 
> 
> Alexandre 
> ___ 
> ceph-users mailing list 
> ceph-users@lists.ceph.com 
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 



-- 
Jason 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] CephFS + cache tiering in Jewel

2016-08-23 Thread Burkhard Linke

Hi,

the Firefly and Hammer releases did not support transparent usage of 
cache tiering in CephFS. The cache tier itself had to be specified as 
data pool, thus preventing on-the-fly addition and removal of cache tiers.


Does the same restriction also apply to Jewel? I would like to add a 
cache tier to an existing data pool.


Regards,
Burkhard

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd-nbd: list-mapped : is it possible to display associtation between rbd volume and nbd device ?

2016-08-23 Thread Jason Dillaman
I don't think this is something that could be trivially added.  The
nbd protocol doesn't really support associating metadata with the
device. Right now, that "list-mapped" command just tests each nbd
device to see if it is connected to any backing server (not just
rbd-nbd backed devices).

On Tue, Aug 23, 2016 at 5:28 AM, Alexandre DERUMIER  wrote:
> Hi, I'm currently testing rbd-nbd, to implement them in lxc instead krbd (to 
> support new rbd features )
>
>
> #rbd-nbd  map pool/testimage
> /dev/nbd0
> #rbd-nbd list-mapped
> /dev/nbd0
>
>
> Is is possible to implement something like
>
> #rbd-nbd list-mapped
> /dev/nbd0 pool/testimage
>
>
> Regards,
>
> Alexandre
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph auth key generation algorithm documentation

2016-08-23 Thread Heller, Chris
I’d like to generate keys for ceph external to any system which would have 
ceph-authtool.
Looking over the ceph website and googling have turned up nothing.

Is the ceph auth key generation algorithm documented anywhere?

-Chris
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Help with systemd

2016-08-23 Thread Robert Sander
On 22.08.2016 20:16, K.C. Wong wrote:
> Is there a way
> to force a 'remote-fs' reclassification?

Have you tried adding _netdev to the fstab options?

Regards
-- 
Robert Sander
Heinlein Support GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Fwd: Re: Merging CephFS data pools

2016-08-23 Thread Дробышевский , Владимир
>
>
> Missing CC to list
>
>
>  Forwarded Message 
> Subject: Re: [ceph-users] Merging CephFS data pools
> Date: Tue, 23 Aug 2016 08:59:45 +0200
> From: Burkhard Linke 
> 
> To: Gregory Farnum  
>
> Hi,
>
>
> On 08/22/2016 10:02 PM, Gregory Farnum wrote:
> > On Thu, Aug 18, 2016 at 12:21 AM, Burkhard Linke
> >  
> >  wrote:
> >> Hi,
> >>
> >> the current setup for CephFS at our site uses two data pools due to
> >> different requirements in the past. I want to merge these two pools now,
> >> eliminating the second pool completely.
> >>
> >> I've written a small script to locate all files on the second pool using
> >> their file layout attributes and replace them with a copy on the correct
> >> pool. This works well for files, but modifies the timestamps of the
> >> directories.
> >> Do you have any idea for a better solution that does not modify timestamps
> >> and plays well with active CephFS clients (e.g. no problem with files being
> >> used)? A simple 'rados cppool' probably does not work since the pool 
> >> id/name
> >> is part of a file's metadata and client will not be aware of moved
> >> files.
> > Can't you just use rsync or something that will set the timestamps itself?
> The script is using 'cp -a', which also preserves the timestamps. So
> file timestamps are ok, but directory timestamps get updated by cp and
> mv. And that's ok from my point of view.
>
> The main concern is data integrity. There are 20TB left to be
> transferred from the old pool, and part of this data is currently in
> active use (including being overwritten in place). If write access to an
> opened file happens while it is being transfered, the changes to that
> file might be lost.
>
> We can coordinate the remaining transfers with the affected users, if no
> other way exists.
>
> I believe that the best way is to copy all the files from the old pool to
the another one, after that set a service window and make the second pass
to copy files with changes only, deny access to the source pool (but keep
data for a while) and open service access again. After some time if there
will not be any data loss issues the old pool can be deleted. I think it's
the only way to guaratee data integrity.

One of the big service integrators few months ago were migrating from a
traditional proprietary storage solution to ceph RADOS (they offer
S3-compatible storage services). So they used the following migration path:
1. They wrote a special script for this propose.
2. Script copied all the data from the old storage to RADOS and put to the
database records for the each file with size, timestamp, owner, permissions
and so on. In case of success migration app wrote "migrated" status to the
db, in case of failure - error, and that file should be migrated during the
next app run. Migration process took around two weeks because they shaped
speed to prevent service performance disruption.
3. After that they started MD5 hash comparison to assure in data integrity.
4. In the end they had put service to maintainance mode for a few hours,
copied and checked all changes made during migration time and finally
opened access to the new cluster.

Best regards,
Vladimir


>
> Regards,
> Burkhard
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD Watch Notify for snapshots

2016-08-23 Thread Jason Dillaman
Looks good.  Since you are re-using the RBD header object to send the
watch notification, a running librbd client will most likely print out
an error message along the lines of "failed to decode the
notification" since you are sending "fsfreeze" / "fsunfreeze" as the
payload, but it would be harmless.

On Mon, Aug 22, 2016 at 9:13 AM, Nick Fisk  wrote:
> Hi Jason,
>
> Here is my initial attempt at using the Watch/Notify support to be able to 
> remotely fsfreeze a filesystem on a RBD. Please note this
> was all very new to me and so there will probably be a lot of things that 
> haven't been done in the best way.
>
> https://github.com/fiskn/rbd_freeze
>
> I'm not sure if calling out to bash scripts is the best way of doing the 
> fsfreezing, but it was the easiest way I could think to
> accomplish the task. And it also allowed me to fairly easily run extra checks 
> like seeing if any files have been updated recently.
>
> Let me know what you think.
>
> Nick
>
>> -Original Message-
>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
>> Nick Fisk
>> Sent: 08 July 2016 09:58
>> To: dilla...@redhat.com
>> Cc: 'ceph-users' 
>> Subject: Re: [ceph-users] RBD Watch Notify for snapshots
>>
>> Thanks Jason,
>>
>> I think I'm going to start with a bash script which SSH's into the machine 
>> to check if the process has finished writing and then
> calls the
>> fsfreeze as I've got time constraints to getting this working. But I will 
>> definitely revisit this and see if there is something I
> can create
>> which will do as you have described, as it would be a much neater solution.
>>
>> Nick
>>
>> > -Original Message-
>> > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
>> > Of Jason Dillaman
>> > Sent: 08 July 2016 04:02
>> > To: n...@fisk.me.uk
>> > Cc: ceph-users 
>> > Subject: Re: [ceph-users] RBD Watch Notify for snapshots
>> >
>> > librbd pseudo-automatically handles this by flushing the cache to the
>> > snapshot when a new snapshot is created, but I don't think krbd does the 
>> > same. If it doesn't, it would probably be a nice
> addition to
>> the block driver to support the general case.
>> >
>> > Baring that (or if you want to involve something like fsfreeze), I
>> > think the answer depends on how much you are willing to write some
>> > custom C/C++ code (I don't think the rados python library exposes
>> > watch/notify APIs). A daemon could register a watch on a custom 
>> > per-host/image/etc object which would sync the disk when a
>> notification is received. Prior to creating a snapshot, you would need to 
>> send a notification to this object to alert the daemon
> to
>> sync/fsfreeze/etc.
>> >
>> > On Thu, Jul 7, 2016 at 12:33 PM, Nick Fisk  wrote:
>> > Hi All,
>> >
>> > I have a RBD mounted to a machine via the kernel client and I wish to
>> > be able to take a snapshot and mount it to another machine where it can be 
>> > backed up.
>> >
>> > The big issue is that I need to make sure that the process writing on
>> > the source machine is finished and the FS is sync'd before taking the 
>> > snapshot.
>> >
>> > My question. Is there something I can do with Watch/Notify to trigger
>> > this checking/sync process on the source machine before the snapshot is 
>> > actually taken?
>> >
>> > Thanks,
>> > Nick
>> >
>> > ___
>> > ceph-users mailing list
>> > mailto:ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>> >
>> >
>> >
>> > --
>> > Jason
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] BUG ON librbd or libc

2016-08-23 Thread Jason Dillaman
There was almost the exact same issue on the master branch right after
the switch to cmake because tcmalloc was incorrectly (and partially)
linked into librados/librbd. What occurred was that the std::list
within ceph::buffer::ptr was allocated via tcmalloc but was freed
within librados/librbd via the standard malloc.

On Tue, Aug 23, 2016 at 3:45 AM, Ning Yao  wrote:
> Hi, all
>
> Our vm is terminated unexpectedly when using librbd in our production
> environment with CentOS 7.0 kernel 3.12 with Ceph version 0.94.5 and
> glibc version 2.17. we get log from libvirtd as below
>
> *** Error in `/usr/libexec/qemu-kvm': invalid fastbin entry (free):
> 0x7f7db7eed740 ***
>
> === Backtrace: =
>
> /lib64/libc.so.6(+0x7d1fd)[0x7f8520fe61fd]
>
> /lib64/librbd.so.1(_ZNSt10_List_baseIN4ceph6buffer3ptrESaIS2_EE8_M_clearEv+0x2f)[0x7f8527589b3f]
>
> /lib64/librados.so.2(+0xb8a06)[0x7f8525080a06]
>
> /lib64/librados.so.2(+0xb9457)[0x7f8525081457]
>
> /lib64/librados.so.2(+0x718f7)[0x7f85250398f7]
>
> /lib64/librados.so.2(+0x2bffe5)[0x7f8525287fe5]
>
> /lib64/librados.so.2(+0x2cbaad)[0x7f8525293aad]
>
> /lib64/libpthread.so.0(+0x7df5)[0x7f852ab12df5]
>
> /lib64/libc.so.6(clone+0x6d)[0x7f852105f1ad]
>
> but the log does not show symbols so that we cannot know which
> functions cause the error, any suggestions?
>
>
>
> Regards
> Ning Yao
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Very slow S3 sync with big number of object.

2016-08-23 Thread jan hugo prins
Hi,

I'm testing S3 and I created a test where I sync a big part of my
homedirectory, about 4GB of data in a lot of small objects, towards a S3
bucket.
The first part of the sync was very fast but after some time it became a
lot slower.

What I basically see is this for every file:

The file gets transferred.
The S3 gateway returns a 200 for the transfer
s3cmd seems to think that something failed (probably the acl???)
It starts a new transfer of the same file with ?acl behind the call and
inserts a 3 second delay for this transfer.

With 40.000 files this really takes a very long time.

I have created debug logfiles for this but they are really big and I
probably need to filter out the parts related to one set of calls to
make it usable.

The user that I use to do the transfer has full controll on the bucket.

Has anyone seen this before?

-- 
Met vriendelijke groet / Best regards,

Jan Hugo Prins
Infra and Isilon storage consultant

Better.be B.V.
Auke Vleerstraat 140 E | 7547 AN Enschede | KvK 08097527
T +31 (0) 53 48 00 694 | M +31 (0)6 26 358 951
jpr...@betterbe.com | www.betterbe.com

This e-mail is intended exclusively for the addressee(s), and may not
be passed on to, or made available for use by any person other than 
the addressee(s). Better.be B.V. rules out any and every liability 
resulting from any electronic transmission.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Signature V2

2016-08-23 Thread jan hugo prins
Hi,

I already created a ticket for this issue.
http://tracker.ceph.com/issues/17076

The complete logfile should be in this ticket.

Jan Hugo


Jan Hugo Prins


On 08/22/2016 10:36 PM, Gregory Farnum wrote:
> On Thu, Aug 18, 2016 at 11:42 AM, jan hugo prins  wrote:
>> I have been able to reproduce the error and create a debug log from the
>> failure.
>> I can't post the debug log here because there is sensitive information
>> in the debug log like access keys etc.
>> Where can I send this log for analysis? And who is able to have a look
>> at this?
> I can't do anything useful with this, but you can upload the log with
> ceph-post-file and it will only be accessible to upstream Ceph devs.
> Then create a ticket under the RGW project at tracker.ceph.com
> pointing to the files (ceph-post-file will give you a UUID).
> -Greg
>
>> A small part of the debug log without stripped of sensitive information:
>>
>> 2016-08-18 17:26:33.864658 7ff155ffb700 10 -
>> Verifying signatures
>> 2016-08-18 17:26:33.864659 7ff155ffb700 10 Signature =
>> abbeb6af798b2aad58cd398491698f863253f3859d22b4c9558cc808159d256d
>> 2016-08-18 17:26:33.864660 7ff155ffb700 10 New Signature =
>> e13d83bcd1f52103e9056add844e0037accb71436faee1a3e0048dd6c25cd4b6
>> 2016-08-18 17:26:33.864661 7ff155ffb700 10 -
>> 2016-08-18 17:26:33.864664 7ff155ffb700 20 delayed aws4 auth failed
>> 2016-08-18 17:26:33.864674 7ff155ffb700  2 req 624:0.000642:s3:PUT
>> /Photos/Options/x/180x102.jpg:put_obj:completing
>> 2016-08-18 17:26:33.864749 7ff155ffb700  2 req 624:0.000717:s3:PUT
>> /Photos/Options/x/180x102.jpg:put_obj:op status=-2027
>> 2016-08-18 17:26:33.864757 7ff155ffb700  2 req 624:0.000726:s3:PUT
>> /Photos/Options/x/180x102.jpg:put_obj:http status=403
>> 2016-08-18 17:26:33.864762 7ff155ffb700  1 == req done
>> req=0x7ff155ff5710 op status=-2027 http_status=403 ==
>> 2016-08-18 17:26:33.864776 7ff155ffb700 20 process_request() returned -2027
>> 2016-08-18 17:26:33.864801 7ff155ffb700  1 civetweb: 0x7ff1f8003e80:
>> 192.168.2.59 - - [18/Aug/2016:17:26:33 +0200] "PUT
>> /Photos/Options/x/180x102.jpg HTTP/1.1" 403 0 - -
>>
>>
>> Jan Hugo Prins
>>
>>
>> On 08/18/2016 01:32 PM, jan hugo prins wrote:
>>> did some more searching and according to some info I found RGW should
>>> support V4 signatures.
>>>
>>> http://tracker.ceph.com/issues/10333
>>> http://tracker.ceph.com/issues/11858
>>>
>>> The fact that everyone still modifies s3cmd to use Version 2 Signatures
>>> suggests to me that we have a bug in this code.
>>>
>>> If I use V4 signatures most of my requests work fine, but some requests
>>> fail on a signature error.
>>>
>>> Thanks,
>>> Jan Hugo Prins
>>>
>>>
>>> On 08/18/2016 12:46 PM, jan hugo prins wrote:
 Hi everyone.

 To connect to my S3 gateways using s3cmd I had to set the option
 signature_v2 in my s3cfg to true.
 If I didn't do that I would get Signature mismatch errors and this seems
 to be because Amazon uses Signature version 4 while the S3 gateway of
 Ceph only supports Signature Version 2.

 Now I see the following error in a Jave project we are building that
 should talk to S3.

 Aug 18, 2016 12:12:38 PM org.apache.catalina.core.StandardWrapperValve
 invoke
 SEVERE: Servlet.service() for servlet [Default] in context with path
 [/VehicleData] threw exception
 com.betterbe.vd.web.servlet.LsExceptionWrapper: xxx
 caused: com.amazonaws.services.s3.model.AmazonS3Exception: null
 (Service: Amazon S3; Status Code: 400; Error Code:
 XAmzContentSHA256Mismatch; Request ID:
 tx02cc6-0057b58a15-25bba-default), S3 Extended Request
 ID: 25bba-default-default
 at
 com.betterbe.vd.web.dataset.requesthandler.DatasetRequestHandler.handle(DatasetRequestHandler.java:262)
 at com.betterbe.vd.web.servlet.Servlet.handler(Servlet.java:141)
 at com.betterbe.vd.web.servlet.Servlet.doPost(Servlet.java:110)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:646)

 To me this looks a bit the same, though I'm not a Java developer.
 Am I correct, and if so, can I tell the Java S3 client to use Version 2
 signatures?


>> --
>> Met vriendelijke groet / Best regards,
>>
>> Jan Hugo Prins
>> Infra and Isilon storage consultant
>>
>> Better.be B.V.
>> Auke Vleerstraat 140 E | 7547 AN Enschede | KvK 08097527
>> T +31 (0) 53 48 00 694 | M +31 (0)6 26 358 951
>> jpr...@betterbe.com | www.betterbe.com
>>
>> This e-mail is intended exclusively for the addressee(s), and may not
>> be passed on to, or made available for use by any person other than
>> the addressee(s). Better.be B.V. rules out any and every liability
>> resulting from any electronic transmission.
>>
>>
>>
>> ___

[ceph-users] rbd-nbd: list-mapped : is it possible to display associtation between rbd volume and nbd device ?

2016-08-23 Thread Alexandre DERUMIER
Hi, I'm currently testing rbd-nbd, to implement them in lxc instead krbd (to 
support new rbd features ) 


#rbd-nbd  map pool/testimage 
/dev/nbd0 
#rbd-nbd list-mapped 
/dev/nbd0 


Is is possible to implement something like 

#rbd-nbd list-mapped 
/dev/nbd0 pool/testimage


Regards,

Alexandre
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] BUG ON librbd or libc

2016-08-23 Thread Brad Hubbard
On Tue, Aug 23, 2016 at 03:45:58PM +0800, Ning Yao wrote:
> Hi, all
> 
> Our vm is terminated unexpectedly when using librbd in our production
> environment with CentOS 7.0 kernel 3.12 with Ceph version 0.94.5 and
> glibc version 2.17. we get log from libvirtd as below
> 
> *** Error in `/usr/libexec/qemu-kvm': invalid fastbin entry (free):

This means the maloc (free) code detected a memory corruption or accounting
issue. This is unlikely to be an issue in glibc.

> 0x7f7db7eed740 ***
> 
> === Backtrace: =
> 
> /lib64/libc.so.6(+0x7d1fd)[0x7f8520fe61fd]
> 
> /lib64/librbd.so.1(_ZNSt10_List_baseIN4ceph6buffer3ptrESaIS2_EE8_M_clearEv+0x2f)[0x7f8527589b3f]

$ c++filt _ZNSt10_List_baseIN4ceph6buffer3ptrESaIS2_EE8_M_clearEv
std::_List_base::_M_clear()

The std::_List_base _M_clear() function likely calls free to free some memory
allocated by the list and this is triggering the issue.

I can't find a similar tracker so we'll likely need more information to pin this
down.

> 
> /lib64/librados.so.2(+0xb8a06)[0x7f8525080a06]
> 
> /lib64/librados.so.2(+0xb9457)[0x7f8525081457]
> 
> /lib64/librados.so.2(+0x718f7)[0x7f85250398f7]
> 
> /lib64/librados.so.2(+0x2bffe5)[0x7f8525287fe5]
> 
> /lib64/librados.so.2(+0x2cbaad)[0x7f8525293aad]
> 
> /lib64/libpthread.so.0(+0x7df5)[0x7f852ab12df5]
> 
> /lib64/libc.so.6(clone+0x6d)[0x7f852105f1ad]
> 
> but the log does not show symbols so that we cannot know which
> functions cause the error, any suggestions?

Install the debuginfo packages for ceph (includes debuginfo for librados and
librbd) and glibc ideally.

To gather debug logging for rbd you should add something like the following to
ceph.conf on the qemu-kvm host.

[client] # Can also be global since it is inherited·
debug ms = 1
debug rbd = 20
debug objectcacher = 20
debug objecter = 20
log file = /var/log/ceph/rbd.log

Then run the following commands.

# touch /var/log/ceph/rbd.log
# chmod 777 /var/log/ceph/rbd.log

Then reproduce the issue if possible and create a tracker, upload the debug log
and the stack trace and let us know here.

-- 
HTH,
Brad

> 
> 
> 
> Regards
> Ning Yao
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Recommended hardware for MDS server

2016-08-23 Thread Burkhard Linke

Hi,


On 08/22/2016 07:27 PM, Wido den Hollander wrote:

Op 22 augustus 2016 om 15:52 schreef Christian Balzer :



Hello,

first off, not a CephFS user, just installed it on a lab setup for fun.
That being said, I tend to read most posts here.

And I do remember participating in similar discussions.

On Mon, 22 Aug 2016 14:47:38 +0200 Burkhard Linke wrote:


Hi,

we are running CephFS with about 70TB data, > 5 million files and about
100 clients. The MDS is currently colocated on a storage box with 14 OSD
(12 HDD, 2SSD). The box has two E52680v3 CPUs and 128 GB RAM. CephFS
runs fine, but it feels like the metadata operations may need more speed.


Firstly, I wouldn't share share the MDS with a storage/OSD node, a MON
node would make a more "natural" co-location spot.

Indeed. I always try to avoid to co-locate anything with the OSDs.
The MONs are also colocated with other OSD hosts, but this is also 
subject to change in the near future.



That being said, CPU wise that machine feels vastly overpowered, don't see
more then half of the cores utilized ever for OSD purposes, even in the
most contrived test cases.

Have you monitored that node with something like atop to get a feel what
tasks are using how much (of a specific) CPU?


Excerpt of MDS perf dump:
"mds": {
  "request": 73389282,
  "reply": 73389282,
  "reply_latency": {
  "avgcount": 73389282,
  "sum": 259696.749971457
  },
  "forward": 0,
  "dir_fetch": 4094842,
  "dir_commit": 720085,
  "dir_split": 0,
  "inode_max": 500,
  "inodes": 565,
  "inodes_top": 320979,
  "inodes_bottom": 530518,
  "inodes_pin_tail": 4148568,
  "inodes_pinned": 4469666,
  "inodes_expired": 60001276,
  "inodes_with_caps": 4468714,
  "caps": 4850520,
  "subtrees": 2,
  "traverse": 92378836,
  "traverse_hit": 75743822,
  "traverse_forward": 0,
  "traverse_discover": 0,
  "traverse_dir_fetch": 1719440,
  "traverse_remote_ino": 33,
  "traverse_lock": 3952,
  "load_cent": 7339063064,
  "q": 0,
  "exported": 0,
  "exported_inodes": 0,
  "imported": 0,
  "imported_inodes": 0
  },

The setup is expected grow, with regards to the amount of stored data
and the number of clients. The MDS process currently consumes about 36
TB RAM, with 22 TB resident. Since a large part of the MDS run single
threaded, a CPU with less core and more CPU frequency might be a better
choice in this setup.


I suppose you mean GB up there. ^o^

If memory serves me well, there are knobs to control MDS memory usage, so
tuning them upwards may help.


mds_cache_size you mean probably. That's the amount of inodes the MDS will 
cache at max.

Keep in mind, a single inodes uses about 4k of memory. So the default of 100k 
will consume 400MB of memory.

You can increase this to 16.777.216 so it will use about 64GB at max. I would 
still advise to put 128GB of memory in that machine since the MDS might have a 
leak at some points and you want to give it some headroom.

Source: http://docs.ceph.com/docs/master/cephfs/mds-config-ref/
mds_cache_size is already set to 5.000.000 and will need to be changed 
again since there are already cache pressure messages in the ceph logs. 
128GB RAM will definitely be a good idea.



And yes to the less cores, more speed rationale. Up to a point of course.

Indeed. Faster single-core E5 is better for the MDS than a slower multi-core.

So I'll have a closer look at configurations with E5-1XXX.



Again, checking with atop should give you a better insight there.

Also up there you said metadata stuff feels sluggish, have you considered
moving that pool to SSDs?

I recall from recent benchmarks that there was no benefit in having the 
metadata on SSD. Sure, it might help a bit with maybe a journal replay, but I 
think that regular disks with a proper journal do just fine.
Most of the metadata is read by the MDS upon start and cached in memory 
(that's why the process consumes several GB of RAM...). Given a suitable 
cache size, only journal updates should result in I/O to the metadata 
pool; client requests should be served from memory.


Thanks for hints, I'll go for a single socket setup with a E5-1XXX and 
128GB RAM.


Regards,
Burkhard
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph Day Munich - 23 Sep 2016

2016-08-23 Thread Patrick McGarry
Hey cephers,

We now finally have a date and location confirmed for Ceph Day Munich
in September:

http://ceph.com/cephdays/ceph-day-munich/

If you are interested in being a speaker please send me the following:

1) Speaker Name
2) Speaker Org
3) Talk Title
4) Talk abstract

I will be accepting speakers on a first-come first-served basis. Thanks!


-- 

Best Regards,

Patrick McGarry
Director Ceph Community || Red Hat
http://ceph.com  ||  http://community.redhat.com
@scuttlemonkey || @ceph
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] BUG ON librbd or libc

2016-08-23 Thread Ning Yao
Hi, all

Our vm is terminated unexpectedly when using librbd in our production
environment with CentOS 7.0 kernel 3.12 with Ceph version 0.94.5 and
glibc version 2.17. we get log from libvirtd as below

*** Error in `/usr/libexec/qemu-kvm': invalid fastbin entry (free):
0x7f7db7eed740 ***

=== Backtrace: =

/lib64/libc.so.6(+0x7d1fd)[0x7f8520fe61fd]

/lib64/librbd.so.1(_ZNSt10_List_baseIN4ceph6buffer3ptrESaIS2_EE8_M_clearEv+0x2f)[0x7f8527589b3f]

/lib64/librados.so.2(+0xb8a06)[0x7f8525080a06]

/lib64/librados.so.2(+0xb9457)[0x7f8525081457]

/lib64/librados.so.2(+0x718f7)[0x7f85250398f7]

/lib64/librados.so.2(+0x2bffe5)[0x7f8525287fe5]

/lib64/librados.so.2(+0x2cbaad)[0x7f8525293aad]

/lib64/libpthread.so.0(+0x7df5)[0x7f852ab12df5]

/lib64/libc.so.6(clone+0x6d)[0x7f852105f1ad]

but the log does not show symbols so that we cannot know which
functions cause the error, any suggestions?



Regards
Ning Yao
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] 答复: BlueStore write amplification

2016-08-23 Thread Zhiyuan Wang
Hi
Only one node, and only one nvme SSD, the SSD has 12 partitions, every three 
for one OSD
And fio is 4k randwrite, iodepth is 128
No snapshot

Thanks

发件人: Jan Schermer [mailto:j...@schermer.cz]
发送时间: 2016年8月23日 14:52
收件人: Zhiyuan Wang 
抄送: ceph-users@lists.ceph.com
主题: Re: [ceph-users] BlueStore write amplification

Is that 400MB on all nodes or on each node? If it's on all nodes then 10:1 is 
not that surprising.
What what the block size in your fio benchmark?
We had much higher amplification on our cluster with snapshots and stuff...

Jan

On 23 Aug 2016, at 08:38, Zhiyuan Wang 
> wrote:

Hi
I have test bluestore on SSD, and I found that the BW from fio is about 40MB, 
but
the write BW from iostat of SSD is about 400MB, nearly ten times.
Could someone help to explain this?
Thanks a lot.

Below are my configuration file:
[global]
fsid = 31e77e3c-447c-4745-a91a-58bda80a868c
enable experimental unrecoverable data corrupting features = bluestore 
rocksdb
osd objectstore = bluestore

bluestore default buffered read = true
bluestore_min_alloc_size=4096
osd pool default size = 1

osd pg bits = 8
osd pgp bits = 8
auth supported = none
log to syslog = false
filestore xattr use omap = true
auth cluster required = none
auth service required = none
auth client required = none

public network = 192.168.200.233/24
cluster network = 192.168.100.233/24

mon initial members = node3
mon host = 192.168.200.233
mon data = /etc/ceph/mon.node3

filestore merge threshold = 40
filestore split multiple = 8
osd op threads = 8

debug_bluefs = "0/0"
debug_bluestore = "0/0"
debug_bdev = "0/0"
debug_lockdep = "0/0"
debug_context = "0/0"
debug_crush = "0/0"
debug_mds = "0/0"
debug_mds_balancer = "0/0"
debug_mds_locker = "0/0"
debug_mds_log = "0/0"
debug_mds_log_expire = "0/0"
debug_mds_migrator = "0/0"
debug_buffer = "0/0"
debug_timer = "0/0"
debug_filer = "0/0"
debug_objecter = "0/0"
debug_rados = "0/0"
debug_rbd = "0/0"
debug_journaler = "0/0"
debug_objectcacher = "0/0"
debug_client = "0/0"
debug_osd = "0/0"
debug_optracker = "0/0"
debug_objclass = "0/0"
debug_filestore = "0/0"
debug_journal = "0/0"
debug_ms = "0/0"
debug_mon = "0/0"
debug_monc = "0/0"
debug_paxos = "0/0"
debug_tp = "0/0"
debug_auth = "0/0"
debug_finisher = "0/0"
debug_heartbeatmap = "0/0"
debug_perfcounter = "0/0"
debug_rgw = "0/0"
debug_hadoop = "0/0"
debug_asok = "0/0"
debug_throttle = "0/0"

[osd.0]
host = node3
osd data = /etc/ceph/osd-device-0-data
bluestore block path = /dev/disk/by-partlabel/osd-device-0-block
bluestore block db path = /dev/disk/by-partlabel/osd-device-0-db
bluestore block wal path = /dev/disk/by-partlabel/osd-device-0-wal

[osd.1]
host = node3
osd data = /etc/ceph/osd-device-1-data
bluestore block path = /dev/disk/by-partlabel/osd-device-1-block
bluestore block db path = /dev/disk/by-partlabel/osd-device-1-db
bluestore block wal path = /dev/disk/by-partlabel/osd-device-1-wal
[osd.2]
host = node3
osd data = /etc/ceph/osd-device-2-data
bluestore block path = /dev/disk/by-partlabel/osd-device-2-block
bluestore block db path = /dev/disk/by-partlabel/osd-device-2-db
bluestore block wal path = /dev/disk/by-partlabel/osd-device-2-wal


[osd.3]
host = node3
osd data = /etc/ceph/osd-device-3-data
bluestore block path = /dev/disk/by-partlabel/osd-device-3-block
bluestore block db path = /dev/disk/by-partlabel/osd-device-3-db
bluestore block wal path = /dev/disk/by-partlabel/osd-device-3-wal
Email Disclaimer & Confidentiality Notice
This message is confidential and intended solely for the use of the recipient 
to whom they are addressed. If you are not the intended recipient you should 
not deliver, distribute or copy this e-mail. Please notify the sender 
immediately by e-mail and delete this e-mail from your system. Copyright © 2016 
by Istuary Innovation Labs, Inc. All rights reserved.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Email Disclaimer & Confidentiality Notice
This message is confidential and intended solely for the use of the recipient 
to whom they are addressed. If you are not the intended recipient you should 
not deliver, distribute or copy this e-mail. Please notify 

[ceph-users] Fwd: Re: Merging CephFS data pools

2016-08-23 Thread Burkhard Linke


Missing CC to list



 Forwarded Message 
Subject:Re: [ceph-users] Merging CephFS data pools
Date:   Tue, 23 Aug 2016 08:59:45 +0200
From:   Burkhard Linke 
To: Gregory Farnum 



Hi,


On 08/22/2016 10:02 PM, Gregory Farnum wrote:

On Thu, Aug 18, 2016 at 12:21 AM, Burkhard Linke
 wrote:

Hi,

the current setup for CephFS at our site uses two data pools due to
different requirements in the past. I want to merge these two pools now,
eliminating the second pool completely.

I've written a small script to locate all files on the second pool using
their file layout attributes and replace them with a copy on the correct
pool. This works well for files, but modifies the timestamps of the
directories.
Do you have any idea for a better solution that does not modify timestamps
and plays well with active CephFS clients (e.g. no problem with files being
used)? A simple 'rados cppool' probably does not work since the pool id/name
is part of a file's metadata and client will not be aware of moved
files.

Can't you just use rsync or something that will set the timestamps itself?

The script is using 'cp -a', which also preserves the timestamps. So
file timestamps are ok, but directory timestamps get updated by cp and
mv. And that's ok from my point of view.

The main concern is data integrity. There are 20TB left to be
transferred from the old pool, and part of this data is currently in
active use (including being overwritten in place). If write access to an
opened file happens while it is being transfered, the changes to that
file might be lost.

We can coordinate the remaining transfers with the affected users, if no
other way exists.

Regards,
Burkhard

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] BlueStore write amplification

2016-08-23 Thread Jan Schermer
Is that 400MB on all nodes or on each node? If it's on all nodes then 10:1 is 
not that surprising.
What what the block size in your fio benchmark?
We had much higher amplification on our cluster with snapshots and stuff...

Jan

> On 23 Aug 2016, at 08:38, Zhiyuan Wang  wrote:
> 
> Hi 
> I have test bluestore on SSD, and I found that the BW from fio is about 40MB, 
> but
> the write BW from iostat of SSD is about 400MB, nearly ten times.
> Could someone help to explain this? 
> Thanks a lot.
>  
> Below are my configuration file:
> [global]
> fsid = 31e77e3c-447c-4745-a91a-58bda80a868c
> enable experimental unrecoverable data corrupting features = 
> bluestore rocksdb
> osd objectstore = bluestore
>  
> bluestore default buffered read = true
> bluestore_min_alloc_size=4096
> osd pool default size = 1
>  
> osd pg bits = 8
> osd pgp bits = 8
> auth supported = none
> log to syslog = false
> filestore xattr use omap = true
> auth cluster required = none
> auth service required = none
> auth client required = none
>  
> public network = 192.168.200.233/24
> cluster network = 192.168.100.233/24
>  
> mon initial members = node3
> mon host = 192.168.200.233
> mon data = /etc/ceph/mon.node3
>   
> filestore merge threshold = 40
> filestore split multiple = 8
> osd op threads = 8
>  
> debug_bluefs = "0/0"
> debug_bluestore = "0/0"
> debug_bdev = "0/0" 
> debug_lockdep = "0/0" 
> debug_context = "0/0"  
> debug_crush = "0/0"
> debug_mds = "0/0"
> debug_mds_balancer = "0/0"
> debug_mds_locker = "0/0"
> debug_mds_log = "0/0"
> debug_mds_log_expire = "0/0"
> debug_mds_migrator = "0/0"
> debug_buffer = "0/0"
> debug_timer = "0/0"
> debug_filer = "0/0"
> debug_objecter = "0/0"
> debug_rados = "0/0"
> debug_rbd = "0/0"
> debug_journaler = "0/0"
> debug_objectcacher = "0/0"
> debug_client = "0/0"
> debug_osd = "0/0"
> debug_optracker = "0/0"
> debug_objclass = "0/0"
> debug_filestore = "0/0"
> debug_journal = "0/0"
> debug_ms = "0/0"
> debug_mon = "0/0"
> debug_monc = "0/0"
> debug_paxos = "0/0"
> debug_tp = "0/0"
> debug_auth = "0/0"
> debug_finisher = "0/0"
> debug_heartbeatmap = "0/0"
> debug_perfcounter = "0/0"
> debug_rgw = "0/0"
> debug_hadoop = "0/0"
> debug_asok = "0/0"
> debug_throttle = "0/0"
>  
> [osd.0]
> host = node3
> osd data = /etc/ceph/osd-device-0-data
> bluestore block path = /dev/disk/by-partlabel/osd-device-0-block
> bluestore block db path = /dev/disk/by-partlabel/osd-device-0-db
> bluestore block wal path = /dev/disk/by-partlabel/osd-device-0-wal
>  
> [osd.1]
> host = node3
> osd data = /etc/ceph/osd-device-1-data
> bluestore block path = /dev/disk/by-partlabel/osd-device-1-block
> bluestore block db path = /dev/disk/by-partlabel/osd-device-1-db
> bluestore block wal path = /dev/disk/by-partlabel/osd-device-1-wal
> [osd.2]
> host = node3
> osd data = /etc/ceph/osd-device-2-data
> bluestore block path = /dev/disk/by-partlabel/osd-device-2-block
> bluestore block db path = /dev/disk/by-partlabel/osd-device-2-db
> bluestore block wal path = /dev/disk/by-partlabel/osd-device-2-wal
>  
>  
> [osd.3]
> host = node3
> osd data = /etc/ceph/osd-device-3-data
> bluestore block path = /dev/disk/by-partlabel/osd-device-3-block
> bluestore block db path = /dev/disk/by-partlabel/osd-device-3-db
> bluestore block wal path = /dev/disk/by-partlabel/osd-device-3-wal
> Email Disclaimer & Confidentiality Notice
> This message is confidential and intended solely for the use of the recipient 
> to whom they are addressed. If you are not the intended recipient you should 
> not deliver, distribute or copy this e-mail. Please notify the sender 
> immediately by e-mail and delete this e-mail from your system. Copyright © 
> 2016 by Istuary Innovation Labs, Inc. All rights reserved. 
>  
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com 
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] BlueStore write amplification

2016-08-23 Thread Varada Kari
Hi,

You can refer to a thread "Odd WAL traffic for BlueStore" in devel list for 
your questions.
This traffic is mostly observed on wal partition of bluestore, which is used by 
rocksdb. Above thread should give more insight to your questions.

Varada

On Tuesday 23 August 2016 12:09 PM, Zhiyuan Wang wrote:
Hi
I have test bluestore on SSD, and I found that the BW from fio is about 40MB, 
but
the write BW from iostat of SSD is about 400MB, nearly ten times.
Could someone help to explain this?
Thanks a lot.

Below are my configuration file:
[global]
fsid = 31e77e3c-447c-4745-a91a-58bda80a868c
enable experimental unrecoverable data corrupting features = bluestore 
rocksdb
osd objectstore = bluestore

bluestore default buffered read = true
bluestore_min_alloc_size=4096
osd pool default size = 1

osd pg bits = 8
osd pgp bits = 8
auth supported = none
log to syslog = false
filestore xattr use omap = true
auth cluster required = none
auth service required = none
auth client required = none

public network = 192.168.200.233/24
cluster network = 192.168.100.233/24

mon initial members = node3
mon host = 192.168.200.233
mon data = /etc/ceph/mon.node3

filestore merge threshold = 40
filestore split multiple = 8
osd op threads = 8

debug_bluefs = "0/0"
debug_bluestore = "0/0"
debug_bdev = "0/0"
debug_lockdep = "0/0"
debug_context = "0/0"
debug_crush = "0/0"
debug_mds = "0/0"
debug_mds_balancer = "0/0"
debug_mds_locker = "0/0"
debug_mds_log = "0/0"
debug_mds_log_expire = "0/0"
debug_mds_migrator = "0/0"
debug_buffer = "0/0"
debug_timer = "0/0"
debug_filer = "0/0"
debug_objecter = "0/0"
debug_rados = "0/0"
debug_rbd = "0/0"
debug_journaler = "0/0"
debug_objectcacher = "0/0"
debug_client = "0/0"
debug_osd = "0/0"
debug_optracker = "0/0"
debug_objclass = "0/0"
debug_filestore = "0/0"
debug_journal = "0/0"
debug_ms = "0/0"
debug_mon = "0/0"
debug_monc = "0/0"
debug_paxos = "0/0"
debug_tp = "0/0"
debug_auth = "0/0"
debug_finisher = "0/0"
debug_heartbeatmap = "0/0"
debug_perfcounter = "0/0"
debug_rgw = "0/0"
debug_hadoop = "0/0"
debug_asok = "0/0"
debug_throttle = "0/0"

[osd.0]
host = node3
osd data = /etc/ceph/osd-device-0-data
bluestore block path = /dev/disk/by-partlabel/osd-device-0-block
bluestore block db path = /dev/disk/by-partlabel/osd-device-0-db
bluestore block wal path = /dev/disk/by-partlabel/osd-device-0-wal

[osd.1]
host = node3
osd data = /etc/ceph/osd-device-1-data
bluestore block path = /dev/disk/by-partlabel/osd-device-1-block
bluestore block db path = /dev/disk/by-partlabel/osd-device-1-db
bluestore block wal path = /dev/disk/by-partlabel/osd-device-1-wal
[osd.2]
host = node3
osd data = /etc/ceph/osd-device-2-data
bluestore block path = /dev/disk/by-partlabel/osd-device-2-block
bluestore block db path = /dev/disk/by-partlabel/osd-device-2-db
bluestore block wal path = /dev/disk/by-partlabel/osd-device-2-wal


[osd.3]
host = node3
osd data = /etc/ceph/osd-device-3-data
bluestore block path = /dev/disk/by-partlabel/osd-device-3-block
bluestore block db path = /dev/disk/by-partlabel/osd-device-3-db
bluestore block wal path = /dev/disk/by-partlabel/osd-device-3-wal
Email Disclaimer & Confidentiality Notice
This message is confidential and intended solely for the use of the recipient 
to whom they are addressed. If you are not the intended recipient you should 
not deliver, distribute or copy this e-mail. Please notify the sender 
immediately by e-mail and delete this e-mail from your system. Copyright (c) 
2016 by Istuary Innovation Labs, Inc. All rights reserved.


PLEASE NOTE: The information contained in this electronic mail message is 
intended only for the use of the designated recipient(s) named above. If the 
reader of this message is not the intended recipient, you are hereby notified 
that you have received this message in error and that any review, 
dissemination, distribution, or copying of this message is strictly prohibited. 
If you have received this communication in error, please notify the sender by 
telephone or e-mail (as shown above) immediately and destroy any and all copies 
of this message in your possession (whether hard copies or electronically 
stored copies).
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] BlueStore write amplification

2016-08-23 Thread Zhiyuan Wang
Hi
I have test bluestore on SSD, and I found that the BW from fio is about 40MB, 
but
the write BW from iostat of SSD is about 400MB, nearly ten times.
Could someone help to explain this?
Thanks a lot.

Below are my configuration file:
[global]
fsid = 31e77e3c-447c-4745-a91a-58bda80a868c
enable experimental unrecoverable data corrupting features = bluestore 
rocksdb
osd objectstore = bluestore

bluestore default buffered read = true
bluestore_min_alloc_size=4096
osd pool default size = 1

osd pg bits = 8
osd pgp bits = 8
auth supported = none
log to syslog = false
filestore xattr use omap = true
auth cluster required = none
auth service required = none
auth client required = none

public network = 192.168.200.233/24
cluster network = 192.168.100.233/24

mon initial members = node3
mon host = 192.168.200.233
mon data = /etc/ceph/mon.node3

filestore merge threshold = 40
filestore split multiple = 8
osd op threads = 8

debug_bluefs = "0/0"
debug_bluestore = "0/0"
debug_bdev = "0/0"
debug_lockdep = "0/0"
debug_context = "0/0"
debug_crush = "0/0"
debug_mds = "0/0"
debug_mds_balancer = "0/0"
debug_mds_locker = "0/0"
debug_mds_log = "0/0"
debug_mds_log_expire = "0/0"
debug_mds_migrator = "0/0"
debug_buffer = "0/0"
debug_timer = "0/0"
debug_filer = "0/0"
debug_objecter = "0/0"
debug_rados = "0/0"
debug_rbd = "0/0"
debug_journaler = "0/0"
debug_objectcacher = "0/0"
debug_client = "0/0"
debug_osd = "0/0"
debug_optracker = "0/0"
debug_objclass = "0/0"
debug_filestore = "0/0"
debug_journal = "0/0"
debug_ms = "0/0"
debug_mon = "0/0"
debug_monc = "0/0"
debug_paxos = "0/0"
debug_tp = "0/0"
debug_auth = "0/0"
debug_finisher = "0/0"
debug_heartbeatmap = "0/0"
debug_perfcounter = "0/0"
debug_rgw = "0/0"
debug_hadoop = "0/0"
debug_asok = "0/0"
debug_throttle = "0/0"

[osd.0]
host = node3
osd data = /etc/ceph/osd-device-0-data
bluestore block path = /dev/disk/by-partlabel/osd-device-0-block
bluestore block db path = /dev/disk/by-partlabel/osd-device-0-db
bluestore block wal path = /dev/disk/by-partlabel/osd-device-0-wal

[osd.1]
host = node3
osd data = /etc/ceph/osd-device-1-data
bluestore block path = /dev/disk/by-partlabel/osd-device-1-block
bluestore block db path = /dev/disk/by-partlabel/osd-device-1-db
bluestore block wal path = /dev/disk/by-partlabel/osd-device-1-wal
[osd.2]
host = node3
osd data = /etc/ceph/osd-device-2-data
bluestore block path = /dev/disk/by-partlabel/osd-device-2-block
bluestore block db path = /dev/disk/by-partlabel/osd-device-2-db
bluestore block wal path = /dev/disk/by-partlabel/osd-device-2-wal


[osd.3]
host = node3
osd data = /etc/ceph/osd-device-3-data
bluestore block path = /dev/disk/by-partlabel/osd-device-3-block
bluestore block db path = /dev/disk/by-partlabel/osd-device-3-db
bluestore block wal path = /dev/disk/by-partlabel/osd-device-3-wal
Email Disclaimer & Confidentiality Notice
This message is confidential and intended solely for the use of the recipient 
to whom they are addressed. If you are not the intended recipient you should 
not deliver, distribute or copy this e-mail. Please notify the sender 
immediately by e-mail and delete this e-mail from your system. Copyright (c) 
2016 by Istuary Innovation Labs, Inc. All rights reserved.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com