[ceph-users] OSD latency inaccurate reports?

2015-07-13 Thread Kostis Fardelas
Hello,
I noticed that commit/apply latency reported using:
ceph pg dump -f json-pretty

is very different from the values reported when querying the OSD sockets.
What is your opinion? What are the targets the I should fetch metrics
from in order to be as precise as possible?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Slow requests during ceph osd boot

2015-07-13 Thread Kostis Fardelas
Hello,
after rebooting a ceph node and the OSDs starting booting and joining
the cluster, we experience slow requests that get resolved immediately
after cluster recovers. It is improtant to note that before the node
reboot, we set noout flag in order to prevent recovery - so there are
only degraded PGs when OSDs shut down- and let the cluster handle the
OSDs down/up in the lightest way.

Is there any tunable we should consider in order to avoid service
degradation for our ceph clients?

Regards,
Kostis
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Firefly 0.80.10 ready to upgrade to?

2015-07-13 Thread Gregory Farnum
On Mon, Jul 13, 2015 at 11:25 AM, Kostis Fardelas dante1...@gmail.com wrote:
 Hello,
 it seems that new packages for firefly have been uploaded to repo.
 However, I can't find any details in Ceph Release notes. There is only
 one thread in ceph-devel [1], but it is not clear what this new
 version is about. Is it safe to upgrade from 0.80.9 to 0.80.10?

These packages got created and uploaded to the repository without
release notes. I'm not sure why but I believe they're safe to use.
Hopefully Sage and our release guys can resolve that soon as we've
gotten several queries on the subject. :)
-Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] All pgs with - up [0] acting [0], new cluster installation

2015-07-13 Thread alberto ayllon
Maybe this can help to get the origin of the problem.

If I run  ceph pg dump, and the end of the response i get:


osdstat kbused kbavail kb hb in hb out
0 36688 5194908 5231596 [1,2,3,4,5,6,7,8] []
1 34004 5197592 5231596 [] []
2 34004 5197592 5231596 [1] []
3 34004 5197592 5231596 [0,1,2,4,5,6,7,8] []
4 34004 5197592 5231596 [1,2] []
5 34004 5197592 5231596 [1,2,4] []
6 34004 5197592 5231596 [0,1,2,3,4,5,7,8] []
7 34004 5197592 5231596 [1,2,4,5] []
8 34004 5197592 5231596 [1,2,4,5,7] []
 sum 308720 46775644 47084364


Please someone can help me?



2015-07-13 11:45 GMT+02:00 alberto ayllon albertoayllon...@gmail.com:

 Hello everybody and thanks for your help.

 Hello, I'm newbie in CEPH, I'm trying to install a CEPH cluster with test
 purpose.

 I had just installed a CEPH cluster with three VMs (ubuntu 14.04), each
 one has one mon daemon and three OSDs, also each server has 3 disk.
 Cluster has only one poll (rbd) with pg and pgp_num = 280, and osd pool
 get rbd size = 2.

 I made cluster's installation with  ceph-deploy, ceph version is 0.94.2

 I think cluster's OSDs are having peering problems, because if I run ceph
 status, it returns:

 # ceph status
 cluster d54a2216-b522-4744-a7cc-a2106e1281b6
  health HEALTH_WARN
 280 pgs degraded
 280 pgs stuck degraded
 280 pgs stuck unclean
 280 pgs stuck undersized
 280 pgs undersized
  monmap e3: 3 mons at {ceph01=
 172.16.70.158:6789/0,ceph02=172.16.70.159:6789/0,ceph03=172.16.70.160:6789/0
 }
 election epoch 38, quorum 0,1,2 ceph01,ceph02,ceph03
  osdmap e46: 9 osds: 9 up, 9 in
   pgmap v129: 280 pgs, 1 pools, 0 bytes data, 0 objects
 301 MB used, 45679 MB / 45980 MB avail
  280 active+undersized+degraded

 And for all pgs, the command ceph pg map X.yy returns something like:

 osdmap e46 pg 0.d7 (0.d7) - up [0] acting [0]

 As I know Acting Set and Up Set must have the same value, but as they
 are equal to 0, there are not defined OSDs to
 stores pgs replicas, and I think this is why all pg are in
 active+undersized+degraded state.

 Has anyone any idea of what I have to do for  Active Set and Up Set
 reaches correct values.


 Thanks a lot!


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] He8 drives

2015-07-13 Thread Blair Bethwaite
On 13 July 2015 at 21:36, Emmanuel Florac eflo...@intellique.com wrote:
 I've benchmarked it and found it has about exactly the same performance
 profile as the He6. Compared to the Seagate 6TB it draws much less
 power (almost half), and that's the main selling point IMO, with
 durability.

That's consistent with this other published review (which I found
after the storagereview one):
http://www.tomsitpro.com/articles/hgst-ultrastar-he8-8tb-hdd,2-921-8.html

So seems like a decent option for a capacity-first Ceph cluster.

-- 
Cheers,
~Blairo
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] All pgs with - up [0] acting [0], new cluster installation

2015-07-13 Thread Wido den Hollander


On 13-07-15 14:07, alberto ayllon wrote:
 On 13-07-15 13:12, alberto ayllon wrote:
 Maybe this can help to get the origin of the problem.
 
 If I run  ceph pg dump, and the end of the response i get:
 
 
 What does 'ceph osd tree' tell you?
 
 It seems there is something wrong with your CRUSHMap.
 
 Wido
 
 
 Thanks for your answer Wido.
 
 Here is the output of ceph osd tree;
 
 # ceph osd tree
 ID WEIGHT TYPE NAME   UP/DOWN REWEIGHT PRIMARY-AFFINITY 
 -1  0 root default  
 -2  0 host ceph01   
  0  0 osd.0up  1.0  1.0 
  3  0 osd.3up  1.0  1.0 
  6  0 osd.6up  1.0  1.0 
 -3  0 host ceph02   
  1  0 osd.1up  1.0  1.0 
  4  0 osd.4up  1.0  1.0 
  7  0 osd.7up  1.0  1.0 
 -4  0 host ceph03   
  2  0 osd.2up  1.0  1.0 
  5  0 osd.5up  1.0  1.0 
  8  0 osd.8up  1.0  1.0 
 
 

The weights are allo zero (0) of all the OSDs. How big are the disks? I
think they are very tiny , eg 10GB?

You probably want a bit bigger disks to test with.

Or set the weight manually of each OSD:

$ ceph osd crush reweight osd.X 1

Wido

 
 osdstatkbusedkbavailkbhb inhb out
 03668851949085231596[1,2,3,4,5,6,7,8][]
 13400451975925231596[][]
 23400451975925231596[1][]
 33400451975925231596[0,1,2,4,5,6,7,8][]
 43400451975925231596[1,2][]
 53400451975925231596[1,2,4][]
 63400451975925231596[0,1,2,3,4,5,7,8][]
 73400451975925231596[1,2,4,5][]
 83400451975925231596[1,2,4,5,7][]
  sum3087204677564447084364
 
 
 Please someone can help me?
 
 
 
 2015-07-13 11:45 GMT+02:00 alberto ayllon albertoayllonces at
 gmail.com http://gmail.com
 mailto:albertoayllonces mailto:albertoayllonces at gmail.com
 http://gmail.com:
 
 Hello everybody and thanks foryour help.
 
 Hello, I'm newbie in CEPH, I'm trying to install a CEPHcluster with
 test purpose.
 
 I had just installed a CEPH cluster with three VMs (ubuntu 14.04),
 each one has one mon daemon and three OSDs, also each server has 3
 disk.
 Cluster has only one poll (rbd) with pg and pgp_num = 280, and osd
 pool get rbd size = 2.
 
 I made cluster's installation with  ceph-deploy, ceph version is
 0.94.2
 
 I think cluster's OSDs are having peering problems, because if Irun
 ceph status, it returns:
 
 # ceph status
 cluster d54a2216-b522-4744-a7cc-a2106e1281b6
  health HEALTH_WARN
 280 pgs degraded
 280 pgs stuck degraded
 280 pgs stuck unclean
 280 pgs stuck undersized
 280 pgs undersized
  monmap e3: 3 mons at

 {ceph01=172.16.70.158:6789/0,ceph02=172.16.70.159:6789/0,ceph03=172.16.70.160:6789/0
 http://172.16.70.158:6789/0,ceph02=172.16.70.159:6789/0,ceph03=172.16.70.160:6789/0

 http://172.16.70.158:6789/0,ceph02=172.16.70.159:6789/0,ceph03=172.16.70.160:6789/0}
 election epoch 38, quorum 0,1,2 ceph01,ceph02,ceph03
  osdmap e46: 9 osds: 9 up, 9 in
   pgmap v129: 280 pgs, 1 pools, 0 bytes data, 0 objects
 301 MB used, 45679 MB / 45980 MB avail
  280 active+undersized+degraded
 
 And for all pgs, the command ceph pg map X.yyreturns something like:
 
 osdmap e46 pg 0.d7 (0.d7) - up [0] acting [0]
 
 As I know Acting Set and Up Set must have the same value, but as
 they are equal to 0, there are not defined OSDs to
 stores pgs replicas, and I think this is why all pg are in
 active+undersized+degraded state.
 
 Has anyone any idea of what I have to do for  Active Set and Up
 Set reaches correct values.
 
 
 Thanks a lot!
 
 
 
 
 ___
 ceph-users mailing list
 ceph-users at lists.ceph.com http://lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] All pgs with - up [0] acting [0], new cluster installation

2015-07-13 Thread alberto ayllon
Hi Wido.

Thanks again.

I will rebuild the cluster with bigger disk.

Again thanks for your help.


2015-07-13 14:15 GMT+02:00 Wido den Hollander w...@42on.com:



 On 13-07-15 14:07, alberto ayllon wrote:
  On 13-07-15 13:12, alberto ayllon wrote:
  Maybe this can help to get the origin of the problem.
 
  If I run  ceph pg dump, and the end of the response i get:
 
 
  What does 'ceph osd tree' tell you?
 
  It seems there is something wrong with your CRUSHMap.
 
  Wido
 
 
  Thanks for your answer Wido.
 
  Here is the output of ceph osd tree;
 
  # ceph osd tree
  ID WEIGHT TYPE NAME   UP/DOWN REWEIGHT PRIMARY-AFFINITY
  -1  0 root default
  -2  0 host ceph01
   0  0 osd.0up  1.0  1.0
   3  0 osd.3up  1.0  1.0
   6  0 osd.6up  1.0  1.0
  -3  0 host ceph02
   1  0 osd.1up  1.0  1.0
   4  0 osd.4up  1.0  1.0
   7  0 osd.7up  1.0  1.0
  -4  0 host ceph03
   2  0 osd.2up  1.0  1.0
   5  0 osd.5up  1.0  1.0
   8  0 osd.8up  1.0  1.0
 
 

 The weights are allo zero (0) of all the OSDs. How big are the disks? I
 think they are very tiny , eg 10GB?

 You probably want a bit bigger disks to test with.

 Or set the weight manually of each OSD:

 $ ceph osd crush reweight osd.X 1

 Wido

 
  osdstatkbusedkbavailkbhb inhb out
  03668851949085231596[1,2,3,4,5,6,7,8][]
  13400451975925231596[][]
  23400451975925231596[1][]
  33400451975925231596[0,1,2,4,5,6,7,8][]
  43400451975925231596[1,2][]
  53400451975925231596[1,2,4][]
  63400451975925231596[0,1,2,3,4,5,7,8][]
  73400451975925231596[1,2,4,5][]
  83400451975925231596[1,2,4,5,7][]
   sum3087204677564447084364
 
 
  Please someone can help me?
 
 
 
  2015-07-13 11:45 GMT+02:00 alberto ayllon albertoayllonces at
  gmail.com http://gmail.com
  mailto:albertoayllonces mailto:albertoayllonces at gmail.com
  http://gmail.com:
 
  Hello everybody and thanks foryour help.
 
  Hello, I'm newbie in CEPH, I'm trying to install a CEPHcluster with
  test purpose.
 
  I had just installed a CEPH cluster with three VMs (ubuntu 14.04),
  each one has one mon daemon and three OSDs, also each server has 3
  disk.
  Cluster has only one poll (rbd) with pg and pgp_num = 280, and osd
  pool get rbd size = 2.
 
  I made cluster's installation with  ceph-deploy, ceph version is
  0.94.2
 
  I think cluster's OSDs are having peering problems, because if Irun
  ceph status, it returns:
 
  # ceph status
  cluster d54a2216-b522-4744-a7cc-a2106e1281b6
   health HEALTH_WARN
  280 pgs degraded
  280 pgs stuck degraded
  280 pgs stuck unclean
  280 pgs stuck undersized
  280 pgs undersized
   monmap e3: 3 mons at
 
  {ceph01=
 172.16.70.158:6789/0,ceph02=172.16.70.159:6789/0,ceph03=172.16.70.160:6789/0
  
 http://172.16.70.158:6789/0,ceph02=172.16.70.159:6789/0,ceph03=172.16.70.160:6789/0
 
 
  
 http://172.16.70.158:6789/0,ceph02=172.16.70.159:6789/0,ceph03=172.16.70.160:6789/0
 }
  election epoch 38, quorum 0,1,2 ceph01,ceph02,ceph03
   osdmap e46: 9 osds: 9 up, 9 in
pgmap v129: 280 pgs, 1 pools, 0 bytes data, 0 objects
  301 MB used, 45679 MB / 45980 MB avail
   280 active+undersized+degraded
 
  And for all pgs, the command ceph pg map X.yyreturns something
 like:
 
  osdmap e46 pg 0.d7 (0.d7) - up [0] acting [0]
 
  As I know Acting Set and Up Set must have the same value, but as
  they are equal to 0, there are not defined OSDs to
  stores pgs replicas, and I think this is why all pg are in
  active+undersized+degraded state.
 
  Has anyone any idea of what I have to do for  Active Set and Up
  Set reaches correct values.
 
 
  Thanks a lot!
 
 
 
 
  ___
  ceph-users mailing list
  ceph-users at lists.ceph.com http://lists.ceph.com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 
 
 
  ___
  ceph-users mailing list
  ceph-users@lists.ceph.com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CephFS kernel client reboots on write

2015-07-13 Thread Gregory Farnum
On Mon, Jul 13, 2015 at 9:49 AM, Ilya Dryomov idryo...@gmail.com wrote:
 On Fri, Jul 10, 2015 at 9:36 PM, Jan Pekař jan.pe...@imatic.cz wrote:
 Hi all,

 I think I found a bug in cephfs kernel client.
 When I create directory in cephfs and set layout to

 ceph.dir.layout=stripe_unit=1073741824 stripe_count=1
 object_size=1073741824 pool=somepool

 attepmts to write larger file will cause kernel hung or reboot.
 When I'm using cephfs client based on fuse, it works (but now I have some
 issues with fuse and concurrent writes too, but it is not this kind of
 problem).

 Which kernel are you running?  What do you see in the dmesg when it
 hangs?  What is the panic splat when it crashes?  How big is the
 larger file that you are trying to write?


 I think object_size and stripe_unit 1073741824 is max value, or can I set it
 higher?

 Default values stripe_unit=4194304 stripe_count=1 object_size=4194304
 works without problem on write.

 My goal was not to split file between osd's each 4MB of its size but save it
 in one piece.

 This is generally not a very good idea - you have to consider the
 distribution of objects across PGs and how your OSDs will be utilized.

Yeah. Beyond that, the OSDs will reject writes exceeding a certain
size (90MB by default). I'm not sure exactly what mismatch you're
running into here but I can think of several different ways a 1GB
write/single object could get stuck; it's just not a good idea.
-Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Firefly 0.80.10 ready to upgrade to?

2015-07-13 Thread Kostis Fardelas
Hello,
it seems that new packages for firefly have been uploaded to repo.
However, I can't find any details in Ceph Release notes. There is only
one thread in ceph-devel [1], but it is not clear what this new
version is about. Is it safe to upgrade from 0.80.9 to 0.80.10?

Regards,
Kostis

[1] http://permalink.gmane.org/gmane.comp.file-systems.ceph.devel/25684
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CephFS kernel client reboots on write

2015-07-13 Thread Ilya Dryomov
On Fri, Jul 10, 2015 at 9:36 PM, Jan Pekař jan.pe...@imatic.cz wrote:
 Hi all,

 I think I found a bug in cephfs kernel client.
 When I create directory in cephfs and set layout to

 ceph.dir.layout=stripe_unit=1073741824 stripe_count=1
 object_size=1073741824 pool=somepool

 attepmts to write larger file will cause kernel hung or reboot.
 When I'm using cephfs client based on fuse, it works (but now I have some
 issues with fuse and concurrent writes too, but it is not this kind of
 problem).

Which kernel are you running?  What do you see in the dmesg when it
hangs?  What is the panic splat when it crashes?  How big is the
larger file that you are trying to write?


 I think object_size and stripe_unit 1073741824 is max value, or can I set it
 higher?

 Default values stripe_unit=4194304 stripe_count=1 object_size=4194304
 works without problem on write.

 My goal was not to split file between osd's each 4MB of its size but save it
 in one piece.

This is generally not a very good idea - you have to consider the
distribution of objects across PGs and how your OSDs will be utilized.

Thanks,

Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] 32 bit limitation for ceph on arm

2015-07-13 Thread Daleep Bais
Hi,

I am building a ceph cluster on Arm. Is there any limitation for 32 bit in
regard to number of nodes, storage capacity etc?

Please suggest..

Thanks.

Daleep Singh Bais
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Firefly 0.80.10 ready to upgrade to?

2015-07-13 Thread Wido den Hollander


On 13-07-15 12:25, Kostis Fardelas wrote:
 Hello,
 it seems that new packages for firefly have been uploaded to repo.
 However, I can't find any details in Ceph Release notes. There is only
 one thread in ceph-devel [1], but it is not clear what this new
 version is about. Is it safe to upgrade from 0.80.9 to 0.80.10?
 

I already have multiple systems running 0.80.10 which came from .7, .8
and .9.

.10 works just fine. Is a 500TB production cluster.

Wido

 Regards,
 Kostis
 
 [1] http://permalink.gmane.org/gmane.comp.file-systems.ceph.devel/25684
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] All pgs with - up [0] acting [0], new cluster installation

2015-07-13 Thread alberto ayllon
Hello everybody and thanks for your help.

Hello, I'm newbie in CEPH, I'm trying to install a CEPH cluster with test
purpose.

I had just installed a CEPH cluster with three VMs (ubuntu 14.04), each one
has one mon daemon and three OSDs, also each server has 3 disk.
Cluster has only one poll (rbd) with pg and pgp_num = 280, and osd pool
get rbd size = 2.

I made cluster's installation with  ceph-deploy, ceph version is 0.94.2

I think cluster's OSDs are having peering problems, because if I run ceph
status, it returns:

# ceph status
cluster d54a2216-b522-4744-a7cc-a2106e1281b6
 health HEALTH_WARN
280 pgs degraded
280 pgs stuck degraded
280 pgs stuck unclean
280 pgs stuck undersized
280 pgs undersized
 monmap e3: 3 mons at {ceph01=
172.16.70.158:6789/0,ceph02=172.16.70.159:6789/0,ceph03=172.16.70.160:6789/0
}
election epoch 38, quorum 0,1,2 ceph01,ceph02,ceph03
 osdmap e46: 9 osds: 9 up, 9 in
  pgmap v129: 280 pgs, 1 pools, 0 bytes data, 0 objects
301 MB used, 45679 MB / 45980 MB avail
 280 active+undersized+degraded

And for all pgs, the command ceph pg map X.yy returns something like:

osdmap e46 pg 0.d7 (0.d7) - up [0] acting [0]

As I know Acting Set and Up Set must have the same value, but as they
are equal to 0, there are not defined OSDs to
stores pgs replicas, and I think this is why all pg are in
active+undersized+degraded state.

Has anyone any idea of what I have to do for  Active Set and Up Set
reaches correct values.


Thanks a lot!
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] All pgs with - up [0] acting [0], new cluster installation

2015-07-13 Thread alberto ayllon
On 13-07-15 13:12, alberto ayllon wrote:
 Maybe this can help to get the origin of the problem.

 If I run  ceph pg dump, and the end of the response i get:


What does 'ceph osd tree' tell you?

It seems there is something wrong with your CRUSHMap.

Wido


Thanks for your answer Wido.

Here is the output of ceph osd tree;

# ceph osd tree
ID WEIGHT TYPE NAME   UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1  0 root default
-2  0 host ceph01
 0  0 osd.0up  1.0  1.0
 3  0 osd.3up  1.0  1.0
 6  0 osd.6up  1.0  1.0
-3  0 host ceph02
 1  0 osd.1up  1.0  1.0
 4  0 osd.4up  1.0  1.0
 7  0 osd.7up  1.0  1.0
-4  0 host ceph03
 2  0 osd.2up  1.0  1.0
 5  0 osd.5up  1.0  1.0
 8  0 osd.8up  1.0  1.0



 osdstatkbusedkbavailkbhb inhb out
 03668851949085231596[1,2,3,4,5,6,7,8][]
 13400451975925231596[][]
 23400451975925231596[1][]
 33400451975925231596[0,1,2,4,5,6,7,8][]
 43400451975925231596[1,2][]
 53400451975925231596[1,2,4][]
 63400451975925231596[0,1,2,3,4,5,7,8][]
 73400451975925231596[1,2,4,5][]
 83400451975925231596[1,2,4,5,7][]
  sum3087204677564447084364


 Please someone can help me?



 2015-07-13 11:45 GMT+02:00 alberto ayllon albertoayllonces at gmail.com
 mailto:albertoayllonces at gmail.com:

 Hello everybody and thanks foryour help.

 Hello, I'm newbie in CEPH, I'm trying to install a CEPHcluster with
 test purpose.

 I had just installed a CEPH cluster with three VMs (ubuntu 14.04),
 each one has one mon daemon and three OSDs, also each server has 3
disk.
 Cluster has only one poll (rbd) with pg and pgp_num = 280, and osd
 pool get rbd size = 2.

 I made cluster's installation with  ceph-deploy, ceph version is
 0.94.2

 I think cluster's OSDs are having peering problems, because if Irun
 ceph status, it returns:

 # ceph status
 cluster d54a2216-b522-4744-a7cc-a2106e1281b6
  health HEALTH_WARN
 280 pgs degraded
 280 pgs stuck degraded
 280 pgs stuck unclean
 280 pgs stuck undersized
 280 pgs undersized
  monmap e3: 3 mons at
 {ceph01=
172.16.70.158:6789/0,ceph02=172.16.70.159:6789/0,ceph03=172.16.70.160:6789/0
 
http://172.16.70.158:6789/0,ceph02=172.16.70.159:6789/0,ceph03=172.16.70.160:6789/0
}
 election epoch 38, quorum 0,1,2 ceph01,ceph02,ceph03
  osdmap e46: 9 osds: 9 up, 9 in
   pgmap v129: 280 pgs, 1 pools, 0 bytes data, 0 objects
 301 MB used, 45679 MB / 45980 MB avail
  280 active+undersized+degraded

 And for all pgs, the command ceph pg map X.yyreturns something like:

 osdmap e46 pg 0.d7 (0.d7) - up [0] acting [0]

 As I know Acting Set and Up Set must have the same value, but as
 they are equal to 0, there are not defined OSDs to
 stores pgs replicas, and I think this is why all pg are in
 active+undersized+degraded state.

 Has anyone any idea of what I have to do for  Active Set and Up
 Set reaches correct values.


 Thanks a lot!




 ___
 ceph-users mailing list
 ceph-users at lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] xattrs vs omap

2015-07-13 Thread Jan Schermer
Sorry for reviving an old thread, but could I get some input on this, pretty 
please?

ext4 has 256-byte inodes by default (at least according to docs)
but the fragment below says:
OPTION(filestore_max_inline_xattr_size_other, OPT_U32, 512)

The default 512b is too much if the inode is just 256b, so shouldn’t that be 
256b in case people use the default ext4 inode size?

Anyway, is it better to format ext4 with larger inodes (say 2048b) and set 
filestore_max_inline_xattr_size_other=1536, or leave it at defaults?
(As I understand it, on ext4 xattrs ale limited to one block, inode size + 
something can spill to one different inode - maybe someone knows better).

Is filestore_max_inline_xattr_size and absolute limit, or is it 
filestore_max_inline_xattr_size*filestore_max_inline_xattrs in reality?

Does OSD do the sane thing if for some reason the xattrs do not fit? What are 
the performance implications of storing the xattrs in leveldb?

And lastly - what size of xattrs should I really expect if all I use is RBD for 
OpenStack instances? (No radosgw, no cephfs, but heavy on rbd image and pool 
snapshots). This overhead is quite large

My plan so far is to format the drives like this:
mkfs.ext4 -I 2048 -b 4096 -i 524288 -E stride=32,stripe-width=256
(2048b inode, 4096b block size, one inode for 512k of space 
and set  filestore_max_inline_xattr_size_other=1536

Does that make sense?

Thanks!

Jan



 On 02 Jul 2015, at 12:18, Jan Schermer j...@schermer.cz wrote:
 
 Does anyone have a known-good set of parameters for ext4? I want to try it as 
 well but I’m a bit worried what happnes if I get it wrong.
 
 Thanks
 
 Jan
 
 
 
 On 02 Jul 2015, at 09:40, Nick Fisk n...@fisk.me.uk wrote:
 
 -Original Message-
 From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
 Christian Balzer
 Sent: 02 July 2015 02:23
 To: Ceph Users
 Subject: Re: [ceph-users] xattrs vs omap
 
 On Thu, 2 Jul 2015 00:36:18 + Somnath Roy wrote:
 
 It is replaced with the following config option..
 
 // Use omap for xattrs for attrs over
 // filestore_max_inline_xattr_size or
 OPTION(filestore_max_inline_xattr_size, OPT_U32, 0) //Override
 OPTION(filestore_max_inline_xattr_size_xfs, OPT_U32, 65536)
 OPTION(filestore_max_inline_xattr_size_btrfs, OPT_U32, 2048)
 OPTION(filestore_max_inline_xattr_size_other, OPT_U32, 512)
 
 // for more than filestore_max_inline_xattrs attrs
 OPTION(filestore_max_inline_xattrs, OPT_U32, 0) //Override
 OPTION(filestore_max_inline_xattrs_xfs, OPT_U32, 10)
 OPTION(filestore_max_inline_xattrs_btrfs, OPT_U32, 10)
 OPTION(filestore_max_inline_xattrs_other, OPT_U32, 2)
 
 
 If these limits crossed, xattrs will be stored in omap..
 
 Sounds fair.
 
 Since I only use RBD I don't think it will ever exceed this.
 
 Possibly, see my thread  about performance difference between new and old
 pools. Still not quite sure what's going on, but for some reasons some of
 the objects behind RBD's have larger xattrs which is causing really poor
 performance.
 
 
 Thanks,
 
 Chibi
 For ext4, you can use either filestore_max*_other or
 filestore_max_inline_xattrs/ filestore_max_inline_xattr_size. I any
 case, later two will override everything.
 
 Thanks  Regards
 Somnath
 
 -Original Message-
 From: Christian Balzer [mailto:ch...@gol.com]
 Sent: Wednesday, July 01, 2015 5:26 PM
 To: Ceph Users
 Cc: Somnath Roy
 Subject: Re: [ceph-users] xattrs vs omap
 
 
 Hello,
 
 On Wed, 1 Jul 2015 15:24:13 + Somnath Roy wrote:
 
 It doesn't matter, I think filestore_xattr_use_omap is a 'noop'  and
 not used in the Hammer.
 
 Then what was this functionality replaced with, esp. considering EXT4
 based OSDs?
 
 Chibi
 Thanks  Regards
 Somnath
 
 -Original Message-
 From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On
 Behalf Of Adam Tygart Sent: Wednesday, July 01, 2015 8:20 AM
 To: Ceph Users
 Subject: [ceph-users] xattrs vs omap
 
 Hello all,
 
 I've got a coworker who put filestore_xattr_use_omap = true in the
 ceph.conf when we first started building the cluster. Now he can't
 remember why. He thinks it may be a holdover from our first Ceph
 cluster (running dumpling on ext4, iirc).
 
 In the newly built cluster, we are using XFS with 2048 byte inodes,
 running Ceph 0.94.2. It currently has production data in it.
 
 From my reading of other threads, it looks like this is probably not
 something you want set to true (at least on XFS), due to performance
 implications. Is this something you can change on a running cluster?
 Is it worth the hassle?
 
 Thanks,
 Adam
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 
 
 PLEASE NOTE: The information contained in this electronic mail
 message is intended only for the use of the designated recipient(s)
 named above. If the reader of this message is not the intended
 recipient, you are hereby notified 

Re: [ceph-users] All pgs with - up [0] acting [0], new cluster installation

2015-07-13 Thread Wido den Hollander


On 13-07-15 13:12, alberto ayllon wrote:
 Maybe this can help to get the origin of the problem.
 
 If I run  ceph pg dump, and the end of the response i get:
 

What does 'ceph osd tree' tell you?

It seems there is something wrong with your CRUSHMap.

Wido

 
 osdstatkbusedkbavailkbhb inhb out
 03668851949085231596[1,2,3,4,5,6,7,8][]
 13400451975925231596[][]
 23400451975925231596[1][]
 33400451975925231596[0,1,2,4,5,6,7,8][]
 43400451975925231596[1,2][]
 53400451975925231596[1,2,4][]
 63400451975925231596[0,1,2,3,4,5,7,8][]
 73400451975925231596[1,2,4,5][]
 83400451975925231596[1,2,4,5,7][]
  sum3087204677564447084364
 
 
 Please someone can help me?
 
 
 
 2015-07-13 11:45 GMT+02:00 alberto ayllon albertoayllon...@gmail.com
 mailto:albertoayllon...@gmail.com:
 
 Hello everybody and thanks foryour help.
 
 Hello, I'm newbie in CEPH, I'm trying to install a CEPHcluster with
 test purpose.
 
 I had just installed a CEPH cluster with three VMs (ubuntu 14.04),
 each one has one mon daemon and three OSDs, also each server has 3 disk.
 Cluster has only one poll (rbd) with pg and pgp_num = 280, and osd
 pool get rbd size = 2.
 
 I made cluster's installation with  ceph-deploy, ceph version is
 0.94.2
 
 I think cluster's OSDs are having peering problems, because if Irun
 ceph status, it returns:
 
 # ceph status
 cluster d54a2216-b522-4744-a7cc-a2106e1281b6
  health HEALTH_WARN
 280 pgs degraded
 280 pgs stuck degraded
 280 pgs stuck unclean
 280 pgs stuck undersized
 280 pgs undersized
  monmap e3: 3 mons at
 
 {ceph01=172.16.70.158:6789/0,ceph02=172.16.70.159:6789/0,ceph03=172.16.70.160:6789/0
 
 http://172.16.70.158:6789/0,ceph02=172.16.70.159:6789/0,ceph03=172.16.70.160:6789/0}
 election epoch 38, quorum 0,1,2 ceph01,ceph02,ceph03
  osdmap e46: 9 osds: 9 up, 9 in
   pgmap v129: 280 pgs, 1 pools, 0 bytes data, 0 objects
 301 MB used, 45679 MB / 45980 MB avail
  280 active+undersized+degraded
 
 And for all pgs, the command ceph pg map X.yyreturns something like:
 
 osdmap e46 pg 0.d7 (0.d7) - up [0] acting [0]
 
 As I know Acting Set and Up Set must have the same value, but as
 they are equal to 0, there are not defined OSDs to
 stores pgs replicas, and I think this is why all pg are in
 active+undersized+degraded state.
 
 Has anyone any idea of what I have to do for  Active Set and Up
 Set reaches correct values.
 
 
 Thanks a lot!
 
 
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] He8 drives

2015-07-13 Thread Emmanuel Florac
Le Wed, 8 Jul 2015 10:28:17 +1000
Blair Bethwaite blair.bethwa...@gmail.com écrivait:

 Does anyone have any experience with the newish HGST He8 8TB Helium
 filled HDDs? 

I've benchmarked it and found it has about exactly the same performance
profile as the He6. Compared to the Seagate 6TB it draws much less
power (almost half), and that's the main selling point IMO, with
durability.

-- 

Emmanuel Florac |   Direction technique
|   Intellique
|   eflo...@intellique.com
|   +33 1 78 94 84 02

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] 32 bit limitation for ceph on arm

2015-07-13 Thread Shinobu Kinjo
Why do you stick to 32bit?

 Kinjo

On Mon, Jul 13, 2015 at 7:35 PM, Daleep Bais daleepb...@gmail.com wrote:

 Hi,

 I am building a ceph cluster on Arm. Is there any limitation for 32 bit in
 regard to number of nodes, storage capacity etc?

 Please suggest..

 Thanks.

 Daleep Singh Bais

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
Life w/ Linux http://i-shinobu.hatenablog.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] slow requests going up and down

2015-07-13 Thread Will . Boege
Does the ceph health detail show anything about stale or unclean PGs, or
are you just getting the blocked ops messages?

On 7/13/15, 5:38 PM, Deneau, Tom tom.den...@amd.com wrote:

I have a cluster where over the weekend something happened and successive
calls to ceph health detail show things like below.
What does it mean when the number of blocked requests goes up and down
like this?
Some clients are still running successfully.

-- Tom Deneau, AMD



HEALTH_WARN 20 requests are blocked  32 sec; 2 osds have slow requests
20 ops are blocked  536871 sec
2 ops are blocked  536871 sec on osd.5
18 ops are blocked  536871 sec on osd.7
2 osds have slow requests

HEALTH_WARN 4 requests are blocked  32 sec; 2 osds have slow requests
4 ops are blocked  536871 sec
2 ops are blocked  536871 sec on osd.5
2 ops are blocked  536871 sec on osd.7
2 osds have slow requests

HEALTH_WARN 27 requests are blocked  32 sec; 2 osds have slow requests
27 ops are blocked  536871 sec
2 ops are blocked  536871 sec on osd.5
25 ops are blocked  536871 sec on osd.7
2 osds have slow requests

HEALTH_WARN 34 requests are blocked  32 sec; 2 osds have slow requests
34 ops are blocked  536871 sec
9 ops are blocked  536871 sec on osd.5
25 ops are blocked  536871 sec on osd.7
2 osds have slow requests
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Issue with journal on another drive

2015-07-13 Thread Rimma Iontel

Thank you Lionel,

This was very helpful.  I actually chose to split the partition and then 
recreated the OSDs.  Everything is up and running now.


Rimma

On 7/13/15 6:34 PM, Lionel Bouton wrote:

On 07/14/15 00:08, Rimma Iontel wrote:

Hi all,

[...]
Is there something that needed to be done to journal partition to
enable sharing between multiple OSDs?  Or is there something else
that's causing the isssue?


IIRC you can't share a volume between multiple OSDs. What you could do
if splitting this partition isn't possible is create a LVM volume group
with it as a single physical volume (change type of partition to lvm,
pvcreate /dev/sda6, vgcreate journal_vg /dev/sda6). Then you can create
a logical volumes in it for each of your OSDs (lvcreate -n
osdn_journal -L one_third_of_available_space journal_vg) and use
them (/dev/journal_vg/osdn_journal) in your configuration.

Lionel


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] 32 bit limitation for ceph on arm

2015-07-13 Thread Daleep Bais
Hi,

I have an existing hardware which I have to use.

Please suggest so that accordingly I could implement.

Thanks

On Mon, Jul 13, 2015 at 5:51 PM, Shinobu Kinjo shinobu...@gmail.com wrote:

 Why do you stick to 32bit?

  Kinjo

 On Mon, Jul 13, 2015 at 7:35 PM, Daleep Bais daleepb...@gmail.com wrote:

 Hi,

 I am building a ceph cluster on Arm. Is there any limitation for 32 bit
 in regard to number of nodes, storage capacity etc?

 Please suggest..

 Thanks.

 Daleep Singh Bais

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




 --
 Life w/ Linux http://i-shinobu.hatenablog.com/

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] slow requests going up and down

2015-07-13 Thread Christian Balzer

Hello,

to quote Sherlock Holmes:

Data, data, data. I cannot make bricks without clay.

That the number of blocked requests is varying is indeed interesting, but
I presume you're more interested in fixing this than dissecting this
particular tidbit?

If so...

Start with the basics, all relevant software version, a description of
your cluster, full outputs of ceph osd tree and ceph -s, etc.

The same 2 OSDs are affected, anything peculiar going on in their logs?

How about their SMART status?

Are they being deep-scrubbed (logs above) or otherwise busy (atop, iostat)?

You may find something in the performance counters, blocked requests
section, see: http://ceph.com/docs/v0.69/dev/perf_counters/

Lastly, the most likely fix will be restarting the affected OSDs. 

See also:

https://www.mail-archive.com/ceph-users@lists.ceph.com/msg15410.html

Christian

On Mon, 13 Jul 2015 22:38:57 + Deneau, Tom wrote:

 I have a cluster where over the weekend something happened and
 successive calls to ceph health detail show things like below. What does
 it mean when the number of blocked requests goes up and down like this?
 Some clients are still running successfully.
 
 -- Tom Deneau, AMD
 
 
 
 HEALTH_WARN 20 requests are blocked  32 sec; 2 osds have slow requests
 20 ops are blocked  536871 sec
 2 ops are blocked  536871 sec on osd.5
 18 ops are blocked  536871 sec on osd.7
 2 osds have slow requests
 
 HEALTH_WARN 4 requests are blocked  32 sec; 2 osds have slow requests
 4 ops are blocked  536871 sec
 2 ops are blocked  536871 sec on osd.5
 2 ops are blocked  536871 sec on osd.7
 2 osds have slow requests
 
 HEALTH_WARN 27 requests are blocked  32 sec; 2 osds have slow requests
 27 ops are blocked  536871 sec
 2 ops are blocked  536871 sec on osd.5
 25 ops are blocked  536871 sec on osd.7
 2 osds have slow requests
 
 HEALTH_WARN 34 requests are blocked  32 sec; 2 osds have slow requests
 34 ops are blocked  536871 sec
 9 ops are blocked  536871 sec on osd.5
 25 ops are blocked  536871 sec on osd.7
 2 osds have slow requests
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 


-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Global OnLine Japan/Fusion Communications
http://www.gol.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] how to recover from: 1 pgs down; 10 pgs incomplete; 10 pgs stuck inactive; 10 pgs stuck unclean

2015-07-13 Thread Jelle de Jong
Hello everybody,

I was testing a ceph cluster with osd_pool_default_size = 2 and while
rebuilding the OSD on one ceph node a disk in an other node started
getting read errors and ceph kept taking the OSD down, and instead of me
executing ceph osd set nodown while the other node was rebuilding I kept
restarting the OSD for a while and ceph took the OSD in for a few
minutes and then taking it back down.

I then removed the bad OSD from the cluster and later added it back in
with nodown flag set and a weight of zero, moving all the data away.
Then removed the OSD again and added a new OSD with a new hard drive.

However I ended up with the following cluster status and I can't seem to
find how to get the cluster healthy again. I'm doing this as tests
before taking this ceph configuration in further production.

http://paste.debian.net/plain/281922

If I lost data, my bad, but how could I figure out in what pool the data
was lost and in what rbd volume (so what kvm guest lost data).

Kind regards,

Jelle de Jong
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ruby bindings for Librados

2015-07-13 Thread Wido den Hollander
Hi,

I have an Ruby application which currently talks S3, but I want to have
the application talk native RADOS.

Now looking online I found various Ruby bindings for librados, but none
of them seem official.

What I found:

* desperados: https://github.com/johnl/desperados
* ceph-ruby: https://github.com/netskin/ceph-ruby

The last commit for desperados was in March 2013 and ceph-ruby in April
2015.

Anybody out there using Ruby bindings? If so, which one and what are the
experiences?

-- 
Wido den Hollander
42on B.V.
Ceph trainer and consultant

Phone: +31 (0)20 700 9902
Skype: contact42on
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ruby bindings for Librados

2015-07-13 Thread Corin Langosch
Hi Wido,

I'm the dev of https://github.com/netskin/ceph-ruby and still use it in 
production on some systems. It has everything I
need so I didn't develop any further. If you find any bugs or need new 
features, just open an issue and I'm happy to
have a look.

Best
Corin

Am 13.07.2015 um 21:24 schrieb Wido den Hollander:
 Hi,
 
 I have an Ruby application which currently talks S3, but I want to have
 the application talk native RADOS.
 
 Now looking online I found various Ruby bindings for librados, but none
 of them seem official.
 
 What I found:
 
 * desperados: https://github.com/johnl/desperados
 * ceph-ruby: https://github.com/netskin/ceph-ruby
 
 The last commit for desperados was in March 2013 and ceph-ruby in April
 2015.
 
 Anybody out there using Ruby bindings? If so, which one and what are the
 experiences?
 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] He8 drives

2015-07-13 Thread Udo Lembke
Hi,
I have just expand our ceph-cluster (7 nodes) with one 8TB HGST (change
from 4TB to 8TB) on each node (and 11 4TB HGST).
But I have set the primary affinity to 0 for the 8 TB-disks... in this
case my performance values are not 8-TB-disk related.

Udo

On 08.07.2015 02:28, Blair Bethwaite wrote:
 Hi folks,

 Does anyone have any experience with the newish HGST He8 8TB Helium
 filled HDDs? Storagereview looked at them here:
 http://www.storagereview.com/hgst_ultrastar_helium_he8_8tb_enterprise_hard_drive_review.
 I'm torn as to the lower read performance shown there than e.g. the
 He6 or Seagate 6TB, but thing is, I think we probably have enough
 aggregate IOPs with ~170 drives. Has anyone tried these in a Ceph
 cluster yet?


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] xattrs vs omap

2015-07-13 Thread Somnath Roy
inline

-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Jan 
Schermer
Sent: Monday, July 13, 2015 2:32 AM
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] xattrs vs omap

Sorry for reviving an old thread, but could I get some input on this, pretty 
please?

ext4 has 256-byte inodes by default (at least according to docs) but the 
fragment below says:
OPTION(filestore_max_inline_xattr_size_other, OPT_U32, 512)

The default 512b is too much if the inode is just 256b, so shouldn’t that be 
256b in case people use the default ext4 inode size?

Anyway, is it better to format ext4 with larger inodes (say 2048b) and set 
filestore_max_inline_xattr_size_other=1536, or leave it at defaults?
[Somnath] Why 1536 ? why not 1024 or any power of 2 ? I am not seeing any harm 
though, but, curious.
(As I understand it, on ext4 xattrs ale limited to one block, inode size + 
something can spill to one different inode - maybe someone knows better).


[Somnath] The xttr size (_) is now more than 256 bytes and it will spill 
over, so, bigger inode  size will be good. But, I would suggest do your 
benchmark before putting it into production.

Is filestore_max_inline_xattr_size and absolute limit, or is it 
filestore_max_inline_xattr_size*filestore_max_inline_xattrs in reality?

[Somnath] The *_size is tracking the xttr size per attribute and *inline_xattrs 
keep track of max number of inline attributes allowed. So, if a xattr size is  
*_size , it will go to omap and also if the total number of xattra  
*inline_xattrs , it will go to omap.
If you are only using rbd, the number of inline xattrs will be always 2 and it 
will not cross that default max limit.

Does OSD do the sane thing if for some reason the xattrs do not fit? What are 
the performance implications of storing the xattrs in leveldb?

[Somnath] Even though I don't have the exact numbers, but, it has a significant 
overhead if the xattrs go to leveldb.

And lastly - what size of xattrs should I really expect if all I use is RBD for 
OpenStack instances? (No radosgw, no cephfs, but heavy on rbd image and pool 
snapshots). This overhead is quite large

[Somnath] It will be 2 xattrs, default _ will be little bigger than 256 bytes 
and _snapset is small depends on number of snaps/clones, but unlikely will 
cross 256 bytes range.

My plan so far is to format the drives like this:
mkfs.ext4 -I 2048 -b 4096 -i 524288 -E stride=32,stripe-width=256 (2048b inode, 
4096b block size, one inode for 512k of space and set  
filestore_max_inline_xattr_size_other=1536
[Somnath] Not much idea on ext4, sorry..

Does that make sense?

Thanks!

Jan



 On 02 Jul 2015, at 12:18, Jan Schermer j...@schermer.cz wrote:

 Does anyone have a known-good set of parameters for ext4? I want to try it as 
 well but I’m a bit worried what happnes if I get it wrong.

 Thanks

 Jan



 On 02 Jul 2015, at 09:40, Nick Fisk n...@fisk.me.uk wrote:

 -Original Message-
 From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On
 Behalf Of Christian Balzer
 Sent: 02 July 2015 02:23
 To: Ceph Users
 Subject: Re: [ceph-users] xattrs vs omap

 On Thu, 2 Jul 2015 00:36:18 + Somnath Roy wrote:

 It is replaced with the following config option..

 // Use omap for xattrs for attrs over //
 filestore_max_inline_xattr_size or
 OPTION(filestore_max_inline_xattr_size, OPT_U32, 0) //Override
 OPTION(filestore_max_inline_xattr_size_xfs, OPT_U32, 65536)
 OPTION(filestore_max_inline_xattr_size_btrfs, OPT_U32, 2048)
 OPTION(filestore_max_inline_xattr_size_other, OPT_U32, 512)

 // for more than filestore_max_inline_xattrs attrs
 OPTION(filestore_max_inline_xattrs, OPT_U32, 0) //Override
 OPTION(filestore_max_inline_xattrs_xfs, OPT_U32, 10)
 OPTION(filestore_max_inline_xattrs_btrfs, OPT_U32, 10)
 OPTION(filestore_max_inline_xattrs_other, OPT_U32, 2)


 If these limits crossed, xattrs will be stored in omap..

 Sounds fair.

 Since I only use RBD I don't think it will ever exceed this.

 Possibly, see my thread  about performance difference between new and
 old pools. Still not quite sure what's going on, but for some reasons
 some of the objects behind RBD's have larger xattrs which is causing
 really poor performance.


 Thanks,

 Chibi
 For ext4, you can use either filestore_max*_other or
 filestore_max_inline_xattrs/ filestore_max_inline_xattr_size. I any
 case, later two will override everything.

 Thanks  Regards
 Somnath

 -Original Message-
 From: Christian Balzer [mailto:ch...@gol.com]
 Sent: Wednesday, July 01, 2015 5:26 PM
 To: Ceph Users
 Cc: Somnath Roy
 Subject: Re: [ceph-users] xattrs vs omap


 Hello,

 On Wed, 1 Jul 2015 15:24:13 + Somnath Roy wrote:

 It doesn't matter, I think filestore_xattr_use_omap is a 'noop'
 and not used in the Hammer.

 Then what was this functionality replaced with, esp. considering
 EXT4 based OSDs?

 Chibi
 Thanks  Regards
 Somnath

 -Original Message-
 From: ceph-users 

Re: [ceph-users] mds0: Client failing to respond to cache pressure

2015-07-13 Thread Eric Eastman
Thanks John. I will back the test down to the simple case of 1 client
without the kernel driver and only running NFS Ganesha, and work forward
till I trip the problem and report my findings.

Eric

On Mon, Jul 13, 2015 at 2:18 AM, John Spray john.sp...@redhat.com wrote:



 On 13/07/2015 04:02, Eric Eastman wrote:

 Hi John,

 I am seeing this problem with Ceph v9.0.1 with the v4.1 kernel on all
 nodes.  This system is using 4 Ceph FS client systems. They all have
 the kernel driver version of CephFS loaded, but none are mounting the
 file system. All 4 clients are using the libcephfs VFS interface to
 Ganesha NFS (V2.2.0-2) and Samba (Version 4.3.0pre1-GIT-0791bb0) to
 share out the Ceph file system.

 # ceph -s
  cluster 6d8aae1e-1125-11e5-a708-001b78e265be
   health HEALTH_WARN
  4 near full osd(s)
  mds0: Client ede-c2-gw01 failing to respond to cache pressure
  mds0: Client ede-c2-gw02:cephfs failing to respond to cache
 pressure
  mds0: Client ede-c2-gw03:cephfs failing to respond to cache
 pressure
   monmap e1: 3 mons at
 {ede-c2-mon01=
 10.15.2.121:6789/0,ede-c2-mon02=10.15.2.122:6789/0,ede-c2-mon03=10.15.2.123:6789/0
 }
  election epoch 8, quorum 0,1,2
 ede-c2-mon01,ede-c2-mon02,ede-c2-mon03
   mdsmap e912: 1/1/1 up {0=ede-c2-mds03=up:active}, 2 up:standby
   osdmap e272: 8 osds: 8 up, 8 in
pgmap v225264: 832 pgs, 4 pools, 188 GB data, 5173 kobjects
  212 GB used, 48715 MB / 263 GB avail
   832 active+clean
client io 1379 kB/s rd, 20653 B/s wr, 98 op/s


 It would help if we knew whether it's the kernel clients or the userspace
 clients that are generating the warnings here.  You've probably already
 done this, but I'd get rid of any unused kernel client mounts to simplify
 the situation.

 We haven't tested the cache limit enforcement with NFS Ganesha, so there
 is a decent chance that it is broken.  The ganehsha FSAL is doing
 ll_get/ll_put reference counting on inodes, so it seems quite possible that
 its cache is pinning things that we would otherwise be evicting in response
 to cache pressure.  You mention samba as well,

 You can see if the MDS cache is indeed exceeding its limit by looking at
 the output of:
 ceph daemon mds.daemon id perf dump mds

 ...where the inodes value tells you how many are in the cache, vs.
 inode_max.

 If you can, it would be useful to boil this down to a straightforward test
 case: if you start with a healthy cluster, mount a single ganesha client,
 and do your 5 million file procedure, do you get the warning?  Same for
 samba/kernel mounts -- this is likely to be a client side issue, so we need
 to confirm which client is misbehaving.

 Cheers,
 John



 # cat /proc/version
 Linux version 4.1.0-040100-generic (kernel@gomeisa) (gcc version 4.6.3
 (Ubuntu/Linaro 4.6.3-1ubuntu5) ) #201506220235 SMP Mon Jun 22 06:36:19
 UTC 2015

 # ceph -v
 ceph version 9.0.1 (997b3f998d565a744bfefaaf34b08b891f8dbf64)

 The systems are all running Ubuntu Trusty that has been upgraded to
 the 4.1 kernel. This is all physical machines and no VMs.  The test
 run that caused the problem was create and verifying 5 million small
 files.

 We have some tools that flag when Ceph is in a WARN state so it would
 be nice to get rid of this warning.

 Please let me know what additional information you need.

 Thanks,

 Eric

 On Fri, Jul 10, 2015 at 4:19 AM, 谷枫 feiche...@gmail.com wrote:

 Thank you John,
 All my server is ubuntu14.04 with 3.16 kernel.
 Not all of clients appear this problem, the cluster seems functioning
 well
 now.
 As you say,i will change the mds_cache_size to 50 from 10 to
 take a
 test, thanks again!

 2015-07-10 17:00 GMT+08:00 John Spray john.sp...@redhat.com:


 This is usually caused by use of older kernel clients.  I don't remember
 exactly what version it was fixed in, but iirc we've seen the problem
 with
 3.14 and seen it go away with 3.18.

 If your system is otherwise functioning well, this is not a critical
 error
 -- it just means that the MDS might not be able to fully control its
 memory
 usage (i.e. it can exceed mds_cache_size).

 John



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph packages for openSUSE 13.2, Factory, Tumbleweed

2015-07-13 Thread Nathan Cutler
This is to announce that ceph has been packaged for openSUSE 13.2, 
openSUSE Factory, and openSUSE Tumbleweed. It is building in the 
OpenSUSE Build Service (OBS), filesystems:ceph project, from the 
development branch of what will become SUSE Enterprise Storage 2.


https://build.opensuse.org/package/show/filesystems:ceph/ceph

If you have the time and inclination to test the OBS ceph packages on
openSUSE 13.2, Factory, and/or Tumbleweed, I will be interested to hear 
from you. The same applies if you need help downloading/installing the 
packages.


Thanks and regards.

--
Nathan Cutler
Software Engineer Distributed Storage
SUSE LINUX, s.r.o.
Tel.: +420 284 084 037


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ruby bindings for Librados

2015-07-13 Thread Wido den Hollander
On 07/13/2015 09:43 PM, Corin Langosch wrote:
 Hi Wido,
 
 I'm the dev of https://github.com/netskin/ceph-ruby and still use it in 
 production on some systems. It has everything I
 need so I didn't develop any further. If you find any bugs or need new 
 features, just open an issue and I'm happy to
 have a look.
 

Ah, that's great! We should look into making a Ruby binding official
and moving it to Ceph's Github project. That would make it more clear
for end-users.

I see that RADOS namespaces are currently not implemented in the Ruby
bindings. Not many bindings have them though. Might be worth looking at.

I'll give the current bindings a try btw!

 Best
 Corin
 
 Am 13.07.2015 um 21:24 schrieb Wido den Hollander:
 Hi,

 I have an Ruby application which currently talks S3, but I want to have
 the application talk native RADOS.

 Now looking online I found various Ruby bindings for librados, but none
 of them seem official.

 What I found:

 * desperados: https://github.com/johnl/desperados
 * ceph-ruby: https://github.com/netskin/ceph-ruby

 The last commit for desperados was in March 2013 and ceph-ruby in April
 2015.

 Anybody out there using Ruby bindings? If so, which one and what are the
 experiences?



-- 
Wido den Hollander
42on B.V.
Ceph trainer and consultant

Phone: +31 (0)20 700 9902
Skype: contact42on
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CephFS kernel client reboots on write

2015-07-13 Thread Jan Pekař



On 2015-07-13 12:01, Gregory Farnum wrote:

On Mon, Jul 13, 2015 at 9:49 AM, Ilya Dryomov idryo...@gmail.com wrote:

On Fri, Jul 10, 2015 at 9:36 PM, Jan Pekař jan.pe...@imatic.cz wrote:

Hi all,

I think I found a bug in cephfs kernel client.
When I create directory in cephfs and set layout to

ceph.dir.layout=stripe_unit=1073741824 stripe_count=1
object_size=1073741824 pool=somepool

attepmts to write larger file will cause kernel hung or reboot.
When I'm using cephfs client based on fuse, it works (but now I have some
issues with fuse and concurrent writes too, but it is not this kind of
problem).


Which kernel are you running?  What do you see in the dmesg when it
hangs?  What is the panic splat when it crashes?  How big is the
larger file that you are trying to write?

I'm running 4.0.3 kernel but it was the same with older ones.
Computer hangs, so I cannot display dmesg. I will try to catch it with 
remote syslog.

Larger file is about 500MB. Last time 300MB was ok.




I think object_size and stripe_unit 1073741824 is max value, or can I set it
higher?

Default values stripe_unit=4194304 stripe_count=1 object_size=4194304
works without problem on write.

My goal was not to split file between osd's each 4MB of its size but save it
in one piece.


This is generally not a very good idea - you have to consider the
distribution of objects across PGs and how your OSDs will be utilized.


Yeah. Beyond that, the OSDs will reject writes exceeding a certain
size (90MB by default). I'm not sure exactly what mismatch you're
running into here but I can think of several different ways a 1GB
write/single object could get stuck; it's just not a good idea.
-Greg
I'm using it this way from the beginning and with FUSE I had no problem 
with big files. Objects in my OSD has often 1GB and no problem with it.






--

Ing. Jan Pekař
jan.pe...@imatic.cz | +420603811737

Imatic | Jagellonská 14 | Praha 3 | 130 00
http://www.imatic.cz

--
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Issue with journal on another drive

2015-07-13 Thread Rimma Iontel

Hi all,

I am trying to set up a three-node ceph cluster.  Each node is running 
RHEL 7.1 and has three 1TB HDD drives for OSDs (sdb, sdc, sdd) and an 
SSD partition (/dev/sda6) for the journal.


I zapped the HDDs and used the following to create OSDs:

# ceph-deploy --overwrite-conf osd create node:/dev/sdb:/dev/sda6
# ceph-deploy --overwrite-conf osd create node:/dev/sdc:/dev/sda6
# ceph-deploy --overwrite-conf osd create node:/dev/sdd:/dev/sda6

Didn't get any errors but some of the OSDs are not coming up on the nodes:

# ceph osd tree
# idweight  type name   up/down reweight
-1  8.19root default
-2  2.73host osd-01
3   0.91osd.3   up  1
0   0.91osd.0   up  1
1   0.91osd.1   down0
-3  2.73host osd-02
4   0.91osd.4   up  1
2   0.91osd.2   down0
7   0.91osd.7   down0
-4  2.73host osd-03
8   0.91osd.8   up  1
5   0.91osd.5   down0
6   0.91osd.6   up  1

Cluster is not doing well:

# ceph -s
cluster a1a1fa57-d9eb-4eb1-b0de-7729ce7eb10c
 health HEALTH_WARN 1724 pgs degraded; 96 pgs incomplete; 2 pgs 
stale; 96 pgs stuck inactive; 2 pgs stuck stale; 2666 pgs stuck unclean; 
recovery 4/24 objects degraded (16.667%)
 monmap e1: 3 mons at 
{cntrl-01=10.10.103.21:6789/0,cntrl-02=10.10.103.22:6789/0,cntrl-03=10.10.103.23:6789/0}, 
election epoch 18, quorum 0,1,2 cntrl-01,cntrl-02,cntrl-03

 osdmap e345: 9 osds: 5 up, 5 in
  pgmap v16755: 4096 pgs, 2 pools, 12976 kB data, 8 objects
385 MB used, 4654 GB / 4655 GB avail
4/24 objects degraded (16.667%)
  46 active
 627 active+degraded+remapped
1430 active+clean
  52 incomplete
1097 active+degraded
 798 active+remapped
   2 stale+active
  44 remapped+incomplete

I see the following in the logs for the failed OSDs:

2015-07-13 13:58:39.562223 7fafeb12d7c0  0 ceph version 0.80.8 
(69eaad7f8308f21573c604f121956e64679a52a7), process ceph-osd, pid 4906
2015-07-13 13:58:39.592437 7fafeb12d7c0  0 
filestore(/var/lib/ceph/osd/ceph-7) mount detected xfs (libxfs)
2015-07-13 13:58:39.592447 7fafeb12d7c0  1 
filestore(/var/lib/ceph/osd/ceph-7)  disabling 'filestore replica 
fadvise' due to known issues with fadvise(DONTNEED) on xfs
2015-07-13 13:58:39.635624 7fafeb12d7c0  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-7) detect_features: 
FIEMAP ioctl is supported and appears to work
2015-07-13 13:58:39.635633 7fafeb12d7c0  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-7) detect_features: 
FIEMAP ioctl is disabled via 'filestore fiemap' config option
2015-07-13 13:58:39.643786 7fafeb12d7c0  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-7) detect_features: 
syncfs(2) syscall fully supported (by glibc and kernel)
2015-07-13 13:58:39.643838 7fafeb12d7c0  0 
xfsfilestorebackend(/var/lib/ceph/osd/ceph-7) detect_feature: extsize is 
disabled by conf
2015-07-13 13:58:39.792118 7fafeb12d7c0  0 
filestore(/var/lib/ceph/osd/ceph-7) mount: enabling WRITEAHEAD journal 
mode: checkpoint is not enabled
2015-07-13 13:58:40.064871 7fafeb12d7c0  1 journal _open 
/var/lib/ceph/osd/ceph-7/journal fd 20: 131080388608 bytes, block size 
4096 bytes, directio = 1, aio = 1
2015-07-13 13:58:40.064897 7fafeb12d7c0 -1 journal FileJournal::open: 
ondisk fsid 60436b03-ece2-4709-a847-cf46ae9d7481 doesn't match expected 
1d4e4290-0e91-4f53-a477-bfc09990ef72, invalid (someone else's?) journal
2015-07-13 13:58:40.064928 7fafeb12d7c0 -1 
filestore(/var/lib/ceph/osd/ceph-7) mount failed to open journal 
/var/lib/ceph/osd/ceph-7/journal: (22) Invalid argument
2015-07-13 13:58:40.073118 7fafeb12d7c0 -1 ESC[0;31m ** ERROR: error 
converting store /var/lib/ceph/osd/ceph-7: (22) Invalid argument


Is there something that needed to be done to journal partition to enable 
sharing between multiple OSDs?  Or is there something else that's 
causing the isssue?


Thanks.

--
Rimma

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] slow requests going up and down

2015-07-13 Thread Deneau, Tom
I have a cluster where over the weekend something happened and successive calls 
to ceph health detail show things like below.
What does it mean when the number of blocked requests goes up and down like 
this?
Some clients are still running successfully.

-- Tom Deneau, AMD



HEALTH_WARN 20 requests are blocked  32 sec; 2 osds have slow requests
20 ops are blocked  536871 sec
2 ops are blocked  536871 sec on osd.5
18 ops are blocked  536871 sec on osd.7
2 osds have slow requests

HEALTH_WARN 4 requests are blocked  32 sec; 2 osds have slow requests
4 ops are blocked  536871 sec
2 ops are blocked  536871 sec on osd.5
2 ops are blocked  536871 sec on osd.7
2 osds have slow requests

HEALTH_WARN 27 requests are blocked  32 sec; 2 osds have slow requests
27 ops are blocked  536871 sec
2 ops are blocked  536871 sec on osd.5
25 ops are blocked  536871 sec on osd.7
2 osds have slow requests

HEALTH_WARN 34 requests are blocked  32 sec; 2 osds have slow requests
34 ops are blocked  536871 sec
9 ops are blocked  536871 sec on osd.5
25 ops are blocked  536871 sec on osd.7
2 osds have slow requests
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph daemons stucked in FUTEX_WAIT syscall

2015-07-13 Thread Simion Rad
Hi ,

I'm running a small cephFS ( 21 TB , 16 OSDs having different sizes between 
400G and 3.5 TB ) cluster that is used as a file warehouse (both small and big 
files).
Every day there are times when a lot of processes running on the client servers 
( using either fuse of kernel client) become stuck in D state and when I run a 
strace of them I see them waiting in FUTEX_WAIT syscall.
The same issue I'm able to see on all OSD demons.
The ceph version I'm running is Firefly 0.80.10 both on clients and on server 
daemons.
I use ext4 as osd filesystem.
Operating system on servers : Ubuntu 14.04 and kernel 3.13.
Operaing system on clients : Ubuntu 12.04 LTS with HWE option kernel 3.13
The osd daemons are using RAID5 virtual disks (6 x 300 GB 10K RPM disks on RAID 
controller Dell PERC H700 with 512MB BBU using write-back mode).
The servers which the ceph daemons are running on are also hosting KVM VMs ( 
OpenStack Nova ).
Because of this unfortunate setup the performance is really bad, but at least I 
shouldn't see as many locking issues (or shoud I ? ).
The only thing which temporarily improves the performance is restarting every 
osd. After such a restart I see some processes on client machines resume I/O 
but only for a couple of
hours,  then the whole process must be repeated.
I cannot afford to run a setup without RAID because there isn't enough RAM left 
for a couple of osd daemons.

The ceph.conf settings I use  :

auth cluster required = cephx
auth service required = cephx
auth client required = cephx
filestore xattr use omap = true
osd pool default size = 2
osd pool default min size = 1
osd pool default pg num = 128
osd pool default pgp num = 128
public network = 10.71.13.0/24
cluster network = 10.71.12.0/24

Did someone else experienced this kind of behaviour (stuck processes in 
FUTEX_WAIT syscall) when running firefly release on Ubuntu 14.04 ?

Thanks,
Simion Rad.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Issue with journal on another drive

2015-07-13 Thread Lionel Bouton
On 07/14/15 00:08, Rimma Iontel wrote:
 Hi all,

 [...]
 Is there something that needed to be done to journal partition to
 enable sharing between multiple OSDs?  Or is there something else
 that's causing the isssue?


IIRC you can't share a volume between multiple OSDs. What you could do
if splitting this partition isn't possible is create a LVM volume group
with it as a single physical volume (change type of partition to lvm,
pvcreate /dev/sda6, vgcreate journal_vg /dev/sda6). Then you can create
a logical volumes in it for each of your OSDs (lvcreate -n
osdn_journal -L one_third_of_available_space journal_vg) and use
them (/dev/journal_vg/osdn_journal) in your configuration.

Lionel
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Configuring Ceph without DNS

2015-07-13 Thread Nigel Williams

 On 13 Jul 2015, at 4:58 pm, Abhishek Varshney abhishekvrs...@gmail.com 
 wrote:
 I have a requirement wherein I wish to setup Ceph where hostname resolution 
 is not supported and I just have IP addresses to work with. Is there a way 
 through which I can achieve this in Ceph? If yes, what are the caveats 
 associated with that approach?

We’ve been operating our Dumpling (now Firefly) cluster this way since it was 
put into production over 18-months ago, using host files to define all the Ceph 
hosts, works perfectly well.


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Configuring Ceph without DNS

2015-07-13 Thread Abhishek Varshney
Hi,

I have a requirement wherein I wish to setup Ceph where hostname resolution
is not supported and I just have IP addresses to work with. Is there a way
through which I can achieve this in Ceph? If yes, what are the caveats
associated with that approach?

PS: I am using ceph-deploy for deployment.

Thanks
Abhishek
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Configuring Ceph without DNS

2015-07-13 Thread Peter Michael Calum
Hi,

Could you try to use the host files instead af DNS. - Defining all CEPH hosts 
in /etc/hosts with their ip’s
should solve the problem.

thanks,
Peter Calum

Fra: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] På vegne af Abhishek 
Varshney
Sendt: 13. juli 2015 08:59
Til: ceph-users@lists.ceph.com
Emne: [ceph-users] Configuring Ceph without DNS

Hi,

I have a requirement wherein I wish to setup Ceph where hostname resolution is 
not supported and I just have IP addresses to work with. Is there a way through 
which I can achieve this in Ceph? If yes, what are the caveats associated with 
that approach?

PS: I am using ceph-deploy for deployment.

Thanks
Abhishek
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Configuring Ceph without DNS

2015-07-13 Thread Abhishek Varshney
Hi Peter and Nigel,

I have tries /etc/hosts and it works perfectly fine! But I am looking for
an alternative (if any) to do away completely with hostnames and just use
IP addresses instead.

Thanks
Abhishek

On 13 July 2015 at 12:40, Nigel Williams nigel.d.willi...@gmail.com wrote:


  On 13 Jul 2015, at 4:58 pm, Abhishek Varshney abhishekvrs...@gmail.com
 wrote:
  I have a requirement wherein I wish to setup Ceph where hostname
 resolution is not supported and I just have IP addresses to work with. Is
 there a way through which I can achieve this in Ceph? If yes, what are the
 caveats associated with that approach?

 We’ve been operating our Dumpling (now Firefly) cluster this way since it
 was put into production over 18-months ago, using host files to define all
 the Ceph hosts, works perfectly well.


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs without admin key

2015-07-13 Thread John Spray
Yes: clients need an MDS key that says allow, and an OSD key that 
permits it access to the RADOS pool you're using as your CephFS data pool.


If you're already trying that and getting an error, please post the caps 
you're using.


Thanks,
John


On 12/07/2015 14:12, Bernhard Duebi wrote:

Hi,

I'm new to ceph. I setup a small cluster and successfully connected kvm/qemu to 
use block devices. Now I'm experimenting with CephFS. I use ceph-fuse on SLES12 
(ceph 0.94). I can mount the file-system and write to it, but only when the 
admin keyring is present, which gives the FS client full admin privileges.
For kvm/qemu I can limit the privileges by creating key with limited privileges. I was 
googling if the same is possible for CephFS. I found some answers but none of them work 
because I always get permission denied.

Any hints how the key should look like?

Thanks
Bernhard


FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks  orcas on your 
desktop!
Check it out at http://www.inbox.com/marineaquarium


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Configuring Ceph without DNS

2015-07-13 Thread Wido den Hollander


On 13-07-15 09:13, Abhishek Varshney wrote:
 Hi Peter and Nigel,
 
 I have tries /etc/hosts and it works perfectly fine! But I am looking
 for an alternative (if any) to do away completely with hostnames and
 just use IP addresses instead.
 

It's just that ceph-deploy wants DNS, but if you go for manual
bootstrapping there is no requirement for DNS at all.

Ceph internally doesn't do anything with DNS, it has the monitor
addresses hardcoded in the monmap and that is leading for the cluster.

Wido

 Thanks
 Abhishek
 
 On 13 July 2015 at 12:40, Nigel Williams nigel.d.willi...@gmail.com
 mailto:nigel.d.willi...@gmail.com wrote:
 
 
  On 13 Jul 2015, at 4:58 pm, Abhishek Varshney abhishekvrs...@gmail.com 
 mailto:abhishekvrs...@gmail.com wrote:
  I have a requirement wherein I wish to setup Ceph where hostname 
 resolution is not supported and I just have IP addresses to work with. Is 
 there a way through which I can achieve this in Ceph? If yes, what are the 
 caveats associated with that approach?
 
 We’ve been operating our Dumpling (now Firefly) cluster this way
 since it was put into production over 18-months ago, using host
 files to define all the Ceph hosts, works perfectly well.
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mds0: Client failing to respond to cache pressure

2015-07-13 Thread John Spray



On 13/07/2015 04:02, Eric Eastman wrote:

Hi John,

I am seeing this problem with Ceph v9.0.1 with the v4.1 kernel on all
nodes.  This system is using 4 Ceph FS client systems. They all have
the kernel driver version of CephFS loaded, but none are mounting the
file system. All 4 clients are using the libcephfs VFS interface to
Ganesha NFS (V2.2.0-2) and Samba (Version 4.3.0pre1-GIT-0791bb0) to
share out the Ceph file system.

# ceph -s
 cluster 6d8aae1e-1125-11e5-a708-001b78e265be
  health HEALTH_WARN
 4 near full osd(s)
 mds0: Client ede-c2-gw01 failing to respond to cache pressure
 mds0: Client ede-c2-gw02:cephfs failing to respond to cache 
pressure
 mds0: Client ede-c2-gw03:cephfs failing to respond to cache 
pressure
  monmap e1: 3 mons at
{ede-c2-mon01=10.15.2.121:6789/0,ede-c2-mon02=10.15.2.122:6789/0,ede-c2-mon03=10.15.2.123:6789/0}
 election epoch 8, quorum 0,1,2
ede-c2-mon01,ede-c2-mon02,ede-c2-mon03
  mdsmap e912: 1/1/1 up {0=ede-c2-mds03=up:active}, 2 up:standby
  osdmap e272: 8 osds: 8 up, 8 in
   pgmap v225264: 832 pgs, 4 pools, 188 GB data, 5173 kobjects
 212 GB used, 48715 MB / 263 GB avail
  832 active+clean
   client io 1379 kB/s rd, 20653 B/s wr, 98 op/s


It would help if we knew whether it's the kernel clients or the 
userspace clients that are generating the warnings here.  You've 
probably already done this, but I'd get rid of any unused kernel client 
mounts to simplify the situation.


We haven't tested the cache limit enforcement with NFS Ganesha, so there 
is a decent chance that it is broken.  The ganehsha FSAL is doing 
ll_get/ll_put reference counting on inodes, so it seems quite possible 
that its cache is pinning things that we would otherwise be evicting in 
response to cache pressure.  You mention samba as well,


You can see if the MDS cache is indeed exceeding its limit by looking at 
the output of:

ceph daemon mds.daemon id perf dump mds

...where the inodes value tells you how many are in the cache, vs. 
inode_max.


If you can, it would be useful to boil this down to a straightforward 
test case: if you start with a healthy cluster, mount a single ganesha 
client, and do your 5 million file procedure, do you get the warning?  
Same for samba/kernel mounts -- this is likely to be a client side 
issue, so we need to confirm which client is misbehaving.


Cheers,
John



# cat /proc/version
Linux version 4.1.0-040100-generic (kernel@gomeisa) (gcc version 4.6.3
(Ubuntu/Linaro 4.6.3-1ubuntu5) ) #201506220235 SMP Mon Jun 22 06:36:19
UTC 2015

# ceph -v
ceph version 9.0.1 (997b3f998d565a744bfefaaf34b08b891f8dbf64)

The systems are all running Ubuntu Trusty that has been upgraded to
the 4.1 kernel. This is all physical machines and no VMs.  The test
run that caused the problem was create and verifying 5 million small
files.

We have some tools that flag when Ceph is in a WARN state so it would
be nice to get rid of this warning.

Please let me know what additional information you need.

Thanks,

Eric

On Fri, Jul 10, 2015 at 4:19 AM, 谷枫 feiche...@gmail.com wrote:

Thank you John,
All my server is ubuntu14.04 with 3.16 kernel.
Not all of clients appear this problem, the cluster seems functioning well
now.
As you say,i will change the mds_cache_size to 50 from 10 to take a
test, thanks again!

2015-07-10 17:00 GMT+08:00 John Spray john.sp...@redhat.com:


This is usually caused by use of older kernel clients.  I don't remember
exactly what version it was fixed in, but iirc we've seen the problem with
3.14 and seen it go away with 3.18.

If your system is otherwise functioning well, this is not a critical error
-- it just means that the MDS might not be able to fully control its memory
usage (i.e. it can exceed mds_cache_size).

John



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com