[ceph-users] Multiple OSD's in a Each node with replica 2

2015-03-23 Thread Azad Aliyar
I  have a doubt . In a scenario (3nodes x 4osd each x 2replica)  I tested
with a node down and as long as you have space available all objects were
there.

Is it possible all replicas of an object to be saved in the same node?

Is it possible to lose any?

Is there a mechanism that prevents replicas to be stored in another osd in
the same node?

I would love someone to answer it and any information is highly appreciated.
-- 
   Warm Regards,  Azad Aliyar
 Linux Server Engineer
 *Email* :  azad.ali...@sparksupport.com   *|*   *Skype* :   spark.azad
http://www.sparksupport.com http://www.sparkmycloud.com
https://www.facebook.com/sparksupport
http://www.linkedin.com/company/244846  https://twitter.com/sparksupport
3rd Floor, Leela Infopark, Phase -2,Kakanad, Kochi-30, Kerala, India
*Phone*:+91 484 6561696 , *Mobile*:91-8129270421.   *Confidentiality
Notice:* Information in this e-mail is proprietary to SparkSupport. and is
intended for use only by the addressed, and may contain information that is
privileged, confidential or exempt from disclosure. If you are not the
intended recipient, you are notified that any use of this information in
any manner is strictly prohibited. Please delete this mail  notify us
immediately at i...@sparksupport.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [SPAM] Changing pg_num = RBD VM down !

2015-03-16 Thread Azad Aliyar
May I know your ceph version.?. The latest version of firefly 80.9 has
patches to avoid excessive data migrations during rewighting osds. You may
need set a tunable inorder make this patch active.

This is a bugfix release for firefly.  It fixes a performance regression
in librbd, an important CRUSH misbehavior (see below), and several RGW
bugs.  We have also backported support for flock/fcntl locks to ceph-fuse
and libcephfs.

We recommend that all Firefly users upgrade.

For more detailed information, see
  http://docs.ceph.com/docs/master/_downloads/v0.80.9.txt

Adjusting CRUSH maps


* This point release fixes several issues with CRUSH that trigger
  excessive data migration when adjusting OSD weights.  These are most
  obvious when a very small weight change (e.g., a change from 0 to
  .01) triggers a large amount of movement, but the same set of bugs
  can also lead to excessive (though less noticeable) movement in
  other cases.

  However, because the bug may already have affected your cluster,
  fixing it may trigger movement *back* to the more correct location.
  For this reason, you must manually opt-in to the fixed behavior.

  In order to set the new tunable to correct the behavior::

 ceph osd crush set-tunable straw_calc_version 1

  Note that this change will have no immediate effect.  However, from
  this point forward, any 'straw' bucket in your CRUSH map that is
  adjusted will get non-buggy internal weights, and that transition
  may trigger some rebalancing.

  You can estimate how much rebalancing will eventually be necessary
  on your cluster with::

 ceph osd getcrushmap -o /tmp/cm
 crushtool -i /tmp/cm --num-rep 3 --test --show-mappings  /tmp/a 21
 crushtool -i /tmp/cm --set-straw-calc-version 1 -o /tmp/cm2
 crushtool -i /tmp/cm2 --reweight -o /tmp/cm2
 crushtool -i /tmp/cm2 --num-rep 3 --test --show-mappings  /tmp/b 21
 wc -l /tmp/a  # num total mappings
 diff -u /tmp/a /tmp/b | grep -c ^+# num changed mappings

   Divide the total number of lines in /tmp/a with the number of lines
   changed.  We've found that most clusters are under 10%.

   You can force all of this rebalancing to happen at once with::

 ceph osd crush reweight-all

   Otherwise, it will happen at some unknown point in the future when
   CRUSH weights are next adjusted.

Notable Changes
---

* ceph-fuse: flock, fcntl lock support (Yan, Zheng, Greg Farnum)
* crush: fix straw bucket weight calculation, add straw_calc_version
  tunable (#10095 Sage Weil)
* crush: fix tree bucket (Rongzu Zhu)
* crush: fix underflow of tree weights (Loic Dachary, Sage Weil)
* crushtool: add --reweight (Sage Weil)
* librbd: complete pending operations before losing image (#10299 Jason
  Dillaman)
* librbd: fix read caching performance regression (#9854 Jason Dillaman)
* librbd: gracefully handle deleted/renamed pools (#10270 Jason Dillaman)
* mon: fix dump of chooseleaf_vary_r tunable (Sage Weil)
* osd: fix PG ref leak in snaptrimmer on peering (#10421 Kefu Chai)
* osd: handle no-op write with snapshot (#10262 Sage Weil)
* radosgw-admi




On 03/16/2015 12:37 PM, Alexandre DERUMIER wrote:
 VMs are running on the same nodes than OSD
 Are you sure that you didn't some kind of out of memory.
 pg rebalance can be memory hungry. (depend how many osd you have).

2 OSD per host, and 5 hosts in this cluster.
hosts h
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Doesn't Support Qcow2 Disk images

2015-03-12 Thread Azad Aliyar
Community please explain the 2nd warning on this page:

http://ceph.com/docs/master/rbd/rbd-openstack/

Important Ceph doesn’t support QCOW2 for hosting a virtual machine disk.
Thus if you want to boot virtual machines in Ceph (ephemeral backend or
boot from volume), the Glance image format must be RAW.


-- 
   Warm Regards,  Azad Aliyar
 Linux Server Engineer
 *Email* :  azad.ali...@sparksupport.com   *|*   *Skype* :   spark.azad
http://www.sparksupport.com http://www.sparkmycloud.com
https://www.facebook.com/sparksupport
http://www.linkedin.com/company/244846  https://twitter.com/sparksupport
3rd Floor, Leela Infopark, Phase -2,Kakanad, Kochi-30, Kerala, India
*Phone*:+91 484 6561696 , *Mobile*:91-8129270421.   *Confidentiality
Notice:* Information in this e-mail is proprietary to SparkSupport. and is
intended for use only by the addressed, and may contain information that is
privileged, confidential or exempt from disclosure. If you are not the
intended recipient, you are notified that any use of this information in
any manner is strictly prohibited. Please delete this mail  notify us
immediately at i...@sparksupport.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread

2015-03-09 Thread Azad Aliyar
*Check Max Threadcount:* If you have a node with a lot of OSDs, you may be
hitting the default maximum number of threads (e.g., usually 32k),
especially during recovery. You can increase the number of threads using
sysctl to see if increasing the maximum number of threads to the maximum
possible number of threads allowed (i.e., 4194303) will help. For example:

sysctl -w kernel.pid_max=4194303

 If increasing the maximum thread count resolves the issue, you can make it
permanent by including a kernel.pid_max setting in the /etc/sysctl.conf
file. For example:

kernel.pid_max = 4194303


On Mon, Mar 9, 2015 at 4:11 PM, Karan Singh karan.si...@csc.fi wrote:

 Hello Community need help to fix a long going Ceph problem.

 Cluster is unhealthy , Multiple OSDs are DOWN. When i am trying to restart
 OSD’s i am getting this error


 *2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc
 http://Thread.cc: In function 'void Thread::create(size_t)' thread
 7f760dac9700 time 2015-03-09 12:22:16.311970*
 *common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)*


 *Environment *:  4 Nodes , OSD+Monitor , Firefly latest , CentOS6.5 ,
 3.17.2-1.el6.elrepo.x86_64

 Tried upgrading from 0.80.7 to 0.80.8  but no Luck

 Tried centOS stock kernel 2.6.32  but no Luck

 Memory is not a problem more then 150+GB is free


 Did any one every faced this problem ??

 *Cluster status *

  *  cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33*
 * health HEALTH_WARN 7334 pgs degraded; 1185 pgs down; 1 pgs
 incomplete; 1735 pgs peering; 8938 pgs stale; 1*
 *736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs stuck unclean;
 recovery 6061/31080 objects degraded (19*
 *.501%); 111/196 in osds are down; clock skew detected on mon.pouta-s02,
 mon.pouta-s03*
 * monmap e3: 3 mons at
 {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX.50.3:6789*
 */0}, election epoch 1312, quorum 0,1,2 pouta-s01,pouta-s02,pouta-s03*
 * osdmap e26633: 239 osds: 85 up, 196 in*
 *  pgmap v60389: 17408 pgs, 13 pools, 42345 MB data, 10360 objects*
 *4699 GB used, 707 TB / 711 TB avail*
 *6061/31080 objects degraded (19.501%)*
 *  14 down+remapped+peering*
 *  39 active*
 *3289 active+clean*
 * 547 peering*
 * 663 stale+down+peering*
 * 705 stale+active+remapped*
 *   1 active+degraded+remapped*
 *   1 stale+down+incomplete*
 * 484 down+peering*
 * 455 active+remapped*
 *3696 stale+active+degraded*
 *   4 remapped+peering*
 *  23 stale+down+remapped+peering*
 *  51 stale+active*
 *3637 active+degraded*
 *3799 stale+active+clean*

 *OSD :  Logs *

 *2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc
 http://Thread.cc: In function 'void Thread::create(size_t)' thread
 7f760dac9700 time 2015-03-09 12:22:16.311970*
 *common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)*

 * ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7)*
 * 1: (Thread::create(unsigned long)+0x8a) [0xaf41da]*
 * 2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0xae84fa]*
 * 3: (Accepter::entry()+0x265) [0xb5c635]*
 * 4: /lib64/libpthread.so.0() [0x3c8a6079d1]*
 * 5: (clone()+0x6d) [0x3c8a2e89dd]*
 * NOTE: a copy of the executable, or `objdump -rdS executable` is needed
 to interpret this.*


 *More information at Ceph Tracker Issue :  *
 http://tracker.ceph.com/issues/10988#change-49018


 
 Karan Singh
 Systems Specialist , Storage Platforms
 CSC - IT Center for Science,
 Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland
 mobile: +358 503 812758
 tel. +358 9 4572001
 fax +358 9 4572302
 http://www.csc.fi/
 


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
   Warm Regards,  Azad Aliyar
 Linux Server Engineer
 *Email* :  azad.ali...@sparksupport.com   *|*   *Skype* :   spark.azad
http://www.sparksupport.com http://www.sparkmycloud.com
https://www.facebook.com/sparksupport
http://www.linkedin.com/company/244846  https://twitter.com/sparksupport
3rd Floor, Leela Infopark, Phase -2,Kakanad, Kochi-30, Kerala, India
*Phone*:+91 484 6561696 , *Mobile*:91-8129270421.   *Confidentiality
Notice:* Information in this e-mail is proprietary to SparkSupport. and is
intended for use only by the addressed, and may contain information that is
privileged, confidential or exempt from disclosure. If you are not the
intended recipient, you are notified that any use of this information in
any manner is strictly prohibited. Please delete this mail  notify us
immediately at i...@sparksupport.com

Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread

2015-03-09 Thread Azad Aliyar
Great Karan.

On Mon, Mar 9, 2015 at 9:32 PM, Karan Singh karan.si...@csc.fi wrote:

 Thanks Guys kernel.pid_max=4194303 did the trick.

 - Karan -

 On 09 Mar 2015, at 14:48, Christian Eichelmann 
 christian.eichelm...@1und1.de wrote:

 Hi Karan,

 as you are actually writing in your own book, the problem is the sysctl
 setting kernel.pid_max. I've seen in your bug report that you were
 setting it to 65536, which is still to low for high density hardware.

 In our cluster, one OSD server has in an idle situation about 66.000
 Threads (60 OSDs per Server). The number of threads increases when you
 increase the number of placement groups in the cluster, which I think
 has triggered your problem.

 Set the kernel.pid_max setting to 4194303 (the maximum) like Azad
 Aliyar suggested, and the problem should be gone.

 Regards,
 Christian

 Am 09.03.2015 11:41, schrieb Karan Singh:

 Hello Community need help to fix a long going Ceph problem.

 Cluster is unhealthy , Multiple OSDs are DOWN. When i am trying to
 restart OSD’s i am getting this error


 /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc
 http://Thread.cc: In function 'void Thread::create(size_t)' thread
 7f760dac9700 time 2015-03-09 12:22:16.311970/
 /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/


 *Environment *:  4 Nodes , OSD+Monitor , Firefly latest , CentOS6.5
 , 3.17.2-1.el6.elrepo.x86_64

 Tried upgrading from 0.80.7 to 0.80.8  but no Luck

 Tried centOS stock kernel 2.6.32  but no Luck

 Memory is not a problem more then 150+GB is free


 Did any one every faced this problem ??

 *Cluster status *
 *
 *
 / cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33/
 / health HEALTH_WARN 7334 pgs degraded; 1185 pgs down; 1 pgs
 incomplete; 1735 pgs peering; 8938 pgs stale; 1/
 /736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs stuck unclean;
 recovery 6061/31080 objects degraded (19/
 /.501%); 111/196 in osds are down; clock skew detected on mon.pouta-s02,
 mon.pouta-s03/
 / monmap e3: 3 mons at

 {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX.50.3:6789/
 //0}, election epoch 1312, quorum 0,1,2 pouta-s01,pouta-s02,pouta-s03/
 /   * osdmap e26633: 239 osds: 85 up, 196 in*/
 /  pgmap v60389: 17408 pgs, 13 pools, 42345 MB data, 10360 objects/
 /4699 GB used, 707 TB / 711 TB avail/
 /6061/31080 objects degraded (19.501%)/
 /  14 down+remapped+peering/
 /  39 active/
 /3289 active+clean/
 / 547 peering/
 / 663 stale+down+peering/
 / 705 stale+active+remapped/
 /   1 active+degraded+remapped/
 /   1 stale+down+incomplete/
 / 484 down+peering/
 / 455 active+remapped/
 /3696 stale+active+degraded/
 /   4 remapped+peering/
 /  23 stale+down+remapped+peering/
 /  51 stale+active/
 /3637 active+degraded/
 /3799 stale+active+clean/

 *OSD :  Logs *

 /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc
 http://Thread.cc: In function 'void Thread::create(size_t)' thread
 7f760dac9700 time 2015-03-09 12:22:16.311970/
 /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/
 /
 /
 / ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7)/
 / 1: (Thread::create(unsigned long)+0x8a) [0xaf41da]/
 / 2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0xae84fa]/
 / 3: (Accepter::entry()+0x265) [0xb5c635]/
 / 4: /lib64/libpthread.so.0() [0x3c8a6079d1]/
 / 5: (clone()+0x6d) [0x3c8a2e89dd]/
 / NOTE: a copy of the executable, or `objdump -rdS executable` is
 needed to interpret this./


 *More information at Ceph Tracker Issue :
 *http://tracker.ceph.com/issues/10988#change-49018


 
 Karan Singh
 Systems Specialist , Storage Platforms
 CSC - IT Center for Science,
 Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland
 mobile: +358 503 812758
 tel. +358 9 4572001
 fax +358 9 4572302
 http://www.csc.fi/
 



 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



 --
 Christian Eichelmann
 Systemadministrator

 11 Internet AG - IT Operations Mail  Media Advertising  Targeting
 Brauerstraße 48 · DE-76135 Karlsruhe
 Telefon: +49 721 91374-8026
 christian.eichelm...@1und1.de

 Amtsgericht Montabaur / HRB 6484
 Vorstände: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Robert
 Hoffmann, Markus Huhn, Hans-Henning Kettler, Dr. Oliver Mauss, Jan Oetjen
 Aufsichtsratsvorsitzender: Michael Scheeren





-- 
   Warm Regards,  Azad Aliyar
 Linux Server Engineer
 *Email* :  azad.ali...@sparksupport.com   *|*   *Skype* :   spark.azad
http://www.sparksupport.com http

[ceph-users] Multiple OSD's in a Each node with replica 2

2015-03-06 Thread Azad Aliyar
I  have a doubt . In a scenario (3nodes x 4osd each x 2replica)  I tested
with a node down and as long as you have space available all objects were
there.

Is it possible all replicas of an object to be saved in the same node?

Is it possible to lose any?

Is there a mechanism that prevents replicas to be stored in another osd in
the same node?

I would love someone to answer it and any information is highly appreciated.

-- 
   Warm Regards,  Azad Aliyar
 Linux Server Engineer
 *Email* :  azad.ali...@sparksupport.com   *|*   *Skype* :   spark.azad
http://www.sparksupport.com http://www.sparkmycloud.com
https://www.facebook.com/sparksupport
http://www.linkedin.com/company/244846  https://twitter.com/sparksupport
3rd Floor, Leela Infopark, Phase -2,Kakanad, Kochi-30, Kerala, India
*Phone*:+91 484 6561696 , *Mobile*:91-8129270421.   *Confidentiality
Notice:* Information in this e-mail is proprietary to SparkSupport. and is
intended for use only by the addressed, and may contain information that is
privileged, confidential or exempt from disclosure. If you are not the
intended recipient, you are notified that any use of this information in
any manner is strictly prohibited. Please delete this mail  notify us
immediately at i...@sparksupport.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com