Re: [ceph-users] [Ceph-community] working ceph.conf file?

2014-08-08 Thread Matt Harlum

One thing to add, I had a similar issue with manually created OSD’s not coming 
back up with a reboot, they were being mounted but not started
To resolve this I had to create a file on each OSD called sysvinit

Regards,
Matt

On 9 Aug 2014, at 7:57 am, Andrew Woodward  wrote:

> Dan,
> 
> It is not necessary to specify the OSD data in ceph.conf anymore. Ceph has 
> two auto-start functions besides this method.
> 
> udev rules:
> ceph uses a udev rule to scan and attempt to mount (and activate) partitions 
> with specific GUID set for the partition typecode
> sgdisk --typecode=<>:<> /dev/<>
> 
> the exact GUID's to use can be found 
> https://github.com/ceph/ceph/blob/master/udev/95-ceph-osd.rules. These are 
> set automaticly using ceph-disk (or ceph-deploy) if it creates the partition 
> from an empty disk, in the case that it does not, you have to set them by 
> hand, all be it should probably do this, or at least tell you should do this.
> 
> ceph init script:
> the ceph init script will scan /var/lib/ceph/osd (or the otherwise configured 
> location) for -  (default cluster name is ceph) folders and 
> attempt to start the osd service for each of them if they look correct
> 
> lastly, and possibly the most annoying way is that you can configure each OSD 
> and path in ceph.conf, I don't have any good examples as the two prior are 
> more flexible / require less config.
> 
> 
> 
> 
> On Fri, Aug 8, 2014 at 8:53 AM, O'Reilly, Dan  wrote:
> Does anybody have a good sample ceph.conf file I can use for reference?  I’m 
> having a problem where OSD’s won’t come back up after a sysem reboot.
> 
>  
> 
> Dan O'Reilly
> 
> UNIX Systems Administration
> 
> 
> 
> 9601 S. Meridian Blvd.
> 
> Englewood, CO 80112
> 
> 720-514-6293
> 
>  
> 
>  
> 
> 
> ___
> Ceph-community mailing list
> ceph-commun...@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-community-ceph.com
> 
> 
> 
> 
> -- 
> Andrew
> Mirantis
> Ceph community
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Can't start OSD

2014-08-08 Thread Matt Harlum
Hi,

Can you run ls -lah /var/lib/ceph/osd/ceph-10/journal

It’s saying it can’t find the journal

Regards,
Matt 

On 9 Aug 2014, at 12:51 am, O'Reilly, Dan  wrote:

> I’m afraid I don’t know exactly how to interpret this, but after a reboot:
>  
> 2014-08-08 08:48:44.616005 7f0c3b1447a0  0 ceph version 0.80.1 
> (a38fe1169b6d2ac98b427334c12d7cf81f809b74), process ceph-osd, pid 2978
> 2014-08-08 08:48:44.635680 7f0c3b1447a0  0 
> filestore(/var/lib/ceph/osd/ceph-10) mount detected xfs (libxfs)
> 2014-08-08 08:48:44.635730 7f0c3b1447a0  1 
> filestore(/var/lib/ceph/osd/ceph-10)  disabling 'filestore replica fadvise' 
> due to known issues with fadvise(DONTNEED) on xfs
> 2014-08-08 08:48:44.681911 7f0c3b1447a0  0 
> genericfilestorebackend(/var/lib/ceph/osd/ceph-10) detect_features: FIEMAP 
> ioctl is supported and appears to work
> 2014-08-08 08:48:44.681959 7f0c3b1447a0  0 
> genericfilestorebackend(/var/lib/ceph/osd/ceph-10) detect_features: FIEMAP 
> ioctl is disabled via 'filestore fiemap' config option
> 2014-08-08 08:48:44.748483 7f0c3b1447a0  0 
> genericfilestorebackend(/var/lib/ceph/osd/ceph-10) detect_features: 
> syscall(SYS_syncfs, fd) fully supported
> 2014-08-08 08:48:44.748605 7f0c3b1447a0  0 
> xfsfilestorebackend(/var/lib/ceph/osd/ceph-10) detect_feature: extsize is 
> supported
> 2014-08-08 08:48:44.889826 7f0c3b1447a0  0 
> filestore(/var/lib/ceph/osd/ceph-10) mount: enabling WRITEAHEAD journal mode: 
> checkpoint is not enabled
> 2014-08-08 08:48:45.064198 7f0c3b1447a0 -1 
> filestore(/var/lib/ceph/osd/ceph-10) mount failed to open journal 
> /var/lib/ceph/osd/ceph-10/journal: (2) No such file or directory
> 2014-08-08 08:48:45.074220 7f0c3b1447a0 -1  ** ERROR: error converting store 
> /var/lib/ceph/osd/ceph-10: (2) No such file or directory
> 2014-08-08 08:49:19.957725 7f2c40c1a7a0  0 ceph version 0.80.1 
> (a38fe1169b6d2ac98b427334c12d7cf81f809b74), process ceph-osd, pid 4707
> 2014-08-08 08:49:19.973896 7f2c40c1a7a0  0 
> filestore(/var/lib/ceph/osd/ceph-10) mount detected xfs (libxfs)
> 2014-08-08 08:49:19.973931 7f2c40c1a7a0  1 
> filestore(/var/lib/ceph/osd/ceph-10)  disabling 'filestore replica fadvise' 
> due to known issues with fadvise(DONTNEED) on xfs
> 2014-08-08 08:49:20.016413 7f2c40c1a7a0  0 
> genericfilestorebackend(/var/lib/ceph/osd/ceph-10) detect_features: FIEMAP 
> ioctl is supported and appears to work
> 2014-08-08 08:49:20.016444 7f2c40c1a7a0  0 
> genericfilestorebackend(/var/lib/ceph/osd/ceph-10) detect_features: FIEMAP 
> ioctl is disabled via 'filestore fiemap' config option
> 2014-08-08 08:49:20.083052 7f2c40c1a7a0  0 
> genericfilestorebackend(/var/lib/ceph/osd/ceph-10) detect_features: 
> syscall(SYS_syncfs, fd) fully supported
> 2014-08-08 08:49:20.083179 7f2c40c1a7a0  0 
> xfsfilestorebackend(/var/lib/ceph/osd/ceph-10) detect_feature: extsize is 
> supported
> 2014-08-08 08:49:20.134213 7f2c40c1a7a0  0 
> filestore(/var/lib/ceph/osd/ceph-10) mount: enabling WRITEAHEAD journal mode: 
> checkpoint is not enabled
> 2014-08-08 08:49:20.136710 7f2c40c1a7a0 -1 
> filestore(/var/lib/ceph/osd/ceph-10) mount failed to open journal 
> /var/lib/ceph/osd/ceph-10/journal: (2) No such file or directory
> 2014-08-08 08:49:20.146797 7f2c40c1a7a0 -1  ** ERROR: error converting store 
> /var/lib/ceph/osd/ceph-10: (2) No such file or directory
>  
> From: German Anders [mailto:gand...@despegar.com] 
> Sent: Friday, August 08, 2014 8:23 AM
> To: O'Reilly, Dan
> Cc: Karan Singh; ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] Can't start OSD
>  
> How about the logs? Is something there?
> 
> ls /var/log/ceph/
>  
> German Anders
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>  
> --- Original message --- 
> Asunto: Re: [ceph-users] Can't start OSD 
> De: "O'Reilly, Dan"  
> Para: Karan Singh  
> Cc: ceph-users@lists.ceph.com  
> Fecha: Friday, 08/08/2014 10:53
> 
> 
> Nope.  Nothing works.  This is VERY frustrating.
>  
> What happened:
>  
> -  I rebooted the box, simulating a system failure.
> -  When the system came back up, ceph wasn’t started, and the osd 
> volumes weren’t mounted.
> -  I did a “service ceph start osd” and the ceph processes don’t start
> -  I did a “ceph-deploy activate” on the devices,  so they’re 
> mounted.  “service ceph start” still doesn’t start anything.
>  
> Right now:
>  
> # service ceph restart
> === osd.18 ===
> === osd.18 ===
> Stopping Ceph osd.18 on tm1cldosdl04...done
> === osd.18 ===
> create-or-move updated item name 'osd.18' weight 0.45 at location 
> {host=tm1cldosdl04,root=default} to crush map
> Starting Ceph osd.18 on tm1cldosdl04...
> starting osd.18 at :/0 osd_data /var/lib/ceph/osd/ceph-18 
> /var/lib/ceph/osd/ceph-18/journal
> === osd.17 ===
> === osd.17 ===
> Stopping Ceph osd.17 on tm1cldosdl04...done
> === osd.17 ===
> create-or-move updated item name 'osd.17' weight 0.45 at location 
> {host=tm1cldosdl04,root=default} to crush map
> Starting Ceph osd.17 on tm1cldosdl04...
> starting o

Re: [ceph-users] Placement groups forever in "creating" state and dont map to OSD

2014-08-04 Thread Matt Harlum
Hi

What distributions are your machines using? and is SELinux enabled on them?

I ran into the same issue once, i had to disable SELinux on all the machines 
and then reinstall


On 4 Aug 2014, at 5:25 pm, yogesh_d...@dell.com wrote:

> Dell - Internal Use - Confidential
> 
> Matt
> Thanks for responding
> As suggested I tried to set replication to 2X by usng commands you provided
>  
> $ceph osd pool set data size 2
> $ceph osd pool set data min_size 2
> $ceph osd pool set rbd size 2
> $ceph osd pool set rbd min_size 2
> $ceph osd pool set metadata size 2
> $ceph osd pool set metadata min_size 2
>  
> It told me –
> set pool 0 size to 2
> set pool 0 min_size to 2
> set pool 2 size to 2
> set pool 2 min_size to 2
> set pool 1 size to 2
> set pool 1 min_size to 2
>  
> To verify that pool size had indeed changed – I checked again
>  
> $ceph osd dump | grep 'rep size'
> pool 0 'data' rep size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 
> pgp_num 64 last_change 90 owner 0 crash_replay_interval 45
> pool 1 'metadata' rep size 2 crush_ruleset 1 object_hash rjenkins pg_num 64 
> pgp_num 64 last_change 94 owner 0
> pool 2 'rbd' rep size 2 crush_ruleset 2 object_hash rjenkins pg_num 64 
> pgp_num 64 last_change 92 owner 0
> pool 3 'datapool' rep size 2 crush_ruleset 2 object_hash rjenkins pg_num 10 
> pgp_num 10 last_change 38 owner 0
>  
>  
> However – my cluster is still in same state
>  
> $ceph -s
>health HEALTH_WARN 202 pgs stuck inactive; 202 pgs stuck unclean
>monmap e1: 1 mons at {slesceph1=160.110.73.200:6789/0}, election epoch 1, 
> quorum 0 slesceph1
>osdmap e106: 2 osds: 2 up, 2 in
> pgmap v171: 202 pgs: 202 creating; 0 bytes data, 10306 MB used, 71573 MB 
> / 81880 MB avail
>mdsmap e1: 0/0/1 up
> Yogesh Devi,
> Architect,  Dell Cloud Clinical Archive
> Dell 
>  
>  
> Land Phone +91 80 28413000 Extension – 2781
> Hand Phone+91 99014 71082
>  
> From: Matt Harlum [mailto:m...@cactuar.net] 
> Sent: Saturday, August 02, 2014 6:01 AM
> To: Devi, Yogesh
> Cc: Pulicken, Antony
> Subject: Re: [ceph-users] Placement groups forever in "creating" state and 
> dont map to OSD
>  
> Hi Yogesh,
>  
> By default ceph is configured to create 3 replicas of the data, with only 3 
> OSDs it cannot create all of the pgs required to do this
>  
> You will need to change the replication to 2x for your pools, this can be 
> done like so:
> ceph odd pool set data size 2
> ceph odd pool set data min_size 2
> ceph odd pool set rbd size 2
> ceph odd pool set rbd min_size 2
> ceph odd pool set metadata size 2
> ceph odd pool set metadata min_size 2
>  
> Once you do this your ceph cluster should go to a healthy state.
>  
> Regards,
> Matt
>  
>  
>  
> On 2 Aug 2014, at 12:57 am, yogesh_d...@dell.com wrote:
> 
> 
> Dell - Internal Use - Confidential
> 
> Hello Ceph Experts J ,
>  
> I am using ceph ( ceph version 0.56.6) on Suse linux.
> I created a simple cluster with one monitor server and two OSDs .
> The conf file is attached
>  
> When  start my cluster – and do “ceph –s” -  I see following message
>  
> $ceph –s”
> health HEALTH_WARN 202 pgs stuck inactive; 202 pgs stuck unclean
>monmap e1: 1 mons at {slesceph1=160.110.73.200:6789/0}, election epoch 1, 
> quorum 0 slesceph1
>osdmap e56: 2 osds: 2 up, 2 in
> pgmap v100: 202 pgs: 202 creating; 0 bytes data, 10305 MB used, 71574 MB 
> / 81880 MB avail
>mdsmap e1: 0/0/1 up
>  
>  
> Basically there is some problem with my placement groups – they are forever 
> stuck in “creating” state and there is no OSD associated with them ( despite 
> having two OSD’s that are up and in” ) – when I do a ceph pg stat” I see as 
> follows
>  
> $ceph pg stat
> v100: 202 pgs: 202 creating; 0 bytes data, 10305 MB used, 71574 MB / 81880 MB 
> avail
>  
>  
> if I query any individual pg – then I see it isn’t mapped to any OSD
> $ ceph pg 0.d query
> pgid currently maps to no osd
>  
> I tried restaring OSDs and tuning my configuration without any avail
>  
> Any suggestions ?
>  
> Yogesh Devi
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Optimal OSD Configuration for 45 drives?

2014-07-26 Thread Matt Harlum

On 25 Jul 2014, at 5:54 pm, Christian Balzer  wrote:

> On Fri, 25 Jul 2014 13:31:34 +1000 Matt Harlum wrote:
> 
>> Hi,
>> 
>> I’ve purchased a couple of 45Drives enclosures and would like to figure
>> out the best way to configure these for ceph?
>> 
> That's the second time within a month somebody mentions these 45 drive
> chassis. 
> Would you mind elaborating which enclosures these are precisely?
> 
> I'm wondering especially about the backplane, as 45 is such an odd number.
> 

The Chassis is from 45drives.com. it has 3 rows of 15 direct wire sas 
connectors connected to two highpoint rocket 750s using 12 SFF-8087 Connectors. 
I’m considering replacing the highpoints with 3x LSI 9201-16I cards
The chassis’ are loaded up with 45 Seagate 4TB drives, and separate to the 45 
large drives are the two boot drives in raid 1.

> Also if you don't mind, specify "a couple" and what your net storage
> requirements are.
> 

Total is 3 of these 45drives.com enclosures for 3 replicas of our data, 

> In fact, read this before continuing:
> ---
> https://www.mail-archive.com/ceph-users@lists.ceph.com/msg11011.html
> ---
> 
>> Mainly I was wondering if it was better to set up multiple raid groups
>> and then put an OSD on each rather than an OSD for each of the 45 drives
>> in the chassis? 
>> 
> Steve already towed the conservative Ceph party line here, let me give you
> some alternative views and options on top of that and to recap what I
> wrote in the thread above.
> 
> In addition to his links, read this:
> ---
> https://objects.dreamhost.com/inktankweb/Inktank_Hardware_Configuration_Guide.pdf
> ---
> 
> Lets go from cheap and cheerful to "comes with racing stripes".
> 
> 1) All spinning rust, all the time. Plunk in 45 drives, as JBOD behind the
> cheapest (and densest) controllers you can get. Having the journal on the
> disks will halve their performance, but you just wanted the space and are
> not that pressed for IOPS. 
> The best you can expect per node with this setup is something around 2300
> IOPS with normal (7200RPM) disks.
> 
> 2) Same as 1), but use controllers with a large HW cache (4GB Areca comes
> to mind) in JBOD (or 45 times RAID0) mode. 
> This will alleviate some of the thrashing problems, particular if you're
> expecting high IOPS to be in short bursts.
> 
> 3) Ceph Classic, basically what Steve wrote. 
> 32HDDs, 8SSDs for journals (you do NOT want an uneven spread of journals). 
> This will give you sustainable 3200 IOPS, but of course the journals on
> SSDs not only avoid all that trashing about on the disk but also allow for
> coalescing of writes, so this is going to be fastest solution so far.
> Of course you will need 3 of these at minimum for acceptable redundancy,
> unlike 4) which just needs a replication level of 2.
> 
> 4) The anti-cephalopod. See my reply from a month ago in the link above.
> All the arguments apply, it very much depends upon your use case and
> budget. In my case the higher density, lower cost and ease of maintaining
> the cluster where well worth the lower IOPS.
> 
> 5) We can improve upon 3) by using HW cached controllers of course. And
> hey, you did need to connect those drive bays somehow anyway. ^o^ 
> Maybe even squeeze some more out of it by having the SSD controller
> separate from the HDD one(s).
> This is as fast (IOPS) as it comes w/o going to full SSD.
> 
> 

Thanks, “All Spinning Rust” will probably be fine, we’re looking to just store 
full server backups for a long time, so there’s not expected to be high IO or 
anything like that. 
The servers came with some pretty underpowered specs re: cpu/ram and they 
support a max of 32GB each and single socket. but at some point I plan to 
upgrade the motherboard to allow much much more ram to be fitted.

Mainly the reason why I ask if it’s a good idea to set up raid groups for the 
OSDs is that I can’t put 96GB ram in these and can’t put enough cpu power in to 
them. I’m imagining it’ll all start to fall to pieces if I try to operate these 
with ceph due to the small amount of ram and cpu?

> Networking:
> Either of the setups above will saturate a single 10Gb/s aka 1GB/s as
> Steve noted. 
> In fact 3) to 5) will be able to write up to 4GB/s in theory based on the
> HDDs sequential performance, but that is unlikely to be seen in real live.
> And of course your maximum write speed is  based on the speed of the SSDs.
> So for example with 3) you would want those 8 SSDs to have write speeds of
> about 250MB/s, giving you 2GB/s max write.
> Which in turn means 2 10GB/s links at least, up to 4 if you want
> redundancy and/or a separation of public and cluster network.
> 
> RAM:
> The more, the merrie

[ceph-users] Optimal OSD Configuration for 45 drives?

2014-07-24 Thread Matt Harlum
Hi,

I’ve purchased a couple of 45Drives enclosures and would like to figure out the 
best way to configure these for ceph?

Mainly I was wondering if it was better to set up multiple raid groups and then 
put an OSD on each rather than an OSD for each of the 45 drives in the chassis? 

Regards,
Matt

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com