Re: [ceph-users] ceph-deploy Errors - Fedora 21

2015-01-02 Thread Ken Dreyer
On 01/02/2015 12:38 PM, Travis Rhoden wrote:
> Hello,
> 
> I believe this is a problem specific to Fedora packaging.  The Fedora
> package for ceph-deploy is a bit different than the ones hosted at
> ceph.com .  Can you please tell me the output of "rpm
> -q python-remoto"?
> 
> I believe the problem is that the python-remoto package is too old, and
> there is not a correct dependency on it when it comes to versions.  The
> minimum version should be 0.0.22, but the latest in Fedora is 0.0.21
> (and latest upstream is 0.0.23).  I'll push to get this updated
> correctly.  The Fedora package maintainers will need to put out a new
> release of python-remoto, and hopefully update the spec file for
> ceph-deploy to require >= 0.0.22.

Thanks Travis for tracking this down!

Federico has granted me access to the Fedora Rawhide and F21 branches
today (thanks Federico!) in Fedora's package database [1].

I've built python-remoto 0.0.23 in Rawhide (Fedora 22) and Fedora 21 [2].

deeepdish, you can grab the Fedora 21 build directly from [3]
immediately if you wish.

If you'd rather wait for signed builds, you can wait a few days for the
Fedora infra admins to sign the package and push it out to the Fedora
mirrors [4]. When that's done, you can run "yum
--enablerepo=updates-testing update python-remoto", and yum will then
update your system to python-remoto-0.0.23-1.fc21 .

Either way, we'd really welcome your feedback and confirmation that this
does in fact fix your issue.

- Ken

[1] https://admin.fedoraproject.org/pkgdb/package/python-remoto/
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1146478
[3] http://koji.fedoraproject.org/koji/taskinfo?taskID=8516634
[4] https://admin.fedoraproject.org/updates/python-remoto-0.0.23-1.fc21

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd map hangs

2015-01-02 Thread Dyweni - Ceph-Users
Your OSDs are full.  The cluster will block, until space is freed up and 
both OSDs leave full state.


You have 2 OSDs, so I'm assuming you are running replica size of 2?  A 
quick (but risky) method might be to reduce your replica down to 1, to 
get the cluster unblocked, clean up space, then go back to replica size 
2.





On 2015-01-02 13:44, Max Power wrote:
After I tried to copy some files into a rbd device I ran into a "osd 
full"
state. So I restarted my server and wanted to remove some files from 
the
filesystem again. But at this moment I cannot execute "rbd map" anymore 
and I do

not know why.

This all happened in my testing environment and this is the current 
state with

'ceph status'
 health HEALTH_ERR
2 full osd(s)
 monmap e1: 1 mons at {test1=10.0.0.141:6789/0}
election epoch 1, quorum 0 test1
 osdmap e69: 2 osds: 2 up, 2 in
flags full
  pgmap v469: 100 pgs, 1 pools, 1727 MB data, 438 objects
3917 MB used, 156 MB / 4073 MB avail
 100 active+clean
strace reports this before 'rbd map pool/disk' hangs
[...]
access("/sys/bus/rbd", F_OK)= 0
access("/run/udev/control", F_OK)   = 0
socket(PF_NETLINK, SOCK_RAW|SOCK_CLOEXEC|SOCK_NONBLOCK, 
NETLINK_KOBJECT_UEVENT)

= 3
setsockopt(3, SOL_SOCKET, SO_ATTACH_FILTER,
"\r\0\0\0\0\0\0\0@k\211\240\377\177\0\0", 16) = 0
bind(3, {sa_family=AF_NETLINK, pid=0, groups=0002}, 12) = 0
getsockname(3, {sa_family=AF_NETLINK, pid=1192, groups=0002}, [12]) 
= 0

setsockopt(3, SOL_SOCKET, SO_PASSCRED, [1], 4) = 0
open("/sys/bus/rbd/add_single_major", O_WRONLY) = 4
write(4, "10.0.0.141:6789 name=admin,key=c"..., 61

Any idea why I cannot access the rbd device anymore?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph-deploy install and pinning on Ubuntu 14.04

2015-01-02 Thread Travis Rhoden
Hi Giuseppe,

ceph-deploy does try to do some pinning for the Ceph packages.  Those
settings should be found at /etc/apt/preferences.d/ceph.pref

If you find something is incorrect there, please let us know what it is and
we can can look into it!

 - Travis

On Sat, Dec 20, 2014 at 11:32 AM, Giuseppe Civitella <
giuseppe.civite...@gmail.com> wrote:

> Hi all,
>
> I'm using deph-deploy on Ubuntu 14.04. When I do a ceph-deploy install I
> see packages getting installed from ubuntu repositories instead of ceph's
> ones, am I missing something? Do I need to do some pinning on repositories?
>
> Thanks
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] rbd map hangs

2015-01-02 Thread Max Power
After I tried to copy some files into a rbd device I ran into a "osd full"
state. So I restarted my server and wanted to remove some files from the
filesystem again. But at this moment I cannot execute "rbd map" anymore and I do
not know why.

This all happened in my testing environment and this is the current state with
'ceph status'
 health HEALTH_ERR
2 full osd(s)
 monmap e1: 1 mons at {test1=10.0.0.141:6789/0}
election epoch 1, quorum 0 test1
 osdmap e69: 2 osds: 2 up, 2 in
flags full
  pgmap v469: 100 pgs, 1 pools, 1727 MB data, 438 objects
3917 MB used, 156 MB / 4073 MB avail
 100 active+clean
strace reports this before 'rbd map pool/disk' hangs
[...]
access("/sys/bus/rbd", F_OK)= 0
access("/run/udev/control", F_OK)   = 0
socket(PF_NETLINK, SOCK_RAW|SOCK_CLOEXEC|SOCK_NONBLOCK, NETLINK_KOBJECT_UEVENT)
= 3
setsockopt(3, SOL_SOCKET, SO_ATTACH_FILTER,
"\r\0\0\0\0\0\0\0@k\211\240\377\177\0\0", 16) = 0
bind(3, {sa_family=AF_NETLINK, pid=0, groups=0002}, 12) = 0
getsockname(3, {sa_family=AF_NETLINK, pid=1192, groups=0002}, [12]) = 0
setsockopt(3, SOL_SOCKET, SO_PASSCRED, [1], 4) = 0
open("/sys/bus/rbd/add_single_major", O_WRONLY) = 4
write(4, "10.0.0.141:6789 name=admin,key=c"..., 61

Any idea why I cannot access the rbd device anymore?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-deploy Errors - Fedora 21

2015-01-02 Thread Travis Rhoden
Hello,

I believe this is a problem specific to Fedora packaging.  The Fedora
package for ceph-deploy is a bit different than the ones hosted at ceph.com.
Can you please tell me the output of "rpm -q python-remoto"?

I believe the problem is that the python-remoto package is too old, and
there is not a correct dependency on it when it comes to versions.  The
minimum version should be 0.0.22, but the latest in Fedora is 0.0.21 (and
latest upstream is 0.0.23).  I'll push to get this updated correctly.  The
Fedora package maintainers will need to put out a new release of
python-remoto, and hopefully update the spec file for ceph-deploy to
require >= 0.0.22.

 - Travis

On Mon, Dec 29, 2014 at 10:24 PM, deeepdish  wrote:

> Hello.
>
> I’m having an issue with ceph-deploy on Fedora 21.
>
> - Installed ceph-deploy via ‘yum install ceph-deploy'
> - created non-root user
> - assigned sudo privs as per documentation -
> http://ceph.com/docs/master/rados/deployment/preflight-checklist/
>
> $ ceph-deploy install smg01.erbus.kupsta.net
> [ceph_deploy.conf][DEBUG ] found configuration file at:
> /cephfs/.cephdeploy.conf
> [ceph_deploy.cli][INFO  ] Invoked (1.5.20): /bin/ceph-deploy install
> [hostname]
> [ceph_deploy.install][DEBUG ] Installing stable version firefly on
> cluster ceph hosts [hostname]
> [ceph_deploy.install][DEBUG ] Detecting platform for host [hostname] ...
> [ceph_deploy][ERROR ] RuntimeError: connecting to
> host: [hostname] resulted in errors: TypeError __init__() got an unexpected
> keyword argument 'detect_sudo'
>
>
> Thank you.
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Weighting question

2015-01-02 Thread Gregory Farnum
The meant-for-human-consumption free space estimates and things won't be
accurate if you weight evenly instead of by size, but otherwise things
should work just fine -- you'll simply get full OSD warnings when you have
1TB/OSD.
-Greg
On Thu, Jan 1, 2015 at 3:10 PM Lindsay Mathieson <
lindsay.mathie...@gmail.com> wrote:

> On Thu, 1 Jan 2015 08:27:33 AM Dyweni - Ceph-Users wrote:
> > I suspect a better configuration would be to leave your weights alone
> > and to
> > change your primary affinity so that the osd with the ssd is used first.
>
> Interesting
>
> >   You
> > might a little improvement on the writes (since the spinners have to
> > work too),
> > but the reads should have the most improvement (since ceph only has to
> > read
> > from the ssd).
>
> Couple of things:
> - The SSD will be partitioned for each OSD to have a journal
>
> - I thought Journals were for writes only, not reads?
>
> --
> Lindsay___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Adding Crush Rules

2015-01-02 Thread Gregory Farnum
I'm on my phone at the moment, but I think if you run "ceph osd crush rule"
it will prompt you with the relevant options?
On Tue, Dec 30, 2014 at 6:00 PM Lindsay Mathieson <
lindsay.mathie...@gmail.com> wrote:

> Is there a command to do this without decompiling/editing/compiling the
> crush
> set? makes me nervous ...
> --
> Lindsay___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RadosGW slow gc

2015-01-02 Thread Gregory Farnum
You can store radosgw data in a regular EC pool without any caching in
front. I suspect this will work better for you, as part of the slowness is
probably the OSDs trying to look up all the objects in the ec pool before
deleting them. You should be able to check if that's the case by looking at
the osd perfcounters over time. (We've discussed cache counters before;
check the docs or the list).
-Greg
On Thu, Jan 1, 2015 at 1:01 PM Aaron Bassett 
wrote:

> I’m doing some load testing on radosgw to get ready for production and I
> had a problem with it stalling out. I had 100 cores from several nodes
> doing multipart uploads in parallel. This ran great for about two days,
> managing to upload about 2000 objects with an average size of 100GB. Then
> it stalled out and stopped. Ever since then, the gw has been gc’ing very
> slowly. During the upload run, it was creating objects at ~ 100/s, now it’s
> cleaning them at ~3/s. At this rate it wont be done for nearly a year and
> this is only a fraction of the data I need to put in.
>
> The pool I’m writing to is a cache pool at size 2 with an EC pool at 10+2
> behind it. (This data is not mission critical so we are trying to save
> space). I don’t know if this will affect the slow gc or not.
>
> I tried turning up rgw gc max objs to 256, but it didn’t seem to make a
> difference.
>
> I’m working under the assumption that my uploads started stalling because
> too many un-gc’ed parts accumulated, but I may be way off base there.
>
> Any thoughts would be much appreciated, Aaron
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-deploy Errors - Fedora 21

2015-01-02 Thread Ken Dreyer
On 12/29/2014 08:24 PM, deeepdish wrote:
> Hello.
> 
> I’m having an issue with ceph-deploy on Fedora 21.   
> 
> - Installed ceph-deploy via ‘yum install ceph-deploy'
> - created non-root user
> - assigned sudo privs as per documentation
> - http://ceph.com/docs/master/rados/deployment/preflight-checklist/
>  
> $ ceph-deploy install smg01.erbus.kupsta.net
>  
> [ceph_deploy.conf][DEBUG ] found configuration file at:
> /cephfs/.cephdeploy.conf
> [ceph_deploy.cli][INFO  ] Invoked (1.5.20): /bin/ceph-deploy install
> [hostname]
> [ceph_deploy.install][DEBUG ] Installing stable version firefly on
> cluster ceph hosts [hostname]
> [ceph_deploy.install][DEBUG ] Detecting platform for host [hostname] ...
> [ceph_deploy][ERROR ] RuntimeError: connecting to
> host: [hostname] resulted in errors: TypeError __init__() got an
> unexpected keyword argument 'detect_sudo'

Hi deeepdish,

Sorry you're having issues with ceph-deploy. Would you mind filing a bug
at http://tracker.ceph.com/ so this doesn't get lost? There's a lot of
traffic on ceph-users and it's best if we have a ticket for this.

If you don't already have an account in our tracker, you can register
for a new account using the "Register" link in the upper-right corner.

Also, it would be useful to have a bit more information from you:

1) What is the host OS and version of smg01.erbus.kupsta.net ?
2) What is the content of your .cephdeploy.conf file?

- Ken
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Weird scrub problem

2015-01-02 Thread Samuel Just
If the file structure is corrupted, then all bets are kind of off.
You'd have to characterize precisely the kind of corruption you want
handled and add a feature request for that.
-Sam

On Sat, Dec 27, 2014 at 5:14 PM, Andrey Korolyov  wrote:
> On Sat, Dec 27, 2014 at 4:09 PM, Andrey Korolyov  wrote:
>> On Tue, Dec 23, 2014 at 4:17 AM, Samuel Just  wrote:
>>> Oh, that's a bit less interesting.  The bug might be still around though.
>>> -Sam
>>>
>>> On Mon, Dec 22, 2014 at 2:50 PM, Andrey Korolyov  wrote:
 On Tue, Dec 23, 2014 at 1:12 AM, Samuel Just  wrote:
> You'll have to reproduce with logs on all three nodes.  I suggest you
> open a high priority bug and attach the logs.
>
> debug osd = 20
> debug filestore = 20
> debug ms = 1
>
> I'll be out for the holidays, but I should be able to look at it when
> I get back.
> -Sam
>


 Thanks Sam,

 although I am not sure if it makes not only a historical interest (the
 mentioned cluster running cuttlefish), I`ll try to collect logs for
 scrub.
>>
>> Same stuff:
>> https://www.mail-archive.com/ceph-users@lists.ceph.com/msg15447.html
>> https://www.mail-archive.com/ceph-users@lists.ceph.com/msg14918.html
>>
>> Looks like issue is still with us, though it requires meta or file
>> structure corruption to show itself. I`ll check if it can be
>> reproduced via rsync -X sec pg subdir -> pri pg subdir or vice-versa.
>> Mine case shows slightly different pathnames for same objects with
>> same checksums, may be a root reason then. As every case mentioned,
>> including mine, happened in oh-shit-hardware-is-broken case, I suggest
>> that the incurable corruption happens during primary backfill from
>> active replica at the recovery time.
>
> Recovery/backfill from corrupted primary copy results to crash
> (attached) of primary OSD, for example it can be triggered by purging
> one of secondary copies (top of cuttlefish branch for line numbers).
> Although as secondaries preserve same data with same checksums, it is
> possible to destroy both meta record and pg directory and refill
> primary back. The interesting point is that the corrupted primary was
> completely refilled after hardware failure, but looks like it survived
> long enough after a failure event to spread corruption to the copies,
> I simply can not imagine better explanation.
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Is there an negative relationship between storage utilization and ceph performance?

2015-01-02 Thread Udo Lembke
Hi again,
... after a long time!

Now I have change the whole ceph-cluster from xfs to ext4 (60 OSDs),
change tunables and fill the cluster again.

So I can compare the bench values.

For my setup the cluster runs better with ext4 than with xfs - latency
drop from ~14ms to ~8ms. (rados -p test bench 60 seq --no-cleanup)
Still with the old tunables.

Now with the new tunables (and filled again to 65%) the read performance
was also much better - raised from 440MB/s to ~760MB/s.

The write performance was less after before, but I had problems with the
read-performance (write was OK for me).

I lost a little bit of space - the weight of reach disk was 3.64 before
and 3.58 now.

For me it's looks, that the storage utilization has less impact with
ext4 and ext4 performs better than xfs!

Udo

Am 05.11.2014 01:22, schrieb Christian Balzer:
> 
> Hello,
> 
> On Tue, 04 Nov 2014 20:49:02 +0100 Udo Lembke wrote:
> 
>> Hi,
>> since a long time I'm looking for performance improvements for our
>> ceph-cluster.
>> The last expansion got better performance, because we add another node
>> (with 12 OSDs). The storage utilization was after that 60%.
>>
> Another node of course does more than lower per OSD disk utilization, it
> also adds more RAM (cached objects), more distribution of requests, etc.
> 
> So the question here is, did the usage (number of client IOPS) stay the
> same and just the total amount of stored data did grow?
> 
>> Now we reach again 69% (the next nodes are waiting for installation) and
>> the performance drop! OK, we also change the ceph-version from 0.72.x to
>> firefly.
>> But I'm wonder if there an relationship between utilization an
>> performance?! The OSDs are xfs disks, but now i start to use ext4,
>> because of the bad fragmentation on a xfs-filesystem (yes, I use the
>> mountoption allocsize=4M allready).
>>
> Does defragmenting (all of) the XFS backed OSDs help?
> 
>> Has anybody the same effect?
>>
> I have nothing anywhere near that full, but I can confirm that XFS
> fragments worse than ext4 and the less said about BTRFS, the better. ^.^
> Also defragmenting (not that they needed it) ext4 volumes felt more
> lightweight than XFS.
> 
> Since you now have ext4 OSDs, how about doing a osd bench and fio on those
> compared to XFS backed ones?
> 
> Other than the above, Mark listed a number of good reasons why OSDs (HDDs)
> become slower when getting fuller besides fragmentation.
> 
> Christian
>> Udo
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
> 
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Weighting question

2015-01-02 Thread Dyweni - Ceph-Users



On 2015-01-01 14:04, Lindsay Mathieson wrote:

On Thu, 1 Jan 2015 08:27:33 AM Dyweni - Ceph-Users wrote:

  You
might a little improvement on the writes (since the spinners have to
work too),
but the reads should have the most improvement (since ceph only has to
read
from the ssd).


Couple of things:
- The SSD will be partitioned for each OSD to have a journal

- I thought Journals were for writes only, not reads?



I believe that's correct, Journals are only for writes.

If you're using the SSD only for journals, then you won't see
any read improvements.


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] redundancy with 2 nodes

2015-01-02 Thread Mark Kirkwood

On 01/01/15 23:16, Christian Balzer wrote:


Hello,

On Thu, 01 Jan 2015 18:25:47 +1300 Mark Kirkwood wrote:

but I agree that you should probably not get a HEALTH OK status when you
have just setup 2 (or in fact any even number of) monitors...HEALTH WARN
would make more sense, with a wee message suggesting adding at least one
more!



I think what Jiri meant is that wen the whole cluster goes into a deadlock
due to loosing monitor quorum, ceph -s etc won't work anymore either.



Right - but looking at health output from his earlier post:

cephadmin@ceph1:~$ ceph status
cluster bce2ff4d-e03b-4b75-9b17-8a48ee4d7788
 health HEALTH_OK
 monmap e1: 2 mons at 
{ceph1=192.168.30.21:6789/0,ceph2=192.168.30.22:6789/0}, election epoch 
12, quorum 0,1 ceph1,ceph2

 mdsmap e7: 1/1/1 up {0=ceph1=up:active}, 1 up:standby
 osdmap e88: 4 osds: 4 up, 4 in
  pgmap v2051: 1280 pgs, 5 pools, 13184 MB data, 3328 objects
26457 MB used, 11128 GB / 11158 GB avail
1280 active+clean

...if he had received some sort of caution about the number of mons 
instead of HEALTH OK from that health status, then he might have added 
another *before* everything locked up. That's what I was meaning before.



And while the cluster rightfully shouldn't be doing anything in such a
state, querying the surviving/reachable monitor and being told as much
would indeed be a nice feature, as opposed to deafening silence.



Sure, getting nothing is highly undesirable.


As for your suggestion, while certainly helpful it is my not so humble
opinion than the the WARN state right now is totally overloaded and quite
frankly bogus.
This is particularly a problem with monitor plugins that just pick up the
WARN state without further discrimination.




Yeah, I agree that WARN is hopelessly overloaded. In the past I have to 
dig backward in the logs to see what the warning is actually about, and 
if it is really something that needs attention.


regards

Mark
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Weighting question

2015-01-02 Thread Lindsay Mathieson
On Thu, 1 Jan 2015 08:50:20 AM you wrote:
> > http://ceph.com/docs/master/rados/operations/crush-map/#primary-affinity
> >
> > 
> 
> This may help you too:
> 
> http://cephnotes.ksperis.com/blog/2014/08/20/ceph-primary-affinity

H - so if I have three OSD's on a Node (looking at get two extra drives 
per node now) - and set the affinity:

osd.0 = 0
osd.1 = 1
osd.2 = 1

Then reads will be randomly split between osd's 1 & 2?

ceph - a tweakers delight. We need a new DSM entry:

  COCD : "Ceph Obsessive Compulsive Disorder"

-- 
Lindsay

signature.asc
Description: This is a digitally signed message part.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Worthwhile setting up Cache tier with small leftover SSD partions?

2015-01-02 Thread Lindsay Mathieson
Expanding my tiny ceph setup from 2 OSD's to six, and two extra SSD's for 
journals (IBM 530 120GB)

Yah, I know the 5300's would be much better 

Assuming I use 10GB ber OSD for journal and 5GB spare to improve the SSD 
lifetime, that leaves 85GB spare per SSD.


Is it worthwhile setting up a 2 *85GB OSD Cache Tier (Replica 2)? Usage is for 
approx 15 Active VM's, used mainly for development and light database work.

Maybe its way to small and would be continually shuffling hot data.

Also - is writeback dangerous for cache tiering? it seems to be safe to me as 
the data is being written safely to the cache tier and will be flushed to the 
backing store on restart after an power failure etc.

-- 
Lindsay

signature.asc
Description: This is a digitally signed message part.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Weighting question

2015-01-02 Thread Dyweni - Ceph-Users



On 2015-01-01 08:27, Dyweni - Ceph-Users wrote:
Hi, I'm going to take a stab at this, since I've just recently/am 
currently

dealing with this/something similar myself.


On 2014-12-31 21:59, Lindsay Mathieson wrote:
As mentioned before :) we have two osd ndoes with one 3TB osd each. 
(replica

2)

About to add a smaller (1TB) faster drive to each node

From the docs, normal practice would be to weight it in accordance 
with size,

i.e 3 for the 3TB OSD, 1 for the 1TB OSD.

But I'd like to spread it 50/50 to take better advantage of the faster 
drive,

so weight them all at 1. Bad idea?



As long as your total data used (ceph df) / # of osds < your smallest 
drive

capacity, you should be fine.

I suspect a better configuration would be to leave your weights alone 
and to
change your primary affinity so that the osd with the ssd is used 
first.  You
might a little improvement on the writes (since the spinners have to 
work too),
but the reads should have the most improvement (since ceph only has to 
read

from the ssd).

http://ceph.com/docs/master/rados/operations/crush-map/#primary-affinity




This may help you too:

http://cephnotes.ksperis.com/blog/2014/08/20/ceph-primary-affinity






We only have 1TB of data so I'm presuming the 1TB drives would get 
500GB each.


--
Lindsay

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Weighting question

2015-01-02 Thread Dyweni - Ceph-Users


Hi, I'm going to take a stab at this, since I've just recently/am 
currently

dealing with this/something similar myself.


On 2014-12-31 21:59, Lindsay Mathieson wrote:
As mentioned before :) we have two osd ndoes with one 3TB osd each. 
(replica

2)

About to add a smaller (1TB) faster drive to each node

From the docs, normal practice would be to weight it in accordance with 
size,

i.e 3 for the 3TB OSD, 1 for the 1TB OSD.

But I'd like to spread it 50/50 to take better advantage of the faster 
drive,

so weight them all at 1. Bad idea?



As long as your total data used (ceph df) / # of osds < your smallest 
drive

capacity, you should be fine.

I suspect a better configuration would be to leave your weights alone 
and to
change your primary affinity so that the osd with the ssd is used first. 
 You
might a little improvement on the writes (since the spinners have to 
work too),
but the reads should have the most improvement (since ceph only has to 
read

from the ssd).

http://ceph.com/docs/master/rados/operations/crush-map/#primary-affinity



We only have 1TB of data so I'm presuming the 1TB drives would get 
500GB each.


--
Lindsay

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] redundancy with 2 nodes

2015-01-02 Thread Jiri Kanicky

Hi,

I noticed this message after shutting down the other node. You might be 
right that I need 3 monitors.

2015-01-01 15:47:35.990260 7f22858dd700  0 monclient: hunting for new mon

But what is quite unexpected is that you cannot run even "ceph status" 
on the running node t find out the state of the cluster.


Thx Jiri


On 1/01/2015 15:46, Jiri Kanicky wrote:

Hi,

I have:
- 2 monitors, one on each node
- 4 OSDs, two on each node
- 2 MDS, one on each node

Yes, all pools are set with size=2 and min_size=1

cephadmin@ceph1:~$ ceph osd dump
epoch 88
fsid bce2ff4d-e03b-4b75-9b17-8a48ee4d7788
created 2014-12-27 23:38:00.455097
modified 2014-12-30 20:45:51.343217
flags
pool 0 'rbd' replicated *size 2 min_size 1* crush_ruleset 0 
object_hash rjenkins p  g_num 256 pgp_num 256 
last_change 86 flags hashpspool stripe_width 0
pool 1 'media' replicated *size 2 min_size 1* crush_ruleset 0 
object_hash rjenkins   pg_num 256 pgp_num 256 
last_change 60 flags hashpspool stripe_width 0
pool 2 'data' replicated size *2 min_size 1* crush_ruleset 0 
object_hash rjenkins   pg_num 256 pgp_num 256 
last_change 63 flags hashpspool stripe_width 0
pool 3 'cephfs_test' replicated *size 2 min_size 1* crush_ruleset 0 
object_hash rj  enkins pg_num 256 pgp_num 256 
last_change 71 flags hashpspool crash_replay_inter  
val 45 stripe_width 0
pool 4 'cephfs_metadata' replicated *size 2 min_size 1* crush_ruleset 
0 object_has  h rjenkins pg_num 256 pgp_num 256 
last_change 69 flags hashpspool stripe_width 0

max_osd 4
osd.0 up   in  weight 1 up_from 55 up_thru 86 down_at 51 
last_clean_interval [39  ,50) 192.168.30.21:6800/17319 
10.1.1.21:6800/17319 10.1.1.21:6801/17319 192.168.  
30.21:6801/17319 exists,up 4f3172e1-adb8-4ca3-94af-6f0b8fcce35a
osd.1 up   in  weight 1 up_from 57 up_thru 86 down_at 53 
last_clean_interval [41  ,52) 192.168.30.21:6803/17684 
10.1.1.21:6802/17684 10.1.1.21:6804/17684 192.168.  
30.21:6805/17684 exists,up 1790347a-94fa-4b81-b429-1e7c7f11d3fd
osd.2 up   in  weight 1 up_from 79 up_thru 86 down_at 74 
last_clean_interval [13  ,73) 192.168.30.22:6801/3178 
10.1.1.22:6800/3178 10.1.1.22:6801/3178 192.168.30.  
22:6802/3178 exists,up 5520835f-c411-4750-974b-34e9aea2585d
osd.3 up   in  weight 1 up_from 81 up_thru 86 down_at 72 
last_clean_interval [20  ,71) 192.168.30.22:6804/3414 
10.1.1.22:6802/3414 10.1.1.22:6803/3414 192.168.30.  
22:6805/3414 exists,up 25e62059-6392-4a69-99c9-214ae335004


Thx Jiri

On 1/01/2015 15:21, Lindsay Mathieson wrote:

On Thu, 1 Jan 2015 02:59:05 PM Jiri Kanicky wrote:

I would expect that if I shut down one node, the system will keep
running. But when I tested it, I cannot even execute "ceph status"
command on the running node.

2 osd Nodes, 3 Mon nodes here, works perfectly for me.

How many monitors do you have?
Maybe you need a third monitor only node for quorum?



I set "osd_pool_default_size = 2" (min_size=1) on all pools, so I
thought that each copy will reside on each node. Which means that if 1
node goes down the second one will be still operational.

does:
ceph osd pool get {pool name} size
   return 2

ceph osd pool get {pool name} min_size
   return 1




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Weighting question

2015-01-02 Thread Christian Balzer
On Thu, 01 Jan 2015 13:59:57 +1000 Lindsay Mathieson wrote:

> As mentioned before :) we have two osd ndoes with one 3TB osd each.
> (replica 2)
> 
> About to add a smaller (1TB) faster drive to each node
> 
> From the docs, normal practice would be to weight it in accordance with
> size, i.e 3 for the 3TB OSD, 1 for the 1TB OSD.
> 
> But I'd like to spread it 50/50 to take better advantage of the faster
> drive, so weight them all at 1. Bad idea?
> 
Other than the wasted space, no. It should achieve what you want.

> We only have 1TB of data so I'm presuming the 1TB drives would get 500GB
> each.
> 

Expect a good deal of variance, Ceph still isn't very good at evenly
distributing data (PGs actually):
---
Filesystem  1K-blocks  Used  Available Use% Mounted on
/dev/sdi1  2112738204 211304052 1794043640  11% /var/lib/ceph/osd/ceph-19
/dev/sdk1  2112738204 140998368 1864349324   8% /var/lib/ceph/osd/ceph-21
---

On OSD 19 are 157 PGs, on 21 just 105, perfectly explaining this size
difference of about 33%.

That's on a Firefly cluster with 24 OSDs and more than adequate number of
PGs per OSD (128).

Christian 
-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Global OnLine Japan/Fusion Communications
http://www.gol.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Not running multiple services on the same machine?

2015-01-02 Thread Lindsay Mathieson
On Fri, 2 Jan 2015 09:59:36 PM Gregory Farnum wrote:
> The only technical issue I can think of is that you don't want to put
> kernel clients on the same OS as an OSD (due to deadlock scenarios under
> memory pressure and writeback).

The only kernel client is the cephfs driver?

qemu rbd client is usermode isn't it?

-- 
Lindsay

signature.asc
Description: This is a digitally signed message part.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Not running multiple services on the same machine?

2015-01-02 Thread Gregory Farnum
I think it's just for service isolation that people recommend splitting
them. The only technical issue I can think of is that you don't want to put
kernel clients on the same OS as an OSD (due to deadlock scenarios under
memory pressure and writeback).
-Greg
On Sat, Dec 27, 2014 at 12:11 PM Christopher Armstrong 
wrote:

> Hi folks,
>
> I've heard several comments on the mailing list warning against running
> multiple Ceph services (monitors, daemons, MDS, gateway) on the same
> machine. I was wondering if someone could shed more light on the dangers of
> this. In Deis[1] we only require clusters to be 3 machines big, and we need
> to run monitors, daemons, and MDS servers. Deis runs on CoreOS, so all of
> our services are shipped as Docker containers. We run Ceph within
> containers as our store[2] component, so on a single CoreOS host we're
> running a monitor, daemon, MDS, gateway, and consuming the cluster with a
> CephFS mount.
>
> I know it's ill-advised, but my question is - why? What sort of issues are
> we looking at? Data loss, performance, etc.? When I implemented this I was
> unaware of the recommendation not to do this, and I'd like to address any
> potential issues now.
>
> Thanks!
>
> Chris
>
> [1]: https://github.com/deis/deis
> [2]: https://github.com/deis/deis/tree/master/store
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com