Simplified LRC in CEPH

2014-08-01 Thread Andreas Joachim Peters
Hi Loic et. al. I managed to prototype (and understand) LRC encoding similiar to Xorbas in the ISA plug-in. As an example take a (16,4) code (which gives nice alignment for 4k blocks) : For 4 sub groups of the data chunks you build e.g. local parities LP1-LP4 LP1 = 1 ^ 2 ^ 3 ^ 4 LP2 = 5 ^ 6

call for comments - ceph-disk making OSD directories on typos and is inconsistent (useability).

2014-08-01 Thread Owen Synge
Dear all, By default ceph-disk will do the following: # ceph-disk - prepare --fs-type xfs --cluster ceph -- /dev/sdk DEBUG:ceph-disk:Preparing osd data dir /dev/sdk No block device /dev/sdk exists so ceph-disk decides a block device is not wanted and makes a directory for an OSD. I think

[PATCH] osd: add local_mtime to struct object_info_t

2014-08-01 Thread Wang, Zhiqiang
As we discussed before, adding a new field in struct object_info_t to solve the skipping flush problem. This patch is also available as a pull request at https://github.com/ceph/ceph/pull/2188 This fixes a bug when the time of the OSDs and clients are not synchronized (especially when client is

RE: Non existing monitor

2014-08-01 Thread Aanchal Agrawal
Any help on this ... -Original Message- From: Aanchal Agrawal Sent: Wednesday, July 30, 2014 3:37 PM To: 'ceph-devel@vger.kernel.org' Cc: Pavan Rallabhandi Subject: Non existing monitor Hi, We found a case(bug?) in ceph mon code, where in, an attempt to start a non-existing monitor is

Re: Non existing monitor

2014-08-01 Thread Wido den Hollander
On 07/30/2014 12:07 PM, Aanchal Agrawal wrote: Hi, We found a case(bug?) in ceph mon code, where in, an attempt to start a non-existing monitor is throwing up a levelDB error saying failed to create new leveldb store, instead we thought an appropriate message say No Monitor present with that

Re: Simplified LRC in CEPH

2014-08-01 Thread Loic Dachary
Hi Andreas, Enlightening explanation, thank you ! On 01/08/2014 13:45, Andreas Joachim Peters wrote: Hi Loic et. al. I managed to prototype (and understand) LRC encoding similiar to Xorbas in the ISA plug-in. As an example take a (16,4) code (which gives nice alignment for 4k blocks) :

RE: Simplified LRC in CEPH

2014-08-01 Thread Andreas Joachim Peters
Hi Loic, It would, definitely. How would you control where data / parity chunks are located ? I ordered the chunks after encoding in this way: ( 1 2 3 4 LP1 ) ( 5 6 7 8 LP2 ) ( 9 10 11 12 LP3 ) ( 13 14 15 16 LP4 ) ( R2 R3 R4 LP5 ) Always (k/l)+1 consecutive chunks belong location-wise

Re: [PATCH] locking/mutexes: Revert locking/mutexes: Add extra reschedule point

2014-08-01 Thread Ilya Dryomov
On Thu, Jul 31, 2014 at 6:39 PM, Peter Zijlstra pet...@infradead.org wrote: On Thu, Jul 31, 2014 at 04:30:52PM +0200, Mike Galbraith wrote: On Thu, 2014-07-31 at 15:13 +0200, Peter Zijlstra wrote: Smells like maybe current-state != TASK_RUNNING Bingo [ 1200.851004] kjournald D

Re: Simplified LRC in CEPH

2014-08-01 Thread Loic Dachary
Hi Andreas, It probably is just what we need. Although https://github.com/ceph/ceph/pull/1921 is more flexible in terms of chunk placement, I can't think of a use case where it would actually be useful. Maybe it's just me being back from hollidays but it smells like a solution to a non

Re: [PATCH] locking/mutexes: Revert locking/mutexes: Add extra reschedule point

2014-08-01 Thread Peter Zijlstra
On Fri, Aug 01, 2014 at 04:56:27PM +0400, Ilya Dryomov wrote: I'm going to fix up rbd_request_fn(), but I want to make sure I understand this in full. - Previously the danger of calling blocking primitives on the way to schedule(), i.e. with task-state != TASK_RUNNING, was that if the

[ANN] ceph-deploy 1.5.10 released

2014-08-01 Thread Alfredo Deza
Hi All, There is a new release of ceph-deploy, the easy deployment tool for Ceph. This release comes with a few improvements towards better usage of ceph-disk on remote nodes, with more verbosity so things are a bit more clear when they execute. The full list of fixes for this release can be

RE: Simplified LRC in CEPH

2014-08-01 Thread Andreas Joachim Peters
Hi Loic, your initial scheme is more flexible. Nevertheless I think that a simplified description covers 99% of peoples use cases. You can absorb the logic of implied parity into your generic LRC plug-in if you find a good way to describe something like 'compute but don't' store'. If you

Re: KeyFileStore ?

2014-08-01 Thread Guang Yang
I really like the idea, one scenario keeps bothering us is that there are too many small files which make the file system indexing slow (so that a single read request could take more than 10 disk IOs for path lookup). If we pursuit this proposal, is there a chance we can take one step further,

Re: call for comments - ceph-disk making OSD directories on typos and is inconsistent (useability).

2014-08-01 Thread Sage Weil
On Fri, 1 Aug 2014, Owen Synge wrote: Dear all, By default ceph-disk will do the following: # ceph-disk - prepare --fs-type xfs --cluster ceph -- /dev/sdk DEBUG:ceph-disk:Preparing osd data dir /dev/sdk No block device /dev/sdk exists so ceph-disk decides a block device is not

RE: Simplified LRC in CEPH

2014-08-01 Thread Sage Weil
On Fri, 1 Aug 2014, Andreas Joachim Peters wrote: Hi Loic, your initial scheme is more flexible. Nevertheless I think that a simplified description covers 99% of peoples use cases. You can absorb the logic of implied parity into your generic LRC plug-in if you find a good way to describe

Ιnstrumenting RADOS with Zipkin + LTTng

2014-08-01 Thread Marios-Evaggelos Kogias
Hello all, my name is Marios Kogias and I am a student at the National Technical University of Athens. As part of my diploma thesis and my participation in Google Summer of Code 2014 (in the LTTng organization) I am working on a low-overhead tracing infrastructure for distributed systems. I am

Re: call for comments - ceph-disk making OSD directories on typos and is inconsistent (useability).

2014-08-01 Thread Owen Synge
On 08/01/2014 05:10 PM, Sage Weil wrote: On Fri, 1 Aug 2014, Owen Synge wrote: Dear all, By default ceph-disk will do the following: # ceph-disk - prepare --fs-type xfs --cluster ceph -- /dev/sdk DEBUG:ceph-disk:Preparing osd data dir /dev/sdk No block device /dev/sdk exists so

Re: First attempt at rocksdb monitor store stress testing

2014-08-01 Thread Mark Nelson
On 07/30/2014 09:00 PM, Shu, Xinxin wrote: Does your report base on wip-rocksdb-mark branch? Yes, though I've been tweaking Joao's test tool a bit. I ran more tests with the higher ulimit for rocksdb, and also did 10,000 objects instead of 5,000. There are some interesting effects. Leveldb

RE: Non existing monitor

2014-08-01 Thread Pavan Rallabhandi
Greg, The commands used were to start monitor for an instance id that is non existing, for which a leveldb store error is thrown: snip src/ceph-mon -i foo 2014-08-01 02:47:08.210208 7f7341c3c800 -1 failed to create new leveldb store \snip The idea is to fix this behavior by throwing a

Free LinuxCon/CloudOpen Pass

2014-08-01 Thread Patrick McGarry
Hey cephers, Now that OSCON is in our rearview mirror we have started looking to LinuxCon/CloudOpen, which is looming just over two weeks away. If you haven't arranged tickets yet, and would like to go, let us know! We have an extra ticket (maybe two) and we'd love to have you attend and hang

Re: Ιnstrumenting RADOS with Zipkin + LTTng

2014-08-01 Thread Samuel Just
Reading! -Sam On Fri, Aug 1, 2014 at 9:28 AM, Marios-Evaggelos Kogias marioskog...@gmail.com wrote: Hello all, my name is Marios Kogias and I am a student at the National Technical University of Athens. As part of my diploma thesis and my participation in Google Summer of Code 2014 (in the

Re: Ιnstrumenting RADOS with Zipkin + LTTng

2014-08-01 Thread Samuel Just
I'm probably missing something, is there supposed to be a [6] with your ceph changes? -Sam On Fri, Aug 1, 2014 at 1:17 PM, Samuel Just sam.j...@inktank.com wrote: Reading! -Sam On Fri, Aug 1, 2014 at 9:28 AM, Marios-Evaggelos Kogias marioskog...@gmail.com wrote: Hello all, my name is

Re: Ιnstrumenting RADOS with Zipkin + LTTng

2014-08-01 Thread Adam Crume
I'm a developer working on RBD replay, so I've written a lot of the tracing code. I'd like to start out by saying that I'm speaking for myself, not for the Ceph project as a whole. This certainly is interesting. This would be useful for analysis that simple statistics couldn't capture, like

Re: KeyFileStore ?

2014-08-01 Thread Samuel Just
Sage's basic approach sounds about right to me. I'm fairly skeptical about the benefits of packing small objects together within larger files, though. It seems like for very small objects, we would be better off stashing the contents opportunistically within the onode. For somewhat larger

RE: First attempt at rocksdb monitor store stress testing

2014-08-01 Thread Sage Weil
Hi xinxin, It's merged! We've hit one other snag, though.. rocksdb is failing to build in i386. See http://gitbuilder.sepia.ceph.com/gitbuilder-ceph-deb-trusty-i386-basic/log.cgi?log=52c2182fe833e8a0206787ecd878bd010cc2e529 Do you mind taking a look? Probably the int type in the

firefly backports

2014-08-01 Thread Sage Weil
Sam, Yehuda, I backported a ton of pending stuff to firefly-next. A bunch of it was the cephtool tests and unwinding it from the mds cli changes. Also did ceph df writeable space (with fix). I didn't do some of the core OSD stuff, though, namely 8438 ec objeccts are not cleaned up 8701