Re: Review request : Erasure Code plugin loader implementation

2013-08-20 Thread Loic Dachary
Hi Sage,

I created erasure code : convenience functions to code / decode 
http://tracker.ceph.com/issues/6064 to implement the suggested functions. 
Please let me know if this should be merged with another task.

Cheers

On 19/08/2013 17:06, Loic Dachary wrote:
 
 
 On 19/08/2013 02:01, Sage Weil wrote:
 On Sun, 18 Aug 2013, Loic Dachary wrote:
 Hi Sage,

 Unless I misunderstood something ( which is still possible at this stage 
 ;-) decode() is used both for recovery of missing chunks and retrieval of 
 the original buffer. Decoding the M data chunks is a special case of 
 decoding N = M chunks out of the M+K chunks that were produced by 
 encode(). It can be used to recover parity chunks as well as data chunks.

 https://github.com/dachary/ceph/blob/wip-4929/doc/dev/osd_internals/erasure-code.rst#erasure-code-library-abstract-api

 mapint, buffer decode(const setint want_to_read, const mapint, 
 buffer chunks)

 decode chunks to read the content of the want_to_read chunks and return 
 a map associating the chunk number with its decoded content. For instance, 
 in the simplest case M=2,K=1 for an encoded payload of data A and B with 
 parity Z, calling

 decode([1,2], { 1 = 'A', 2 = 'B', 3 = 'Z' })
 = { 1 = 'A', 2 = 'B' }

 If however, the chunk B is to be read but is missing it will be:

 decode([2], { 1 = 'A', 3 = 'Z' })
 = { 2 = 'B' }

 Ah, I guess this works when some of the chunks contain the original 
 data (as with a parity code).  There are codes that don't work that way, 
 although I suspect we won't use them.

 Regardless, I wonder if we should generalize slightly and have some 
 methods work in terms of (offset,length) of the original stripe to 
 generalize that bit.  Then we would have something like

  mapint, buffer transcode(const setint want_to_read, const mapint, 
 buffer chunks);

 to go from chunks - chunks (as we would want to do with, say, a LRC-like 
 code where we can rebuild some shards from a subset of the other shards).  
 And then also have

  int decode(const mapint, buffer chunks, unsigned offset, 
  unsigned len, bufferlist *out);
 
 This function would be implemented more or less as:
 
   setint want_to_read = range_to_chunks(offset, len) // compute what chunks 
 must be retrieved
   setint available = the up set
   setint minimum = minimum_to_decode(want_to_read, available);
   mapint, buffer available_chunks = retrieve_chunks_from_osds(minimum);
   mapint, buffer chunks = transcode(want_to_read, available_chunks); // 
 repairs if necessary
   out = bufferptr(concat_chunks(chunks), offset - offset of the first chunk, 
 len)
 
 or do you have something else in mind ?
 

 that recovers the original data.

 In our case, the read path would use decode, and for recovery we would use 
 transcode.  

 We'd also want to have alternate minimum_to_decode* methods, like

 virtual setint minimum_to_decode(unsigned offset, unsigned len, const 
  setint available_chunks) = 0;
 
 I also have a convenience wrapper in mind for this but I feel I'm missing 
 something.
 
 Cheers
 

 What do you think?

 sage





 Cheers

 On 18/08/2013 19:34, Sage Weil wrote:
 On Sun, 18 Aug 2013, Loic Dachary wrote:
 Hi Ceph,

 I've implemented a draft of the Erasure Code plugin loader in the context 
 of http://tracker.ceph.com/issues/5878. It has a trivial unit test and an 
 example plugin. It would be great if someone could do a quick review. The 
 general idea is that the erasure code pool calls something like:

 ErasureCodePlugin::factory(erasure_code, example, parameters)

 as shown at

 https://github.com/ceph/ceph/blob/5a2b1d66ae17b78addc14fee68c73985412f3c8c/src/test/osd/TestErasureCode.cc#L28

 to get an object implementing the interface

 https://github.com/ceph/ceph/blob/5a2b1d66ae17b78addc14fee68c73985412f3c8c/src/osd/ErasureCodeInterface.h

 which matches the proposal described at

 https://github.com/dachary/ceph/blob/wip-4929/doc/dev/osd_internals/erasure-code.rst#erasure-code-library-abstract-api

 The draft is at

 https://github.com/ceph/ceph/commit/5a2b1d66ae17b78addc14fee68c73985412f3c8c

 Thanks in advance :-)

 I haven't been following this discussion too closely, but taking a look 
 now, the first 3 make sense, but

 virtual mapint, bufferptr decode(const setint want_to_read, const 
 mapint, bufferptr chunks) = 0;

 it seems like this one should be more like

 virtual int decode(const mapint, bufferptr chunks, bufferlist *out);

 As in, you'd decode the chunks you have to get the actual data.  If you 
 want to get (missing) chunks for recovery, you'd do

   minimum_to_decode(...);  // see what we need
   fetch those chunks from other nodes
   decode(...);   // reconstruct original buffer
   encode(...);   // encode missing chunks from original data

 sage
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  

Re: Review request : Erasure Code plugin loader implementation

2013-08-19 Thread Loic Dachary
Hi Sage,

This makes a lot more sense indeed. I updated the 
http://tracker.ceph.com/issues/5878 description accordingly.

ceph osd pool create poolname erasure-code-dir=/var/lib/ceph/erasure-code
erasure-code-plugin=jerasure erasure-code-m=10 erasure-code-k=3 
erasure-code-algorithm=Reed-Solomon 

Thanks :-)

On 19/08/2013 02:24, Sage Weil wrote:
 Hi Loic,
 
 One other thought on http://tracker.ceph.com/issues/5878:
 
 The user interface there would let you adjust various parameters of the 
 pool's erasure coding scheme after the pool is created.  As a practical 
 matter, I suspect that many/most of these fields will be specified exactly 
 once (at pool creation time) and will be immutable properties of the pool 
 after that.  The m/k at a minimum need to match up with what we are 
 requesting out of crush.  And once there is data stored, I don't think it 
 will make sense to be able to change the encoding scheme for new objects 
 and still be able to deal with old objects.  (Or maybe it will be, if the 
 code metadata is in the object_info_t.)
 
 Even if we do support changing some of these on the fly, though, I suspect 
 the most important interface, and the first we implement, will be 
 something like
 
  ceph osd pool create name [key=value ...]
 
 the various parameters listed, like EC algorithm, m, k, and pg_num.  We 
 can probably generalize the mon command interface to have a key/value list 
 type that will make this easy to plumb from the CLI (and trivial via 
 ceph-rest-api).
 
 sage
 

-- 
Loïc Dachary, Artisan Logiciel Libre
All that is necessary for the triumph of evil is that good people do nothing.



signature.asc
Description: OpenPGP digital signature


Re: Review request : Erasure Code plugin loader implementation

2013-08-19 Thread Loic Dachary


On 19/08/2013 02:01, Sage Weil wrote:
 On Sun, 18 Aug 2013, Loic Dachary wrote:
 Hi Sage,

 Unless I misunderstood something ( which is still possible at this stage ;-) 
 decode() is used both for recovery of missing chunks and retrieval of the 
 original buffer. Decoding the M data chunks is a special case of decoding N 
 = M chunks out of the M+K chunks that were produced by encode(). It can be 
 used to recover parity chunks as well as data chunks.

 https://github.com/dachary/ceph/blob/wip-4929/doc/dev/osd_internals/erasure-code.rst#erasure-code-library-abstract-api

 mapint, buffer decode(const setint want_to_read, const mapint, 
 buffer chunks)

 decode chunks to read the content of the want_to_read chunks and return 
 a map associating the chunk number with its decoded content. For instance, 
 in the simplest case M=2,K=1 for an encoded payload of data A and B with 
 parity Z, calling

 decode([1,2], { 1 = 'A', 2 = 'B', 3 = 'Z' })
 = { 1 = 'A', 2 = 'B' }

 If however, the chunk B is to be read but is missing it will be:

 decode([2], { 1 = 'A', 3 = 'Z' })
 = { 2 = 'B' }
 
 Ah, I guess this works when some of the chunks contain the original 
 data (as with a parity code).  There are codes that don't work that way, 
 although I suspect we won't use them.
 
 Regardless, I wonder if we should generalize slightly and have some 
 methods work in terms of (offset,length) of the original stripe to 
 generalize that bit.  Then we would have something like
 
  mapint, buffer transcode(const setint want_to_read, const mapint, 
 buffer chunks);
 
 to go from chunks - chunks (as we would want to do with, say, a LRC-like 
 code where we can rebuild some shards from a subset of the other shards).  
 And then also have
 
  int decode(const mapint, buffer chunks, unsigned offset, 
  unsigned len, bufferlist *out);

This function would be implemented more or less as:

  setint want_to_read = range_to_chunks(offset, len) // compute what chunks 
must be retrieved
  setint available = the up set
  setint minimum = minimum_to_decode(want_to_read, available);
  mapint, buffer available_chunks = retrieve_chunks_from_osds(minimum);
  mapint, buffer chunks = transcode(want_to_read, available_chunks); // 
repairs if necessary
  out = bufferptr(concat_chunks(chunks), offset - offset of the first chunk, 
len)

or do you have something else in mind ?

 
 that recovers the original data.
 
 In our case, the read path would use decode, and for recovery we would use 
 transcode.  
 
 We'd also want to have alternate minimum_to_decode* methods, like
 
 virtual setint minimum_to_decode(unsigned offset, unsigned len, const 
  setint available_chunks) = 0;

I also have a convenience wrapper in mind for this but I feel I'm missing 
something.

Cheers

 
 What do you think?
 
 sage
 
 
 
 

 Cheers

 On 18/08/2013 19:34, Sage Weil wrote:
 On Sun, 18 Aug 2013, Loic Dachary wrote:
 Hi Ceph,

 I've implemented a draft of the Erasure Code plugin loader in the context 
 of http://tracker.ceph.com/issues/5878. It has a trivial unit test and an 
 example plugin. It would be great if someone could do a quick review. The 
 general idea is that the erasure code pool calls something like:

 ErasureCodePlugin::factory(erasure_code, example, parameters)

 as shown at

 https://github.com/ceph/ceph/blob/5a2b1d66ae17b78addc14fee68c73985412f3c8c/src/test/osd/TestErasureCode.cc#L28

 to get an object implementing the interface

 https://github.com/ceph/ceph/blob/5a2b1d66ae17b78addc14fee68c73985412f3c8c/src/osd/ErasureCodeInterface.h

 which matches the proposal described at

 https://github.com/dachary/ceph/blob/wip-4929/doc/dev/osd_internals/erasure-code.rst#erasure-code-library-abstract-api

 The draft is at

 https://github.com/ceph/ceph/commit/5a2b1d66ae17b78addc14fee68c73985412f3c8c

 Thanks in advance :-)

 I haven't been following this discussion too closely, but taking a look 
 now, the first 3 make sense, but

 virtual mapint, bufferptr decode(const setint want_to_read, const 
 mapint, bufferptr chunks) = 0;

 it seems like this one should be more like

 virtual int decode(const mapint, bufferptr chunks, bufferlist *out);

 As in, you'd decode the chunks you have to get the actual data.  If you 
 want to get (missing) chunks for recovery, you'd do

   minimum_to_decode(...);  // see what we need
   fetch those chunks from other nodes
   decode(...);   // reconstruct original buffer
   encode(...);   // encode missing chunks from original data

 sage
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


 -- 
 Lo?c Dachary, Artisan Logiciel Libre
 All that is necessary for the triumph of evil is that good people do nothing.


 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to 

Re: Review request : Erasure Code plugin loader implementation

2013-08-19 Thread Sage Weil
On Mon, 19 Aug 2013, Loic Dachary wrote:
 
 
 On 19/08/2013 02:01, Sage Weil wrote:
  On Sun, 18 Aug 2013, Loic Dachary wrote:
  Hi Sage,
 
  Unless I misunderstood something ( which is still possible at this stage 
  ;-) decode() is used both for recovery of missing chunks and retrieval of 
  the original buffer. Decoding the M data chunks is a special case of 
  decoding N = M chunks out of the M+K chunks that were produced by 
  encode(). It can be used to recover parity chunks as well as data chunks.
 
  https://github.com/dachary/ceph/blob/wip-4929/doc/dev/osd_internals/erasure-code.rst#erasure-code-library-abstract-api
 
  mapint, buffer decode(const setint want_to_read, const mapint, 
  buffer chunks)
 
  decode chunks to read the content of the want_to_read chunks and 
  return a map associating the chunk number with its decoded content. For 
  instance, in the simplest case M=2,K=1 for an encoded payload of data A 
  and B with parity Z, calling
 
  decode([1,2], { 1 = 'A', 2 = 'B', 3 = 'Z' })
  = { 1 = 'A', 2 = 'B' }
 
  If however, the chunk B is to be read but is missing it will be:
 
  decode([2], { 1 = 'A', 3 = 'Z' })
  = { 2 = 'B' }
  
  Ah, I guess this works when some of the chunks contain the original 
  data (as with a parity code).  There are codes that don't work that way, 
  although I suspect we won't use them.
  
  Regardless, I wonder if we should generalize slightly and have some 
  methods work in terms of (offset,length) of the original stripe to 
  generalize that bit.  Then we would have something like
  
   mapint, buffer transcode(const setint want_to_read, const 
  mapint, 
  buffer chunks);
  
  to go from chunks - chunks (as we would want to do with, say, a LRC-like 
  code where we can rebuild some shards from a subset of the other shards).  
  And then also have
  
   int decode(const mapint, buffer chunks, unsigned offset, 
   unsigned len, bufferlist *out);
 
 This function would be implemented more or less as:
 
   setint want_to_read = range_to_chunks(offset, len) // compute what chunks 
 must be retrieved
   setint available = the up set
   setint minimum = minimum_to_decode(want_to_read, available);
   mapint, buffer available_chunks = retrieve_chunks_from_osds(minimum);
   mapint, buffer chunks = transcode(want_to_read, available_chunks); // 
 repairs if necessary
   out = bufferptr(concat_chunks(chunks), offset - offset of the first chunk, 
 len)
 
 or do you have something else in mind ?

This makes sense.  I am still wondering if it is worth generalizing this a 
bit further to codes without a nice mapping of a range - want_to_read 
(i.e. that require decoding the entire stripe to get any part of it).  
For those codes, we would want to choose the N cheapest/available chunks 
and the sequence above would be a bit different.  I guess in reality, 
though, we probably don't care to implement any such codes (I'm not sure 
what their advantages would be, if any)!

sage





  
  
  that recovers the original data.
  
  In our case, the read path would use decode, and for recovery we would use 
  transcode.  
  
  We'd also want to have alternate minimum_to_decode* methods, like
  
  virtual setint minimum_to_decode(unsigned offset, unsigned len, const 
   setint available_chunks) = 0;
 
 I also have a convenience wrapper in mind for this but I feel I'm missing 
 something.
 
 Cheers
 
  
  What do you think?
  
  sage
  
  
  
  
 
  Cheers
 
  On 18/08/2013 19:34, Sage Weil wrote:
  On Sun, 18 Aug 2013, Loic Dachary wrote:
  Hi Ceph,
 
  I've implemented a draft of the Erasure Code plugin loader in the 
  context of http://tracker.ceph.com/issues/5878. It has a trivial unit 
  test and an example plugin. It would be great if someone could do a 
  quick review. The general idea is that the erasure code pool calls 
  something like:
 
  ErasureCodePlugin::factory(erasure_code, example, parameters)
 
  as shown at
 
  https://github.com/ceph/ceph/blob/5a2b1d66ae17b78addc14fee68c73985412f3c8c/src/test/osd/TestErasureCode.cc#L28
 
  to get an object implementing the interface
 
  https://github.com/ceph/ceph/blob/5a2b1d66ae17b78addc14fee68c73985412f3c8c/src/osd/ErasureCodeInterface.h
 
  which matches the proposal described at
 
  https://github.com/dachary/ceph/blob/wip-4929/doc/dev/osd_internals/erasure-code.rst#erasure-code-library-abstract-api
 
  The draft is at
 
  https://github.com/ceph/ceph/commit/5a2b1d66ae17b78addc14fee68c73985412f3c8c
 
  Thanks in advance :-)
 
  I haven't been following this discussion too closely, but taking a look 
  now, the first 3 make sense, but
 
  virtual mapint, bufferptr decode(const setint want_to_read, 
  const 
  mapint, bufferptr chunks) = 0;
 
  it seems like this one should be more like
 
  virtual int decode(const mapint, bufferptr chunks, bufferlist 
  *out);
 
  As in, you'd decode the chunks you have to get the actual data.  If you 
  want to 

Re: Review request : Erasure Code plugin loader implementation

2013-08-18 Thread Sage Weil
On Sun, 18 Aug 2013, Loic Dachary wrote:
 Hi Ceph,
 
 I've implemented a draft of the Erasure Code plugin loader in the context of 
 http://tracker.ceph.com/issues/5878. It has a trivial unit test and an 
 example plugin. It would be great if someone could do a quick review. The 
 general idea is that the erasure code pool calls something like:
 
 ErasureCodePlugin::factory(erasure_code, example, parameters)
 
 as shown at
 
 https://github.com/ceph/ceph/blob/5a2b1d66ae17b78addc14fee68c73985412f3c8c/src/test/osd/TestErasureCode.cc#L28
 
 to get an object implementing the interface
 
 https://github.com/ceph/ceph/blob/5a2b1d66ae17b78addc14fee68c73985412f3c8c/src/osd/ErasureCodeInterface.h
 
 which matches the proposal described at
 
 https://github.com/dachary/ceph/blob/wip-4929/doc/dev/osd_internals/erasure-code.rst#erasure-code-library-abstract-api
 
 The draft is at
 
 https://github.com/ceph/ceph/commit/5a2b1d66ae17b78addc14fee68c73985412f3c8c
 
 Thanks in advance :-)

I haven't been following this discussion too closely, but taking a look 
now, the first 3 make sense, but

virtual mapint, bufferptr decode(const setint want_to_read, const 
mapint, bufferptr chunks) = 0;

it seems like this one should be more like

virtual int decode(const mapint, bufferptr chunks, bufferlist *out);

As in, you'd decode the chunks you have to get the actual data.  If you 
want to get (missing) chunks for recovery, you'd do

  minimum_to_decode(...);  // see what we need
  fetch those chunks from other nodes
  decode(...);   // reconstruct original buffer
  encode(...);   // encode missing chunks from original data

sage
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Review request : Erasure Code plugin loader implementation

2013-08-18 Thread Loic Dachary
Hi Sage,

Unless I misunderstood something ( which is still possible at this stage ;-) 
decode() is used both for recovery of missing chunks and retrieval of the 
original buffer. Decoding the M data chunks is a special case of decoding N = 
M chunks out of the M+K chunks that were produced by encode(). It can be used 
to recover parity chunks as well as data chunks.

https://github.com/dachary/ceph/blob/wip-4929/doc/dev/osd_internals/erasure-code.rst#erasure-code-library-abstract-api

mapint, buffer decode(const setint want_to_read, const mapint, 
buffer chunks)

decode chunks to read the content of the want_to_read chunks and return a 
map associating the chunk number with its decoded content. For instance, in the 
simplest case M=2,K=1 for an encoded payload of data A and B with parity Z, 
calling

decode([1,2], { 1 = 'A', 2 = 'B', 3 = 'Z' })
= { 1 = 'A', 2 = 'B' }

If however, the chunk B is to be read but is missing it will be:

decode([2], { 1 = 'A', 3 = 'Z' })
= { 2 = 'B' }

Cheers

On 18/08/2013 19:34, Sage Weil wrote:
 On Sun, 18 Aug 2013, Loic Dachary wrote:
 Hi Ceph,

 I've implemented a draft of the Erasure Code plugin loader in the context of 
 http://tracker.ceph.com/issues/5878. It has a trivial unit test and an 
 example plugin. It would be great if someone could do a quick review. The 
 general idea is that the erasure code pool calls something like:

 ErasureCodePlugin::factory(erasure_code, example, parameters)

 as shown at

 https://github.com/ceph/ceph/blob/5a2b1d66ae17b78addc14fee68c73985412f3c8c/src/test/osd/TestErasureCode.cc#L28

 to get an object implementing the interface

 https://github.com/ceph/ceph/blob/5a2b1d66ae17b78addc14fee68c73985412f3c8c/src/osd/ErasureCodeInterface.h

 which matches the proposal described at

 https://github.com/dachary/ceph/blob/wip-4929/doc/dev/osd_internals/erasure-code.rst#erasure-code-library-abstract-api

 The draft is at

 https://github.com/ceph/ceph/commit/5a2b1d66ae17b78addc14fee68c73985412f3c8c

 Thanks in advance :-)
 
 I haven't been following this discussion too closely, but taking a look 
 now, the first 3 make sense, but
 
 virtual mapint, bufferptr decode(const setint want_to_read, const 
 mapint, bufferptr chunks) = 0;
 
 it seems like this one should be more like
 
 virtual int decode(const mapint, bufferptr chunks, bufferlist *out);
 
 As in, you'd decode the chunks you have to get the actual data.  If you 
 want to get (missing) chunks for recovery, you'd do
 
   minimum_to_decode(...);  // see what we need
   fetch those chunks from other nodes
   decode(...);   // reconstruct original buffer
   encode(...);   // encode missing chunks from original data
 
 sage
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 

-- 
Loïc Dachary, Artisan Logiciel Libre
All that is necessary for the triumph of evil is that good people do nothing.



signature.asc
Description: OpenPGP digital signature


Re: Review request : Erasure Code plugin loader implementation

2013-08-18 Thread Sage Weil
On Sun, 18 Aug 2013, Loic Dachary wrote:
 Hi Sage,
 
 Unless I misunderstood something ( which is still possible at this stage ;-) 
 decode() is used both for recovery of missing chunks and retrieval of the 
 original buffer. Decoding the M data chunks is a special case of decoding N 
 = M chunks out of the M+K chunks that were produced by encode(). It can be 
 used to recover parity chunks as well as data chunks.
 
 https://github.com/dachary/ceph/blob/wip-4929/doc/dev/osd_internals/erasure-code.rst#erasure-code-library-abstract-api
 
 mapint, buffer decode(const setint want_to_read, const mapint, 
 buffer chunks)
 
 decode chunks to read the content of the want_to_read chunks and return a 
 map associating the chunk number with its decoded content. For instance, in 
 the simplest case M=2,K=1 for an encoded payload of data A and B with parity 
 Z, calling
 
 decode([1,2], { 1 = 'A', 2 = 'B', 3 = 'Z' })
 = { 1 = 'A', 2 = 'B' }
 
 If however, the chunk B is to be read but is missing it will be:
 
 decode([2], { 1 = 'A', 3 = 'Z' })
 = { 2 = 'B' }

Ah, I guess this works when some of the chunks contain the original 
data (as with a parity code).  There are codes that don't work that way, 
although I suspect we won't use them.

Regardless, I wonder if we should generalize slightly and have some 
methods work in terms of (offset,length) of the original stripe to 
generalize that bit.  Then we would have something like

 mapint, buffer transcode(const setint want_to_read, const mapint, 
buffer chunks);

to go from chunks - chunks (as we would want to do with, say, a LRC-like 
code where we can rebuild some shards from a subset of the other shards).  
And then also have

 int decode(const mapint, buffer chunks, unsigned offset, 
 unsigned len, bufferlist *out);

that recovers the original data.

In our case, the read path would use decode, and for recovery we would use 
transcode.  

We'd also want to have alternate minimum_to_decode* methods, like

virtual setint minimum_to_decode(unsigned offset, unsigned len, const 
 setint available_chunks) = 0;

What do you think?

sage




 
 Cheers
 
 On 18/08/2013 19:34, Sage Weil wrote:
  On Sun, 18 Aug 2013, Loic Dachary wrote:
  Hi Ceph,
 
  I've implemented a draft of the Erasure Code plugin loader in the context 
  of http://tracker.ceph.com/issues/5878. It has a trivial unit test and an 
  example plugin. It would be great if someone could do a quick review. The 
  general idea is that the erasure code pool calls something like:
 
  ErasureCodePlugin::factory(erasure_code, example, parameters)
 
  as shown at
 
  https://github.com/ceph/ceph/blob/5a2b1d66ae17b78addc14fee68c73985412f3c8c/src/test/osd/TestErasureCode.cc#L28
 
  to get an object implementing the interface
 
  https://github.com/ceph/ceph/blob/5a2b1d66ae17b78addc14fee68c73985412f3c8c/src/osd/ErasureCodeInterface.h
 
  which matches the proposal described at
 
  https://github.com/dachary/ceph/blob/wip-4929/doc/dev/osd_internals/erasure-code.rst#erasure-code-library-abstract-api
 
  The draft is at
 
  https://github.com/ceph/ceph/commit/5a2b1d66ae17b78addc14fee68c73985412f3c8c
 
  Thanks in advance :-)
  
  I haven't been following this discussion too closely, but taking a look 
  now, the first 3 make sense, but
  
  virtual mapint, bufferptr decode(const setint want_to_read, const 
  mapint, bufferptr chunks) = 0;
  
  it seems like this one should be more like
  
  virtual int decode(const mapint, bufferptr chunks, bufferlist *out);
  
  As in, you'd decode the chunks you have to get the actual data.  If you 
  want to get (missing) chunks for recovery, you'd do
  
minimum_to_decode(...);  // see what we need
fetch those chunks from other nodes
decode(...);   // reconstruct original buffer
encode(...);   // encode missing chunks from original data
  
  sage
  --
  To unsubscribe from this list: send the line unsubscribe ceph-devel in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
  
 
 -- 
 Lo?c Dachary, Artisan Logiciel Libre
 All that is necessary for the triumph of evil is that good people do nothing.
 
 
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Review request : Erasure Code plugin loader implementation

2013-08-18 Thread Sage Weil
Hi Loic,

One other thought on http://tracker.ceph.com/issues/5878:

The user interface there would let you adjust various parameters of the 
pool's erasure coding scheme after the pool is created.  As a practical 
matter, I suspect that many/most of these fields will be specified exactly 
once (at pool creation time) and will be immutable properties of the pool 
after that.  The m/k at a minimum need to match up with what we are 
requesting out of crush.  And once there is data stored, I don't think it 
will make sense to be able to change the encoding scheme for new objects 
and still be able to deal with old objects.  (Or maybe it will be, if the 
code metadata is in the object_info_t.)

Even if we do support changing some of these on the fly, though, I suspect 
the most important interface, and the first we implement, will be 
something like

 ceph osd pool create name [key=value ...]

the various parameters listed, like EC algorithm, m, k, and pg_num.  We 
can probably generalize the mon command interface to have a key/value list 
type that will make this easy to plumb from the CLI (and trivial via 
ceph-rest-api).

sage
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html