Re: rados_clone_range for different pgs

2013-10-08 Thread Oleg Krasnianskiy
We use ceph to store huge files stripped into small (4mb) objects.
Due to the fact that files can be changed unpredictably (data
insertion/modification/deletion in any part of a file), we have to
copy parts of the objects and it is done via the client.
I see the following ways to solve this problem:
 - implement a client that is launched on the same host as the source
osd, that will handle the copy process
 - add functionality to the osd, so it can do copy to other osds

Which way best suits with the ceph ideology?

2013/8/2 Sage Weil s...@inktank.com:
 Hi Oleg,

 On Fri, 2 Aug 2013, Oleg Krasnianskiy wrote:
 Hi

 I have asked this question in ceph-users, but did not get any
 response, so I'll test my luck again, but with ceph-devel =)

 Sorry about that!

 Is there any way to copy part of one object into another one if they
 reside in different pgs?
 There is rados_clone_range, but it requires both objects to be inside one pg.

 There is no way currently.  The clone_range can only (reliably) work on an
 OSD if it is stored with the same locator key; otherwise you have a ~R/N
 chance of that happening (where N is the number of OSDs, R is the number
 of replicas), which isn't worth optimizing for.  If the objects aren't
 stored together, you need to read and then write the data; this avoids
 adding additional complexity to the OSD for minimal gain.

 Do you have a use-case in mind where this functionality is important?

 sage

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rados_clone_range for different pgs

2013-10-08 Thread Gregory Farnum
On Tue, Oct 8, 2013 at 7:40 AM, Oleg Krasnianskiy
oleg.krasnians...@gmail.com wrote:
 We use ceph to store huge files stripped into small (4mb) objects.
 Due to the fact that files can be changed unpredictably (data
 insertion/modification/deletion in any part of a file), we have to
 copy parts of the objects and it is done via the client.
 I see the following ways to solve this problem:
  - implement a client that is launched on the same host as the source
 osd, that will handle the copy process
  - add functionality to the osd, so it can do copy to other osds

 Which way best suits with the ceph ideology?

I'm a bit confused; why does chunking of files into objects
necessitate copying between objects?

In any case, I suspect you will want to do this via OSD commands
rather than by trying to put a client next to the OSD (this is subject
to races if an OSD dies, for instance). We are currently implementing
similar functionality for the first time, in order to support caching
and tiering pools. It's not yet exposed to clients, but it shouldn't
be difficult to extend our new copyfrom interface (used by the OSD)
to a copy_chunk interface that we can expose to clients and copies
part of an object into another, if somebody wants to take a stab at
it!
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


rados_clone_range for different pgs

2013-08-02 Thread Oleg Krasnianskiy
Hi

I have asked this question in ceph-users, but did not get any
response, so I'll test my luck again, but with ceph-devel =)

Is there any way to copy part of one object into another one if they
reside in different pgs?
There is rados_clone_range, but it requires both objects to be inside one pg.

Thanks!
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rados_clone_range for different pgs

2013-08-02 Thread Sage Weil
Hi Oleg,

On Fri, 2 Aug 2013, Oleg Krasnianskiy wrote:
 Hi
 
 I have asked this question in ceph-users, but did not get any
 response, so I'll test my luck again, but with ceph-devel =)

Sorry about that!
 
 Is there any way to copy part of one object into another one if they
 reside in different pgs?
 There is rados_clone_range, but it requires both objects to be inside one pg.

There is no way currently.  The clone_range can only (reliably) work on an 
OSD if it is stored with the same locator key; otherwise you have a ~R/N 
chance of that happening (where N is the number of OSDs, R is the number 
of replicas), which isn't worth optimizing for.  If the objects aren't 
stored together, you need to read and then write the data; this avoids 
adding additional complexity to the OSD for minimal gain.

Do you have a use-case in mind where this functionality is important?

sage

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html