Re: [ceph-users] IO to OSD with librados

Jialin Liu Mon, 18 Jun 2018 01:06:38 -0700

Thank you Dan. I’ll try it.

Best,
Jialin
NERSC/LBNL


> On Jun 18, 2018, at 12:22 AM, Dan van der Ster <d...@vanderster.com> wrote:
> 
> Hi,
> 
> One way you can see exactly what is happening when you write an object
> is with --debug_ms=1.
> 
> For example, I write a 100MB object to a test pool:  rados
> --debug_ms=1 -p test put 100M.dat 100M.dat
> I pasted the output of this here: https://pastebin.com/Zg8rjaTV
> In this case, it first gets the cluster maps from a mon, then writes
> the object to osd.58, which is the primary osd for PG 119.77:
> 
> # ceph pg 119.77 query | jq .up
> [
>  58,
>  49,
>  31
> ]
> 
> Otherwise I answered your questions below...
> 
>> On Sun, Jun 17, 2018 at 8:29 PM Jialin Liu <jaln...@lbl.gov> wrote:
>> 
>> Hello,
>> 
>> I have a couple questions regarding the IO on OSD via librados.
>> 
>> 
>> 1. How to check which osd is receiving data?
>> 
> 
> See `ceph osd map`.
> For my example above:
> 
> # ceph osd map test 100M.dat
> osdmap e236396 pool 'test' (119) object '100M.dat' -> pg 119.864b0b77
> (119.77) -> up ([58,49,31], p58) acting ([58,49,31], p58)
> 
>> 2. Can the write operation return immediately to the application once the 
>> write to the primary OSD is done? or does it return only when the data is 
>> replicated twice? (size=3)
> 
> Write returns once it is safe on *all* replicas or EC chunks.
> 
>> 3. What is the I/O size in the lower level in librados, e.g., if I send a 
>> 100MB request with 1 thread, does librados send the data by a fixed 
>> transaction size?
> 
> This depends on the client. The `rados` CLI example I showed you broke
> the 100MB object into 4MB parts.
> Most use-cases keep the objects around 4MB or 8MB.
> 
>> 4. I have 4 OSS, 48 OSDs, will the 4 OSS become the bottleneck? from the 
>> ceph documentation, once the cluster map is received by the client, the 
>> client can talk to OSD directly, so the assumption is the max parallelism 
>> depends on the number of OSDs, is this correct?
>> 
> 
> That's more or less correct -- the IOPS and BW capacity of the cluster
> generally scales linearly with number of OSDs.
> 
> Cheers,
> Dan
> CERN
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] IO to OSD with librados

Reply via email to