sorry meant the building in this case. The building of 900 requests takes too long. So the kernel starts to cancel these I/O requests.

  void AioCompletion::finish_adding_requests(CephContext *cct)
  {
ldout(cct, 20) << "AioCompletion::finish_adding_requests " << (void*)this << " pending " << pending_count << dendl;
    lock.Lock();
    assert(building);
    building = false;
    if (!pending_count) {
      finalize(cct, rval);
      complete();
    }
    lock.Unlock();
  }

Finanlize and complete is only done when pending_count is 0 so all I/O is done.

Stefan

Am 19.11.2012 09:38, schrieb Stefan Priebe - Profihost AG:
Hi Josh,

i got the following info from the qemu devs.

The discards get canceled by the client kernel as they take TOO long.
This happens due to the fact that ceph handle discards as buffered I/O.

I see that there are max pending 800 requests. And rbd returns success
first when there are no requests left. This is TOO long for the kernel.

I think discards must be changed to unbuffered I/O to solve this.

Greets,
Stefan

Am 18.11.2012 03:38, schrieb Josh Durgin:
On 11/17/2012 02:19 PM, Stefan Priebe wrote:
Hello list,

right now librbd returns an error if i issue a discard for a sector /
byterange where ceph does not have any file as i had never written to
this section.

Thanks for bringing this up again. I haven't had time to dig deeper
into it yet, but I definitely want to fix this for bobtail.

This is not correct. It should return 0 / OK in this case.

Stefan

Examplelog:
2012-11-02 21:06:17.649922 7f745f7fe700 20 librbd::AioRequest:
WRITE_FLAT
2012-11-02 21:06:17.649924 7f745f7fe700 20 librbd::AioCompletion:
AioCompletion::complete_request() this=0x7f72cc05bd20
complete_cb=0x7f747021d4b0
2012-11-02 21:06:17.649924 7f747015c780  1 -- 10.10.0.2:0/2028325 -->
10.10.0.18:6803/9687 -- osd_op(client.26862.0:3073
rb.0.1044.359ed6c7.000000000bde [delete] 3.bd84636 snapc 2=[]) v4 -- ?+0
0x7f72d81c69b0 con 0x7f74600dbf50
2012-11-02 21:06:17.649934 7f747015c780 20 librbd:  oid
rb.0.1044.359ed6c7.000000000bdf 0~4194304 from [4156556288,4194304]
2012-11-02 21:06:17.649972 7f7465a6e700  1 -- 10.10.0.2:0/2028325 <==
osd.1202 10.10.0.18:6806/9821 143 ==== osd_op_reply(1652
rb.0.1044.359ed6c7.000000000652 [delete] ondisk = -2 (No such file or
directory)) v4 ==== 130+0+0 (2964367729 0 0) 0x7f72dc0f0090 con
0x7f74600e4350
2012-11-02 21:06:17.649994 7f745f7fe700 20 librbd::AioRequest: write
0x7f74600feab0 should_complete: r = -2

This last line isn't printing what's actually being returned to the
application. It's still in librbd's internal processing, and will be
converted to 0 for the application.

Could you try with the master or next branches? After the
'should_complete' line, there should be a line like:

<date> <time> <thread_id> 20 librbd::AioCompletion:
AioCompletion::finalize() rval 0 ...

That 'rval 0' shows the actual return value the application (qemu in
this case) will see.

Josh
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to