Re: Different geoms for an rbd block device

2012-10-30 Thread Josh Durgin

On 10/28/2012 03:02 AM, Andrey Korolyov wrote:

Hi,

Should following behavior considered to be normal?

$ rbd map test-rack0/debiantest --user qemukvm --secret qemukvm.key
$ fdisk /dev/rbd1

Command (m for help): p

Disk /dev/rbd1: 671 MB, 671088640 bytes
255 heads, 63 sectors/track, 81 cylinders, total 1310720 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 4194304 bytes / 4194304 bytes
Disk identifier: 0x00056f14

  Device Boot  Start End  Blocks   Id  System
/dev/rbd1p12048   63487   30720   82  Linux swap / Solaris
Partition 1 does not start on physical sector boundary.
/dev/rbd1p2   63488 1292287  614400   83  Linux
Partition 2 does not start on physical sector boundary.

Meanwhile, in the guest vm over same image:

fdisk /dev/vda

Command (m for help): p

Disk /dev/vda: 671 MB, 671088640 bytes
16 heads, 63 sectors/track, 1300 cylinders, total 1310720 sectors


I'm guessing the reported number of cylinders is the issue?
You can control that with a qemu option. I think

-drive ...cyls=81

will do it. You can also set the min/opt i/o sizes via
qemu device properties min_io_size and opt_io_size in
the same way you can adjust discard granularity:

http://ceph.com/docs/master/rbd/qemu-rbd/#enabling-discard-trim

Unfortunately min_io_size is a uint16 in qemu, so it won't
be able to store 4194304.


Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00056f14

Device Boot  Start End  Blocks   Id  System
/dev/vda12048   63487   30720   82  Linux swap / Solaris
/dev/vda2   63488 1292287  614400   83  Linux

The real pain starts when I try to repartition disk from after 'rbd
map' using its geometry - it simply broke partition layout, for
example, first block offset moves from 2048b to 8192. Of course I can
specify geometry by hand, but before that I may need to start vm at
least once or do something else which will print me out actual layout.

Thanks!


Setting the geometry at qemu boot time should work, and is a bit easier.
qemu actually has code to try to guess disk geometry from a partition
table, but perhaps it doesn't support the format you're using.

Josh
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Different geoms for an rbd block device

2012-10-30 Thread Andrey Korolyov
On Wed, Oct 31, 2012 at 1:07 AM, Josh Durgin josh.dur...@inktank.com wrote:
 On 10/28/2012 03:02 AM, Andrey Korolyov wrote:

 Hi,

 Should following behavior considered to be normal?

 $ rbd map test-rack0/debiantest --user qemukvm --secret qemukvm.key
 $ fdisk /dev/rbd1

 Command (m for help): p

 Disk /dev/rbd1: 671 MB, 671088640 bytes
 255 heads, 63 sectors/track, 81 cylinders, total 1310720 sectors
 Units = sectors of 1 * 512 = 512 bytes
 Sector size (logical/physical): 512 bytes / 512 bytes
 I/O size (minimum/optimal): 4194304 bytes / 4194304 bytes
 Disk identifier: 0x00056f14

   Device Boot  Start End  Blocks   Id  System
 /dev/rbd1p12048   63487   30720   82  Linux swap /
 Solaris
 Partition 1 does not start on physical sector boundary.
 /dev/rbd1p2   63488 1292287  614400   83  Linux
 Partition 2 does not start on physical sector boundary.

 Meanwhile, in the guest vm over same image:

 fdisk /dev/vda

 Command (m for help): p

 Disk /dev/vda: 671 MB, 671088640 bytes
 16 heads, 63 sectors/track, 1300 cylinders, total 1310720 sectors


 I'm guessing the reported number of cylinders is the issue?
 You can control that with a qemu option. I think

 -drive ...cyls=81

 will do it. You can also set the min/opt i/o sizes via
 qemu device properties min_io_size and opt_io_size in
 the same way you can adjust discard granularity:

 http://ceph.com/docs/master/rbd/qemu-rbd/#enabling-discard-trim

 Unfortunately min_io_size is a uint16 in qemu, so it won't
 be able to store 4194304.


 Units = sectors of 1 * 512 = 512 bytes
 Sector size (logical/physical): 512 bytes / 512 bytes
 I/O size (minimum/optimal): 512 bytes / 512 bytes
 Disk identifier: 0x00056f14

 Device Boot  Start End  Blocks   Id  System
 /dev/vda12048   63487   30720   82  Linux swap /
 Solaris
 /dev/vda2   63488 1292287  614400   83  Linux

 The real pain starts when I try to repartition disk from after 'rbd
 map' using its geometry - it simply broke partition layout, for
 example, first block offset moves from 2048b to 8192. Of course I can
 specify geometry by hand, but before that I may need to start vm at
 least once or do something else which will print me out actual layout.

 Thanks!


 Setting the geometry at qemu boot time should work, and is a bit easier.
 qemu actually has code to try to guess disk geometry from a partition
 table, but perhaps it doesn't support the format you're using.

 Josh

So preferable geometry is one provided by kernel client, right? Is
there any advantages of using large blocks for I/O with discard(ofc,
not right now, I`ll wait for virtio bus support :) )?  At first sight,
TCP transfers should not differ by resulting speed on typical
workloads, but only on exotic ones - like delayed commit on the guest
FS + intensive writes.
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Different geoms for an rbd block device

2012-10-30 Thread Josh Durgin

On 10/30/2012 02:41 PM, Andrey Korolyov wrote:

On Wed, Oct 31, 2012 at 1:07 AM, Josh Durgin josh.dur...@inktank.com wrote:

On 10/28/2012 03:02 AM, Andrey Korolyov wrote:


Hi,

Should following behavior considered to be normal?

$ rbd map test-rack0/debiantest --user qemukvm --secret qemukvm.key
$ fdisk /dev/rbd1

Command (m for help): p

Disk /dev/rbd1: 671 MB, 671088640 bytes
255 heads, 63 sectors/track, 81 cylinders, total 1310720 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 4194304 bytes / 4194304 bytes
Disk identifier: 0x00056f14

   Device Boot  Start End  Blocks   Id  System
/dev/rbd1p12048   63487   30720   82  Linux swap /
Solaris
Partition 1 does not start on physical sector boundary.
/dev/rbd1p2   63488 1292287  614400   83  Linux
Partition 2 does not start on physical sector boundary.

Meanwhile, in the guest vm over same image:

fdisk /dev/vda

Command (m for help): p

Disk /dev/vda: 671 MB, 671088640 bytes
16 heads, 63 sectors/track, 1300 cylinders, total 1310720 sectors



I'm guessing the reported number of cylinders is the issue?
You can control that with a qemu option. I think

-drive ...cyls=81

will do it. You can also set the min/opt i/o sizes via
qemu device properties min_io_size and opt_io_size in
the same way you can adjust discard granularity:

http://ceph.com/docs/master/rbd/qemu-rbd/#enabling-discard-trim

Unfortunately min_io_size is a uint16 in qemu, so it won't
be able to store 4194304.



Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00056f14

 Device Boot  Start End  Blocks   Id  System
/dev/vda12048   63487   30720   82  Linux swap /
Solaris
/dev/vda2   63488 1292287  614400   83  Linux

The real pain starts when I try to repartition disk from after 'rbd
map' using its geometry - it simply broke partition layout, for
example, first block offset moves from 2048b to 8192. Of course I can
specify geometry by hand, but before that I may need to start vm at
least once or do something else which will print me out actual layout.

Thanks!



Setting the geometry at qemu boot time should work, and is a bit easier.
qemu actually has code to try to guess disk geometry from a partition
table, but perhaps it doesn't support the format you're using.

Josh


So preferable geometry is one provided by kernel client, right? Is
there any advantages of using large blocks for I/O with discard(ofc,
not right now, I`ll wait for virtio bus support :) )?  At first sight,
TCP transfers should not differ by resulting speed on typical
workloads, but only on exotic ones - like delayed commit on the guest
FS + intensive writes.


Generally larger I/Os are better, but the kernel in the guest will
probably restrict them to less than the full 4MB. I'm not sure how
large discard operations will get, but if they span an entire object
the object will be deleted instead of needing to zero out a chunk of it.
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html