Re: [Qemu-devel] [Qemu-block] [RFC 1/1] ide: bug #1777315: io_buffer_size and sg.size can represent partial sector sizes

2018-06-19 Thread Amol Surati
On Wed, Jun 20, 2018 at 06:23:19AM +0530, Amol Surati wrote:
> On Tue, Jun 19, 2018 at 05:43:52PM -0400, John Snow wrote:
> > 
> > 
> > On 06/19/2018 05:26 PM, Amol Surati wrote:
> > > On Tue, Jun 19, 2018 at 08:04:03PM +0530, Amol Surati wrote:
> > >> On Tue, Jun 19, 2018 at 09:45:15AM -0400, John Snow wrote:
> > >>>
> > >>>
> > >>> On 06/19/2018 04:53 AM, Kevin Wolf wrote:
> >  Am 19.06.2018 um 06:01 hat Amol Surati geschrieben:
> > > On Mon, Jun 18, 2018 at 08:14:10PM -0400, John Snow wrote:
> > >>
> > >>
> > >> On 06/18/2018 02:02 PM, Amol Surati wrote:
> > >>> On Mon, Jun 18, 2018 at 12:05:15AM +0530, Amol Surati wrote:
> >  This patch fixes the assumption that io_buffer_size is always a 
> >  perfect
> >  multiple of the sector size. The assumption is the cause of the 
> >  firing
> >  of 'assert(n * 512 == s->sg.size);'.
> > 
> >  Signed-off-by: Amol Surati 
> >  ---
> > >>>
> > >>> The repository https://github.com/asurati/1777315 contains a module 
> > >>> for
> > >>> QEMU's 8086:7010 ATA controller, which exercises the code path
> > >>> described in [RFC 0/1] of this series.
> > >>>
> > >>
> > >> Thanks, this made it easier to see what was happening. I was able to
> > >> write an ide-test test case using this source as a guide, and 
> > >> reproduce
> > >> the error.
> > >>
> > >> static void test_bmdma_partial_sector_short_prdt(void)
> > >> {
> > >> QPCIDevice *dev;
> > >> QPCIBar bmdma_bar, ide_bar;
> > >> uint8_t status;
> > >>
> > >> /* Read 2 sectors but only give 1 sector in PRDT */
> > >> PrdtEntry prdt[] = {
> > >> {
> > >> .addr = 0,
> > >> .size = cpu_to_le32(0x200),
> > >> },
> > >> {
> > >> .addr = 512,
> > >> .size = cpu_to_le32(0x44 | PRDT_EOT),
> > >> }
> > >> };
> > >>
> > >> dev = get_pci_device(_bar, _bar);
> > >> status = send_dma_request(CMD_READ_DMA, 0, 2,
> > >>   prdt, ARRAY_SIZE(prdt), NULL);
> > >> g_assert_cmphex(status, ==, 0);
> > >> assert_bit_clear(qpci_io_readb(dev, ide_bar, reg_status), DF | 
> > >> ERR);
> > >> free_pci_device(dev);
> > >> }
> > >>
> > >>> Loading the module reproduces the bug. Tested on the latest master
> > >>> branch.
> > >>>
> > >>> Steps:
> > >>> - Install a Linux distribution as a guest, ensuring that the boot 
> > >>> disk
> > >>>   resides on non-IDE controllers (such as virtio)
> > >>> - Attach another disk as a master device on the primary
> > >>>   IDE controller (i.e. attach at -hda.)
> > >>> - Blacklist ata_piix, pata_acpi and ata_generic modules, and reboot.
> > >>> - Copy the source files into the guest and build the module.
> > >>> - Load the module. QEMU process should die with the message:
> > >>>   qemu-system-x86_64: hw/ide/core.c:871: ide_dma_cb:
> > >>>   Assertion `n * 512 == s->sg.size' failed.
> > >>>
> > >>>
> > >>> -Amol
> > >>>
> > >>
> > >> I'm less sure of the fix -- certainly the assert is wrong, but just
> > >> incrementing 'n' is wrong too -- we didn't copy (n+1) sectors, we 
> > >> copied
> > >> (n) and a few extra bytes.
> > >
> > > That is true.
> > >
> > > There are (at least) two fields that represent the total size of a DMA
> > > transfer -
> > > (1) The size, as requested through the NSECTOR field.
> > > (2) The size, as calculated through the length fields of the PRD 
> > > entries.
> > >
> > > It makes sense to consider the most restrictive of the sizes, as the 
> > > factor
> > > which determines both the end of a successful DMA transfer and the
> > > condition to assert.
> > >
> > >>
> > >> The sector-based math here would need to be adjusted to be able to 
> > >> cope
> > >> with partial sector reads... or we ought to avoid doing any partial
> > >> sector transfers.
> > >>
> > >>
> > >> I'm not sure which is more correct tonight, it depends:
> > >>
> > >> - If it's OK to transfer partial sectors before reporting overflow,
> > >> adjusting the command loop to work with partial sectors is OK.
> > >>
> > >> - If it's NOT OK to do partial sector transfer, the sglist 
> > >> preparation
> > >> phase needs to produce a truncated SGList that's some multiple of 512
> > >> bytes that leaves the excess bytes in a second sglist that we don't
> > >> throw away and can use as a basis for building the next sglist. (Or 
> > >> the
> > >> DMA helpers need to take a max_bytes parameter and return an sglist
> > >> representing unused buffer space if the command underflowed.)
> > >
> > > Support for 

Re: [Qemu-devel] [Qemu-block] [RFC 1/1] ide: bug #1777315: io_buffer_size and sg.size can represent partial sector sizes

2018-06-19 Thread Amol Surati
On Tue, Jun 19, 2018 at 05:43:52PM -0400, John Snow wrote:
> 
> 
> On 06/19/2018 05:26 PM, Amol Surati wrote:
> > On Tue, Jun 19, 2018 at 08:04:03PM +0530, Amol Surati wrote:
> >> On Tue, Jun 19, 2018 at 09:45:15AM -0400, John Snow wrote:
> >>>
> >>>
> >>> On 06/19/2018 04:53 AM, Kevin Wolf wrote:
>  Am 19.06.2018 um 06:01 hat Amol Surati geschrieben:
> > On Mon, Jun 18, 2018 at 08:14:10PM -0400, John Snow wrote:
> >>
> >>
> >> On 06/18/2018 02:02 PM, Amol Surati wrote:
> >>> On Mon, Jun 18, 2018 at 12:05:15AM +0530, Amol Surati wrote:
>  This patch fixes the assumption that io_buffer_size is always a 
>  perfect
>  multiple of the sector size. The assumption is the cause of the 
>  firing
>  of 'assert(n * 512 == s->sg.size);'.
> 
>  Signed-off-by: Amol Surati 
>  ---
> >>>
> >>> The repository https://github.com/asurati/1777315 contains a module 
> >>> for
> >>> QEMU's 8086:7010 ATA controller, which exercises the code path
> >>> described in [RFC 0/1] of this series.
> >>>
> >>
> >> Thanks, this made it easier to see what was happening. I was able to
> >> write an ide-test test case using this source as a guide, and reproduce
> >> the error.
> >>
> >> static void test_bmdma_partial_sector_short_prdt(void)
> >> {
> >> QPCIDevice *dev;
> >> QPCIBar bmdma_bar, ide_bar;
> >> uint8_t status;
> >>
> >> /* Read 2 sectors but only give 1 sector in PRDT */
> >> PrdtEntry prdt[] = {
> >> {
> >> .addr = 0,
> >> .size = cpu_to_le32(0x200),
> >> },
> >> {
> >> .addr = 512,
> >> .size = cpu_to_le32(0x44 | PRDT_EOT),
> >> }
> >> };
> >>
> >> dev = get_pci_device(_bar, _bar);
> >> status = send_dma_request(CMD_READ_DMA, 0, 2,
> >>   prdt, ARRAY_SIZE(prdt), NULL);
> >> g_assert_cmphex(status, ==, 0);
> >> assert_bit_clear(qpci_io_readb(dev, ide_bar, reg_status), DF | 
> >> ERR);
> >> free_pci_device(dev);
> >> }
> >>
> >>> Loading the module reproduces the bug. Tested on the latest master
> >>> branch.
> >>>
> >>> Steps:
> >>> - Install a Linux distribution as a guest, ensuring that the boot disk
> >>>   resides on non-IDE controllers (such as virtio)
> >>> - Attach another disk as a master device on the primary
> >>>   IDE controller (i.e. attach at -hda.)
> >>> - Blacklist ata_piix, pata_acpi and ata_generic modules, and reboot.
> >>> - Copy the source files into the guest and build the module.
> >>> - Load the module. QEMU process should die with the message:
> >>>   qemu-system-x86_64: hw/ide/core.c:871: ide_dma_cb:
> >>>   Assertion `n * 512 == s->sg.size' failed.
> >>>
> >>>
> >>> -Amol
> >>>
> >>
> >> I'm less sure of the fix -- certainly the assert is wrong, but just
> >> incrementing 'n' is wrong too -- we didn't copy (n+1) sectors, we 
> >> copied
> >> (n) and a few extra bytes.
> >
> > That is true.
> >
> > There are (at least) two fields that represent the total size of a DMA
> > transfer -
> > (1) The size, as requested through the NSECTOR field.
> > (2) The size, as calculated through the length fields of the PRD 
> > entries.
> >
> > It makes sense to consider the most restrictive of the sizes, as the 
> > factor
> > which determines both the end of a successful DMA transfer and the
> > condition to assert.
> >
> >>
> >> The sector-based math here would need to be adjusted to be able to cope
> >> with partial sector reads... or we ought to avoid doing any partial
> >> sector transfers.
> >>
> >>
> >> I'm not sure which is more correct tonight, it depends:
> >>
> >> - If it's OK to transfer partial sectors before reporting overflow,
> >> adjusting the command loop to work with partial sectors is OK.
> >>
> >> - If it's NOT OK to do partial sector transfer, the sglist preparation
> >> phase needs to produce a truncated SGList that's some multiple of 512
> >> bytes that leaves the excess bytes in a second sglist that we don't
> >> throw away and can use as a basis for building the next sglist. (Or the
> >> DMA helpers need to take a max_bytes parameter and return an sglist
> >> representing unused buffer space if the command underflowed.)
> >
> > Support for partial sector transfers is built into the DMA interface's 
> > PRD
> > mechanism itself, because an entry is allowed to transfer in the units 
> > of
> > even number of bytes.
> >
> > I think the controller's IO process runs in two parts (probably loops 
> > over
> > for a single transfer):
> >
> > (1) The 

Re: [Qemu-devel] [Qemu-block] [RFC 1/1] ide: bug #1777315: io_buffer_size and sg.size can represent partial sector sizes

2018-06-19 Thread John Snow



On 06/19/2018 05:26 PM, Amol Surati wrote:
> On Tue, Jun 19, 2018 at 08:04:03PM +0530, Amol Surati wrote:
>> On Tue, Jun 19, 2018 at 09:45:15AM -0400, John Snow wrote:
>>>
>>>
>>> On 06/19/2018 04:53 AM, Kevin Wolf wrote:
 Am 19.06.2018 um 06:01 hat Amol Surati geschrieben:
> On Mon, Jun 18, 2018 at 08:14:10PM -0400, John Snow wrote:
>>
>>
>> On 06/18/2018 02:02 PM, Amol Surati wrote:
>>> On Mon, Jun 18, 2018 at 12:05:15AM +0530, Amol Surati wrote:
 This patch fixes the assumption that io_buffer_size is always a perfect
 multiple of the sector size. The assumption is the cause of the firing
 of 'assert(n * 512 == s->sg.size);'.

 Signed-off-by: Amol Surati 
 ---
>>>
>>> The repository https://github.com/asurati/1777315 contains a module for
>>> QEMU's 8086:7010 ATA controller, which exercises the code path
>>> described in [RFC 0/1] of this series.
>>>
>>
>> Thanks, this made it easier to see what was happening. I was able to
>> write an ide-test test case using this source as a guide, and reproduce
>> the error.
>>
>> static void test_bmdma_partial_sector_short_prdt(void)
>> {
>> QPCIDevice *dev;
>> QPCIBar bmdma_bar, ide_bar;
>> uint8_t status;
>>
>> /* Read 2 sectors but only give 1 sector in PRDT */
>> PrdtEntry prdt[] = {
>> {
>> .addr = 0,
>> .size = cpu_to_le32(0x200),
>> },
>> {
>> .addr = 512,
>> .size = cpu_to_le32(0x44 | PRDT_EOT),
>> }
>> };
>>
>> dev = get_pci_device(_bar, _bar);
>> status = send_dma_request(CMD_READ_DMA, 0, 2,
>>   prdt, ARRAY_SIZE(prdt), NULL);
>> g_assert_cmphex(status, ==, 0);
>> assert_bit_clear(qpci_io_readb(dev, ide_bar, reg_status), DF | ERR);
>> free_pci_device(dev);
>> }
>>
>>> Loading the module reproduces the bug. Tested on the latest master
>>> branch.
>>>
>>> Steps:
>>> - Install a Linux distribution as a guest, ensuring that the boot disk
>>>   resides on non-IDE controllers (such as virtio)
>>> - Attach another disk as a master device on the primary
>>>   IDE controller (i.e. attach at -hda.)
>>> - Blacklist ata_piix, pata_acpi and ata_generic modules, and reboot.
>>> - Copy the source files into the guest and build the module.
>>> - Load the module. QEMU process should die with the message:
>>>   qemu-system-x86_64: hw/ide/core.c:871: ide_dma_cb:
>>>   Assertion `n * 512 == s->sg.size' failed.
>>>
>>>
>>> -Amol
>>>
>>
>> I'm less sure of the fix -- certainly the assert is wrong, but just
>> incrementing 'n' is wrong too -- we didn't copy (n+1) sectors, we copied
>> (n) and a few extra bytes.
>
> That is true.
>
> There are (at least) two fields that represent the total size of a DMA
> transfer -
> (1) The size, as requested through the NSECTOR field.
> (2) The size, as calculated through the length fields of the PRD entries.
>
> It makes sense to consider the most restrictive of the sizes, as the 
> factor
> which determines both the end of a successful DMA transfer and the
> condition to assert.
>
>>
>> The sector-based math here would need to be adjusted to be able to cope
>> with partial sector reads... or we ought to avoid doing any partial
>> sector transfers.
>>
>>
>> I'm not sure which is more correct tonight, it depends:
>>
>> - If it's OK to transfer partial sectors before reporting overflow,
>> adjusting the command loop to work with partial sectors is OK.
>>
>> - If it's NOT OK to do partial sector transfer, the sglist preparation
>> phase needs to produce a truncated SGList that's some multiple of 512
>> bytes that leaves the excess bytes in a second sglist that we don't
>> throw away and can use as a basis for building the next sglist. (Or the
>> DMA helpers need to take a max_bytes parameter and return an sglist
>> representing unused buffer space if the command underflowed.)
>
> Support for partial sector transfers is built into the DMA interface's PRD
> mechanism itself, because an entry is allowed to transfer in the units of
> even number of bytes.
>
> I think the controller's IO process runs in two parts (probably loops over
> for a single transfer):
>
> (1) The controller's disk interface transfers between its internal buffer
> and the disk storage. The transfers are likely to be in the
> multiples of a sector.
> (2) The controller's DMA interface transfers between its internal buffer
> and the system memory. The transfers can be sub-sector in size(, and
> are preserving of the areas, of the internal 

Re: [Qemu-devel] [Qemu-block] [RFC 1/1] ide: bug #1777315: io_buffer_size and sg.size can represent partial sector sizes

2018-06-19 Thread Amol Surati
On Tue, Jun 19, 2018 at 08:04:03PM +0530, Amol Surati wrote:
> On Tue, Jun 19, 2018 at 09:45:15AM -0400, John Snow wrote:
> > 
> > 
> > On 06/19/2018 04:53 AM, Kevin Wolf wrote:
> > > Am 19.06.2018 um 06:01 hat Amol Surati geschrieben:
> > >> On Mon, Jun 18, 2018 at 08:14:10PM -0400, John Snow wrote:
> > >>>
> > >>>
> > >>> On 06/18/2018 02:02 PM, Amol Surati wrote:
> >  On Mon, Jun 18, 2018 at 12:05:15AM +0530, Amol Surati wrote:
> > > This patch fixes the assumption that io_buffer_size is always a 
> > > perfect
> > > multiple of the sector size. The assumption is the cause of the firing
> > > of 'assert(n * 512 == s->sg.size);'.
> > >
> > > Signed-off-by: Amol Surati 
> > > ---
> > 
> >  The repository https://github.com/asurati/1777315 contains a module for
> >  QEMU's 8086:7010 ATA controller, which exercises the code path
> >  described in [RFC 0/1] of this series.
> > 
> > >>>
> > >>> Thanks, this made it easier to see what was happening. I was able to
> > >>> write an ide-test test case using this source as a guide, and reproduce
> > >>> the error.
> > >>>
> > >>> static void test_bmdma_partial_sector_short_prdt(void)
> > >>> {
> > >>> QPCIDevice *dev;
> > >>> QPCIBar bmdma_bar, ide_bar;
> > >>> uint8_t status;
> > >>>
> > >>> /* Read 2 sectors but only give 1 sector in PRDT */
> > >>> PrdtEntry prdt[] = {
> > >>> {
> > >>> .addr = 0,
> > >>> .size = cpu_to_le32(0x200),
> > >>> },
> > >>> {
> > >>> .addr = 512,
> > >>> .size = cpu_to_le32(0x44 | PRDT_EOT),
> > >>> }
> > >>> };
> > >>>
> > >>> dev = get_pci_device(_bar, _bar);
> > >>> status = send_dma_request(CMD_READ_DMA, 0, 2,
> > >>>   prdt, ARRAY_SIZE(prdt), NULL);
> > >>> g_assert_cmphex(status, ==, 0);
> > >>> assert_bit_clear(qpci_io_readb(dev, ide_bar, reg_status), DF | ERR);
> > >>> free_pci_device(dev);
> > >>> }
> > >>>
> >  Loading the module reproduces the bug. Tested on the latest master
> >  branch.
> > 
> >  Steps:
> >  - Install a Linux distribution as a guest, ensuring that the boot disk
> >    resides on non-IDE controllers (such as virtio)
> >  - Attach another disk as a master device on the primary
> >    IDE controller (i.e. attach at -hda.)
> >  - Blacklist ata_piix, pata_acpi and ata_generic modules, and reboot.
> >  - Copy the source files into the guest and build the module.
> >  - Load the module. QEMU process should die with the message:
> >    qemu-system-x86_64: hw/ide/core.c:871: ide_dma_cb:
> >    Assertion `n * 512 == s->sg.size' failed.
> > 
> > 
> >  -Amol
> > 
> > >>>
> > >>> I'm less sure of the fix -- certainly the assert is wrong, but just
> > >>> incrementing 'n' is wrong too -- we didn't copy (n+1) sectors, we copied
> > >>> (n) and a few extra bytes.
> > >>
> > >> That is true.
> > >>
> > >> There are (at least) two fields that represent the total size of a DMA
> > >> transfer -
> > >> (1) The size, as requested through the NSECTOR field.
> > >> (2) The size, as calculated through the length fields of the PRD entries.
> > >>
> > >> It makes sense to consider the most restrictive of the sizes, as the 
> > >> factor
> > >> which determines both the end of a successful DMA transfer and the
> > >> condition to assert.
> > >>
> > >>>
> > >>> The sector-based math here would need to be adjusted to be able to cope
> > >>> with partial sector reads... or we ought to avoid doing any partial
> > >>> sector transfers.
> > >>>
> > >>>
> > >>> I'm not sure which is more correct tonight, it depends:
> > >>>
> > >>> - If it's OK to transfer partial sectors before reporting overflow,
> > >>> adjusting the command loop to work with partial sectors is OK.
> > >>>
> > >>> - If it's NOT OK to do partial sector transfer, the sglist preparation
> > >>> phase needs to produce a truncated SGList that's some multiple of 512
> > >>> bytes that leaves the excess bytes in a second sglist that we don't
> > >>> throw away and can use as a basis for building the next sglist. (Or the
> > >>> DMA helpers need to take a max_bytes parameter and return an sglist
> > >>> representing unused buffer space if the command underflowed.)
> > >>
> > >> Support for partial sector transfers is built into the DMA interface's 
> > >> PRD
> > >> mechanism itself, because an entry is allowed to transfer in the units of
> > >> even number of bytes.
> > >>
> > >> I think the controller's IO process runs in two parts (probably loops 
> > >> over
> > >> for a single transfer):
> > >>
> > >> (1) The controller's disk interface transfers between its internal buffer
> > >> and the disk storage. The transfers are likely to be in the
> > >> multiples of a sector.
> > >> (2) The controller's DMA interface transfers between its internal buffer
> > >> and the system memory. The transfers 

Re: [Qemu-devel] [Qemu-block] [RFC 1/1] ide: bug #1777315: io_buffer_size and sg.size can represent partial sector sizes

2018-06-19 Thread Amol Surati
On Tue, Jun 19, 2018 at 09:45:15AM -0400, John Snow wrote:
> 
> 
> On 06/19/2018 04:53 AM, Kevin Wolf wrote:
> > Am 19.06.2018 um 06:01 hat Amol Surati geschrieben:
> >> On Mon, Jun 18, 2018 at 08:14:10PM -0400, John Snow wrote:
> >>>
> >>>
> >>> On 06/18/2018 02:02 PM, Amol Surati wrote:
>  On Mon, Jun 18, 2018 at 12:05:15AM +0530, Amol Surati wrote:
> > This patch fixes the assumption that io_buffer_size is always a perfect
> > multiple of the sector size. The assumption is the cause of the firing
> > of 'assert(n * 512 == s->sg.size);'.
> >
> > Signed-off-by: Amol Surati 
> > ---
> 
>  The repository https://github.com/asurati/1777315 contains a module for
>  QEMU's 8086:7010 ATA controller, which exercises the code path
>  described in [RFC 0/1] of this series.
> 
> >>>
> >>> Thanks, this made it easier to see what was happening. I was able to
> >>> write an ide-test test case using this source as a guide, and reproduce
> >>> the error.
> >>>
> >>> static void test_bmdma_partial_sector_short_prdt(void)
> >>> {
> >>> QPCIDevice *dev;
> >>> QPCIBar bmdma_bar, ide_bar;
> >>> uint8_t status;
> >>>
> >>> /* Read 2 sectors but only give 1 sector in PRDT */
> >>> PrdtEntry prdt[] = {
> >>> {
> >>> .addr = 0,
> >>> .size = cpu_to_le32(0x200),
> >>> },
> >>> {
> >>> .addr = 512,
> >>> .size = cpu_to_le32(0x44 | PRDT_EOT),
> >>> }
> >>> };
> >>>
> >>> dev = get_pci_device(_bar, _bar);
> >>> status = send_dma_request(CMD_READ_DMA, 0, 2,
> >>>   prdt, ARRAY_SIZE(prdt), NULL);
> >>> g_assert_cmphex(status, ==, 0);
> >>> assert_bit_clear(qpci_io_readb(dev, ide_bar, reg_status), DF | ERR);
> >>> free_pci_device(dev);
> >>> }
> >>>
>  Loading the module reproduces the bug. Tested on the latest master
>  branch.
> 
>  Steps:
>  - Install a Linux distribution as a guest, ensuring that the boot disk
>    resides on non-IDE controllers (such as virtio)
>  - Attach another disk as a master device on the primary
>    IDE controller (i.e. attach at -hda.)
>  - Blacklist ata_piix, pata_acpi and ata_generic modules, and reboot.
>  - Copy the source files into the guest and build the module.
>  - Load the module. QEMU process should die with the message:
>    qemu-system-x86_64: hw/ide/core.c:871: ide_dma_cb:
>    Assertion `n * 512 == s->sg.size' failed.
> 
> 
>  -Amol
> 
> >>>
> >>> I'm less sure of the fix -- certainly the assert is wrong, but just
> >>> incrementing 'n' is wrong too -- we didn't copy (n+1) sectors, we copied
> >>> (n) and a few extra bytes.
> >>
> >> That is true.
> >>
> >> There are (at least) two fields that represent the total size of a DMA
> >> transfer -
> >> (1) The size, as requested through the NSECTOR field.
> >> (2) The size, as calculated through the length fields of the PRD entries.
> >>
> >> It makes sense to consider the most restrictive of the sizes, as the factor
> >> which determines both the end of a successful DMA transfer and the
> >> condition to assert.
> >>
> >>>
> >>> The sector-based math here would need to be adjusted to be able to cope
> >>> with partial sector reads... or we ought to avoid doing any partial
> >>> sector transfers.
> >>>
> >>>
> >>> I'm not sure which is more correct tonight, it depends:
> >>>
> >>> - If it's OK to transfer partial sectors before reporting overflow,
> >>> adjusting the command loop to work with partial sectors is OK.
> >>>
> >>> - If it's NOT OK to do partial sector transfer, the sglist preparation
> >>> phase needs to produce a truncated SGList that's some multiple of 512
> >>> bytes that leaves the excess bytes in a second sglist that we don't
> >>> throw away and can use as a basis for building the next sglist. (Or the
> >>> DMA helpers need to take a max_bytes parameter and return an sglist
> >>> representing unused buffer space if the command underflowed.)
> >>
> >> Support for partial sector transfers is built into the DMA interface's PRD
> >> mechanism itself, because an entry is allowed to transfer in the units of
> >> even number of bytes.
> >>
> >> I think the controller's IO process runs in two parts (probably loops over
> >> for a single transfer):
> >>
> >> (1) The controller's disk interface transfers between its internal buffer
> >> and the disk storage. The transfers are likely to be in the
> >> multiples of a sector.
> >> (2) The controller's DMA interface transfers between its internal buffer
> >> and the system memory. The transfers can be sub-sector in size(, and
> >> are preserving of the areas, of the internal buffer, not subject to a
> >> write.)
> > 
> > The spec isn't clear about this (or at least I can't find anything where
> > the exact behaviour is specified), but I agree that that's my mental
> > model as well. So I would make IDE send 

Re: [Qemu-devel] [Qemu-block] [RFC 1/1] ide: bug #1777315: io_buffer_size and sg.size can represent partial sector sizes

2018-06-19 Thread John Snow



On 06/19/2018 04:53 AM, Kevin Wolf wrote:
> Am 19.06.2018 um 06:01 hat Amol Surati geschrieben:
>> On Mon, Jun 18, 2018 at 08:14:10PM -0400, John Snow wrote:
>>>
>>>
>>> On 06/18/2018 02:02 PM, Amol Surati wrote:
 On Mon, Jun 18, 2018 at 12:05:15AM +0530, Amol Surati wrote:
> This patch fixes the assumption that io_buffer_size is always a perfect
> multiple of the sector size. The assumption is the cause of the firing
> of 'assert(n * 512 == s->sg.size);'.
>
> Signed-off-by: Amol Surati 
> ---

 The repository https://github.com/asurati/1777315 contains a module for
 QEMU's 8086:7010 ATA controller, which exercises the code path
 described in [RFC 0/1] of this series.

>>>
>>> Thanks, this made it easier to see what was happening. I was able to
>>> write an ide-test test case using this source as a guide, and reproduce
>>> the error.
>>>
>>> static void test_bmdma_partial_sector_short_prdt(void)
>>> {
>>> QPCIDevice *dev;
>>> QPCIBar bmdma_bar, ide_bar;
>>> uint8_t status;
>>>
>>> /* Read 2 sectors but only give 1 sector in PRDT */
>>> PrdtEntry prdt[] = {
>>> {
>>> .addr = 0,
>>> .size = cpu_to_le32(0x200),
>>> },
>>> {
>>> .addr = 512,
>>> .size = cpu_to_le32(0x44 | PRDT_EOT),
>>> }
>>> };
>>>
>>> dev = get_pci_device(_bar, _bar);
>>> status = send_dma_request(CMD_READ_DMA, 0, 2,
>>>   prdt, ARRAY_SIZE(prdt), NULL);
>>> g_assert_cmphex(status, ==, 0);
>>> assert_bit_clear(qpci_io_readb(dev, ide_bar, reg_status), DF | ERR);
>>> free_pci_device(dev);
>>> }
>>>
 Loading the module reproduces the bug. Tested on the latest master
 branch.

 Steps:
 - Install a Linux distribution as a guest, ensuring that the boot disk
   resides on non-IDE controllers (such as virtio)
 - Attach another disk as a master device on the primary
   IDE controller (i.e. attach at -hda.)
 - Blacklist ata_piix, pata_acpi and ata_generic modules, and reboot.
 - Copy the source files into the guest and build the module.
 - Load the module. QEMU process should die with the message:
   qemu-system-x86_64: hw/ide/core.c:871: ide_dma_cb:
   Assertion `n * 512 == s->sg.size' failed.


 -Amol

>>>
>>> I'm less sure of the fix -- certainly the assert is wrong, but just
>>> incrementing 'n' is wrong too -- we didn't copy (n+1) sectors, we copied
>>> (n) and a few extra bytes.
>>
>> That is true.
>>
>> There are (at least) two fields that represent the total size of a DMA
>> transfer -
>> (1) The size, as requested through the NSECTOR field.
>> (2) The size, as calculated through the length fields of the PRD entries.
>>
>> It makes sense to consider the most restrictive of the sizes, as the factor
>> which determines both the end of a successful DMA transfer and the
>> condition to assert.
>>
>>>
>>> The sector-based math here would need to be adjusted to be able to cope
>>> with partial sector reads... or we ought to avoid doing any partial
>>> sector transfers.
>>>
>>>
>>> I'm not sure which is more correct tonight, it depends:
>>>
>>> - If it's OK to transfer partial sectors before reporting overflow,
>>> adjusting the command loop to work with partial sectors is OK.
>>>
>>> - If it's NOT OK to do partial sector transfer, the sglist preparation
>>> phase needs to produce a truncated SGList that's some multiple of 512
>>> bytes that leaves the excess bytes in a second sglist that we don't
>>> throw away and can use as a basis for building the next sglist. (Or the
>>> DMA helpers need to take a max_bytes parameter and return an sglist
>>> representing unused buffer space if the command underflowed.)
>>
>> Support for partial sector transfers is built into the DMA interface's PRD
>> mechanism itself, because an entry is allowed to transfer in the units of
>> even number of bytes.
>>
>> I think the controller's IO process runs in two parts (probably loops over
>> for a single transfer):
>>
>> (1) The controller's disk interface transfers between its internal buffer
>> and the disk storage. The transfers are likely to be in the
>> multiples of a sector.
>> (2) The controller's DMA interface transfers between its internal buffer
>> and the system memory. The transfers can be sub-sector in size(, and
>> are preserving of the areas, of the internal buffer, not subject to a
>> write.)
> 
> The spec isn't clear about this (or at least I can't find anything where
> the exact behaviour is specified), but I agree that that's my mental
> model as well. So I would make IDE send a byte-granularity request with
> the final partial sector to the block layer, so that the data is
> actually transferred up to that point.
> 
> In practice it probably doesn't matter much because a too short PRDT
> means that the request doesn't complete successfully (the condition is
> 

Re: [Qemu-devel] [Qemu-block] [RFC 1/1] ide: bug #1777315: io_buffer_size and sg.size can represent partial sector sizes

2018-06-19 Thread Kevin Wolf
Am 19.06.2018 um 06:01 hat Amol Surati geschrieben:
> On Mon, Jun 18, 2018 at 08:14:10PM -0400, John Snow wrote:
> > 
> > 
> > On 06/18/2018 02:02 PM, Amol Surati wrote:
> > > On Mon, Jun 18, 2018 at 12:05:15AM +0530, Amol Surati wrote:
> > >> This patch fixes the assumption that io_buffer_size is always a perfect
> > >> multiple of the sector size. The assumption is the cause of the firing
> > >> of 'assert(n * 512 == s->sg.size);'.
> > >>
> > >> Signed-off-by: Amol Surati 
> > >> ---
> > > 
> > > The repository https://github.com/asurati/1777315 contains a module for
> > > QEMU's 8086:7010 ATA controller, which exercises the code path
> > > described in [RFC 0/1] of this series.
> > > 
> > 
> > Thanks, this made it easier to see what was happening. I was able to
> > write an ide-test test case using this source as a guide, and reproduce
> > the error.
> > 
> > static void test_bmdma_partial_sector_short_prdt(void)
> > {
> > QPCIDevice *dev;
> > QPCIBar bmdma_bar, ide_bar;
> > uint8_t status;
> > 
> > /* Read 2 sectors but only give 1 sector in PRDT */
> > PrdtEntry prdt[] = {
> > {
> > .addr = 0,
> > .size = cpu_to_le32(0x200),
> > },
> > {
> > .addr = 512,
> > .size = cpu_to_le32(0x44 | PRDT_EOT),
> > }
> > };
> > 
> > dev = get_pci_device(_bar, _bar);
> > status = send_dma_request(CMD_READ_DMA, 0, 2,
> >   prdt, ARRAY_SIZE(prdt), NULL);
> > g_assert_cmphex(status, ==, 0);
> > assert_bit_clear(qpci_io_readb(dev, ide_bar, reg_status), DF | ERR);
> > free_pci_device(dev);
> > }
> > 
> > > Loading the module reproduces the bug. Tested on the latest master
> > > branch.
> > > 
> > > Steps:
> > > - Install a Linux distribution as a guest, ensuring that the boot disk
> > >   resides on non-IDE controllers (such as virtio)
> > > - Attach another disk as a master device on the primary
> > >   IDE controller (i.e. attach at -hda.)
> > > - Blacklist ata_piix, pata_acpi and ata_generic modules, and reboot.
> > > - Copy the source files into the guest and build the module.
> > > - Load the module. QEMU process should die with the message:
> > >   qemu-system-x86_64: hw/ide/core.c:871: ide_dma_cb:
> > >   Assertion `n * 512 == s->sg.size' failed.
> > > 
> > > 
> > > -Amol
> > > 
> > 
> > I'm less sure of the fix -- certainly the assert is wrong, but just
> > incrementing 'n' is wrong too -- we didn't copy (n+1) sectors, we copied
> > (n) and a few extra bytes.
> 
> That is true.
> 
> There are (at least) two fields that represent the total size of a DMA
> transfer -
> (1) The size, as requested through the NSECTOR field.
> (2) The size, as calculated through the length fields of the PRD entries.
> 
> It makes sense to consider the most restrictive of the sizes, as the factor
> which determines both the end of a successful DMA transfer and the
> condition to assert.
> 
> > 
> > The sector-based math here would need to be adjusted to be able to cope
> > with partial sector reads... or we ought to avoid doing any partial
> > sector transfers.
> > 
> > 
> > I'm not sure which is more correct tonight, it depends:
> > 
> > - If it's OK to transfer partial sectors before reporting overflow,
> > adjusting the command loop to work with partial sectors is OK.
> > 
> > - If it's NOT OK to do partial sector transfer, the sglist preparation
> > phase needs to produce a truncated SGList that's some multiple of 512
> > bytes that leaves the excess bytes in a second sglist that we don't
> > throw away and can use as a basis for building the next sglist. (Or the
> > DMA helpers need to take a max_bytes parameter and return an sglist
> > representing unused buffer space if the command underflowed.)
> 
> Support for partial sector transfers is built into the DMA interface's PRD
> mechanism itself, because an entry is allowed to transfer in the units of
> even number of bytes.
> 
> I think the controller's IO process runs in two parts (probably loops over
> for a single transfer):
> 
> (1) The controller's disk interface transfers between its internal buffer
> and the disk storage. The transfers are likely to be in the
> multiples of a sector.
> (2) The controller's DMA interface transfers between its internal buffer
> and the system memory. The transfers can be sub-sector in size(, and
> are preserving of the areas, of the internal buffer, not subject to a
> write.)

The spec isn't clear about this (or at least I can't find anything where
the exact behaviour is specified), but I agree that that's my mental
model as well. So I would make IDE send a byte-granularity request with
the final partial sector to the block layer, so that the data is
actually transferred up to that point.

In practice it probably doesn't matter much because a too short PRDT
means that the request doesn't complete successfully (the condition is
indicated by clear Interrupt, Active and Error