Re: [PATCH v4 00/14] Fixes for DP8393X SONIC device emulation

2020-02-18 Thread Laurent Vivier
Le 19/02/2020 à 02:57, Aleksandar Markovic a écrit :
> 2:54 AM Sre, 19.02.2020. Aleksandar Markovic
> mailto:aleksandar.m.m...@gmail.com>> је
> написао/ла:
>>
>> 2:06 AM Sre, 19.02.2020. Finn Thain  > је написао/ла:
>> >
>> > On Tue, 18 Feb 2020, Aleksandar Markovic wrote:
>> >
>> > > On Wednesday, January 29, 2020, Finn Thain
> mailto:fth...@telegraphics.com.au>>
>> > > wrote:
>> > >
>> > > > Hi All,
>> > > >
>> > > > There are bugs in the emulated dp8393x device that can stop packet
>> > > > reception in a Linux/m68k guest (q800 machine).
>> > > >
>> > > > With a Linux/m68k v5.5 guest (q800), it's possible to remotely
> trigger
>> > > > an Oops by sending ping floods.
>> > > >
>> > > > With a Linux/mips guest (magnum machine), the driver fails to probe
>> > > > the dp8393x device.
>> > > >
>> > > > With a NetBSD/arc 5.1 guest (magnum), the bugs in the device can be
>> > > > fatal to the guest kernel.
>> > > >
>> > > > Whilst debugging the device, I found that the receiver algorithm
>> > > > differs from the one described in the National Semiconductor
>> > > > datasheet.
>> > > >
>> > > > This patch series resolves these bugs.
>> > > >
>> > > > AFAIK, all bugs in the Linux sonic driver were fixed in Linux v5.5.
>> > > > ---
>> > >
>> > >
>> > > Herve,
>> > >
>> > > Do your Jazz tests pass with these changes?
>> > >
>> >
>> > AFAIK those tests did not expose the NetBSD panic that is caused by
>> > mainline QEMU (mentioned above).
>> >
>> > I have actually run the tests you requested (Hervé described them in an
>> > earlier thread). There was no regression. Quite the reverse -- it's no
>> > longer possible to remotely crash the NetBSD kernel.
>> >
>> > Apparently my testing was also the first time that the jazzsonic driver
>> > (from the Linux/mips Magnum port) was tested successfully with QEMU. It
>> > doesn't work in mainline QEMU.
>> >
>>
>> Well, I appologize if I missed all these facts. I just did not notice
> them, at least not in this form. And, yes, some "Tested-by:" by Herve
> would be desirable and nice.
>>
> 
> Or, perhaps, even "Reviewed-by:".
> 

It would be nice to have this merged before next release because q800
machine networking is not reliable without them.

And thank you to Finn for all his hard work on this device emulation.

Laurent



Re: [PATCH v2] scsi-disk: define props in scsi_block_disk to avoid memleaks

2020-02-18 Thread Pan Nengyuan



On 1/22/2020 1:05 AM, Paolo Bonzini wrote:
> On 14/01/20 10:16, pannengy...@huawei.com wrote:
>> From: Pan Nengyuan 
>>
>> scsi_block_realize() use scsi_realize() to init some props, but
>> these props is not defined in scsi_block_disk_properties, so they will
>> not be freed.
>>
>> This patch defines these prop in scsi_block_disk_properties and aslo
>> calls scsi_unrealize to avoid memleaks, the leak stack as
>> follow(it's easy to reproduce by attaching/detaching scsi-block-disks):
>>
>> =
>> ==qemu-system-x86_64==32195==ERROR: LeakSanitizer: detected memory leaks
>>
>> Direct leak of 57 byte(s) in 3 object(s) allocated from:
>>   #0 0x7f19f8bed768 (/lib64/libasan.so.5+0xef768)  ??:?
>>   #1 0x7f19f64d9445 (/lib64/libglib-2.0.so.0+0x52445)  ??:?
>>   #2 0x7f19f64f2d92 (/lib64/libglib-2.0.so.0+0x6bd92)  ??:?
>>   #3 0x55975366e596 (qemu-system-x86_64+0x35c0596)  
>> /mnt/sdb/qemu/hw/scsi/scsi-disk.c:2399
>>   #4 0x559753671201 (emu-system-x86_64+0x35c3201)  
>> /mnt/sdb/qemu/hw/scsi/scsi-disk.c:2681
>>   #5 0x559753687e3e (qemu-system-x86_64+0x35d9e3e)  
>> /mnt/sdb/qemu/hw/scsi/scsi-bus.c:58
>>   #6 0x55975368ac44 (qemu-system-x86_64+0x35dcc44)  
>> /mnt/sdb/qemu/hw/scsi/scsi-bus.c:216
>>   #7 0x5597532a7840 (qemu-system-x86_64+0x31f9840)  
>> /mnt/sdb/qemu/hw/core/qdev.c:876
>>
>> Direct leak of 15 byte(s) in 3 object(s) allocated from:
>>   #0 0x7f19f8bed768 (/lib64/libasan.so.5+0xef768)  ??:?
>>   #1 0x7f19f64d9445 (/lib64/libglib-2.0.so.0+0x52445)  ??:?
>>   #2 0x7f19f64f2d92 (/lib64/libglib-2.0.so.0+0x6bd92)  ??:?
>>   #3 0x55975366e06f (qemu-system-x86_64+0x35c006f)  
>> /mnt/sdb/qemu/hw/scsi/scsi-disk.c:2388
>>   #4 0x559753671201 (qemu-system-x86_64+0x35c3201)  
>> /mnt/sdb/qemu/hw/scsi/scsi-disk.c:2681
>>   #5 0x559753687e3e (qemu-system-x86_64+0x35d9e3e)  
>> /mnt/sdb/qemu/hw/scsi/scsi-bus.c:58
>>   #6 0x55975368ac44 (qemu-system-x86_64+0x35dcc44)  
>> /mnt/sdb/qemu/hw/scsi/scsi-bus.c:216
>>
>> @@ -3079,9 +3080,8 @@ static const TypeInfo scsi_cd_info = {
>>  
>>  #ifdef __linux__
>>  static Property scsi_block_properties[] = {
>> -DEFINE_BLOCK_ERROR_PROPERTIES(SCSIDiskState, qdev.conf), \
>> +DEFINE_SCSI_DISK_PROPERTIES(),.
> The properties defined there shouldn't apply to scsi-block.
> 
> Can you explain what exactly is being leaked?

Ohh, I'm sorry, I missed this email and reply it so late.

When we attach a scsi-block disk, the props(version/vender/device_id) are 
malloced in scsi_realize() which it's called by scsi_block_realize(),
but we don't define these props in scsi_block_properties. So these props will 
not be released when we detach the scsi-block disk.

This patch will reuse scsi_disk_properties to define those props in 
scsi_block_properties to fix it.
Similarly to scsi_hd, this patch aslo set unrealize to call 
del_boot_device_lchs().

Thanks.

> 
> Paolo
> 
> .
> 



Re: [PULL SUBSYSTEM qemu-pseries] pseries: Update SLOF firmware image

2020-02-18 Thread Cédric Le Goater
On 2/19/20 7:44 AM, Alexey Kardashevskiy wrote:
> 
> 
> On 19/02/2020 12:20, Alexey Kardashevskiy wrote:
>>
>>
>> On 18/02/2020 23:59, Cédric Le Goater wrote:
>>> On 2/18/20 1:48 PM, Cédric Le Goater wrote:
 On 2/18/20 10:40 AM, Cédric Le Goater wrote:
> On 2/18/20 10:10 AM, Alexey Kardashevskiy wrote:
>>
>>
>> On 18/02/2020 20:05, Alexey Kardashevskiy wrote:
>>>
>>>
>>> On 18/02/2020 18:12, Cédric Le Goater wrote:
 On 2/18/20 1:30 AM, Alexey Kardashevskiy wrote:
>
>
> On 17/02/2020 20:48, Cédric Le Goater wrote:
>> On 2/17/20 3:12 AM, Alexey Kardashevskiy wrote:
>>> The following changes since commit 
>>> 05943fb4ca41f626078014c0327781815c6584c5:
>>>
>>>   ppc: free 'fdt' after reset the machine (2020-02-17 11:27:23 
>>> +1100)
>>>
>>> are available in the Git repository at:
>>>
>>>   g...@github.com:aik/qemu.git tags/qemu-slof-20200217
>>>
>>> for you to fetch changes up to 
>>> ea9a03e5aa023c5391bab5259898475d0298aac2:
>>>
>>>   pseries: Update SLOF firmware image (2020-02-17 13:08:59 +1100)
>>>
>>> 
>>> Alexey Kardashevskiy (1):
>>>   pseries: Update SLOF firmware image
>>>
>>>  pc-bios/README   |   2 +-
>>>  pc-bios/slof.bin | Bin 931032 -> 968560 bytes
>>>  roms/SLOF|   2 +-
>>>  3 files changed, 2 insertions(+), 2 deletions(-)
>>>
>>>
>>> *** Note: this is not for master, this is for pseries
>>>
>>
>> Hello Alexey,
>>
>> QEMU fails to boot from disk. See below.
>
>
> It does boot mine (fedora 30, ubuntu 18.04), see below. I believe I
> could have broken something but I need more detail. Thanks,

 fedora31 boots but not ubuntu 19.10. Could it be GRUB version 2.04 ? 
>>>
>>>
>>> No, not that either:
>>
>>
>> but it might be because of power9 - I only tried power8, rsyncing the
>> image to a p9 machine now...
>
> Here is the disk : 
>
> Disk /dev/sda: 50 GiB, 53687091200 bytes, 104857600 sectors
> Disk model: QEMU HARDDISK   
> Units: sectors of 1 * 512 = 512 bytes
> Sector size (logical/physical): 512 bytes / 512 bytes
> I/O size (minimum/optimal): 512 bytes / 512 bytes
> Disklabel type: gpt
> Disk identifier: 27DCE458-231A-4981-9FF1-983F87C2902D
>
> Device Start   End   Sectors Size Type
> /dev/sda1   2048 16383 14336   7M PowerPC PReP boot
> /dev/sda2  16384 100679679 100663296  48G Linux filesystem
> /dev/sda3  100679680 104857566   4177887   2G Linux swap
>
>
> GPT ? 

 For the failure, I bisected up to :

 f12149908705 ("ext2: Read all 64bit of inode number")
>>>
>>> Here is a possible fix for it. I did some RPN on my hp28s in the past 
>>> but I am not forth fluent.
>>
>>
>> you basically zeroed the top bits by shifting them too far right :)
>>
>> The proper fix I think is:
>>
>> -  32 lshift or
>> +  20 lshift or
>>
>> I keep forgetting it is all in hex. Can you please give it a try? My
>> 128GB disk does not expose this problem somehow. Thanks,
> 
> Better try this one please:
> 
> https://github.com/aik/SLOF/tree/ext4
Tested with the same image. Looks good. 
 
> What I still do not understand is why GRUB is using ext2 from SLOF, it
> should parse ext4 itself :-/

Here is the fs information.


Filesystem volume name:   
Last mounted on:  /
Filesystem UUID:  8d53f6b4-ffc2-4d8f-bd09-67ac97d7b0c5
Filesystem magic number:  0xEF53
Filesystem revision #:1 (dynamic)
Filesystem features:  has_journal ext_attr resize_inode dir_index filetype 
needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg 
dir_nlink extra_isize
Filesystem flags: unsigned_directory_hash 
Default mount options:user_xattr acl
Filesystem state: clean
Errors behavior:  Continue
Filesystem OS type:   Linux
Inode count:  3127296
Block count:  12582912
Reserved block count: 552210
Free blocks:  7907437
Free inodes:  2863361
First block:  0
Block size:   4096
Fragment size:4096
Reserved GDT blocks:  1021
Blocks per group: 32768
Fragments per group:  32768
Inodes per group: 8144
Inode blocks per group:   509
Flex block group size:16
Filesystem created:   Wed Dec 14 15:40:55 2016
Last mount time:  Wed Feb 19 08:06:52 2020
Last write time:  Wed Feb 19 08:06:46 2020
Mount count:  1863
Maximum mount count:  -1
Last checked: Fri Nov 23 19:09:13 2018
Check interval:   0 ()
Lifetime writes:  883 GB
Reserved blocks 

Re: [PATCH v2 12/22] qemu-iotests/199: fix style

2020-02-18 Thread Andrey Shinkevich

On 17/02/2020 18:02, Vladimir Sementsov-Ogievskiy wrote:

Mostly, satisfy pep8 complains.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  tests/qemu-iotests/199 | 13 +++--
  1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index 40774eed74..de9ba8d94c 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -28,8 +28,8 @@ disk_b = os.path.join(iotests.test_dir, 'disk_b')
  size = '256G'
  fifo = os.path.join(iotests.test_dir, 'mig_fifo')
  
-class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
  
+class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):

  def tearDown(self):
  self.vm_a.shutdown()
  self.vm_b.shutdown()
@@ -54,7 +54,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
  
  result = self.vm_a.qmp('block-dirty-bitmap-add', node='drive0',

 name='bitmap', granularity=granularity)
-self.assert_qmp(result, 'return', {});
+self.assert_qmp(result, 'return', {})
  
  s = 0

  while s < write_size:
@@ -71,7 +71,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
  
  result = self.vm_a.qmp('block-dirty-bitmap-clear', node='drive0',

 name='bitmap')
-self.assert_qmp(result, 'return', {});
+self.assert_qmp(result, 'return', {})
  s = 0
  while s < write_size:
  self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
@@ -104,15 +104,16 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
  self.vm_b.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
  s += 0x1
  
-result = self.vm_b.qmp('query-block');

+result = self.vm_b.qmp('query-block')
  while len(result['return'][0]['dirty-bitmaps']) > 1:
  time.sleep(2)
-result = self.vm_b.qmp('query-block');
+result = self.vm_b.qmp('query-block')
  
  result = self.vm_b.qmp('x-debug-block-dirty-bitmap-sha256',

 node='drive0', name='bitmap')
  
-self.assert_qmp(result, 'return/sha256', sha256);

+self.assert_qmp(result, 'return/sha256', sha256)
+
  
  if __name__ == '__main__':

  iotests.main(supported_fmts=['qcow2'], supported_cache_modes=['none'],



Reviewed-by: Andrey Shinkevich 
--
With the best regards,
Andrey Shinkevich



Re: [PATCH 1/5] aio-posix: fix use after leaving scope in aio_poll()

2020-02-18 Thread Sergio Lopez
On Fri, Feb 14, 2020 at 05:17:08PM +, Stefan Hajnoczi wrote:
> epoll_handler is a stack variable and must not be accessed after it goes
> out of scope:
> 
>   if (aio_epoll_check_poll(ctx, pollfds, npfd, timeout)) {
>   AioHandler epoll_handler;
>   ...
>   add_pollfd(_handler);
>   ret = aio_epoll(ctx, pollfds, npfd, timeout);
>   } ...
> 
>   ...
> 
>   /* if we have any readable fds, dispatch event */
>   if (ret > 0) {
>   for (i = 0; i < npfd; i++) {
>   nodes[i]->pfd.revents = pollfds[i].revents;
>   }
>   }
> 
> nodes[0] is _handler, which has already gone out of scope.
> 
> There is no need to use pollfds[] for epoll.  We don't need an
> AioHandler for the epoll fd.
> 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  util/aio-posix.c | 20 
>  1 file changed, 8 insertions(+), 12 deletions(-)

Reviewed-by: Sergio Lopez 


signature.asc
Description: PGP signature


Re: [PULL SUBSYSTEM qemu-pseries] pseries: Update SLOF firmware image

2020-02-18 Thread Alexey Kardashevskiy



On 19/02/2020 12:20, Alexey Kardashevskiy wrote:
> 
> 
> On 18/02/2020 23:59, Cédric Le Goater wrote:
>> On 2/18/20 1:48 PM, Cédric Le Goater wrote:
>>> On 2/18/20 10:40 AM, Cédric Le Goater wrote:
 On 2/18/20 10:10 AM, Alexey Kardashevskiy wrote:
>
>
> On 18/02/2020 20:05, Alexey Kardashevskiy wrote:
>>
>>
>> On 18/02/2020 18:12, Cédric Le Goater wrote:
>>> On 2/18/20 1:30 AM, Alexey Kardashevskiy wrote:


 On 17/02/2020 20:48, Cédric Le Goater wrote:
> On 2/17/20 3:12 AM, Alexey Kardashevskiy wrote:
>> The following changes since commit 
>> 05943fb4ca41f626078014c0327781815c6584c5:
>>
>>   ppc: free 'fdt' after reset the machine (2020-02-17 11:27:23 +1100)
>>
>> are available in the Git repository at:
>>
>>   g...@github.com:aik/qemu.git tags/qemu-slof-20200217
>>
>> for you to fetch changes up to 
>> ea9a03e5aa023c5391bab5259898475d0298aac2:
>>
>>   pseries: Update SLOF firmware image (2020-02-17 13:08:59 +1100)
>>
>> 
>> Alexey Kardashevskiy (1):
>>   pseries: Update SLOF firmware image
>>
>>  pc-bios/README   |   2 +-
>>  pc-bios/slof.bin | Bin 931032 -> 968560 bytes
>>  roms/SLOF|   2 +-
>>  3 files changed, 2 insertions(+), 2 deletions(-)
>>
>>
>> *** Note: this is not for master, this is for pseries
>>
>
> Hello Alexey,
>
> QEMU fails to boot from disk. See below.


 It does boot mine (fedora 30, ubuntu 18.04), see below. I believe I
 could have broken something but I need more detail. Thanks,
>>>
>>> fedora31 boots but not ubuntu 19.10. Could it be GRUB version 2.04 ? 
>>
>>
>> No, not that either:
>
>
> but it might be because of power9 - I only tried power8, rsyncing the
> image to a p9 machine now...

 Here is the disk : 

 Disk /dev/sda: 50 GiB, 53687091200 bytes, 104857600 sectors
 Disk model: QEMU HARDDISK   
 Units: sectors of 1 * 512 = 512 bytes
 Sector size (logical/physical): 512 bytes / 512 bytes
 I/O size (minimum/optimal): 512 bytes / 512 bytes
 Disklabel type: gpt
 Disk identifier: 27DCE458-231A-4981-9FF1-983F87C2902D

 Device Start   End   Sectors Size Type
 /dev/sda1   2048 16383 14336   7M PowerPC PReP boot
 /dev/sda2  16384 100679679 100663296  48G Linux filesystem
 /dev/sda3  100679680 104857566   4177887   2G Linux swap


 GPT ? 
>>>
>>> For the failure, I bisected up to :
>>>
>>> f12149908705 ("ext2: Read all 64bit of inode number")
>>
>> Here is a possible fix for it. I did some RPN on my hp28s in the past 
>> but I am not forth fluent.
> 
> 
> you basically zeroed the top bits by shifting them too far right :)
> 
> The proper fix I think is:
> 
> -  32 lshift or
> +  20 lshift or
> 
> I keep forgetting it is all in hex. Can you please give it a try? My
> 128GB disk does not expose this problem somehow. Thanks,

Better try this one please:

https://github.com/aik/SLOF/tree/ext4

What I still do not understand is why GRUB is using ext2 from SLOF, it
should parse ext4 itself :-/


> 
> 
>>
>> "slash not found" is still there though. 


Yeah I see these but they are harmless as far as I can tell.



>>
>> Cheers,
>>
>> C.
>>
>>
>> From 92dc9f6dc7c6434419306d5a382adb42169b712a Mon Sep 17 00:00:00 2001
>> From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= 
>> Date: Tue, 18 Feb 2020 13:54:54 +0100
>> Subject: [PATCH] ext2: Fix 64bit inode number
>> MIME-Version: 1.0
>> Content-Type: text/plain; charset=UTF-8
>> Content-Transfer-Encoding: 8bit
>>
>> Fixes: f12149908705 ("ext2: Read all 64bit of inode number")
>> Signed-off-by: Cédric Le Goater 
>> ---
>>  slof/fs/packages/ext2-files.fs | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/slof/fs/packages/ext2-files.fs b/slof/fs/packages/ext2-files.fs
>> index b6a7880bd88e..f1d9fdfd67e2 100644
>> --- a/slof/fs/packages/ext2-files.fs
>> +++ b/slof/fs/packages/ext2-files.fs
>> @@ -152,7 +152,7 @@ CONSTANT /ext4-ee
>>dup
>>8 + l@-le   \ reads bg_inode_table_lo
>>swap 28 + l@-le \ reads bg_inode_table_hi
>> -  32 lshift or
>> +  32 rshift or
>>block-size @ *  \ # in group, inode table
>>swap inode-size @ * + xlsplit seek drop  inode @ inode-size @ read drop
>>  ;
>>
> 

-- 
Alexey



[Bug 1759522] Re: windows qemu-img create vpc/vhdx error

2020-02-18 Thread Zixuan Wang
I remembered asking a QEMU developer about this. He suggested me to send
an email to the developer mailing list, or send messages to IRC channel.

I don't have time to do this right now, but if someone else finds this
bug report and wants to get the help, please email them instead.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1759522

Title:
  windows qemu-img create vpc/vhdx error

Status in QEMU:
  New

Bug description:
  On windows, using qemu-img (version 2.11.90) to create vpc/vhdx
  virtual disk tends to fail. Here's the way to reproduce:

  1. Install qemu-w64-setup-20180321.exe

  2. Use `qemu-img create -f vhdx -o subformat=fixed disk.vhdx 512M` to create 
a vhdx:
 Formatting 'disk.vhdx', fmt=vhdx size=536870912 log_size=1048576 
block_size=0 subformat=fixed

  3. Execute `qemu-img info disk.vhdx` gives the result, (note the `disk size` 
is incorrect):
 image: disk.vhdx
 file format: vhdx
 virtual size: 512M (536870912 bytes)
 disk size: 1.4M
 cluster_size: 8388608

  4. On Windows 10 (V1709), double click disk.vhdx gives an error:
 Make sure the file is in an NTFS volume and isn't in a compressed folder 
or volume.

 Using Disk Management -> Action -> Attach VHD gives an error:
 The requested operation could not be completed due to a virtual disk 
system limitation. Virtual hard disk files must be uncompressed and uneccrypted 
and must not be sparse.

  Comparison with Windows 10 created VHDX:

  1. Using Disk Management -> Action -> Create VHD:
 File name: win.vhdx
 Virtual hard disk size: 512MB
 Virtual hard disk format: VHDX
 Virtual hard disk type: Fixed size

  2. Detach VHDX

  3. Execute `qemu-img info win.vhdx` gives the result:
 image: win.vhdx
 file format: vhdx
 virtual size: 512M (536870912 bytes)
 disk size: 516M
 cluster_size: 33554432

  Comparison with qemu-img under Ubuntu:

  1. Version: qemu-img version 2.5.0 (Debian 1:2.5+dfsg-5ubuntu10.16),
  Copyright (c) 2004-2008 Fabrice Bellard

  2. qemu-img create -f vhdx -o subformat=fixed lin.vhdx 512M
 Formatting 'lin.vhdx', fmt=vhdx size=536870912 log_size=1048576 
block_size=0 subformat=fixed

  3. qemu-img info lin.vhdx
 image: lin.vhdx
 file format: vhdx
 virtual size: 512M (536870912 bytes)
 disk size: 520M
 cluster_size: 8388608

  4. Load lin.vhdx under Windows 10 is ok

  The same thing happens on `vpc` format with or without
  `oformat=fixed`, it seems that windows version of qemu-img has some
  incorrect operation? My guess is that windows version of qemu-img
  doesn't handle the description field of vpc/vhdx, which leads to an
  incorrect `disk size` field.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1759522/+subscriptions



Re: [PATCH] migration/throttle: Make throttle slower at tail stage

2020-02-18 Thread zhukeqian



On 2020/2/14 20:28, Dr. David Alan Gilbert wrote:
> * Keqian Zhu (zhukeqi...@huawei.com) wrote:
>> At the tail stage of throttle, VM is very sensitive to
>> CPU percentage. We just throttle 30% of remaining CPU
>> when throttle is more than 80 percentage.
> 
> This is a bit unusual;  all of the rest of the throttling has no
> fixed constants; all values are set by parameters.
> 
> You've got the two, the '80' and the '0.3'
> 
> I thinkt he easy solution is to replace your parameter 'tailslow' by two
> new parameters; 'tailstart' and 'tailrate';  both defaulting to 100%.
> 
> Then you make it:
> 
> if (cpu_throttle_now >= pct_tailstart) {
> /* Eat some scale of CPU from remaining */
> cpu_throttle_inc = ceil((100 - cpu_throttle_now) * pct_tailrate);
> 
> (with percentage scaling added).
> 
> Then setting 'tailstart' to 80 and 'tailrate' to 30 is equivalent to
> what you have, but means we have no magical constants in the code.
> 
Yes, this is a good suggestion. Though this patch is not the final idea,
I will apply it when throttle approach is decided.
> Dave
> 
> 
[...]
>> -- 
>> 2.19.1
>>
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> 
> 
> .
> 
Thanks,
Keqian




Re: [PATCH] migration/throttle: Make throttle slower at tail stage

2020-02-18 Thread zhukeqian



On 2020/2/14 19:46, Eric Blake wrote:
> On 2/13/20 9:27 PM, Keqian Zhu wrote:
>> At the tail stage of throttle, VM is very sensitive to
>> CPU percentage. We just throttle 30% of remaining CPU
>> when throttle is more than 80 percentage.
>>
>> This doesn't conflict with cpu_throttle_increment.
>>
>> This may make migration time longer, and is disabled
>> by default.
>>
>> Signed-off-by: Keqian Zhu 
>> ---
>> Cc: Juan Quintela 
>> Cc: "Dr. David Alan Gilbert" 
>> Cc: Eric Blake 
>> Cc: Markus Armbruster 
> 
>> +++ b/qapi/migration.json
>> @@ -532,6 +532,12 @@
>>   #  auto-converge detects that migration is not 
>> making
>>   #  progress. The default value is 10. (Since 2.7)
>>   #
>> +# @cpu-throttle-tailslow: Make throttle slower at tail stage
>> +# At the tail stage of throttle, VM is very 
>> sensitive to
>> +# CPU percentage. We just throttle 30% of remaining 
>> CPU
>> +# when throttle is more than 80 percentage. The 
>> default
>> +# value is false. (Since 4.1)
> 
> The next release is 5.0, not 4.1.
Thanks for reminding me.
> 
Thanks,
Keqian





Re: [PATCH] migration/throttle: Make throttle slower at tail stage

2020-02-18 Thread zhukeqian
Hi, Juan

On 2020/2/14 20:37, Juan Quintela wrote:
> Keqian Zhu  wrote:
>> At the tail stage of throttle, VM is very sensitive to
>> CPU percentage. We just throttle 30% of remaining CPU
>> when throttle is more than 80 percentage.
> 
> Why?
> 
My original idea is that if we throttle a fixed percentage of CPU every time,
then the VM is more and more sensitive to performance decrease.

For example, if the initial throttle is 10% and we throttle 10% every time. At 
the
beginning, the performance changes from 100% to 90%, which makes little effect 
on VM.
However, if the dirty rate is very high and it is not enough even throttle 80%, 
then
the performance changes from 20% to 10%, which half the performance and makes 
heavy
effect on VM.

In the example above, if throttle 85% is enough, then throttle 90% makes 
unnecessary
performance loss on VM. So this is the reason for slowdown throttling when we 
are about
to reach the best throttle.

> If we really think that this is better that current approarch, just do
> this _always_.  And throothre 30% of remaining CPU.  So we go:
> 
> 30%
> 30% + 0.3(70%)
> ...
> 
> Or anything else.
> 

This should not be a new approach, instead it is an optional enhancement to

current approach. However, after my deeper thinking, the way that throttle
30% of remaining CPU is unusual and not suitable. We should use another way
to slowdown the tail stage.

When dirty bytes is is 50% more than the approx. bytes xfer, we start or 
increase
throttling. My idea is that we can calculate the throttle increment expected.
When dirty rate is about to reach the 50% of bandwidth, the throttle increment
expected will smaller than "cpu_throttle_increment" at tail stage.

Maybe the core code likes this:

-static void mig_throttle_guest_down(void)
+static void mig_throttle_guest_down(uint64_t bytes_dirty, uint64_t bytes_xfer)
 {
 MigrationState *s = migrate_get_current();
 uint64_t pct_initial = s->parameters.cpu_throttle_initial;
-uint64_t pct_icrement = s->parameters.cpu_throttle_increment;
+uint64_t pct_increment = s->parameters.cpu_throttle_increment;
+bool pct_tailslow = s->parameters.cpu_throttle_tailslow;
 int pct_max = s->parameters.max_cpu_throttle;

+uint64_t cpu_throttle_now = cpu_throttle_get_percentage();
+uint64_t cpu_now, cpu_target, cpu_throttle_expect;
+uint64_t cpu_throttle_inc;
+
 /* We have not started throttling yet. Let's start it. */
 if (!cpu_throttle_active()) {
 cpu_throttle_set(pct_initial);
 } else {
 /* Throttling already on, just increase the rate */
-cpu_throttle_set(MIN(cpu_throttle_get_percentage() + pct_icrement,
+cpu_throttle_inc = pct_increment;
+if (pct_tailslow) {
+cpu_now = 100 - cpu_throttle_now;
+cpu_target = ((bytes_xfer / 2.0) / bytes_dirty) * cpu_now;
+cpu_throttle_expect = cpu_now - cpu_target;
+if (cpu_throttle_expect < pct_increment) {
+cpu_throttle_inc = cpu_throttle_expect;
+}
+}
+cpu_throttle_set(MIN(cpu_throttle_now + cpu_throttle_inc,
  pct_max));
 }
 }
__

-if ((rs->num_dirty_pages_period * TARGET_PAGE_SIZE >
-   (bytes_xfer_now - rs->bytes_xfer_prev) / 2) &&
+bytes_dirty_period = rs->num_dirty_pages_period * TARGET_PAGE_SIZE;
+bytes_xfer_period = bytes_xfer_now - rs->bytes_xfer_prev;
+if ((bytes_dirty_period > bytes_xfer_period / 2) &&
 (++rs->dirty_rate_high_cnt >= 2)) {
 trace_migration_throttle();
 rs->dirty_rate_high_cnt = 0;
-mig_throttle_guest_down();
+mig_throttle_guest_down(bytes_dirty_period,
+bytes_xfer_period);
 }
> My experience is:
> - you really need to go to very high throothle to make migration happens
>   (more than 70% or so usually).
> - The way that we throotle is not completely exact.
> 
>> This doesn't conflict with cpu_throttle_increment.
>>
>> This may make migration time longer, and is disabled
>> by default.
> 
> 
> What do you think?
> I think that it is better to change method and improve documentation
> that yet adding another parameter.
> 
> Later, Juan.
> 
> 
> .
> 
Thanks,
Keqian




Re: QEMU Sockets Networking Backend Multicast Networking Fix

2020-02-18 Thread Jason Wang



On 2020/2/17 下午6:05, Faisal Al-Humaimidi wrote:

Hello Jason,

But, the local address is not meant to be added to the group, rather 
we listen to it, hence we bind to the local address. The multicast 
group is a higher layer that would be requested to join to by the 
listening host. Here's a similar example in multicasting that 
demonstrates this idea in Python: 
https://pymotw.com/2/socket/multicast.html.



Well, I think it kinds of violates the multicast overlay here. It allows 
to receive any other traffic (unitcast) that may be received for the 
port which is not what we want here.


Thanks





Regards,
Faisal Al-Humaimidi

On Mon., Feb. 17, 2020, 1:54 a.m. Jason Wang, > wrote:



On 2020/2/15 下午6:39, Markus Armbruster wrote:
> Jason, please have a look.
>
> Faisal Al-Humaimidi mailto:falhuma...@gmail.com>> writes:
>
>> Hello QEMU developers,
>>
>> I have noticed a bug in the `mcast` option of the `socket`
networking
>> backend, where I simply cannot join a multicast group (tested
in Windows 10
>> with QEMU 4.2.0 release). I have found a fix to the problem.
The problem
>> was mainly due to the fact that QEMU was binding to the
multicast address,
>> and not the local address or the default INADDR_ANY (0.0.0.0)
if no local
>> address is used.
>>
>> Here's the patch text (as well as attached with this email),
that outlines
>> my fix:
>>
>> ```
>> diff -uarN qemu-4.2.0.original/net/socket.c
qemu-4.2.0.modified/net/socket.c
>> --- qemu-4.2.0.original/net/socket.c 2019-12-12
10:20:48.0 -0800
>> +++ qemu-4.2.0.modified/net/socket.c 2020-02-14
10:30:16.395973453 -0800
>> @@ -253,6 +253,15 @@
>>           goto fail;
>>       }
>>
>> +    /* Preserve the multicast address, and bind to a
non-multicast group
>> (e.g. a
>> +     * local address).
>> +     */
>> +    struct in_addr group_addr = mcastaddr->sin_addr;
>> +    if (localaddr) {
>> +        mcastaddr->sin_addr = *localaddr;
>> +    } else {
>> +        mcastaddr->sin_addr.s_addr = htonl(INADDR_ANY);
>> +    }
>>       ret = bind(fd, (struct sockaddr *)mcastaddr,
sizeof(*mcastaddr));


This looks wrong, AFAIK the local address should be added through
IP_ADD_MEMBERSHIP which is already handled in this function I believe.

Thanks


>>       if (ret < 0) {
>>           error_setg_errno(errp, errno, "can't bind ip=%s to
socket",
>> @@ -260,7 +269,10 @@
>>           goto fail;
>>       }
>>
>> -    /* Add host to multicast group */
>> +    /* Restore the multicast address. */
>> +    mcastaddr->sin_addr = group_addr;
>> +
>> +    /* Add host to multicast group. */
>>       imr.imr_multiaddr = mcastaddr->sin_addr;
>>       if (localaddr) {
>>           imr.imr_interface = *localaddr;
>> @@ -277,7 +289,7 @@
>>           goto fail;
>>       }
>>
>> -    /* Force mcast msgs to loopback (eg. several QEMUs in same
host */
>> +    /* Force mcast msgs to loopback (eg. several QEMUs in same
host). */
>>       loop = 1;
>>       ret = qemu_setsockopt(fd, IPPROTO_IP, IP_MULTICAST_LOOP,
>>                             , sizeof(loop));
>> @@ -287,7 +299,7 @@
>>           goto fail;
>>       }
>>
>> -    /* If a bind address is given, only send packets from that
address */
>> +    /* If a bind address is given, only send packets from that
address. */
>>       if (localaddr != NULL) {
>>           ret = qemu_setsockopt(fd, IPPROTO_IP, IP_MULTICAST_IF,
>>                                 localaddr, sizeof(*localaddr));
>> ```
>>
>> Regards,
>> Faisal Al-Humaimidi
>






Re: [PATCH v12 Kernel 4/7] vfio iommu: Implementation of ioctl to for dirty pages tracking.

2020-02-18 Thread Alex Williamson
On Wed, 19 Feb 2020 09:51:32 +0530
Kirti Wankhede  wrote:

> On 2/19/2020 3:11 AM, Alex Williamson wrote:
> > On Tue, 18 Feb 2020 11:28:53 +0530
> > Kirti Wankhede  wrote:
> >   
> >> 
> >>  
> >>> As I understand the above algorithm, we find a vfio_dma
> >>> overlapping the request and populate the bitmap for that range.  Then
> >>> we go back and put_user() for each byte that we touched.  We could
> >>> instead simply work on a one byte buffer as we enumerate the requested
> >>> range and do a put_user() ever time we reach the end of it and have 
> >>> bits
> >>> set. That would greatly simplify the above example.  But I would 
> >>> expect
> >>> that we're a) more likely to get asked for ranges covering a single
> >>> vfio_dma  
> >>
> >> QEMU ask for single vfio_dma during each iteration.
> >>
> >> If we restrict this ABI to cover single vfio_dma only, then it
> >> simplifies the logic here. That was my original suggestion. Should we
> >> think about that again?  
> >
> > But we currently allow unmaps that overlap multiple vfio_dmas as long
> > as no vfio_dma is bisected, so I think that implies that an unmap while
> > asking for the dirty bitmap has even further restricted semantics.  I'm
> > also reluctant to design an ABI around what happens to be the current
> > QEMU implementation.
> >
> > If we take your example above, ranges {0x,0xa000} and
> > {0xa000,0x1} ({start,end}), I think you're working with the
> > following two bitmaps in this implementation:
> >
> > 0011 b
> > 0011b
> >
> > And we need to combine those into:
> >
> >  b
> >
> > Right?
> >
> > But it seems like that would be easier if the second bitmap was instead:
> >
> > 1100b
> >
> > Then we wouldn't need to worry about the entire bitmap being shifted by
> > the bit offset within the byte, which limits our fixes to the boundary
> > byte and allows us to use copy_to_user() directly for the bulk of the
> > copy.  So how do we get there?
> >
> > I think we start with allocating the vfio_dma bitmap to account for
> > this initial offset, so we calculate bitmap_base_iova as:
> >  (iova & ~((PAGE_SIZE << 3) - 1))
> > We then use bitmap_base_iova in calculating which bits to set.
> >
> > The user needs to follow the same rules, and maybe this adds some value
> > to the user providing the bitmap size rather than the kernel
> > calculating it.  For example, if the user wanted the dirty bitmap for
> > the range {0xa000,0x1} above, they'd provide at least a 1 byte
> > bitmap, but we'd return bit #2 set to indicate 0xa000 is dirty.
> >
> > Effectively the user can ask for any iova range, but the buffer will be
> > filled relative to the zeroth bit of the bitmap following the above
> > bitmap_base_iova formula (and replacing PAGE_SIZE with the user
> > requested pgsize).  I'm tempted to make this explicit in the user
> > interface (ie. only allow bitmaps starting on aligned pages), but a
> > user is able to map and unmap single pages and we need to support
> > returning a dirty bitmap with an unmap, so I don't think we can do that.
> > 
> 
>  Sigh, finding adjacent vfio_dmas within the same byte seems simpler than
>  this.  
> >>>
> >>> How does KVM do this?  My intent was that if all of our bitmaps share
> >>> the same alignment then we can merge the intersection and continue to
> >>> use copy_to_user() on either side.  However, if QEMU doesn't do the
> >>> same, it doesn't really help us.  Is QEMU stuck with an implementation
> >>> of only retrieving dirty bits per MemoryRegionSection exactly because
> >>> of this issue and therefore we can rely on it in our implementation as
> >>> well?  Thanks,
> >>>  
> >>
> >> QEMU sync dirty_bitmap per MemoryRegionSection. Within
> >> MemoryRegionSection there could be multiple KVMSlots. QEMU queries
> >> dirty_bitmap per KVMSlot and mark dirty for each KVMSlot.
> >> On kernel side, KVM_GET_DIRTY_LOG ioctl calls
> >> kvm_get_dirty_log_protect(), where it uses copy_to_user() to copy bitmap
> >> of that memSlot.
> >> vfio_dma is per MemoryRegionSection. We can reply on MemoryRegionSection
> >> in our implementation. But to get bitmap during unmap, we have to take
> >> care of concatenating bitmaps.  
> > 
> > So KVM does not worry about bitmap alignment because the interface is
> > based on slots, a dirty bitmap can only be retrieved for a single,
> > entire slot.  We need VFIO_IOMMU_UNMAP_DMA to maintain its support for
> > spanning multiple vfio_dmas, but maybe we have some leeway that we
> > don't need to support both multiple vfio_dmas and dirty bitmap at the
> > same time.  It seems like it would be a massive simplification if we
> > required an unmap with dirty bitmap to 

Re: [PATCH v12 Kernel 4/7] vfio iommu: Implementation of ioctl to for dirty pages tracking.

2020-02-18 Thread Kirti Wankhede




On 2/19/2020 3:11 AM, Alex Williamson wrote:

On Tue, 18 Feb 2020 11:28:53 +0530
Kirti Wankhede  wrote:





As I understand the above algorithm, we find a vfio_dma
overlapping the request and populate the bitmap for that range.  Then
we go back and put_user() for each byte that we touched.  We could
instead simply work on a one byte buffer as we enumerate the requested
range and do a put_user() ever time we reach the end of it and have bits
set. That would greatly simplify the above example.  But I would expect
that we're a) more likely to get asked for ranges covering a single
vfio_dma


QEMU ask for single vfio_dma during each iteration.

If we restrict this ABI to cover single vfio_dma only, then it
simplifies the logic here. That was my original suggestion. Should we
think about that again?


But we currently allow unmaps that overlap multiple vfio_dmas as long
as no vfio_dma is bisected, so I think that implies that an unmap while
asking for the dirty bitmap has even further restricted semantics.  I'm
also reluctant to design an ABI around what happens to be the current
QEMU implementation.

If we take your example above, ranges {0x,0xa000} and
{0xa000,0x1} ({start,end}), I think you're working with the
following two bitmaps in this implementation:

0011 b
0011b

And we need to combine those into:

 b

Right?

But it seems like that would be easier if the second bitmap was instead:

1100b

Then we wouldn't need to worry about the entire bitmap being shifted by
the bit offset within the byte, which limits our fixes to the boundary
byte and allows us to use copy_to_user() directly for the bulk of the
copy.  So how do we get there?

I think we start with allocating the vfio_dma bitmap to account for
this initial offset, so we calculate bitmap_base_iova as:
 (iova & ~((PAGE_SIZE << 3) - 1))
We then use bitmap_base_iova in calculating which bits to set.

The user needs to follow the same rules, and maybe this adds some value
to the user providing the bitmap size rather than the kernel
calculating it.  For example, if the user wanted the dirty bitmap for
the range {0xa000,0x1} above, they'd provide at least a 1 byte
bitmap, but we'd return bit #2 set to indicate 0xa000 is dirty.

Effectively the user can ask for any iova range, but the buffer will be
filled relative to the zeroth bit of the bitmap following the above
bitmap_base_iova formula (and replacing PAGE_SIZE with the user
requested pgsize).  I'm tempted to make this explicit in the user
interface (ie. only allow bitmaps starting on aligned pages), but a
user is able to map and unmap single pages and we need to support
returning a dirty bitmap with an unmap, so I don't think we can do that.
  


Sigh, finding adjacent vfio_dmas within the same byte seems simpler than
this.


How does KVM do this?  My intent was that if all of our bitmaps share
the same alignment then we can merge the intersection and continue to
use copy_to_user() on either side.  However, if QEMU doesn't do the
same, it doesn't really help us.  Is QEMU stuck with an implementation
of only retrieving dirty bits per MemoryRegionSection exactly because
of this issue and therefore we can rely on it in our implementation as
well?  Thanks,
   


QEMU sync dirty_bitmap per MemoryRegionSection. Within
MemoryRegionSection there could be multiple KVMSlots. QEMU queries
dirty_bitmap per KVMSlot and mark dirty for each KVMSlot.
On kernel side, KVM_GET_DIRTY_LOG ioctl calls
kvm_get_dirty_log_protect(), where it uses copy_to_user() to copy bitmap
of that memSlot.
vfio_dma is per MemoryRegionSection. We can reply on MemoryRegionSection
in our implementation. But to get bitmap during unmap, we have to take
care of concatenating bitmaps.


So KVM does not worry about bitmap alignment because the interface is
based on slots, a dirty bitmap can only be retrieved for a single,
entire slot.  We need VFIO_IOMMU_UNMAP_DMA to maintain its support for
spanning multiple vfio_dmas, but maybe we have some leeway that we
don't need to support both multiple vfio_dmas and dirty bitmap at the
same time.  It seems like it would be a massive simplification if we
required an unmap with dirty bitmap to span exactly one vfio_dma,
right? 


Yes.


I don't see that we'd break any existing users with that, it's
unfortunate that we can't have the flexibility of the existing calling
convention, but I think there's good reason for it here.  Our separate
dirty bitmap log reporting would follow the same semantics.  I think
this all aligns with how the MemoryListener works in QEMU right now,
correct?  For example we wouldn't need any extra per MAP_DMA tracking
in QEMU like KVM has for its slots.



That right.
Should we go ahead with the implementation to get dirty bitmap for one 
vfio_dma for GET_DIRTY ioctl and unmap with dirty ioctl? Accordingly we 
can have sanity checks in these ioctls.


Thanks,
Kirti


In QEMU, in function kvm_physical_sync_dirty_bitmap() 

Re: [PATCH] pcie_root_port: Add disable_hotplug option

2020-02-18 Thread Michael S. Tsirkin
On Tue, Feb 18, 2020 at 10:02:19PM -0500, Laine Stump wrote:
> Also, is there a rhyme/reason for some options having true/false, and some
> being off/on? disable-acs seems to be true/false, but disable-modern is
> on/off. Doesn't make any difference to me in the end, but just thought I'd
> bring it up in case there might be a reason to use on/off instead of
> true/false for this one.

Some places accept on/off, some true/false, some on/off/true/false
others on/off/yes/no and others on/off/true/false/yes/no.

In this case both user visitor machinery. Which I *think*
means on/off is the safe choice and true/false can be
broken in some places.

We really should clean up this mess ... Julia, what do you think?
Let's make them all support all options?


-- 
MST




RE: Emulating Solaris 10 on SPARC64 sun4u

2020-02-18 Thread jasper.lowell
Excuse the delay. I believe the reason why I am unable to locate the error 
string "Interrupt not seen after set_features" in the OpenSolaris source code 
is because it belongs to a proprietary driver that was not distributed with 
OpenSolaris. Rather than rely on source code I've had to debug this problem by 
observing Solaris 10's behaviour.

I previously linked 
https://docs.oracle.com/cd/E23824_01/html/821-1475/uata-7d.html that seems to 
indicate that this error is fatal.
Considering that the CMD646 IDE controller driver experiences a fatal error 
during the bootstrapping of the system, I suspect that the file system on the 
CDROM might not be accessible.
I'm not sure if this is directly related to the unresponsive serial console but 
I wouldn't be surprised.

When configuring devices, Solaris 10 uses the SET_FEATURE command on the CMD646 
to set the transfer mode to MDMA mode.
>From what I can tell, this is successful and the emulated IDE controller 
>raises an interrupt acknowledging that the command was completed successfully. 
>To determine whether or not this interrupt was successfully propagated to 
>Solaris 10, I made manual changes to ensure that the interrupt was not raised 
>for this event at this specific time. This resulted in a new error from 
>Solaris 10 regarding "set_features".
- Solaris 10 appears to be able to see the interrupt from the completion of the 
SET_FEATURE command.
- Solaris 10 appears to then perform two reads on the status register. From 
what I understand, this has the side effect of clearing interrupts.
- Solaris 10 then writes to the device/head register.
- Solaris 10 then spins on ARTTIM23_INTR_CH1 expecting it to be set. When it is 
not set, the operation times out and we are presented with the fatal error 
regarding set_features.

I am not intimately familiar with the workings of the CMD646 or the ATA 
specification so I can only speculate.
- If the interrupt that Solaris 10 expects is the one from the SET_FEATURE 
command, then Solaris 10 is not expecting reading from the status register to 
clear ARTTIM23_INTR_CH1.
- If the interrupt that Solaris 10 expects is not the one from the SET_FEATURE 
command, then it must expect an interrupt to occur from writing to the 
device/head register.

I found it strange that Solaris 10 was spinning on ARTTIM23_INTR_CH1. Is it 
possible that Solaris 10 is not expecting the values of ARTTIM23_INTR_CH1 and 
MRDMODE_INTR_CH1 to be synced? I made changes to disable the syncing and the 
fatal error from Solaris 10 disappeared. Unfortunately, I can't tell whether or 
not this actually improved the emulation of Solaris 10 as the serial console is 
still unresponsive.

If there is a bug in the Solaris 10 driver I would expect this error to be more 
widely referenced online. I suspect that the emulated CMD646 is not perfectly 
faithful to the hardware and this is causing problems for Solaris 10.
I am not convinced that this problem is related to IRQ routing as Solaris 10 
appears to recognise interrupts when they happen (or don't). Because of this, I 
don't think this error is related  to the DMA problem under MorphOS but I could 
be wrong.

Does anyone have any ideas that might explain why Solaris 10 insists that 
ARTTIM23_INTR_CH1 is set despite two previous reads of the status register?

Thanks,
Jasper Lowell.

-Original Message-
From: Mark Cave-Ayland  
Sent: Sunday, 9 February 2020 10:26 PM
To: Lowell,J,Jasper,VIM R ; qemu-devel@nongnu.org
Cc: atar4q...@gmail.com
Subject: Re: Emulating Solaris 10 on SPARC64 sun4u

On 05/02/2020 06:31, jasper.low...@bt.com wrote:

> I'm currently working towards emulating Solaris 10 on sun4u.
> 
>  
> 
> The Solaris 10 ISO image I am attempting to boot is the one from the 
> Oracle
> 
> download page at
> https://www.oracle.com/solaris/solaris10/downloads/solaris10-get-jsp-downloads.html.
> 
> Image: sol-10-u11-ga-sparc-dvd.iso
> 
> MD5:   53e8b066f7f250ce2fd2cef063f8072b
> 
>  
> 
> I am using QEMU commit 7bd9d0a9e26c7a3c67c0f174f0009ba19969b158.
> 
>  
> 
> The command I am using to run QEMU is:
> 
> ./qemu/sparc64-softmmu/qemu-system-sparc64 -bios 
> ./openbios/obj-sparc64/openbios-builtin.elf -cdrom 
> ./iso/solaris/sol-10-u11-ga-sparc-dvd.iso -boot d -nographic -m 3G
> 
>  
> 
> ```
> 
> CPUs: 1 x SUNW,UltraSPARC-IIi
> 
> UUID: ----
> 
> Welcome to OpenBIOS v1.1 built on Feb 5 2020 19:15
> 
>   Type 'help' for detailed information
> 
> Trying cdrom:f...
> 
> Not a bootable ELF image
> 
> Not a bootable a.out image
> 
>  
> 
> Loading FCode image...
> 
> Loaded 7420 bytes
> 
> entry point is 0x4000
> 
> Evaluating FCode...
> 
> Evaluating FCode...
> 
> Ignoring failed claim for va 100 memsz af6d6!
> 
> Ignoring failed claim for va 1402000 memsz 4dcc8!
> 
> Ignoring failed claim for va 180 memsz 510c8!
> 
> SunOS Release 5.10 Version Generic_147147-26 64-bit
> 
> Copyright (c) 1983, 2013, Oracle and/or its affiliates. All rights reserved.
> 
> could 

Re: [PATCH v4 00/14] Fixes for DP8393X SONIC device emulation

2020-02-18 Thread Jason Wang



On 2020/2/19 上午2:30, Aleksandar Markovic wrote:



On Tuesday, February 4, 2020, Jason Wang > wrote:



On 2020/1/29 下午5:27, Finn Thain wrote:

Hi All,

There are bugs in the emulated dp8393x device that can stop packet
reception in a Linux/m68k guest (q800 machine).

With a Linux/m68k v5.5 guest (q800), it's possible to remotely
trigger
an Oops by sending ping floods.

With a Linux/mips guest (magnum machine), the driver fails to
probe
the dp8393x device.

With a NetBSD/arc 5.1 guest (magnum), the bugs in the device
can be
fatal to the guest kernel.

Whilst debugging the device, I found that the receiver algorithm
differs from the one described in the National Semiconductor
datasheet.

This patch series resolves these bugs.

AFAIK, all bugs in the Linux sonic driver were fixed in Linux
v5.5.
---
Changed since v1:
  - Minor revisions as described beneath commit logs.
  - Dropped patches 4/10 and 7/10.
  - Added 5 new patches.

Changed since v2:
  - Minor revisions as described beneath commit logs.
  - Dropped patch 13/13.
  - Added 2 new patches.

Changed since v3:
  - Replaced patch 13/14 with patch suggested by Philippe
Mathieu-Daudé.


Finn Thain (14):
   dp8393x: Mask EOL bit from descriptor addresses
   dp8393x: Always use 32-bit accesses
   dp8393x: Clean up endianness hacks
   dp8393x: Have dp8393x_receive() return the packet size
   dp8393x: Update LLFA and CRDA registers from rx descriptor
   dp8393x: Clear RRRA command register bit only when appropriate
   dp8393x: Implement packet size limit and RBAE interrupt
   dp8393x: Don't clobber packet checksum
   dp8393x: Use long-word-aligned RRA pointers in 32-bit mode
   dp8393x: Pad frames to word or long word boundary
   dp8393x: Clear descriptor in_use field to release packet
   dp8393x: Always update RRA pointers and sequence numbers
   dp8393x: Don't reset Silicon Revision register
   dp8393x: Don't stop reception upon RBE interrupt assertion

  hw/net/dp8393x.c | 202
+++
  1 file changed, 134 insertions(+), 68 deletions(-)



Applied.


Hi, Jason,

I generally have some reservations towards patches that did not 
receive any R-bs. I think we should hear from Herve in this case, to 
confirm that this change doesn't cause other problems while solving 
the original ones.



That's fine but if it's agreed that we should hear from somebody for a 
specific part of the code, it's better to have the one as 
maintainer/reviewer in MAINTAINERS.


Thanks




I hope this is not the case.

Regards,
Aleksandar







Re: [PATCH] pcie_root_port: Add disable_hotplug option

2020-02-18 Thread Laine Stump

On 2/18/20 1:40 PM, Julia Suvorova wrote:

On Tue, Feb 18, 2020 at 6:18 PM Laine Stump  wrote:


On 2/18/20 11:17 AM, Julia Suvorova wrote:

Make hot-plug/hot-unplug on PCIe Root Ports optional to allow libvirt
to manage it and restrict unplug for the entire machine. This is going
to prevent user-initiated unplug in guests (Windows mostly).
Usage:
  -device pcie-root-port,disable-hotplug=true,...


Double negatives (e.g. "disable-hotplug=false") tend to confuse simple
minds like mine. Would it be any more difficult to make the name of the
option positive instead (e.g. "enable-hotplug") with the default set to
"true"?


disable-hotplug=false will not be used, because it's default. And it
follows previous naming (''disable-acs').


Yeah, I don't like the name of that one either (or of "disable-modern" 
or "disable-legacy") but I don't follow qemu-devel closely so I didn't 
see them when their patches went by. But now is my chance to complain :-)


I can live with it either way, but still think it's much better to not 
have "negative" option names. Feel free to ignore, and I'll just be 
happy that I didn't accept it silently.


Also, is there a rhyme/reason for some options having true/false, and 
some being off/on? disable-acs seems to be true/false, but 
disable-modern is on/off. Doesn't make any difference to me in the end, 
but just thought I'd bring it up in case there might be a reason to use 
on/off instead of true/false for this one.





Re: [PATCH v2] Avoid address_space_rw() with a constant is_write argument

2020-02-18 Thread Edgar E. Iglesias
On Tue, Feb 18, 2020 at 11:24:57AM +, Peter Maydell wrote:
> The address_space_rw() function allows either reads or writes
> depending on the is_write argument passed to it; this is useful
> when the direction of the access is determined programmatically
> (as for instance when handling the KVM_EXIT_MMIO exit reason).
> Under the hood it just calls either address_space_write() or
> address_space_read_full().
> 
> We also use it a lot with a constant is_write argument, though,
> which has two issues:
>  * when reading "address_space_rw(..., 1)" this is less
>immediately clear to the reader as being a write than
>"address_space_write(...)"
>  * calling address_space_rw() bypasses the optimization
>in address_space_read() that fast-paths reads of a
>fixed length
> 
> This commit was produced with the included Coccinelle script
> scripts/coccinelle/as-rw-const.patch.
> 
> Two lines in hw/net/dp8393x.c that Coccinelle produced that
> were over 80 characters were re-wrapped by hand.
> 
> Signed-off-by: Peter Maydell 
> ---
> I could break this down into separate patches by submaintainer,
> but the patch is not that large and I would argue that it's
> better for the project if we can try to avoid introducing too
> much friction into the process of doing 'safe' tree-wide
> minor refactorings.


For xlnx-zdma:
Reviewed-by: Edgar E. Iglesias 



> 
> v1->v2: put the coccinelle script in scripts/coccinelle rather
> than just in the commit message.
> ---
>  accel/kvm/kvm-all.c  |  6 +--
>  dma-helpers.c|  4 +-
>  exec.c   |  4 +-
>  hw/dma/xlnx-zdma.c   | 11 ++---
>  hw/net/dp8393x.c | 68 ++--
>  hw/net/i82596.c  | 25 +-
>  hw/net/lasi_i82596.c |  5 +-
>  hw/ppc/pnv_lpc.c |  8 ++--
>  hw/s390x/css.c   | 12 ++---
>  qtest.c  | 52 ++---
>  target/i386/hvf/x86_mmu.c| 12 ++---
>  scripts/coccinelle/as_rw_const.cocci | 30 
>  12 files changed, 133 insertions(+), 104 deletions(-)
>  create mode 100644 scripts/coccinelle/as_rw_const.cocci
> 
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index c111312dfdd..0cfe6fd8ded 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -2178,9 +2178,9 @@ void kvm_flush_coalesced_mmio_buffer(void)
>  ent = >coalesced_mmio[ring->first];
>  
>  if (ent->pio == 1) {
> -address_space_rw(_space_io, ent->phys_addr,
> - MEMTXATTRS_UNSPECIFIED, ent->data,
> - ent->len, true);
> +address_space_write(_space_io, ent->phys_addr,
> +MEMTXATTRS_UNSPECIFIED, ent->data,
> +ent->len);
>  } else {
>  cpu_physical_memory_write(ent->phys_addr, ent->data, 
> ent->len);
>  }
> diff --git a/dma-helpers.c b/dma-helpers.c
> index d3871dc61ea..e8a26e81e16 100644
> --- a/dma-helpers.c
> +++ b/dma-helpers.c
> @@ -28,8 +28,8 @@ int dma_memory_set(AddressSpace *as, dma_addr_t addr, 
> uint8_t c, dma_addr_t len)
>  memset(fillbuf, c, FILLBUF_SIZE);
>  while (len > 0) {
>  l = len < FILLBUF_SIZE ? len : FILLBUF_SIZE;
> -error |= address_space_rw(as, addr, MEMTXATTRS_UNSPECIFIED,
> -  fillbuf, l, true);
> +error |= address_space_write(as, addr, MEMTXATTRS_UNSPECIFIED,
> + fillbuf, l);
>  len -= l;
>  addr += l;
>  }
> diff --git a/exec.c b/exec.c
> index 8e9cc3b47cf..baefe582393 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -3810,8 +3810,8 @@ int cpu_memory_rw_debug(CPUState *cpu, target_ulong 
> addr,
>  address_space_write_rom(cpu->cpu_ases[asidx].as, phys_addr,
>  attrs, buf, l);
>  } else {
> -address_space_rw(cpu->cpu_ases[asidx].as, phys_addr,
> - attrs, buf, l, 0);
> +address_space_read(cpu->cpu_ases[asidx].as, phys_addr, attrs, 
> buf,
> +   l);
>  }
>  len -= l;
>  buf += l;
> diff --git a/hw/dma/xlnx-zdma.c b/hw/dma/xlnx-zdma.c
> index 8fb83f5b078..31936061e21 100644
> --- a/hw/dma/xlnx-zdma.c
> +++ b/hw/dma/xlnx-zdma.c
> @@ -311,8 +311,7 @@ static bool zdma_load_descriptor(XlnxZDMA *s, uint64_t 
> addr, void *buf)
>  return false;
>  }
>  
> -address_space_rw(s->dma_as, addr, s->attr,
> - buf, sizeof(XlnxZDMADescr), false);
> +address_space_read(s->dma_as, addr, s->attr, buf, sizeof(XlnxZDMADescr));
>  return true;
>  }
>  
> @@ -364,7 +363,7 @@ static uint64_t zdma_update_descr_addr(XlnxZDMA *s, bool 
> type,
>  } else {
>  addr = 

Re: [RFC PATCH v2] target/ppc: Enable hardfloat for PPC

2020-02-18 Thread Programmingkid


> On Feb 18, 2020, at 12:10 PM, BALATON Zoltan  wrote:
> 
> While other targets take advantage of using host FPU to do floating
> point computations, this was disabled for PPC target because always
> clearing exception flags before every FP op made it slightly slower
> than emulating everyting with softfloat. To emulate some FPSCR bits,
> clearing of fp_status may be necessary (unless these could be handled
> e.g. using FP exceptions on host but there's no API for that in QEMU
> yet) but preserving at least the inexact flag makes hardfloat usable
> and faster than softfloat. Since most clients don't actually care
> about this flag, we can gain some speed trading some emulation
> accuracy.
> 
> This patch implements a simple way to keep the inexact flag set for
> hardfloat while still allowing to revert to softfloat for workloads
> that need more accurate albeit slower emulation. (Set hardfloat
> property of CPU, i.e. -cpu name,hardfloat=false for that.) There may
> still be room for further improvement but this seems to increase
> floating point performance. Unfortunately the softfloat case is slower
> than before this patch so this patch only makes sense if the default
> is also set to enable hardfloat.
> 
> Because of the above this patch at the moment is mainly for testing
> different workloads to evaluate how viable would this be in practice.
> Thus, RFC and not ready for merge yet.
> 
> Signed-off-by: BALATON Zoltan 
> ---
> v2: use different approach to avoid needing if () in
> helper_reset_fpstatus() but this does not seem to change overhead
> much, also make it a single patch as adding the hardfloat option is
> only a few lines; with this we can use same value at other places where
> float_status is reset and maybe enable hardfloat for a few more places
> for a little more performance but not too much. With this I got:



Thank you for working on this. It is about time we have a better FPU. 

I applied your patch over David Gibson's ppc-for-5.0 branch. It applied cleanly 
and compiled easily.

Tests were done on a Mac OS 10.4.3 VM. The CPU was set to G3. 

I did several tests and here are my results:

With hard float:
- The USB audio device does not produce any sound. 
- Converting a MIDI file to AAC in iTunes happens at 0.4x (faster than soft 
float :) ).
For my FPSCR test program, 21 tests failed. The high number is because the 
inexact exception is being set for situations it should not be set for.

With soft float:
- Some sound can be heard from the USB audio device. It isn't good sounding. I 
had to force quit Quicktime player because it stopped working.
- Converting a MIDI file to AAC in iTunes happens at 0.3x (slower than hard 
float).
- 13 tests failed with my FPSCR test program.

This patch is a good start. I'm not worried about the Floating Point Status and 
Control Register flags being wrong since hardly any software bothers to check 
them. I think more optimizations can happen by simplifying the FPU. As it is 
now it makes a lot of calls per operation.




[PATCH v4 11/12] target/ppc: Streamline construction of VRMA SLB entry

2020-02-18 Thread David Gibson
When in VRMA mode (i.e. a guest thinks it has the MMU off, but the
hypervisor is still applying translation) we use a special SLB entry,
rather than looking up an SLBE by address as we do when guest translation
is on.

We build that special entry in ppc_hash64_update_vrma() along with some
logic for handling some non-VRMA cases.  Split the actual build of the
VRMA SLBE into a separate helper and streamline it a bit.

Signed-off-by: David Gibson 
---
 target/ppc/mmu-hash64.c | 78 -
 1 file changed, 38 insertions(+), 40 deletions(-)

diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index 203a41cca1..5ce7cc8359 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -791,6 +791,39 @@ static target_ulong rmls_limit(PowerPCCPU *cpu)
 }
 }
 
+static int build_vrma_slbe(PowerPCCPU *cpu, ppc_slb_t *slb)
+{
+CPUPPCState *env = >env;
+target_ulong lpcr = env->spr[SPR_LPCR];
+uint32_t vrmasd = (lpcr & LPCR_VRMASD) >> LPCR_VRMASD_SHIFT;
+target_ulong vsid = SLB_VSID_VRMA | ((vrmasd << 4) & SLB_VSID_LLP_MASK);
+int i;
+
+/*
+ * Make one up. Mostly ignore the ESID which will not be needed
+ * for translation
+ */
+for (i = 0; i < PPC_PAGE_SIZES_MAX_SZ; i++) {
+const PPCHash64SegmentPageSizes *sps = >hash64_opts->sps[i];
+
+if (!sps->page_shift) {
+break;
+}
+
+if ((vsid & SLB_VSID_LLP_MASK) == sps->slb_enc) {
+slb->esid = SLB_ESID_V;
+slb->vsid = vsid;
+slb->sps = sps;
+return 0;
+}
+}
+
+error_report("Bad page size encoding in LPCR[VRMASD]; LPCR=0x"
+ TARGET_FMT_lx"\n", lpcr);
+
+return -1;
+}
+
 int ppc_hash64_handle_mmu_fault(PowerPCCPU *cpu, vaddr eaddr,
 int rwx, int mmu_idx)
 {
@@ -1046,53 +1079,18 @@ void ppc_hash64_tlb_flush_hpte(PowerPCCPU *cpu, 
target_ulong ptex,
 static void ppc_hash64_update_vrma(PowerPCCPU *cpu)
 {
 CPUPPCState *env = >env;
-const PPCHash64SegmentPageSizes *sps = NULL;
-target_ulong esid, vsid, lpcr;
 ppc_slb_t *slb = >vrma_slb;
-uint32_t vrmasd;
-int i;
-
-/* First clear it */
-slb->esid = slb->vsid = 0;
-slb->sps = NULL;
 
 /* Is VRMA enabled ? */
 if (ppc_hash64_use_vrma(env)) {
-return;
-}
-
-/*
- * Make one up. Mostly ignore the ESID which will not be needed
- * for translation
- */
-lpcr = env->spr[SPR_LPCR];
-vsid = SLB_VSID_VRMA;
-vrmasd = (lpcr & LPCR_VRMASD) >> LPCR_VRMASD_SHIFT;
-vsid |= (vrmasd << 4) & (SLB_VSID_L | SLB_VSID_LP);
-esid = SLB_ESID_V;
-
-for (i = 0; i < PPC_PAGE_SIZES_MAX_SZ; i++) {
-const PPCHash64SegmentPageSizes *sps1 = >hash64_opts->sps[i];
-
-if (!sps1->page_shift) {
-break;
-}
-
-if ((vsid & SLB_VSID_LLP_MASK) == sps1->slb_enc) {
-sps = sps1;
-break;
+if (build_vrma_slbe(cpu, slb) == 0) {
+return;
 }
 }
 
-if (!sps) {
-error_report("Bad page size encoding esid 0x"TARGET_FMT_lx
- " vsid 0x"TARGET_FMT_lx, esid, vsid);
-return;
-}
-
-slb->vsid = vsid;
-slb->esid = esid;
-slb->sps = sps;
+/* Otherwise, clear it to indicate error */
+slb->esid = slb->vsid = 0;
+slb->sps = NULL;
 }
 
 void ppc_store_lpcr(PowerPCCPU *cpu, target_ulong val)
-- 
2.24.1




[PATCH v4 10/12] target/ppc: Only calculate RMLS derived RMA limit on demand

2020-02-18 Thread David Gibson
When the LPCR is written, we update the env->rmls field with the RMA limit
it implies.  Simplify things by just calculating the value directly from
the LPCR value when we need it.

It's possible this is a little slower, but it's unlikely to be significant,
since this is only for real mode accesses in a translation configuration
that's not used very often, and the whole thing is behind the qemu TLB
anyway.  Therefore, keeping the number of state variables down and not
having to worry about making sure it's always in sync seems the better
option.

Signed-off-by: David Gibson 
Reviewed-by: Cédric Le Goater 
---
 target/ppc/cpu.h| 1 -
 target/ppc/mmu-hash64.c | 8 +---
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 8077fdb068..f9871b1233 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -1046,7 +1046,6 @@ struct CPUPPCState {
 uint64_t insns_flags2;
 #if defined(TARGET_PPC64)
 ppc_slb_t vrma_slb;
-target_ulong rmls;
 #endif
 
 int error_code;
diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index 46690bc79b..203a41cca1 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -844,8 +844,10 @@ int ppc_hash64_handle_mmu_fault(PowerPCCPU *cpu, vaddr 
eaddr,
 
 goto skip_slb_search;
 } else {
+target_ulong limit = rmls_limit(cpu);
+
 /* Emulated old-style RMO mode, bounds check against RMLS */
-if (raddr >= env->rmls) {
+if (raddr >= limit) {
 if (rwx == 2) {
 ppc_hash64_set_isi(cs, SRR1_PROTFAULT);
 } else {
@@ -1007,8 +1009,9 @@ hwaddr ppc_hash64_get_phys_page_debug(PowerPCCPU *cpu, 
target_ulong addr)
 return -1;
 }
 } else {
+target_ulong limit = rmls_limit(cpu);
 /* Emulated old-style RMO mode, bounds check against RMLS */
-if (raddr >= env->rmls) {
+if (raddr >= limit) {
 return -1;
 }
 return raddr | env->spr[SPR_RMOR];
@@ -1098,7 +1101,6 @@ void ppc_store_lpcr(PowerPCCPU *cpu, target_ulong val)
 CPUPPCState *env = >env;
 
 env->spr[SPR_LPCR] = val & pcc->lpcr_mask;
-env->rmls = rmls_limit(cpu);
 ppc_hash64_update_vrma(cpu);
 }
 
-- 
2.24.1




[PATCH v4 12/12] target/ppc: Don't store VRMA SLBE persistently

2020-02-18 Thread David Gibson
Currently, we construct the SLBE used for VRMA translations when the LPCR
is written (which controls some bits in the SLBE), then use it later for
translations.

This is a bit complex and confusing - simplify it by simply constructing
the SLBE directly from the LPCR when we need it.

Signed-off-by: David Gibson 
---
 target/ppc/cpu.h|  3 ---
 target/ppc/mmu-hash64.c | 28 ++--
 2 files changed, 6 insertions(+), 25 deletions(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index f9871b1233..5a55fb02bd 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -1044,9 +1044,6 @@ struct CPUPPCState {
 uint32_t flags;
 uint64_t insns_flags;
 uint64_t insns_flags2;
-#if defined(TARGET_PPC64)
-ppc_slb_t vrma_slb;
-#endif
 
 int error_code;
 uint32_t pending_interrupts;
diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index 5ce7cc8359..7e6f4f62cb 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -829,6 +829,7 @@ int ppc_hash64_handle_mmu_fault(PowerPCCPU *cpu, vaddr 
eaddr,
 {
 CPUState *cs = CPU(cpu);
 CPUPPCState *env = >env;
+ppc_slb_t vrma_slbe;
 ppc_slb_t *slb;
 unsigned apshift;
 hwaddr ptex;
@@ -867,8 +868,8 @@ int ppc_hash64_handle_mmu_fault(PowerPCCPU *cpu, vaddr 
eaddr,
 }
 } else if (ppc_hash64_use_vrma(env)) {
 /* Emulated VRMA mode */
-slb = >vrma_slb;
-if (!slb->sps) {
+slb = _slbe;
+if (build_vrma_slbe(cpu, slb) != 0) {
 /* Invalid VRMA setup, machine check */
 cs->exception_index = POWERPC_EXCP_MCHECK;
 env->error_code = 0;
@@ -1016,6 +1017,7 @@ skip_slb_search:
 hwaddr ppc_hash64_get_phys_page_debug(PowerPCCPU *cpu, target_ulong addr)
 {
 CPUPPCState *env = >env;
+ppc_slb_t vrma_slbe;
 ppc_slb_t *slb;
 hwaddr ptex, raddr;
 ppc_hash_pte64_t pte;
@@ -1037,8 +1039,8 @@ hwaddr ppc_hash64_get_phys_page_debug(PowerPCCPU *cpu, 
target_ulong addr)
 return raddr | env->spr[SPR_HRMOR];
 } else if (ppc_hash64_use_vrma(env)) {
 /* Emulated VRMA mode */
-slb = >vrma_slb;
-if (!slb->sps) {
+slb = _slbe;
+if (build_vrma_slbe(cpu, slb) != 0) {
 return -1;
 }
 } else {
@@ -1076,30 +1078,12 @@ void ppc_hash64_tlb_flush_hpte(PowerPCCPU *cpu, 
target_ulong ptex,
 cpu->env.tlb_need_flush = TLB_NEED_GLOBAL_FLUSH | TLB_NEED_LOCAL_FLUSH;
 }
 
-static void ppc_hash64_update_vrma(PowerPCCPU *cpu)
-{
-CPUPPCState *env = >env;
-ppc_slb_t *slb = >vrma_slb;
-
-/* Is VRMA enabled ? */
-if (ppc_hash64_use_vrma(env)) {
-if (build_vrma_slbe(cpu, slb) == 0) {
-return;
-}
-}
-
-/* Otherwise, clear it to indicate error */
-slb->esid = slb->vsid = 0;
-slb->sps = NULL;
-}
-
 void ppc_store_lpcr(PowerPCCPU *cpu, target_ulong val)
 {
 PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
 CPUPPCState *env = >env;
 
 env->spr[SPR_LPCR] = val & pcc->lpcr_mask;
-ppc_hash64_update_vrma(cpu);
 }
 
 void helper_store_lpcr(CPUPPCState *env, target_ulong val)
-- 
2.24.1




[PATCH v4 06/12] target/ppc: Remove RMOR register from POWER9 & POWER10

2020-02-18 Thread David Gibson
Currently we create the Real Mode Offset Register (RMOR) on all Book3S cpus
from POWER7 onwards.  However the translation mode which the RMOR controls
is no longer supported in POWER9, and so the register has been removed from
the architecture.

Remove it from our model on POWER9 and POWER10.

Signed-off-by: David Gibson 
Reviewed-by: Cédric Le Goater 
---
 target/ppc/translate_init.inc.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/target/ppc/translate_init.inc.c b/target/ppc/translate_init.inc.c
index ab79975fec..925bc31ca5 100644
--- a/target/ppc/translate_init.inc.c
+++ b/target/ppc/translate_init.inc.c
@@ -8015,12 +8015,16 @@ static void gen_spr_book3s_ids(CPUPPCState *env)
  SPR_NOACCESS, SPR_NOACCESS,
  _read_generic, _write_generic,
  0x);
-spr_register_hv(env, SPR_RMOR, "RMOR",
+spr_register_hv(env, SPR_HRMOR, "HRMOR",
  SPR_NOACCESS, SPR_NOACCESS,
  SPR_NOACCESS, SPR_NOACCESS,
  _read_generic, _write_generic,
  0x);
-spr_register_hv(env, SPR_HRMOR, "HRMOR",
+}
+
+static void gen_spr_rmor(CPUPPCState *env)
+{
+spr_register_hv(env, SPR_RMOR, "RMOR",
  SPR_NOACCESS, SPR_NOACCESS,
  SPR_NOACCESS, SPR_NOACCESS,
  _read_generic, _write_generic,
@@ -8535,6 +8539,7 @@ static void init_proc_POWER7(CPUPPCState *env)
 
 /* POWER7 Specific Registers */
 gen_spr_book3s_ids(env);
+gen_spr_rmor(env);
 gen_spr_amr(env);
 gen_spr_book3s_purr(env);
 gen_spr_power5p_common(env);
@@ -8676,6 +8681,7 @@ static void init_proc_POWER8(CPUPPCState *env)
 
 /* POWER8 Specific Registers */
 gen_spr_book3s_ids(env);
+gen_spr_rmor(env);
 gen_spr_amr(env);
 gen_spr_iamr(env);
 gen_spr_book3s_purr(env);
-- 
2.24.1




[PATCH v4 07/12] target/ppc: Use class fields to simplify LPCR masking

2020-02-18 Thread David Gibson
When we store the Logical Partitioning Control Register (LPCR) we have a
big switch statement to work out which are valid bits for the cpu model
we're emulating.

As well as being ugly, this isn't really conceptually correct, since it is
based on the mmu_model variable, whereas the LPCR isn't (only) about the
MMU, so mmu_model is basically just acting as a proxy for the cpu model.

Handle this in a simpler way, by adding a suitable lpcr_mask to the QOM
class.

Signed-off-by: David Gibson 
Reviewed-by: Cédric Le Goater 
---
 target/ppc/cpu-qom.h|  1 +
 target/ppc/mmu-hash64.c | 37 ++---
 target/ppc/translate_init.inc.c | 27 
 3 files changed, 26 insertions(+), 39 deletions(-)

diff --git a/target/ppc/cpu-qom.h b/target/ppc/cpu-qom.h
index e499575dc8..15d6b54a7d 100644
--- a/target/ppc/cpu-qom.h
+++ b/target/ppc/cpu-qom.h
@@ -177,6 +177,7 @@ typedef struct PowerPCCPUClass {
 uint64_t insns_flags;
 uint64_t insns_flags2;
 uint64_t msr_mask;
+uint64_t lpcr_mask; /* Available bits in the LPCR */
 uint64_t lpcr_pm;   /* Power-saving mode Exit Cause Enable bits */
 powerpc_mmu_t   mmu_model;
 powerpc_excp_t  excp_model;
diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index 2d54f065d9..8acd1f78ae 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -1095,43 +1095,10 @@ static void ppc_hash64_update_vrma(PowerPCCPU *cpu)
 
 void ppc_store_lpcr(PowerPCCPU *cpu, target_ulong val)
 {
+PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
 CPUPPCState *env = >env;
-uint64_t lpcr = 0;
 
-/* Filter out bits */
-switch (env->mmu_model) {
-case POWERPC_MMU_2_03: /* P5p */
-lpcr = val & (LPCR_RMLS | LPCR_ILE |
-  LPCR_LPES0 | LPCR_LPES1 |
-  LPCR_RMI | LPCR_HDICE);
-break;
-case POWERPC_MMU_2_06: /* P7 */
-lpcr = val & (LPCR_VPM0 | LPCR_VPM1 | LPCR_ISL | LPCR_DPFD |
-  LPCR_VRMASD | LPCR_RMLS | LPCR_ILE |
-  LPCR_P7_PECE0 | LPCR_P7_PECE1 | LPCR_P7_PECE2 |
-  LPCR_MER | LPCR_TC |
-  LPCR_LPES0 | LPCR_LPES1 | LPCR_HDICE);
-break;
-case POWERPC_MMU_2_07: /* P8 */
-lpcr = val & (LPCR_VPM0 | LPCR_VPM1 | LPCR_ISL | LPCR_KBV |
-  LPCR_DPFD | LPCR_VRMASD | LPCR_RMLS | LPCR_ILE |
-  LPCR_AIL | LPCR_ONL | LPCR_P8_PECE0 | LPCR_P8_PECE1 |
-  LPCR_P8_PECE2 | LPCR_P8_PECE3 | LPCR_P8_PECE4 |
-  LPCR_MER | LPCR_TC | LPCR_LPES0 | LPCR_HDICE);
-break;
-case POWERPC_MMU_3_00: /* P9 */
-lpcr = val & (LPCR_VPM1 | LPCR_ISL | LPCR_KBV | LPCR_DPFD |
-  (LPCR_PECE_U_MASK & LPCR_HVEE) | LPCR_ILE | LPCR_AIL |
-  LPCR_UPRT | LPCR_EVIRT | LPCR_ONL | LPCR_HR | LPCR_LD |
-  (LPCR_PECE_L_MASK & (LPCR_PDEE | LPCR_HDEE | LPCR_EEE |
-  LPCR_DEE | LPCR_OEE)) | LPCR_MER | LPCR_GTSE | LPCR_TC |
-  LPCR_HEIC | LPCR_LPES0 | LPCR_HVICE | LPCR_HDICE);
-break;
-default:
-g_assert_not_reached();
-;
-}
-env->spr[SPR_LPCR] = lpcr;
+env->spr[SPR_LPCR] = val & pcc->lpcr_mask;
 ppc_hash64_update_rmls(cpu);
 ppc_hash64_update_vrma(cpu);
 }
diff --git a/target/ppc/translate_init.inc.c b/target/ppc/translate_init.inc.c
index 925bc31ca5..5b7a5226e1 100644
--- a/target/ppc/translate_init.inc.c
+++ b/target/ppc/translate_init.inc.c
@@ -8476,6 +8476,8 @@ POWERPC_FAMILY(POWER5P)(ObjectClass *oc, void *data)
 (1ull << MSR_DR) |
 (1ull << MSR_PMM) |
 (1ull << MSR_RI);
+pcc->lpcr_mask = LPCR_RMLS | LPCR_ILE | LPCR_LPES0 | LPCR_LPES1 |
+LPCR_RMI | LPCR_HDICE;
 pcc->mmu_model = POWERPC_MMU_2_03;
 #if defined(CONFIG_SOFTMMU)
 pcc->handle_mmu_fault = ppc_hash64_handle_mmu_fault;
@@ -8653,6 +8655,12 @@ POWERPC_FAMILY(POWER7)(ObjectClass *oc, void *data)
 (1ull << MSR_PMM) |
 (1ull << MSR_RI) |
 (1ull << MSR_LE);
+pcc->lpcr_mask = LPCR_VPM0 | LPCR_VPM1 | LPCR_ISL | LPCR_DPFD |
+LPCR_VRMASD | LPCR_RMLS | LPCR_ILE |
+LPCR_P7_PECE0 | LPCR_P7_PECE1 | LPCR_P7_PECE2 |
+LPCR_MER | LPCR_TC |
+LPCR_LPES0 | LPCR_LPES1 | LPCR_HDICE;
+pcc->lpcr_pm = LPCR_P7_PECE0 | LPCR_P7_PECE1 | LPCR_P7_PECE2;
 pcc->mmu_model = POWERPC_MMU_2_06;
 #if defined(CONFIG_SOFTMMU)
 pcc->handle_mmu_fault = ppc_hash64_handle_mmu_fault;
@@ -8669,7 +8677,6 @@ POWERPC_FAMILY(POWER7)(ObjectClass *oc, void *data)
 pcc->l1_dcache_size = 0x8000;
 pcc->l1_icache_size = 0x8000;
 pcc->interrupts_big_endian = ppc_cpu_interrupts_big_endian_lpcr;
-pcc->lpcr_pm = LPCR_P7_PECE0 | LPCR_P7_PECE1 | LPCR_P7_PECE2;
 }
 
 static void init_proc_POWER8(CPUPPCState *env)

[PATCH v4 02/12] ppc: Remove stub of PPC970 HID4 implementation

2020-02-18 Thread David Gibson
The PowerPC 970 CPU was a cut-down POWER4, which had hypervisor capability.
However, it can be (and often was) strapped into "Apple mode", where the
hypervisor capabilities were disabled (essentially putting it always in
hypervisor mode).

That's actually the only mode of the 970 we support in qemu, and we're
unlikely to change that any time soon.  However, we do have a partial
implementation of the 970's HID4 register which affects things only
relevant for hypervisor mode.

That stub is also really ugly, since it attempts to duplicate the effects
of HID4 by re-encoding it into the LPCR register used in newer CPUs, but
in a really confusing way.

Just get rid of it.

Signed-off-by: David Gibson 
Reviewed-by: Cédric Le Goater 
Reviewed-by: Greg Kurz 
---
 target/ppc/mmu-hash64.c | 28 +---
 target/ppc/translate_init.inc.c | 20 
 2 files changed, 9 insertions(+), 39 deletions(-)

diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index da8966ccf5..a881876647 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -1091,33 +1091,6 @@ void ppc_store_lpcr(PowerPCCPU *cpu, target_ulong val)
 
 /* Filter out bits */
 switch (env->mmu_model) {
-case POWERPC_MMU_64B: /* 970 */
-if (val & 0x40) {
-lpcr |= LPCR_LPES0;
-}
-if (val & 0x8000ull) {
-lpcr |= LPCR_LPES1;
-}
-if (val & 0x20) {
-lpcr |= (0x4ull << LPCR_RMLS_SHIFT);
-}
-if (val & 0x4000ull) {
-lpcr |= (0x2ull << LPCR_RMLS_SHIFT);
-}
-if (val & 0x2000ull) {
-lpcr |= (0x1ull << LPCR_RMLS_SHIFT);
-}
-env->spr[SPR_RMOR] = ((lpcr >> 41) & 0xull) << 26;
-
-/*
- * XXX We could also write LPID from HID4 here
- * but since we don't tag any translation on it
- * it doesn't actually matter
- *
- * XXX For proper emulation of 970 we also need
- * to dig HRMOR out of HID5
- */
-break;
 case POWERPC_MMU_2_03: /* P5p */
 lpcr = val & (LPCR_RMLS | LPCR_ILE |
   LPCR_LPES0 | LPCR_LPES1 |
@@ -1154,6 +1127,7 @@ void ppc_store_lpcr(PowerPCCPU *cpu, target_ulong val)
 }
 break;
 default:
+g_assert_not_reached();
 ;
 }
 env->spr[SPR_LPCR] = lpcr;
diff --git a/target/ppc/translate_init.inc.c b/target/ppc/translate_init.inc.c
index a0d0eaabf2..ab79975fec 100644
--- a/target/ppc/translate_init.inc.c
+++ b/target/ppc/translate_init.inc.c
@@ -7895,25 +7895,21 @@ static void spr_write_lpcr(DisasContext *ctx, int sprn, 
int gprn)
 {
 gen_helper_store_lpcr(cpu_env, cpu_gpr[gprn]);
 }
-
-static void spr_write_970_hid4(DisasContext *ctx, int sprn, int gprn)
-{
-#if defined(TARGET_PPC64)
-spr_write_generic(ctx, sprn, gprn);
-gen_helper_store_lpcr(cpu_env, cpu_gpr[gprn]);
-#endif
-}
-
 #endif /* !defined(CONFIG_USER_ONLY) */
 
 static void gen_spr_970_lpar(CPUPPCState *env)
 {
 #if !defined(CONFIG_USER_ONLY)
-/* Logical partitionning */
-/* PPC970: HID4 is effectively the LPCR */
+/*
+ * PPC970: HID4 covers things later controlled by the LPCR and
+ * RMOR in later CPUs, but with a different encoding.  We only
+ * support the 970 in "Apple mode" which has all hypervisor
+ * facilities disabled by strapping, so we can basically just
+ * ignore it
+ */
 spr_register(env, SPR_970_HID4, "HID4",
  SPR_NOACCESS, SPR_NOACCESS,
- _read_generic, _write_970_hid4,
+ _read_generic, _write_generic,
  0x);
 #endif
 }
-- 
2.24.1




[PATCH v4 04/12] target/ppc: Introduce ppc_hash64_use_vrma() helper

2020-02-18 Thread David Gibson
When running guests under a hypervisor, the hypervisor obviously needs to
be protected from guest accesses even if those are in what the guest
considers real mode (translation off).  The POWER hardware provides two
ways of doing that: The old way has guest real mode accesses simply offset
and bounds checked into host addresses.  It works, but requires that a
significant chunk of the guest's memory - the RMA - be physically
contiguous in the host, which is pretty inconvenient.  The new way, known
as VRMA, has guest real mode accesses translated in roughly the normal way
but with some special parameters.

In POWER7 and POWER8 the LPCR[VPM0] bit selected between the two modes, but
in POWER9 only VRMA mode is supported and LPCR[VPM0] no longer exists.  We
handle that difference in behaviour in ppc_hash64_set_isi().. but not in
other places that we blindly check LPCR[VPM0].

Correct those instances with a new helper to tell if we should be in VRMA
mode.

Signed-off-by: David Gibson 
Reviewed-by: Cédric Le Goater 
---
 target/ppc/mmu-hash64.c | 43 -
 1 file changed, 21 insertions(+), 22 deletions(-)

diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index 5fabd93c92..6b3c214879 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -668,6 +668,21 @@ unsigned ppc_hash64_hpte_page_shift_noslb(PowerPCCPU *cpu,
 return 0;
 }
 
+static bool ppc_hash64_use_vrma(CPUPPCState *env)
+{
+switch (env->mmu_model) {
+case POWERPC_MMU_3_00:
+/*
+ * ISAv3.0 (POWER9) always uses VRMA, the VPM0 field and RMOR
+ * register no longer exist
+ */
+return true;
+
+default:
+return !!(env->spr[SPR_LPCR] & LPCR_VPM0);
+}
+}
+
 static void ppc_hash64_set_isi(CPUState *cs, uint64_t error_code)
 {
 CPUPPCState *env = _CPU(cs)->env;
@@ -676,15 +691,7 @@ static void ppc_hash64_set_isi(CPUState *cs, uint64_t 
error_code)
 if (msr_ir) {
 vpm = !!(env->spr[SPR_LPCR] & LPCR_VPM1);
 } else {
-switch (env->mmu_model) {
-case POWERPC_MMU_3_00:
-/* Field deprecated in ISAv3.00 - interrupts always go to hyperv */
-vpm = true;
-break;
-default:
-vpm = !!(env->spr[SPR_LPCR] & LPCR_VPM0);
-break;
-}
+vpm = ppc_hash64_use_vrma(env);
 }
 if (vpm && !msr_hv) {
 cs->exception_index = POWERPC_EXCP_HISI;
@@ -702,15 +709,7 @@ static void ppc_hash64_set_dsi(CPUState *cs, uint64_t dar, 
uint64_t dsisr)
 if (msr_dr) {
 vpm = !!(env->spr[SPR_LPCR] & LPCR_VPM1);
 } else {
-switch (env->mmu_model) {
-case POWERPC_MMU_3_00:
-/* Field deprecated in ISAv3.00 - interrupts always go to hyperv */
-vpm = true;
-break;
-default:
-vpm = !!(env->spr[SPR_LPCR] & LPCR_VPM0);
-break;
-}
+vpm = ppc_hash64_use_vrma(env);
 }
 if (vpm && !msr_hv) {
 cs->exception_index = POWERPC_EXCP_HDSI;
@@ -799,7 +798,7 @@ int ppc_hash64_handle_mmu_fault(PowerPCCPU *cpu, vaddr 
eaddr,
 if (!(eaddr >> 63)) {
 raddr |= env->spr[SPR_HRMOR];
 }
-} else if (env->spr[SPR_LPCR] & LPCR_VPM0) {
+} else if (ppc_hash64_use_vrma(env)) {
 /* Emulated VRMA mode */
 slb = >vrma_slb;
 if (!slb->sps) {
@@ -967,7 +966,7 @@ hwaddr ppc_hash64_get_phys_page_debug(PowerPCCPU *cpu, 
target_ulong addr)
 } else if ((msr_hv || !env->has_hv_mode) && !(addr >> 63)) {
 /* In HV mode, add HRMOR if top EA bit is clear */
 return raddr | env->spr[SPR_HRMOR];
-} else if (env->spr[SPR_LPCR] & LPCR_VPM0) {
+} else if (ppc_hash64_use_vrma(env)) {
 /* Emulated VRMA mode */
 slb = >vrma_slb;
 if (!slb->sps) {
@@ -1056,8 +1055,7 @@ static void ppc_hash64_update_vrma(PowerPCCPU *cpu)
 slb->sps = NULL;
 
 /* Is VRMA enabled ? */
-lpcr = env->spr[SPR_LPCR];
-if (!(lpcr & LPCR_VPM0)) {
+if (ppc_hash64_use_vrma(env)) {
 return;
 }
 
@@ -1065,6 +1063,7 @@ static void ppc_hash64_update_vrma(PowerPCCPU *cpu)
  * Make one up. Mostly ignore the ESID which will not be needed
  * for translation
  */
+lpcr = env->spr[SPR_LPCR];
 vsid = SLB_VSID_VRMA;
 vrmasd = (lpcr & LPCR_VRMASD) >> LPCR_VRMASD_SHIFT;
 vsid |= (vrmasd << 4) & (SLB_VSID_L | SLB_VSID_LP);
-- 
2.24.1




[PATCH v4 08/12] target/ppc: Streamline calculation of RMA limit from LPCR[RMLS]

2020-02-18 Thread David Gibson
Currently we use a big switch statement in ppc_hash64_update_rmls() to work
out what the right RMA limit is based on the LPCR[RMLS] field.  There's no
formula for this - it's just an arbitrary mapping defined by the existing
CPU implementations - but we can make it a bit more readable by using a
lookup table rather than a switch.  In addition we can use the MiB/GiB
symbols to make it a bit clearer.

While there we add a bit of clarity and rationale to the comment about
what happens if the LPCR[RMLS] doesn't contain a valid value.

Signed-off-by: David Gibson 
Reviewed-by: Cédric Le Goater 
---
 target/ppc/mmu-hash64.c | 71 -
 1 file changed, 35 insertions(+), 36 deletions(-)

diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index 8acd1f78ae..4e6c1f722b 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -18,6 +18,7 @@
  * License along with this library; if not, see .
  */
 #include "qemu/osdep.h"
+#include "qemu/units.h"
 #include "cpu.h"
 #include "exec/exec-all.h"
 #include "exec/helper-proto.h"
@@ -757,6 +758,39 @@ static void ppc_hash64_set_c(PowerPCCPU *cpu, hwaddr ptex, 
uint64_t pte1)
 stb_phys(CPU(cpu)->as, base + offset, (pte1 & 0xff) | 0x80);
 }
 
+static target_ulong rmls_limit(PowerPCCPU *cpu)
+{
+CPUPPCState *env = >env;
+/*
+ * This is the full 4 bits encoding of POWER8. Previous
+ * CPUs only support a subset of these but the filtering
+ * is done when writing LPCR
+ */
+const target_ulong rma_sizes[] = {
+[0] = 0,
+[1] = 16 * GiB,
+[2] = 1 * GiB,
+[3] = 64 * MiB,
+[4] = 256 * MiB,
+[5] = 0,
+[6] = 0,
+[7] = 128 * MiB,
+[8] = 32 * MiB,
+};
+target_ulong rmls = (env->spr[SPR_LPCR] & LPCR_RMLS) >> LPCR_RMLS_SHIFT;
+
+if (rmls < ARRAY_SIZE(rma_sizes)) {
+return rma_sizes[rmls];
+} else {
+/*
+ * Bad value, so the OS has shot itself in the foot.  Return a
+ * 0-sized RMA which we expect to trigger an immediate DSI or
+ * ISI
+ */
+return 0;
+}
+}
+
 int ppc_hash64_handle_mmu_fault(PowerPCCPU *cpu, vaddr eaddr,
 int rwx, int mmu_idx)
 {
@@ -1006,41 +1040,6 @@ void ppc_hash64_tlb_flush_hpte(PowerPCCPU *cpu, 
target_ulong ptex,
 cpu->env.tlb_need_flush = TLB_NEED_GLOBAL_FLUSH | TLB_NEED_LOCAL_FLUSH;
 }
 
-static void ppc_hash64_update_rmls(PowerPCCPU *cpu)
-{
-CPUPPCState *env = >env;
-uint64_t lpcr = env->spr[SPR_LPCR];
-
-/*
- * This is the full 4 bits encoding of POWER8. Previous
- * CPUs only support a subset of these but the filtering
- * is done when writing LPCR
- */
-switch ((lpcr & LPCR_RMLS) >> LPCR_RMLS_SHIFT) {
-case 0x8: /* 32MB */
-env->rmls = 0x200ull;
-break;
-case 0x3: /* 64MB */
-env->rmls = 0x400ull;
-break;
-case 0x7: /* 128MB */
-env->rmls = 0x800ull;
-break;
-case 0x4: /* 256MB */
-env->rmls = 0x1000ull;
-break;
-case 0x2: /* 1GB */
-env->rmls = 0x4000ull;
-break;
-case 0x1: /* 16GB */
-env->rmls = 0x4ull;
-break;
-default:
-/* What to do here ??? */
-env->rmls = 0;
-}
-}
-
 static void ppc_hash64_update_vrma(PowerPCCPU *cpu)
 {
 CPUPPCState *env = >env;
@@ -1099,7 +1098,7 @@ void ppc_store_lpcr(PowerPCCPU *cpu, target_ulong val)
 CPUPPCState *env = >env;
 
 env->spr[SPR_LPCR] = val & pcc->lpcr_mask;
-ppc_hash64_update_rmls(cpu);
+env->rmls = rmls_limit(cpu);
 ppc_hash64_update_vrma(cpu);
 }
 
-- 
2.24.1




[PATCH v4 05/12] spapr, ppc: Remove VPM0/RMLS hacks for POWER9

2020-02-18 Thread David Gibson
For the "pseries" machine, we use "virtual hypervisor" mode where we
only model the CPU in non-hypervisor privileged mode.  This means that
we need guest physical addresses within the modelled cpu to be treated
as absolute physical addresses.

We used to do that by clearing LPCR[VPM0] and setting LPCR[RMLS] to a high
limit so that the old offset based translation for guest mode applied,
which does what we need.  However, POWER9 has removed support for that
translation mode, which meant we had some ugly hacks to keep it working.

We now explicitly handle this sort of translation for virtual hypervisor
mode, so the hacks aren't necessary.  We don't need to set VPM0 and RMLS
from the machine type code - they're now ignored in vhyp mode.  On the cpu
side we don't need to allow LPCR[RMLS] to be set on POWER9 in vhyp mode -
that was only there to allow the hack on the machine side.

Signed-off-by: David Gibson 
Reviewed-by: Cédric Le Goater 
---
 hw/ppc/spapr_cpu_core.c | 6 +-
 target/ppc/mmu-hash64.c | 8 
 2 files changed, 1 insertion(+), 13 deletions(-)

diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index d09125d9af..ea5e11f1d9 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -58,14 +58,10 @@ static void spapr_reset_vcpu(PowerPCCPU *cpu)
  * we don't get spurious wakups before an RTAS start-cpu call.
  * For the same reason, set PSSCR_EC.
  */
-lpcr &= ~(LPCR_VPM0 | LPCR_VPM1 | LPCR_ISL | LPCR_KBV | pcc->lpcr_pm);
+lpcr &= ~(LPCR_VPM1 | LPCR_ISL | LPCR_KBV | pcc->lpcr_pm);
 lpcr |= LPCR_LPES0 | LPCR_LPES1;
 env->spr[SPR_PSSCR] |= PSSCR_EC;
 
-/* Set RMLS to the max (ie, 16G) */
-lpcr &= ~LPCR_RMLS;
-lpcr |= 1ull << LPCR_RMLS_SHIFT;
-
 ppc_store_lpcr(cpu, lpcr);
 
 /* Set a full AMOR so guest can use the AMR as it sees fit */
diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index 6b3c214879..2d54f065d9 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -1126,14 +1126,6 @@ void ppc_store_lpcr(PowerPCCPU *cpu, target_ulong val)
   (LPCR_PECE_L_MASK & (LPCR_PDEE | LPCR_HDEE | LPCR_EEE |
   LPCR_DEE | LPCR_OEE)) | LPCR_MER | LPCR_GTSE | LPCR_TC |
   LPCR_HEIC | LPCR_LPES0 | LPCR_HVICE | LPCR_HDICE);
-/*
- * If we have a virtual hypervisor, we need to bring back RMLS. It
- * doesn't exist on an actual P9 but that's all we know how to
- * configure with softmmu at the moment
- */
-if (cpu->vhyp) {
-lpcr |= (val & LPCR_RMLS);
-}
 break;
 default:
 g_assert_not_reached();
-- 
2.24.1




[PATCH v4 01/12] ppc: Remove stub support for 32-bit hypervisor mode

2020-02-18 Thread David Gibson
a4f30719a8cd, way back in 2007 noted that "PowerPC hypervisor mode is not
fundamentally available only for PowerPC 64" and added a 32-bit version
of the MSR[HV] bit.

But nothing was ever really done with that; there is no meaningful support
for 32-bit hypervisor mode 13 years later.  Let's stop pretending and just
remove the stubs.

Signed-off-by: David Gibson 
---
 target/ppc/cpu.h| 21 +++--
 target/ppc/translate_init.inc.c |  6 +++---
 2 files changed, 10 insertions(+), 17 deletions(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index b283042515..8077fdb068 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -24,8 +24,6 @@
 #include "exec/cpu-defs.h"
 #include "cpu-qom.h"
 
-/* #define PPC_EMULATE_32BITS_HYPV */
-
 #define TCG_GUEST_DEFAULT_MO 0
 
 #define TARGET_PAGE_BITS_64K 16
@@ -300,13 +298,12 @@ typedef struct ppc_v3_pate_t {
 #define MSR_SF   63 /* Sixty-four-bit modehflags */
 #define MSR_TAG  62 /* Tag-active mode (POWERx ?)*/
 #define MSR_ISF  61 /* Sixty-four-bit interrupt mode on 630  */
-#define MSR_SHV  60 /* hypervisor state   hflags */
+#define MSR_HV   60 /* hypervisor state   hflags */
 #define MSR_TS0  34 /* Transactional state, 2 bits (Book3s)  */
 #define MSR_TS1  33
 #define MSR_TM   32 /* Transactional Memory Available (Book3s)   */
 #define MSR_CM   31 /* Computation mode for BookE hflags */
 #define MSR_ICM  30 /* Interrupt computation mode for BookE  */
-#define MSR_THV  29 /* hypervisor state for 32 bits PowerPC   hflags */
 #define MSR_GS   28 /* guest state for BookE */
 #define MSR_UCLE 26 /* User-mode cache lock enable for BookE */
 #define MSR_VR   25 /* altivec availablex hflags */
@@ -401,10 +398,13 @@ typedef struct ppc_v3_pate_t {
 
 #define msr_sf   ((env->msr >> MSR_SF)   & 1)
 #define msr_isf  ((env->msr >> MSR_ISF)  & 1)
-#define msr_shv  ((env->msr >> MSR_SHV)  & 1)
+#if defined(TARGET_PPC64)
+#define msr_hv   ((env->msr >> MSR_HV)   & 1)
+#else
+#define msr_hv   (0)
+#endif
 #define msr_cm   ((env->msr >> MSR_CM)   & 1)
 #define msr_icm  ((env->msr >> MSR_ICM)  & 1)
-#define msr_thv  ((env->msr >> MSR_THV)  & 1)
 #define msr_gs   ((env->msr >> MSR_GS)   & 1)
 #define msr_ucle ((env->msr >> MSR_UCLE) & 1)
 #define msr_vr   ((env->msr >> MSR_VR)   & 1)
@@ -449,16 +449,9 @@ typedef struct ppc_v3_pate_t {
 
 /* Hypervisor bit is more specific */
 #if defined(TARGET_PPC64)
-#define MSR_HVB (1ULL << MSR_SHV)
-#define msr_hv  msr_shv
-#else
-#if defined(PPC_EMULATE_32BITS_HYPV)
-#define MSR_HVB (1ULL << MSR_THV)
-#define msr_hv  msr_thv
+#define MSR_HVB (1ULL << MSR_HV)
 #else
 #define MSR_HVB (0ULL)
-#define msr_hv  (0)
-#endif
 #endif
 
 /* DSISR */
diff --git a/target/ppc/translate_init.inc.c b/target/ppc/translate_init.inc.c
index 53995f62ea..a0d0eaabf2 100644
--- a/target/ppc/translate_init.inc.c
+++ b/target/ppc/translate_init.inc.c
@@ -8804,7 +8804,7 @@ POWERPC_FAMILY(POWER8)(ObjectClass *oc, void *data)
 PPC2_ISA205 | PPC2_ISA207S | PPC2_FP_CVT_S64 |
 PPC2_TM | PPC2_PM_ISA206;
 pcc->msr_mask = (1ull << MSR_SF) |
-(1ull << MSR_SHV) |
+(1ull << MSR_HV) |
 (1ull << MSR_TM) |
 (1ull << MSR_VR) |
 (1ull << MSR_VSX) |
@@ -9017,7 +9017,7 @@ POWERPC_FAMILY(POWER9)(ObjectClass *oc, void *data)
 PPC2_ISA205 | PPC2_ISA207S | PPC2_FP_CVT_S64 |
 PPC2_TM | PPC2_ISA300 | PPC2_PRCNTL;
 pcc->msr_mask = (1ull << MSR_SF) |
-(1ull << MSR_SHV) |
+(1ull << MSR_HV) |
 (1ull << MSR_TM) |
 (1ull << MSR_VR) |
 (1ull << MSR_VSX) |
@@ -9228,7 +9228,7 @@ POWERPC_FAMILY(POWER10)(ObjectClass *oc, void *data)
 PPC2_ISA205 | PPC2_ISA207S | PPC2_FP_CVT_S64 |
 PPC2_TM | PPC2_ISA300 | PPC2_PRCNTL;
 pcc->msr_mask = (1ull << MSR_SF) |
-(1ull << MSR_SHV) |
+(1ull << MSR_HV) |
 (1ull << MSR_TM) |
 (1ull << MSR_VR) |
 (1ull << MSR_VSX) |
-- 
2.24.1




[PATCH v4 03/12] target/ppc: Correct handling of real mode accesses with vhyp on hash MMU

2020-02-18 Thread David Gibson
On ppc we have the concept of virtual hypervisor ("vhyp") mode, where we
only model the non-hypervisor-privileged parts of the cpu.  Essentially we
model the hypervisor's behaviour from the point of view of a guest OS, but
we don't model the hypervisor's execution.

In particular, in this mode, qemu's notion of target physical address is
a guest physical address from the vcpu's point of view.  So accesses in
guest real mode don't require translation.  If we were modelling the
hypervisor mode, we'd need to translate the guest physical address into
a host physical address.

Currently, we handle this sloppily: we rely on setting up the virtual LPCR
and RMOR registers so that GPAs are simply HPAs plus an offset, which we
set to zero.  This is already conceptually dubious, since the LPCR and RMOR
registers don't exist in the non-hypervisor portion of the CPU.  It gets
worse with POWER9, where RMOR and LPCR[VPM0] no longer exist at all.

Clean this up by explicitly handling the vhyp case.  While we're there,
remove some unnecessary nesting of if statements that made the logic to
select the correct real mode behaviour a bit less clear than it could be.

Signed-off-by: David Gibson 
Reviewed-by: Cédric Le Goater 
---
 target/ppc/mmu-hash64.c | 60 -
 1 file changed, 35 insertions(+), 25 deletions(-)

diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index a881876647..5fabd93c92 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -789,27 +789,30 @@ int ppc_hash64_handle_mmu_fault(PowerPCCPU *cpu, vaddr 
eaddr,
  */
 raddr = eaddr & 0x0FFFULL;
 
-/* In HV mode, add HRMOR if top EA bit is clear */
-if (msr_hv || !env->has_hv_mode) {
+if (cpu->vhyp) {
+/*
+ * In virtual hypervisor mode, there's nothing to do:
+ *   EA == GPA == qemu guest address
+ */
+} else if (msr_hv || !env->has_hv_mode) {
+/* In HV mode, add HRMOR if top EA bit is clear */
 if (!(eaddr >> 63)) {
 raddr |= env->spr[SPR_HRMOR];
 }
-} else {
-/* Otherwise, check VPM for RMA vs VRMA */
-if (env->spr[SPR_LPCR] & LPCR_VPM0) {
-slb = >vrma_slb;
-if (slb->sps) {
-goto skip_slb_search;
-}
-/* Not much else to do here */
+} else if (env->spr[SPR_LPCR] & LPCR_VPM0) {
+/* Emulated VRMA mode */
+slb = >vrma_slb;
+if (!slb->sps) {
+/* Invalid VRMA setup, machine check */
 cs->exception_index = POWERPC_EXCP_MCHECK;
 env->error_code = 0;
 return 1;
-} else if (raddr < env->rmls) {
-/* RMA. Check bounds in RMLS */
-raddr |= env->spr[SPR_RMOR];
-} else {
-/* The access failed, generate the approriate interrupt */
+}
+
+goto skip_slb_search;
+} else {
+/* Emulated old-style RMO mode, bounds check against RMLS */
+if (raddr >= env->rmls) {
 if (rwx == 2) {
 ppc_hash64_set_isi(cs, SRR1_PROTFAULT);
 } else {
@@ -821,6 +824,8 @@ int ppc_hash64_handle_mmu_fault(PowerPCCPU *cpu, vaddr 
eaddr,
 }
 return 1;
 }
+
+raddr |= env->spr[SPR_RMOR];
 }
 tlb_set_page(cs, eaddr & TARGET_PAGE_MASK, raddr & TARGET_PAGE_MASK,
  PAGE_READ | PAGE_WRITE | PAGE_EXEC, mmu_idx,
@@ -953,22 +958,27 @@ hwaddr ppc_hash64_get_phys_page_debug(PowerPCCPU *cpu, 
target_ulong addr)
 /* In real mode the top 4 effective address bits are ignored */
 raddr = addr & 0x0FFFULL;
 
-/* In HV mode, add HRMOR if top EA bit is clear */
-if ((msr_hv || !env->has_hv_mode) && !(addr >> 63)) {
+if (cpu->vhyp) {
+/*
+ * In virtual hypervisor mode, there's nothing to do:
+ *   EA == GPA == qemu guest address
+ */
+return raddr;
+} else if ((msr_hv || !env->has_hv_mode) && !(addr >> 63)) {
+/* In HV mode, add HRMOR if top EA bit is clear */
 return raddr | env->spr[SPR_HRMOR];
-}
-
-/* Otherwise, check VPM for RMA vs VRMA */
-if (env->spr[SPR_LPCR] & LPCR_VPM0) {
+} else if (env->spr[SPR_LPCR] & LPCR_VPM0) {
+/* Emulated VRMA mode */
 slb = >vrma_slb;
 if (!slb->sps) {
 return -1;
 }
-} else if (raddr < env->rmls) {
-/* RMA. Check bounds in RMLS */
-return raddr | env->spr[SPR_RMOR];
 } else {
-return -1;
+/* Emulated old-style RMO mode, bounds check against RMLS */
+if (raddr >= env->rmls) {
+ 

[PATCH v4 09/12] target/ppc: Correct RMLS table

2020-02-18 Thread David Gibson
The table of RMA limits based on the LPCR[RMLS] field is slightly wrong.
We're missing the RMLS == 0 => 256 GiB RMA option, which is available on
POWER8, so add that.

The comment that goes with the table is much more wrong.  We *don't* filter
invalid RMLS values when writing the LPCR, and there's not really a
sensible way to do so.  Furthermore, while in theory the set of RMLS values
is implementation dependent, it seems in practice the same set has been
available since around POWER4+ up until POWER8, the last model which
supports RMLS at all.  So, correct that as well.

Signed-off-by: David Gibson 
Reviewed-by: Cédric Le Goater 
---
 target/ppc/mmu-hash64.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index 4e6c1f722b..46690bc79b 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -762,12 +762,12 @@ static target_ulong rmls_limit(PowerPCCPU *cpu)
 {
 CPUPPCState *env = >env;
 /*
- * This is the full 4 bits encoding of POWER8. Previous
- * CPUs only support a subset of these but the filtering
- * is done when writing LPCR
+ * In theory the meanings of RMLS values are implementation
+ * dependent.  In practice, this seems to have been the set from
+ * POWER4+..POWER8, and RMLS is no longer supported in POWER9.
  */
 const target_ulong rma_sizes[] = {
-[0] = 0,
+[0] = 256 * GiB,
 [1] = 16 * GiB,
 [2] = 1 * GiB,
 [3] = 64 * MiB,
-- 
2.24.1




Re: [PATCH v3 00/12] target/ppc: Correct some errors with real mode handling

2020-02-18 Thread David Gibson
On Wed, Feb 19, 2020 at 11:54:02AM +1100, David Gibson wrote:
> POWER "book S" (server class) cpus have a concept of "real mode" where
> MMU translation is disabled... sort of.  In fact this can mean a bunch
> of slightly different things when hypervisor mode and other
> considerations are present.
> 
> We had some errors in edge cases here, so clean some things up and
> correct them.

Duh.  Forgot to run checkpatch, new version coming shortly.

> 
> Changes since v2:
>  * Removed 32-bit hypervisor stubs more completely
>  * Minor polish based on review comments
> Changes since RFCv1:
>  * Add a number of extra patches taking advantage of the initial
>cleanups
> 
> David Gibson (12):
>   ppc: Remove stub support for 32-bit hypervisor mode
>   ppc: Remove stub of PPC970 HID4 implementation
>   target/ppc: Correct handling of real mode accesses with vhyp on hash
> MMU
>   target/ppc: Introduce ppc_hash64_use_vrma() helper
>   spapr, ppc: Remove VPM0/RMLS hacks for POWER9
>   target/ppc: Remove RMOR register from POWER9 & POWER10
>   target/ppc: Use class fields to simplify LPCR masking
>   target/ppc: Streamline calculation of RMA limit from LPCR[RMLS]
>   target/ppc: Correct RMLS table
>   target/ppc: Only calculate RMLS derived RMA limit on demand
>   target/ppc: Streamline construction of VRMA SLB entry
>   target/ppc: Don't store VRMA SLBE persistently
> 
>  hw/ppc/spapr_cpu_core.c |   6 +-
>  target/ppc/cpu-qom.h|   1 +
>  target/ppc/cpu.h|  25 +--
>  target/ppc/mmu-hash64.c | 329 
>  target/ppc/translate_init.inc.c |  60 --
>  5 files changed, 175 insertions(+), 246 deletions(-)
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


[PATCH v4 00/12] target/ppc: Correct some errors with real mode handling

2020-02-18 Thread David Gibson
POWER "book S" (server class) cpus have a concept of "real mode" where
MMU translation is disabled... sort of.  In fact this can mean a bunch
of slightly different things when hypervisor mode and other
considerations are present.

We had some errors in edge cases here, so clean some things up and
correct them.

Changes since v3:
 * Fix style errors reported by checkpatch
Changes since v2:
 * Removed 32-bit hypervisor stubs more completely
 * Minor polish based on review comments
Changes since RFCv1:
 * Add a number of extra patches taking advantage of the initial
   cleanups

David Gibson (12):
  ppc: Remove stub support for 32-bit hypervisor mode
  ppc: Remove stub of PPC970 HID4 implementation
  target/ppc: Correct handling of real mode accesses with vhyp on hash
MMU
  target/ppc: Introduce ppc_hash64_use_vrma() helper
  spapr, ppc: Remove VPM0/RMLS hacks for POWER9
  target/ppc: Remove RMOR register from POWER9 & POWER10
  target/ppc: Use class fields to simplify LPCR masking
  target/ppc: Streamline calculation of RMA limit from LPCR[RMLS]
  target/ppc: Correct RMLS table
  target/ppc: Only calculate RMLS derived RMA limit on demand
  target/ppc: Streamline construction of VRMA SLB entry
  target/ppc: Don't store VRMA SLBE persistently

 hw/ppc/spapr_cpu_core.c |   6 +-
 target/ppc/cpu-qom.h|   1 +
 target/ppc/cpu.h|  25 +--
 target/ppc/mmu-hash64.c | 331 
 target/ppc/translate_init.inc.c |  63 --
 5 files changed, 179 insertions(+), 247 deletions(-)

-- 
2.24.1




Re: [PATCH v4 00/14] Fixes for DP8393X SONIC device emulation

2020-02-18 Thread Aleksandar Markovic
2:54 AM Sre, 19.02.2020. Aleksandar Markovic 
је написао/ла:
>
> 2:06 AM Sre, 19.02.2020. Finn Thain  је
написао/ла:
> >
> > On Tue, 18 Feb 2020, Aleksandar Markovic wrote:
> >
> > > On Wednesday, January 29, 2020, Finn Thain 
> > > wrote:
> > >
> > > > Hi All,
> > > >
> > > > There are bugs in the emulated dp8393x device that can stop packet
> > > > reception in a Linux/m68k guest (q800 machine).
> > > >
> > > > With a Linux/m68k v5.5 guest (q800), it's possible to remotely
trigger
> > > > an Oops by sending ping floods.
> > > >
> > > > With a Linux/mips guest (magnum machine), the driver fails to probe
> > > > the dp8393x device.
> > > >
> > > > With a NetBSD/arc 5.1 guest (magnum), the bugs in the device can be
> > > > fatal to the guest kernel.
> > > >
> > > > Whilst debugging the device, I found that the receiver algorithm
> > > > differs from the one described in the National Semiconductor
> > > > datasheet.
> > > >
> > > > This patch series resolves these bugs.
> > > >
> > > > AFAIK, all bugs in the Linux sonic driver were fixed in Linux v5.5.
> > > > ---
> > >
> > >
> > > Herve,
> > >
> > > Do your Jazz tests pass with these changes?
> > >
> >
> > AFAIK those tests did not expose the NetBSD panic that is caused by
> > mainline QEMU (mentioned above).
> >
> > I have actually run the tests you requested (Hervé described them in an
> > earlier thread). There was no regression. Quite the reverse -- it's no
> > longer possible to remotely crash the NetBSD kernel.
> >
> > Apparently my testing was also the first time that the jazzsonic driver
> > (from the Linux/mips Magnum port) was tested successfully with QEMU. It
> > doesn't work in mainline QEMU.
> >
>
> Well, I appologize if I missed all these facts. I just did not notice
them, at least not in this form. And, yes, some "Tested-by:" by Herve would
be desirable and nice.
>

Or, perhaps, even "Reviewed-by:".

> Yours,
> Aleksandae
>
> > Anyway, more testing is always nice, and I'd certainly welcome an
> > 'acked-by' or 'tested-by' if Hervé would like to send one.
> >
> > Please consider backporting this series of bug fixes to QEMU stable
> > branch(es).
> >
> > Regards,
> > Finn
> >
> > > Regards,
> > > Aleksandar
> > >
> > >
> > >
> > > > Changed since v1:
> > > >  - Minor revisions as described beneath commit logs.
> > > >  - Dropped patches 4/10 and 7/10.
> > > >  - Added 5 new patches.
> > > >
> > > > Changed since v2:
> > > >  - Minor revisions as described beneath commit logs.
> > > >  - Dropped patch 13/13.
> > > >  - Added 2 new patches.
> > > >
> > > > Changed since v3:
> > > >  - Replaced patch 13/14 with patch suggested by Philippe
Mathieu-Daudé.
> > > >
> > > >
> > > > Finn Thain (14):
> > > >   dp8393x: Mask EOL bit from descriptor addresses
> > > >   dp8393x: Always use 32-bit accesses
> > > >   dp8393x: Clean up endianness hacks
> > > >   dp8393x: Have dp8393x_receive() return the packet size
> > > >   dp8393x: Update LLFA and CRDA registers from rx descriptor
> > > >   dp8393x: Clear RRRA command register bit only when appropriate
> > > >   dp8393x: Implement packet size limit and RBAE interrupt
> > > >   dp8393x: Don't clobber packet checksum
> > > >   dp8393x: Use long-word-aligned RRA pointers in 32-bit mode
> > > >   dp8393x: Pad frames to word or long word boundary
> > > >   dp8393x: Clear descriptor in_use field to release packet
> > > >   dp8393x: Always update RRA pointers and sequence numbers
> > > >   dp8393x: Don't reset Silicon Revision register
> > > >   dp8393x: Don't stop reception upon RBE interrupt assertion
> > > >
> > > >  hw/net/dp8393x.c | 202
+++
> > > >  1 file changed, 134 insertions(+), 68 deletions(-)
> > > >
> > > > --
> > > > 2.24.1
> > > >
> > > >
> > > >
> > >


Re: [PATCH v4 00/14] Fixes for DP8393X SONIC device emulation

2020-02-18 Thread Aleksandar Markovic
2:06 AM Sre, 19.02.2020. Finn Thain  је
написао/ла:
>
> On Tue, 18 Feb 2020, Aleksandar Markovic wrote:
>
> > On Wednesday, January 29, 2020, Finn Thain 
> > wrote:
> >
> > > Hi All,
> > >
> > > There are bugs in the emulated dp8393x device that can stop packet
> > > reception in a Linux/m68k guest (q800 machine).
> > >
> > > With a Linux/m68k v5.5 guest (q800), it's possible to remotely trigger
> > > an Oops by sending ping floods.
> > >
> > > With a Linux/mips guest (magnum machine), the driver fails to probe
> > > the dp8393x device.
> > >
> > > With a NetBSD/arc 5.1 guest (magnum), the bugs in the device can be
> > > fatal to the guest kernel.
> > >
> > > Whilst debugging the device, I found that the receiver algorithm
> > > differs from the one described in the National Semiconductor
> > > datasheet.
> > >
> > > This patch series resolves these bugs.
> > >
> > > AFAIK, all bugs in the Linux sonic driver were fixed in Linux v5.5.
> > > ---
> >
> >
> > Herve,
> >
> > Do your Jazz tests pass with these changes?
> >
>
> AFAIK those tests did not expose the NetBSD panic that is caused by
> mainline QEMU (mentioned above).
>
> I have actually run the tests you requested (Hervé described them in an
> earlier thread). There was no regression. Quite the reverse -- it's no
> longer possible to remotely crash the NetBSD kernel.
>
> Apparently my testing was also the first time that the jazzsonic driver
> (from the Linux/mips Magnum port) was tested successfully with QEMU. It
> doesn't work in mainline QEMU.
>

Well, I appologize if I missed all these facts. I just did not notice them,
at least not in this form. And, yes, some "Tested-by:" by Herve would be
desirable and nice.

Yours,
Aleksandae

> Anyway, more testing is always nice, and I'd certainly welcome an
> 'acked-by' or 'tested-by' if Hervé would like to send one.
>
> Please consider backporting this series of bug fixes to QEMU stable
> branch(es).
>
> Regards,
> Finn
>
> > Regards,
> > Aleksandar
> >
> >
> >
> > > Changed since v1:
> > >  - Minor revisions as described beneath commit logs.
> > >  - Dropped patches 4/10 and 7/10.
> > >  - Added 5 new patches.
> > >
> > > Changed since v2:
> > >  - Minor revisions as described beneath commit logs.
> > >  - Dropped patch 13/13.
> > >  - Added 2 new patches.
> > >
> > > Changed since v3:
> > >  - Replaced patch 13/14 with patch suggested by Philippe
Mathieu-Daudé.
> > >
> > >
> > > Finn Thain (14):
> > >   dp8393x: Mask EOL bit from descriptor addresses
> > >   dp8393x: Always use 32-bit accesses
> > >   dp8393x: Clean up endianness hacks
> > >   dp8393x: Have dp8393x_receive() return the packet size
> > >   dp8393x: Update LLFA and CRDA registers from rx descriptor
> > >   dp8393x: Clear RRRA command register bit only when appropriate
> > >   dp8393x: Implement packet size limit and RBAE interrupt
> > >   dp8393x: Don't clobber packet checksum
> > >   dp8393x: Use long-word-aligned RRA pointers in 32-bit mode
> > >   dp8393x: Pad frames to word or long word boundary
> > >   dp8393x: Clear descriptor in_use field to release packet
> > >   dp8393x: Always update RRA pointers and sequence numbers
> > >   dp8393x: Don't reset Silicon Revision register
> > >   dp8393x: Don't stop reception upon RBE interrupt assertion
> > >
> > >  hw/net/dp8393x.c | 202
+++
> > >  1 file changed, 134 insertions(+), 68 deletions(-)
> > >
> > > --
> > > 2.24.1
> > >
> > >
> > >
> >


Re: [PATCH v3 00/12] target/ppc: Correct some errors with real mode handling

2020-02-18 Thread no-reply
Patchew URL: 
https://patchew.org/QEMU/20200219005414.15635-1-da...@gibson.dropbear.id.au/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [PATCH v3 00/12] target/ppc: Correct some errors with real mode 
handling
Message-id: 20200219005414.15635-1-da...@gibson.dropbear.id.au
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

From https://github.com/patchew-project/qemu
 * [new tag] patchew/20200219005414.15635-1-da...@gibson.dropbear.id.au 
-> patchew/20200219005414.15635-1-da...@gibson.dropbear.id.au
Switched to a new branch 'test'
275db2f target/ppc: Don't store VRMA SLBE persistently
8f4ef78 target/ppc: Streamline construction of VRMA SLB entry
5329f3b target/ppc: Only calculate RMLS derived RMA limit on demand
656a372 target/ppc: Correct RMLS table
6432e7f target/ppc: Streamline calculation of RMA limit from LPCR[RMLS]
00f78cd target/ppc: Use class fields to simplify LPCR masking
c6f6cea target/ppc: Remove RMOR register from POWER9 & POWER10
c6daae6 spapr, ppc: Remove VPM0/RMLS hacks for POWER9
3374197 target/ppc: Introduce ppc_hash64_use_vrma() helper
7e14e97 target/ppc: Correct handling of real mode accesses with vhyp on hash MMU
7c298cb ppc: Remove stub of PPC970 HID4 implementation
4525879 ppc: Remove stub support for 32-bit hypervisor mode

=== OUTPUT BEGIN ===
1/12 Checking commit 4525879e6fae (ppc: Remove stub support for 32-bit 
hypervisor mode)
2/12 Checking commit 7c298cb58821 (ppc: Remove stub of PPC970 HID4 
implementation)
WARNING: Block comments use a leading /* on a separate line
#98: FILE: target/ppc/translate_init.inc.c:7904:
+/* PPC970: HID4 covers things later controlled by the LPCR and

WARNING: Block comments use a trailing */ on a separate line
#102: FILE: target/ppc/translate_init.inc.c:7908:
+ * ignore it */

total: 0 errors, 2 warnings, 71 lines checked

Patch 2/12 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
3/12 Checking commit 7e14e97fa725 (target/ppc: Correct handling of real mode 
accesses with vhyp on hash MMU)
4/12 Checking commit 337419739ee8 (target/ppc: Introduce ppc_hash64_use_vrma() 
helper)
WARNING: Block comments use a leading /* on a separate line
#41: FILE: target/ppc/mmu-hash64.c:675:
+/* ISAv3.0 (POWER9) always uses VRMA, the VPM0 field and RMOR

WARNING: Block comments use a trailing */ on a separate line
#42: FILE: target/ppc/mmu-hash64.c:676:
+ * register no longer exist */

total: 0 errors, 2 warnings, 83 lines checked

Patch 4/12 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
5/12 Checking commit c6daae6e9b06 (spapr, ppc: Remove VPM0/RMLS hacks for 
POWER9)
6/12 Checking commit c6f6ceaa43fc (target/ppc: Remove RMOR register from POWER9 
& POWER10)
7/12 Checking commit 00f78cdfebbd (target/ppc: Use class fields to simplify 
LPCR masking)
8/12 Checking commit 6432e7fe864f (target/ppc: Streamline calculation of RMA 
limit from LPCR[RMLS])
9/12 Checking commit 656a372f677c (target/ppc: Correct RMLS table)
10/12 Checking commit 5329f3b07fba (target/ppc: Only calculate RMLS derived RMA 
limit on demand)
11/12 Checking commit 8f4ef78a4af3 (target/ppc: Streamline construction of VRMA 
SLB entry)
ERROR: braces {} are necessary for all arms of this statement
#80: FILE: target/ppc/mmu-hash64.c:1084:
+if (build_vrma_slbe(cpu, slb) == 0)
[...]

total: 1 errors, 0 warnings, 97 lines checked

Patch 11/12 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

12/12 Checking commit 275db2fe2a84 (target/ppc: Don't store VRMA SLBE 
persistently)
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20200219005414.15635-1-da...@gibson.dropbear.id.au/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [PULL SUBSYSTEM qemu-pseries] pseries: Update SLOF firmware image

2020-02-18 Thread Alexey Kardashevskiy



On 18/02/2020 23:59, Cédric Le Goater wrote:
> On 2/18/20 1:48 PM, Cédric Le Goater wrote:
>> On 2/18/20 10:40 AM, Cédric Le Goater wrote:
>>> On 2/18/20 10:10 AM, Alexey Kardashevskiy wrote:


 On 18/02/2020 20:05, Alexey Kardashevskiy wrote:
>
>
> On 18/02/2020 18:12, Cédric Le Goater wrote:
>> On 2/18/20 1:30 AM, Alexey Kardashevskiy wrote:
>>>
>>>
>>> On 17/02/2020 20:48, Cédric Le Goater wrote:
 On 2/17/20 3:12 AM, Alexey Kardashevskiy wrote:
> The following changes since commit 
> 05943fb4ca41f626078014c0327781815c6584c5:
>
>   ppc: free 'fdt' after reset the machine (2020-02-17 11:27:23 +1100)
>
> are available in the Git repository at:
>
>   g...@github.com:aik/qemu.git tags/qemu-slof-20200217
>
> for you to fetch changes up to 
> ea9a03e5aa023c5391bab5259898475d0298aac2:
>
>   pseries: Update SLOF firmware image (2020-02-17 13:08:59 +1100)
>
> 
> Alexey Kardashevskiy (1):
>   pseries: Update SLOF firmware image
>
>  pc-bios/README   |   2 +-
>  pc-bios/slof.bin | Bin 931032 -> 968560 bytes
>  roms/SLOF|   2 +-
>  3 files changed, 2 insertions(+), 2 deletions(-)
>
>
> *** Note: this is not for master, this is for pseries
>

 Hello Alexey,

 QEMU fails to boot from disk. See below.
>>>
>>>
>>> It does boot mine (fedora 30, ubuntu 18.04), see below. I believe I
>>> could have broken something but I need more detail. Thanks,
>>
>> fedora31 boots but not ubuntu 19.10. Could it be GRUB version 2.04 ? 
>
>
> No, not that either:


 but it might be because of power9 - I only tried power8, rsyncing the
 image to a p9 machine now...
>>>
>>> Here is the disk : 
>>>
>>> Disk /dev/sda: 50 GiB, 53687091200 bytes, 104857600 sectors
>>> Disk model: QEMU HARDDISK   
>>> Units: sectors of 1 * 512 = 512 bytes
>>> Sector size (logical/physical): 512 bytes / 512 bytes
>>> I/O size (minimum/optimal): 512 bytes / 512 bytes
>>> Disklabel type: gpt
>>> Disk identifier: 27DCE458-231A-4981-9FF1-983F87C2902D
>>>
>>> Device Start   End   Sectors Size Type
>>> /dev/sda1   2048 16383 14336   7M PowerPC PReP boot
>>> /dev/sda2  16384 100679679 100663296  48G Linux filesystem
>>> /dev/sda3  100679680 104857566   4177887   2G Linux swap
>>>
>>>
>>> GPT ? 
>>
>> For the failure, I bisected up to :
>>
>> f12149908705 ("ext2: Read all 64bit of inode number")
> 
> Here is a possible fix for it. I did some RPN on my hp28s in the past 
> but I am not forth fluent.


you basically zeroed the top bits by shifting them too far right :)

The proper fix I think is:

-  32 lshift or
+  20 lshift or

I keep forgetting it is all in hex. Can you please give it a try? My
128GB disk does not expose this problem somehow. Thanks,


> 
> "slash not found" is still there though. 
> 
> Cheers,
> 
> C.
> 
> 
> From 92dc9f6dc7c6434419306d5a382adb42169b712a Mon Sep 17 00:00:00 2001
> From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= 
> Date: Tue, 18 Feb 2020 13:54:54 +0100
> Subject: [PATCH] ext2: Fix 64bit inode number
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> Fixes: f12149908705 ("ext2: Read all 64bit of inode number")
> Signed-off-by: Cédric Le Goater 
> ---
>  slof/fs/packages/ext2-files.fs | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/slof/fs/packages/ext2-files.fs b/slof/fs/packages/ext2-files.fs
> index b6a7880bd88e..f1d9fdfd67e2 100644
> --- a/slof/fs/packages/ext2-files.fs
> +++ b/slof/fs/packages/ext2-files.fs
> @@ -152,7 +152,7 @@ CONSTANT /ext4-ee
>dup
>8 + l@-le   \ reads bg_inode_table_lo
>swap 28 + l@-le \ reads bg_inode_table_hi
> -  32 lshift or
> +  32 rshift or
>block-size @ *  \ # in group, inode table
>swap inode-size @ * + xlsplit seek drop  inode @ inode-size @ read drop
>  ;
> 

-- 
Alexey



Re: [PATCH] xilinx_spips: Correct the number of dummy cycles for the FAST_READ_4 cmd

2020-02-18 Thread Edgar E. Iglesias
On Tue, Feb 18, 2020 at 12:33:50PM +0100, Francisco Iglesias wrote:
> From: Francisco Iglesias 
> 
> Correct the number of dummy cycles required by the FAST_READ_4 command (to
> be eight, one dummy byte).
> 
> Fixes: ef06ca3946 ("xilinx_spips: Add support for RX discard and RX drain")
> Suggested-by: Cédric Le Goater 
> Signed-off-by: Francisco Iglesias 

Reviewed-by: Edgar E. Iglesias 



> ---
>  hw/ssi/xilinx_spips.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/ssi/xilinx_spips.c b/hw/ssi/xilinx_spips.c
> index 6c9ef59779..c57850a505 100644
> --- a/hw/ssi/xilinx_spips.c
> +++ b/hw/ssi/xilinx_spips.c
> @@ -576,11 +576,11 @@ static int xilinx_spips_num_dummies(XilinxQSPIPS *qs, 
> uint8_t command)
>  case FAST_READ:
>  case DOR:
>  case QOR:
> +case FAST_READ_4:
>  case DOR_4:
>  case QOR_4:
>  return 1;
>  case DIOR:
> -case FAST_READ_4:
>  case DIOR_4:
>  return 2;
>  case QIOR:
> -- 
> 2.11.0
> 



Re: [PATCH v4 2/4] target/riscv: configure and turn on vector extension from command line

2020-02-18 Thread Alistair Francis
On Tue, Feb 18, 2020 at 4:46 PM LIU Zhiwei  wrote:
>
> Hi, Alistair
>
> On 2020/2/19 6:34, Alistair Francis wrote:
> > On Mon, Feb 10, 2020 at 12:12 AM LIU Zhiwei  wrote:
> >> Vector extension is default on only for "any" cpu. It can be turned
> >> on by command line "-cpu rv64,v=true,vlen=128,elen=64,vext_spec=v0.7.1".
> >>
> >> vlen is the vector register length, default value is 128 bit.
> >> elen is the max operator size in bits, default value is 64 bit.
> >> vext_spec is the vector specification version, default value is v0.7.1.
> >> Thest properties and cpu can be specified with other values.
> >>
> >> Signed-off-by: LIU Zhiwei 
> > This looks fine to me. Shouldn't this be the last patch though?
> Yes, it should be the last patch.
> > As in
> > once the vector extension has been added to QEMU you can turn it on
> > from the command line. Right now this turns it on but it isn't
> > implemented.
> Maybe I should just add fields in RISCVCPU structure. And never open the
> vector extension on or add configure properties until the implementation
> is ready.

Yes, I think that is a good idea.

>
> It's still a little awkward as the reviewers will not be able to test
> the patch until the
> last patch.

I understand, but I don't think anyone is going to want to test the
extension half way through it being added to QEMU. This way we can
start to merge patches even without full support as users can't turn
it on.

Alistair

>
> > Alistair
> >
> >> ---
> >>   target/riscv/cpu.c | 48 --
> >>   target/riscv/cpu.h |  8 
> >>   2 files changed, 54 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> >> index 8c86ebc109..95fdb6261e 100644
> >> --- a/target/riscv/cpu.c
> >> +++ b/target/riscv/cpu.c
> >> @@ -98,6 +98,11 @@ static void set_priv_version(CPURISCVState *env, int 
> >> priv_ver)
> >>   env->priv_ver = priv_ver;
> >>   }
> >>
> >> +static void set_vext_version(CPURISCVState *env, int vext_ver)
> >> +{
> >> +env->vext_ver = vext_ver;
> >> +}
> >> +
> >>   static void set_feature(CPURISCVState *env, int feature)
> >>   {
> >>   env->features |= (1ULL << feature);
> >> @@ -113,7 +118,7 @@ static void set_resetvec(CPURISCVState *env, int 
> >> resetvec)
> >>   static void riscv_any_cpu_init(Object *obj)
> >>   {
> >>   CPURISCVState *env = _CPU(obj)->env;
> >> -set_misa(env, RVXLEN | RVI | RVM | RVA | RVF | RVD | RVC | RVU);
> >> +set_misa(env, RVXLEN | RVI | RVM | RVA | RVF | RVD | RVC | RVU | RVV);
> >>   set_priv_version(env, PRIV_VERSION_1_11_0);
> >>   set_resetvec(env, DEFAULT_RSTVEC);
> >>   }
> >> @@ -320,6 +325,7 @@ static void riscv_cpu_realize(DeviceState *dev, Error 
> >> **errp)
> >>   CPURISCVState *env = >env;
> >>   RISCVCPUClass *mcc = RISCV_CPU_GET_CLASS(dev);
> >>   int priv_version = PRIV_VERSION_1_11_0;
> >> +int vext_version = VEXT_VERSION_0_07_1;
> >>   target_ulong target_misa = 0;
> >>   Error *local_err = NULL;
> >>
> >> @@ -343,8 +349,18 @@ static void riscv_cpu_realize(DeviceState *dev, Error 
> >> **errp)
> >>   return;
> >>   }
> >>   }
> >> -
> >> +if (cpu->cfg.vext_spec) {
> >> +if (!g_strcmp0(cpu->cfg.vext_spec, "v0.7.1")) {
> >> +vext_version = VEXT_VERSION_0_07_1;
> >> +} else {
> >> +error_setg(errp,
> >> +   "Unsupported vector spec version '%s'",
> >> +   cpu->cfg.vext_spec);
> >> +return;
> >> +}
> >> +}
> >>   set_priv_version(env, priv_version);
> >> +set_vext_version(env, vext_version);
> >>   set_resetvec(env, DEFAULT_RSTVEC);
> >>
> >>   if (cpu->cfg.mmu) {
> >> @@ -409,6 +425,30 @@ static void riscv_cpu_realize(DeviceState *dev, Error 
> >> **errp)
> >>   if (cpu->cfg.ext_u) {
> >>   target_misa |= RVU;
> >>   }
> >> +if (cpu->cfg.ext_v) {
> >> +target_misa |= RVV;
> >> +if (!is_power_of_2(cpu->cfg.vlen)) {
> >> +error_setg(errp,
> >> +   "Vector extension VLEN must be power of 2");
> >> +return;
> >> +}
> >> +if (cpu->cfg.vlen > RV_VLEN_MAX || cpu->cfg.vlen < 128) {
> >> +error_setg(errp,
> >> +   "Vector extension implementation only supports 
> >> VLEN "
> >> +   "in the range [128, %d]", RV_VLEN_MAX);
> >> +return;
> >> +}
> >> +if (!is_power_of_2(cpu->cfg.elen)) {
> >> +error_setg(errp,
> >> +   "Vector extension ELEN must be power of 2");
> >> +return;
> >> +}
> >> +if (cpu->cfg.elen > 64) {
> >> +error_setg(errp,
> >> +   "Vector extension ELEN must <= 64");
> >> +return;
> >> +}
> >> +}
> >>
> 

Re: [PATCH v1] block/nvme: introduce PMR support from NVMe 1.4 spec

2020-02-18 Thread no-reply
Patchew URL: 
https://patchew.org/QEMU/20200218224811.30050-1-andrzej.jakow...@linux.intel.com/



Hi,

This series failed the docker-mingw@fedora build test. Please find the testing 
commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#! /bin/bash
export ARCH=x86_64
make docker-image-fedora V=1 NETWORK=1
time make docker-test-mingw@fedora J=14 NETWORK=1
=== TEST SCRIPT END ===

  CC  hw/display/sii9022.o
  CC  hw/display/ssd0303.o
/tmp/qemu-test/src/hw/block/nvme.c: In function 'nvme_pmr_read':
/tmp/qemu-test/src/hw/block/nvme.c:1342:15: error: implicit declaration of 
function 'msync'; did you mean 'fsync'? [-Werror=implicit-function-declaration]
 ret = msync(n->pmrbuf, n->f_pmr_size, MS_SYNC);
   ^
   fsync
/tmp/qemu-test/src/hw/block/nvme.c:1342:15: error: nested extern declaration of 
'msync' [-Werror=nested-externs]
/tmp/qemu-test/src/hw/block/nvme.c:1342:47: error: 'MS_SYNC' undeclared (first 
use in this function)
 ret = msync(n->pmrbuf, n->f_pmr_size, MS_SYNC);
   ^~~
/tmp/qemu-test/src/hw/block/nvme.c:1342:47: note: each undeclared identifier is 
reported only once for each function it appears in
/tmp/qemu-test/src/hw/block/nvme.c: In function 'nvme_realize':
/tmp/qemu-test/src/hw/block/nvme.c:1413:21: error: implicit declaration of 
function 'mmap'; did you mean 'max'? [-Werror=implicit-function-declaration]
 n->pmrbuf = mmap(NULL, n->f_pmr_size,
 ^~~~
 max
/tmp/qemu-test/src/hw/block/nvme.c:1413:21: error: nested extern declaration of 
'mmap' [-Werror=nested-externs]
/tmp/qemu-test/src/hw/block/nvme.c:1414:27: error: 'PROT_READ' undeclared 
(first use in this function); did you mean 'OF_READ'?
  (PROT_READ | PROT_WRITE), MAP_SHARED, fd, 0);
   ^
   OF_READ
/tmp/qemu-test/src/hw/block/nvme.c:1414:39: error: 'PROT_WRITE' undeclared 
(first use in this function); did you mean 'OF_WRITE'?
  (PROT_READ | PROT_WRITE), MAP_SHARED, fd, 0);
   ^~
   OF_WRITE
/tmp/qemu-test/src/hw/block/nvme.c:1414:52: error: 'MAP_SHARED' undeclared 
(first use in this function); did you mean 'RAM_SHARED'?
  (PROT_READ | PROT_WRITE), MAP_SHARED, fd, 0);
^~
RAM_SHARED
/tmp/qemu-test/src/hw/block/nvme.c:1416:26: error: 'MAP_FAILED' undeclared 
(first use in this function); did you mean 'WAIT_FAILED'?
 if (n->pmrbuf == MAP_FAILED) {
  ^~
  WAIT_FAILED
/tmp/qemu-test/src/hw/block/nvme.c: In function 'nvme_exit':
/tmp/qemu-test/src/hw/block/nvme.c:1583:13: error: implicit declaration of 
function 'munmap' [-Werror=implicit-function-declaration]
 munmap(n->pmrbuf, n->f_pmr_size);
 ^~
/tmp/qemu-test/src/hw/block/nvme.c:1583:13: error: nested extern declaration of 
'munmap' [-Werror=nested-externs]
cc1: all warnings being treated as errors
make: *** [/tmp/qemu-test/src/rules.mak:69: hw/block/nvme.o] Error 1
make: *** Waiting for unfinished jobs
Traceback (most recent call last):
  File "./tests/docker/docker.py", line 664, in 
---
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', 
'--label', 'com.qemu.instance.uuid=058833b8d96e4c67a646a099f4118351', '-u', 
'1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', 
'-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 
'SHOW_ENV=', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', 
'/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', 
'/var/tmp/patchew-tester-tmp-0kstbie3/src/docker-src.2020-02-18-20.04.28.2259:/var/tmp/qemu:z,ro',
 'qemu:fedora', '/var/tmp/qemu/run', 'test-mingw']' returned non-zero exit 
status 2.
filter=--filter=label=com.qemu.instance.uuid=058833b8d96e4c67a646a099f4118351
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-0kstbie3/src'
make: *** [docker-run-test-mingw@fedora] Error 2

real2m39.403s
user0m8.170s


The full log is available at
http://patchew.org/logs/20200218224811.30050-1-andrzej.jakow...@linux.intel.com/testing.docker-mingw@fedora/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [PATCH v4 00/14] Fixes for DP8393X SONIC device emulation

2020-02-18 Thread Finn Thain
On Tue, 18 Feb 2020, Aleksandar Markovic wrote:

> On Wednesday, January 29, 2020, Finn Thain 
> wrote:
> 
> > Hi All,
> >
> > There are bugs in the emulated dp8393x device that can stop packet
> > reception in a Linux/m68k guest (q800 machine).
> >
> > With a Linux/m68k v5.5 guest (q800), it's possible to remotely trigger
> > an Oops by sending ping floods.
> >
> > With a Linux/mips guest (magnum machine), the driver fails to probe
> > the dp8393x device.
> >
> > With a NetBSD/arc 5.1 guest (magnum), the bugs in the device can be
> > fatal to the guest kernel.
> >
> > Whilst debugging the device, I found that the receiver algorithm
> > differs from the one described in the National Semiconductor
> > datasheet.
> >
> > This patch series resolves these bugs.
> >
> > AFAIK, all bugs in the Linux sonic driver were fixed in Linux v5.5.
> > ---
> 
> 
> Herve,
> 
> Do your Jazz tests pass with these changes?
> 

AFAIK those tests did not expose the NetBSD panic that is caused by 
mainline QEMU (mentioned above).

I have actually run the tests you requested (Hervé described them in an 
earlier thread). There was no regression. Quite the reverse -- it's no 
longer possible to remotely crash the NetBSD kernel.

Apparently my testing was also the first time that the jazzsonic driver 
(from the Linux/mips Magnum port) was tested successfully with QEMU. It 
doesn't work in mainline QEMU.

Anyway, more testing is always nice, and I'd certainly welcome an 
'acked-by' or 'tested-by' if Hervé would like to send one.

Please consider backporting this series of bug fixes to QEMU stable 
branch(es).

Regards,
Finn

> Regards,
> Aleksandar
> 
> 
> 
> > Changed since v1:
> >  - Minor revisions as described beneath commit logs.
> >  - Dropped patches 4/10 and 7/10.
> >  - Added 5 new patches.
> >
> > Changed since v2:
> >  - Minor revisions as described beneath commit logs.
> >  - Dropped patch 13/13.
> >  - Added 2 new patches.
> >
> > Changed since v3:
> >  - Replaced patch 13/14 with patch suggested by Philippe Mathieu-Daudé.
> >
> >
> > Finn Thain (14):
> >   dp8393x: Mask EOL bit from descriptor addresses
> >   dp8393x: Always use 32-bit accesses
> >   dp8393x: Clean up endianness hacks
> >   dp8393x: Have dp8393x_receive() return the packet size
> >   dp8393x: Update LLFA and CRDA registers from rx descriptor
> >   dp8393x: Clear RRRA command register bit only when appropriate
> >   dp8393x: Implement packet size limit and RBAE interrupt
> >   dp8393x: Don't clobber packet checksum
> >   dp8393x: Use long-word-aligned RRA pointers in 32-bit mode
> >   dp8393x: Pad frames to word or long word boundary
> >   dp8393x: Clear descriptor in_use field to release packet
> >   dp8393x: Always update RRA pointers and sequence numbers
> >   dp8393x: Don't reset Silicon Revision register
> >   dp8393x: Don't stop reception upon RBE interrupt assertion
> >
> >  hw/net/dp8393x.c | 202 +++
> >  1 file changed, 134 insertions(+), 68 deletions(-)
> >
> > --
> > 2.24.1
> >
> >
> >
> 

[PATCH v3 08/12] target/ppc: Streamline calculation of RMA limit from LPCR[RMLS]

2020-02-18 Thread David Gibson
Currently we use a big switch statement in ppc_hash64_update_rmls() to work
out what the right RMA limit is based on the LPCR[RMLS] field.  There's no
formula for this - it's just an arbitrary mapping defined by the existing
CPU implementations - but we can make it a bit more readable by using a
lookup table rather than a switch.  In addition we can use the MiB/GiB
symbols to make it a bit clearer.

While there we add a bit of clarity and rationale to the comment about
what happens if the LPCR[RMLS] doesn't contain a valid value.

Signed-off-by: David Gibson 
Reviewed-by: Cédric Le Goater 
---
 target/ppc/mmu-hash64.c | 71 -
 1 file changed, 35 insertions(+), 36 deletions(-)

diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index 127b7250ae..bb9ebeaf48 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -18,6 +18,7 @@
  * License along with this library; if not, see .
  */
 #include "qemu/osdep.h"
+#include "qemu/units.h"
 #include "cpu.h"
 #include "exec/exec-all.h"
 #include "exec/helper-proto.h"
@@ -755,6 +756,39 @@ static void ppc_hash64_set_c(PowerPCCPU *cpu, hwaddr ptex, 
uint64_t pte1)
 stb_phys(CPU(cpu)->as, base + offset, (pte1 & 0xff) | 0x80);
 }
 
+static target_ulong rmls_limit(PowerPCCPU *cpu)
+{
+CPUPPCState *env = >env;
+/*
+ * This is the full 4 bits encoding of POWER8. Previous
+ * CPUs only support a subset of these but the filtering
+ * is done when writing LPCR
+ */
+const target_ulong rma_sizes[] = {
+[0] = 0,
+[1] = 16 * GiB,
+[2] = 1 * GiB,
+[3] = 64 * MiB,
+[4] = 256 * MiB,
+[5] = 0,
+[6] = 0,
+[7] = 128 * MiB,
+[8] = 32 * MiB,
+};
+target_ulong rmls = (env->spr[SPR_LPCR] & LPCR_RMLS) >> LPCR_RMLS_SHIFT;
+
+if (rmls < ARRAY_SIZE(rma_sizes)) {
+return rma_sizes[rmls];
+} else {
+/*
+ * Bad value, so the OS has shot itself in the foot.  Return a
+ * 0-sized RMA which we expect to trigger an immediate DSI or
+ * ISI
+ */
+return 0;
+}
+}
+
 int ppc_hash64_handle_mmu_fault(PowerPCCPU *cpu, vaddr eaddr,
 int rwx, int mmu_idx)
 {
@@ -1004,41 +1038,6 @@ void ppc_hash64_tlb_flush_hpte(PowerPCCPU *cpu, 
target_ulong ptex,
 cpu->env.tlb_need_flush = TLB_NEED_GLOBAL_FLUSH | TLB_NEED_LOCAL_FLUSH;
 }
 
-static void ppc_hash64_update_rmls(PowerPCCPU *cpu)
-{
-CPUPPCState *env = >env;
-uint64_t lpcr = env->spr[SPR_LPCR];
-
-/*
- * This is the full 4 bits encoding of POWER8. Previous
- * CPUs only support a subset of these but the filtering
- * is done when writing LPCR
- */
-switch ((lpcr & LPCR_RMLS) >> LPCR_RMLS_SHIFT) {
-case 0x8: /* 32MB */
-env->rmls = 0x200ull;
-break;
-case 0x3: /* 64MB */
-env->rmls = 0x400ull;
-break;
-case 0x7: /* 128MB */
-env->rmls = 0x800ull;
-break;
-case 0x4: /* 256MB */
-env->rmls = 0x1000ull;
-break;
-case 0x2: /* 1GB */
-env->rmls = 0x4000ull;
-break;
-case 0x1: /* 16GB */
-env->rmls = 0x4ull;
-break;
-default:
-/* What to do here ??? */
-env->rmls = 0;
-}
-}
-
 static void ppc_hash64_update_vrma(PowerPCCPU *cpu)
 {
 CPUPPCState *env = >env;
@@ -1097,7 +1096,7 @@ void ppc_store_lpcr(PowerPCCPU *cpu, target_ulong val)
 CPUPPCState *env = >env;
 
 env->spr[SPR_LPCR] = val & pcc->lpcr_mask;
-ppc_hash64_update_rmls(cpu);
+env->rmls = rmls_limit(cpu);
 ppc_hash64_update_vrma(cpu);
 }
 
-- 
2.24.1




[PATCH v3 07/12] target/ppc: Use class fields to simplify LPCR masking

2020-02-18 Thread David Gibson
When we store the Logical Partitioning Control Register (LPCR) we have a
big switch statement to work out which are valid bits for the cpu model
we're emulating.

As well as being ugly, this isn't really conceptually correct, since it is
based on the mmu_model variable, whereas the LPCR isn't (only) about the
MMU, so mmu_model is basically just acting as a proxy for the cpu model.

Handle this in a simpler way, by adding a suitable lpcr_mask to the QOM
class.

Signed-off-by: David Gibson 
Reviewed-by: Cédric Le Goater 
---
 target/ppc/cpu-qom.h|  1 +
 target/ppc/mmu-hash64.c | 37 ++---
 target/ppc/translate_init.inc.c | 27 
 3 files changed, 26 insertions(+), 39 deletions(-)

diff --git a/target/ppc/cpu-qom.h b/target/ppc/cpu-qom.h
index e499575dc8..15d6b54a7d 100644
--- a/target/ppc/cpu-qom.h
+++ b/target/ppc/cpu-qom.h
@@ -177,6 +177,7 @@ typedef struct PowerPCCPUClass {
 uint64_t insns_flags;
 uint64_t insns_flags2;
 uint64_t msr_mask;
+uint64_t lpcr_mask; /* Available bits in the LPCR */
 uint64_t lpcr_pm;   /* Power-saving mode Exit Cause Enable bits */
 powerpc_mmu_t   mmu_model;
 powerpc_excp_t  excp_model;
diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index d7f9933e6d..127b7250ae 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -1093,43 +1093,10 @@ static void ppc_hash64_update_vrma(PowerPCCPU *cpu)
 
 void ppc_store_lpcr(PowerPCCPU *cpu, target_ulong val)
 {
+PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
 CPUPPCState *env = >env;
-uint64_t lpcr = 0;
 
-/* Filter out bits */
-switch (env->mmu_model) {
-case POWERPC_MMU_2_03: /* P5p */
-lpcr = val & (LPCR_RMLS | LPCR_ILE |
-  LPCR_LPES0 | LPCR_LPES1 |
-  LPCR_RMI | LPCR_HDICE);
-break;
-case POWERPC_MMU_2_06: /* P7 */
-lpcr = val & (LPCR_VPM0 | LPCR_VPM1 | LPCR_ISL | LPCR_DPFD |
-  LPCR_VRMASD | LPCR_RMLS | LPCR_ILE |
-  LPCR_P7_PECE0 | LPCR_P7_PECE1 | LPCR_P7_PECE2 |
-  LPCR_MER | LPCR_TC |
-  LPCR_LPES0 | LPCR_LPES1 | LPCR_HDICE);
-break;
-case POWERPC_MMU_2_07: /* P8 */
-lpcr = val & (LPCR_VPM0 | LPCR_VPM1 | LPCR_ISL | LPCR_KBV |
-  LPCR_DPFD | LPCR_VRMASD | LPCR_RMLS | LPCR_ILE |
-  LPCR_AIL | LPCR_ONL | LPCR_P8_PECE0 | LPCR_P8_PECE1 |
-  LPCR_P8_PECE2 | LPCR_P8_PECE3 | LPCR_P8_PECE4 |
-  LPCR_MER | LPCR_TC | LPCR_LPES0 | LPCR_HDICE);
-break;
-case POWERPC_MMU_3_00: /* P9 */
-lpcr = val & (LPCR_VPM1 | LPCR_ISL | LPCR_KBV | LPCR_DPFD |
-  (LPCR_PECE_U_MASK & LPCR_HVEE) | LPCR_ILE | LPCR_AIL |
-  LPCR_UPRT | LPCR_EVIRT | LPCR_ONL | LPCR_HR | LPCR_LD |
-  (LPCR_PECE_L_MASK & (LPCR_PDEE | LPCR_HDEE | LPCR_EEE |
-  LPCR_DEE | LPCR_OEE)) | LPCR_MER | LPCR_GTSE | LPCR_TC |
-  LPCR_HEIC | LPCR_LPES0 | LPCR_HVICE | LPCR_HDICE);
-break;
-default:
-g_assert_not_reached();
-;
-}
-env->spr[SPR_LPCR] = lpcr;
+env->spr[SPR_LPCR] = val & pcc->lpcr_mask;
 ppc_hash64_update_rmls(cpu);
 ppc_hash64_update_vrma(cpu);
 }
diff --git a/target/ppc/translate_init.inc.c b/target/ppc/translate_init.inc.c
index c5629d8ba9..823c3c7b54 100644
--- a/target/ppc/translate_init.inc.c
+++ b/target/ppc/translate_init.inc.c
@@ -8475,6 +8475,8 @@ POWERPC_FAMILY(POWER5P)(ObjectClass *oc, void *data)
 (1ull << MSR_DR) |
 (1ull << MSR_PMM) |
 (1ull << MSR_RI);
+pcc->lpcr_mask = LPCR_RMLS | LPCR_ILE | LPCR_LPES0 | LPCR_LPES1 |
+LPCR_RMI | LPCR_HDICE;
 pcc->mmu_model = POWERPC_MMU_2_03;
 #if defined(CONFIG_SOFTMMU)
 pcc->handle_mmu_fault = ppc_hash64_handle_mmu_fault;
@@ -8652,6 +8654,12 @@ POWERPC_FAMILY(POWER7)(ObjectClass *oc, void *data)
 (1ull << MSR_PMM) |
 (1ull << MSR_RI) |
 (1ull << MSR_LE);
+pcc->lpcr_mask = LPCR_VPM0 | LPCR_VPM1 | LPCR_ISL | LPCR_DPFD |
+LPCR_VRMASD | LPCR_RMLS | LPCR_ILE |
+LPCR_P7_PECE0 | LPCR_P7_PECE1 | LPCR_P7_PECE2 |
+LPCR_MER | LPCR_TC |
+LPCR_LPES0 | LPCR_LPES1 | LPCR_HDICE;
+pcc->lpcr_pm = LPCR_P7_PECE0 | LPCR_P7_PECE1 | LPCR_P7_PECE2;
 pcc->mmu_model = POWERPC_MMU_2_06;
 #if defined(CONFIG_SOFTMMU)
 pcc->handle_mmu_fault = ppc_hash64_handle_mmu_fault;
@@ -8668,7 +8676,6 @@ POWERPC_FAMILY(POWER7)(ObjectClass *oc, void *data)
 pcc->l1_dcache_size = 0x8000;
 pcc->l1_icache_size = 0x8000;
 pcc->interrupts_big_endian = ppc_cpu_interrupts_big_endian_lpcr;
-pcc->lpcr_pm = LPCR_P7_PECE0 | LPCR_P7_PECE1 | LPCR_P7_PECE2;
 }
 
 static void init_proc_POWER8(CPUPPCState *env)

[PATCH v3 09/12] target/ppc: Correct RMLS table

2020-02-18 Thread David Gibson
The table of RMA limits based on the LPCR[RMLS] field is slightly wrong.
We're missing the RMLS == 0 => 256 GiB RMA option, which is available on
POWER8, so add that.

The comment that goes with the table is much more wrong.  We *don't* filter
invalid RMLS values when writing the LPCR, and there's not really a
sensible way to do so.  Furthermore, while in theory the set of RMLS values
is implementation dependent, it seems in practice the same set has been
available since around POWER4+ up until POWER8, the last model which
supports RMLS at all.  So, correct that as well.

Signed-off-by: David Gibson 
Reviewed-by: Cédric Le Goater 
---
 target/ppc/mmu-hash64.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index bb9ebeaf48..e6f24be93e 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -760,12 +760,12 @@ static target_ulong rmls_limit(PowerPCCPU *cpu)
 {
 CPUPPCState *env = >env;
 /*
- * This is the full 4 bits encoding of POWER8. Previous
- * CPUs only support a subset of these but the filtering
- * is done when writing LPCR
+ * In theory the meanings of RMLS values are implementation
+ * dependent.  In practice, this seems to have been the set from
+ * POWER4+..POWER8, and RMLS is no longer supported in POWER9.
  */
 const target_ulong rma_sizes[] = {
-[0] = 0,
+[0] = 256 * GiB,
 [1] = 16 * GiB,
 [2] = 1 * GiB,
 [3] = 64 * MiB,
-- 
2.24.1




[PATCH v3 11/12] target/ppc: Streamline construction of VRMA SLB entry

2020-02-18 Thread David Gibson
When in VRMA mode (i.e. a guest thinks it has the MMU off, but the
hypervisor is still applying translation) we use a special SLB entry,
rather than looking up an SLBE by address as we do when guest translation
is on.

We build that special entry in ppc_hash64_update_vrma() along with some
logic for handling some non-VRMA cases.  Split the actual build of the
VRMA SLBE into a separate helper and streamline it a bit.

Signed-off-by: David Gibson 
---
 target/ppc/mmu-hash64.c | 79 -
 1 file changed, 38 insertions(+), 41 deletions(-)

diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index 170a78bd2e..06cfff9860 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -789,6 +789,39 @@ static target_ulong rmls_limit(PowerPCCPU *cpu)
 }
 }
 
+static int build_vrma_slbe(PowerPCCPU *cpu, ppc_slb_t *slb)
+{
+CPUPPCState *env = >env;
+target_ulong lpcr = env->spr[SPR_LPCR];
+uint32_t vrmasd = (lpcr & LPCR_VRMASD) >> LPCR_VRMASD_SHIFT;
+target_ulong vsid = SLB_VSID_VRMA | ((vrmasd << 4) & SLB_VSID_LLP_MASK);
+int i;
+
+/*
+ * Make one up. Mostly ignore the ESID which will not be needed
+ * for translation
+ */
+for (i = 0; i < PPC_PAGE_SIZES_MAX_SZ; i++) {
+const PPCHash64SegmentPageSizes *sps = >hash64_opts->sps[i];
+
+if (!sps->page_shift) {
+break;
+}
+
+if ((vsid & SLB_VSID_LLP_MASK) == sps->slb_enc) {
+slb->esid = SLB_ESID_V;
+slb->vsid = vsid;
+slb->sps = sps;
+return 0;
+}
+}
+
+error_report("Bad page size encoding in LPCR[VRMASD]; LPCR=0x"
+ TARGET_FMT_lx"\n", lpcr);
+
+return -1;
+}
+
 int ppc_hash64_handle_mmu_fault(PowerPCCPU *cpu, vaddr eaddr,
 int rwx, int mmu_idx)
 {
@@ -1044,53 +1077,17 @@ void ppc_hash64_tlb_flush_hpte(PowerPCCPU *cpu, 
target_ulong ptex,
 static void ppc_hash64_update_vrma(PowerPCCPU *cpu)
 {
 CPUPPCState *env = >env;
-const PPCHash64SegmentPageSizes *sps = NULL;
-target_ulong esid, vsid, lpcr;
 ppc_slb_t *slb = >vrma_slb;
-uint32_t vrmasd;
-int i;
-
-/* First clear it */
-slb->esid = slb->vsid = 0;
-slb->sps = NULL;
 
 /* Is VRMA enabled ? */
 if (ppc_hash64_use_vrma(env)) {
-return;
-}
-
-/*
- * Make one up. Mostly ignore the ESID which will not be needed
- * for translation
- */
-lpcr = env->spr[SPR_LPCR];
-vsid = SLB_VSID_VRMA;
-vrmasd = (lpcr & LPCR_VRMASD) >> LPCR_VRMASD_SHIFT;
-vsid |= (vrmasd << 4) & (SLB_VSID_L | SLB_VSID_LP);
-esid = SLB_ESID_V;
-
-for (i = 0; i < PPC_PAGE_SIZES_MAX_SZ; i++) {
-const PPCHash64SegmentPageSizes *sps1 = >hash64_opts->sps[i];
-
-if (!sps1->page_shift) {
-break;
-}
-
-if ((vsid & SLB_VSID_LLP_MASK) == sps1->slb_enc) {
-sps = sps1;
-break;
-}
-}
-
-if (!sps) {
-error_report("Bad page size encoding esid 0x"TARGET_FMT_lx
- " vsid 0x"TARGET_FMT_lx, esid, vsid);
-return;
+if (build_vrma_slbe(cpu, slb) == 0)
+return;
 }
 
-slb->vsid = vsid;
-slb->esid = esid;
-slb->sps = sps;
+/* Otherwise, clear it to indicate error */
+slb->esid = slb->vsid = 0;
+slb->sps = NULL;
 }
 
 void ppc_store_lpcr(PowerPCCPU *cpu, target_ulong val)
-- 
2.24.1




[PATCH v3 06/12] target/ppc: Remove RMOR register from POWER9 & POWER10

2020-02-18 Thread David Gibson
Currently we create the Real Mode Offset Register (RMOR) on all Book3S cpus
from POWER7 onwards.  However the translation mode which the RMOR controls
is no longer supported in POWER9, and so the register has been removed from
the architecture.

Remove it from our model on POWER9 and POWER10.

Signed-off-by: David Gibson 
Reviewed-by: Cédric Le Goater 
---
 target/ppc/translate_init.inc.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/target/ppc/translate_init.inc.c b/target/ppc/translate_init.inc.c
index d7d4f012b8..c5629d8ba9 100644
--- a/target/ppc/translate_init.inc.c
+++ b/target/ppc/translate_init.inc.c
@@ -8014,12 +8014,16 @@ static void gen_spr_book3s_ids(CPUPPCState *env)
  SPR_NOACCESS, SPR_NOACCESS,
  _read_generic, _write_generic,
  0x);
-spr_register_hv(env, SPR_RMOR, "RMOR",
+spr_register_hv(env, SPR_HRMOR, "HRMOR",
  SPR_NOACCESS, SPR_NOACCESS,
  SPR_NOACCESS, SPR_NOACCESS,
  _read_generic, _write_generic,
  0x);
-spr_register_hv(env, SPR_HRMOR, "HRMOR",
+}
+
+static void gen_spr_rmor(CPUPPCState *env)
+{
+spr_register_hv(env, SPR_RMOR, "RMOR",
  SPR_NOACCESS, SPR_NOACCESS,
  SPR_NOACCESS, SPR_NOACCESS,
  _read_generic, _write_generic,
@@ -8534,6 +8538,7 @@ static void init_proc_POWER7(CPUPPCState *env)
 
 /* POWER7 Specific Registers */
 gen_spr_book3s_ids(env);
+gen_spr_rmor(env);
 gen_spr_amr(env);
 gen_spr_book3s_purr(env);
 gen_spr_power5p_common(env);
@@ -8675,6 +8680,7 @@ static void init_proc_POWER8(CPUPPCState *env)
 
 /* POWER8 Specific Registers */
 gen_spr_book3s_ids(env);
+gen_spr_rmor(env);
 gen_spr_amr(env);
 gen_spr_iamr(env);
 gen_spr_book3s_purr(env);
-- 
2.24.1




[PATCH v3 05/12] spapr, ppc: Remove VPM0/RMLS hacks for POWER9

2020-02-18 Thread David Gibson
For the "pseries" machine, we use "virtual hypervisor" mode where we
only model the CPU in non-hypervisor privileged mode.  This means that
we need guest physical addresses within the modelled cpu to be treated
as absolute physical addresses.

We used to do that by clearing LPCR[VPM0] and setting LPCR[RMLS] to a high
limit so that the old offset based translation for guest mode applied,
which does what we need.  However, POWER9 has removed support for that
translation mode, which meant we had some ugly hacks to keep it working.

We now explicitly handle this sort of translation for virtual hypervisor
mode, so the hacks aren't necessary.  We don't need to set VPM0 and RMLS
from the machine type code - they're now ignored in vhyp mode.  On the cpu
side we don't need to allow LPCR[RMLS] to be set on POWER9 in vhyp mode -
that was only there to allow the hack on the machine side.

Signed-off-by: David Gibson 
Reviewed-by: Cédric Le Goater 
---
 hw/ppc/spapr_cpu_core.c | 6 +-
 target/ppc/mmu-hash64.c | 8 
 2 files changed, 1 insertion(+), 13 deletions(-)

diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index d09125d9af..ea5e11f1d9 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -58,14 +58,10 @@ static void spapr_reset_vcpu(PowerPCCPU *cpu)
  * we don't get spurious wakups before an RTAS start-cpu call.
  * For the same reason, set PSSCR_EC.
  */
-lpcr &= ~(LPCR_VPM0 | LPCR_VPM1 | LPCR_ISL | LPCR_KBV | pcc->lpcr_pm);
+lpcr &= ~(LPCR_VPM1 | LPCR_ISL | LPCR_KBV | pcc->lpcr_pm);
 lpcr |= LPCR_LPES0 | LPCR_LPES1;
 env->spr[SPR_PSSCR] |= PSSCR_EC;
 
-/* Set RMLS to the max (ie, 16G) */
-lpcr &= ~LPCR_RMLS;
-lpcr |= 1ull << LPCR_RMLS_SHIFT;
-
 ppc_store_lpcr(cpu, lpcr);
 
 /* Set a full AMOR so guest can use the AMR as it sees fit */
diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index d878180df5..d7f9933e6d 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -1124,14 +1124,6 @@ void ppc_store_lpcr(PowerPCCPU *cpu, target_ulong val)
   (LPCR_PECE_L_MASK & (LPCR_PDEE | LPCR_HDEE | LPCR_EEE |
   LPCR_DEE | LPCR_OEE)) | LPCR_MER | LPCR_GTSE | LPCR_TC |
   LPCR_HEIC | LPCR_LPES0 | LPCR_HVICE | LPCR_HDICE);
-/*
- * If we have a virtual hypervisor, we need to bring back RMLS. It
- * doesn't exist on an actual P9 but that's all we know how to
- * configure with softmmu at the moment
- */
-if (cpu->vhyp) {
-lpcr |= (val & LPCR_RMLS);
-}
 break;
 default:
 g_assert_not_reached();
-- 
2.24.1




[Bug 1863819] [NEW] repeated KVM single step crashes leaks into SMP guest and crashes guest application

2020-02-18 Thread Dustin Spicuzza
Public bug reported:

Guest: Windows 7 x64
Host: Ubuntu 18.04.4 (kernel 5.3.0-40-generic)
QEMU: master 6c599282f8ab382fe59f03a6cae755b89561a7b3

If I try to use GDB to repeatedly single-step a userspace process while
running a KVM guest, the userspace process will eventually crash with a
0x8004 exception (single step). This is easily reproducible on a
Windows guest, I've not tried another guest type but I've been told it's
the same there also.

On a Ubuntu 16 host with an older kernel, this will hang the entire
machine. However, it seems it may have been fixed by
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5cc244a20b86090c087073c124284381cdf47234
?

It's not clear to me whether this is a KVM or a QEMU bug. A TCG guest
does not crash the userspace process in the same way, but it does hang
the VM.

I've tried a variety of QEMU versions (3.0, 4.2, master) and they all
exhibit the same behavior. I'm happy to dig into this more if someone
can point me in the right direction.

Here's the outline for reproducing the bug:

* Compile iloop.cpp (attached) as a 32-bit application using MSVC
* Start Windows 7 x64 guest under GDB
  * Pass '-enable-kvm -smp 4,cores=2 -gdb tcp::4567' to QEMU along with other 
typical options

(need to get CR3 to ensure we're in the right application context -- if there's 
an easier way to do this I'd love to hear it!)
* Install WinDBG on guest
* Copy SysInternals LiveKD to guest
* Start iloop.exe in guest, note loop address
* Run LiveKD from administrative prompt
  * livekd64.exe -w
* In WinDBG:
  * !process 0 0
  * Search for iloop.exe, note DirBase (this is CR3)

In GDB:
* Execute 'target remote tcp::4567'
* Execute 'c'
* Hit CTRL-C to pause the VM
* Execute 'p/x $cr3'
  .. continue if not equal to DirBase in WinDBG, keep stopping until it is equal
* Once $cr3 is correct value, if you 'stepi' a few times you'll note the 
process going in a loop, it should keep hitting the address echoed to the 
console by iloop.exe

Crash the process from GDB:
* Execute 'stepi 1'
* Watch the process, eventually it'll die with an 0x8004 error

** Affects: qemu
 Importance: Undecided
 Status: New

** Attachment added: "iloop.cpp"
   https://bugs.launchpad.net/bugs/1863819/+attachment/5329416/+files/iloop.cpp

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1863819

Title:
  repeated KVM single step crashes leaks into SMP guest and crashes
  guest application

Status in QEMU:
  New

Bug description:
  Guest: Windows 7 x64
  Host: Ubuntu 18.04.4 (kernel 5.3.0-40-generic)
  QEMU: master 6c599282f8ab382fe59f03a6cae755b89561a7b3

  If I try to use GDB to repeatedly single-step a userspace process
  while running a KVM guest, the userspace process will eventually crash
  with a 0x8004 exception (single step). This is easily reproducible
  on a Windows guest, I've not tried another guest type but I've been
  told it's the same there also.

  On a Ubuntu 16 host with an older kernel, this will hang the entire
  machine. However, it seems it may have been fixed by
  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5cc244a20b86090c087073c124284381cdf47234
  ?

  It's not clear to me whether this is a KVM or a QEMU bug. A TCG guest
  does not crash the userspace process in the same way, but it does hang
  the VM.

  I've tried a variety of QEMU versions (3.0, 4.2, master) and they all
  exhibit the same behavior. I'm happy to dig into this more if someone
  can point me in the right direction.

  Here's the outline for reproducing the bug:

  * Compile iloop.cpp (attached) as a 32-bit application using MSVC
  * Start Windows 7 x64 guest under GDB
* Pass '-enable-kvm -smp 4,cores=2 -gdb tcp::4567' to QEMU along with other 
typical options

  (need to get CR3 to ensure we're in the right application context -- if 
there's an easier way to do this I'd love to hear it!)
  * Install WinDBG on guest
  * Copy SysInternals LiveKD to guest
  * Start iloop.exe in guest, note loop address
  * Run LiveKD from administrative prompt
* livekd64.exe -w
  * In WinDBG:
* !process 0 0
* Search for iloop.exe, note DirBase (this is CR3)

  In GDB:
  * Execute 'target remote tcp::4567'
  * Execute 'c'
  * Hit CTRL-C to pause the VM
  * Execute 'p/x $cr3'
.. continue if not equal to DirBase in WinDBG, keep stopping until it is 
equal
  * Once $cr3 is correct value, if you 'stepi' a few times you'll note the 
process going in a loop, it should keep hitting the address echoed to the 
console by iloop.exe

  Crash the process from GDB:
  * Execute 'stepi 1'
  * Watch the process, eventually it'll die with an 0x8004 error

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1863819/+subscriptions



[PATCH v1] block/nvme: introduce PMR support from NVMe 1.4 spec

2020-02-18 Thread Andrzej Jakowski
This patch introduces support for PMR that has been defined as part of NVMe 1.4
spec. User can now specify a pmr_file which will be mmap'ed into qemu address
space and subsequently in PCI BAR 2. Guest OS can perform mmio read and writes
to the PMR region that will stay persistent accross system reboot.

Signed-off-by: Andrzej Jakowski 
---
 hw/block/nvme.c   | 145 ++-
 hw/block/nvme.h   |   5 ++
 hw/block/trace-events |   5 ++
 include/block/nvme.h  | 172 ++
 4 files changed, 326 insertions(+), 1 deletion(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index d28335cbf3..836cf8d426 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -19,10 +19,14 @@
  *  -drive file=,if=none,id=
  *  -device nvme,drive=,serial=,id=, \
  *  cmb_size_mb=, \
+ *  [pmr_file=,] \
  *  num_queues=
  *
  * Note cmb_size_mb denotes size of CMB in MB. CMB is assumed to be at
  * offset 0 in BAR2 and supports only WDS, RDS and SQS for now.
+ *
+ * Either cmb or pmr - due to limitation in avaialbe BAR indexes.
+ * pmr_file file needs to be preallocated and be multiple of MiB in size.
  */
 
 #include "qemu/osdep.h"
@@ -1141,6 +1145,26 @@ static void nvme_write_bar(NvmeCtrl *n, hwaddr offset, 
uint64_t data,
 NVME_GUEST_ERR(nvme_ub_mmiowr_cmbsz_readonly,
"invalid write to read only CMBSZ, ignored");
 return;
+case 0xE00: /* PMRCAP */
+NVME_GUEST_ERR(nvme_ub_mmiowr_pmrcap_readonly,
+   "invalid write to PMRCAP register, ignored");
+return;
+case 0xE04: /* TODO PMRCTL */
+break;
+case 0xE08: /* PMRSTS */
+NVME_GUEST_ERR(nvme_ub_mmiowr_pmrsts_readonly,
+   "invalid write to PMRSTS register, ignored");
+return;
+case 0xE0C: /* PMREBS */
+NVME_GUEST_ERR(nvme_ub_mmiowr_pmrebs_readonly,
+   "invalid write to PMREBS register, ignored");
+return;
+case 0xE10: /* PMRSWTP */
+NVME_GUEST_ERR(nvme_ub_mmiowr_pmrswtp_readonly,
+   "invalid write to PMRSWTP register, ignored");
+return;
+case 0xE14: /* TODO PMRMSC */
+ break;
 default:
 NVME_GUEST_ERR(nvme_ub_mmiowr_invalid,
"invalid MMIO write,"
@@ -1303,6 +1327,38 @@ static const MemoryRegionOps nvme_cmb_ops = {
 },
 };
 
+static void nvme_pmr_write(void *opaque, hwaddr addr, uint64_t data,
+unsigned size)
+{
+NvmeCtrl *n = (NvmeCtrl *)opaque;
+stn_le_p(>pmrbuf[addr], size, data);
+}
+
+static uint64_t nvme_pmr_read(void *opaque, hwaddr addr, unsigned size)
+{
+NvmeCtrl *n = (NvmeCtrl *)opaque;
+if (!NVME_PMRCAP_PMRWBM(n->bar.pmrcap)) {
+int ret;
+ret = msync(n->pmrbuf, n->f_pmr_size, MS_SYNC);
+if (!ret) {
+NVME_GUEST_ERR(nvme_ub_mmiowr_pmrread_barrier,
+   "error while persisting data");
+}
+}
+return ldn_le_p(>pmrbuf[addr], size);
+}
+
+static const MemoryRegionOps nvme_pmr_ops = {
+.read = nvme_pmr_read,
+.write = nvme_pmr_write,
+.endianness = DEVICE_LITTLE_ENDIAN,
+.impl = {
+.min_access_size = 1,
+.max_access_size = 8,
+},
+};
+
+
 static void nvme_realize(PCIDevice *pci_dev, Error **errp)
 {
 NvmeCtrl *n = NVME(pci_dev);
@@ -1332,6 +1388,37 @@ static void nvme_realize(PCIDevice *pci_dev, Error 
**errp)
 error_setg(errp, "serial property not set");
 return;
 }
+
+if (!n->cmb_size_mb && n->pmr_file) {
+int fd;
+
+n->f_pmr = fopen(n->pmr_file, "r+b");
+if (!n->f_pmr) {
+error_setg(errp, "pmr backend file open error");
+return;
+}
+
+fseek(n->f_pmr, 0L, SEEK_END);
+n->f_pmr_size = ftell(n->f_pmr);
+fseek(n->f_pmr, 0L, SEEK_SET);
+
+/* pmr file size needs to be multiple of MiB in size */
+if (!n->f_pmr_size || n->f_pmr_size % (1 << 20)) {
+error_setg(errp, "pmr backend file size needs to be greater than 0"
+ "and multiple of MiB in size");
+return;
+}
+
+fd = fileno(n->f_pmr);
+n->pmrbuf = mmap(NULL, n->f_pmr_size,
+ (PROT_READ | PROT_WRITE), MAP_SHARED, fd, 0);
+
+if (n->pmrbuf == MAP_FAILED) {
+error_setg(errp, "pmr backend file mmap error");
+return;
+}
+}
+
 blkconf_blocksizes(>conf);
 if (!blkconf_apply_backend_options(>conf, blk_is_read_only(n->conf.blk),
false, errp)) {
@@ -1393,7 +1480,6 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp)
 n->bar.intmc = n->bar.intms = 0;
 
 if (n->cmb_size_mb) {
-
 NVME_CMBLOC_SET_BIR(n->bar.cmbloc, 2);
 NVME_CMBLOC_SET_OFST(n->bar.cmbloc, 0);
 
@@ -1415,6 +1501,52 @@ static void 

[PATCH v3 12/12] target/ppc: Don't store VRMA SLBE persistently

2020-02-18 Thread David Gibson
Currently, we construct the SLBE used for VRMA translations when the LPCR
is written (which controls some bits in the SLBE), then use it later for
translations.

This is a bit complex and confusing - simplify it by simply constructing
the SLBE directly from the LPCR when we need it.

Signed-off-by: David Gibson 
---
 target/ppc/cpu.h|  3 ---
 target/ppc/mmu-hash64.c | 27 ++-
 2 files changed, 6 insertions(+), 24 deletions(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index f9871b1233..5a55fb02bd 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -1044,9 +1044,6 @@ struct CPUPPCState {
 uint32_t flags;
 uint64_t insns_flags;
 uint64_t insns_flags2;
-#if defined(TARGET_PPC64)
-ppc_slb_t vrma_slb;
-#endif
 
 int error_code;
 uint32_t pending_interrupts;
diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index 06cfff9860..d93dcf3a08 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -827,6 +827,7 @@ int ppc_hash64_handle_mmu_fault(PowerPCCPU *cpu, vaddr 
eaddr,
 {
 CPUState *cs = CPU(cpu);
 CPUPPCState *env = >env;
+ppc_slb_t vrma_slbe;
 ppc_slb_t *slb;
 unsigned apshift;
 hwaddr ptex;
@@ -865,8 +866,8 @@ int ppc_hash64_handle_mmu_fault(PowerPCCPU *cpu, vaddr 
eaddr,
 }
 } else if (ppc_hash64_use_vrma(env)) {
 /* Emulated VRMA mode */
-slb = >vrma_slb;
-if (!slb->sps) {
+slb = _slbe;
+if (build_vrma_slbe(cpu, slb) != 0) {
 /* Invalid VRMA setup, machine check */
 cs->exception_index = POWERPC_EXCP_MCHECK;
 env->error_code = 0;
@@ -1014,6 +1015,7 @@ skip_slb_search:
 hwaddr ppc_hash64_get_phys_page_debug(PowerPCCPU *cpu, target_ulong addr)
 {
 CPUPPCState *env = >env;
+ppc_slb_t vrma_slbe;
 ppc_slb_t *slb;
 hwaddr ptex, raddr;
 ppc_hash_pte64_t pte;
@@ -1035,8 +1037,8 @@ hwaddr ppc_hash64_get_phys_page_debug(PowerPCCPU *cpu, 
target_ulong addr)
 return raddr | env->spr[SPR_HRMOR];
 } else if (ppc_hash64_use_vrma(env)) {
 /* Emulated VRMA mode */
-slb = >vrma_slb;
-if (!slb->sps) {
+slb = _slbe;
+if (build_vrma_slbe(cpu, slb) != 0) {
 return -1;
 }
 } else {
@@ -1074,29 +1076,12 @@ void ppc_hash64_tlb_flush_hpte(PowerPCCPU *cpu, 
target_ulong ptex,
 cpu->env.tlb_need_flush = TLB_NEED_GLOBAL_FLUSH | TLB_NEED_LOCAL_FLUSH;
 }
 
-static void ppc_hash64_update_vrma(PowerPCCPU *cpu)
-{
-CPUPPCState *env = >env;
-ppc_slb_t *slb = >vrma_slb;
-
-/* Is VRMA enabled ? */
-if (ppc_hash64_use_vrma(env)) {
-if (build_vrma_slbe(cpu, slb) == 0)
-return;
-}
-
-/* Otherwise, clear it to indicate error */
-slb->esid = slb->vsid = 0;
-slb->sps = NULL;
-}
-
 void ppc_store_lpcr(PowerPCCPU *cpu, target_ulong val)
 {
 PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
 CPUPPCState *env = >env;
 
 env->spr[SPR_LPCR] = val & pcc->lpcr_mask;
-ppc_hash64_update_vrma(cpu);
 }
 
 void helper_store_lpcr(CPUPPCState *env, target_ulong val)
-- 
2.24.1




[PATCH v3 10/12] target/ppc: Only calculate RMLS derived RMA limit on demand

2020-02-18 Thread David Gibson
When the LPCR is written, we update the env->rmls field with the RMA limit
it implies.  Simplify things by just calculating the value directly from
the LPCR value when we need it.

It's possible this is a little slower, but it's unlikely to be significant,
since this is only for real mode accesses in a translation configuration
that's not used very often, and the whole thing is behind the qemu TLB
anyway.  Therefore, keeping the number of state variables down and not
having to worry about making sure it's always in sync seems the better
option.

Signed-off-by: David Gibson 
Reviewed-by: Cédric Le Goater 
---
 target/ppc/cpu.h| 1 -
 target/ppc/mmu-hash64.c | 8 +---
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 8077fdb068..f9871b1233 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -1046,7 +1046,6 @@ struct CPUPPCState {
 uint64_t insns_flags2;
 #if defined(TARGET_PPC64)
 ppc_slb_t vrma_slb;
-target_ulong rmls;
 #endif
 
 int error_code;
diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index e6f24be93e..170a78bd2e 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -842,8 +842,10 @@ int ppc_hash64_handle_mmu_fault(PowerPCCPU *cpu, vaddr 
eaddr,
 
 goto skip_slb_search;
 } else {
+target_ulong limit = rmls_limit(cpu);
+
 /* Emulated old-style RMO mode, bounds check against RMLS */
-if (raddr >= env->rmls) {
+if (raddr >= limit) {
 if (rwx == 2) {
 ppc_hash64_set_isi(cs, SRR1_PROTFAULT);
 } else {
@@ -1005,8 +1007,9 @@ hwaddr ppc_hash64_get_phys_page_debug(PowerPCCPU *cpu, 
target_ulong addr)
 return -1;
 }
 } else {
+target_ulong limit = rmls_limit(cpu);
 /* Emulated old-style RMO mode, bounds check against RMLS */
-if (raddr >= env->rmls) {
+if (raddr >= limit) {
 return -1;
 }
 return raddr | env->spr[SPR_RMOR];
@@ -1096,7 +1099,6 @@ void ppc_store_lpcr(PowerPCCPU *cpu, target_ulong val)
 CPUPPCState *env = >env;
 
 env->spr[SPR_LPCR] = val & pcc->lpcr_mask;
-env->rmls = rmls_limit(cpu);
 ppc_hash64_update_vrma(cpu);
 }
 
-- 
2.24.1




[PATCH v3 00/12] target/ppc: Correct some errors with real mode handling

2020-02-18 Thread David Gibson
POWER "book S" (server class) cpus have a concept of "real mode" where
MMU translation is disabled... sort of.  In fact this can mean a bunch
of slightly different things when hypervisor mode and other
considerations are present.

We had some errors in edge cases here, so clean some things up and
correct them.

Changes since v2:
 * Removed 32-bit hypervisor stubs more completely
 * Minor polish based on review comments
Changes since RFCv1:
 * Add a number of extra patches taking advantage of the initial
   cleanups

David Gibson (12):
  ppc: Remove stub support for 32-bit hypervisor mode
  ppc: Remove stub of PPC970 HID4 implementation
  target/ppc: Correct handling of real mode accesses with vhyp on hash
MMU
  target/ppc: Introduce ppc_hash64_use_vrma() helper
  spapr, ppc: Remove VPM0/RMLS hacks for POWER9
  target/ppc: Remove RMOR register from POWER9 & POWER10
  target/ppc: Use class fields to simplify LPCR masking
  target/ppc: Streamline calculation of RMA limit from LPCR[RMLS]
  target/ppc: Correct RMLS table
  target/ppc: Only calculate RMLS derived RMA limit on demand
  target/ppc: Streamline construction of VRMA SLB entry
  target/ppc: Don't store VRMA SLBE persistently

 hw/ppc/spapr_cpu_core.c |   6 +-
 target/ppc/cpu-qom.h|   1 +
 target/ppc/cpu.h|  25 +--
 target/ppc/mmu-hash64.c | 329 
 target/ppc/translate_init.inc.c |  60 --
 5 files changed, 175 insertions(+), 246 deletions(-)

-- 
2.24.1




[PATCH v3 04/12] target/ppc: Introduce ppc_hash64_use_vrma() helper

2020-02-18 Thread David Gibson
When running guests under a hypervisor, the hypervisor obviously needs to
be protected from guest accesses even if those are in what the guest
considers real mode (translation off).  The POWER hardware provides two
ways of doing that: The old way has guest real mode accesses simply offset
and bounds checked into host addresses.  It works, but requires that a
significant chunk of the guest's memory - the RMA - be physically
contiguous in the host, which is pretty inconvenient.  The new way, known
as VRMA, has guest real mode accesses translated in roughly the normal way
but with some special parameters.

In POWER7 and POWER8 the LPCR[VPM0] bit selected between the two modes, but
in POWER9 only VRMA mode is supported and LPCR[VPM0] no longer exists.  We
handle that difference in behaviour in ppc_hash64_set_isi().. but not in
other places that we blindly check LPCR[VPM0].

Correct those instances with a new helper to tell if we should be in VRMA
mode.

Signed-off-by: David Gibson 
Reviewed-by: Cédric Le Goater 
---
 target/ppc/mmu-hash64.c | 41 +++--
 1 file changed, 19 insertions(+), 22 deletions(-)

diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index 5fabd93c92..d878180df5 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -668,6 +668,19 @@ unsigned ppc_hash64_hpte_page_shift_noslb(PowerPCCPU *cpu,
 return 0;
 }
 
+static bool ppc_hash64_use_vrma(CPUPPCState *env)
+{
+switch (env->mmu_model) {
+case POWERPC_MMU_3_00:
+/* ISAv3.0 (POWER9) always uses VRMA, the VPM0 field and RMOR
+ * register no longer exist */
+return true;
+
+default:
+return !!(env->spr[SPR_LPCR] & LPCR_VPM0);
+}
+}
+
 static void ppc_hash64_set_isi(CPUState *cs, uint64_t error_code)
 {
 CPUPPCState *env = _CPU(cs)->env;
@@ -676,15 +689,7 @@ static void ppc_hash64_set_isi(CPUState *cs, uint64_t 
error_code)
 if (msr_ir) {
 vpm = !!(env->spr[SPR_LPCR] & LPCR_VPM1);
 } else {
-switch (env->mmu_model) {
-case POWERPC_MMU_3_00:
-/* Field deprecated in ISAv3.00 - interrupts always go to hyperv */
-vpm = true;
-break;
-default:
-vpm = !!(env->spr[SPR_LPCR] & LPCR_VPM0);
-break;
-}
+vpm = ppc_hash64_use_vrma(env);
 }
 if (vpm && !msr_hv) {
 cs->exception_index = POWERPC_EXCP_HISI;
@@ -702,15 +707,7 @@ static void ppc_hash64_set_dsi(CPUState *cs, uint64_t dar, 
uint64_t dsisr)
 if (msr_dr) {
 vpm = !!(env->spr[SPR_LPCR] & LPCR_VPM1);
 } else {
-switch (env->mmu_model) {
-case POWERPC_MMU_3_00:
-/* Field deprecated in ISAv3.00 - interrupts always go to hyperv */
-vpm = true;
-break;
-default:
-vpm = !!(env->spr[SPR_LPCR] & LPCR_VPM0);
-break;
-}
+vpm = ppc_hash64_use_vrma(env);
 }
 if (vpm && !msr_hv) {
 cs->exception_index = POWERPC_EXCP_HDSI;
@@ -799,7 +796,7 @@ int ppc_hash64_handle_mmu_fault(PowerPCCPU *cpu, vaddr 
eaddr,
 if (!(eaddr >> 63)) {
 raddr |= env->spr[SPR_HRMOR];
 }
-} else if (env->spr[SPR_LPCR] & LPCR_VPM0) {
+} else if (ppc_hash64_use_vrma(env)) {
 /* Emulated VRMA mode */
 slb = >vrma_slb;
 if (!slb->sps) {
@@ -967,7 +964,7 @@ hwaddr ppc_hash64_get_phys_page_debug(PowerPCCPU *cpu, 
target_ulong addr)
 } else if ((msr_hv || !env->has_hv_mode) && !(addr >> 63)) {
 /* In HV mode, add HRMOR if top EA bit is clear */
 return raddr | env->spr[SPR_HRMOR];
-} else if (env->spr[SPR_LPCR] & LPCR_VPM0) {
+} else if (ppc_hash64_use_vrma(env)) {
 /* Emulated VRMA mode */
 slb = >vrma_slb;
 if (!slb->sps) {
@@ -1056,8 +1053,7 @@ static void ppc_hash64_update_vrma(PowerPCCPU *cpu)
 slb->sps = NULL;
 
 /* Is VRMA enabled ? */
-lpcr = env->spr[SPR_LPCR];
-if (!(lpcr & LPCR_VPM0)) {
+if (ppc_hash64_use_vrma(env)) {
 return;
 }
 
@@ -1065,6 +1061,7 @@ static void ppc_hash64_update_vrma(PowerPCCPU *cpu)
  * Make one up. Mostly ignore the ESID which will not be needed
  * for translation
  */
+lpcr = env->spr[SPR_LPCR];
 vsid = SLB_VSID_VRMA;
 vrmasd = (lpcr & LPCR_VRMASD) >> LPCR_VRMASD_SHIFT;
 vsid |= (vrmasd << 4) & (SLB_VSID_L | SLB_VSID_LP);
-- 
2.24.1




[PATCH v3 02/12] ppc: Remove stub of PPC970 HID4 implementation

2020-02-18 Thread David Gibson
The PowerPC 970 CPU was a cut-down POWER4, which had hypervisor capability.
However, it can be (and often was) strapped into "Apple mode", where the
hypervisor capabilities were disabled (essentially putting it always in
hypervisor mode).

That's actually the only mode of the 970 we support in qemu, and we're
unlikely to change that any time soon.  However, we do have a partial
implementation of the 970's HID4 register which affects things only
relevant for hypervisor mode.

That stub is also really ugly, since it attempts to duplicate the effects
of HID4 by re-encoding it into the LPCR register used in newer CPUs, but
in a really confusing way.

Just get rid of it.

Signed-off-by: David Gibson 
Reviewed-by: Cédric Le Goater 
Reviewed-by: Greg Kurz 
---
 target/ppc/mmu-hash64.c | 28 +---
 target/ppc/translate_init.inc.c | 17 ++---
 2 files changed, 7 insertions(+), 38 deletions(-)

diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index da8966ccf5..a881876647 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -1091,33 +1091,6 @@ void ppc_store_lpcr(PowerPCCPU *cpu, target_ulong val)
 
 /* Filter out bits */
 switch (env->mmu_model) {
-case POWERPC_MMU_64B: /* 970 */
-if (val & 0x40) {
-lpcr |= LPCR_LPES0;
-}
-if (val & 0x8000ull) {
-lpcr |= LPCR_LPES1;
-}
-if (val & 0x20) {
-lpcr |= (0x4ull << LPCR_RMLS_SHIFT);
-}
-if (val & 0x4000ull) {
-lpcr |= (0x2ull << LPCR_RMLS_SHIFT);
-}
-if (val & 0x2000ull) {
-lpcr |= (0x1ull << LPCR_RMLS_SHIFT);
-}
-env->spr[SPR_RMOR] = ((lpcr >> 41) & 0xull) << 26;
-
-/*
- * XXX We could also write LPID from HID4 here
- * but since we don't tag any translation on it
- * it doesn't actually matter
- *
- * XXX For proper emulation of 970 we also need
- * to dig HRMOR out of HID5
- */
-break;
 case POWERPC_MMU_2_03: /* P5p */
 lpcr = val & (LPCR_RMLS | LPCR_ILE |
   LPCR_LPES0 | LPCR_LPES1 |
@@ -1154,6 +1127,7 @@ void ppc_store_lpcr(PowerPCCPU *cpu, target_ulong val)
 }
 break;
 default:
+g_assert_not_reached();
 ;
 }
 env->spr[SPR_LPCR] = lpcr;
diff --git a/target/ppc/translate_init.inc.c b/target/ppc/translate_init.inc.c
index a0d0eaabf2..d7d4f012b8 100644
--- a/target/ppc/translate_init.inc.c
+++ b/target/ppc/translate_init.inc.c
@@ -7895,25 +7895,20 @@ static void spr_write_lpcr(DisasContext *ctx, int sprn, 
int gprn)
 {
 gen_helper_store_lpcr(cpu_env, cpu_gpr[gprn]);
 }
-
-static void spr_write_970_hid4(DisasContext *ctx, int sprn, int gprn)
-{
-#if defined(TARGET_PPC64)
-spr_write_generic(ctx, sprn, gprn);
-gen_helper_store_lpcr(cpu_env, cpu_gpr[gprn]);
-#endif
-}
-
 #endif /* !defined(CONFIG_USER_ONLY) */
 
 static void gen_spr_970_lpar(CPUPPCState *env)
 {
 #if !defined(CONFIG_USER_ONLY)
 /* Logical partitionning */
-/* PPC970: HID4 is effectively the LPCR */
+/* PPC970: HID4 covers things later controlled by the LPCR and
+ * RMOR in later CPUs, but with a different encoding.  We only
+ * support the 970 in "Apple mode" which has all hypervisor
+ * facilities disabled by strapping, so we can basically just
+ * ignore it */
 spr_register(env, SPR_970_HID4, "HID4",
  SPR_NOACCESS, SPR_NOACCESS,
- _read_generic, _write_970_hid4,
+ _read_generic, _write_generic,
  0x);
 #endif
 }
-- 
2.24.1




[PATCH v3 01/12] ppc: Remove stub support for 32-bit hypervisor mode

2020-02-18 Thread David Gibson
a4f30719a8cd, way back in 2007 noted that "PowerPC hypervisor mode is not
fundamentally available only for PowerPC 64" and added a 32-bit version
of the MSR[HV] bit.

But nothing was ever really done with that; there is no meaningful support
for 32-bit hypervisor mode 13 years later.  Let's stop pretending and just
remove the stubs.

Signed-off-by: David Gibson 
---
 target/ppc/cpu.h| 21 +++--
 target/ppc/translate_init.inc.c |  6 +++---
 2 files changed, 10 insertions(+), 17 deletions(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index b283042515..8077fdb068 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -24,8 +24,6 @@
 #include "exec/cpu-defs.h"
 #include "cpu-qom.h"
 
-/* #define PPC_EMULATE_32BITS_HYPV */
-
 #define TCG_GUEST_DEFAULT_MO 0
 
 #define TARGET_PAGE_BITS_64K 16
@@ -300,13 +298,12 @@ typedef struct ppc_v3_pate_t {
 #define MSR_SF   63 /* Sixty-four-bit modehflags */
 #define MSR_TAG  62 /* Tag-active mode (POWERx ?)*/
 #define MSR_ISF  61 /* Sixty-four-bit interrupt mode on 630  */
-#define MSR_SHV  60 /* hypervisor state   hflags */
+#define MSR_HV   60 /* hypervisor state   hflags */
 #define MSR_TS0  34 /* Transactional state, 2 bits (Book3s)  */
 #define MSR_TS1  33
 #define MSR_TM   32 /* Transactional Memory Available (Book3s)   */
 #define MSR_CM   31 /* Computation mode for BookE hflags */
 #define MSR_ICM  30 /* Interrupt computation mode for BookE  */
-#define MSR_THV  29 /* hypervisor state for 32 bits PowerPC   hflags */
 #define MSR_GS   28 /* guest state for BookE */
 #define MSR_UCLE 26 /* User-mode cache lock enable for BookE */
 #define MSR_VR   25 /* altivec availablex hflags */
@@ -401,10 +398,13 @@ typedef struct ppc_v3_pate_t {
 
 #define msr_sf   ((env->msr >> MSR_SF)   & 1)
 #define msr_isf  ((env->msr >> MSR_ISF)  & 1)
-#define msr_shv  ((env->msr >> MSR_SHV)  & 1)
+#if defined(TARGET_PPC64)
+#define msr_hv   ((env->msr >> MSR_HV)   & 1)
+#else
+#define msr_hv   (0)
+#endif
 #define msr_cm   ((env->msr >> MSR_CM)   & 1)
 #define msr_icm  ((env->msr >> MSR_ICM)  & 1)
-#define msr_thv  ((env->msr >> MSR_THV)  & 1)
 #define msr_gs   ((env->msr >> MSR_GS)   & 1)
 #define msr_ucle ((env->msr >> MSR_UCLE) & 1)
 #define msr_vr   ((env->msr >> MSR_VR)   & 1)
@@ -449,16 +449,9 @@ typedef struct ppc_v3_pate_t {
 
 /* Hypervisor bit is more specific */
 #if defined(TARGET_PPC64)
-#define MSR_HVB (1ULL << MSR_SHV)
-#define msr_hv  msr_shv
-#else
-#if defined(PPC_EMULATE_32BITS_HYPV)
-#define MSR_HVB (1ULL << MSR_THV)
-#define msr_hv  msr_thv
+#define MSR_HVB (1ULL << MSR_HV)
 #else
 #define MSR_HVB (0ULL)
-#define msr_hv  (0)
-#endif
 #endif
 
 /* DSISR */
diff --git a/target/ppc/translate_init.inc.c b/target/ppc/translate_init.inc.c
index 53995f62ea..a0d0eaabf2 100644
--- a/target/ppc/translate_init.inc.c
+++ b/target/ppc/translate_init.inc.c
@@ -8804,7 +8804,7 @@ POWERPC_FAMILY(POWER8)(ObjectClass *oc, void *data)
 PPC2_ISA205 | PPC2_ISA207S | PPC2_FP_CVT_S64 |
 PPC2_TM | PPC2_PM_ISA206;
 pcc->msr_mask = (1ull << MSR_SF) |
-(1ull << MSR_SHV) |
+(1ull << MSR_HV) |
 (1ull << MSR_TM) |
 (1ull << MSR_VR) |
 (1ull << MSR_VSX) |
@@ -9017,7 +9017,7 @@ POWERPC_FAMILY(POWER9)(ObjectClass *oc, void *data)
 PPC2_ISA205 | PPC2_ISA207S | PPC2_FP_CVT_S64 |
 PPC2_TM | PPC2_ISA300 | PPC2_PRCNTL;
 pcc->msr_mask = (1ull << MSR_SF) |
-(1ull << MSR_SHV) |
+(1ull << MSR_HV) |
 (1ull << MSR_TM) |
 (1ull << MSR_VR) |
 (1ull << MSR_VSX) |
@@ -9228,7 +9228,7 @@ POWERPC_FAMILY(POWER10)(ObjectClass *oc, void *data)
 PPC2_ISA205 | PPC2_ISA207S | PPC2_FP_CVT_S64 |
 PPC2_TM | PPC2_ISA300 | PPC2_PRCNTL;
 pcc->msr_mask = (1ull << MSR_SF) |
-(1ull << MSR_SHV) |
+(1ull << MSR_HV) |
 (1ull << MSR_TM) |
 (1ull << MSR_VR) |
 (1ull << MSR_VSX) |
-- 
2.24.1




[PATCH v3 03/12] target/ppc: Correct handling of real mode accesses with vhyp on hash MMU

2020-02-18 Thread David Gibson
On ppc we have the concept of virtual hypervisor ("vhyp") mode, where we
only model the non-hypervisor-privileged parts of the cpu.  Essentially we
model the hypervisor's behaviour from the point of view of a guest OS, but
we don't model the hypervisor's execution.

In particular, in this mode, qemu's notion of target physical address is
a guest physical address from the vcpu's point of view.  So accesses in
guest real mode don't require translation.  If we were modelling the
hypervisor mode, we'd need to translate the guest physical address into
a host physical address.

Currently, we handle this sloppily: we rely on setting up the virtual LPCR
and RMOR registers so that GPAs are simply HPAs plus an offset, which we
set to zero.  This is already conceptually dubious, since the LPCR and RMOR
registers don't exist in the non-hypervisor portion of the CPU.  It gets
worse with POWER9, where RMOR and LPCR[VPM0] no longer exist at all.

Clean this up by explicitly handling the vhyp case.  While we're there,
remove some unnecessary nesting of if statements that made the logic to
select the correct real mode behaviour a bit less clear than it could be.

Signed-off-by: David Gibson 
Reviewed-by: Cédric Le Goater 
---
 target/ppc/mmu-hash64.c | 60 -
 1 file changed, 35 insertions(+), 25 deletions(-)

diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index a881876647..5fabd93c92 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -789,27 +789,30 @@ int ppc_hash64_handle_mmu_fault(PowerPCCPU *cpu, vaddr 
eaddr,
  */
 raddr = eaddr & 0x0FFFULL;
 
-/* In HV mode, add HRMOR if top EA bit is clear */
-if (msr_hv || !env->has_hv_mode) {
+if (cpu->vhyp) {
+/*
+ * In virtual hypervisor mode, there's nothing to do:
+ *   EA == GPA == qemu guest address
+ */
+} else if (msr_hv || !env->has_hv_mode) {
+/* In HV mode, add HRMOR if top EA bit is clear */
 if (!(eaddr >> 63)) {
 raddr |= env->spr[SPR_HRMOR];
 }
-} else {
-/* Otherwise, check VPM for RMA vs VRMA */
-if (env->spr[SPR_LPCR] & LPCR_VPM0) {
-slb = >vrma_slb;
-if (slb->sps) {
-goto skip_slb_search;
-}
-/* Not much else to do here */
+} else if (env->spr[SPR_LPCR] & LPCR_VPM0) {
+/* Emulated VRMA mode */
+slb = >vrma_slb;
+if (!slb->sps) {
+/* Invalid VRMA setup, machine check */
 cs->exception_index = POWERPC_EXCP_MCHECK;
 env->error_code = 0;
 return 1;
-} else if (raddr < env->rmls) {
-/* RMA. Check bounds in RMLS */
-raddr |= env->spr[SPR_RMOR];
-} else {
-/* The access failed, generate the approriate interrupt */
+}
+
+goto skip_slb_search;
+} else {
+/* Emulated old-style RMO mode, bounds check against RMLS */
+if (raddr >= env->rmls) {
 if (rwx == 2) {
 ppc_hash64_set_isi(cs, SRR1_PROTFAULT);
 } else {
@@ -821,6 +824,8 @@ int ppc_hash64_handle_mmu_fault(PowerPCCPU *cpu, vaddr 
eaddr,
 }
 return 1;
 }
+
+raddr |= env->spr[SPR_RMOR];
 }
 tlb_set_page(cs, eaddr & TARGET_PAGE_MASK, raddr & TARGET_PAGE_MASK,
  PAGE_READ | PAGE_WRITE | PAGE_EXEC, mmu_idx,
@@ -953,22 +958,27 @@ hwaddr ppc_hash64_get_phys_page_debug(PowerPCCPU *cpu, 
target_ulong addr)
 /* In real mode the top 4 effective address bits are ignored */
 raddr = addr & 0x0FFFULL;
 
-/* In HV mode, add HRMOR if top EA bit is clear */
-if ((msr_hv || !env->has_hv_mode) && !(addr >> 63)) {
+if (cpu->vhyp) {
+/*
+ * In virtual hypervisor mode, there's nothing to do:
+ *   EA == GPA == qemu guest address
+ */
+return raddr;
+} else if ((msr_hv || !env->has_hv_mode) && !(addr >> 63)) {
+/* In HV mode, add HRMOR if top EA bit is clear */
 return raddr | env->spr[SPR_HRMOR];
-}
-
-/* Otherwise, check VPM for RMA vs VRMA */
-if (env->spr[SPR_LPCR] & LPCR_VPM0) {
+} else if (env->spr[SPR_LPCR] & LPCR_VPM0) {
+/* Emulated VRMA mode */
 slb = >vrma_slb;
 if (!slb->sps) {
 return -1;
 }
-} else if (raddr < env->rmls) {
-/* RMA. Check bounds in RMLS */
-return raddr | env->spr[SPR_RMOR];
 } else {
-return -1;
+/* Emulated old-style RMO mode, bounds check against RMLS */
+if (raddr >= env->rmls) {
+ 

Re: [PATCH v4 2/4] target/riscv: configure and turn on vector extension from command line

2020-02-18 Thread LIU Zhiwei

Hi, Alistair

On 2020/2/19 6:34, Alistair Francis wrote:

On Mon, Feb 10, 2020 at 12:12 AM LIU Zhiwei  wrote:

Vector extension is default on only for "any" cpu. It can be turned
on by command line "-cpu rv64,v=true,vlen=128,elen=64,vext_spec=v0.7.1".

vlen is the vector register length, default value is 128 bit.
elen is the max operator size in bits, default value is 64 bit.
vext_spec is the vector specification version, default value is v0.7.1.
Thest properties and cpu can be specified with other values.

Signed-off-by: LIU Zhiwei 

This looks fine to me. Shouldn't this be the last patch though?

Yes, it should be the last patch.

As in
once the vector extension has been added to QEMU you can turn it on
from the command line. Right now this turns it on but it isn't
implemented.

Maybe I should just add fields in RISCVCPU structure. And never open the
vector extension on or add configure properties until the implementation 
is ready.


It's still a little awkward as the reviewers will not be able to test 
the patch until the

last patch.


Alistair


---
  target/riscv/cpu.c | 48 --
  target/riscv/cpu.h |  8 
  2 files changed, 54 insertions(+), 2 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 8c86ebc109..95fdb6261e 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -98,6 +98,11 @@ static void set_priv_version(CPURISCVState *env, int 
priv_ver)
  env->priv_ver = priv_ver;
  }

+static void set_vext_version(CPURISCVState *env, int vext_ver)
+{
+env->vext_ver = vext_ver;
+}
+
  static void set_feature(CPURISCVState *env, int feature)
  {
  env->features |= (1ULL << feature);
@@ -113,7 +118,7 @@ static void set_resetvec(CPURISCVState *env, int resetvec)
  static void riscv_any_cpu_init(Object *obj)
  {
  CPURISCVState *env = _CPU(obj)->env;
-set_misa(env, RVXLEN | RVI | RVM | RVA | RVF | RVD | RVC | RVU);
+set_misa(env, RVXLEN | RVI | RVM | RVA | RVF | RVD | RVC | RVU | RVV);
  set_priv_version(env, PRIV_VERSION_1_11_0);
  set_resetvec(env, DEFAULT_RSTVEC);
  }
@@ -320,6 +325,7 @@ static void riscv_cpu_realize(DeviceState *dev, Error 
**errp)
  CPURISCVState *env = >env;
  RISCVCPUClass *mcc = RISCV_CPU_GET_CLASS(dev);
  int priv_version = PRIV_VERSION_1_11_0;
+int vext_version = VEXT_VERSION_0_07_1;
  target_ulong target_misa = 0;
  Error *local_err = NULL;

@@ -343,8 +349,18 @@ static void riscv_cpu_realize(DeviceState *dev, Error 
**errp)
  return;
  }
  }
-
+if (cpu->cfg.vext_spec) {
+if (!g_strcmp0(cpu->cfg.vext_spec, "v0.7.1")) {
+vext_version = VEXT_VERSION_0_07_1;
+} else {
+error_setg(errp,
+   "Unsupported vector spec version '%s'",
+   cpu->cfg.vext_spec);
+return;
+}
+}
  set_priv_version(env, priv_version);
+set_vext_version(env, vext_version);
  set_resetvec(env, DEFAULT_RSTVEC);

  if (cpu->cfg.mmu) {
@@ -409,6 +425,30 @@ static void riscv_cpu_realize(DeviceState *dev, Error 
**errp)
  if (cpu->cfg.ext_u) {
  target_misa |= RVU;
  }
+if (cpu->cfg.ext_v) {
+target_misa |= RVV;
+if (!is_power_of_2(cpu->cfg.vlen)) {
+error_setg(errp,
+   "Vector extension VLEN must be power of 2");
+return;
+}
+if (cpu->cfg.vlen > RV_VLEN_MAX || cpu->cfg.vlen < 128) {
+error_setg(errp,
+   "Vector extension implementation only supports VLEN "
+   "in the range [128, %d]", RV_VLEN_MAX);
+return;
+}
+if (!is_power_of_2(cpu->cfg.elen)) {
+error_setg(errp,
+   "Vector extension ELEN must be power of 2");
+return;
+}
+if (cpu->cfg.elen > 64) {
+error_setg(errp,
+   "Vector extension ELEN must <= 64");
+return;
+}
+}

  set_misa(env, RVXLEN | target_misa);
  }
@@ -444,10 +484,14 @@ static Property riscv_cpu_properties[] = {
  DEFINE_PROP_BOOL("c", RISCVCPU, cfg.ext_c, true),
  DEFINE_PROP_BOOL("s", RISCVCPU, cfg.ext_s, true),
  DEFINE_PROP_BOOL("u", RISCVCPU, cfg.ext_u, true),
+DEFINE_PROP_BOOL("v", RISCVCPU, cfg.ext_v, false),
  DEFINE_PROP_BOOL("Counters", RISCVCPU, cfg.ext_counters, true),
  DEFINE_PROP_BOOL("Zifencei", RISCVCPU, cfg.ext_ifencei, true),
  DEFINE_PROP_BOOL("Zicsr", RISCVCPU, cfg.ext_icsr, true),
  DEFINE_PROP_STRING("priv_spec", RISCVCPU, cfg.priv_spec),
+DEFINE_PROP_STRING("vext_spec", RISCVCPU, cfg.vext_spec),
+DEFINE_PROP_UINT16("vlen", RISCVCPU, cfg.vlen, 128),
+DEFINE_PROP_UINT16("elen", RISCVCPU, cfg.elen, 64),
  DEFINE_PROP_BOOL("mmu", RISCVCPU, cfg.mmu, true),
  

Re: [PULL SUBSYSTEM qemu-pseries] pseries: Update SLOF firmware image

2020-02-18 Thread David Gibson
On Tue, Feb 18, 2020 at 06:48:43AM +0100, Philippe Mathieu-Daudé wrote:
> On 2/17/20 11:46 PM, David Gibson wrote:
> > On Mon, Feb 17, 2020 at 11:24:11AM +0100, Philippe Mathieu-Daudé wrote:
> > > On 2/17/20 10:26 AM, Philippe Mathieu-Daudé wrote:
> > > > Hi Alexey,
> > > > 
> > > > On 2/17/20 3:12 AM, Alexey Kardashevskiy wrote:
> > > > > The following changes since commit
> > > > > 05943fb4ca41f626078014c0327781815c6584c5:
> > > > > 
> > > > >     ppc: free 'fdt' after reset the machine (2020-02-17 11:27:23 
> > > > > +1100)
> > > > > 
> > > > > are available in the Git repository at:
> > > > > 
> > > > >     g...@github.com:aik/qemu.git tags/qemu-slof-20200217
> > > > > 
> > > > > for you to fetch changes up to 
> > > > > ea9a03e5aa023c5391bab5259898475d0298aac2:
> > > > > 
> > > > >     pseries: Update SLOF firmware image (2020-02-17 13:08:59 +1100)
> > > > > 
> > > > > 
> > > > > Alexey Kardashevskiy (1):
> > > > >     pseries: Update SLOF firmware image
> > > > > 
> > > > >    pc-bios/README   |   2 +-
> > > > >    pc-bios/slof.bin | Bin 931032 -> 968560 bytes
> > > > >    roms/SLOF    |   2 +-
> > > > >    3 files changed, 2 insertions(+), 2 deletions(-)
> > > > 
> > > > I only received the cover, not the patch, have you posted it?
> > > 
> > > OK I see the SLOF binary is almost 1MB. Maybe this got blocked by spam
> > > filter. FYI you can use 'git-format-patch --no-binary' to emit the patch
> > > with the commit description but without the content.
> > 
> > Generally Alexey sends SLOF updates to me just as pull requests
> > without patches in full, because a huge slab of base64 encoded
> > firmware isn't particularly illuminating.
> 
> I understand, this is why I later suggested Alexey to use 'git-format-patch
> --no-binary', because Laszlo uses it for EDK2 submodule, this allow to
> quickly review the change on the list (without posting the base64), see:

Hm.  What's to review?  The only change apart from the binary blob and
the submodule tag is the version numbers in pc-bios/README.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH] spapr: Rework hash<->radix transitions at CAS

2020-02-18 Thread David Gibson
On Fri, Feb 14, 2020 at 07:19:00PM +0100, Greg Kurz wrote:
> On Fri, 14 Feb 2020 09:28:35 +1100
> David Gibson  wrote:
> 
> > On Thu, Feb 13, 2020 at 04:38:38PM +0100, Greg Kurz wrote:
> > > Until the CAS negotiation is over, an HPT can be allocated on three
> > > different paths:
> > > 
> > > 1) during machine reset if the host doesn't support radix,
> > > 
> > > 2) during CAS if the guest wants hash and doesn't support HPT resizing,
> > >in which case we pre-emptively resize the HPT to accomodate maxram,
> > > 
> > > 3) during CAS if no CAS reboot was requested, the guest wants hash but
> > >we're currently configured for radix.
> > > 
> > > Depending on the various combinations of host or guest MMU support,
> > > HPT resizing guest support and the possibility of a CAS reboot, it
> > > is quite hard to know which of these allocates the HPT that will
> > > be ultimately used by the guest that wants to do hash. Also, some of
> > > them have bugs:
> > > 
> > > - 2) calls spapr_reallocate_hpt() instead of spapr_setup_hpt_and_vrma()
> > >   and thus doesn't update the VRMA size, even though we've just extended
> > >   the HPT. Not sure what issues this can cause,
> > > 
> > > - 3) doesn't check for HPT resizing support and will always allocate a
> > >   small HPT based on the initial RAM size. This caps the total amount of
> > >   RAM the guest can see, especially if maxram is much higher than the
> > >   initial ram.
> > > 
> > > We only support guests that do CAS and we already assume that the HPT
> > > isn't being used when we do the pre-emptive resizing at CAS. It thus
> > > seems reasonable to only allocate the HPT at the end of CAS, when no
> > > CAS reboot was requested.
> > > 
> > > Consolidate the logic so that we only create the HPT during 3), ie.
> > > when we're done with the CAS reboot cycles, and ensure HPT resizing
> > > is taken into account. This fixes the radix->hash transition for
> > > all cases.
> > 
> > Uh.. I'm pretty sure this can't work for KVM on a POWER8 host.  We
> > need the HPT at all times there, or there's nowhere to put VRMA
> > entries, so we can't run even in real mode.
> > 
> 
> Well it happens to be working anyway because KVM automatically
> creates an HPT (default size 16MB) in kvmppc_hv_setup_htab_rma()
> if QEMU didn't do so already... Would a comment to emphasize this
> be enough or do you prefer I don't drop the HPT allocation currently
> performed at machine reset ?

Relying on the automatic allocation is not a good idea.  With host
kernels before HPT resizing, once that automatic allocation happens,
we can't change the HPT size *at all*, even with a reset or CAS.

So, yes, the current code is annoyingly complex, but it's that way for
a reason.

> > > The guest can theoretically call CAS several times, without a CAS
> > > reboot in between. Linux guests don't do that, but better safe than
> > > sorry, let's ensure we can also handle the symmetrical hash->radix
> > > transition correctly: free the HPT and set the GR bit in PATE.
> > > An helper is introduced for the latter since this is already what
> > > we do during machine reset when going for radix.
> > > 
> > > As a bonus, this removes one user of spapr->cas_reboot, which we
> > > want to get rid of in the future.
> > > 
> > > Signed-off-by: Greg Kurz 
> > > ---
> > >  hw/ppc/spapr.c |   25 +++-
> > >  hw/ppc/spapr_hcall.c   |   59 
> > > 
> > >  include/hw/ppc/spapr.h |1 +
> > >  3 files changed, 44 insertions(+), 41 deletions(-)
> > > 
> > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > index 828e2cc1359a..88bc0e4e3ca1 100644
> > > --- a/hw/ppc/spapr.c
> > > +++ b/hw/ppc/spapr.c
> > > @@ -1573,9 +1573,19 @@ void spapr_setup_hpt_and_vrma(SpaprMachineState 
> > > *spapr)
> > >  {
> > >  int hpt_shift;
> > >  
> > > +/*
> > > + * HPT resizing is a bit of a special case, because when enabled
> > > + * we assume an HPT guest will support it until it says it
> > > + * doesn't, instead of assuming it won't support it until it says
> > > + * it does.  Strictly speaking that approach could break for
> > > + * guests which don't make a CAS call, but those are so old we
> > > + * don't care about them.  Without that assumption we'd have to
> > > + * make at least a temporary allocation of an HPT sized for max
> > > + * memory, which could be impossibly difficult under KVM HV if
> > > + * maxram is large.
> > > + */
> > >  if ((spapr->resize_hpt == SPAPR_RESIZE_HPT_DISABLED)
> > > -|| (spapr->cas_reboot
> > > -&& !spapr_ovec_test(spapr->ov5_cas, OV5_HPT_RESIZE))) {
> > > +|| !spapr_ovec_test(spapr->ov5_cas, OV5_HPT_RESIZE)) {
> > >  hpt_shift = 
> > > spapr_hpt_shift_for_ramsize(MACHINE(spapr)->maxram_size);
> > >  } else {
> > >  uint64_t current_ram_size;
> > > @@ -1604,6 +1614,12 @@ static int spapr_reset_drcs(Object *child, void 
> 

Re: [PULL SUBSYSTEM qemu-pseries] pseries: Update SLOF firmware image

2020-02-18 Thread David Gibson
On Tue, Feb 18, 2020 at 01:59:44PM +0100, Cédric Le Goater wrote:
> On 2/18/20 1:48 PM, Cédric Le Goater wrote:
> > On 2/18/20 10:40 AM, Cédric Le Goater wrote:
> >> On 2/18/20 10:10 AM, Alexey Kardashevskiy wrote:
> >>>
> >>>
> >>> On 18/02/2020 20:05, Alexey Kardashevskiy wrote:
> 
> 
>  On 18/02/2020 18:12, Cédric Le Goater wrote:
> > On 2/18/20 1:30 AM, Alexey Kardashevskiy wrote:
> >>
> >>
> >> On 17/02/2020 20:48, Cédric Le Goater wrote:
> >>> On 2/17/20 3:12 AM, Alexey Kardashevskiy wrote:
>  The following changes since commit 
>  05943fb4ca41f626078014c0327781815c6584c5:
> 
>    ppc: free 'fdt' after reset the machine (2020-02-17 11:27:23 +1100)
> 
>  are available in the Git repository at:
> 
>    g...@github.com:aik/qemu.git tags/qemu-slof-20200217
> 
>  for you to fetch changes up to 
>  ea9a03e5aa023c5391bab5259898475d0298aac2:
> 
>    pseries: Update SLOF firmware image (2020-02-17 13:08:59 +1100)
> 
>  
>  Alexey Kardashevskiy (1):
>    pseries: Update SLOF firmware image
> 
>   pc-bios/README   |   2 +-
>   pc-bios/slof.bin | Bin 931032 -> 968560 bytes
>   roms/SLOF|   2 +-
>   3 files changed, 2 insertions(+), 2 deletions(-)
> 
> 
>  *** Note: this is not for master, this is for pseries
> 
> >>>
> >>> Hello Alexey,
> >>>
> >>> QEMU fails to boot from disk. See below.
> >>
> >>
> >> It does boot mine (fedora 30, ubuntu 18.04), see below. I believe I
> >> could have broken something but I need more detail. Thanks,
> >
> > fedora31 boots but not ubuntu 19.10. Could it be GRUB version 2.04 ? 
> 
> 
>  No, not that either:
> >>>
> >>>
> >>> but it might be because of power9 - I only tried power8, rsyncing the
> >>> image to a p9 machine now...
> >>
> >> Here is the disk : 
> >>
> >> Disk /dev/sda: 50 GiB, 53687091200 bytes, 104857600 sectors
> >> Disk model: QEMU HARDDISK   
> >> Units: sectors of 1 * 512 = 512 bytes
> >> Sector size (logical/physical): 512 bytes / 512 bytes
> >> I/O size (minimum/optimal): 512 bytes / 512 bytes
> >> Disklabel type: gpt
> >> Disk identifier: 27DCE458-231A-4981-9FF1-983F87C2902D
> >>
> >> Device Start   End   Sectors Size Type
> >> /dev/sda1   2048 16383 14336   7M PowerPC PReP boot
> >> /dev/sda2  16384 100679679 100663296  48G Linux filesystem
> >> /dev/sda3  100679680 104857566   4177887   2G Linux swap
> >>
> >>
> >> GPT ? 
> > 
> > For the failure, I bisected up to :
> > 
> > f12149908705 ("ext2: Read all 64bit of inode number")
> 
> Here is a possible fix for it. I did some RPN on my hp28s in the past 
> but I am not forth fluent.
> 
> "slash not found" is still there though. 

I've removed this SLOF update from my ppc-for-5.0 staging tree until
we figure out what's going wrong here.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH v4 1/3] target/arm: Support SError injection

2020-02-18 Thread Gavin Shan

Hi Marc,

On 2/19/20 3:28 AM, Marc Zyngier wrote:

On 2020-02-18 02:04, Gavin Shan wrote:

This supports SError injection, which will be used by "virt" board to
simulating the behavior of NMI injection in next patch. As Peter Maydell
suggested, this adds a new interrupt (ARM_CPU_SERROR), which is parallel
to CPU_INTERRUPT_HARD. The backend depends on if kvm is enabled or not.
kvm_vcpu_ioctl(cpu, KVM_SET_VCPU_EVENTS) is leveraged to inject SError
or data abort to guest. When TCG is enabled, the behavior is simulated
by injecting SError and data abort to guest.


s/and/or/ (you can't inject both at the same time).



Absolutely, will be corrected in v5, which will be hold. I hope to receive
comments from Peter and Richard before going to do another respin :)



Signed-off-by: Gavin Shan 
---
 target/arm/cpu.c  | 69 +++
 target/arm/cpu.h  | 20 -
 target/arm/helper.c   | 12 
 target/arm/m_helper.c |  8 +
 target/arm/machine.c  |  3 +-
 5 files changed, 91 insertions(+), 21 deletions(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index de733aceeb..e5750080bc 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -78,7 +78,7 @@ static bool arm_cpu_has_work(CPUState *cs)
 && cs->interrupt_request &
 (CPU_INTERRUPT_FIQ | CPU_INTERRUPT_HARD
  | CPU_INTERRUPT_VFIQ | CPU_INTERRUPT_VIRQ
- | CPU_INTERRUPT_EXITTB);
+ | CPU_INTERRUPT_SERROR | CPU_INTERRUPT_EXITTB);
 }

 void arm_register_pre_el_change_hook(ARMCPU *cpu, ARMELChangeHookFn *hook,
@@ -449,6 +449,9 @@ static inline bool arm_excp_unmasked(CPUState *cs,
unsigned int excp_idx,
 return false;
 }
 return !(env->daif & PSTATE_I);
+    case EXCP_SERROR:
+   pstate_unmasked = !(env->daif & PSTATE_A);
+   break;


nit: Consider keeping the physical interrupts together, as they are closely
related.



Sorry, I didn't get the point. Maybe you're suggesting something like below?
If yes, I'm not sure if it's necessary.

pstate_unmasked = !(env->daif & (PSTATE_A | PSTATE_I));

I think PSTATE_A is enough to mask out SError according to ARMv8 architecture
reference manual (D1.7), as below:

   A, I, F Asynchronous exception mask bits:
   A
  SError interrupt mask bit.
   I
  IRQ interrupt mask bit.
   F
  FIQ interrupt mask bit.


 default:
 g_assert_not_reached();
 }
@@ -538,6 +541,15 @@ bool arm_cpu_exec_interrupt(CPUState *cs, int
interrupt_request)

 /* The prioritization of interrupts is IMPLEMENTATION DEFINED. */

+    if (interrupt_request & CPU_INTERRUPT_SERROR) {
+    excp_idx = EXCP_SERROR;
+    target_el = arm_phys_excp_target_el(cs, excp_idx, cur_el, secure);
+    if (arm_excp_unmasked(cs, excp_idx, target_el,
+  cur_el, secure, hcr_el2)) {
+    goto found;
+    }
+    }
+
 if (interrupt_request & CPU_INTERRUPT_FIQ) {
 excp_idx = EXCP_FIQ;
 target_el = arm_phys_excp_target_el(cs, excp_idx, cur_el, secure);
@@ -570,6 +582,7 @@ bool arm_cpu_exec_interrupt(CPUState *cs, int
interrupt_request)
 goto found;
 }
 }
+
 return false;

  found:
@@ -585,7 +598,7 @@ static bool arm_v7m_cpu_exec_interrupt(CPUState
*cs, int interrupt_request)
 CPUClass *cc = CPU_GET_CLASS(cs);
 ARMCPU *cpu = ARM_CPU(cs);
 CPUARMState *env = >env;
-    bool ret = false;
+    uint32_t excp_idx;

 /* ARMv7-M interrupt masking works differently than -A or -R.
  * There is no FIQ/IRQ distinction. Instead of I and F bits
@@ -594,13 +607,26 @@ static bool arm_v7m_cpu_exec_interrupt(CPUState
*cs, int interrupt_request)
  * (which depends on state like BASEPRI, FAULTMASK and the
  * currently active exception).
  */
-    if (interrupt_request & CPU_INTERRUPT_HARD
-    && (armv7m_nvic_can_take_pending_exception(env->nvic))) {
-    cs->exception_index = EXCP_IRQ;
-    cc->do_interrupt(cs);
-    ret = true;
+    if (!armv7m_nvic_can_take_pending_exception(env->nvic)) {
+    return false;
+    }
+
+    if (interrupt_request & CPU_INTERRUPT_SERROR) {
+    excp_idx = EXCP_SERROR;
+    goto found;
+    }
+
+    if (interrupt_request & CPU_INTERRUPT_HARD) {
+    excp_idx = EXCP_IRQ;
+    goto found;
 }
-    return ret;
+
+    return false;
+
+found:
+    cs->exception_index = excp_idx;
+    cc->do_interrupt(cs);
+    return true;
 }
 #endif

@@ -656,7 +682,8 @@ static void arm_cpu_set_irq(void *opaque, int irq,
int level)
 [ARM_CPU_IRQ] = CPU_INTERRUPT_HARD,
 [ARM_CPU_FIQ] = CPU_INTERRUPT_FIQ,
 [ARM_CPU_VIRQ] = CPU_INTERRUPT_VIRQ,
-    [ARM_CPU_VFIQ] = CPU_INTERRUPT_VFIQ
+    [ARM_CPU_VFIQ] = CPU_INTERRUPT_VFIQ,
+    [ARM_CPU_SERROR] = CPU_INTERRUPT_SERROR,
 };

 if (level) {
@@ -676,6 +703,7 @@ static void arm_cpu_set_irq(void *opaque, int irq,
int level)
 break;
 case ARM_CPU_IRQ:
 case 

Re: [PATCH v2] Avoid address_space_rw() with a constant is_write argument

2020-02-18 Thread David Gibson
On Tue, Feb 18, 2020 at 11:24:57AM +, Peter Maydell wrote:
> The address_space_rw() function allows either reads or writes
> depending on the is_write argument passed to it; this is useful
> when the direction of the access is determined programmatically
> (as for instance when handling the KVM_EXIT_MMIO exit reason).
> Under the hood it just calls either address_space_write() or
> address_space_read_full().
> 
> We also use it a lot with a constant is_write argument, though,
> which has two issues:
>  * when reading "address_space_rw(..., 1)" this is less
>immediately clear to the reader as being a write than
>"address_space_write(...)"
>  * calling address_space_rw() bypasses the optimization
>in address_space_read() that fast-paths reads of a
>fixed length
> 
> This commit was produced with the included Coccinelle script
> scripts/coccinelle/as-rw-const.patch.
> 
> Two lines in hw/net/dp8393x.c that Coccinelle produced that
> were over 80 characters were re-wrapped by hand.
> 
> Signed-off-by: Peter Maydell 

ppc parts

Acked-by: David Gibson 

> ---
> I could break this down into separate patches by submaintainer,
> but the patch is not that large and I would argue that it's
> better for the project if we can try to avoid introducing too
> much friction into the process of doing 'safe' tree-wide
> minor refactorings.
> 
> v1->v2: put the coccinelle script in scripts/coccinelle rather
> than just in the commit message.
> ---
>  accel/kvm/kvm-all.c  |  6 +--
>  dma-helpers.c|  4 +-
>  exec.c   |  4 +-
>  hw/dma/xlnx-zdma.c   | 11 ++---
>  hw/net/dp8393x.c | 68 ++--
>  hw/net/i82596.c  | 25 +-
>  hw/net/lasi_i82596.c |  5 +-
>  hw/ppc/pnv_lpc.c |  8 ++--
>  hw/s390x/css.c   | 12 ++---
>  qtest.c  | 52 ++---
>  target/i386/hvf/x86_mmu.c| 12 ++---
>  scripts/coccinelle/as_rw_const.cocci | 30 
>  12 files changed, 133 insertions(+), 104 deletions(-)
>  create mode 100644 scripts/coccinelle/as_rw_const.cocci
> 
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index c111312dfdd..0cfe6fd8ded 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -2178,9 +2178,9 @@ void kvm_flush_coalesced_mmio_buffer(void)
>  ent = >coalesced_mmio[ring->first];
>  
>  if (ent->pio == 1) {
> -address_space_rw(_space_io, ent->phys_addr,
> - MEMTXATTRS_UNSPECIFIED, ent->data,
> - ent->len, true);
> +address_space_write(_space_io, ent->phys_addr,
> +MEMTXATTRS_UNSPECIFIED, ent->data,
> +ent->len);
>  } else {
>  cpu_physical_memory_write(ent->phys_addr, ent->data, 
> ent->len);
>  }
> diff --git a/dma-helpers.c b/dma-helpers.c
> index d3871dc61ea..e8a26e81e16 100644
> --- a/dma-helpers.c
> +++ b/dma-helpers.c
> @@ -28,8 +28,8 @@ int dma_memory_set(AddressSpace *as, dma_addr_t addr, 
> uint8_t c, dma_addr_t len)
>  memset(fillbuf, c, FILLBUF_SIZE);
>  while (len > 0) {
>  l = len < FILLBUF_SIZE ? len : FILLBUF_SIZE;
> -error |= address_space_rw(as, addr, MEMTXATTRS_UNSPECIFIED,
> -  fillbuf, l, true);
> +error |= address_space_write(as, addr, MEMTXATTRS_UNSPECIFIED,
> + fillbuf, l);
>  len -= l;
>  addr += l;
>  }
> diff --git a/exec.c b/exec.c
> index 8e9cc3b47cf..baefe582393 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -3810,8 +3810,8 @@ int cpu_memory_rw_debug(CPUState *cpu, target_ulong 
> addr,
>  address_space_write_rom(cpu->cpu_ases[asidx].as, phys_addr,
>  attrs, buf, l);
>  } else {
> -address_space_rw(cpu->cpu_ases[asidx].as, phys_addr,
> - attrs, buf, l, 0);
> +address_space_read(cpu->cpu_ases[asidx].as, phys_addr, attrs, 
> buf,
> +   l);
>  }
>  len -= l;
>  buf += l;
> diff --git a/hw/dma/xlnx-zdma.c b/hw/dma/xlnx-zdma.c
> index 8fb83f5b078..31936061e21 100644
> --- a/hw/dma/xlnx-zdma.c
> +++ b/hw/dma/xlnx-zdma.c
> @@ -311,8 +311,7 @@ static bool zdma_load_descriptor(XlnxZDMA *s, uint64_t 
> addr, void *buf)
>  return false;
>  }
>  
> -address_space_rw(s->dma_as, addr, s->attr,
> - buf, sizeof(XlnxZDMADescr), false);
> +address_space_read(s->dma_as, addr, s->attr, buf, sizeof(XlnxZDMADescr));
>  return true;
>  }
>  
> @@ -364,7 +363,7 @@ static uint64_t zdma_update_descr_addr(XlnxZDMA *s, bool 
> type,
>  } else {
>  addr = zdma_get_regaddr64(s, 

Re: [PATCH 2/3] hw/ppc/virtex_ml507:fix leak of fdevice tree blob

2020-02-18 Thread David Gibson
On Tue, Feb 18, 2020 at 05:11:53PM +0800, kuhn.chen...@huawei.com wrote:
> From: Chen Qun 
> 
> The device tree blob returned by load_device_tree is malloced.
> We should free it after cpu_physical_memory_write().
> 
> Reported-by: Euler Robot 
> Signed-off-by: Chen Qun 

I've applied this patch to my ppc-for-5.0 staging tree.

> ---
>  hw/ppc/virtex_ml507.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/hw/ppc/virtex_ml507.c b/hw/ppc/virtex_ml507.c
> index 91dd00ee91..4eef70069f 100644
> --- a/hw/ppc/virtex_ml507.c
> +++ b/hw/ppc/virtex_ml507.c
> @@ -188,6 +188,7 @@ static int xilinx_load_device_tree(hwaddr addr,
>  if (r < 0)
>  fprintf(stderr, "couldn't set /chosen/bootargs\n");
>  cpu_physical_memory_write(addr, fdt, fdt_size);
> +g_free(fdt);
>  return fdt_size;
>  }
>  

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH 00/22] linux-user: generate syscall_nr.sh

2020-02-18 Thread Alistair Francis
On Mon, Feb 17, 2020 at 2:36 PM Laurent Vivier  wrote:
>
> This series copies the files syscall.tbl from linux v5.5 and generates
> the file syscall_nr.h from them.
>
> This is done for all the QEMU targets that have a syscall.tbl
> in the linux source tree: mips, mips64, i386, x86_64, sparc, s390x,
> ppc, arm, microblaze, sh4, xtensa, m68k, hppa and alpha.
>
> tilegx and cris are depecrated in linux (tilegx has no maintainer in QEMU)
>
> aarch64, nios2, openrisc and riscv have no syscall.tbl in linux.

What's the plan with these other architectures?

RISC-V uses asm-generic, is there some way to generate syscall_nr.h from that?

Alistair

>
> It seems there is a bug in QEMU that forces to disable manually arch_prctl
> with i386 target: do_arch_prctl() is only defined with TARGET_ABI32 but
> TARGET_ABI32 is never defined with TARGET_I386 (nor TARGET_X86_64).
>
> I have also removed all syscalls in s390x/syscall_nr.h defined for
> !defined(TARGET_S390X).
>
> I have added a script to copy all these files from linux and updated
> them at the end of the series with their latest version for today.
>
> The two last patches manage the special case for mips O32 that needs
> to know the number of arguments. We find them in strace sources.
>
> Laurent Vivier (22):
>   linux-user: introduce parameters to generate syscall_nr.h
>   linux-user,alpha: add syscall table generation support
>   linux-user,hppa: add syscall table generation support
>   linux-user,m68k: add syscall table generation support
>   linux-user,xtensa: add syscall table generation support
>   linux-user,sh4: add syscall table generation support
>   linux-user,microblaze: add syscall table generation support
>   linux-user,arm: add syscall table generation support
>   linux-user,ppc: split syscall_nr.h
>   linux-user,ppc: add syscall table generation support
>   linux-user,s390x: remove syscall definitions for !TARGET_S390X
>   linux-user,s390x: add syscall table generation support
>   linux-user,sparc,sparc64: add syscall table generation support
>   linux-user,i386: add syscall table generation support
>   linux-user,x86_64: add syscall table generation support
>   linux-user,mips: add syscall table generation support
>   linux-user,mips64: split syscall_nr.h
>   linux-user,mips64: add syscall table generation support
>   linux-user,scripts: add a script to update syscall.tbl
>   linux-user: update syscall.tbl from linux 0bf999f9c5e7
>   linux-user,mips: move content of mips_syscall_args
>   linux-user,mips: update syscall-args-o32.c.inc
>
>  MAINTAINERS|   1 +
>  Makefile.target|   3 +-
>  configure  |  23 +
>  linux-user/Makefile.objs   |  19 +-
>  linux-user/alpha/Makefile.objs |   5 +
>  linux-user/alpha/syscall.tbl   | 479 
>  linux-user/alpha/syscall_nr.h  | 492 -
>  linux-user/alpha/syscallhdr.sh |  32 ++
>  linux-user/arm/Makefile.objs   |   8 +
>  linux-user/arm/syscall.tbl | 453 
>  linux-user/arm/syscall_nr.h| 447 ---
>  linux-user/arm/syscallhdr.sh   |  31 ++
>  linux-user/hppa/Makefile.objs  |   5 +
>  linux-user/hppa/syscall.tbl| 437 +++
>  linux-user/hppa/syscall_nr.h   | 358 
>  linux-user/hppa/syscallhdr.sh  |  32 ++
>  linux-user/i386/Makefile.objs  |   5 +
>  linux-user/i386/syscall_32.tbl | 444 +++
>  linux-user/i386/syscall_nr.h   | 387 -
>  linux-user/i386/syscallhdr.sh  |  28 +
>  linux-user/m68k/Makefile.objs  |   5 +
>  linux-user/m68k/syscall.tbl| 439 +++
>  linux-user/m68k/syscall_nr.h   | 434 ---
>  linux-user/m68k/syscallhdr.sh  |  32 ++
>  linux-user/microblaze/Makefile.objs|   5 +
>  linux-user/microblaze/syscall.tbl  | 445 +++
>  linux-user/microblaze/syscall_nr.h | 442 ---
>  linux-user/microblaze/syscallhdr.sh|  32 ++
>  linux-user/mips/Makefile.objs  |   5 +
>  linux-user/mips/cpu_loop.c | 440 +--
>  linux-user/mips/syscall-args-o32.c.inc | 436 +++
>  linux-user/mips/syscall_nr.h   | 425 ---
>  linux-user/mips/syscall_o32.tbl| 427 +++
>  linux-user/mips/syscallhdr.sh  |  36 ++
>  linux-user/mips64/Makefile.objs|   9 +
>  linux-user/mips64/syscall_n32.tbl  | 378 +
>  linux-user/mips64/syscall_n64.tbl  | 354 
>  linux-user/mips64/syscall_nr.h | 719 +
>  linux-user/mips64/syscallhdr.sh|  33 ++
>  linux-user/ppc/Makefile.objs   |   9 +
>  linux-user/ppc/signal.c|   2 +-
>  linux-user/ppc/syscall.tbl | 521 ++
>  linux-user/ppc/syscall_nr.h| 394 

Re: [PATCH v4 2/4] target/riscv: configure and turn on vector extension from command line

2020-02-18 Thread Alistair Francis
On Mon, Feb 10, 2020 at 12:12 AM LIU Zhiwei  wrote:
>
> Vector extension is default on only for "any" cpu. It can be turned
> on by command line "-cpu rv64,v=true,vlen=128,elen=64,vext_spec=v0.7.1".
>
> vlen is the vector register length, default value is 128 bit.
> elen is the max operator size in bits, default value is 64 bit.
> vext_spec is the vector specification version, default value is v0.7.1.
> Thest properties and cpu can be specified with other values.
>
> Signed-off-by: LIU Zhiwei 

This looks fine to me. Shouldn't this be the last patch though? As in
once the vector extension has been added to QEMU you can turn it on
from the command line. Right now this turns it on but it isn't
implemented.

Alistair

> ---
>  target/riscv/cpu.c | 48 --
>  target/riscv/cpu.h |  8 
>  2 files changed, 54 insertions(+), 2 deletions(-)
>
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index 8c86ebc109..95fdb6261e 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -98,6 +98,11 @@ static void set_priv_version(CPURISCVState *env, int 
> priv_ver)
>  env->priv_ver = priv_ver;
>  }
>
> +static void set_vext_version(CPURISCVState *env, int vext_ver)
> +{
> +env->vext_ver = vext_ver;
> +}
> +
>  static void set_feature(CPURISCVState *env, int feature)
>  {
>  env->features |= (1ULL << feature);
> @@ -113,7 +118,7 @@ static void set_resetvec(CPURISCVState *env, int resetvec)
>  static void riscv_any_cpu_init(Object *obj)
>  {
>  CPURISCVState *env = _CPU(obj)->env;
> -set_misa(env, RVXLEN | RVI | RVM | RVA | RVF | RVD | RVC | RVU);
> +set_misa(env, RVXLEN | RVI | RVM | RVA | RVF | RVD | RVC | RVU | RVV);
>  set_priv_version(env, PRIV_VERSION_1_11_0);
>  set_resetvec(env, DEFAULT_RSTVEC);
>  }
> @@ -320,6 +325,7 @@ static void riscv_cpu_realize(DeviceState *dev, Error 
> **errp)
>  CPURISCVState *env = >env;
>  RISCVCPUClass *mcc = RISCV_CPU_GET_CLASS(dev);
>  int priv_version = PRIV_VERSION_1_11_0;
> +int vext_version = VEXT_VERSION_0_07_1;
>  target_ulong target_misa = 0;
>  Error *local_err = NULL;
>
> @@ -343,8 +349,18 @@ static void riscv_cpu_realize(DeviceState *dev, Error 
> **errp)
>  return;
>  }
>  }
> -
> +if (cpu->cfg.vext_spec) {
> +if (!g_strcmp0(cpu->cfg.vext_spec, "v0.7.1")) {
> +vext_version = VEXT_VERSION_0_07_1;
> +} else {
> +error_setg(errp,
> +   "Unsupported vector spec version '%s'",
> +   cpu->cfg.vext_spec);
> +return;
> +}
> +}
>  set_priv_version(env, priv_version);
> +set_vext_version(env, vext_version);
>  set_resetvec(env, DEFAULT_RSTVEC);
>
>  if (cpu->cfg.mmu) {
> @@ -409,6 +425,30 @@ static void riscv_cpu_realize(DeviceState *dev, Error 
> **errp)
>  if (cpu->cfg.ext_u) {
>  target_misa |= RVU;
>  }
> +if (cpu->cfg.ext_v) {
> +target_misa |= RVV;
> +if (!is_power_of_2(cpu->cfg.vlen)) {
> +error_setg(errp,
> +   "Vector extension VLEN must be power of 2");
> +return;
> +}
> +if (cpu->cfg.vlen > RV_VLEN_MAX || cpu->cfg.vlen < 128) {
> +error_setg(errp,
> +   "Vector extension implementation only supports VLEN "
> +   "in the range [128, %d]", RV_VLEN_MAX);
> +return;
> +}
> +if (!is_power_of_2(cpu->cfg.elen)) {
> +error_setg(errp,
> +   "Vector extension ELEN must be power of 2");
> +return;
> +}
> +if (cpu->cfg.elen > 64) {
> +error_setg(errp,
> +   "Vector extension ELEN must <= 64");
> +return;
> +}
> +}
>
>  set_misa(env, RVXLEN | target_misa);
>  }
> @@ -444,10 +484,14 @@ static Property riscv_cpu_properties[] = {
>  DEFINE_PROP_BOOL("c", RISCVCPU, cfg.ext_c, true),
>  DEFINE_PROP_BOOL("s", RISCVCPU, cfg.ext_s, true),
>  DEFINE_PROP_BOOL("u", RISCVCPU, cfg.ext_u, true),
> +DEFINE_PROP_BOOL("v", RISCVCPU, cfg.ext_v, false),
>  DEFINE_PROP_BOOL("Counters", RISCVCPU, cfg.ext_counters, true),
>  DEFINE_PROP_BOOL("Zifencei", RISCVCPU, cfg.ext_ifencei, true),
>  DEFINE_PROP_BOOL("Zicsr", RISCVCPU, cfg.ext_icsr, true),
>  DEFINE_PROP_STRING("priv_spec", RISCVCPU, cfg.priv_spec),
> +DEFINE_PROP_STRING("vext_spec", RISCVCPU, cfg.vext_spec),
> +DEFINE_PROP_UINT16("vlen", RISCVCPU, cfg.vlen, 128),
> +DEFINE_PROP_UINT16("elen", RISCVCPU, cfg.elen, 64),
>  DEFINE_PROP_BOOL("mmu", RISCVCPU, cfg.mmu, true),
>  DEFINE_PROP_BOOL("pmp", RISCVCPU, cfg.pmp, true),
>  DEFINE_PROP_END_OF_LIST(),
> diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> index 07e63016a7..bf2b4b55af 100644
> 

Re: [PATCH v2] Avoid address_space_rw() with a constant is_write argument

2020-02-18 Thread Alistair Francis
On Tue, Feb 18, 2020 at 3:25 AM Peter Maydell  wrote:
>
> The address_space_rw() function allows either reads or writes
> depending on the is_write argument passed to it; this is useful
> when the direction of the access is determined programmatically
> (as for instance when handling the KVM_EXIT_MMIO exit reason).
> Under the hood it just calls either address_space_write() or
> address_space_read_full().
>
> We also use it a lot with a constant is_write argument, though,
> which has two issues:
>  * when reading "address_space_rw(..., 1)" this is less
>immediately clear to the reader as being a write than
>"address_space_write(...)"
>  * calling address_space_rw() bypasses the optimization
>in address_space_read() that fast-paths reads of a
>fixed length
>
> This commit was produced with the included Coccinelle script
> scripts/coccinelle/as-rw-const.patch.
>
> Two lines in hw/net/dp8393x.c that Coccinelle produced that
> were over 80 characters were re-wrapped by hand.
>
> Signed-off-by: Peter Maydell 

Reviewed-by: Alistair Francis 

Alistair

> ---
> I could break this down into separate patches by submaintainer,
> but the patch is not that large and I would argue that it's
> better for the project if we can try to avoid introducing too
> much friction into the process of doing 'safe' tree-wide
> minor refactorings.
>
> v1->v2: put the coccinelle script in scripts/coccinelle rather
> than just in the commit message.
> ---
>  accel/kvm/kvm-all.c  |  6 +--
>  dma-helpers.c|  4 +-
>  exec.c   |  4 +-
>  hw/dma/xlnx-zdma.c   | 11 ++---
>  hw/net/dp8393x.c | 68 ++--
>  hw/net/i82596.c  | 25 +-
>  hw/net/lasi_i82596.c |  5 +-
>  hw/ppc/pnv_lpc.c |  8 ++--
>  hw/s390x/css.c   | 12 ++---
>  qtest.c  | 52 ++---
>  target/i386/hvf/x86_mmu.c| 12 ++---
>  scripts/coccinelle/as_rw_const.cocci | 30 
>  12 files changed, 133 insertions(+), 104 deletions(-)
>  create mode 100644 scripts/coccinelle/as_rw_const.cocci
>
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index c111312dfdd..0cfe6fd8ded 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -2178,9 +2178,9 @@ void kvm_flush_coalesced_mmio_buffer(void)
>  ent = >coalesced_mmio[ring->first];
>
>  if (ent->pio == 1) {
> -address_space_rw(_space_io, ent->phys_addr,
> - MEMTXATTRS_UNSPECIFIED, ent->data,
> - ent->len, true);
> +address_space_write(_space_io, ent->phys_addr,
> +MEMTXATTRS_UNSPECIFIED, ent->data,
> +ent->len);
>  } else {
>  cpu_physical_memory_write(ent->phys_addr, ent->data, 
> ent->len);
>  }
> diff --git a/dma-helpers.c b/dma-helpers.c
> index d3871dc61ea..e8a26e81e16 100644
> --- a/dma-helpers.c
> +++ b/dma-helpers.c
> @@ -28,8 +28,8 @@ int dma_memory_set(AddressSpace *as, dma_addr_t addr, 
> uint8_t c, dma_addr_t len)
>  memset(fillbuf, c, FILLBUF_SIZE);
>  while (len > 0) {
>  l = len < FILLBUF_SIZE ? len : FILLBUF_SIZE;
> -error |= address_space_rw(as, addr, MEMTXATTRS_UNSPECIFIED,
> -  fillbuf, l, true);
> +error |= address_space_write(as, addr, MEMTXATTRS_UNSPECIFIED,
> + fillbuf, l);
>  len -= l;
>  addr += l;
>  }
> diff --git a/exec.c b/exec.c
> index 8e9cc3b47cf..baefe582393 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -3810,8 +3810,8 @@ int cpu_memory_rw_debug(CPUState *cpu, target_ulong 
> addr,
>  address_space_write_rom(cpu->cpu_ases[asidx].as, phys_addr,
>  attrs, buf, l);
>  } else {
> -address_space_rw(cpu->cpu_ases[asidx].as, phys_addr,
> - attrs, buf, l, 0);
> +address_space_read(cpu->cpu_ases[asidx].as, phys_addr, attrs, 
> buf,
> +   l);
>  }
>  len -= l;
>  buf += l;
> diff --git a/hw/dma/xlnx-zdma.c b/hw/dma/xlnx-zdma.c
> index 8fb83f5b078..31936061e21 100644
> --- a/hw/dma/xlnx-zdma.c
> +++ b/hw/dma/xlnx-zdma.c
> @@ -311,8 +311,7 @@ static bool zdma_load_descriptor(XlnxZDMA *s, uint64_t 
> addr, void *buf)
>  return false;
>  }
>
> -address_space_rw(s->dma_as, addr, s->attr,
> - buf, sizeof(XlnxZDMADescr), false);
> +address_space_read(s->dma_as, addr, s->attr, buf, sizeof(XlnxZDMADescr));
>  return true;
>  }
>
> @@ -364,7 +363,7 @@ static uint64_t zdma_update_descr_addr(XlnxZDMA *s, bool 
> type,
>  } else {
>  addr = zdma_get_regaddr64(s, basereg);
>  

Re: [PATCH v3 0/3] arm: allwinner: Wire up USB ports

2020-02-18 Thread Niek Linnenbank
Hi Guenter, Philippe,

On Tue, Feb 18, 2020 at 7:39 AM Philippe Mathieu-Daudé 
wrote:

> Cc'ing Niek.
>
> On 2/17/20 9:48 PM, Guenter Roeck wrote:
> > Instantiate EHCI and OHCI controllers on Allwinner A10.
> >
> > The first patch in the series moves the declaration of EHCISysBusState
> > from hcd-ohci.c to hcd-ohci.h. This lets us add the structure to
> > AwA10State. Similar, TYPE_SYSBUS_OHCI is moved to be able to use it
> > outside its driver.
> >
> > The second patch introduces the ehci-sysbus property "companion-enable".
> > This lets us use object_property_set_bool() to enable companion mode.
> >
> > The third patch instantiates EHCI and OHCI ports for Allwinner-A10
> > and marks the OHCI ports as companions of the respective EHCI ports.
> >
> > Tested by attaching various high speed and full speed devices, and by
> > booting from USB drive.
> >
> > v3: Rebased to master
> > v2: Add summary
> >  Rewrite to instantiate OHCI in companion mode; add patch 2/3
> >  Merge EHCI and OHCI instantiation into a single patch
> >
> > 
> > Guenter Roeck (3):
> >hw: usb: hcd-ohci: Move OHCISysBusState and TYPE_SYSBUS_OHCI to
> include file
> >hcd-ehci: Introduce "companion-enable" sysbus property
> >arm: allwinner: Wire up USB ports
> >
> >   hw/arm/allwinner-a10.c | 43
> ++
> >   hw/usb/hcd-ehci-sysbus.c   |  2 ++
> >   hw/usb/hcd-ohci.c  | 15 ---
> >   hw/usb/hcd-ohci.h  | 16 
> >   include/hw/arm/allwinner-a10.h |  6 ++
> >   5 files changed, 67 insertions(+), 15 deletions(-)
> >
>
>
Thanks for contributing this! I was able to test & verify it on my local
machine using latest Qemu master and linux 5.5.0.
I just had to add the -usb flag to the Qemu command and re-compile linux
with CONFIG_USB_STORAGE.

Output with buildroot on a USB mass storage disk as rootfs:

++ ./arm-softmmu/qemu-system-arm -M cubieboard -kernel
$HOME/cubie/linux.git/arch/arm/boot/zImage -nographic -append
'console=ttyS0,115200 earlyprintk debug rootwait root=/dev/sda ro
init=/sbin/init' -dtb
$HOME/cubie/linux.git/arch/arm/boot/dts/sun4i-a10-cubieboard.dtb -m 512 -s
-usb -drive
if=none,id=stick,file=$HOME/cubie/buildroot-2019.11/output/images/rootfs.ext2
-device usb-storage,bus=usb-bus.1,drive=stick -nic user
[0.00] Booting Linux on physical CPU 0x0
[0.00] Linux version 5.5.0-rc3 (me@host) (gcc version 7.4.0
(Ubuntu/Linaro 7.4.0-1ubuntu1~18.04.1)) #6 SMP Tue Feb 18 23:21:36 CET 2020
[0.00] CPU: ARMv7 Processor [410fc080] revision 0 (ARMv7),
cr=10c5387d
[0.00] CPU: PIPT / VIPT nonaliasing data cache, VIPT nonaliasing
instruction cache
[0.00] OF: fdt: Machine model: Cubietech Cubieboard
[0.00] Memory policy: Data cache writeback
...
[4.559154] random: fast init done
[5.481107] scsi 1:0:0:0: Direct-Access QEMU QEMU HARDDISK
 2.5+ PQ: 0 ANSI: 5
[5.493282] sd 1:0:0:0: Power-on or device reset occurred
[5.513539] sd 1:0:0:0: [sda] 122880 512-byte logical blocks: (62.9
MB/60.0 MiB)
[5.521970] sd 1:0:0:0: [sda] Write Protect is off
[5.522683] sd 1:0:0:0: [sda] Mode Sense: 63 00 00 08
[5.524552] sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled,
doesn't support DPO or FUA
[5.613064] sd 1:0:0:0: [sda] Attached SCSI disk
[5.681764] EXT4-fs (sda): INFO: recovery required on readonly filesystem
[5.682530] EXT4-fs (sda): write access will be enabled during recovery
...
[6.129348] EXT4-fs (sda): re-mounted. Opts: (null)
Starting syslogd: OK
Starting klogd: OK
Running sysctl: OK
Initializing random number generator: OK
Saving random seed: [7.205617] random: dd: uninitialized urandom read
(512 bytes read)
OK
Starting network: OK

Welcome to Cubieboard2!
Cubieboard2 login:


Thanks!

Tested-by: Niek Linnenbank 

Regards,
Niek


-- 
Niek Linnenbank


Re: [PATCH v2 fixed 07/16] exec: Drop "shared" parameter from ram_block_add()

2020-02-18 Thread Peter Xu
On Wed, Feb 12, 2020 at 02:42:45PM +0100, David Hildenbrand wrote:
> Properly store it in the flags of the ram block instead (and the flag
> even already exists and is used).
> 
> E.g., qemu_ram_is_shared() now properly succeeds on all ram blocks that are
> actually shared.
> 
> Reviewed-by: Igor Kotrasinski 
> Reviewed-by: Richard Henderson 
> Cc: Richard Henderson 
> Cc: Paolo Bonzini 
> Cc: Igor Mammedov 
> Signed-off-by: David Hildenbrand 

Reviewed-by: Peter Xu 

-- 
Peter Xu




Re: [PATCH v2 fixed 06/16] exec: Reuse qemu_ram_apply_settings() in qemu_ram_remap()

2020-02-18 Thread Peter Xu
On Wed, Feb 12, 2020 at 02:42:44PM +0100, David Hildenbrand wrote:
> I don't see why we shouldn't apply all settings to make it look like the
> surrounding RAM (and enable proper VMA merging).
> 
> Note: memory backend settings might have overridden these settings. We
> would need a callback to let the memory backend fix that up.
> 
> Reviewed-by: Richard Henderson 
> Cc: Richard Henderson 
> Cc: Paolo Bonzini 
> Cc: Igor Mammedov 
> Signed-off-by: David Hildenbrand 

Reviewed-by: Peter Xu 

-- 
Peter Xu




Re: [PATCH v2 fixed 05/16] exec: Factor out setting ram settings (madvise ...) into qemu_ram_apply_settings()

2020-02-18 Thread Peter Xu
On Wed, Feb 12, 2020 at 02:42:43PM +0100, David Hildenbrand wrote:
> Factor all settings out into qemu_ram_apply_settings().
> 
> For memory_try_enable_merging(), the important bit is that it won't be
> called with XEN - which is now still the case as new_block->host will
> remain NULL.
> 
> Reviewed-by: Richard Henderson 
> Cc: Richard Henderson 
> Cc: Paolo Bonzini 
> Cc: Igor Mammedov 
> Signed-off-by: David Hildenbrand 

Reviewed-by: Peter Xu 

-- 
Peter Xu




Re: [PATCH v2 fixed 04/16] util: vfio-helpers: Factor out removal from qemu_vfio_undo_mapping()

2020-02-18 Thread Peter Xu
On Wed, Feb 12, 2020 at 02:42:42PM +0100, David Hildenbrand wrote:
> Factor it out and properly use it where applicable. Make
> qemu_vfio_undo_mapping() look like qemu_vfio_do_mapping(), passing the
> size and iova, not the mapping.
> 
> Cc: Richard Henderson 
> Cc: Paolo Bonzini 
> Cc: Eduardo Habkost 
> Cc: Marcel Apfelbaum 
> Cc: Alex Williamson 
> Cc: Stefan Hajnoczi 
> Signed-off-by: David Hildenbrand 

Reviewed-by: Peter Xu 

-- 
Peter Xu




Re: [PATCH v2 fixed 03/16] util: vfio-helpers: Remove Error parameter from qemu_vfio_undo_mapping()

2020-02-18 Thread Peter Xu
On Wed, Feb 12, 2020 at 02:42:41PM +0100, David Hildenbrand wrote:
> Everybody discards the error. Let's error_report() instead so this error
> doesn't get lost.
> 
> Cc: Richard Henderson 
> Cc: Paolo Bonzini 
> Cc: Eduardo Habkost 
> Cc: Marcel Apfelbaum 
> Cc: Alex Williamson 
> Cc: Stefan Hajnoczi 
> Signed-off-by: David Hildenbrand 

IMHO error_setg() should be preferred comparing to error_report()
because it has a context to be delivered to the caller, so the error
has a better chance to be used in a better way (e.g., QMP only
supports error_setg()).

A better solution is that we deliver the error upper.  For example,
qemu_vfio_dma_map() is one caller of qemu_vfio_undo_mapping, if you
see the callers of qemu_vfio_dma_map() you'll notice most of them has
Error** defined (e.g., nvme_init_queue).  Then we can link all of them
up.

Another lazy solution (and especially if vfio-helpers are still mostly
used only by advanced users) is we can simply pass in _abort for
the three callers then they won't be missed...

Thanks,

> ---
>  util/vfio-helpers.c | 11 +--
>  1 file changed, 5 insertions(+), 6 deletions(-)
> 
> diff --git a/util/vfio-helpers.c b/util/vfio-helpers.c
> index d6332522c1..13dd962d95 100644
> --- a/util/vfio-helpers.c
> +++ b/util/vfio-helpers.c
> @@ -540,8 +540,7 @@ static int qemu_vfio_do_mapping(QEMUVFIOState *s, void 
> *host, size_t size,
>  /**
>   * Undo the DMA mapping from @s with VFIO, and remove from mapping list.
>   */
> -static void qemu_vfio_undo_mapping(QEMUVFIOState *s, IOVAMapping *mapping,
> -   Error **errp)
> +static void qemu_vfio_undo_mapping(QEMUVFIOState *s, IOVAMapping *mapping)
>  {
>  int index;
>  struct vfio_iommu_type1_dma_unmap unmap = {
> @@ -556,7 +555,7 @@ static void qemu_vfio_undo_mapping(QEMUVFIOState *s, 
> IOVAMapping *mapping,
>  assert(QEMU_IS_ALIGNED(mapping->size, qemu_real_host_page_size));
>  assert(index >= 0 && index < s->nr_mappings);
>  if (ioctl(s->container, VFIO_IOMMU_UNMAP_DMA, )) {
> -error_setg(errp, "VFIO_UNMAP_DMA failed: %d", -errno);
> +error_report("VFIO_UNMAP_DMA failed: %d", -errno);
>  }
>  memmove(mapping, >mappings[index + 1],
>  sizeof(s->mappings[0]) * (s->nr_mappings - index - 1));
> @@ -621,7 +620,7 @@ int qemu_vfio_dma_map(QEMUVFIOState *s, void *host, 
> size_t size,
>  assert(qemu_vfio_verify_mappings(s));
>  ret = qemu_vfio_do_mapping(s, host, size, iova0);
>  if (ret) {
> -qemu_vfio_undo_mapping(s, mapping, NULL);
> +qemu_vfio_undo_mapping(s, mapping);
>  goto out;
>  }
>  s->low_water_mark += size;
> @@ -681,7 +680,7 @@ void qemu_vfio_dma_unmap(QEMUVFIOState *s, void *host)
>  if (!m) {
>  goto out;
>  }
> -qemu_vfio_undo_mapping(s, m, NULL);
> +qemu_vfio_undo_mapping(s, m);
>  out:
>  qemu_mutex_unlock(>lock);
>  }
> @@ -698,7 +697,7 @@ void qemu_vfio_close(QEMUVFIOState *s)
>  return;
>  }
>  while (s->nr_mappings) {
> -qemu_vfio_undo_mapping(s, >mappings[s->nr_mappings - 1], NULL);
> +qemu_vfio_undo_mapping(s, >mappings[s->nr_mappings - 1]);
>  }
>  ram_block_notifier_remove(>ram_notifier);
>  qemu_vfio_reset(s);
> -- 
> 2.24.1
> 
> 

-- 
Peter Xu




Re: [PATCH v2 fixed 02/16] util: vfio-helpers: Fix qemu_vfio_close()

2020-02-18 Thread Peter Xu
On Wed, Feb 12, 2020 at 02:42:40PM +0100, David Hildenbrand wrote:
> qemu_vfio_undo_mapping() will decrement the number of mappings and
> reshuffle the array elements to fit into the reduced size.
> 
> Iterating over all elements like this does not work as expected, let's make
> sure to remove all mappings properly.
> 
> Cc: Richard Henderson 
> Cc: Paolo Bonzini 
> Cc: Eduardo Habkost 
> Cc: Marcel Apfelbaum 
> Cc: Alex Williamson 
> Cc: Stefan Hajnoczi 
> Signed-off-by: David Hildenbrand 

Reviewed-by: Peter Xu 

-- 
Peter Xu




Re: [PATCH v2 fixed 01/16] util: vfio-helpers: Factor out and fix processing of existing ram blocks

2020-02-18 Thread Peter Xu
On Wed, Feb 12, 2020 at 02:42:39PM +0100, David Hildenbrand wrote:
> Factor it out into common code when a new notifier is registered, just
> as done with the memory region notifier. This allows us to have the
> logic about how to process existing ram blocks at a central place (which
> will be extended soon).
> 
> Just like when adding a new ram block, we have to register the max_length
> for now. We don't have a way to get notified about resizes yet, and some
> memory would not be mapped when growing the ram block.
> 
> Note: Currently, ram blocks are only "fake resized". All memory
> (max_length) is accessible.
> 
> We can get rid of a bunch of functions in stubs/ram-block.c . Print the
> warning from inside qemu_vfio_ram_block_added().
> 
> Cc: Richard Henderson 
> Cc: Paolo Bonzini 
> Cc: Eduardo Habkost 
> Cc: Marcel Apfelbaum 
> Cc: Alex Williamson 
> Cc: Stefan Hajnoczi 
> Signed-off-by: David Hildenbrand 
> ---
>  exec.c|  5 +
>  hw/core/numa.c| 14 ++
>  include/exec/cpu-common.h |  1 +
>  stubs/ram-block.c | 20 
>  util/vfio-helpers.c   | 28 +++-
>  5 files changed, 27 insertions(+), 41 deletions(-)
> 
> diff --git a/exec.c b/exec.c
> index 67e520d18e..05cfe868ab 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -2017,6 +2017,11 @@ ram_addr_t qemu_ram_get_used_length(RAMBlock *rb)
>  return rb->used_length;
>  }
>  
> +ram_addr_t qemu_ram_get_max_length(RAMBlock *rb)
> +{
> +return rb->max_length;
> +}
> +
>  bool qemu_ram_is_shared(RAMBlock *rb)
>  {
>  return rb->flags & RAM_SHARED;
> diff --git a/hw/core/numa.c b/hw/core/numa.c
> index 0d1b4be76a..6599c69e05 100644
> --- a/hw/core/numa.c
> +++ b/hw/core/numa.c
> @@ -899,9 +899,23 @@ void query_numa_node_mem(NumaNodeMem node_mem[], 
> MachineState *ms)
>  }
>  }
>  
> +static int ram_block_notify_add_single(RAMBlock *rb, void *opaque)
> +{
> +const ram_addr_t max_size = qemu_ram_get_max_length(rb);
> +void *host = qemu_ram_get_host_addr(rb);
> +RAMBlockNotifier *notifier = opaque;
> +
> +if (host) {
> +notifier->ram_block_added(notifier, host, max_size);
> +}
> +return 0;
> +}
> +
>  void ram_block_notifier_add(RAMBlockNotifier *n)
>  {
>  QLIST_INSERT_HEAD(_list.ramblock_notifiers, n, next);
> +/* Notify about all existing ram blocks. */
> +qemu_ram_foreach_block(ram_block_notify_add_single, n);
>  }
>  
>  void ram_block_notifier_remove(RAMBlockNotifier *n)
> diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
> index 81753bbb34..9760ac9068 100644
> --- a/include/exec/cpu-common.h
> +++ b/include/exec/cpu-common.h
> @@ -59,6 +59,7 @@ const char *qemu_ram_get_idstr(RAMBlock *rb);
>  void *qemu_ram_get_host_addr(RAMBlock *rb);
>  ram_addr_t qemu_ram_get_offset(RAMBlock *rb);
>  ram_addr_t qemu_ram_get_used_length(RAMBlock *rb);
> +ram_addr_t qemu_ram_get_max_length(RAMBlock *rb);
>  bool qemu_ram_is_shared(RAMBlock *rb);
>  bool qemu_ram_is_uf_zeroable(RAMBlock *rb);
>  void qemu_ram_set_uf_zeroable(RAMBlock *rb);
> diff --git a/stubs/ram-block.c b/stubs/ram-block.c
> index 73c0a3ee08..10855b52dd 100644
> --- a/stubs/ram-block.c
> +++ b/stubs/ram-block.c
> @@ -2,21 +2,6 @@
>  #include "exec/ramlist.h"
>  #include "exec/cpu-common.h"
>  
> -void *qemu_ram_get_host_addr(RAMBlock *rb)
> -{
> -return 0;
> -}
> -
> -ram_addr_t qemu_ram_get_offset(RAMBlock *rb)
> -{
> -return 0;
> -}
> -
> -ram_addr_t qemu_ram_get_used_length(RAMBlock *rb)
> -{
> -return 0;
> -}

Maybe put into another patch?

Actually I'm thinking whether it would worth to do...  They're still
declared in include/exec/cpu-common.h, so logically who includes the
header but linked against stubs can still call this function.  So
keeping them there still make sense to me.

> -
>  void ram_block_notifier_add(RAMBlockNotifier *n)
>  {
>  }
> @@ -24,8 +9,3 @@ void ram_block_notifier_add(RAMBlockNotifier *n)
>  void ram_block_notifier_remove(RAMBlockNotifier *n)
>  {
>  }
> -
> -int qemu_ram_foreach_block(RAMBlockIterFunc func, void *opaque)
> -{
> -return 0;
> -}
> diff --git a/util/vfio-helpers.c b/util/vfio-helpers.c
> index 813f7ec564..71e02e7f35 100644
> --- a/util/vfio-helpers.c
> +++ b/util/vfio-helpers.c
> @@ -376,8 +376,13 @@ static void qemu_vfio_ram_block_added(RAMBlockNotifier 
> *n,
>void *host, size_t size)
>  {
>  QEMUVFIOState *s = container_of(n, QEMUVFIOState, ram_notifier);
> +int ret;
> +
>  trace_qemu_vfio_ram_block_added(s, host, size);
> -qemu_vfio_dma_map(s, host, size, false, NULL);
> +ret = qemu_vfio_dma_map(s, host, size, false, NULL);
> +if (ret) {
> +error_report("qemu_vfio_dma_map(%p, %zu) failed: %d", host, size, 
> ret);
> +}

Irrelevant change (another patch)?

>  }
>  
>  static void qemu_vfio_ram_block_removed(RAMBlockNotifier *n,
> @@ -390,33 +395,14 @@ static void 
> 

Re: [PATCH v4 00/20] Add Allwinner H3 SoC and Orange Pi PC Machine

2020-02-18 Thread Niek Linnenbank
Hi Peter & Philippe,

On Tue, Feb 18, 2020 at 11:05 AM Peter Maydell 
wrote:

> On Tue, 18 Feb 2020 at 06:46, Philippe Mathieu-Daudé 
> wrote:
> > IIRC from the specs, cards are block devices and the only alignment
> > required is the size of a block (512KiB for your 4GiB card).
>
> Isn't there something related to erase blocks too, which impose
> a larger granularity than just a single block?
>
> Anyway, in general the backing image for an SD card device
> needs to be exactly the size of the SD card you're emulating,
> because QEMU needs somewhere it can write back the data
> if the guest decides to write to the last block on the card.
> So short-length images generally don't work (true for all
> block devices, not just SD cards, I think). This often bites users
> if they're using some distro "here's a disk/sd card image file"
> where the expected use with real hardware is "dd the image
> file onto the SD card".
>

Yes, the description you gave here is indeed the issue.
And unfortunately in this particular case, the distro did not give a very
understandable
diagnostic error message.

Kind regards,
Niek


>
> thanks
> -- PMM
>


-- 
Niek Linnenbank


Re: [PATCH] memory: batch allocate ioeventfds[] in address_space_update_ioeventfds()

2020-02-18 Thread Peter Xu
On Tue, Feb 18, 2020 at 06:22:26PM +, Stefan Hajnoczi wrote:
> Reallocing the ioeventfds[] array each time an element is added is very
> expensive as the number of ioeventfds increases.  Batch allocate instead
> to amortize the cost of realloc.
> 
> This patch reduces Linux guest boot times from 362s to 140s when there
> are 2 virtio-blk devices with 1 virtqueue and 99 virtio-blk devices with
> 32 virtqueues.
> 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  memory.c | 17 ++---
>  1 file changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/memory.c b/memory.c
> index aeaa8dcc9e..2d6f931f8c 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -794,10 +794,18 @@ static void 
> address_space_update_ioeventfds(AddressSpace *as)
>  FlatView *view;
>  FlatRange *fr;
>  unsigned ioeventfd_nb = 0;
> -MemoryRegionIoeventfd *ioeventfds = NULL;
> +unsigned ioeventfd_max;
> +MemoryRegionIoeventfd *ioeventfds;
>  AddrRange tmp;
>  unsigned i;
>  
> +/*
> + * It is likely that the number of ioeventfds hasn't changed much, so use
> + * the previous size as the starting value.
> + */
> +ioeventfd_max = as->ioeventfd_nb;
> +ioeventfds = g_new(MemoryRegionIoeventfd, ioeventfd_max);

Would the ioeventfd_max being cached and never goes down but it can
only keep or increase?  I'm not sure if that's a big problem, but
considering the commit message mentioned 99 virtio-blk with 32 queues
each, I'm not sure... :)

I'm thinking maybe start with a relative big number but always under
control (e.g., 64), then...

> +
>  view = address_space_get_flatview(as);
>  FOR_EACH_FLAT_RANGE(fr, view) {
>  for (i = 0; i < fr->mr->ioeventfd_nb; ++i) {
> @@ -806,8 +814,11 @@ static void address_space_update_ioeventfds(AddressSpace 
> *as)
>   
> int128_make64(fr->offset_in_region)));
>  if (addrrange_intersects(fr->addr, tmp)) {
>  ++ioeventfd_nb;
> -ioeventfds = g_realloc(ioeventfds,
> -  ioeventfd_nb * 
> sizeof(*ioeventfds));
> +if (ioeventfd_nb > ioeventfd_max) {
> +ioeventfd_max += 64;

... do exponential increase here (max*=2) instead so still easy to
converge?

Thanks,

> +ioeventfds = g_realloc(ioeventfds,
> +ioeventfd_max * sizeof(*ioeventfds));
> +}
>  ioeventfds[ioeventfd_nb-1] = fr->mr->ioeventfds[i];
>  ioeventfds[ioeventfd_nb-1].addr = tmp;
>  }
> -- 
> 2.24.1
> 

-- 
Peter Xu




Re: [PATCH v12 Kernel 4/7] vfio iommu: Implementation of ioctl to for dirty pages tracking.

2020-02-18 Thread Alex Williamson
On Tue, 18 Feb 2020 11:28:53 +0530
Kirti Wankhede  wrote:

> 
> 
> >As I understand the above algorithm, we find a vfio_dma
> > overlapping the request and populate the bitmap for that range.  Then
> > we go back and put_user() for each byte that we touched.  We could
> > instead simply work on a one byte buffer as we enumerate the requested
> > range and do a put_user() ever time we reach the end of it and have bits
> > set. That would greatly simplify the above example.  But I would expect
> > that we're a) more likely to get asked for ranges covering a single
> > vfio_dma  
> 
>  QEMU ask for single vfio_dma during each iteration.
> 
>  If we restrict this ABI to cover single vfio_dma only, then it
>  simplifies the logic here. That was my original suggestion. Should we
>  think about that again?  
> >>>
> >>> But we currently allow unmaps that overlap multiple vfio_dmas as long
> >>> as no vfio_dma is bisected, so I think that implies that an unmap while
> >>> asking for the dirty bitmap has even further restricted semantics.  I'm
> >>> also reluctant to design an ABI around what happens to be the current
> >>> QEMU implementation.
> >>>
> >>> If we take your example above, ranges {0x,0xa000} and
> >>> {0xa000,0x1} ({start,end}), I think you're working with the
> >>> following two bitmaps in this implementation:
> >>>
> >>> 0011 b
> >>> 0011b
> >>>
> >>> And we need to combine those into:
> >>>
> >>>  b
> >>>
> >>> Right?
> >>>
> >>> But it seems like that would be easier if the second bitmap was instead:
> >>>
> >>> 1100b
> >>>
> >>> Then we wouldn't need to worry about the entire bitmap being shifted by
> >>> the bit offset within the byte, which limits our fixes to the boundary
> >>> byte and allows us to use copy_to_user() directly for the bulk of the
> >>> copy.  So how do we get there?
> >>>
> >>> I think we start with allocating the vfio_dma bitmap to account for
> >>> this initial offset, so we calculate bitmap_base_iova as:
> >>> (iova & ~((PAGE_SIZE << 3) - 1))
> >>> We then use bitmap_base_iova in calculating which bits to set.
> >>>
> >>> The user needs to follow the same rules, and maybe this adds some value
> >>> to the user providing the bitmap size rather than the kernel
> >>> calculating it.  For example, if the user wanted the dirty bitmap for
> >>> the range {0xa000,0x1} above, they'd provide at least a 1 byte
> >>> bitmap, but we'd return bit #2 set to indicate 0xa000 is dirty.
> >>>
> >>> Effectively the user can ask for any iova range, but the buffer will be
> >>> filled relative to the zeroth bit of the bitmap following the above
> >>> bitmap_base_iova formula (and replacing PAGE_SIZE with the user
> >>> requested pgsize).  I'm tempted to make this explicit in the user
> >>> interface (ie. only allow bitmaps starting on aligned pages), but a
> >>> user is able to map and unmap single pages and we need to support
> >>> returning a dirty bitmap with an unmap, so I don't think we can do that.
> >>>  
> >>
> >> Sigh, finding adjacent vfio_dmas within the same byte seems simpler than
> >> this.  
> > 
> > How does KVM do this?  My intent was that if all of our bitmaps share
> > the same alignment then we can merge the intersection and continue to
> > use copy_to_user() on either side.  However, if QEMU doesn't do the
> > same, it doesn't really help us.  Is QEMU stuck with an implementation
> > of only retrieving dirty bits per MemoryRegionSection exactly because
> > of this issue and therefore we can rely on it in our implementation as
> > well?  Thanks,
> >   
> 
> QEMU sync dirty_bitmap per MemoryRegionSection. Within 
> MemoryRegionSection there could be multiple KVMSlots. QEMU queries 
> dirty_bitmap per KVMSlot and mark dirty for each KVMSlot.
> On kernel side, KVM_GET_DIRTY_LOG ioctl calls 
> kvm_get_dirty_log_protect(), where it uses copy_to_user() to copy bitmap 
> of that memSlot.
> vfio_dma is per MemoryRegionSection. We can reply on MemoryRegionSection 
> in our implementation. But to get bitmap during unmap, we have to take 
> care of concatenating bitmaps.

So KVM does not worry about bitmap alignment because the interface is
based on slots, a dirty bitmap can only be retrieved for a single,
entire slot.  We need VFIO_IOMMU_UNMAP_DMA to maintain its support for
spanning multiple vfio_dmas, but maybe we have some leeway that we
don't need to support both multiple vfio_dmas and dirty bitmap at the
same time.  It seems like it would be a massive simplification if we
required an unmap with dirty bitmap to span exactly one vfio_dma,
right?  I don't see that we'd break any existing users with that, it's
unfortunate that we can't have the flexibility of the existing calling
convention, but I think there's good reason for it here.  Our separate
dirty bitmap log reporting would follow the same semantics.  I think
this all aligns with how the MemoryListener works in 

Re: [Qemu-devel] [PATCH] linux-user: Implement membarrier syscall

2020-02-18 Thread Laurent Vivier
Le 13/05/2019 à 11:02, Andreas Schwab a écrit :
> Signed-off-by: Andreas Schwab 
> ---
>  linux-user/syscall.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> index f5ff6f5dc8..80399f4eb0 100644
> --- a/linux-user/syscall.c
> +++ b/linux-user/syscall.c
> @@ -313,6 +313,9 @@ _syscall3(int, getrandom, void *, buf, size_t, buflen, 
> unsigned int, flags)
>  _syscall5(int, kcmp, pid_t, pid1, pid_t, pid2, int, type,
>unsigned long, idx1, unsigned long, idx2)
>  #endif
> +#if defined(TARGET_NR_membarrier) && defined(__NR_membarrier)
> +_syscall2(int, membarrier, int, cmd, int, flags)
> +#endif
>  
>  static bitmask_transtbl fcntl_flags_tbl[] = {
>{ TARGET_O_ACCMODE,   TARGET_O_WRONLY,O_ACCMODE,   O_WRONLY,},
> @@ -11620,6 +11623,10 @@ static abi_long do_syscall1(void *cpu_env, int num, 
> abi_long arg1,
>  /* PowerPC specific.  */
>  return do_swapcontext(cpu_env, arg1, arg2, arg3);
>  #endif
> +#if defined TARGET_NR_membarrier && defined __NR_membarrier
> +case TARGET_NR_membarrier:
> +return get_errno(membarrier(arg1, arg2));
> +#endif
>  
>  default:
>  qemu_log_mask(LOG_UNIMP, "Unsupported syscall: %d\n", num);
> 

Applied to my linux-user branch.

Thanks,
Laurent



Re: [Qemu-devel] [PATCH] linux-user: implement getsockopt SO_RCVTIMEO and SO_SNDTIMEO

2020-02-18 Thread Laurent Vivier
Le 13/05/2019 à 11:06, Andreas Schwab a écrit :
> Signed-off-by: Andreas Schwab 
> ---
>  linux-user/syscall.c | 36 ++--
>  1 file changed, 34 insertions(+), 2 deletions(-)
> 
> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> index d113a65831..ba5775a94e 100644
> --- a/linux-user/syscall.c
> +++ b/linux-user/syscall.c
> @@ -2171,10 +2171,42 @@ static abi_long do_getsockopt(int sockfd, int level, 
> int optname,
>  level = SOL_SOCKET;
>  switch (optname) {
>  /* These don't just return a single integer */
> -case TARGET_SO_RCVTIMEO:
> -case TARGET_SO_SNDTIMEO:
>  case TARGET_SO_PEERNAME:
>  goto unimplemented;
> +case TARGET_SO_RCVTIMEO: {
> +struct timeval tv;
> +socklen_t tvlen;
> +
> +optname = SO_RCVTIMEO;
> +
> +get_timeout:
> +if (get_user_u32(len, optlen)) {
> +return -TARGET_EFAULT;
> +}
> +if (len < 0) {
> +return -TARGET_EINVAL;
> +}
> +
> +tvlen = sizeof(tv);
> +ret = get_errno(getsockopt(sockfd, level, optname,
> +   , ));
> +if (ret < 0) {
> +return ret;
> +}
> +if (len > sizeof(struct target_timeval)) {
> +len = sizeof(struct target_timeval);
> +}
> +if (copy_to_user_timeval(optval_addr, )) {
> +return -TARGET_EFAULT;
> +}
> +if (put_user_u32(len, optlen)) {
> +return -TARGET_EFAULT;
> +}
> +break;
> +}
> +case TARGET_SO_SNDTIMEO:
> +optname = SO_SNDTIMEO;
> +goto get_timeout;
>  case TARGET_SO_PEERCRED: {
>  struct ucred cr;
>  socklen_t crlen;
> 

Applied to my linux-user branch.

Thanks,
Laurent




[PATCH] linux-user: fix socket() strace

2020-02-18 Thread Laurent Vivier
print_socket_type() doesn't manage flags and the correct type cannot
be displayed

Signed-off-by: Laurent Vivier 
---
 linux-user/strace.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/linux-user/strace.c b/linux-user/strace.c
index 4f7130b2ff63..bdfc5177555e 100644
--- a/linux-user/strace.c
+++ b/linux-user/strace.c
@@ -444,7 +444,7 @@ print_socket_domain(int domain)
 static void
 print_socket_type(int type)
 {
-switch (type) {
+switch (type & TARGET_SOCK_TYPE_MASK) {
 case TARGET_SOCK_DGRAM:
 qemu_log("SOCK_DGRAM");
 break;
@@ -464,6 +464,12 @@ print_socket_type(int type)
 qemu_log("SOCK_PACKET");
 break;
 }
+if (type & TARGET_SOCK_CLOEXEC) {
+gemu_log("|SOCK_CLOEXEC");
+}
+if (type & TARGET_SOCK_NONBLOCK) {
+gemu_log("|SOCK_NONBLOCK");
+}
 }
 
 static void
-- 
2.24.1




Re: [PATCH v2 4/4] linux-user: Add support for FDGETFDCSTAT ioctl

2020-02-18 Thread Laurent Vivier
Le 24/01/2020 à 16:47, Aleksandar Markovic a écrit :
> From: Aleksandar Markovic 
> 
> FDGETFDCSTAT's third agrument is a pointer to the structure:
> 
> struct floppy_fdc_state {
> int spec1;
> int spec2;
> int dtr;
> unsigned char version;
> unsigned char dor;
> unsigned long address;
> unsigned int rawcmd:2;
> unsigned int reset:1;
> unsigned int need_configure:1;
> unsigned int perp_mode:2;
> unsigned int has_fifo:1;
> unsigned int driver_version;
> unsigned char track[4];
> };
> 
> defined in Linux kernel header .
> 
> Since there is a fields of the structure of type 'unsigned long', there is
> a need to define "target_format_descr". Also, five fields rawcmd, reset,
> need_configure, perp_mode, and has_fifo are all just bitfields and are
> part od a single 'unsigned int' field.
> 
> Signed-off-by: Aleksandar Markovic 
> ---
>  linux-user/ioctls.h|  2 ++
>  linux-user/syscall_defs.h  | 18 ++
>  linux-user/syscall_types.h | 12 
>  3 files changed, 32 insertions(+)
> 
> diff --git a/linux-user/ioctls.h b/linux-user/ioctls.h
> index adc07ad..b3bbe6a 100644
> --- a/linux-user/ioctls.h
> +++ b/linux-user/ioctls.h
> @@ -145,6 +145,8 @@
>   IOCTL(FDSETMAXERRS, IOC_W, MK_PTR(MK_STRUCT(STRUCT_floppy_max_errors)))
>   IOCTL(FDGETMAXERRS, IOC_R, MK_PTR(MK_STRUCT(STRUCT_floppy_max_errors)))
>   IOCTL(FDRESET, 0, TYPE_NULL)
> + IOCTL(FDGETFDCSTAT, IOC_R,
> +   MK_PTR(MK_STRUCT(STRUCT_target_floppy_fdc_state)))
>   IOCTL(FDRAWCMD, 0, TYPE_NULL)
>   IOCTL(FDTWADDLE, 0, TYPE_NULL)
>   IOCTL(FDEJECT, 0, TYPE_NULL)
> diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
> index ae4c048..e08e5bc 100644
> --- a/linux-user/syscall_defs.h
> +++ b/linux-user/syscall_defs.h
> @@ -933,6 +933,23 @@ struct target_rtc_pll_info {
>  
>  /* From  */
>  
> +struct target_floppy_fdc_state {
> +int spec1;  /* spec1 value last used */
> +int spec2;  /* spec2 value last used */
> +int dtr;
> +unsigned char version;  /* FDC version code */
> +unsigned char dor;
> +abi_ulong address;  /* io address */
> +unsigned int rawcmd:2;
> +unsigned int reset:1;
> +unsigned int need_configure:1;
> +unsigned int perp_mode:2;
> +unsigned int has_fifo:1;
> +unsigned int driver_version;/* version code for floppy driver */
> +unsigned char track[4];
> +};
> +

use abi_int/abi_uint rather than "int/unsigned int".

Thanks,
Laurent



Re: [PATCH] Avoid cpu_physical_memory_rw() with a constant is_write argument

2020-02-18 Thread Peter Maydell
On Tue, 18 Feb 2020 at 20:07, Philippe Mathieu-Daudé  wrote:
> I don't understand well cpu_physical_memory*(). Aren't these obsolete?
> They confuse me when using multi-core CPUs.

They sort of are, but there is no simple mechanical replacement
for them -- you need to look at the individual use to see what
address space it should really be using. For instance the cases
in hw/dma/ probably would require the device to be updated to
the new pattern where it takes a MemoryRegion defining what
it should be doing DMA to, and then it can create an AddressSpace
and use that with address_space_*. But that's a bunch of work
on older devices which mostly people don't care very much about.

In theory we could do a textual replacement of cpu_physical_memory*
with address_space_rw(_space_memory,...)
but that's usually not the right address space, so I'm not
sure that churn is worthwhile.

thanks
-- PMM



Re: [PATCH v2 3/4] linux-user: Add support for FIFREEZE and FITHAW ioctls

2020-02-18 Thread Laurent Vivier
Le 24/01/2020 à 16:47, Aleksandar Markovic a écrit :
> From: Aleksandar Markovic 
> 
> Both FIFREEZE and FITHAW ioctls accept an integer as their third
> argument.
> 
> All ioctls in this group (FI* ioctl) are guarded with "#ifdef", so the
> guards are used in this implementation too for consistency (however,
> many of ioctls in FI* group became old enough that their #ifdef guards
> could be removed, bit this is out of the scope of this patch).

They have been added in v2.6.29

Could you add this information coming from the kernel commit adding them:

   o Freeze the filesystem
  int ioctl(int fd, int FIFREEZE, arg)
fd: The file descriptor of the mountpoint
FIFREEZE: request code for the freeze
arg: Ignored
Return value: 0 if the operation succeeds. Otherwise, -1

o Unfreeze the filesystem
  int ioctl(int fd, int FITHAW, arg)
fd: The file descriptor of the mountpoint
FITHAW: request code for unfreeze
arg: Ignored
Return value: 0 if the operation succeeds. Otherwise, -1
Error number: If the filesystem has already been unfrozen,
  errno is set to EINVAL.

> Reviewed-by: Laurent Vivier 
> Signed-off-by: Aleksandar Markovic 
> ---
>  linux-user/ioctls.h   | 6 ++
>  linux-user/syscall_defs.h | 4 
>  2 files changed, 10 insertions(+)
> 
> diff --git a/linux-user/ioctls.h b/linux-user/ioctls.h
> index 944fbeb..adc07ad 100644
> --- a/linux-user/ioctls.h
> +++ b/linux-user/ioctls.h
> @@ -152,6 +152,12 @@
>  #ifdef FIBMAP
>   IOCTL(FIBMAP, IOC_W | IOC_R, MK_PTR(TYPE_LONG))
>  #endif
> +#ifdef FIFREEZE
> + IOCTL(FIFREEZE, IOC_W | IOC_R, TYPE_INT)
> +#endif
> +#ifdef FITHAW
> + IOCTL(FITHAW, IOC_W | IOC_R, TYPE_INT)
> +#endif
>  #ifdef FITRIM
>   IOCTL(FITRIM, IOC_W | IOC_R, MK_PTR(MK_STRUCT(STRUCT_fstrim_range)))
>  #endif
> diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
> index 8761841..ae4c048 100644
> --- a/linux-user/syscall_defs.h
> +++ b/linux-user/syscall_defs.h
> @@ -950,7 +950,11 @@ struct target_rtc_pll_info {
>  #define TARGET_FIBMAP TARGET_IO(0x00,1)  /* bmap access */
>  #define TARGET_FIGETBSZ   TARGET_IO(0x00,2)  /* get the block size used for 
> bmap */
>  
> +#define TARGET_FIFREEZE   TARGET_IOWR('X', 119, int)/* Freeze */
> +#define TARGET_FITHAW TARGET_IOWR('X', 120, int)/* Thaw */
> +#ifdef FITRIM
>  #define TARGET_FITRIM TARGET_IOWR('X', 121, struct fstrim_range)
> +#endif

move "#ifdef FITRIM" to previous patch.

>  #define TARGET_FICLONETARGET_IOW(0x94, 9, int)
>  #define TARGET_FICLONERANGE TARGET_IOW(0x94, 13, struct file_clone_range)
>  
> 




[Bug 1863685] Re: ARM: HCR.TSW traps are not implemented

2020-02-18 Thread Julien Freche
Makes sense. Debugging is on me then :) Both patches behave as expected,
thanks!

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1863685

Title:
  ARM: HCR.TSW traps are not implemented

Status in QEMU:
  In Progress

Bug description:
  On 32-bit and 64-bit ARM platforms, setting HCR.TSW is supposed to
  "Trap data or unified cache maintenance instructions that operate by
  Set/Way." Quoting the ARM manual:

  If EL1 is using AArch64 state, accesses to DC ISW, DC CSW, DC CISW are 
trapped to EL2, reported using EC syndrome value 0x18.
  If EL1 is using AArch32 state, accesses to DCISW, DCCSW, DCCISW are trapped 
to EL2, reported using EC syndrome value 0x03.

  However, QEMU does not trap those instructions/registers. This was
  tested on the branch master of the git repo.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1863685/+subscriptions



Re: [PATCH v2 00/22] Fix error handling during bitmap postcopy

2020-02-18 Thread Eric Blake

On 2/18/20 2:02 PM, Andrey Shinkevich wrote:

qemu-iotests:$ ./check -qcow2
PASSED
(except always failed 261 and 272)


Have you reported those failures on the threads that introduced those tests?

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH v2 2/4] linux-user: Add support for FITRIM ioctl

2020-02-18 Thread Laurent Vivier
Le 18/02/2020 à 21:53, Laurent Vivier a écrit :
> Le 24/01/2020 à 16:47, Aleksandar Markovic a écrit :
>> From: Aleksandar Markovic 
>>
>> FITRIM ioctl accepts a pointer to the structure
>>
>> struct fstrim_range {
>> __u64 start;
>> __u64 len;
>> __u64 minlen;
>> };
>>
>> as its third argument.
>>
>> All ioctls in this group (FI* ioctl) are guarded with "#ifdef", so the
>> guards are used in this implementation too for consistency (however,
>> many of ioctls in FI* group became old enough that their #ifdef guards
>> could be removed, bit this is out of the scope of this patch).
>>
>> Signed-off-by: Aleksandar Markovic 
>> ---
>>  linux-user/ioctls.h| 3 +++
>>  linux-user/syscall_defs.h  | 1 +
>>  linux-user/syscall_types.h | 5 +
>>  3 files changed, 9 insertions(+)
>>
>> diff --git a/linux-user/ioctls.h b/linux-user/ioctls.h
>> index 9fb9d6f..944fbeb 100644
>> --- a/linux-user/ioctls.h
>> +++ b/linux-user/ioctls.h
>> @@ -152,6 +152,9 @@
>>  #ifdef FIBMAP
>>   IOCTL(FIBMAP, IOC_W | IOC_R, MK_PTR(TYPE_LONG))
>>  #endif
>> +#ifdef FITRIM
>> + IOCTL(FITRIM, IOC_W | IOC_R, MK_PTR(MK_STRUCT(STRUCT_fstrim_range)))
>> +#endif
>>  #ifdef FICLONE
>>   IOCTL(FICLONE, IOC_W, TYPE_INT)
>>   IOCTL(FICLONERANGE, IOC_W, MK_PTR(MK_STRUCT(STRUCT_file_clone_range)))
>> diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
>> index ed5068f..8761841 100644
>> --- a/linux-user/syscall_defs.h
>> +++ b/linux-user/syscall_defs.h
>> @@ -950,6 +950,7 @@ struct target_rtc_pll_info {
>>  #define TARGET_FIBMAP TARGET_IO(0x00,1)  /* bmap access */
>>  #define TARGET_FIGETBSZ   TARGET_IO(0x00,2)  /* get the block size used for 
>> bmap */
>>  
>> +#define TARGET_FITRIM TARGET_IOWR('X', 121, struct fstrim_range)

You need also the "#ifdef" that is in the next patch.

>>  #define TARGET_FICLONETARGET_IOW(0x94, 9, int)
>>  #define TARGET_FICLONERANGE TARGET_IOW(0x94, 13, struct file_clone_range)
>>  
>> diff --git a/linux-user/syscall_types.h b/linux-user/syscall_types.h
>> index 5ba4155..dfd7608 100644
>> --- a/linux-user/syscall_types.h
>> +++ b/linux-user/syscall_types.h
>> @@ -226,6 +226,11 @@ STRUCT(dm_target_versions,
>>  STRUCT(dm_target_msg,
>> TYPE_ULONGLONG) /* sector */
>>  
>> +STRUCT(fstrim_range,
>> +   TYPE_LONGLONG, /* start */
>> +   TYPE_LONGLONG, /* len */
>> +   TYPE_LONGLONG) /* minlen */
> 
> they are __u64, use TYPE_ULONGLONG.
> 
> With that changed, you can add my:
> 
> Reviewed-by: Laurent Vivier 
> 
> 




Re: Cross-project NBD extension proposal: NBD_INFO_INIT_STATE

2020-02-18 Thread Eric Blake

On 2/17/20 9:13 AM, Max Reitz wrote:

Hi,

It’s my understanding that without some is_zero infrastructure for QEMU,
it’s impossible to implement this flag in qemu’s NBD server.


You're right that we may need some more infrastructure before being able 
to decide when to report this bit in all cases.  But for raw files, that 
infrastructure already exists: does block_status at offset 0 and the 
entire image as length return status that the entire file is a hole. 
And for qcow2 files, it would not be that hard to teach a similar 
block_status request to report the entire image as a hole based on my 
proposed qcow2 autoclear bit tracking that the image still reads as zero.




At the same time, I still haven’t understood what we need the flag for.

As far as I understood in our discussion on your qemu series, there is
no case where anyone would need to know whether an image is zero.  All > 
practical cases involve someone having to ensure that some image is
zero.  Knowing whether an image is zero can help with that, but that can
be an implementation detail.

For qcow2, the idea would be that there is some flag that remains true
as long as the image is guaranteed to be zero.  Then we’d have some
bdrv_make_zero function, and qcow2’s implementation would use this
information to gauge whether there’s something to do as all.

For NBD, we cannot use this idea directly because to implement such a
flag (as you’re describing in this mail), we’d need separate is_zero
infrastructure, and that kind of makes the point of “drivers’
bdrv_make_zero() implementations do the right thing by themselves” moot.


We don't necessarily need a separate is_zero infrastructure if we can 
instead teach the existing block_status infrastructure to report that 
the entire image reads as zero.  You're right that clients that need to 
force an entire image to be zero won't need to directly call 
block_status (they can just call bdrv_make_zero, and let that worry 
about whether a block status call makes sense among its list of steps to 
try).  But since block_status can report all-zero status for some cases, 
it's not hard to use that for feeding the NBD bit.


However, there's a difference between qemu's block status (which is 
already typed correctly to return a 64-bit answer, even if it may need a 
few tweaks for clients that currently don't expect it to request more 
than 32 bits) and NBD's block status (which can only report 32 bits 
barring a new extension to the protocol), and where a single all-zero 
bit at NBD_OPT_GO is just as easy of an extension as a way to report a 
64-bit all-zero response to NBD_CMD_BLOCK_STATUS.




OTOH, we wouldn’t need such a flag for the implementation, because we
could just send a 64-bit discard/make_zero over the whole block device
length to the NBD server, and then the server internally does the right
thing(TM).  AFAIU discard and write_zeroes currently have only 32 bit
length fields, but there were plans for adding support for 64 bit
versions anyway.  From my naïve outsider perspective, doing that doesn’t
seem a more complicated protocol addition than adding some way to tell
whether an NBD export is zero.


Adding 64-bit commands to NBD is more invasive than adding a single 
startup status bit.  Both ideas can be done - doing one does not 
preclude the other.  But at the same time, not all servers will 
implement both ideas - if one is easy to implement while the other is 
hard, it is not unlikely that qemu will still encounter NBD servers that 
advertise startup state but not support 64-bit make_zero (even if qemu 
as NBD server starts supporting 64-bit make zero) or even 64-bit block 
status results.


Another thing to think about here is timing.  With the proposed NBD 
addition, it is the server telling the client that "the image you are 
connecting to started zero", prior to the point that the client even has 
a chance to request "can you make the image all zero in a quick manner 
(and if not, I'll fall back to writing zeroes as I go)".  And even if 
NBD gains a 64-bit block status and/or make zero command, it is still 
less network traffic for the server to advertise up-front that the image 
is all zero than it is for the client to have to issue command requests 
of the server (network traffic is not always the bottleneck, but it can 
be a consideration).




So I’m still wondering whether there are actually cases where we need to
tell whether some image or NBD export is zero that do not involve making
it zero if it isn’t.


Just because we don't think that qemu-img has such a case does not mean 
that other NBD clients will not be able to come up with some use for 
knowing if an image starts all zero.




(I keep asking because it seems to me that if all we ever really want to
do is to ensure that some images/exports are zero, we should implement
that.)


The problem is WHERE do you implement it.  Is it more efficient to 
implement make_zero in the NBD server (the client merely requests to 
make 

Re: [PATCH v2 2/4] linux-user: Add support for FITRIM ioctl

2020-02-18 Thread Laurent Vivier
Le 24/01/2020 à 16:47, Aleksandar Markovic a écrit :
> From: Aleksandar Markovic 
> 
> FITRIM ioctl accepts a pointer to the structure
> 
> struct fstrim_range {
> __u64 start;
> __u64 len;
> __u64 minlen;
> };
> 
> as its third argument.
> 
> All ioctls in this group (FI* ioctl) are guarded with "#ifdef", so the
> guards are used in this implementation too for consistency (however,
> many of ioctls in FI* group became old enough that their #ifdef guards
> could be removed, bit this is out of the scope of this patch).
> 
> Signed-off-by: Aleksandar Markovic 
> ---
>  linux-user/ioctls.h| 3 +++
>  linux-user/syscall_defs.h  | 1 +
>  linux-user/syscall_types.h | 5 +
>  3 files changed, 9 insertions(+)
> 
> diff --git a/linux-user/ioctls.h b/linux-user/ioctls.h
> index 9fb9d6f..944fbeb 100644
> --- a/linux-user/ioctls.h
> +++ b/linux-user/ioctls.h
> @@ -152,6 +152,9 @@
>  #ifdef FIBMAP
>   IOCTL(FIBMAP, IOC_W | IOC_R, MK_PTR(TYPE_LONG))
>  #endif
> +#ifdef FITRIM
> + IOCTL(FITRIM, IOC_W | IOC_R, MK_PTR(MK_STRUCT(STRUCT_fstrim_range)))
> +#endif
>  #ifdef FICLONE
>   IOCTL(FICLONE, IOC_W, TYPE_INT)
>   IOCTL(FICLONERANGE, IOC_W, MK_PTR(MK_STRUCT(STRUCT_file_clone_range)))
> diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
> index ed5068f..8761841 100644
> --- a/linux-user/syscall_defs.h
> +++ b/linux-user/syscall_defs.h
> @@ -950,6 +950,7 @@ struct target_rtc_pll_info {
>  #define TARGET_FIBMAP TARGET_IO(0x00,1)  /* bmap access */
>  #define TARGET_FIGETBSZ   TARGET_IO(0x00,2)  /* get the block size used for 
> bmap */
>  
> +#define TARGET_FITRIM TARGET_IOWR('X', 121, struct fstrim_range)
>  #define TARGET_FICLONETARGET_IOW(0x94, 9, int)
>  #define TARGET_FICLONERANGE TARGET_IOW(0x94, 13, struct file_clone_range)
>  
> diff --git a/linux-user/syscall_types.h b/linux-user/syscall_types.h
> index 5ba4155..dfd7608 100644
> --- a/linux-user/syscall_types.h
> +++ b/linux-user/syscall_types.h
> @@ -226,6 +226,11 @@ STRUCT(dm_target_versions,
>  STRUCT(dm_target_msg,
> TYPE_ULONGLONG) /* sector */
>  
> +STRUCT(fstrim_range,
> +   TYPE_LONGLONG, /* start */
> +   TYPE_LONGLONG, /* len */
> +   TYPE_LONGLONG) /* minlen */

they are __u64, use TYPE_ULONGLONG.

With that changed, you can add my:

Reviewed-by: Laurent Vivier 




RE: [PATCH] WHPX: Assigning maintainer for Windows Hypervisor Platform

2020-02-18 Thread Justin Terry (SF)
Looks good to me! Thanks Sunil.

Signed-off-by: Justin Terry (VM) 

> -Original Message-
> From: Sunil Muthuswamy 
> Sent: Tuesday, February 18, 2020 12:39 PM
> To: Eduardo Habkost ; Paolo Bonzini
> ; Richard Henderson 
> Cc: Stefan Weil ; qemu-devel@nongnu.org; Justin Terry
> (SF) 
> Subject: [PATCH] WHPX: Assigning maintainer for Windows Hypervisor
> Platform
> 
> Signed-off-by: Sunil Muthuswamy 
> ---
>  MAINTAINERS | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 1740a4fddc..9b3ba4e1b5 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -404,6 +404,14 @@ S: Supported
>  F: target/i386/kvm.c
>  F: scripts/kvm/vmxcap
> 
> +WHPX CPUs
> +M: Sunil Muthuswamy 
> +S: Supported
> +F: target/i386/whpx-all.c
> +F: target/i386/whp-dispatch.h
> +F: accel/stubs/whpx-stub.c
> +F: include/sysemu/whpx.h
> +
>  Guest CPU Cores (Xen)
>  -
>  X86 Xen CPUs
> --
> 2.17.1



[Bug 1863685] Re: ARM: HCR.TSW traps are not implemented

2020-02-18 Thread Richard Henderson
I can't think of any reason that DACR would have an incorrect
register value.  It would be treated as any other system register,
and there's only one code path involved.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1863685

Title:
  ARM: HCR.TSW traps are not implemented

Status in QEMU:
  In Progress

Bug description:
  On 32-bit and 64-bit ARM platforms, setting HCR.TSW is supposed to
  "Trap data or unified cache maintenance instructions that operate by
  Set/Way." Quoting the ARM manual:

  If EL1 is using AArch64 state, accesses to DC ISW, DC CSW, DC CISW are 
trapped to EL2, reported using EC syndrome value 0x18.
  If EL1 is using AArch32 state, accesses to DCISW, DCCSW, DCCISW are trapped 
to EL2, reported using EC syndrome value 0x03.

  However, QEMU does not trap those instructions/registers. This was
  tested on the branch master of the git repo.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1863685/+subscriptions



Re: [PATCH v2 1/4] linux-user: Add support for FS_IOC_FSXATTR ioctls

2020-02-18 Thread Laurent Vivier
Le 24/01/2020 à 16:47, Aleksandar Markovic a écrit :
> From: Aleksandar Markovic 
> 
> Both FS_IOC_FSGETXATTR and FS_IOC_FSSETXATTR accept a pointer to
> the structure
> 
> struct fsxattr {
> __u32 fsx_xflags; /* xflags field value (get/set) */
> __u32 fsx_extsize;/* extsize field value (get/set)*/
> __u32 fsx_nextents;   /* nextents field value (get)   */
> __u32 fsx_projid; /* project identifier (get/set) */
> __u32 fsx_cowextsize; /* CoW extsize field value (get/set)*/
> unsigned char fsx_pad[8];
> };
> 
> as their third argument.
> 
> These ioctls were relatively recently introduced, so the "#ifdef"
> guards are used in this implementation.
> 
> Signed-off-by: Aleksandar Markovic 
> ---
>  linux-user/ioctls.h   | 7 +++
>  linux-user/syscall_defs.h | 6 ++
>  2 files changed, 13 insertions(+)
> 
> diff --git a/linux-user/ioctls.h b/linux-user/ioctls.h
> index 73dcc76..9fb9d6f 100644
> --- a/linux-user/ioctls.h
> +++ b/linux-user/ioctls.h
> @@ -173,6 +173,13 @@
>   IOCTL(FS_IOC32_SETFLAGS, IOC_W, MK_PTR(TYPE_INT))
>   IOCTL(FS_IOC32_GETVERSION, IOC_R, MK_PTR(TYPE_INT))
>   IOCTL(FS_IOC32_SETVERSION, IOC_W, MK_PTR(TYPE_INT))
> +#ifdef FS_IOC_FSGETXATTR
> + IOCTL(FS_IOC_FSGETXATTR, IOC_W, MK_PTR(MK_STRUCT(STRUCT_fsxattr)))

kernel declares that as IOC_R

> +#endif
> +#ifdef FS_IOC_FSSETXATTR
> + IOCTL(FS_IOC_FSSETXATTR, IOC_W, MK_PTR(MK_STRUCT(STRUCT_fsxattr)))
> +#endif
> +
>  
>  #ifdef CONFIG_USBFS
>/* USB ioctls */
> diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
> index 9b61ae8..ed5068f 100644
> --- a/linux-user/syscall_defs.h
> +++ b/linux-user/syscall_defs.h
> @@ -966,6 +966,12 @@ struct target_rtc_pll_info {
>  #define TARGET_FS_IOC32_SETFLAGS TARGET_IOW('f', 2, int)
>  #define TARGET_FS_IOC32_GETVERSION TARGET_IOR('v', 1, int)
>  #define TARGET_FS_IOC32_SETVERSION TARGET_IOW('v', 2, int)
> +#ifdef FS_IOC_FSGETXATTR
> +#define TARGET_FS_IOC_FSGETXATTR TARGET_IOR('X', 31, struct fsxattr)
> +#endif
> +#ifdef FS_IOC_FSSETXATTR
> +#define TARGET_FS_IOC_FSSETXATTR TARGET_IOR('X', 32, struct fsxattr)

kernel declares that as _IOW

> +#endif
>  
>  /* usb ioctls */
>  #define TARGET_USBDEVFS_CONTROL TARGET_IOWRU('U', 0)
> 

where is defined STRUCT(fsxattr, ...)?

Thanks,
Laurent



[PATCH] WHPX: Assigning maintainer for Windows Hypervisor Platform

2020-02-18 Thread Sunil Muthuswamy
Signed-off-by: Sunil Muthuswamy 
---
 MAINTAINERS | 8 
 1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 1740a4fddc..9b3ba4e1b5 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -404,6 +404,14 @@ S: Supported
 F: target/i386/kvm.c
 F: scripts/kvm/vmxcap
 
+WHPX CPUs
+M: Sunil Muthuswamy 
+S: Supported
+F: target/i386/whpx-all.c
+F: target/i386/whp-dispatch.h
+F: accel/stubs/whpx-stub.c
+F: include/sysemu/whpx.h
+
 Guest CPU Cores (Xen)
 -
 X86 Xen CPUs
-- 
2.17.1



Re: [PATCH v8 00/13] linux-user: Add support for real time clock and

2020-02-18 Thread Laurent Vivier
Le 15/01/2020 à 20:36, Filip Bozuta a écrit :
> This series covers following RTC and sound timer ioctls:
> 
>   RTC ioctls(22):
> 
> * RTC_AIE_ON  * RTC_ALM_SET * RTC_WKALM_SET
> * RTC_AIE_OFF * RTC_ALM_READ* RTC_WKALM_RD
> * RTC_UIE_ON  * RTC_RD_TIME * RTC_PLL_GET
> * RTC_UIE_OFF * RTC_SET_TIME* RTC_PLL_SET
> * RTC_PIE_ON  * RTC_IRQP_READ   * RTC_VL_READ
> * RTC_PIE_OFF * RTC_IRQP_SET* RTC_VL_CLR
> * RTC_WIE_ON  * RTC_EPOCH_READ
> * RTC_WIE_OFF * RTC_EPOCH_SET
> 
>   Sound timer ioctls(14):
> 
> * SNDRV_TIMER_IOCTL_PVERSION  * SNDRV_TIMER_IOCTL_INFO
> * SNDRV_TIMER_IOCTL_NEXT_DEVICE   * SNDRV_TIMER_IOCTL_PARAMS
> * SNDRV_TIMER_IOCTL_TREAD * SNDRV_TIMER_IOCTL_STATUS
> * SNDRV_TIMER_IOCTL_GINFO * SNDRV_TIMER_IOCTL_START
> * SNDRV_TIMER_IOCTL_GPARAMS   * SNDRV_TIMER_IOCTL_STOP
> * SNDRV_TIMER_IOCTL_GSTATUS   * SNDRV_TIMER_IOCTL_CONTINUE
> * SNDRV_TIMER_IOCTL_SELECT* SNDRV_TIMER_IOCTL_PAUSE
> 
> The functionalities of individual ioctls were described in this series
> patch commit messages.
> 
> Testing method for RTC ioctls:
> 
> Mini test programs were written for each ioctl. Those programs were
> compiled (sometimes using cross-compilers) for the following
> architectures:
> 
> * Intel 64-bit (little endian)
> * Power pc 32-bit (big endian)
> * Power pc 64-bit (big endian)
> 
> The corresponding native programs were executed without using
> QEMU on following hosts:
> 
> * Intel Core i7-4790K (x86_64 host)
> * Power 7447A (ppc32 host)
> 
> All applicable compiled programs were in turn executed through QEMU
> and the results obtained were the same ones gotten for native
> execution.
> 
> Example of a test program:
> 
> For ioctl RTC_RD_TIME the following test program was used:
> 
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> 
> #define ERROR -1
> 
> int main()
> {
> 
> int fd = open("/dev/rtc", O_RDWR | O_NONBLOCK);
> 
> if(fd == ERROR)
> {
> perror("open");
> return -1;
> }
> 
> struct rtc_time cur_time;
> 
> if(ioctl(fd, RTC_RD_TIME, _time) < 0)
> {
> perror("ioctl");
> return -1;
> }
> 
> printf("Second: %d, Minute: %d, Hour: %d, Day: %d, Month: %d, 
> Year: %d,",
> cur_time.tm_sec, cur_time.tm_min, cur_time.tm_hour, 
> cur_time.tm_mday, cur_time.tm_mon, cur_time.tm_year);
> 
> return 0;
> }
> 
> Limitations of testing:
> 
> The test host pc that was used for testing (intel pc) has RTC
> that doesn't support all RTC features that are accessible
> through ioctls. This means that testing can't discover
> functionality errors related to the third argument of ioctls
> that are used for features which are not supported. For example,
> running the test program for ioctl RTC_EPOCH_READ gives
> the error output: inappropriate ioctl for device. As expected,
> the same output was obtained through QEMU which means that this
> ioctl is recognized in QEMU but doesn't really do anything
> because it is not supported in the host computer's RTC.
> 
> Conclusion: Some RTC ioctls need to be tested on computers
> that support their functionalities so that it can be inferred
> that they are really supported in QEMU. In absence of such
> test hosts, the specifications of those ioctls need to be
> carefully checked manually and the implementations should be
> updated accordingly.
> 
> Testing method for sound timer ioctls:
> 
> The alsa ioctl test suite, that can be found on github
> ("https://github.com/alsa-project/alsa-utils;), was used the test
> the implemented ioctls. The file "timer.c", located in this test
> suite, contains test functions that are used to test alsa timer
> ioctls. This file was compiled (sometimes using cross-compilers) 
> for the following architectures:
> 
> * Intel 64-bit (little endian)
> * Power pc 32-bit (big endian)
> * Power pc 64-bit (big endian)
> 
> The corresponding compiled test files were executed without using
> QEMU on following hosts:
> 
> * Intel Core i7-4790K (x86_64 host)
> * Power 7447A (ppc32 host)
> 
> The corresponding native compiled test files were executed without using
> QEMU on following hosts:
> 
> * Intel Core i7-4790K (x86_64 host)
> * Power 7447A (ppc32 host)
> 
> All compiled test files were in 

[Bug 1863685] Re: ARM: HCR.TSW traps are not implemented

2020-02-18 Thread Julien Freche
Sorry, I meant the operation is a write (TVM is on). The result of the
operation is setting DACR to 0 so the guest stops progressing after
that.

Anyway, since the issue could also be on my side, I don't want to block
you with this.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1863685

Title:
  ARM: HCR.TSW traps are not implemented

Status in QEMU:
  In Progress

Bug description:
  On 32-bit and 64-bit ARM platforms, setting HCR.TSW is supposed to
  "Trap data or unified cache maintenance instructions that operate by
  Set/Way." Quoting the ARM manual:

  If EL1 is using AArch64 state, accesses to DC ISW, DC CSW, DC CISW are 
trapped to EL2, reported using EC syndrome value 0x18.
  If EL1 is using AArch32 state, accesses to DCISW, DCCSW, DCCISW are trapped 
to EL2, reported using EC syndrome value 0x03.

  However, QEMU does not trap those instructions/registers. This was
  tested on the branch master of the git repo.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1863685/+subscriptions



[Bug 1863685] Re: ARM: HCR.TSW traps are not implemented

2020-02-18 Thread Julien Freche
Thanks for the quick turn around! I tested both your patches together
(it's useful to have both to emulate set/way flushing inside a guest)
and I am getting something unexpected. At some point, we are trapping on
an access to DACR but ESR_EL2 doesn't seem to make a lot of sense:
0xfe00dc0. I am running a 32-bit Linux inside a VM.

Decoding it: Rt is set to 0xe which is LR_usr. Also, this is a read
operation. So, essentially the guest seems to try to set DACR to LR_usr
which seems unreasonable.

It could be an issue with the hypervisor on my side (I am running some
custom code). But, it's not obvious to me and the code behaves fine on
bare-metal. Is there a chance that ESR is not populated correctly?

In any case, I do see traps for set/way instructions so, from that point
of view, the patch is good.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1863685

Title:
  ARM: HCR.TSW traps are not implemented

Status in QEMU:
  In Progress

Bug description:
  On 32-bit and 64-bit ARM platforms, setting HCR.TSW is supposed to
  "Trap data or unified cache maintenance instructions that operate by
  Set/Way." Quoting the ARM manual:

  If EL1 is using AArch64 state, accesses to DC ISW, DC CSW, DC CISW are 
trapped to EL2, reported using EC syndrome value 0x18.
  If EL1 is using AArch32 state, accesses to DCISW, DCCSW, DCCISW are trapped 
to EL2, reported using EC syndrome value 0x03.

  However, QEMU does not trap those instructions/registers. This was
  tested on the branch master of the git repo.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1863685/+subscriptions



Re: [PATCH v3 0/4] migration: Replace gemu_log with qemu_log

2020-02-18 Thread Laurent Vivier
Le 04/02/2020 à 03:54, Josh Kunz a écrit :
> Summary of v2->v3 changes:
>   * Removed assert for CMSG handling, replaced with LOG_UNIMP. Will
> switch to assert in follow-up patch.
>   * Fixed BSD-user build (dangling references to qemu_add_log), and
> verified the user-mode build works.
> 
> Summary of v1->v2 changes:
>   * Removed backwards-compatibility code for non-strace log statements.
>   * Removed new qemu_log interface for adding or removing fields from
> the log mask.
>   * Removed LOG_USER and converted all uses (except one) to LOG_UNIMP.
> * One gemu_log statement was converted to an assert.
>   * Some style cleanup.
> 
> The linux-user and bsd-user trees both widely use a function called
> `gemu_log` (notice the 'g') for miscellaneous and strace logging. This
> function predates the newer `qemu_log` function, and has a few drawbacks
> compared to `qemu_log`:
> 
>   1. Always logs to `stderr`, no logging redirection.
>   2. "Miscellaneous" logging cannot be disabled, so it may mix with guest
>  logging.
>   3. Inconsistency with other parts of the QEMU codebase, and a
>  confusing name.
> 
> The second issue is especially troubling because it can interfere with
> programs that expect to communicate via stderr.
> 
> This change introduces one new logging masks to the `qemu_log` subsystem
> to support its use for user-mode logging: the `LOG_STRACE` mask for
> strace-specific logging. Further, it replaces all existing uses of
> `gemu_log` with the appropriate `qemu_log_mask(LOG_{UNIMP,STRACE}, ...)`
> based on the log message.
> 
> Backwards incompatibility:
>   * Log messages for unimplemented user-mode features are no longer
> logged by default. They have to be enabled by setting the LOG_UNIMP
> mask.
>   * Log messages for strace/unimplemented user-mode features may be
> redirected based on `-D`, instead of always logging to stderr.
> 
> Tested:
>   * Built with clang 9 and g++ 8.3
>   * `make check` run with clang 9 build 
>   * Verified:
> * QEMU_STRACE/-strace still works for linux-user
>   * `make vm-build-netbsd EXTRA_CONFIGURE_OPTS="--disable-system" \
>  BUILD_TARGET="all"` passed.
> 
> Josh Kunz (4):
>   linux-user: Use `qemu_log' for non-strace logging
>   linux-user: Use `qemu_log' for strace
>   linux-user: remove gemu_log from the linux-user tree
>   bsd-user: Replace gemu_log with qemu_log
> 
>  bsd-user/main.c   |  29 ++-
>  bsd-user/qemu.h   |   2 -
>  bsd-user/strace.c |  32 ++-
>  bsd-user/syscall.c|  31 ++-
>  include/qemu/log.h|   2 +
>  linux-user/arm/cpu_loop.c |   5 +-
>  linux-user/fd-trans.c |  55 +++--
>  linux-user/main.c |  39 ++--
>  linux-user/qemu.h |   2 -
>  linux-user/signal.c   |   2 +-
>  linux-user/strace.c   | 479 +++---
>  linux-user/syscall.c  |  48 ++--
>  linux-user/vm86.c |   3 +-
>  util/log.c|   2 +
>  14 files changed, 387 insertions(+), 344 deletions(-)
> 

Applied patches 1 to 3 to my linux-user branch.

Thanks,
LAurent



[PATCH v2] sh4: Fix PCI ISA IO memory subregion

2020-02-18 Thread Guenter Roeck
Booting the r2d machine from flash fails because flash is not discovered.
Looking at the flattened memory tree, we see the following.

FlatView #1
 AS "memory", root: system
 AS "cpu-memory-0", root: system
 AS "sh_pci_host", root: bus master container
 Root memory region: system
  - (prio 0, i/o): io
  0001-00ff (prio 0, i/o): r2d.flash @0001

The overlapping memory region is sh_pci.isa, ie the ISA I/O region bridge.
This region is initially assigned to address 0xfe24, but overwritten
with a write into the PCIIOBR register. This write is expected to adjust
the PCI memory window, but not to change the region's base adddress.

Peter Maydell provided the following detailed explanation.

"Section 22.3.7 and in particular figure 22.3 (of "SSH7751R user's manual:
hardware") are clear about how this is supposed to work: there is a window
at 0xfe24 in the system register space for PCI I/O space. When the CPU
makes an access into that area, the PCI controller calculates the PCI
address to use by combining bits 0..17 of the system address with the
bits 31..18 value that the guest has put into the PCIIOBR. That is, writing
to the PCIIOBR changes which section of the IO address space is visible in
the 0xfe24 window. Instead what QEMU's implementation does is move the
window to whatever value the guest writes to the PCIIOBR register -- so if
the guest writes 0 we put the window at 0 in system address space."

Fix the problem by calling memory_region_set_alias_offset() instead of
removing and re-adding the PCI ISA subregion on writes into PCIIOBR.
At the same time, in sh_pci_device_realize(), don't set iobr since
it is overwritten later anyway. Instead, pass the base address to
memory_region_add_subregion() directly.

Many thanks to Peter Maydell for the detailed problem analysis, and for
providing suggestions on how to fix the problem.

Signed-off-by: Guenter Roeck 
---
v2: Complete rework based on Peter's analysis. Don't remove the
'sh_pci.isa' alias, which was perfectly fine.
Instead, fix the underlying problem.
Rename subject from "'sh4: Remove bad memory alias 'sh_pci.isa'"

 hw/sh4/sh_pci.c | 11 +++
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/hw/sh4/sh_pci.c b/hw/sh4/sh_pci.c
index 71afd23b67..08f2fc1dde 100644
--- a/hw/sh4/sh_pci.c
+++ b/hw/sh4/sh_pci.c
@@ -67,12 +67,8 @@ static void sh_pci_reg_write (void *p, hwaddr addr, uint64_t 
val,
 pcic->mbr = val & 0xff01;
 break;
 case 0x1c8:
-if ((val & 0xfffc) != (pcic->iobr & 0xfffc)) {
-memory_region_del_subregion(get_system_memory(), >isa);
-pcic->iobr = val & 0xfffc0001;
-memory_region_add_subregion(get_system_memory(),
-pcic->iobr & 0xfffc, >isa);
-}
+pcic->iobr = val & 0xfffc0001;
+memory_region_set_alias_offset(>isa, val & 0xfffc);
 break;
 case 0x220:
 pci_data_write(phb->bus, pcic->par, val, 4);
@@ -147,8 +143,7 @@ static void sh_pci_device_realize(DeviceState *dev, Error 
**errp)
  get_system_io(), 0, 0x4);
 sysbus_init_mmio(sbd, >memconfig_p4);
 sysbus_init_mmio(sbd, >memconfig_a7);
-s->iobr = 0xfe24;
-memory_region_add_subregion(get_system_memory(), s->iobr, >isa);
+memory_region_add_subregion(get_system_memory(), 0xfe24, >isa);
 
 s->dev = pci_create_simple(phb->bus, PCI_DEVFN(0, 0), "sh_pci_host");
 }
-- 
2.17.1




Re: [PATCH v3 1/4] linux-user: Use `qemu_log' for non-strace logging

2020-02-18 Thread Laurent Vivier
Le 04/02/2020 à 03:54, Josh Kunz a écrit :
> Since most calls to `gemu_log` are actually logging unimplemented features,
> this change replaces most non-strace calls to `gemu_log` with calls to
> `qemu_log_mask(LOG_UNIMP, ...)`.  This allows the user to easily log to
> a file, and to mask out these log messages if they desire.
> 
> Note: This change is slightly backwards incompatible, since now these
> "unimplemented" log messages will not be logged by default.
> 
> Signed-off-by: Josh Kunz 
> ---
>  linux-user/arm/cpu_loop.c |  5 ++--
>  linux-user/fd-trans.c | 55 +--
>  linux-user/syscall.c  | 35 -
>  linux-user/vm86.c |  3 ++-
>  4 files changed, 62 insertions(+), 36 deletions(-)
> 
> diff --git a/linux-user/arm/cpu_loop.c b/linux-user/arm/cpu_loop.c
> index 1fae90c6df..cf618daa1c 100644
> --- a/linux-user/arm/cpu_loop.c
> +++ b/linux-user/arm/cpu_loop.c
> @@ -349,8 +349,9 @@ void cpu_loop(CPUARMState *env)
>  env->regs[0] = cpu_get_tls(env);
>  break;
>  default:
> -gemu_log("qemu: Unsupported ARM syscall: 0x%x\n",
> - n);
> +qemu_log_mask(LOG_UNIMP,
> +  "qemu: Unsupported ARM syscall: 
> 0x%x\n",
> +  n);
>  env->regs[0] = -TARGET_ENOSYS;
>  break;
>  }
> diff --git a/linux-user/fd-trans.c b/linux-user/fd-trans.c
> index 9b92386abf..c0687c52e6 100644
> --- a/linux-user/fd-trans.c
> +++ b/linux-user/fd-trans.c
> @@ -514,7 +514,8 @@ static abi_long host_to_target_data_bridge_nlattr(struct 
> nlattr *nlattr,
>  u32[1] = tswap32(u32[1]); /* optmask */
>  break;
>  default:
> -gemu_log("Unknown QEMU_IFLA_BR type %d\n", nlattr->nla_type);
> +qemu_log_mask(LOG_UNIMP, "Unknown QEMU_IFLA_BR type %d\n",
> +  nlattr->nla_type);
>  break;
>  }
>  return 0;
> @@ -577,7 +578,8 @@ static abi_long 
> host_to_target_slave_data_bridge_nlattr(struct nlattr *nlattr,
>  case QEMU_IFLA_BRPORT_BRIDGE_ID:
>  break;
>  default:
> -gemu_log("Unknown QEMU_IFLA_BRPORT type %d\n", nlattr->nla_type);
> +qemu_log_mask(LOG_UNIMP, "Unknown QEMU_IFLA_BRPORT type %d\n",
> +  nlattr->nla_type);
>  break;
>  }
>  return 0;
> @@ -605,7 +607,8 @@ static abi_long host_to_target_data_tun_nlattr(struct 
> nlattr *nlattr,
>  *u32 = tswap32(*u32);
>  break;
>  default:
> -gemu_log("Unknown QEMU_IFLA_TUN type %d\n", nlattr->nla_type);
> +qemu_log_mask(LOG_UNIMP, "Unknown QEMU_IFLA_TUN type %d\n",
> +  nlattr->nla_type);
>  break;
>  }
>  return 0;
> @@ -652,7 +655,8 @@ static abi_long 
> host_to_target_data_linkinfo_nlattr(struct nlattr *nlattr,
>NULL,
>  
> host_to_target_data_tun_nlattr);
>  } else {
> -gemu_log("Unknown QEMU_IFLA_INFO_KIND %s\n", li_context->name);
> +qemu_log_mask(LOG_UNIMP, "Unknown QEMU_IFLA_INFO_KIND %s\n",
> +  li_context->name);
>  }
>  break;
>  case QEMU_IFLA_INFO_SLAVE_DATA:
> @@ -663,12 +667,13 @@ static abi_long 
> host_to_target_data_linkinfo_nlattr(struct nlattr *nlattr,
>NULL,
> 
> host_to_target_slave_data_bridge_nlattr);
>  } else {
> -gemu_log("Unknown QEMU_IFLA_INFO_SLAVE_KIND %s\n",
> +qemu_log_mask(LOG_UNIMP, "Unknown QEMU_IFLA_INFO_SLAVE_KIND 
> %s\n",
>   li_context->slave_name);
>  }
>  break;
>  default:
> -gemu_log("Unknown host QEMU_IFLA_INFO type: %d\n", nlattr->nla_type);
> +qemu_log_mask(LOG_UNIMP, "Unknown host QEMU_IFLA_INFO type: %d\n",
> +  nlattr->nla_type);
>  break;
>  }
>  
> @@ -690,7 +695,8 @@ static abi_long host_to_target_data_inet_nlattr(struct 
> nlattr *nlattr,
>  }
>  break;
>  default:
> -gemu_log("Unknown host AF_INET type: %d\n", nlattr->nla_type);
> +qemu_log_mask(LOG_UNIMP, "Unknown host AF_INET type: %d\n",
> +  nlattr->nla_type);
>  }
>  return 0;
>  }
> @@ -741,7 +747,8 @@ static abi_long host_to_target_data_inet6_nlattr(struct 
> nlattr *nlattr,
>  }
>  break;
>  default:
> -gemu_log("Unknown host AF_INET6 type: %d\n", nlattr->nla_type);
> +qemu_log_mask(LOG_UNIMP, "Unknown host AF_INET6 type: %d\n",
> +  nlattr->nla_type);
>  }
>  return 0;
>  }
> @@ 

Re: [PATCH] Avoid cpu_physical_memory_rw() with a constant is_write argument

2020-02-18 Thread Philippe Mathieu-Daudé

On 2/18/20 7:49 PM, Peter Maydell wrote:

On Tue, 18 Feb 2020 at 17:57, Stefan Weil  wrote:


Am 18.02.20 um 14:20 schrieb Philippe Mathieu-Daudé:


This commit was produced with the included Coccinelle script
scripts/coccinelle/as-rw-const.patch.

Inspired-by: Peter Maydell 
Signed-off-by: Philippe Mathieu-Daudé 
---
Based-on: <20200218112457.22712-1-peter.mayd...@linaro.org>

[...]

diff --git a/target/i386/hax-all.c b/target/i386/hax-all.c
index a8b6e5aeb8..f5971ccc74 100644
--- a/target/i386/hax-all.c
+++ b/target/i386/hax-all.c
@@ -376,8 +376,8 @@ static int hax_handle_fastmmio(CPUArchState *env, struct 
hax_fastmmio *hft)
   *  hft->direction == 2: gpa ==> gpa2
   */
  uint64_t value;
-cpu_physical_memory_rw(hft->gpa, (uint8_t *) , hft->size, 0);
-cpu_physical_memory_rw(hft->gpa2, (uint8_t *) , hft->size, 1);
+cpu_physical_memory_read(hft->gpa, (uint8_t *), hft->size);
+cpu_physical_memory_write(hft->gpa2, (uint8_t *), hft->size);



Maybe those type casts could be removed, too. They are no longer needed
after your modification.


I think that we should fix the inconsistency where these functions
all take "uint8_t* buf":

  - address_space_rw()
  - address_space_read()
  - address_space_write()
  - address_space_write_rom()
  - cpu_physical_memory_rw()
  - cpu_memory_rw_debug()

but these take void*:
  - cpu_physical_memory_read()
  - cpu_physical_memory_write()
  - address_space_write_cached()
  - address_space_read_cached_slow()
  - address_space_write_cached_slow()
  - pci_dma_read()
  - pci_dma_write()
  - pci_dma_rw()
  - dma_memory_read()
  - dma_memory_write()
  - dma_memory_rw()
  - dma_memory_rw_relaxed()


I don't understand well cpu_physical_memory*(). Aren't these obsolete?
They confuse me when using multi-core CPUs.



Depending on which way we go we would either want to remove these
casts, or not.

I guess that we have more cases of 'void*', and that would
certainly be the easier way to convert (otherwise we probably
need to add a bunch of new casts to uint8_t* in various callsites),
but I don't have a strong opinion. Paolo ?


I thought about it too but it is quite some work, and I'v to admit I 
lost some faith with my previous chardev conversion. There Paolo/Daniel 
agreed to follow the libc read()/write() prototypes.




thanks
-- PMM






Re: [PATCH v2 00/22] Fix error handling during bitmap postcopy

2020-02-18 Thread Andrey Shinkevich

qemu-iotests:$ ./check -qcow2
PASSED
(except always failed 261 and 272)

Andrey

On 17/02/2020 18:02, Vladimir Sementsov-Ogievskiy wrote:

Original idea of bitmaps postcopy migration is that bitmaps are non
critical data, and their loss is not serious problem. So, using postcopy
method on any failure we should just drop unfinished bitmaps and
continue guest execution.

However, it doesn't work so. It crashes, fails, it goes to
postcopy-recovery feature. It does anything except for behavior we want.
These series fixes at least some problems with error handling during
bitmaps migration postcopy.

v1 was "[PATCH 0/7] Fix crashes on early shutdown during bitmaps postcopy"

v2:

Most of patches are new or changed a lot.
Only patches 06,07 mostly unchanged, just rebased on refactorings.

Vladimir Sementsov-Ogievskiy (22):
   migration/block-dirty-bitmap: fix dirty_bitmap_mig_before_vm_start
   migration/block-dirty-bitmap: rename state structure types
   migration/block-dirty-bitmap: rename dirty_bitmap_mig_cleanup
   migration/block-dirty-bitmap: move mutex init to dirty_bitmap_mig_init
   migration/block-dirty-bitmap: refactor state global variables
   migration/block-dirty-bitmap: rename finish_lock to just lock
   migration/block-dirty-bitmap: simplify dirty_bitmap_load_complete
   migration/block-dirty-bitmap: keep bitmap state for all bitmaps
   migration/block-dirty-bitmap: relax error handling in incoming part
   migration/block-dirty-bitmap: cancel migration on shutdown
   migration/savevm: don't worry if bitmap migration postcopy failed
   qemu-iotests/199: fix style
   qemu-iotests/199: drop extra constraints
   qemu-iotests/199: better catch postcopy time
   qemu-iotests/199: improve performance: set bitmap by discard
   qemu-iotests/199: change discard patterns
   qemu-iotests/199: increase postcopy period
   python/qemu/machine: add kill() method
   qemu-iotests/199: prepare for new test-cases addition
   qemu-iotests/199: check persistent bitmaps
   qemu-iotests/199: add early shutdown case to bitmaps postcopy
   qemu-iotests/199: add source-killed case to bitmaps postcopy

Cc: John Snow 
Cc: Vladimir Sementsov-Ogievskiy 
Cc: Stefan Hajnoczi 
Cc: Fam Zheng 
Cc: Juan Quintela 
Cc: "Dr. David Alan Gilbert" 
Cc: Eduardo Habkost 
Cc: Cleber Rosa 
Cc: Kevin Wolf 
Cc: Max Reitz 
Cc: qemu-bl...@nongnu.org
Cc: qemu-devel@nongnu.org
Cc: qemu-sta...@nongnu.org # for patch 01

  migration/migration.h  |   3 +-
  migration/block-dirty-bitmap.c | 444 +
  migration/migration.c  |  15 +-
  migration/savevm.c |  37 ++-
  python/qemu/machine.py |  12 +-
  tests/qemu-iotests/199 | 244 ++
  tests/qemu-iotests/199.out |   4 +-
  7 files changed, 529 insertions(+), 230 deletions(-)



--
With the best regards,
Andrey Shinkevich



  1   2   3   4   5   >